LLM PARSING OF NATIONAL VIOLENT DEATH REPORTING SYSTEM NARRATIVES
Suicide is one of the leading causes of death in the United States for 5-24 year-olds. Researchers and policymakers study the circumstances of youth suicides to better understand them and reduce their occurrence. One key source of information is the National Violent Death Reporting System (NVDRS). The NVDRS captures information about violent deaths across the United States that has been abstracted from sources including law enforcement reports, coroner/medical examiner reports, toxicology reports, and death certificates.
​
The NVDRS contains narrative summaries drawing from those sources as well as standard variables that are useful to researchers. The process of recording standardized variables is time consuming and prone to human error.
​
Generative prompting of large language models such as the Mistral-7B-Instruct-v0.2 and
Code Llama 7b to abstract dimensions of stigma and discrimination from coroner or
medical examiner narratives yielded promising results. This analysis revealed
commonalities between many of the tragic events described in the project dataset that
would be important to document in the NVDRS dataset.

