Artificial intelligence and Natural Language Processing for Digital Epidemiology: Overcoming the Challenges of Real World Data

Dr. Graciela Gonzalez-Hernandez

Vice Chair Department of Computational Biomedicine, Cedars-Sinai Medical Center, CA

October 10, 2022 | 3:30 PM

Via Zoom:

Password: BINF865

Health records and patient generated data (in health forums or social media) constitute what the FDA and the CDC refer to as “real world data”, which can be extremely valuable and become ‘real world evidence’ to advance health research. However, using these data presents many challenges for large-scale studies, as it is sometimes misused. In this talk, I will showcase some of the approaches my team has deployed to identify cohorts, reduce data bias, find what drives patients to switch medications, and enrich metadata for SARS-CoV-2 sequences in public repositories, among other projects that incorporate real world data using natural language processing and artificial intelligence techniques.


Dr. Graciela Gonzalez-Hernandez is Vice Chair for Research and Education in the new Department of Computational Biomedicine at Cedars-Sinai Medical Center in California. Prior to joining Cedars-Sinai in May 2022, Dr Gonzalez-Hernandez was an Associate Professor of Informatics in the Department of Biostatistics, Epidemiology and Informatics (DBEI) of the Perelman School of Medicine, University of Pennsylvania.

The Gonzalez-Hernandez Laboratory has been a pioneer in integrating natural language processing and artificial intelligence for digital epidemiology. The developed language processing methods are open to the research community and are designed for portability, with applications to mining information from real world data such as electronic health records, published literature and social media. Her lab has made available advanced tools like DeepADEMiner, Kusuri, and SEED for extracting adverse events, medication names and symptoms mentioned in Twitter and electronic health records.

Download Flyer