KG-COVID-19: A Framework to Produce Customized Knowledge Graphs for COVID-19 Response

Justin T. Reese, Deepak Unni, Tiffany J. Callahan, Luca Cappelletti, Vida Ravanmehr, Seth Carbon, Kent A. Shefchek, Benjamin M. Good, James P. Balhoff, Tommaso Fontana, Hannah Blau, Nicolas Matentzoglu, Nomi L. Harris, Monica C. Munoz-Torres, Melissa A. Haendel, Peter N. Robinson, Marcin P. Joachimiak, Christopher J. Mungall

Research output: Contribution to journalArticlepeer-review

31 Scopus citations


Integrated, up-to-date data about SARS-CoV-2 and COVID-19 is crucial for the ongoing response to the COVID-19 pandemic by the biomedical research community. While rich biological knowledge exists for SARS-CoV-2 and related viruses (SARS-CoV, MERS-CoV), integrating this knowledge is difficult and time-consuming, since much of it is in siloed databases or in textual format. Furthermore, the data required by the research community vary drastically for different tasks; the optimal data for a machine learning task, for example, is much different from the data used to populate a browsable user interface for clinicians. To address these challenges, we created KG-COVID-19, a flexible framework that ingests and integrates heterogeneous biomedical data to produce knowledge graphs (KGs), and applied it to create a KG for COVID-19 response. This KG framework also can be applied to other problems in which siloed biomedical data must be quickly integrated for different research applications, including future pandemics. An effective response to the COVID-19 pandemic relies on integration of many different types of data available about SARS-CoV-2 and related viruses. KG-COVID-19 is a framework for producing knowledge graphs that can be customized for downstream applications including machine learning tasks, hypothesis-based querying, and browsable user interface to enable researchers to explore COVID-19 data and discover relationships.

Original languageEnglish (US)
Article number100155
Issue number1
StatePublished - Jan 8 2021


  • COVID-19
  • DSML 3: Development/Pre-production: Data science output has been rolled out/validated across multiple domains/problems
  • MERS-CoV
  • SARS-CoV
  • SARS-CoV-2
  • coronavirus
  • data integration
  • knowledge graph
  • machine learning
  • ontology

ASJC Scopus subject areas

  • Decision Sciences(all)


Dive into the research topics of 'KG-COVID-19: A Framework to Produce Customized Knowledge Graphs for COVID-19 Response'. Together they form a unique fingerprint.

Cite this