Spatial Prediction of COVID-19 Pandemic Dynamics in the United States

Çiğdem Ak, Alex D. Chitsazan, Mehmet Gonen, Ruth Etzioni, Aaron J. Grossberg

Research output: Contribution to journalArticlepeer-review

Abstract

The impact of COVID-19 across the United States (US) has been heterogeneous, with rapid spread and greater mortality in some areas compared with others. We used geographically-linked data to test the hypothesis that the risk for COVID-19 was defined by location and sought to define which demographic features were most closely associated with elevated COVID-19 spread and mortality. We leveraged geographically-restricted social, economic, political, and demographic information from US counties to develop a computational framework using structured Gaussian process to predict county-level case and death counts during the pandemic’s initial and nationwide phases. After identifying the most predictive information sources by location, we applied an unsupervised clustering algorithm and topic modeling to identify groups of features most closely associated with COVID-19 spread. Our model successfully predicted COVID-19 case counts of unseen locations after examining case counts and demographic information of neighboring locations, with overall Pearson’s correlation coefficient and the proportion of variance explained as 0.96 and 0.84 during the initial phase and 0.95 and 0.87 during the nationwide phase, respectively. Aside from population metrics, presidential vote margin was the most consistently selected spatial feature in our COVID-19 prediction models. Urbanicity and 2020 presidential vote margins were more predictive than other demographic features. Models trained using death counts showed similar performance metrics. Topic modeling showed that counties with similar socioeconomic and demographic features tended to group together, and some of these feature sets were associated with COVID-19 dynamics. Clustering of counties based on these feature groups found by topic modeling revealed groups of counties that experienced markedly different COVID-19 spread. We conclude that topic modeling can be used to group similar features and identify counties with similar features in epidemiologic research.

Original languageEnglish (US)
Article number470
JournalISPRS International Journal of Geo-Information
Volume11
Issue number9
DOIs
StatePublished - Sep 2022

Keywords

  • computational epidemiology
  • COVID-19
  • infectious diseases
  • interpretable predictions
  • spatial clustering
  • spatiotemporal modeling

ASJC Scopus subject areas

  • Geography, Planning and Development
  • Computers in Earth Sciences
  • Earth and Planetary Sciences (miscellaneous)

Fingerprint

Dive into the research topics of 'Spatial Prediction of COVID-19 Pandemic Dynamics in the United States'. Together they form a unique fingerprint.

Cite this