Utilizing Geospatial Predictive Modeling to Identify Emerging Health Clusters and Strategize Low-Cost Preventive Interventions for County Health Departments
Abstract
Emerging health clusters—unexpected aggregations of disease incidence—often go undetected until they escalate into
public health emergencies, driving high reactive costs for county health departments. This study addresses the gap in
proactive, data-driven surveillance by investigating the utility of geospatial predictive modeling for early cluster
identification and low-cost preventive planning. The primary purpose is to develop and validate a hybrid geospatial
framework combining spatial autocorrelation (Getis-Ord Gi*) and machine learning (random forest regression) to
predict high-risk health clusters for chronic and communicable diseases. Using a retrospective ecological design, the
study analyzes five years of de-identified electronic health records and environmental data from three mid-sized U.S.
county health departments. Key findings indicate that the hybrid model achieves 87.4% predictive accuracy for
emerging clusters up to four weeks in advance, at a marginal cost per county of $0.18 per capita when integrated into
existing geographic information systems (GIS). Furthermore, the model enables targeted interventions—mobile
clinics and targeted outreach—that reduce potential outbreak costs by 34%. The conclusion supports that county
health departments can operationalize geospatial predictive modeling using existing data infrastructures to shift from
reactive to preventive public health, substantially reducing both health disparities and long-term expenditures.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Tanjin Islam (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.