Public health as we know it started with data.

In 1854, hundreds of people in London’s Soho neighborhood were dying from cholera. At the time, no one knew what caused the disease, but a public health pioneer named John Snow suspected it was in the water. He painstakingly collected and mapped data from people who had become ill and discovered the source: a water pump on Broadwick Street. The city removed the pump’s handle, the outbreak dissipated, and epidemiology—the study of how diseases originate and spread—was born.

Data is even more vital to public health today than it was 170 years ago. Indeed, just as clinicians rely on patient data—blood pressure, glucose levels, and other vital signs—to diagnosis illnesses and guide treatment, public health agencies need data to keep communities healthy. Data enables them to detect disease hot spots, control the spread of infection, and direct limited resources as efficiently as possible to the populations that need them most.

But before public health agencies can start analyzing and using data, they must first collect it.

Just 20 years after John Snow’s breakthrough, U.S. doctors started systematically reporting cases of disease to their health departments. Back then, they used postcards. In the 20th century, they progressed to phone and then fax. Today, despite the advent of electronic medical records and the internet, phone and fax often remain the way that cases get reported. This causes delays and errors in reporting that can cost lives and money. These challenges were especially stark during the COVID-19 pandemic, a time when many health departments received lab reports by fax. As The New York Times reported in September 2022: “The precise cost in needless illness and death cannot be quantified. … But federal experts are certain that the lack of comprehensive, timely data has also exacted a heavy toll.”

Fortunately, the pandemic also sparked innovation. Though at the time doctors and public health agencies rarely used automated digital systems to share data, known as electronic case reporting, today about 53% of state, local, Tribal, and territorial public health agencies receive digitized case data for at least three-quarters of the health conditions that providers are required to report by law.

Indeed, the movement to bring public health into the digital age has many bright spots. Most lab results, which public health agencies use to confirm incidents of disease, are reported using automated systems. Immunization reports are increasingly digital. And public health agencies are using digital data from emergency rooms—known as syndromic surveillance—in increasingly creative ways to detect and respond to a wide range of infectious and environmental health threats.

Most lab results, which public health agencies use to confirm incidents of disease, are reported using automated systems.

Health departments are also pioneering partnerships with insurers such as Medicaid and harnessing their claims data to inform and improve treatments for chronic diseases. Some are looking outside of health care and engaging with social services agencies, corrections departments, educational institutions, and others to better understand the social and economic factors that drive health outcomes and then design more effective programs to prevent diseases. Other intrepid agencies are even exploring how artificial intelligence can help them to overcome technological hurdles and workforce shortages.

These technological advances and creative uses of data are not yet the norm, but some state and local agencies are pointing toward a future of public health that takes full advantage of the innovations available to them.

Support from syndromic surveillance

Lab reports confirming the presence of a disease are vital to public health, but by the time public health agencies get them, the data is often days or weeks old—the time it takes for doctors to order tests and wait for the results. However, when diseases are spreading quickly, time is of the essence.

That’s why, around the time of 9/11 and the anthrax attacks of 2001, the U.S. government established the national syndromic surveillance program to serve as a sentinel against bioterrorist attacks. By collecting digital data on just the symptoms that patients reported in emergency departments, the Centers for Disease Control and Prevention (CDC) and state public health departments could detect threats in near real time.

Today, public health departments across the country are using syndromic surveillance data to track not just infectious diseases but also environmental and behavioral health issues that arise from wildfire smoke, extreme heat, substance use, and automobile injuries, to name a few.

For example, when wildfires swept across Oregon in September 2020, hospitals filled with patients suffering from respiratory illnesses. As doctors treated people individually, public health officials also sprang into action, using real-time data from emergency rooms and urgent care centers to track the smoke’s health effects and issue targeted warnings to vulnerable communities.

And in northern Idaho, where suicide has been a leading cause of death among youth, the regional health department in 2017 used syndromic surveillance to better understand and respond to childhood suicide risk. They did this by analyzing data on suicidal ideation or attempts from hospital emergency departments, which captured critical data often missed by traditional sources such as coroner reports. The health department monitored the data and generated weekly analyses that examined possible correlations of suicidal behavior with sex and age. It shared its analyses with the Suicide Prevention Action Network, which used the data to improve its efforts to prevent self-harm.

Data to curb chronic disease

Before the advent of vaccines, antibiotics, and modern sanitation, infectious diseases were the leading killers. But today, chronic diseases such as diabetes, hypertension, and asthma account for 70% of deaths and 86% of health care expenses in the United States. Yet the clinical data that doctors must report is still largely restricted to infectious diseases.

So, if doctors aren’t required to report chronic disease data and cannot feasibly do so, where can public health agencies turn? One major source is insurance providers. They collect information daily on the types of illnesses that patients have, the treatments being recommended, and the medications being prescribed. Insurance providers use this data to determine reimbursement rates, assess the quality of care, and guide treatment, but public health agencies do not have ready access to this information.

To help overcome this gap, The Pew Charitable Trusts recently launched a project to build data-driven partnerships between state public health agencies and their Medicaid counterparts. Why Medicaid? First, as the nation’s largest single payer, it can provide public health agencies with a large pool of claims data. Second, Medicaid serves people who would benefit most by more effective public health programs, including families with low incomes, people with disabilities, and older adults. And, lastly, Medicaid influences the practices of two critical constituencies: private insurers that contract with Medicaid and the 70% of doctors who accept Medicaid payments.

In 2020, the Centers for Medicare & Medicaid Services launched an initiative through its innovation accelerator program to help states reduce maternal mortality and severe maternal morbidity among Medicaid beneficiaries. Seven states participated in the program, which focused on strengthening partnerships and building the ability to analyze data to better understand and address maternal health outcomes.

Each state developed tailored data strategies by collaborating with public health agencies and maternal mortality review committees. For example, Delaware linked Medicaid data with maternal mortality data to gain a fuller picture of maternal deaths, while Kentucky focused on severe cardiac-related illnesses and developed a plan to identify, track, and ultimately reduce risk factors. Massachusetts integrated clinical and social data to identify and narrow disparities in maternal health outcomes that have left some populations at greater risk of harm. And Wyoming focused on identifying psychiatric and substance use-related risk factors associated with maternal death and illness.

Medicaid supported the states in drafting data use agreements, linking datasets, and identifying risk factors. Peer-to-peer learning further enriched the experience and fostered continued collaboration. The initiative demonstrated how Medicaid programs can use data partnerships and analytics to drive targeted, evidence-based interventions that improve maternal health outcomes—especially for low-income populations disproportionately affected by preventable complications.

Looking beyond health data

Medical care has a significant influence on whether a sick person gets well, but researchers estimate that medical social, economic, and environmental factors—where people live, what job opportunities they have, the quality of the education they receive, their access to affordable and nutritious food—account for about 80% of health outcomes. That’s because the further upstream the root causes of illness can be addressed, the more efficient and effective the solutions are. Yet as challenging as it is to collect data from doctors, hospitals, and health insurers, it can be even more difficult for public health agencies to partner with social services agencies outside the health care sector. Fortunately, there are a growing number of models to emulate.

In 2017, when Alabama was facing an epidemic of opioid overdoses, its public health department launched a central data repository to pool information from a variety of state agencies, including public health, mental health, Medicaid, and corrections. The system allowed state officials to identify overdose hot spots, evaluate program effectiveness, and secure funding for targeted interventions. For example, when state officials could see that overdoses per capita were much higher in one county than in others, they directed prevention and peer-support programs there. The database has since expanded beyond opioids to support broader public health efforts.

And across California from 2022 to 2024, a mix of state and local health departments, health systems, and social services agencies partnered to improve care for people experiencing homelessness. The project was facilitated by the California Health Care Foundation and Center for Health Care Strategies. In Alameda County, health care providers and community-based organizations collaborated to offer more integrated behavioral health and housing services that address underlying factors—substance use and lack of access to affordable housing—that drive homelessness. In San Diego County, a mix of academic institutions, health care providers, and organizations such as the YMCA created a real-time bed availability tracker and tool for referring people to temporary residential medical care.

Potential for artificial intelligence

While many state and local public health departments are experimenting with chatbots to communicate with the public, AI has also been tested and deployed to detect infectious and foodborne outbreaks by analyzing language in internet searches, news, and social media messages. It’s also been used to accelerate the analysis of disease-causing microbes and their genomes, and even to analyze thermal imaging in hospital waiting rooms to predict daily flu counts.

AI might also help public health agencies overcome obstacles to data sharing and analysis. Interoperability—the concept that two different systems can communicate with each other—requires that multiple stakeholders develop a common language that their products will speak. It takes much time and attention to develop consensus on standards for so many different products and conditions across a broad and diverse group of software vendors, policymakers, public health practitioners, and health care providers.

Without interoperability, public health officials must spend a lot of time and money manually processing data. But AI programs can process data themselves, taking the burden off public health officials and letting health care providers focus on people, not paperwork. In April 2022, for example, about seven months before ChatGPT debuted, the CDC and the Georgia Tech Research Institute unveiled an AI tool that could analyze patients’ records to classify the severity of COVID-19 in pregnant women. They found that software reached the same conclusions as a human clinician in 99.4% of 4,378 cases.

Since then, the development and adoption of AI tools in public health has grown dramatically.

New data dawning

These examples show how modernizing data strengthens public health. Privacy concerns, data silos, and inconsistent regulations can slow progress, so addressing these barriers will require thoughtful policy reform and stakeholder engagement.

And data alone is not enough. Agencies need long-term investments for skilled analysts, epidemiologists, and IT professionals to turn raw information into actionable insights. They also need community partnerships to ensure that interventions are culturally appropriate and responsive to local needs.

Emerging infections, chronic diseases, and environmental threats will continue to challenge our health systems. But with better data and deeper partnerships, public health agencies can detect and respond to threats more quickly, target resources more effectively, and improve health outcomes for all.

The Takeaway

Automatically transferring medical data to public health agencies allows experts to spot disease outbreaks, control the spread, and quickly direct limited resources to the people that need them the most.

Kathy Talkington oversees teams of policy experts, scientists, and staff who lead engagement for The Pew Charitable Trusts’ work on public health issues.

Illustrations by Ned Drummond/The Pew Charitable Trusts

Explore the Issue

This article is part of a magazine issue featuring in-depth stories and insights.
Read the full issue.