Much has been written about the fear of life in a post-fact America where “pants on fire” falsities no longer seem to matter, or even raise eyebrows. More recently, there is a fear that data itself will be under siege. Instead of inconvenient truths, the worry is about inconvenient data — valuable (often irreplaceable) data that has been openly published online by government agencies that might be threatening to the policy perspective of an incoming administration.
Over the past seven years, tremendous momentum has grown behind the public release of data held by the U.S. government, and governments at all levels. Nearly 200,000 datasets are freely available through a federal website called data.gov. Most are seemingly uncontroversial and often widely used like census data, consumer complaints, and hourly precipitation data. And then there are data that might prove uncomfortable for incoming West Wing policymakers or appointees at certain federal agencies. Data about climate change, renewable energy, or even White House visitor logs. This has people worried. Frantically worried. There is a mountain of data potentially at risk. And not just from incoming climate change deniers. The bigger problem is Data Deniers — people in leadership positions who vocally challenge the basic legitimacy of data, whether it’s employment figures, air pollution, or water quality. Census data is protected by law; most other data collected by the government is not.
If major political fights are sparked by legislative or regulatory actions, will there be pressure not only to end certain federal research programs but to remove data currently available that might question the factual underpinnings of those actions? This might touch a broad swath of data now available — from measurements on rising sea levels and ocean temperatures to the number of wildfires in the country.
The removal of any data under circumstances that have even a whiff of political expediency will set a highly dangerous precedent. Could someone get away with it? Possibly. The brazenness of political machinations has sunk to unprecedented levels if North Carolina is any indicator.
Even if concerned citizens save copies of datasets and archive websites—as scientists and others are doing right now with various types of climate data—this does not completely secure the situation. First, such rescued data will over time begin to get “stale” as they age without any official updates to incorporate more recent data. Second and more insidiously, investments in data collection, systems, and data management might decline over time, especially if budgets are squeezed for certain agencies that are heavy data producers like EPA and the Bureau of Labor Statistics. This could have severe, long-term effects on the Federal government’s ability to collect and preserve data crucial to future decisions on a wide range of issues from national security to the economy and the environment.
In the end, this could undermine data quality, and as importantly the perception of data quality across the federal government.
So what’s to be done?
Of course, ferrying at-risk data to non-government institutions and servers should continue. That’s just ensuring basic redundancy ― a sound security measure for any digital information. Over the long term, the collection and free dissemination of more government-produced data needs to be codified in law. The importance of this was recognized long ago for the Census — in Article 1 of the Constitution and Title 13. It’s time for our government and society to catch up and recognize that data is fundamental to an effective government, our economy and our national well-being today and tomorrow.