There’s a nationwide grassroots movement that’s energizing scientists, environmental activists, technologists and librarians. These individuals, known as Data Rescuers, are passionate about open data—free, publicly available data that anyone can access and use, without restrictions. They have participated in events across the country as part of the DataRefuge initiative, which is “committed to identifying, assessing, prioritizing, securing, and distributing reliable copies of federal climate and environmental data so that it remains available to researchers.”
According to Bethany Wiggin, Director of the University of Pennsylvania program in environmental humanities and a DataRefuge organizer, the DataRefuge project provides an insurance policy to protect valuable datasets. Researchers are concerned that data, particularly climate and environmental data, could be made difficult to access or removed altogether in the future. This concern has prompted over a dozen communities to organize events in the past month, with more in the queue.
How is DataRefuge identifying the most valuable datasets? Through a public survey that was widely circulated to the Union of Concerned Scientists’ network. To take the survey, click here.
The Data Rescue movement came to Washington, DC in February. Hundreds of Data Rescuers participated in Data Rescue DC for a two-day event at Georgetown University.
Saturday 2/18 consisted of discussion sessions and a training for participants.
- During the opening panel, representatives from the University of Pennsylvania, Georgetown University and New America led a teach-in on the history and importance of climate data.
- During the second session, representatives from the Sunlight Foundation, University of Pennsylvania, George Washington University, Union of Concerned Scientists and a former Department of Commerce official discussed key issues around the vulnerability of open data.
- During the training, volunteer guides met in preparation for leading the Data Rescue teams to make Sunday’s event as productive as possible.
On Sunday 2/19, students, data advocates, librarians and scientists participated in an archive-a-thon. They split into several working groups:
- Seeders and sorters (technologists) identified important URLs on government agency sites. Those deemed “crawlable” by the Internet Archive’s web crawler were recommended for the End of Term Web Archive. “Uncrawlable” URLs were added to a spreadsheet for further research.
- Researchers and harvesters (technologists) recommended the best approaches for capturing datasets and then constructed a meaningful and complete data archive.
Baggers and checkers (technologists) inspected the harvested dataset to make sure it was complete, conducted a quality assurance check and packaged the data.
- Describers (librarians/scientists) created a metadata record for each archived dataset.
- Storytellers (journalists/photographers) documented the day’s activities through social media posts, interviewed participants to create rescuer profiles, contacted reporters and blogged about the event.
So how productive was Data Rescue DC?
The participants ended the day energized and eager to stay involved. Open data is strong and getting stronger in Washington, DC.