Data about almost everything we do is collected and used to study our habits and choices. Well-known examples of data science applications by innovative companies are ubiquitous: Netflix uses an advanced recommendation algorithm to get us to watch more movies; Amazon suggests other products that we might want to add to our shopping cart, and Facebook shows us ads and news according to our likes, friends and browsing history. Companies who realize that they have access to petabytes of data about their customers can now employ new machine learning algorithms to find insights and use cheaper-than-ever computing resources to improve the way they run their business.
Instead of companies using these powerful data science tools to make us buy more of their products, imagine if these tools could be widely adopted by governments, NGOs and other organizations working on promoting 'social good'. Rather than a recommendation engine for which movie to watch next, we could build a model to suggest which ambulance should be dispatched to the next incoming 911 call for the shortest response time. Or imagine having a tool that suggests which government program to offer a person to reduce his probability of long-term unemployment. Or building a model to identify those students that have the highest risk of dropping out of high school in order to intervene before it is too late.
Unfortunately, most organizations with a primarily public mission–such as educational institutions, municipal or national governments, NGOs– are still lagging behind most corporations when it comes to using data science. This can happen for several reasons: lack of financial resources or investment in this field, not enough in-house skills on data science, or lack of knowledge about the benefits and impacts that data science can bring to an organization. Moreover, talent in this field is still scarce and sought after, and usually big tech companies or banks are the ones attracting available candidates with higher economic incentives.
One organization that is working to reverse this is Data Science for Social Good (DSSG). The DSSG Summer Fellowship was founded in the University of Chicago by Rayid Ghani, Chief Data Scientist of the Obama 2012 Election Campaign, and aims to both 1) train aspiring data scientists to apply their skills to solve social problems and 2) help organizations to adopt data science. Examples of DSSG’s projects in the past include helping the Ministry of Social Services in Mexico identify which poor people are in highest need of conditional cash-transfers, a collaboration with the White House and various police departments to reduce adverse interactions between police officers and the public and helping the City of Cincinnati to improve their emergency response services.
This year, DSSG was brought to Europe by NOVA School of Business & Economics and the municipality of Cascais, Portugal, as an effort to create a global network, and tackle social problems over the world. A team of fellows is working with the Dutch Ministry of Transportation and Environment to reduce response time of traffic incidents. The Netherlands sees around 120,000 incidents annually and the ministry deploys 260 inspectors to keep the Dutch roads safe. The aim of the team is to use available incident and geospatial data to optimize the positioning of road inspectors and improve road safety.
Other European initiatives include predicting the risk of long-term unemployment in Cascais with data from the municipality; a partnership with the World Economic Forum to detect illegal fishing using satellite images and ocean data; and a collaboration with the Tuscany Tourism Office to promote sustainable tourism in Florence, among others. The work done by Data Science for Social Good is made public and uses open source software to ensure that other organizations can use it to address their –often similar – problems. Opening up the code also allows other people to take the existing solutions and improve them. The words “big data” can be a black box for many. Therefore, transparency is key to engage with the public and to give them the opportunity to identify biases or flaws in the models.
All these are just some examples of what can be achieved in different sectors using data in non traditional purposes. Given the amount of data collected nowadays, the applications of data science for social good are endless. Data Science for Social Good is currently fundraising to ensure this work can continue in the future (contact the authors of this article for ways to help). Collaboration between universities, NGOs, governments, data scientists and the public can ensure that big data is used for more positive, just and long-lasting solutions. Sharing projects like these - and their results – with the world and making data science accessible to more organizations are key to ensure that these new tools and techniques are used for the problems that matter most.