Great Challenges Raised in Entrusting Data with Ethical Questions

Can we see data as essential, even vital, to human rights protections? How informed is data and how informed can data-informed policing truly be? Can we embrace the possibilities of algorithmic tools? These were some of the questions raised during the New York University's Bernstein Institute panel discussion entitled 'Human Rights Challenges in Predictive Analytics', which kicked off its annual conference. The topic of this year's conference was 'Tyranny of the Algorithm? Predictive Analytics & Human Rights' and included discussion of challenges that arise in data-driven risk assessment as well as the social, political, economic and cultural contexts in which they are designed and implemented.

Moderator Margaret Satterthwaite, Faculty Director of the Institute and Professor of Clinical Law, began by placing data risk assessment in its greater context-- defined and redefined based on who is using the data, be it scholars, politicians, or the media. Data risk assessment in the human rights field therefore can always be considered problematic, whether because it is incomplete and therefore cannot not tell the whole story, biased and therefore harmful to human rights, incapable to get at the source of human rights by attempting to quantify human suffering, or if, quite frankly, it is a lie. However, it also has the possibility to amplify the messaging behind human rights obligations and become a powerful tool to speak truth to power, if it can, indeed, capture a truth.

Latanya Sweeney, a computer scientist at Harvard University, began the discussion by first apologizing for the adverse effects of algorithms. Algorithms have contributed to human rights violations, to the policing of black and brown bodies, and to broad and unwarranted surveillance. Sweeney presented an image of a Google search of her name and the displayed ads that first accompanied it: ads for for arrest records, though there were none to be found for Sweeney. When she had first searched it, her male colleague from Italian origin suggested the reason was Sweeney had 'a Black sounding' name. Sweeney conducted an experiment and concluded the observation was true: 'White sounding' names yielded neutral search results in advertising and the disparity was an 80% to 20% difference, which Sweeney highlighted fulfilled the litmus test set by the Department of Justice for instances of discrimination.

Instead of discussing whether we live in an algorithmic tyranny, Sweeney proposed instead that we live in a technocracy, where "every value is up for grabs". As an example, Sweeney presented domains such as and targeting the Black American population. Notably, Sweeney found that more domains are exclusive to Blacks and Asians than any other groups. Sweeney concluded that "design of algorithms dictates how we live our lives" and we need to reconceptualize algorithms to think of them as being akin to stereotyping--assessments based on who we are based on the sites we visit and who is in our social networks.

Jeff Brantingham presented an alternate narrative. As Chief of Research and Development at PredPol and a Professor of Anthropology at UCLA, he delineated rather that data can be used in three areas of police work: in response to crimes, apprehension, and prevention, but where it is actually being employed is preventive analytics--the focus placed on the when and where crime is most likely to occur based on information of past crimes and a system which is ever-changing. Data examining gun violence in Chicago, dubbed 'Chiraq' specifically because of the systematic nature of the violence, cannot be static, as hot spots of gun violence pop up one day only to disappear the next, and reappear somewhere else shortly thereafter.

The data, according to Brantingham, is used to assess how much time and resources should be spent at any specific location by law enforcement engaging the public there, with the end goal of minimizing the uncertainty of risk, not eliminating it. Brantingham also concluded that neither the who nor the why is examined in data pertaining to crimes, only the type of crime, it's time, and location, hoping rather to target those likely to become victims of opportunity crimes with the end goal of assisting them.

Jennifer Lynch, Senior Staff Attorney with the Electronic Frontier Foundation, an organization defending human rights in the digital age with focus on freedom of expression and user privacy, provided a civil liberties counterargument to data driven 'predictive policing'--that in preventing victimization through crime, the police are victimizing whole communities. This encompasses limits on freedom of movement, suspicion of police, fear of authorities, and inability to feel safe in a targeted policing environment. To that effect, Lynch spoke of the GIGO ('garbage in, garbage out) phenomenon where data entered is faulty and data analyzed is faulty. With only 50% of crimes reported, and more complex ones, such as rape, never reported, we cannot place our decision-making and ethical choices behind data reasoning, Lynch reasoned. Moreover, the biases behind these algorithms could not be ignored when the average American commits three felonies a day, with only some of those felonies policed. Adding to that, historically the NYPD, LAPD, and CPD have accumulated racially charged information, all of which has informed crime data which is fed into the system. Unaware how data is gathered due to both traditional police secrecy and private sector secrecy, limiting the rights of a defendant to be able to question the technology used to accuse him or her of a crime is another civil liberties question.

Another question that arises is while police are targeting specific communities, could their time be taken away from other crimes they could have been identifying instead? The data justifies their presence in a neighborhood, suspicion of criminal activity, questioning of individuals, belief that individuals may be violent, and subsequent escalation in response. Moreover, this raises further questions. Is such a process precarious to free will? As a society, can we be comfortable with the idea that an algorithm will predict what we will do before we have even decided we will do it?

Rachel Levinson, Senior Counsel for the Liberty and National Security Program in the Brennan Center for Justice highlighted another question: how is data collection and sharing impacting human rights? With data gathered on vulnerable people, those at risk of persecution, this information could fall into the wrong hands when shared, as for example when Europol and EU law enforcement agencies attempted to access information collected by EURODAC, a transnational database established by the European Council where sensitive information was collected for asylum seekers fleeing repressive regimes. Moreover, the sharing of data is often well within the rights of several organizations which can transfer data to third parties. This, in course, may have a sort of chilling effect on asylum-seekers and migrants, causing them not to seek out services because of fear of data collection and sharing. It may also create further divisions and stratas in society of those who are not tracked, and those who are, be it nationally individuals who are incarcerated or internationally, those seeking refugee status. Data analytics can pose even greater adverse effects on migrants and refugees as companies like IBM propose software tools to weed out refugees from imposters and possible terrorists. This raises the question whether companies can argue such tools are just engineering when credit fraud and terrorism operate so differently.

However, what all panel members seemed to agree upon was the potential of data to offer services, tracking recidivism likelihoods for those who have already been incarcerated and providing accompanying services as one example. The risk of offering misplaced services may be less dire. There is also a difference in community efforts and police presence: one creating more support and the other creating more stigma. As Sweeney also pointed out, many crimes are those of opportunity and just because they do not occur in one place, one time, to one individual does not mean they do not occur in another place, at another time, to another individual. Services may work towards eliminating the roots of crimes rather than displacing them.

Though some solutions to current ambiguity in data usage and human rights may include a code of ethics, standards of operation, and a diversity in data, the discussion proved to be a catalyst for further discussions on human dignity, privacy, and ethics. Future questions to be asked may be: 'should we be placing more trust into data and technology than necessary?', 'are we relying on certain technologies without making complex ethical decisions?', and importantly, 'are algorithms the best way to address these problems?'