THE BLOG

Quick: Is Someone Trying to Steal Your Agora?

Co-written by Steve Barnett.

We recently completed a pilot study examining the potential impact of bias in online search, examining the potential impact on public opinion created by manipulations as small as altering the order of presentation of stories covering events or issues. After running more than 600 subjects through our experiments, our findings show some areas of encouragement -- people's opinions can be influenced by what they read. Our findings also show some areas of concern ----people's opinions can be altered by manipulating the coverage of events, even through manipulations as simple as shifting the order in which stories on these events are presented. Although not addressed in this study, we also believe that changing the order of presentation is entirely too easy to achieve on existing search engines, that search engine companies are at best delinquent and at worst complicit in allowing manipulation of order to be so easily achieved, and that the concentration of search represents a significant threat to our First Amendment rights to a free press.

Arianna Huffington ran a session at the World Economic Forum in Davos in 2009, in which she argued that the net would bring participatory democracy and informed self-government to America, to an extent not seen anywhere on Earth since the Athenian Agora and The Golden Age of Pericles. While this would be an outcome to be greatly welcomed, it is not assured. Achieving an idealized participatory democracy assumes that the electorate will enjoy fair access to information and that it will gain true comprehension, and that it will not be influenced by the relative ease of preparing soundbites for one position or another. I was more pessimistic than she about the crowd's attention span, willingness to focus, and willingness to perform the complex analyses enabled by fair and accurate reporting, believing that the net is neither friend nor foe of democracy. Still, both of us accepted the idea that the net would provide fair access. My coauthors and I are no longer convinced that fair access will continue to exist online. We do believe that the net can influence opinion, but we are less convinced that opinion will be based upon fair and even-handed presentation of all relevant information, without fear or favor.

Our experiment was not intended to address the accuracy or fairness of coverage in the traditional media, nor the breadth of coverage or the fairness of the totality of sources available online. Our study simply addressed the impact of changing the presentation order and selection of stories returned in the first page of a traditional search. We manipulated what was presented to experimental subjects, thus simulating the presence of bias in the reporting of results from online search, to examine how these manipulations might affect public opinion. We selected four issues of great public interest and created five different presentations of relevant articles, ranging from extremely supportive to strongly negative. We then examined differences in differences, that is, the difference in opinion before and after subjects read supportive presentations, the difference in opinion before and after subjects read negative presentations, and the difference between the differences. Our experimental treatments, described in more detail below, involved presenting subjects with a simulated first page of search results, with links to articles describing an event of interest, and then changing the balance between supportive and unsupportive articles and changing their relative positioning in the list.

Malleability of Opinions -- Readers Can Be Influenced by What They Read

Readers' opinions appear to be malleable. They can be changed by what they read. Yes, this malleability, and the areas in which malleability is present, both are affected by the nature of the story, by the extent of the readers' prior information, and by the nature of the readers' prior beliefs, as previously reported in academic literature on attempting to manipulate of public opinion through journalism. Still, for 13 of the 20 experimental treatments we performed, or 65 percent, the treatments produced changes in opinion that were significant at the 95 percent confidence level, far too many to be explained merely by chance.

It is not surprising that opinions can be altered, at least temporarily, by reading, but it is a necessary condition for addressing the question of whether search-engine manipulation can bias public opinion. If opinions could not be altered there would be no point in having a free press, or in studying it. It is worth noting that although we did observe a significant effect from exposure to an article with a given viewpoint in our research study, we were not certain that this would be the case. Some subjects were skeptical, and a limited number of subjects seemed outraged by the thought that opinions on important issues could be altered merely by reading about them. Indeed, in his assessment of the experiment, one subject wrote "boring and I don't believe a few articles can sway my opinions so quickly."

We found that opinions can be manipulated by changing treatments and altering what subjects read, but not always in the direction we expected and not always in obvious ways. Differences arose across treatments, across domains based on prior beliefs, across commitments to those beliefs, and across questions. For some events or issues it was difficult to get University of Pennsylvania students to become more positive, for others it was difficult to get them to become more negative, and for still others it was difficult to change opinions at all. For example, in health care, we noticed virtually no difference in subjects' change in opinion on the question: "Is health care a universal right for any citizen of any developed country?" That is, there was no difference between the change produced by treatments that attempted to present health-care reform in a favorable light and the change produced by articles presenting health-care reform unfavorably. Apparently some opinions are indeed difficult to change simply by reading a few articles. In contrast, the positive vs. negative health-care articles produced enormously different responses in subjects asked to assess whether health-care reform would save the nation money. Likewise, although we were not able to alter subjects' belief that global climate change had occurred in the past, different treatments produced enormously different impacts on subjects' beliefs concerning the validity of data suggesting that climate is changing again and beliefs concerning the severity of the threat represented by current climate changes.

The Experiment

We identified four domains (events or issues) of interest: (1) global warming, (2) Iran's possible pursuit of nuclear weapons and appropriate responses, (3) genetic modification of food, and (4) health-care reform in the United States. We used Google and Bing to identify a collection of stories (specific articles on each of the four events). We read the stories, deleting the most extreme and any that we thought might be offensive to subjects, then classifying them as supportive or extremely supportive (e.g., articles that were supportive of health care reform), opposed or extremely opposed (e.g., articles that argued global warming is not occurring and the data are largely fabricated) or neutral (e.g., this is what we know about genetic modification of food and its impact on nutrition, this is its impact on farmers' costs, and these are the unresolved issues concerning risks).

We prepared five experimental treatments, which varied the articles shown to subjects. The neutral treatment not only sought to balance positive and negative stories, but sought to maintain balance locally on the page of results reported; it generally opened with a neutral story, presented a supportive or a negative story, then followed with a strongly opposite story, and continued to alternate. Positive and Negative treatments surged positive or negative stories higher on the page but did not alter the set of stories presented. Double Positive and Double Negative treatments altered the presentation order, but also attempted to swamp the opposing point of view by including additional articles that were supportive while deleting some articles in opposition.

Each subject initially answered four sets of questions, some on facts, some on opinions, for each of the four domains, before receiving any of the experimental treatments. Each subject then received a single treatment on each of the four domains, and then completed the same sets of questions after completing the treatment on each domain. We were therefore able to measure, subject by subject, the extent to which each treatment did or did not alter opinions on each question in each domain. No subject received more than one treatment, because we felt that would telegraph the true purpose of the experiment. The sample was large enough (more than 600 subjects) that we were able detect patterns in changes in belief across different treatments. As noted above, the questions varied enough that we can see that different types of questions have much more variable responses.

Many subjects complained that there was not sufficient time to read all of the articles, which was, of course, what we intended. If everyone read everything then presentation order would not matter, or would not matter in the way that was intended by manipulation. At no time were the results shown to the subjects explicitly identified as the results of a single search on a single search engine, to avoid any negative impact on the reputation of any search engine vendor; after completion of the experiment, subjects were debriefed and informed that the articles they received were produced by merging and manipulating the results of several searches and not the results of a single organic search.

Discussion of Findings

Manipulation of presentation can alter opinions and as we learn more about this, it will become a more powerful tool. It's clear that we can relatively easily convince our subjects that health care is more expensive than they thought or were told; indeed, it is hard not to alter their opinion of some issues simply by letting them learn more about the proposals. It is much harder to convince a subject pool at a major American university that there is an upside to Iran's developing nuclear weapons, or to alter long-held beliefs for or against health care as a universal right of citizenship.

We are aware of several experimental limitations. First, our subjects were all students or student-age young adults; thus, as a group they are known to be more malleable than the population at large. We do not know if the change in opinions is real and would convert into changed actions, or merely subjects' response to the experimental setting and to their belief that the experimenters believed that reading should have changed their opinions. Even if the treatments resulted in real changes in subjects' beliefs immediately after the experiment, we do not know that these changes are persistent. We do not know whether the subjects would have been so effectively manipulated if presented with more articles and more time to read them; subjects forming their opinions before voting are time constrained, of course, and cannot read the hundreds of thousands of pages of material that they might find searching online, but they can prepare more thoroughly than our experimental subjects were permitted to prepare.

Although only a pilot study, the experiment probably does have implications for the relationship between search and informed democracy. It's not surprising that search order presentation matters. If relative position did not then there would be no reason for sponsored search and payment for sponsored links on the top of the page, for sponsored ads, or for search engine relative position optimization (SEO for short). Indeed, if we were surprised by anything it was by how difficult it was to alter opinions in some areas.

And yet, we did observe results that might be numerically and politically significant. We defined numerically significant to be a difference between differences of 0.15; that is, the change produced by the negative treatments differs from the change produced by the positive treatments by 0.15. Changing from Undecided to Yes produces a change of +1, while changing from Undecided to No produces a change of -1; changing from No to Yes produces a change of +2, while changing from Yes to No produces a change of -2. This 15 percent change could be produced in a number of ways and merely requires that the difference between changes produced by positive treatments and changes produced by negative treatments involved a total change in average score of 0.15.

Discussion of Implications

We believe that the levels of change we observed on some questions could be politically significant: Any change that caused 15 percent of undecided voters to move in sync to the same decided position, or any change that caused 7.5 percent of decided voters to switch their opinions in the same direction, might well be politically significant as well as numerically significant.

Manipulation of search results happens all the time... that's what SEO is all about. We are not the first to note the potential to use manipulation of presentation order as a political weapon or as a commercial tool. Search engine vendors not only do not prevent manipulation, they help users perform manipulations. We do not know why they do this, but a possible explanation is that this helps vendors deflect attention away from problems with search, sort of a "don't blame us when you don't show up where you think you should, blame yourself for failing to engage in SEO as well as your competitors!"

Search engine vendors seem to be held to lower standards of accountability than journalists or public opinion pollsters. Perhaps this is because search engine vendors' lack of transparency makes manipulation harder to detect and to eliminate. In an online post I noted that the approximate number of search results returned in a Google search varied in strange and inexplicable ways over time, and that the numbers clearly were not correct. For some people a query requesting a subset of their results (those results not referencing antitrust, for example) might be significantly larger than the set from which this subset was selected. Even meaningless queries might produce significantly larger result sets than the meaningful queries of which they should have been subsets (searching for references to Google's Chairman <<"Eric Schmidt">> produced about 1.6 million results, while searching for references to Schmidt that did not also include references to weimaraners <<"Eric Schmidt" -weimaraners>> led to more than 16 million results). The number of results returned is often used as a crude measure of social impact or social significance, and I find the lack of accuracy in these estimates disconcerting, especially since nearly instantaneous reductions in apparent significance can occur without explanation or recourse (recutions of more than 90 percent are documented in the article).

Google's search engine optimization guru Matt Cutts responded to my post almost immediately, informing me that "our results estimates are just that -- estimates. In theory we could spend cycles on that aspect of our system, but in practice we have a lot of other things to work on, and more accurate results estimates is lower on the list than lots of other things." I do wonder why we are so accepting. Would a TV producer wonder why the number of viewers Nielsen reported for his show had suddenly dropped from 14.3 million to 1.10 million? Would this unexplained change affect the economic prospects for his show? Would anyone tolerate being told that "fixing the numbers we report is not a priority at Nielsen since it would consume resources better used elsewhere?" And yet we trust search engine vendors and their results, although none of us really knows how results are determined or precisely what causes them to change.

Conclusions

Our tentative conclusions on this are (1) manipulation of opinion through manipulation of search is complicated but possible; (2) manipulation of search is complicated but possible as well, and indeed it occurs; (3) we do not know enough about search vendors' algorithms to determine or monitor manipulation; and (4) concentration of search in the hands of one or two search engine vendors, with their lack of algorithmic transparency, and potentially with their own political agendas, should be of great concern to a free democracy. We are not suggesting that anything illegal, unethical, or subversive of our freedoms has occurred. We are suggesting, as we have before, that there is now a real possibility that near monopoly of search for content can trump diversity of source in that content. At a time when major publications like the Wall Street Journal are reporting on bias in Google search, and the New York Times is reporting on the dangers that may be created through Google's misrepresentation of search results, perhaps the power of a monopoly search engine provider should be reexamined.

This is not a subject that many citizens take seriously yet; one reader commented on a previous post by writing sarcastically, "god forbid everybody want to use the same search engine! Now we can't have that!" If we are correct in our arguments here, perhaps concentration of search without regulation is a greater problem than generally perceived, and perhaps it is time for the FCC to examine the First Amendment's constitutional guarantees of the right to speak, to hear, and to be heard, in light of the current concentration in search and the potential for the abuse of that concentration.