Sunday, July 22, 2018

The Augmented Wisdom of Crowds: Rate the Raters and Weight the Ratings

How technology can make us all smarter, not dumber

We thought social media and computer-mediated communications technologies would make us smarter, but recent experience with Facebook, Twitter, and others suggests they are now making us much dumber. We face a major and fundamental crisis. Civilization seems to be descending into a battle of increasingly polarized factions who cannot understand or accept one another, fueled by filter bubbles and echo chambers.

Many have begun to focus serious attention on this problem, but it seems we are fighting the last war -- not using tools that match the task.

A recent conference, "Fake News Horror Show," convened people focused on these issues from government, academia, and industry, and one of the issues was who decides what is "fake news," how, and on what basis. There are many efforts at fact checking, and at certification or rating of reputable vs. disreputable sources -- but also recognition that such efforts can be crippled by circularity: who is credible-enough in the eyes of diverse communities of interest to escape the charge of "fake news" themselves?

I raised two points at that conference. This post expands on the first point and shows how it provides a basis for addressing the second:
  • The core issue is one of trust and authority -- it is hard to get consistent agreement in any broad population on who should be trusted or taken as an authority, no matter what their established credentials or reputation. Who decides what is fake news? What I suggested is that this is the same problem that has been made manageable by getting smarter about the wisdom of crowds -- much as Google's PageRank algorithm beat out Yahoo and AltaVista at making search engines effective at finding content that is relevant and useful.

    As explained further below, the essence of the method is to "rate the raters" -- and to weight those ratings accordingly. Working at Web scale, no rater's authority can be relied on without drawing on the judgement of the crowd. Furthermore, simple equal voting does not fully reflect the wisdom of the crowd -- there is deeper wisdom about those votes to be drawn from the crowd.

    Some of the crowd are more equal than others. Deciding who is more equal, and whose vote should be weighted more heavily can be determined by how people rate the raters -- and how those raters are rated -- and so on. Those ratings are not universal, but depend on the context: the domain and the community -- and the current intent or task of the user. Each of us wants to see what is most relevant, useful, appealing, or eye-opening -- for us -- and perhaps with different balances at different times. Computer intelligence can distill those recursive, context-dependent ratings, to augment human wisdom.
  • A major complicating issue is that of biased assimilation. The perverse truth seems to be that "balanced information may actually inflame extreme views." This is all too clear in the mirror worlds of pro-Trump and anti-Trump factions and their media favorites like Fox, CNN, and MSNBC. Each side thinks the other is unhinged or even evil, and layers a vicious cycle of distrust around anything they say. It seems one of the few promising counters to this vicious cycle is what Cass Sunstein referred to as surprising validators: people one usually gives credence to, but who suggest one's view on a particular issue might be wrong. A recent example of a surprising validator was the "Confession of an Anti-GMO Activist." This item is  readily identifiable as a "turncoat" opinion that might be influential for many, but smart algorithms can find similar items that are more subtle, and tied to less prominent people who may be known and respected by a particular user. There is an opportunity for electronic media services to exploit this insight that "what matters most may be not what is said, but who, exactly, is saying it."
These are themes I have been thinking and writing about on and off for decades. This growing crisis, as highlighted by the Fake News Horror Show conference, spurred me to write this outline for a broad architecture (and specific methods) for addressing these issues. Discussions at that event led to my invitation to an up-coming workshop hosted by the Global Engagement Center (a US State Department unit) focused on "technologies for use against foreign propaganda, disinformation, and radicalization to violence." This post is offered to contribute to those efforts.

Beyond that urgent focus, this architecture has relevance to the broader improvement of social media and other collaborative systems. Some key themes:
  • Binary, black or white thinking is easy and natural, but humans are capable of dealing with the fact that reality is nuanced in many shades of gray, in many dimensions. Our electronic media can augment that capability.
  • Instead, our most widely used social media now foster simplistic, binary thinking.
  • Simple strategies (analogous to those proven and continually refined in Google's search engine) enable our social media systems to recognize more of the underlying nuance, and bring it to our attention in far more effective ways.
  • We can apply an architecture that draws on some core structures and methods to enable intelligent systems to better augment human intelligence, and to do that in ways tuned to the needs of a diversity of people -- from different schools of thought and with different levels of intelligence, education, and attention.
  • Doing this can not only better expose truly fake news for what it is, but can make us smarter and more aware and reflective of nuance. 
  • This can not only guide our attention toward quality, but can also enable us to be more favored by surprising validators and other forms of serendipity needed to escape our filter bubbles.
Where I am coming from

I was first exposed to early forms of augmented intelligence and hypermedia in 1969 (notably Nelson and Engelbart), and to collaborative systems in 1971 (notably Turoff). That set a broad theme for my work. After varied roles in IT and media technology, I became an inventor, and one of my patent applications outlined a collaborative system for social development of inventions and other ideas (in 2002-3). While my specific business objective proved elusive (as the world of patents changed), what I described was a general architecture for collaborative development of ideas that has very wide applicability ("ideas" include news stories, social media posts, and "likes"). That is obviously more timely now than ever. I had written on this blog about some specific aspects of those ideas in 2012: "Filtering for Serendipity -- Extremism, 'Filter Bubbles' and 'Surprising Validators.'" To encourage use of those ideas, I released that patent filing into the public domain in 2016.

Here, I take a first shot at a broad description of these strategies that is intended to be more readable and relevant to our current crisis than the legalese of the patent application. As supplement to this, a copy of that patent document with highlighting of the portions that remain most relevant is posted online.*

Of course some of these ideas are more readily applied than others. But the goal of an architecture is to provide a vision and a framework to build on. Considering the broad scope of what might be done over time is the best way to be sure that we do the best that we can do at any point in time. We can then adjust and improve on that to build toward still-better solutions.

Augmenting the wisdom of crowds

Civilization has risen because of our human skills: to cooperate, to learn from one another, and to coalesce on wisdom and resist folly -- difficult as it may often be to distinguish which is which.

Life is complex, and things are rarely black or white. The Tao symbolizes the realization that everything contains its opposite -- Ted Nelson put it that "everything is deeply intertwingled," and conceived of the Web as a way to reflect that. But throughout human history this nuanced intertwingling has remained challenging for people to grasp.

Behavioral psychology has elucidated the mechanisms behind our difficulty. We are capable of deep and subtle rational thought (Kahneman's System 2, "thinking slow"), but we are pragmatic and lazy, and prefer the the more instinctive, quick, and easy path (system 1, "thinking fast" -- a mode that offers great survival value when faced with urgent decisions. Only reluctantly do we think more deeply. The thinking fast of System 1 favors biased assimilation, with its reliance on the "cognitive ease," quick reactions, and emotional and tribal appeal, rather than rationality.

Augmenting human intellect

For over half a century, a seminal dream of computer technology has been "augmenting human intellect" based on "man-computer symbiosis." The developers of our augmentation tools and our social media believed in their power to enhance community and wisdom -- but we failed to realize how easily our systems can reduce us to the lowest common denominator if we do not apply consistent and coherent measures to better augment the intelligence they automated. A number of early collaborative Web services recognized that some contributors should be more equal than others (for example, Slashdot, with its "karma" reputation system). Simple reputation systems have also proven important for eBay and other market services. However, the social media that came to dominate broader society failed to realize how important that is, and were motivated to "move fast and break things" in a rush to scale and profit.

Now, we are trying to clean up the broken mess of this Frankenberg's monster, to find ways to flag "fake news" in its various harmful forms. But we still seem not to be applying the seminal work in this field. That failure has made our use of the wisdom of crowds stupid to the point of catastrophe. Instead of augmenting our intellect as Engelbart proposed, we are de-augmenting it. People see what is popular, read a headline without reading the full story, jump to conclusions and "like" it, making it more popular, so more people see it. The headlines increasingly become clickbait that distorts the real story. Influence shifts from ideas to memes. This is clearly a vicious cycle -- one that the social media services have little economic incentive to change -- polarization increases engagement, which sells more ads. We urgently need fundamental changes to these systems.

Crowdsourced, domain-specific, authorities -- rating the raters -- much like Google

Raw forms of the wisdom of crowds look to "votes" from crowd, weight them equally, and select the most popular or "liked" items (or a simple average of all votes). This has been done for forecasting, for citation analysis of academic papers, and in early computer searching. But it becomes apparent that this can lead to the lowest common denominator of wisdom, and is easily manipulated with fraudulent votes. Of course we can restrict this to curated "expert" opinion, but then we lose the wisdom of the larger crowd (including its ability to rapidly sense early signs of change).

It was learned that better results can be obtained by weighting votes based on authority, as done in Google's PageRank algorithm, so that votes with higher authority count more heavily (while still using the full crowd to balance the effects of supposed authorities who might be wrong). In academic papers, it was realized that it matters which journal cites an article (now that many low-quality pay-to-publish journals have proliferated).

In Google's search algorithm (dating from 1996, and continuously refined), it was realized that links from a widely-linked-to Web site should be weighted higher in authority than links from another that has few links in to it. The algorithm became recursive: PageRank (used to rank the top search results) depends on how many direct links come in, weighted by a second level factor of how many sites link in to those sites, and weighted in turn by a third level factor of how many of those have many inward links, and so on. Related refinements partitioned these rankings by subject domain, so that authority might be high in one domain, but not in others. The details of how many levels of recursion and how the weighting is done are constantly tuned by Google, but this basic rate the raters strategy is the foundation for Google's continuing success, even as it is now enhanced with many other "signals" in a continually adaptive way. (These include scoring based on analysis of page content and format to weight sites that seem to be legitimate above those that seem to be spam or link farms.)

Proposed methods and architecture

My patent disclosure explains much the same rate the raters strategy (call it RateRank?) as applicable to ranking items of nearly any kind, in a richly nuanced, open, social context for augmenting the wisdom of crowds. (It is a strategy that can itself be adapted and refined by augmenting the wisdom of crowds -- another case of "eat your own dog food!")

The core architecture works in terms of three major dimensions that apply to a full range of information systems and services:
  1. Items. These can be any kind of information item, including contribution items (such as news stories, blog posts, or social media posts, or even books or videos, or collections of items), comment/analysis items (including social media comments on other items), and rating/feedback items (including likes and retweets, as well as comments that imply a rating of another item)
  2. Participants (and communities and sub-communities of participants). These are individuals, who may or may not have specific roles (including submitters, commenters, raters, and special roles such as experts, moderators, or administrators). In social media systems, these might include people (with verified IDs or anonymous), collections of people in the form of businesses, commercial advertisers, political advertisers, and other organizations. (Special rules and restrictions might apply to non-human participants, including bots and corporate or state actors.) Communities of participants might be explicit (with controlled membership), such as Facebook groups, and implicit (and fuzzy), based on closeness of social graph relationships and domain interests. These might include communities of interest, practice, geographic locality, or  degree of social graph closeness. 
  3. Domains (and sub-domains). These may be subject-matter domains in various dimensions. Domains may overlap or cross-cut. (For example issues about GMOs might involve cross-cutting scientific, business, governmental/regulatory, and political domains.)
An important aspect of generality in this architecture is that:
  • Any item or participant can be rated (explicitly or implicitly)
  • Any item can contain one or more ratings of other items or participants (and of itself)
It should be understood that Google's algorithm is a specialized instance of such an architecture -- one where all the items are Web pages, and all links between Web pages are implicit ratings of the link destination by the link source. The key element of man-computer symbiosis here is that the decision to place a link is assumed to be a "rating" decision of a human Webmaster or author (a vote for the destination, by the source, from the source context), but the analysis and weighting of those links (votes) is algorithmic. Much as could be applied to fake news, Google has developed finely tuned algorithms for detecting the multitudes of "link farms" that use bots that seek to fraudulently mimic this human intelligence, and downgrades the weighting of such links.

How the augmenting works

The heart of the method is a fully adaptive process that rates the raters recursively, using explicit and implicit ratings of items and raters (and potentially even the algorithms of the system itself). Rate the raters, rate those who rate the raters, and so on. Weight the ratings according to the rater's reputation (in context), so the wisest members of the crowd, in the current context, as judged by the crowd, have the most say. The wisest in context meaning the wisest in the domains and communities that are most relevant to the current usage context. But still, all of the crowd should be considered at some level.

This causes good items and raters (and algorithms) to bubble up into prominence, and less well-rated ones to sink from prominence. This process would rarely be binary black and white. Highly rated items or participants can lose that rating over time, and in other contexts. Poorly rated items or participants might never be removed (except for extreme abuse) but simply downgraded (to contribute what small weight is warranted, especially if many agree on a contrary view) and can remain accessible with digging, when desired. (As noted below, our social media systems have become essential utilities, and exclusion of people or ideas on the fringe is at odds with the value of free speech in our open society.) The rules and algorithms could be continuously learning and adaptive, using a hybrid of machine learning and human oversight. 

Attention management systems can ensure that the best items tend to made most visible, and the worst least visible, but the system should adjust those rankings to the context of what is known about the user in general, and what is inferred about what the user is seeking at a given time -- with options for explicit overrides (much as Google adjusts its search rankings to the user and their current query patterns).  It should be noted that Facebook and others already use some similar methods, but unfortunately these are oriented to maximizing an intensity of "engagement" that optimizes for the company's ad sale opportunities, rather than to a quality of content and engagement for the user. We need sophistication of algorithms, data science, and machine learning applied to quality for users, not just engagement for advertisers and those who would manipulate us.

Participants might be imputed high authority in one domain, or in one community, but lower in others. Movie stars might outrank Nobel prize-winners when considering a topic in the arts or even in social awareness, but not in economic theory. NRA members might outrank gun control opponents for members of an NRA community, but not for non-members of that community.

Openness is a key enabling feature: these algorithms should not be monolithic, opaque, and controlled by any one system, but should be flexible, transparent, and adaptive -- and depend on user task/context/desires/skill at any given time. Some users may choose simple default processes and behaviors, but other could be enabled to mix and match alternative ranking and filtering processes, and to apply deeper levels of analytics to understand why the system is presenting a given view. Users should be able to modify the view they see as they may desire, either by changing parameters or swapping alternative algorithms. Such alternative algorithms could be from a single provider, or alternative sources in an open marketspace, or "roll your own."

Within this framework, key design factors include how these key processes are managed to work in concert, and to change how each of these behaves, for a given user, at given time, depending on task/context/desires/skill (including the level of effort a user wishes to put in):
  • The core rate the raters process, based on both implicit and explicit ratings, weighted by authority as assessed by other raters (as themselves weighted based on ratings by others), with selective levels of partitioning by community and domain. Consideration of formal and institutional authority can be applied to partially balance crowdsourced authority. Dynamic selection of weighting and balancing methods might depend on user task/context/desires
  • Attention tools that filter irrelevant items and highlight relevant ones (such as to give Facebook or Twitter users different views of their feed). Thus different Facebook or Twitter user might be able to get different views of their feed, and change that as desired.
  • Consideration with regard to which communities and sub-communities most contribute to rankings for specific items at specific times.  Communities might have graded openness (in the form of selectively permeable boundaries) to avoid groupthink and cross-fertilize effectively. This could be applied by using insider/outsider thresholds to manage separation/openness.
  • Consideration with regard to domains and sub-domains to maximize the quality and relevance of ratings, authority, and attention, and to avoid groupthink and cross-fertilize effectively.
  • Consideration of explicit vs. implicit ratings.. While explicit ratings may provide the strongest and most nuanced information, implicit ratings may be far more readily available, thus representing a larger crowd, and so may have the greatest value in augmenting the wisdom of the crowd. Just as with search and ad targeting, implicit ratings can include subtle factors, such as measures of attention, sentiment, emotion, and other behaviors.
  • Consideration of verified vs. unverified vs. anonymous participants. It may be desirable to allow a range of levels, use weighting where anonymous participants have no reputation or a negative reputation. Bots might be banned, or given very poor reputation.
  • Open creation, selection and use of alternative tools for filtering, discovery, attention/alerting, ranking, and analytics depending on user task/context/desires. This kind of openness can stimulate development and testing of creative alternatives and enable market-based selection of the best-suited tools.
  • Use of valuation, crowdfunding, recognition, publicity, and other non-monetary incentives can also be used to encourage productive and meaningful participation, to bring out the best of the crowd.
(As expanded on below, all of this should be done with transparency and user control.)

[Update 10/10/18:] This subsequent post: In the War on Fake News, All of Us are Soldiers, Already!, may help make this more concrete and clarify why it is badly needed.

Applying this to social media -- fake news, community standards, polarization, and serendipity

A core objective is to augment the wisdom of crowds -- to benefit from the crowd to filter out the irrelevant or poor quality -- but to have augmented intelligence in determining relevance and quality in a dynamically nuanced way that reduces the de-augmenting effect of echo chambers and filter bubbles.

Using these methods, true fake news, which is clearly dishonest and created by known bad actors, can be readily filtered out, with low risk of blocking good-faith contrarian perspectives from quality sources. Such fake news can readily be distinguished from  legitimate partisan spin (point and counterpoint), from legitimate criticism (a news photo of a Nazi sign) or historically important news items (the Vietnam "terror of war" photo), and from legitimate humor or satire.

A dilemma that has become very apparent in our social media relates to "community standards" for managing people and items that are "objectionable." Since our social media systems have become essential utilities, exclusion of people or ideas on the fringe is at odds with the rights of free speech in our open society. Jessica Lessin recently commented on Facebook's "clumsy" struggles with content moderation, and on the calls of some to ban people and items. She observes that Facebook wants the community to determine the rules, but also is pressed to placate regulators -- and observes that "getting two billion people to write your rules isn’t very practical."

"Getting two billion people to write your rules" is just what the augmented wisdom of crowds does seek to make practical -- and more effective than any other strategy. The rules would rarely ban people (real humans) or items, but simply limit their visibility beyond the participants and communities that choose to accept such people or items. Such "objectionable" people have no right to require they be granted wide exposure, and, at the same time, those who find some people or materials objectionable rarely have a right to insist on an absolute and total ban.

This ties back to the converse issue, the seeking of surprising validators and serendipity described in my 2015 post. By understanding the items and participants, how they are rated by whom, and how they fit into communities, social graphs, and domains, highly personalized attention management tools can minimize exposure to what is truly objectionable, but can find and present just the right surprising validators for each individual user (at times when they might be receptive). Similarly, these tools can custom-chose serendipitous items from other communities and domains that would otherwise be missed.

This is an area where advanced augmentation of crowd wisdom can become uniquely powerful. The mainstream will become more aware and accepting of fringe views and materials (and might set aside specific times for exploring such items), and the extremes will have the freedom to choose (1) whether they wish to make their case in a way that others can accept as unpleasant but not unreasonable and antisocial, or (2) to be placed beyond the pale of broader society: hard to find, but still short of total exclusion. Again, a high degree of customization can be applied (and varied with changing context). Those who want walled gardens can create them -- with windows and gates that open where and when desired.

Innovation, openness, transparency, and privacy

Of course the key issues are how do we apply quick fixes for our current crisis, how do we evolve toward better media ecosystems, and how do we balance privacy and transparency. I generally advocate for openness and transparency. 

The Internet and the early Web were built on openness and transparency, which fueled a huge burst of innovation.  (Just as I refer to my 2002-3 patent filing, one can make a broad argument that many of the most important ideas of digital society emerged around the time of that "dot-com" era or before.) Open, interoperable systems (both Web 1.0 and Web 2.0) enabled a thousand flowers to bloom. There are also similar lessons from systems for financial market data (one of the first great data market ecologies) fueled by open access to market data from trading exchanges, and to competing, interoperable distribution, analytics, and presentation services. The patent filing I describe here (and others of mine) build on similar openness and interoperability. 

Now that we have veered down a path of closed, monopolistic walled gardens that have gained great power, we face difficult questions of how to manage them for the public good. I suggest we probably need a mix of all four of the following. Determining just how to do that will be challenging. (Some suggestions related to each of these follow.) 
  1. Can we motivate monopolies like Facebook to voluntarily shift to better serve us? Ideally, that would be the fastest solution, since they have full power to introduce such methods (and the skills to do so are much the same as the skills they now apply for targeting ads).
  2. Can we independently layer needed functions on top of such services (or in competition with them)? The questions are how to interface to existing services (with or without cooperation) and how to gain critical mass. Even at more limited scale, such secondary systems might provide augmented wisdom that could be fed back into the dominant systems, such as to help flag harmful items.
  3. Should we mandate regulatory controls, accepting these systems as natural monopolies to be regulated as such (much like early days of regulating the Bell System monopoly on telephonic media platforms)? There seem to be strong arguments for at least some of this, but being smart about it will be a challenge.
  4. Should we open them up or break portions of them apart (much like the later days of regulating the  Bell System)? Here, too, there seem to be strong arguments for at least some of this, but being smart about it will be a challenge.
  5. Can we use regulation to force the monopolies to better serve their users (and society) by forcing changes in their business model (with incentives to serve users rather than advertisers)? I suggest that may be one of the most feasible and effective levers we can apply.
My suggestions about those alternatives:
A transparent society?

A central (and increasingly urgent) dilemma relates to privacy. Some of my suggestions for openness and transparency in our social media and similar collaborative systems could potentially conflict with privacy concerns. We may have to choose between strict privacy and smart, effective systems that create immense new value for users and society. We need to think more deeply about which objectives matter, and how to get the best of mix. Privacy is an important human issue, but its role in our world of Big Data and AI is changing: 
  • As David Brin suggested in The Transparent Society, the question of privacy is not just what is known about us, but who controls that information. Brin suggests the greatest danger is that authoritarian governments will control information and use it to control us (as China is increasingly on track to do that). 
  • We now face a similar concern with monopolies that have taken on quasi-governmental roles -- they seem to be answerable to no one, and are motivated not to serve their users, but to manipulate us to serve the advertisers who they profit from. (There are also the advertisers, themselves.)
  • Brin suggested our technology will return us to the more transparent human norms of the village -- everyone knew one-another's secrets, but that created a balance of power where all but the most antisocial secrets were largely ignored and accepted. We seem to be well on the way to accepting less privacy, as long as our information is not abused.
  • I suggest we will gain the most by moving in the direction of openness and transparency -- with care to protect the aspects of privacy that really need protection (by managing well-targeted constraints on who has access to what, under what controls). 
That takes us back to the genius of man-computer symbiosis -- AI and machine learning thrive on big data. Locking up or siloing big data can cripple our ability to augment the wisdom of crowds and leave us at the mercy of the governments or businesses that do have our data. We need to find a wise middle ground of openness that fuels augmented intelligence and market forces -- in which service providers are driven by customer demand and desires, and constrained only by the precision-crafted privacy protections that are truly needed.

-----------------------

See the Selected Items tab for related posts 
[Update 12/30/19, 12/14/21: That list replaces the shorter list originally posted here.]

Supportive References for Augmenting the Wisdom of Crowds and The Tao of Truth
------

*Appendix -- My patent disclosure document (now in public domain)

This post draws on the architecture and methods described in detail in my US patent application entitled "Method and Apparatus for and Idea Adoption Marketplace" (10/692,974), which was published 9/17/04. It was filed 10/24/03 formalizing a provisional filing on 10/24/02. I released this material into the public domain on 12/19/16. I retain no patent rights in it, and it is open to all who can benefit from it.

A copy of that application with highlighting of portions most relevant to current needs is now online. While this is written in the hard-to-read legalese style required for patent applications, it is hoped that the highlighted sections are helpful to those with interest. (A duplicate copy is here.)

The highlighted sections present a broad architecture that now seems more timely than ever, and provides an extensible framework for far better social media -- and important aspects of digital democracy in general.

For those who are curious, there is a brief write-up on the original motivation of this work.

(This patent application was cited by 183 other patent applications (as of 12/21/21), an indicator of its contribution. 21 of those citations were by Facebook.)