Friday, November 23, 2018

"As We Will Think" -- The Legacy of Ted Nelson, Original Visionary of the Web

From Nelson, As We Will Think (1972 version)
The vision of the Web in 1945

From a recent email from the Editor of The Atlantic:
In July 1945, Vannevar Bush, then the director of the U.S. Office of Scientific Research and Development—the military’s R&D lab, the predecessor to DARPA—published an essay in The Atlantic that would become one of the seminal pieces of technology literature of the 20th century.
Entitled “As We May Think,” the essay laid out a vision for a new kind of relationship between human and machine. Bush introduced an idea he called the memex: a sprawling, shared store of humankind’s knowledge that could be used for social good, not destruction. In the following years, preeminent technologists—including Doug Engelbart, whose work eventually led to the invention of the mouse, the word processor, and hyperlinks; and Sir Tim Berners-Lee, inventor of the World Wide Web—cited “As We May Think” as inspiration for their work.
The seminal re-visioning in the 1960's

It seems ironic that even the Atlantic seems to be neglecting Ted Nelson's role as an equally seminal visionary of the Web -- especially given that one of his early works was an explicit call to re-center on and realize Bush's vision, a work that plays off Bush's title as "As We Will Think."

I tweeted back to the Atlantic:
Some further tweets raised the question whether that was online somewhere, and it seems to be only in The Wayback Machine (archive.org), as a 1972 version -- and a poor copy at that, missing the original figures.

It happens that I have a better copy of the 1972 version, as well as another version that is labelled as being from 1968. So I am posting scans of both versions online (links below). I include some comments on provenance and on my recent email interchange with Nelson below (both of which lead me to believe the 1968 date is correct). But first...

Why Nelson matters 

A fuller explanation of  why Nelson matters is in my post from a few years ago, Digital Camelot - The Once and Future Web of Engelbart and Nelson, but here I caption its core message:

If you care about modern culture and how technology is shaping it, this is worth thinking about -- A powerful eulogy for where the Web might have gone, and still may someday, and the friendship of the two people most responsible for envisioning the Web*  --  Ted Nelson's eulogy for his friend Doug Engelbart, as reported by John Markoff in The Times -- with Nelson's inimitable flair.

As Markoff says:
Theodor Holm Nelson, who coined the term hypertext, has been a thorn in the side of the computing establishment for more than a half century. Last week, in an encomium to his friend Douglas Engelbart, he took his critique to Shakespearean levels. It deserves a wider audience. 
Dr. Engelbart and Ted Nelson became acquaintances at the dawn of the modern computing era. They had envisioned and invented the computing that we have come to take for granted.
I first encountered both of them in 1969, and what I saw set the direction for my life's work.  Engelbart gave "The Mother of All Demos" in 1968 (and I first saw him give a follow-up the next year, and read most of his work).  Nelson dreamed of hypertext and hypermedia, and wrote papers on what he called "hypertext" in the 'mid-60s and the highly influential Whole Earth Catalog of "Computer Lib / Dream Machines" in 1974.

As Nelson laments, both received a degree of recognition, but both were marginalized. Powerful as it may be, expediency took the Web in more limiting directions.

Their ideas remain profound and forward looking. Anyone who really cares about the future of media, intellect, and culture, and how information technology can augment that, should consider their work. Just because the Web took a turn to expediency in the past does not mean it will not realize its richer potential in the future. (One hint of that is noted in the next section [of that post].) ...

Nelson's insight

Ted's iconoclastic and visionary style is apparent from the opening of his "As We Will Think" (1968 version):
Bush was right, His famous article is, however, generally misinterpreted, for it has little to do with "information retrieval" as prosecuted today, Bush rejected indexing and discussed instead new forms of interwoven documents. 
It is possible that Bush's vision will be fulfilled substantially as he saw it, and that information retrieval systems of the kinds now popular will see less use than anticipated.
As the technological base has changed, we must recast his thesis slightly, and regard Bush's "memex" as three things: the personal presentation, editing and file console; a digital feeder network for the delivery of documents in full-text digital form; and new types of documents, or hypertexts, which are especially worth receiving and sending in this manner.
In addition, we also consider a likely design for specialist hypertexts, and discuss problems of their publication. 
BEATING AROUND THE BUSH
Twenty-three years ago, in a widely acclaimed article, Vannevar Bush made certain predictions about the way we of the future would handle written information (1). We are not yet doing so. Yet the Bush article is often cited as the historical beginning, or as a technological watershed, of the field of information retrieval. It is frequently cited without interpretation (2,3). Although some commentators have said its predictions were improbable (4), in general its precepts have been ignored by acclamation...
In hindsight, it is obvious that Ted was right about Bush's vision. The memex pre-saged wonders far beyond the mundane notion of "information retrieval" as generally understood in the 1960s (even if not all of Bush, Engelbart and Nelson's visions have been embodied in the Web).

For an interesting update that theme, see this 2016 Quartz article and its reference to Werner Herzog's interview of Ted for his film Lo and Behold, and a short video of Ted expanding on what he spoke of. Both provide a nice live demos of the "parallel textface," much as shown in the above image from Ted's 1972 article. This also explains Ted's ideas of "transclusion" of elements from one work into another, as a rich kind of mashup that retains the identity of the original elements.  He explains how that can support creator rights to what is linked in, and micropayment-based payment/compensation models.* I have often heard people speculate about some of these exciting ideas, thinking they were new (and sometimes that they invented them). Few realize that Ted described all this in the '60s and '70s,

Those trying to invent this "deeply intertwingled" future might want to stand on Ted's shoulders. Ted may not have had the entrepreneurial genius of Steve Jobs, but his inventive vision is second to none.

The Atlantic might want to talk to him...

---
1968? really? -- it seems yes

Nelson actually wrote previously about his ideas for hypertext (in the mid 1960s), so the exact date of this particular paper may not be of great importance, but its earliest provenance is a be a bit of a puzzle.

I recently corresponded with Ted by email, and he was intrigued by these finds -- happy to have the full 1972 version and puzzled by the "1968" version. He said he did not recall a formal publication from that date, but that he might have provided a version at the ACS Annual Meeting then.

Both papers are from my hard copy file, just as they appear in the scans now posted (with the hand annotations apparently being mine, from when I first read them). I believe I had ordered them from my company library and that the label "ACS Annual Meeting 1968" was the citation information with which I ordered that copy. (I presumed that referred to the American Chemical Society, which seemed a bit far afield, but Ted did have a wide range.) So it seemed to remain a puzzle.

However as I was drafting this post, I noticed that the 1968 version says "Twenty-three years ago..." while the 1972 version says "Twenty-seven years ago..." That would seem to be compelling evidence that my "1968" version actually was from that year.

To add more personal history, I had the pleasure of meeting with Ted in 1970 to explore assisting in an experimental hypertext implementation under Claude Kagan's direction, as part of my masters degree fellowship work at the AT&T Western Electric Research Center. That project did not materialize, but chatting with Ted about hypertext was one of the most memorable hours of my career.

---
Early works by Ted Nelson from my collection

The following are items by Ted that may not to be generally available online. I collected these from 1969 onward, and plan to post scans of all them as I get time (after checking whether comparable copies are already accessible elsewhere).
  • As We Will Think ("ACS Annual Meeting 1968" version
    (unable to confirm citation and date)
  • As We Will Think ("Online 72 Conference Proceedings" version 
    (fuller than the scan at archive.org, includes original figures/photos)
  • “Hypertext Editing System,” published by Brown University on 5/6/69 for the Spring Joint Computer Conference, 5/14-16/69
    (My first exposure to hypertext. I clicked a link and saw the future. It was at the IBM booth running on a "mid-sized" IBM 360/50 mainframe with a 2250 vector graphics workstation equipped with a light-pen. Coincidentally, I knew Andy van Dam and some of the developers from my time at Brown the years just before.)
  • Short Computer Lib “$5 First Edition” ©’73, on typewriter paper hand-duplexed, 12 pages including Dream Machines flip-side.
  • A File Structure for the Changing and the Indeterminate, ACM National Conference 1965
  • Xanadu Draft Brochure, 27 November 1969
  • Computer Decisions 9/70  -- No More Teacher’s Dirty Looks
  • Hypertext Note 0-9, various dates in ‘67
  • Decision/Creativity Systems dated 19 July 1970
  • Hypertexts 20 Mar 70
  • Getting it out of our system, in Schechter, ‘67
  • A 14 December 1970 PDP10 teletype printout of Ted's “final report” for Claude Kagan of Western Electric (maybe incomplete with related fragments) -- as noted above, I met with Ted around that time to discuss assisting in this project while in my master's degree fellowship program. (I suspect this was not distributed beyond Ted, Claude, and me.)
(Copies will also be placed on Google Drive.)

---
*An innovation of my own is relevant to this use of micropayments. Micropayments have a long history of enthusiasm and failure. The problem is that micropayments add up to macropayments, resulting in the shock of a nasty surprise when the bill is presented, or the fear of such a surprise. My short answer to how to fix that is to make the micro-payments variable, including some form of volume discounts and price caps, and to provide forgiveness when the value received is not satisfactory. Details of how do that are in this recent post on my other blog: "The Case Against Micropayments" -- From Fear and Surprise to The Comfy Chair.

Historical Clarification:  I should note that my original tweet, above was imprecise in saying "bringing Bush to the attention of those you name." I gather that Engelbart arrived at the basic idea of hypertext independently, and only later became aware of Ted's work. However my understanding is the Tim Berner-Lee was influenced by Ted, as referenced in his proposal for the WWW.

Wednesday, October 10, 2018

In the War on Fake News, All of Us are Soldiers, Already!

This is intended as a supplement to my posts "A Cognitive Immune System for Social Media -- Developing Systemic Resistance to Fake News" and "The Augmented Wisdom of Crowds: Rate the Raters and Weight the Ratings." (But hopefully this stands on its own as well).Maybe this can make a clearer point of why the methods I propose are powerful and badly needed...
---

A NY Times article titled "Soldiers in Facebook’s War on Fake News Are Feeling Overrun" provides a simple context for showing how I propose to use information already available from all of us, on what is valid and what is fake.

The Times article describes a fact checking organization that works with Facebook in the Philippines (emphasis added):
On the front lines in the war over misinformation, Rappler is overmatched and outgunned - and that could be a worrying indicator of Facebook’s effort to curb the global problem by tapping fact-checking organizations around the world.
...it goes on to describe what I suggest is the heart of the issue:
When its fact checkers determine that a story is false, Facebook pushes it down on users’ News Feeds in favor of other material. Facebook does not delete the content, because it does not want to be seen as censoring free speech, and says demoting false content sharply reduces abuse. Still, falsehoods can resurface or become popular again.
The problem is that the fire hose of fake news is too fast and furious, and too diverse, for any specialized team of fact-checkers to keep up with it. Plus, the damage is done by the time they do identify the fakes and begin to demote them.

But we are all fact checking to some degree without even realizing it. We are all citizen-soldiers. Some do it better than others.

The trick is to draw out all of the signals we provide, in real time -- and use our knowledge of which users' signals are reliable -- to get smarter about what gets pushed down and what gets favored in our feeds. That can serve as a systemic cognitive immune system -- one based on rating the raters and weighting the ratings.

We are all rating all of our news, all of the time, whether implicitly or explicitly, without making any special effort:

  • When we read, "like," comment, or share an item, we provide implicit signals of interest, and perhaps approval.
  • When we comment or share an item, we provide explicit comments that may offer supplementary signals of approval or disapproval.
  • When we ignore an item, we provide a signal of disinterest (and perhaps disapproval).
  • When we return to other activity after viewing an item, the time elapsed signals our level of attention and interest.
Individually, inferences from the more implicit signals may be erratic and low in meaning. But when we have signals from thousands of people, the aggregate becomes meaningful. Trends can be seen quickly. (Facebook already uses such signals to target its ads -- that is how they makes so much money).

But simply adding all these signals can be misleading. 
  • Fake news can quickly spread through groups who are biased (including people or bots who have an ulterior interest in promoting an item) or are simply uncritical and easily inflamed -- making such an item appear to be popular.
  • But our platforms can learn who has which biases, and who is uncritical and easily inflamed.
  • They can learn who is respected within and beyond their narrow factions, and who is not, who is a shill (or a malicious bot) and who is not.
  • They can use this "rating" of the raters to weight their ratings higher or lower.
Done at scale, that can quickly provide probabilistically strong signals that an item is fake or misleading or just low quality. Those signals can enable the platform to demote low quality content and promote high quality content. 

To expand just a bit:
  • Facebook can use outside fact checkers, and can build AI to automatically signal items that seem questionable as one part of its defense.
  • But even without any information at all about the content and meaning of an item, it can make realtime inferences about its quality based on how users react to it.
  • If most of the amplification is from users known to be malicious, biased, or unreliable it can downrank items accordingly
  • It can test that downranking by monitoring further activity.
  • It might even enlist "testers" by promoting a questionable item to users known to be reliable, open, and critical thinkers -- and may even let some generally reliable users to self-select as validators (being careful not to overload them).
  • By being open-ended in this way, such downranking is not censorship -- it is merely a self-regulating learning process that works at Internet scale, on Internet time.
That is how we can augment the wisdom of the crowd -- in real time, with increasing reliability as we learn. That is how we build a cognitive immune system (as my other posts explain further).

This strategy is not new or unproven. It is is the core of Google's wildly successful PageRank algorithm for finding useful search results. And (as I have noted before), it was recently reported that Facebook is now beginning to do a similar, but apparently still primitive form of rating the trustworthiness of its users to try to identify fake news -- they track who spreads fake news and who reports abuse truthfully or deceitfully.* 

What I propose is that we take this much farther, and move rapidly to make it central to our filtering strategies for social media -- and more broadly. An all out effort to do that quickly may be our last, best hope for enlightened democracy.

---
Please see my other posts for more.
----
(*More background from Facebook on their current efforts was cited in the Times article: Hard Questions: What is Facebook Doing to Protect Election Security?

[Update 10/12:] A subsequent Times article by Sheera Frenkel, adds perspective on the scope and pace of the problem -- and the difficulty in definitively identifying items as fakes that can rightly be censored "because of the blurry lines between free speech and disinformation" -- but such questionable items can be down-ranked.

Monday, October 08, 2018

A Cognitive Immune System for Social Media -- Developing Systemic Resistance to Fake News

To counter the spread of fake news, it's more important to manage and filter its spread than to try to interdict its creation -- or to try to inoculate people against its influence. 

A recent NY Times article on their inside look at Facebook's election "war room" highlights the problem, quoting cybersecurity expert Priscilla Moriuchi:
If you look at the way that foreign influence operations have changed these last two years, their focus isn’t really on propagating fake news anymore. “It’s on augmenting stories already out there which speak to hyperpartisan audiences.”
That is why much of the growing effort to respond to the newly recognized crisis of fake news, Russian disinformation, and other forms of disruption in our social media fails to address the core of the problem. We cannot solve the problem by trying to close our systems off from fake news, nor can we expect to radically change people's natural tendency toward cognitive bias. The core problem is that our social media platforms lack an effective "cognitive immune system" that can resist our own tendency to spread the "cognitive pathogens" that are endemic in our social information environment.

Consider how living organisms have evolved to contain infections. We did that not by developing impermeable skins that could be counted on to keep all infections out, nor by making all of our cells so invulnerable that they can resist whatever infectious agents may unpredictably appear.

We have powerfully complemented what we can do in those ways by developing a richly nuanced internal immune system that is deeply embedded throughout our tissues. That immune system uses emergent processes at a system-wide level -- to first learn to identify dangerous agents of disease, and then to learn how to resist their replication and virulence as they try to spread through our system.

The problem is that our social media lack an effective "cognitive immune system" of this kind. 

In fact many of our social media platforms are designed by the businesses that operate them to maximize engagement so they can sell ads. In doing so, they have learned that spreading incendiary disinformation that makes people angry and upset, polarizing them into warring factions, increases their engagement. As a result, these platforms actually learn to spread disease rather than to build immunity. They learn to exploit the fact that people have cognitive biases that make them want to be cocooned in comfortable filter bubbles and feel-good echo-chambers, and to ignore and refute anything that might challenge beliefs that are wrong but comfortable. They work against our human values, not for them.

What are we doing about it? Are we addressing this deep issue of immunity, or are we just putting on band-aids and hoping we can teach people to be smarter? (As a related issue, are we addressing the underlying issue of business model incentives?) Current efforts seem to be focused on measures at the end-points of our social media systems:
  • Stopping disinformation at the source. We certainly should apply band-aids to prevent bad-actors from injecting our media with news, posts, and other items that are intentionally false and dishonest. Of course we should seek to block such items and those who inject them. Band-aids are useful when we find an open wound that germs are gaining entry through. But band-aids are still just band-aids.
  • Making it easier for individuals to recognize when items they receive may be harmful because they are not what they seem. We certainly should provide "immune markers" in the form of consumer-reports-like ratings of items and of the publishers or people who produce them (as many are seeking to do). Making such markers visible to users can help prime them to be more skeptical, and perhaps apply more critical thinking -- much like applying an antiseptic. But that depends on the willingness of users to pay attention to such markers and apply the antiseptic. There is good reason to doubt that will have more than modest effectiveness, given people's natural laziness and instinct for thinking fast rather than slow. (Many social media users "like" items based only on click-bait headlines that are often inflammatory and misleading, without even reading the item -- and that is often enough to cause those items to spread massively.)
These end-point measures are helpful and should be aggressively pursued, but we need to urgently pursue a more systemic strategy of defense. We need to address the problem of dissemination and amplification itself. We need to be much smarter about what gets spread -- from whom, to whom, and why.

Doing that means getting deep into the guts of how our media are filtered and disseminated, step by step, through the "viral" amplification layers of the media systems that connect us. That means integrating a cognitive immune system into the core of our social media platforms. Getting the platform owners to buy in to that will be challenging, but it is the only effective remedy.

Building a cognitive immune system -- the biological parallel

This perspective comes out of work I have been doing for decades, and have written about on this blog (and in a patent filing since released into the public domain). That work centers on ideas for augmenting human intelligence with computer support. More specifically, it is centers on augmenting the wisdom of crowds. It is based on the idea the our wisdom is not the simple result of a majority vote -- but results from an emergent process that applies smart filters that rate the raters and weight the ratings. That provides a way to learn which votes should be more equal than others (in a way that is democratic and egalitarian, but also merit-based). This approach is explained in the posts listed below. It extends an approach that has been developing for centuries.

Supportive of those perspectives, I recently turned to some work on biological immunity that uses the term "cognitive immune system." That work highlight the rich informational aspects of actual immune systems, as a model for understanding how these systems work at a systems level. As noted in one paper (see longer extract below*), biological immune systems are "cognitive, adaptive, fault-tolerant, and fuzzy conceptually." I have only begun to think about the parallels here, but it is apparent that the system architecture I have proposed in my other posts is at least broadly parallel, being also "cognitive, adaptive, fault-tolerant, and fuzzy conceptually." (Of course being "fuzzy conceptually" makes it not the easiest thing to explain and build, but when that is the inherent nature of the problem, it may also necessarily be the essential nature of the solution -- just as it is for biological immune systems.)

An important aspect of this being "fuzzy conceptually," is what I call The Tao of Truth. We can't definitively declare good-faith "speech" as "fake" or "false" in the abstract. Validity is "fuzzy" because it depends on context and interpretation. ("Fuzzy logic" recognizes that in the real world, it is often the case that facts are not entirely true or false but, rather, have degrees of truth.)  That is why only the clearest cases of disinformation can be safely cut off at the source. But we can develop a robust system for ranking the probable (fuzzy) value and truthfulness of speech, revising those rankings, and using that to decide how to share it with whom. For practical purposes, truth is a filtering process, and we can get much smarter about how we apply our collective intelligence to do our filtering. It seems the concepts of "danger" and "self/not-self" in our immune systems have a similarly fuzzy Tao -- many denizens of our microbiome that are not "self" are beneficial to us, and our immune systems have learned that we live better with them inside of us.

My proposals

Details of the architecture I have proposed for a cognitive immune system -- and the need for it -- are here:
  • The Tao of Fake News – the essential need for fuzziness in our logic: the inherent limits of experts, moderators, and rating agencies – and the need for augmenting the wisdom of the crowd (as essential to maintaining the intellectual openness of our democratic/enlightenment values).
(These works did not explicitly address the parallels with biological cognitive immune systems -- exploring those parallels might well lead to improvements on these strategies.)

To those without a background in the technology of modern information platforms, this brief outline may seem abstract and unclear. But as noted in these more detailed posts, these methods are a generalization of methods used by Google (in its PageRank algorithm) to do highly context-relevant filtering of search results using a similar rate the raters and weight the ratings strategy. (That is also "cognitive, adaptive, fault-tolerant, and fuzzy conceptually.") These methods not simple, but they are little stretch from the current computational methods of search engines, or from the ad targeting methods already well-developed by Facebook and others. They can be readily applied -- if the platforms can be motivated to do so.

Broader issues of support for our cognitive immune system

The issue of motivation to do this is crucial. For the kind of cognitive immune system I propose to be effective, it must be built deeply into the guts of our social media platforms (whether directly, or via APIs). As noted above, getting incumbent platforms to shift their business models to align their internal incentives with that need will be challenging. But I suggest it need not be as difficult as it might seem.
A related non-technical issue that many have noted is the need for education of citizens 1) in critical thinking, and 2) in the civics of our democracy. Both seem to have been badly neglected in recent decades. Aggressively remedying that is important, to help inoculate users from disinformation and sloppy thinking -- but that will have limited effectiveness unless we alter the overwhelmingly fast dynamics of our information flows (with the cognitive immune system suggested here) -- to help make us smarter, not dumber in the face of this deluge of information.

---
[Update 10/12:] A subsequent Times article by Sheera Frenkel, adds perspective on the scope and pace of the problem -- and the difficulty in definitively identifying items as fakes that can rightly be censored "because of the blurry lines between free speech and disinformation" -- but such questionable items can be down-ranked.
-----
*Background on our Immune Systems -- from the introduction to the paper mentioned above, "A Cognitive Computational Model Inspired by the Immune System Response" (emphasis added):
The immune system (IS) is by nature a highly distributed, adaptive, and self-organized system that maintains a memory of past encounters and has the ability to continuously learn about new encounters; the immune system as a whole is being interpreted as an intelligent agent. The immune system, along with the central nervous system, represents the most complex biological system in nature [1]. This paper is an attempt to investigate and analyze the immune system response (ISR) in an effort to build a framework inspired by ISR. This framework maintains the same features as the IS itself; it is cognitive, adaptive, fault-tolerant, and fuzzy conceptually. The paper sets three phases for ISR operating sequentially, namely, “recognition,” “decision making,” and “execution,” in addition to another phase operating in parallel which is “maturation.” This paper approaches these phases in detail as a component based architecture model. Then, we will introduce a proposal for a new hybrid and cognitive architecture inspired by ISR. The framework could be used in interdisciplinary systems as manifested in the ISR simulation. Then we will be moving to a high level architecture for the complex adaptive system. IS, as a first class adaptive system, operates on the body context (antigens, body cells, and immune cells). ISR matured over time and enriched its own knowledge base, while neither the context nor the knowledge base is constant, so the response will not be exactly the same even when the immune system encounters the same antigen. A wide range of disciplines is to be discussed in the paper, including artificial intelligence, computational immunology, artificial immune system, and distributed complex adaptive systems. Immunology is one of the fields in biology where the roles of computational and mathematical modeling and analysis were recognized...
The paper supposes that immune system is a cognitive system; IS has beliefs, knowledge, and view about concrete things in our bodies [created out of an ongoing emergent process], which gives IS the ability to abstract, filter, and classify the information to take the proper decisions.

Monday, August 27, 2018

The Tao of Fake News

We are smarter than this!

Everyone with any sense sees "fake news" disinformation campaigns as an existential threat to "truth, justice, and the American Way," but we keep looking for a Superman to sort out what is true and what is fake. A moment's reflection shows that, no Virginia, there is no SuperArbiter of truth. No matter who you choose to check or rate content, there will always be more or less legitimate claims of improper bias.
  • We can't rely on "experts" or "moderators" or any kind of "Consumer Reports" of news. We certainly can't rely on the Likes of the Crowd, a simplistic form of the Wisdom of the Crowd that is too prone to "The Madness of Crowds." 
  • But we can Augment the Wisdom of the Crowd.
  • We can't definitively declare good-faith "speech" as "fake" or "false." 
  • But we can develop a robust system for ranking the probable value and truthfulness of speech, revising those rankings, and using that to decide how to share it with whom.
For practical purposes, truth is a filtering process, and we can get much smarter about how we apply our collective intelligence to do our filtering.

The Tao of Fake News, Truth, and Meaning

Truth is a process. Truth is complex. Truth depends on interpretation and context. Meaning depends on who is saying something, to whom, and why (as Humpty-Dumpty observed). The truth in Rashomon is different for each of the characters. Truth is often very hard for individuals (even "experts") to parse.

Truth is a process, because there is no practical way to ensure that people speak the truth, nor any easy way to determine if they have spoken the truth. Many look to the idea of flagging fake news sources, but who judges, on what basis and what aspects? (A recent NeimanLab assessment of NewsGuard's attempt to do this shows how open to dispute even well-funded, highly professional efforts to do that are.)

Truth is a filtering process: How do we filter true speech from false speech? Over centuries we have come to rely on juries and similar kinds of panels, working in a structured process to draw out and "augment" the collective wisdom of a small crowd. In the sciences, we have a more richly structured process for augmenting the collective wisdom of a large crowd of scientists (and their experiments), informally weighing the authority of each member of the crowd -- and avoiding over-reliance on a few "experts." Our truths are not black and white, absolute, and eternal -- they are contingent, nuanced, and tentative -- but this Tao of truth has served us well.

It is now urgent that our methods for augmenting and filtering our collective wisdom be enhanced. We need to apply computer-mediated collaboration to apply a similar augmented wisdom of the crowd at Internet scale and speed. We can make quick initial assessments, then adapt, grow, and refine our assessments of what is true, in what way, and with regard to what.

Filtering truth -- networks, context, and community

If our goal is to exclude all false and harmful material, we will fail. The nature of truth, and of human values, is too complex. We can exclude the most obviously pernicious frauds -- but for good-faith speech from humans in a free society, we must rely on a more nuanced kind of wisdom.

Our media filter what we see. Now the filters in our dominant social media are controlled by a few corporations motivated to maximize ad revenue by maximizing engagement. They work to serve the advertisers that are their customers, not we users (who now are really their product). We need to get them to change how the filters operate, to maximize value to their users.

We need filters to be tuned to the real value of speech as communication from one person to other people.  Most people want the "firehose" of items on the Internet to be filtered in some way, but just how may vary. Our filters need to be responsive to the desires of the recipients. Partisans may like the comfort of their distorting filter bubbles, but most people will want at least some level of value, quality, and reality, at least some of the time. We can reinforce that by doing well at it.

There is also the fact that people live in communities. Standards for what is truthful and valuable vary from community to community -- and communities and people change over time. This is clearer than ever, now that our social networks are global.

Freedom of speech requires that objectionable speech be speak-able, with very narrow exceptions. The issue is who hears that speech, and what control do they have over what they hear. A related issues is when do third parties have a right to influence those listener choices, and how to keep that intrusive hand as light as possible. Some may think we should never see a swastika or a heresy, but who has the right to draw such lines for everyone in every context?

We cannot shut off objectionable speech, but we can get smarter about managing how it spreads. 

To see this more clearly, consider our human social network as a system of collective intelligence, one that informs an operational definition of truth. Whether at the level of a single social network like Facebook, or all of our information networks, we have three kinds of elements:
  • Sources of information items (publishers, ordinary people, organizations, and even bots) 
  • Recipients of information items  
  • Distribution systems that connect the sources and recipients using filters and presentation service that determine what we see and how we see it (including optional indicators of likely truthfulness, bias, and quality).
Controlling truth at the source may, at first, seem the simple solution, but requires a level of control of speech that is inconsistent with a free society. Letting misinformation and harmful content enter our networks may seem unacceptable, but (with narrow exceptions) censorship is just not a good solution.

Some question whether it is enough to "downrank" items in our feeds (not deleted, but less likely to be presented to us), but what better option do we have than to do that wisely? The best we can reasonably do is manage the spread of low quality and harmful information in a way that is respectful of the rights of both sources and recipients, to limit harm and maximize value.*

How can we do that, and who should control it? We, the people, should control it ourselves (with some limited oversight and support).  Here is how.

Getting smarter -- The Augmented Wisdom of Crowds

Neither automation nor human intelligence alone is up to the scale and dynamics of the problem.  We need a computer-augmented approach to managing the wisdom of the crowd -- as embodied in our filters, and controlled by us. That will pull in all of the human intelligence we can access, and apply algorithms and machine learning (with human oversight) to refine and apply it. The good news is that we have the technology to do that. It is just a matter of the will to develop and apply it.

My previous post outlines a practical strategy for doing that -- "The Augmented Wisdom of Crowds: Rate the Raters and Weight the Ratings." Google has already shown how powerful a parallel form of this strategy can be to filter which search results should be presented to whom-- on Internet scale. My proposal is to broaden these methods to filter what our our social media present to us.

The method is one of considering all available "signals" in the network and learning how to use them to inform our filtering process. The core of the information filtering process -- that can be used for all kinds of media, including our social media -- is to use all the data signals that our media systems have about our activity. We can consider activity patterns across these three dimensions:
  • Information items (content of any kind, including news items, personal updates, comments/replies, likes, and shares/retweets).
  • Participants (and communities and sub-communities of participants), who can serve as both sources and recipients of items (and of items about other items)
  • Subject and task domains (and sub-domains) that give important context to information items and participants.
We can apply this data with the understanding that any item or participant can be rated, and any item can contain one or more ratings (implicit or explicit) of other items and/or participants. The trick is to tease out and make sense of all of these interrelated ratings and relationships. To be smart about that, we must recognize that not all ratings as equal, so we "rate the raters, and weight the ratings" (using any data that signals a rating). We take that to multiple levels -- my reputational authority depends not only on the reputational authority of those who rate me, but on those who rate them (and so on).

This may seem very complicated (and at scale, it is), but Google proved the power of such algorithms to determine which search results are relevant to a user's query (at mind-boggling scale and speed). Their PageRank algorithm considers what pages link to a given page to assess the imputed reputational authority of that page -- with weightings based on the imputed authority of the pages that link to it (again to multiple levels). Facebook uses similarly sophisticated algorithms to determine what ads should be targeted to whom -- tracking and matching user interests, similarities, and communities and matching that with information on their response to similar ads.

In some encouraging news, it was recently reported that Facebook is now also doing a very primitive form of rating the trustworthiness of its users to try to identify fake news -- they track who spreads fake news and who reports abuse truthfully or deceitfully. What I propose is that we take this much farther, and make it central to our filtering strategies for social media and more broadly.

With this strategy, we can improve our media filters to better meet our needs, as follows:
  • Track explicit and implicit signals to determine authority and truthfulness -- both of the speakers (participants) and of the things they say (items) -- drawing on the wisdom of those who hear and repeat it (or otherwise signal how they value it).
  • Do similar tracking to understand the desires and critical thinking skills of each of the recipients
  • Rate the raters (all of us!) -- and weight the votes to favor those with better ratings. Do that n-levels deep (much as Google does).
  • Let the users signal what levels and types of filtering they want. Provide defaults and options to accommodate users desiring different balances of ease or of fine control and reporting. Let users change that as they desire, depending on their wish to relax, to do focused critical thinking, or to open up to serendipity.
  • Provide transparency and auditability -- to each user (and to independent auditors) -- as to what is filtered for them and how.**
  • Open the filtering mechanisms to independent providers, to spur innovation in a competitive marketplace in filtering algorithms for users to choose from.
That is the best broad solution that we can apply. As we get good at it we will be amazed at how effective it can be. But given the catastrophic folly of where have have let this get to...

First, do no harm!

Most urgently, we need to change the incentives of our filters to do good, not harm. At present, our filters are pouring gasoline on the fires (even as their corporate owners claim to be trying to put them out). As explained in a recent HBR article, "current digital advertising business models incentivize the spread of false news." That article explains the insidious problem of the ad model for paying for services (others have called it "the original sin of the Web") and offers some sensible remedies.  

I have proposed more innovative approaches to better-aligning business models -- and to using a light-handed, market-driven, regulatory approach to mandate doing that -- in "An Open Letter to Influencers Concerned About Facebook and Other Platforms."

We have learned that the Internet has all the messiness of humanity and its truths. We are facing a Pearl Harbor of a thousand pin-pricks that is rapidly escalating. We must mobilize onto a war footing now, to halt that before it is too late.
  • First we need to understand the nature and urgency of this threat to democracy, 
  • Then we must move on both short and longer time horizons to slow and then reverse the threat. 
The Tao of fake news contains its opposite, the Tao of Augmented Wisdom. If we seek that, the result will be not only to manage fake news, but to be smarter in our collective wisdom than we can now imagine.

Related posts:
---
*Of course some information items will be clearly malicious, coming from fraudulent human accounts or bots -- and shutting some of that off at the source is feasible and desirable. But much of the spread of "fake news" (malicious or not) is from real people acting in good faith, in accord with their understanding and beliefs. We cannot escape that non-binary nature of human reality, and must come to terms with our world in nuanced shades of gray. But we can get very sophisticated at distinguishing when news is spread by participants who are usually reliable from when it is spread by those who have established a reputation for being credulous, biased, or malicious.

**The usual concern with transparency is that if the algorithms are known, then bad-actors will game them. That is a valid concern, and some have suggested that even if the how of the filtering algorithm is secret, we should be able to see and audit the why for a given result.  But to the extent that there is an open market in filtering methods (and in countermeasures to disinformation), and our filters vary from user to user and time to time, there will be so much variability in the algorithms that it will be hard to game them effectively.

---
[Update 8/30:]  Giuliani and The Tao of Truth 

To indulge in some timely musing, the Tao of Truth gives a perspective on the widely noted recent public statement that "truth isn't truth." At the level of the Tao, we can say that "truth is/isn't truth," or more precisely, "truth is/isn't Truth" (with one capital T). That is the level at which we understand truth to be a process in which the question "what is truth?" depends on what we mean, at what level, in what context, with what assurance -- and how far we are in that process. We as a society have developed a broadly shared expectation of how that process should work. But as the process does its never-ending work, there are no absolutes -- only more or less strong evidence, reasoning, and consensus about what we believe the relevant truth to be. (That, of course is an Enlightenment social perspective, and some disagree with this very process, and instead favor a more absolute and authoritarian variation. Perhaps most fundamentally, we are now in a reactionary time in which our prevailing process for truth is being prominently questioned. The hope here is that continuing development of a free, open, and wise process prevails over return to a closed, authoritarian one -- and prevails over the loss of any consensus at all.

[Update 10/12:] A Times article by Sheera Frenkel, adds perspective on the scope and pace of the problem -- and the difficulty in definitively identifying items as fakes that can be censored "because of the blurry lines between free speech and disinformation" -- but such questionable items can be down-ranked.

Sunday, July 22, 2018

The Augmented Wisdom of Crowds: Rate the Raters and Weight the Ratings

How technology can make us all smarter, not dumber

We thought social media and computer-mediated communications technologies would make us smarter, but recent experience with Facebook, Twitter, and others suggests they are now making us much dumber. We face a major and fundamental crisis. Civilization seems to be descending into a battle of increasingly polarized factions who cannot understand or accept one another, fueled by filter bubbles and echo chambers.

Many have begun to focus serious attention on this problem, but it seems we are fighting the last war -- not using tools that match the task.

A recent conference, "Fake News Horror Show," convened people focused on these issues from government, academia, and industry, and one of the issues was who decides what is "fake news," how, and on what basis. There are many efforts at fact checking, and at certification or rating of reputable vs. disreputable sources -- but also recognition that such efforts can be crippled by circularity: who is credible-enough in the eyes of diverse communities of interest to escape the charge of "fake news" themselves?

I raised two points at that conference. This post expands on the first point and shows how it provides a basis for addressing the second:
  • The core issue is one of trust and authority -- it is hard to get consistent agreement in any broad population on who should be trusted or taken as an authority, no matter what their established credentials or reputation. Who decides what is fake news? What I suggested is that this is the same problem that has been made manageable by getting smarter about the wisdom of crowds -- much as Google's PageRank algorithm beat out Yahoo and AltaVista at making search engines effective at finding content that is relevant and useful.

    As explained further below, the essence of the method is to "rate the raters" -- and to weight those ratings accordingly. Working at Web scale, no rater's authority can be relied on without drawing on the judgement of the crowd. Furthermore, simple equal voting does not fully reflect the wisdom of the crowd -- there is deeper wisdom about those votes to be drawn from the crowd.

    Some of the crowd are more equal than others. Deciding who is more equal, and whose vote should be weighted more heavily can be determined by how people rate the raters -- and how those raters are rated -- and so on. Those ratings are not universal, but depend on the context: the domain and the community -- and the current intent or task of the user. Each of us wants to see what is most relevant, useful, appealing, or eye-opening -- for us -- and perhaps with different balances at different times. Computer intelligence can distill those recursive, context-dependent ratings, to augment human wisdom.
  • A major complicating issue is that of biased assimilation. The perverse truth seems to be that "balanced information may actually inflame extreme views." This is all too clear in the mirror worlds of pro-Trump and anti-Trump factions and their media favorites like Fox, CNN, and MSNBC. Each side thinks the other is unhinged or even evil, and layers a vicious cycle of distrust around anything they say. It seems one of the few promising counters to this vicious cycle is what Cass Sunstein referred to as surprising validators: people one usually gives credence to, but who suggest one's view on a particular issue might be wrong. A recent example of a surprising validator was the "Confession of an Anti-GMO Activist." This item is  readily identifiable as a "turncoat" opinion that might be influential for many, but smart algorithms can find similar items that are more subtle, and tied to less prominent people who may be known and respected by a particular user. There is an opportunity for electronic media services to exploit this insight that "what matters most may be not what is said, but who, exactly, is saying it."
These are themes I have been thinking and writing about on and off for decades. This growing crisis, as highlighted by the Fake News Horror Show conference, spurred me to write this outline for a broad architecture (and specific methods) for addressing these issues. Discussions at that event led to my invitation to an up-coming workshop hosted by the Global Engagement Center (a US State Department unit) focused on "technologies for use against foreign propaganda, disinformation, and radicalization to violence." This post is offered to contribute to those efforts.

Beyond that urgent focus, this architecture has relevance to the broader improvement of social media and other collaborative systems. Some key themes:
  • Binary, black or white thinking is easy and natural, but humans are capable of dealing with the fact that reality is nuanced in many shades of gray, in many dimensions. Our electronic media can augment that capability.
  • Instead, our most widely used social media now foster simplistic, binary thinking.
  • Simple strategies (analogous to those proven and continually refined in Google's search engine) enable our social media systems to recognize more of the underlying nuance, and bring it to our attention in far more effective ways.
  • We can apply an architecture that draws on some core structures and methods to enable intelligent systems to better augment human intelligence, and to do that in ways tuned to the needs of a diversity of people -- from different schools of thought and with different levels of intelligence, education, and attention.
  • Doing this can not only better expose truly fake news for what it is, but can make us smarter and more aware and reflective of nuance. 
  • This can not only guide our attention toward quality, but can also enable us to be more favored by surprising validators and other forms of serendipity needed to escape our filter bubbles.
Where I am coming from

I was first exposed to early forms of augmented intelligence and hypermedia in 1969 (notably Nelson and Engelbart), and to collaborative systems in 1971 (notably Turoff). That set a broad theme for my work. After varied roles in IT and media technology, I became an inventor, and one of my patent applications outlined a collaborative system for social development of inventions and other ideas (in 2002). While my specific business objective proved elusive (as the world of patents changed), what I described was a general architecture for collaborative development of ideas that has very wide applicability ("ideas" include news stories, social media posts, and "likes"). That is obviously more timely now than ever. I had written on this blog about some specific aspects of those ideas in 2012: "Filtering for Serendipity -- Extremism, 'Filter Bubbles' and 'Surprising Validators.'" To encourage use of those ideas, I released that patent filing into the public domain in 2016.

Here, I take a first shot at a broad description of these strategies that is intended to be more readable and relevant to our current crisis than the legalese of the patent application. As supplement to this, a copy of that patent document with highlighting of the portions that remain most relevant is posted online.*

Of course some of these ideas are more readily applied than others. But the goal of an architecture is to provide a vision and a framework to build on. Without considering the broad scope of what might be done over time is the best way to be sure that we do the best that we can do at any point in time. We can then adjust and improve on that to build toward still-better solutions.

Augmenting the wisdom of crowds

Civilization has risen because of our human skills: to cooperate, to learn from one another, and to coalesce on wisdom and resist folly -- difficult as it may often be to distinguish which is which.

Life is complex, and things are rarely black or white. The Tao symbolizes the realization that everything contains its opposite -- Ted Nelson put it that "everything is deeply intertwingled," and conceived of the Web as a way to reflect that. But throughout human history this nuanced intertwingling has remained challenging for people to grasp.

Behavioral psychology has elucidated the mechanisms behind our difficulty. We are capable of deep and subtle rational thought (Kahneman's System 2, "thinking slow"), but we are pragmatic and lazy, and prefer the the more instinctive, quick, and easy path (system 1, "thinking fast" -- a mode that offers great survival value when faced with urgent decisions. Only reluctantly do we think more deeply. The thinking fast of System 1 favors biased assimilation, with its reliance on the "cognitive ease," quick reactions, and emotional and tribal appeal, rather than rationality.

Augmenting human intellect

For over half a century, a seminal dream of computer technology has been "augmenting human intellect" based on "man-computer symbiosis." The developers of our augmentation tools and our social media believed in their power to enhance community and wisdom -- but we failed to realize how easily our systems can reduce us to the lowest common denominator if we do not apply consistent and coherent measures to better augment the intelligence they automated. A number of early collaborative Web services recognized that some contributors should be more equal than others (for example, Slashdot, with its "karma" reputation system). Simple reputation systems have also proven important for eBay and other market services. However, the social media that came to dominate broader society failed to realize how important that is, and were motivated to "move fast and break things" in a rush to scale and profit.

Now, we are trying to clean up the broken mess of this Frankenberg's monster, to find ways to flag "fake news" in its various harmful forms. But we still seem not to be applying the seminal work in this field. That failure has made our use of the wisdom of crowds stupid to the point of catastrophe. Instead of augmenting our intellect as Engelbart proposed, we are de-augmenting it. People see what is popular, read a headline without reading the full story, jump to conclusions and "like" it, making it more popular, so more people see it. The headlines increasingly become clickbait that distorts the real story. Influence shifts from ideas to memes. This is clearly a vicious cycle -- one that the social media services have little economic incentive to change -- polarization increases engagement, which sells more ads. We urgently need fundamental changes to these systems.

Crowdsourced, domain-specific, authorities -- rating the raters -- much like Google

Raw forms of the wisdom of crowds look to "votes" from crowd, weight them equally, and select the most popular or "liked" items (or a simple average of all votes). This has been done for forecasting, for citation analysis of academic papers, and in early computer searching. But it becomes apparent that this can lead to the lowest common denominator of wisdom, and is easily manipulated with fraudulent votes. Of course we can restrict this to curated "expert" opinion, but then we lose the wisdom of the larger crowd (including its ability to rapidly sense early signs of change).

It was learned that better results can be obtained by weighting votes based on authority, as done in Google's PageRank algorithm, so that votes with higher authority count more heavily (while still using the full crowd to balance the effects of supposed authorities who might be wrong). In academic papers, it was realized that it matters which journal cites an article (now that many low-quality pay-to-publish journals have proliferated).

In Google's search algorithm (dating from 1996, and continuously refined), it was realized that links from a widely-linked-to Web site should be weighted higher in authority than links from another that has few links in to it. The algorithm became recursive: PageRank (used to rank the top search results) depends on how many direct links come in, weighted by a second level factor of how many sites link in to those sites, and weighted in turn by a third level factor of how many of those have many inward links, and so on. Related refinements partitioned these rankings by subject domain, so that authority might be high in one domain, but not in others. The details of how many levels of recursion and how the weighting is done are constantly tuned by Google, but this basic rate the raters strategy is the foundation for Google's continuing success, even as it is now enhanced with many other "signals" in a continually adaptive way. (These include scoring based on analysis of page content and format to weight sites that seem to be legitimate above those that seem to be spam or link farms.)

Proposed methods and architecture

My patent disclosure explains much the same rate the raters strategy (call it RateRank?) as applicable to ranking items of nearly any kind, in a richly nuanced, open, social context for augmenting the wisdom of crowds. (It is a strategy that can itself be adapted and refined by augmenting the wisdom of crowds -- another case of "eat your own dog food!")

The core architecture works in terms of three major dimensions that apply to a full range of information systems and services:
  1. Items. These can be any kind of information item, including contribution items (such as news stories, blog posts, or social media posts, or even books or videos, or collections of items), comment/analysis items (including social media comments on other items), and rating/feedback items (including likes and retweets, as well as comments that imply a rating of another item)
  2. Participants (and communities and sub-communities of participants). These are individuals, who may or may not have specific roles (including submitters, commenters, raters, and special roles such as experts, moderators, or administrators). In social media systems, these might include people (with verified IDs or anonymous), collections of people in the form of businesses, commercial advertisers, political advertisers, and other organizations. (Special rules and restrictions might apply to non-human participants, including bots and corporate or state actors.) Communities of participants might be explicit (with controlled membership), such as Facebook groups, and implicit (and fuzzy), based on closeness of social graph relationships and domain interests. These might include communities of interest, practice, geographic locality, or  degree of social graph closeness. 
  3. Domains (and sub-domains). These may be subject-matter domains in various dimensions. Domains may overlap or cross-cut. (For example issues about GMOs might involve cross-cutting scientific, business, governmental/regulatory, and political domains.)
An important aspect of generality in this architecture is that:
  • Any item or participant can be rated (explicitly or implicitly)
  • Any item can contain one or more ratings of other items or participants (and of itself)
It should be understood that Google's algorithm is a specialized instance of such an architecture -- one where all the items are Web pages, and all links between Web pages are implicit ratings of the link destination by the link source. The key element of man-computer symbiosis here is that the decision to place a link is assumed to be a "rating" decision of a human Webmaster or author (a vote for the destination, by the source, from the source context), but the analysis and weighting of those links (votes) is algorithmic. Much as could be applied to fake news, Google has developed finely tuned algorithms for detecting the multitudes of "link farms" that use bots that seek to fraudulently mimic this human intelligence, and downgrades the weighting of such links.

How the augmenting works

The heart of the method is a fully adaptive process that rates the raters recursively, using explicit and implicit ratings of items and raters (and potentially even the algorithms of the system itself). Rate the raters, rate those who rate the raters, and so on. Weight the ratings according to the rater's reputation (in context), so the wisest members of the crowd, in the current context, as judged by the crowd, have the most say. The wisest in context meaning the wisest in the domains and communities that are most relevant to the current usage context. But still, all of the crowd should be considered at some level.

This causes good items and raters (and algorithms) to bubble up into prominence, and less well-rated ones to sink from prominence. This process would rarely be binary black and white. Highly rated items or participants can lose that rating over time, and in other contexts. Poorly rated items or participants might never be removed (except for extreme abuse) but simply downgraded (to contribute what small weight is warranted, especially if many agree on a contrary view) and can remain accessible with digging, when desired. (As noted below, our social media systems have become essential utilities, and exclusion of people or ideas on the fringe is at odds with the value of free speech in our open society.) The rules and algorithms could be continuously learning and adaptive, using a hybrid of machine learning and human oversight. 

Attention management systems can ensure that the best items tend to made most visible, and the worst least visible, but the system should adjust those rankings to the context of what is known about the user in general, and what is inferred about what the user is seeking at a given time -- with options for explicit overrides (much as Google adjusts its search rankings to the user and their current query patterns).  It should be noted that Facebook and others already use some similar methods, but unfortunately these are oriented to maximizing an intensity of "engagement" that optimizes for the company's ad sale opportunities, rather than to a quality of content and engagement for the user. We need sophistication of algorithms, data science, and machine learning applied to quality for users, not just engagement for advertisers and those who would manipulate us.

Participants might be imputed high authority in one domain, or in one community, but lower in others. Movie stars might outrank Nobel prize-winners when considering a topic in the arts or even in social awareness, but not in economic theory. NRA members might outrank gun control opponents for members of an NRA community, but not for non-members of that community.

Openness is a key enabling feature: these algorithms should not be monolithic, opaque, and controlled by any one system, but should be flexible, transparent, and adaptive -- and depend on user task/context/desires/skill at any given time. Some users may choose simple default processes and behaviors, but other could be enabled to mix and match alternative ranking and filtering processes, and to apply deeper levels of analytics to understand why the system is presenting a given view. Users should be able to modify the view they see as they may desire, either by changing parameters or swapping alternative algorithms. Such alternative algorithms could be from a single provider, or alternative sources in an open marketspace, or "roll your own."

Within this framework, key design factors include how these key processes are managed to work in concert, and to change how each of these behaves, for a given user, at given time, depending on task/context/desires/skill (including the level of effort a user wishes to put in):
  • The core rate the raters process, based on both implicit and explicit ratings, weighted by authority as assessed by other raters (as themselves weighted based on ratings by others), with selective levels of partitioning by community and domain. Consideration of formal and institutional authority can be applied to partially balance crowdsourced authority. Dynamic selection of weighting and balancing methods might depend on user task/context/desires
  • Attention tools that filter irrelevant items and highlight relevant ones (such as to give Facebook or Twitter users different views of their feed). Thus different Facebook or Twitter user might be able to get different views of their feed, and change that as desired.
  • Consideration with regard to which communities and sub-communities most contribute to rankings for specific items at specific times.  Communities might have graded openness (in the form of selectively permeable boundaries) to avoid groupthink and cross-fertilize effectively. This could be applied by using insider/outsider thresholds to manage separation/openness.
  • Consideration with regard to domains and sub-domains to maximize the quality and relevance of ratings, authority, and attention, and to avoid groupthink and cross-fertilize effectively.
  • Consideration of explicit vs. implicit ratings.. While explicit ratings may provide the strongest and most nuanced information, implicit ratings may be far more readily available, thus representing a larger crowd, and so may have the greatest value in augmenting the wisdom of the crowd. Just as with search and ad targeting, implicit ratings can include subtle factors, such as measures of attention, sentiment, emotion, and other behaviors.
  • Consideration of verified vs. unverified vs. anonymous participants. It may be desirable to allow a range of levels, use weighting where anonymous participants have no reputation or a negative reputation. Bots might be banned, or given very poor reputation.
  • Open creation, selection and use of alternative tools for filtering, discovery, attention/alerting, ranking, and analytics depending on user task/context/desires. This kind of openness can stimulate development and testing of creative alternatives and enable market-based selection of the best-suited tools.
  • Use of valuation, crowdfunding, recognition, publicity, and other non-monetary incentives can also be used to encourage productive and meaningful participation, to bring out the best of the crowd.
(As expanded on below, all of this should be done with transparency and user control.)

[Update 10/10/18:] This subsequent post: In the War on Fake News, All of Us are Soldiers, Already!, may help make this more concrete and clarify why it is badly needed.

Applying this to social media -- fake news, community standards, polarization, and serendipity

A core objective is to augment the wisdom of crowds -- to benefit from the crowd to filter out the irrelevant or poor quality -- but to have augmented intelligence in determining relevance and quality in a dynamically nuanced way that reduces the de-augmenting effect of echo chambers and filter bubbles.

Using these methods, true fake news, which is clearly dishonest and created by known bad actors, can be readily filtered out, with low risk of blocking good-faith contrarian perspectives from quality sources. Such fake news can readily be distinguished from  legitimate partisan spin (point and counterpoint), from legitimate criticism (a news photo of a Nazi sign) or historically important news items (the Vietnam "terror of war" photo), and from legitimate humor or satire.

A dilemma that has become very apparent in our social media relates to "community standards" for managing people and items that are "objectionable." Since our social media systems have become essential utilities, exclusion of people or ideas on the fringe is at odds with the rights of free speech in our open society. Jessica Lessin recently commented on Facebook's "clumsy" struggles with content moderation, and on the calls of some to ban people and items. She observes that Facebook wants the community to determine the rules, but also is pressed to placate regulators -- and observes that "getting two billion people to write your rules isn’t very practical."

"Getting two billion people to write your rules" is just what the augmented wisdom of crowds does seek to make practical -- and more effective than any other strategy. The rules would rarely ban people (real humans) or items, but simply limit their visibility beyond the participants and communities that choose to accept such people or items. Such "objectionable" people have no right to require they be granted wide exposure, and, at the same time, those who find some people or materials objectionable rarely have a right to insist on an absolute and total ban.

This ties back to the converse issue, the seeking of surprising validators and serendipity described in my 2015 post. By understanding the items and participants, how they are rated by whom, and how they fit into communities, social graphs, and domains, highly personalized attention management tools can minimize exposure to what is truly objectionable, but can find and present just the right surprising validators for each individual user (at times when they might be receptive). Similarly, these tools can custom-chose serendipitous items from other communities and domains that would otherwise be missed.

This is an area where advanced augmentation of crowd wisdom can become uniquely powerful. The mainstream will become more aware and accepting of fringe views and materials (and might set aside specific times for exploring such items), and the extremes will have the freedom to choose (1) whether they wish to make their case in a way that others can accept as unpleasant but not unreasonable and antisocial, or (2) to be placed beyond the pale of broader society: hard to find, but still short of total exclusion. Again, a high degree of customization can be applied (and varied with changing context). Those who want walled gardens can create them -- with windows and gates that open where and when desired.

Innovation, openness, transparency, and privacy

Of course the key issues are how do we apply quick fixes for our current crisis, how do we evolve toward better media ecosystems, and how do we balance privacy and transparency. I generally advocate for openness and transparency. 

The Internet and the early Web were built on openness and transparency, which fueled a huge burst of innovation.  (Just as I refer to my 2002 patent filing, one can make a broad argument that many of the most important ideas of digital society emerged around the time of that "dot-com" era or before.) Open, interoperable systems (both Web 1.0 and Web 2.0) enabled a thousand flowers to bloom. There are also similar lessons from systems for financial market data (one of the first great data market ecologies) fueled by open access to market data from trading exchanges, and to competing, interoperable distribution, analytics, and presentation services. The patent filing I describe here (and others of mine) build on similar openness and interoperability. 

Now that we have veered down a path of closed, monopolistic walled gardens that have gained great power, we face difficult questions of how to manage them for the public good. I suggest we probably need a mix of all four of the following. Determining just how to do that will be challenging. (Some suggestions related to each of these follow.) 
  1. Can we motivate monopolies like Facebook to voluntarily shift to better serve us? Ideally, that would be the fastest solution, since they have full power to introduce such methods (and the skills to do so are much the same as the skills they now apply for targeting ads).
  2. Can we independently layer needed functions on top of such services (or in competition with them)? The questions are how to interface to existing services (with or without cooperation) and how to gain critical mass. Even at more limited scale, such secondary systems might provide augmented wisdom that could be fed back into the dominant systems, such as to help flag harmful items.
  3. Should we mandate regulatory controls, accepting these systems as natural monopolies to be regulated as such (much like early days of regulating the Bell System monopoly on telephonic media platforms)? There seem to be strong arguments for at least some of this, but being smart about it will be a challenge.
  4. Should we open them up or break portions of them apart (much like the later days of regulating the  Bell System)? Here, too, there seem to be strong arguments for at least some of this, but being smart about it will be a challenge.
  5. Can we use regulation to force the monopolies to better serve their users (and society) by forcing changes in their business model (with incentives to serve users rather than advertisers)? I suggest that may be one of the most feasible and effective levers we can apply.
My suggestions about those alternatives:
A transparent society?

A central (and increasingly urgent) dilemma relates to privacy. Some of my suggestions for openness and transparency in our social media and similar collaborative systems could potentially conflict with privacy concerns. We may have to choose between strict privacy and smart, effective systems that create immense new value for users and society. We need to think more deeply about which objectives matter, and how to get the best of mix. Privacy is an important human issue, but its role in our world of Big Data and AI is changing: 
  • As David Brin suggested in The Transparent Society, the question of privacy is not just what is known about us, but who controls that information. Brin suggests the greatest danger is that authoritarian governments will control information and use it to control us (as China is increasingly on track to do that). 
  • We now face a similar concern with monopolies that have taken on quasi-governmental roles -- they seem to be answerable to no one, and are motivated not to serve their users, but to manipulate us to serve the advertisers who they profit from. (There are also the advertisers, themselves.)
  • Brin suggested our technology will return us to the more transparent human norms of the village -- everyone knew one-another's secrets, but that created a balance of power where all but the most antisocial secrets were largely ignored and accepted. We seem to be well on the way to accepting less privacy, as long as our information is not abused.
  • I suggest we will gain the most by moving in the direction of openness and transparency -- with care to protect the aspects of privacy that really need protection (by managing well-targeted constraints on who has access to what, under what controls). 
That takes us back to the genius of man-computer symbiosis -- AI and machine learning thrive on big data. Locking up or siloing big data can cripple our ability to augment the wisdom of crowds and leave us at the mercy of the governments or businesses that do have our data. We need to find a wise middle ground of openness that fuels augmented intelligence and market forces -- in which service providers are driven by customer demand and desires, and constrained only by the precision-crafted privacy protections that are truly needed.

-----------------------
Related Posts:

------

*Appendix -- My patent disclosure document (now in public domain)

This post draws on the architecture and methods described in detail in my US patent application entitled "Method and Apparatus for and Idea Adoption Marketplace" (10/692,974), which was published 9/17/04. It was filed 10/24/03 formalizing a provisional filing on 10/24/02. I released this material into the public domain on 12/19/16. I retain no patent rights in it, and it is open to all who can benefit from it.

A copy of that application with highlighting of portions that remain most relevant to current needs is now online. While this is written in a legalese style required for patent applications that is not very readable, it is hoped that the highlighted sections are readable to those with interest. (A duplicate copy is here.)

The highlighted sections present a broad architecture that now seems more timely than ever, and provides an extensible framework for far better social media -- and important aspects of digital democracy in general.