Monday, January 27, 2020

Make it So, Now! - 10 Ways Tech Platforms Can Safeguard the 2020 Election

"Ten things technology platforms can do to safeguard the 2020 U.S. election" is an urgent and vital statement that we should all read -- and do all we can to make happen -- especially if you have any connection to the platforms, Congress, or regulators (or the press). Hopefully, anyone reading this understands why this is urgent (but the article begins with a brief reminder).

Thirteen prominent thought leaders " discuss immediate steps the major social media companies can take to help safeguard our democratic process and mitigate the weaponization of their platforms in the run-up to the 2020 U.S. elections. They published this as a "living document."

Here is their list of  "What can be done … now" (the article explains each):
  1. Remove and archive fraudulent and automated accounts
  2. Clearly identify paid political posts — even when they’re shared
  3. Use consistent definitions of an ad or paid post
  4. Verify and accurately disclose advertising entities in political ads
  5. Require certification for political ads to receive organic reach
  6. Remove pricing incentives for presidential candidates that reward virality (including a limit on microtargeting)
  7. Provide detailed resources with accurate voting information at top of feeds
  8. Provide a more transparent and consistent set of data in political ad archives
  9. Clarifying where they draw the line on “lying”
  10. Be transparent about the resources they are putting into safety and security
All of these should be do-able in a matter of months.  While many of the signatories "...are working on longer-term ways to create a healthier, safer internet, [they] are proposing more immediate steps that could be implemented before the 2020 election for Facebook and other social media platforms to consider." 

The writers include "a Facebook co-founder, former Facebook, Google and Twitter employees, early Facebook and Twitter investors, academics, non-profit leaders, national security and public policy professionals:" John Borthwick, Sean Eldridge, Yael Eisenstat, Nir Erfat, Tristan Harris, Justin Hendrix, Chris Hughes, Young Mie Kim, Roger McNamee, Adav Noti, Eli Pariser, Trevor Potter and Vivian Schiller.

I, too, am working on longer term issues, as outlined in this recent summary in the context of some important think tank reports: Regulating our Platforms -- A Deeper Vision Similarly, I have addressed one of the most urgent stop-gap issues (which is part of their #6), in 2020: A Goldilocks Solution for False Political Ads on Social Media is Emerging).

Monday, January 20, 2020

Personalized Nutrition -- Because Everything is Deeply Intertwingled!

Nutrition is hard to get right because everything is deeply intertwingled. Personalized Nutrition is changing that!

This new perspective on nutrition is gaining attention, as an aspect of personalized medicine, and is the subject of a new paper, Toward the Definition of Personalized Nutrition: A Proposal by The American Nutrition Association.  (I saw it as it was finalized, since my wife, Dana Reed, is a co-author, and a board member and part of the nutrition science team at ANA.)

The key idea is:
Personalized nutrition (PN) is rooted in the concept that one size does not fit all; differences in biochemistry, metabolism, genetics, and microbiota contribute to the dramatic inter-individual differences observed in response to nutrition, nutrient status, dietary patterns, timing of eating, and environmental exposures. PN has been described in a variety of ways, and other terms such as “precision nutrition,” “individualized nutrition,” and “nutritional genomics” have similar, sometimes overlapping, meanings in the literature.
I have always been something less than a poster child for following nutrition guidelines, for reasons that this report cites:  "...guidelines have only limited ability to address the myriad inputs that influence the unique manifestation of an individual’s health or disease status."

I frequently cite the conundrum from Woody Allen's Sleeper, when the 1970s protagonist had just been awakened by doctors after 200 years:
Dr. Melik: This morning for breakfast he requested something called "wheat germ, organic honey and tiger's milk."
Dr. Aragon: [chuckling] Oh, yes. Those are the charmed substances that some years ago were thought to contain life-preserving properties.
Dr. Melik: You mean there was no deep fat? No steak or cream pies or... hot fudge?
Dr. Aragon: Those were thought to be unhealthy... precisely the opposite of what we now know to be true.
Overstated to be sure, but the real issue is that "one man's meat is another man's poison." Determining which is which for a given person has been impractical, but now we are not only learning that this is far more intertwingled than was thought, but we are gaining the ability to tease out what applies to a given person.

I come from this not from biology, but from machine learning and predictive analytics. My focus is on getting smarter about how everything is intertwingled.

One of the most intriguing companies I have run across is Nutrino, a startup acquired by Medtronic, that analyzes data from continuous glucose monitors used by diabetics to understand the factors that affect their glucose response over time. They correlate to specific food intakes, activity, sleep, mood, blood tests, genomics, biomics, and more. They call it a FoodPrint, "a digital signature of how our body reacts to different foods. It is contextually driven and provides correlations, insights and predictions that become the underpinning for personal and continually improving nutrition recommendations." This is one of the first successful efforts to tease out how what I eat (and what else I do) really affects me as an individual, in all of its real-world intertwingularity.

It is time to move beyond the current so-called "gold standard" of intervention-based studies, the randomized double blind placebo controlled (RDBPC) clinical tests. Reality is far too intertwingled for that to be more than narrowly useful. It is time to embrace big data, correlation, and predictive analytics. Some early recognition of this is that drugmakers are getting the FDA to accept mining of patient data as a way to avoid need for clinical trials.

We have a long way to go, but I want to know how likely it is that a given amount of deep fat or hot fudge, or wheat germ or kale (in combination with the rest of my diet, behavior and risk factors), will have a significant effect, over a time frame that can motivate whether or not I indulge in my chocolate or eat my spinach.

It is not enough to know that the dose makes the poison -- I want to know if the average man's poison is really just my meat.

Before very long we will know.

Friday, January 10, 2020

The Dis-information Choke Point: Dis-tribution (Not Supply or Demand) [Stub]

Demand for Deceit: How the Way We Think Drives Disinformation, is an excellent report from the National Endowment for Democracy (by Samuel Woolley and Katie Joseff, 1/8/20). It highlights the dual importance of both supply and demand side factors in the problem of disinformation (fake news). That crystallizes in my mind an essential gap in this field -- smarter control of distribution. The importance of this third element that mediates between supply and demand was implicit in my comments on algorithms (in section #2 of the prior post).

[This is a stub for a fuller post yet to come. (It is an adaptation of a brief update to my prior post on Regulating the Platforms, but deserves separate treatment.)]

There is little fundamentally new about the supply or the demand for disinformation.  What is fundamentally new is how disinformation is distributed.  That is what we most urgently need to fix. If disinformation falls in a forest… but appears in no one’s feed, does it disinform?

In social media a new form of distribution mediates between supply and demand.  The media platform does filtering that upranks or downranks content, and so governs what users see.  If disinformation is downranked, we will not see it -- even if it is posted and potentially accessible to billions of people.  Filtered distribution is what makes social media not just more information, faster, but an entirely new kind of medium.  Filtering is a new, automated form of moderation and amplification.  That has implications for both the design and the regulation of social media.

[Update: see comments below on Facebook's 2/17/20 White Paper on Regulation.] 

Controlling the choke point

By changing social media filtering algorithms we can dramatically reduce the distribution of disinformation.  It is widely recognized that there is a problem of distribution: current social media promote content that angers and polarizes because that increases engagement and thus ad revenues.  Instead the services could filter for quality and value to users, but they have little incentive to do so.  What little effort they ever have made to do that has been lost in their quest for ad revenue.

Social media marketers speak of "amplification." It is easy to see the supply and demand for disinformation, but marketing professionals know that it is amplification in distribution that makes all the difference. Distribution is the critical choke point for controlling this newly amplified spread of disinformation. (And as Feld points out, the First Amendment does not protect inappropriate uses of loudspeakers.)

While this is a complex area that warrants much study, as the report observes, the arguments cited against the importance of filter bubbles in the box on page 10 are less relevant to social media, where the filters are largely based on the user’s social graph (who promotes items to be fed to them, in the form of posts, likes, comments, and shares), not just active search behavior (what they search for). 

Changing the behavior of demand is clearly desirable, but a very long and costly effort. It is recognized that we cannot stop the supply. But we can control distribution -- changing filtering algorithms could have significant impact rapidly, and would apply across the board, at Internet scale and speed -- if the social media platforms could be motivated to design better algorithms.

How can we do that? A quick summary of key points from my prior posts...

We seem to forget what Google’s original PageRank algorithm had taught us.  Content quality can be inferred algorithmically based on human user behaviors, without intrinsic understanding of the meaning of the content.  Algorithms can be enhanced to be far more nuanced.  The current upranking is based on likes from all of one’s social graph -- all treated as equally valid.  Instead, we can design algorithms that learn to recognize the user behaviors on page 8, to learn which users share responsibly (reading more than headlines and showing discernment for quality) and which are promiscuous (sharing reflexively, with minimal dwell time) or malicious (repeatedly sharing content determined to be disinformation).  Why should those users have more than minimal influence on what other users see?

The spread of disinformation could be dramatically reduced by upranking “votes” on what to share from users with good reputations, and downranking votes from those with poor reputations.  I explain further in A Cognitive Immune System for Social Media -- Developing Systemic Resistance to Fake News and In the War on Fake News, All of Us are Soldiers, Already!  More specifics on designing such algorithms is in The Augmented Wisdom of Crowds: Rate the Raters and Weight the Ratings.  Social media are now reflecting the wisdom of the mob -- instead we need to seek the wisdom of the smart crowd.  That is what society has sought to do for centuries.

Beyond that, better algorithms could combat the social media filter bubble effects by applying measures that apply judo to the active drivers noted on page 8.  Cass Sunstein suggested “surprising validators” in 2012 one way this might be done, and I built on that to explain how that could be applied in social media algorithms:  Filtering for Serendipity -- Extremism, 'Filter Bubbles' and 'Surprising Validators’.

If platforms and regulators focused more on what such distribution algorithms could do, they might take action to make that happen (as addressed in Regulating our Platforms -- A Deeper Vision).

Yes, "the way we think drives disinformation," and social media distribution algorithms drive how we think -- we can drive them for good, not bad!

Background noteNiemanLab today pointed to a PNAS paper showing evidence that "... ratings given by our [lay] participants were very strongly correlated with ratings provided by professional fact-checkers. Thus, incorporating the trust ratings of laypeople into social media ranking algorithms may effectively identify low-quality news outlets and could well reduce the amount of misinformation circulating online." The study was based on explicit quality judgments, but using implicit data on quality judgments as I suggest should be similarly correlated, and could apply the imputed judgments of every social media user who interacted with an item with no added user effort.

Comments on Facebook's 2/17/20 White Paper, Charting a Way Forward on Online Content Regulation

This is an interesting document, with some good discussion, but it seems to provide evidence that leads to the point I make here, but totally misses seeing it. Again this seems to be a case in which "It is difficult to get a man to understand something when his job depends on not understanding it."

The report makes the important point that:
Companies may be able to predict the harmfulness of posts by assessing the likely reach of content (through distribution trends and likely virality), assessing the likelihood that a reported post violates (through review with artificial intelligence), or assessing the likely severity of reported content
So Facebook understands that they can predict "the likely reach of content" -- why not influence it??? It is their distribution process and filtering algorithms that control "the likely reach of content." Why not throttle distribution to reduce the reach in accord with the predicted severity of the violation? Why not gather realtime feedback from the distribution process (including the responses of users) to refine those predictions, so they can course correct the initial predictions and rapidly refine the level of the throttle? That is what I have suggested in many posts, notably In the War on Fake News, All of Us are Soldiers, Already!

See the Selected Items tab for more on this theme.