Smartly Intertwingled: My Decades of Blueskying, and Hopes for the Bluesky Project [Updated 6/28/23]

***See updates to Key Ideas section that follow below (6/28/23)***

This vision also applies to the future of the Fediverse, Mastodon, and all of social media. ...And, as to "vibe," see why I call Bluesky "the shmoo of social media" [Update 8/9/23]

Having been thinking for decades about the potential of social media to offer steerable "bicycles for our minds" individually and collectively -- and becoming increasingly concerned by the directions of the past decade -- I now see some very encouraging patches of blue.

I have been following and commenting generally on the Bluesky project that Jack Dorsey spun out from Twitter, and this April wrote about a similarly aligned project, the Initiative for Public Infrastructure led by Ethan Zuckerman. That post explained how the iDPI effort aligned with my ideas, and where I hoped it might go.

After using Bluesky for nearly two months, and reading some of the growing body of their thinking (in their blog posts and related details on Github) it seems timely for me to respond to their requests for feedback by outlining my thinking on where I hope they will take us. Even if Bluesky, the company, fails to achieve critical mass in its mission “to develop and drive large-scale adoption of technologies for open and decentralized public conversation,” it has potential to lay the foundations for next generation protocols and services that will. What I have seen so far -- building out Composable Moderation, Moderation in a Public Commons, and Algorithmic Choice -- seems well-aligned with my vision.

KEY IDEAS FOR BLUESKY
[...and the Pluriverse of social media in general]

This is a first, brief, and informal discussion draft, summarizing and pointing to ideas I hope the Bluesky team will be pursuing [with updates below]. I don't know how much of this is already in their long-term architectural plan. Of course it is not reasonable to expect the team to be far along in implementing much of what I suggest here -- that will be a massive and extended whole-of-society effort. My objective is simply to paint the vision, in hopes that they (and others) will share it, to ensure that their architecture is designed to extend in these directions as it develops. The hope is that like the web, this architecture will be generative and extensible enough to evolve over decades to provide a rich backbone for augmenting nearly all human discourse -- and the processes of its social mediation.

Following this brief summary are pointers to works of mine that expand on this vision in some detail.

Hypercommunities

The What is Bluesky? blog post says “In the federated network, people can move between cities depending on what kind of community they’d like to be in.” This analogy takes a step in the right direction, but strikes me as missing the essential multi-dimensionality of humanity's social web.

The beauty of online discourse is that I can "be in" many virtual communities at once – I don’t need to “move between” them. Because these communities are virtual, I can participate in many at once, and interact with community members who also participate in many communities at once, as a giant web of overlapping Venn diagrams. I can have multiple "home" communities. At any given time, I should be able to have a view of my own composed virtual community, a view that includes whatever mix of communities I wish to participate in or just observe, ranked into my attention as I choose at the time. Feeds (and searches) should be composable and steerable to provide that view.

This hyperlinking of public (and/or private) spaces is explained in Community and Content Moderation in the Digital Public Hypersquare (co-authored with Chris Riley). Much as web sites form a hyperlinked web that can be seamlessly connected with varying degrees of openness (manually, with links, or using web services), we can build webs of hyper-communities that are connected by our webs of connections to them and to their members. I refer to that as semipermeability, like a membrane that selectively passes some things and not others. As Ted Nelson said, “everything is deeply intertwingled.”

Ranking as the core task

Perhaps it is implicit, not yet documented, or I have missed it in the Bluesky materials, but it seems to me nearly all mediation boils down to ranking. Except in the most egregious cases, "moderation as removal" is anathema. "Filtering" is often narrowly understood as weeding out, not as ranking up or down. Egregious content might be downranked with prejudice, and quarantined, but the value of most content is in the eye of the beholder, and in the eye of those communities that beholders participate in based on shared norms and values.

Done well, downranking can provide safety from bad content, and upranking can bubble up quality and value. Composability of ranking tools can work at both individual and community levels to blend a mix of rankings, weighted as appropriate and desired. Rankings can be based on many dimensions of attributes, with items coming to our attention based on which attention agents uprank or downrank them my how much, and what weight is given to each of those agents.

Composability should also be dynamically steerable. Think of “bicycles for our minds” and how we can steer them at will. And remember that these bicycles should steer us through the multidimensional and semipermeably overlapping web of hypercommunities.

Multilevel feed composition composed from multiple algorithms

I hope the Bluesky architects have this in mind, but have not seen it clearly stated. Currently My Feeds gives a list of pre-defined feed algorithms that we can view one at a time. A truly composable, steerable feed would have a higher level interface that lets us merge a mix of feeds, with defined relative weights. A steerable feed would allow those mixes and weights to be easily changed at will to suit our tasks and moods. Obviously, this full capability and the appropriate UIs for it will take time to develop, but I hope the architecture is being designed to provide extensibility and protocol support for this. Some UI options might be very simple, and some might be suited to those who desire fine granularity of control.

Multi-dimensional reputation based on explicit and/or implicit signals

I view reputation as essential to making ranking work well. Reputation cannot be adequately captured by simple lists. I have written frequently about “rate the raters and weight the ratings” as an extension of Google’s PageRank algorithm to develop what Scott Aaronson has called "eigentrust" (=“eigenreputation”). I have suggested this use implicit ratings -- like, shares, comments (and perhaps more value-indicative signals) – as well as explicit ratings (which might include labels). Feed algorithms can use these methods in an infinite variety of ways. As a simple example, a feed might be composed in part based on implicit ratings from users of some mix selected from followers of Fox, MSNBC, the NY Times, or People magazine – or alumni of Harvard, Ohio State, or Texas A&M, or members of some church or union. The beauty of this kind of computed PageRank-style "eigenreputation" is that it is far more nuanced, current, and broadly sourced than binary lists of who is vouched for or not by some list curator.

This reputation system should ultimately be multidimensional. Reputation ratings may be segmented with respect to specific subject domains and value orientations, and can be selectively sourced from specific communities of interest and value. That way content and people can be ranked in different ways for different purposes. While doing this at scale may seem very complex, my understanding is that Google does similar context-specific segmentation for PageRank. Resources to do that are not yet in hand, but as such services reach scale, funding models will follow.

Rebuilding our social mediation ecosystem

Communities and mediating services can be decoupled. The speech layer may be more tightly tied to specific communities than the reach layer. Real life communities and institutions may be re-enabled to mediate our online discourse, both for their direct membership and those who wish to follow them. The ecosystem that shaped and stabilized discourse in the real world should be reconstituted in the virtual world, where many of the same communities and institutions can add value. These signals of human judgment can be crowdsourced from their membership, but they can also derive from editorial curation sanctioned by these communities/institutions. Many providers of Bluesky algorithms might be tightly integrated into the technical infrastructure of these communities/institutions.

+++Update 6/28 -- thinking out loud further:

My posting on Bluesky linking to the original version of this post led to a productive dialog with Paul Frazee of the Bluesky team and Chris Riley (my frequent co-author), leading to these further thoughts for discussion.

Labelling and rankling

Paul observed that "you're generally suggesting rankings instead of decision-points as well. that's what a lot of the labeling system gets into (proposal 0002)." Having only skimmed that proposal initially, a closer reading led me to comment further:

Thanks both. I agree labeling complements/combines with ranking, and with Chris’s reminder on Trust & Safety.
Reading 0002 more deeply, it seems a good start. I see interesting issues in how labels relate to feeds. Will think more on this, esp. idea that labels can be the ranking dimensions I suggested.
Key idea: simple labels are binary – I hate binary! Maybe you might provide for labels to have a non-binary strength. Then feed ranking agents can factor labels in as rankings with regard to the labelled attribute.

I have been hypothesizing that ranking can more or less fully subsume other forms of automation and include manual ranking and label inputs. For example, as I had suggested above, "Rankings can be based on many dimensions of attributes" -- so rankings could take a hybrid form that includes label attributes.

While binary-valued, the structure in 0002 points to considerable richness in the category structure of labels, along with variations in how they can be applied to presentation. I think adding a quantifier for the strength of a label (how strongly positive or negative it might be) would ultimately be essential to achieving nuance (even if that might not be implemented initially). From this holistic perspective, both positive and negative labeling would be desirable -- that might just be a matter of plus and minus quantifiers.

Extending that suggestion, my 2002-3 design for a collaborative system for open innovation provided for ratings of item value to have quantifications of both the value rating and of the rater's confidence level in that value rating (see the Rating and Reputation section starting at paragraph 0150). This was to feed back into the reputation of the rater, so that raters could indicate low confidence when unsure of their rating, to limit harm to their reputation from ratings that might come to be seen as questionable.

Broader issues of labeling and ranking -- and federation

My bias toward ranking based on non-binary inputs stems from a healthy respect for how our notions of truth and value -- and authority about that -- are contingent, changeable, and heavily influenced by our social mediation ecosystem. That has been central to the generative success of human society. Thus our social media should reflect that social contingency, and provide for a high degree of subsidiarity in how decisions are made. That is the essence of what I call freedom of impression, and how it serves to balance freedom of expression. Bluesky seems very aligned with that.

This also ties to the idea that rankings and labels should ideally be crowdsourced, with weightings based on a reputation system that is itself crowdsourced, to provide social mediation with subsidiarity to how and by whom labels are applied.

Thinking out loud here, after rereading the Bluesky Federation post and other items, the basis for the apparent split in Bluesky thinking about moderation versus algorithmic choice is not very clear to me, since I see them as falling into the same continuum of deciding what we should see online and who has agency over that.

Presumably this split relates to the conventional idea of "moderation as removal" as necessary to achieve "trust and safety," and the idea that some classes of content may be beyond the pale of reasonable discourse, and thus should not be left entirely to rankings based on user agency. In the short term, having no better option readily at hand, of course we must judiciously apply the blunt tools we do have to manage illegal content -- and also in many contexts to much of the "lawful but awful." As Chris points out, well-resourced and responsive trust and safety teams are important for now -- and it seems they will always have a key role.

To that point, I suggest labels also carry authority attributes, so that downranking labels by vetted T&S authorities can be given special classes of high weight. That could include a severity that has the effect of removal, but it seems reasonable to limit that level of authority.

Further thoughts on the federated architecture

The diagram in the Federation post is helpful, but still leaves me with some questions, and I see some places to extend that. It does seem the team is rightly considering many options, and trying to build maximum optionality in to how various players implement interoperable components within that architecture and its protocols. The relation between AppViews and Feed Gens seems right, reading that as allowing many app view providers to draw on multiple Feed Gen providers.

One thing I don't see clearly in there is the need for multilevel algorithmic choice, and whether the two (or more) levels I see need for are all distinct from the app view service offerings:

At a lower level is an open market in basic algorithms with very specific objective functions in terms of subjects, values, and vibes/moods.
At a higher level is an open market in UX-level services that enable composition and orchestration of those lower level algorithmic rankings to present an overall view that blends multiple objective functions, and to allow steering that dynamically as the user's moods and needs change.
These levels might be a continuum of levels that can feed into one another.
Those higher levels then integrate within AppView services, which might pass control downward (or expose it from below) at varying levels of granularity.
Presumably all of this can be rationalized with a ranking protocol that allows up/down rankings from each of the multiple levels to be merged in accord with desired weightings.

The other thing that seems especially important to me is a reframing of my 6/25 points about ranking, labeling and reputation, as follows. My initial take is that in the Federation post diagram, this can break out into multiple layers of structure within the box now captioned as Labeler.

Ranking: As I suggested, it all comes down to ranking. That seems where it feeds into FeedGen and AppView.
Reputation: A key factor in ranking is reputation (and vice versa), and I see the most robust form of reputation as being the PageRank-like process of computing "eigenreputation." That can draw from all available signals of human judgment, combined algorithmically -- not to replace that human judgment but to weight it in a way that distills the wisdom of the crowd weighted by reputation. That sounds circular, but the math of eigenvalues that Google applied draws on human judgement (originally links by "webmasters") recursively, n layers deep, to make that work. Those signals can be implicit (likes, shares, comments, etc.) or explicit (labels, expert ratings, etc.) Importantly, choices of algorithms should also apply to choices of reputation algorithms.
Labels: As I said, labeling can begin as a separate initial alternative to reputation based systems, but I suggest it should evolve to become a component of them. The team seems to be starting with labels managed by individuals or small numbers of individuals. That might remain as an option, but could evolve into more massive crowdsourcing, both in explicit form (like Twitter Community Notes), and in implicit form. The implicit form might be what I had earlier described as ratings that have multiple attribute dimensions, which could be inferred from massive interaction data (likes, shares, comments, and more). Actually all of these could merge into a system of attribute dimensions.

Doing all that is obviously a major challenge that will take many years to build out technically, and in the social mediation ecosystem that would drive it. Hopefully the architecture can be designed for phased implementation with extensibility that would later allow those layers to interact in flexible and highly dynamic ways that might not be foreseeable.

Enabling subsidiarity of "moderation"

I believe it is desirable that our social mediation ecosystem eventually carry most of the burden of "moderation" (a term I use advisedly, preferring "mediation" as broader and more multivalued) -- just as it does in traditional society. It gather that the Bluesky team gets this, but to emphasize:

This requires 1) an architecture for that ecosystem that is well supported by an open technical infrastructure, and 2) a structure that applies subsidiarity to apply a nuanced blend of top-down controls to limit dissemination of the truly harmful (the trust and safety teams and tools that Chris referred to), along with mostly bottom-up controls to manage more contingent levels of awfulness -- and goodness! -- in multiple dimensions.
This should apply at the level of 1) membership communities (servers/instances plus other communities/groups) and 2) cross-community attention/mediation agent services that users choose to opt into. The line between these seems best kept fuzzy and flexible.
All of these dialectics and their fuzziness are at the heart of federation.

Perhaps a workable split is that community services (servers/instances and the like) should be free to have whatever level of moderation as removal (within their scope of control) that they like. But multi-homing, cross-instance user attention agent services -- feed agents and recommenders -- should be free to rely entirely on ranking to whatever extent their users support. Moderation as removal might be limited in scope in that way.

Outsourced trust and safety at scale: Also worth emphasis is that the one of the beauties of federating mediation as a layer distinct from server instances, is that it facilitates outsourcing trust and safety to cross instance services that can have economies of scale, and thus levels or resources and skill, not feasible for all but the largest instances. That is much like email spam filtering and other security services.

"Vibe"-- Bluesky as "the shmoo of social media"

A bit tangential, but perhaps important to Bluesky success is what seems to be misunderstanding of how Bluesky federation and algorithmic choice should make questions of a Bluesky "vibe" moot -- in a way that gives everyone what they want (within reason). The near-term challenge for Bluesky is to manage and limit that so concerns about vibe do not impede progress toward the maturity that will moot those concerns. That suggests need for some PR about how federation and user choice will enable users to build their network views to have whatever vibe they want -- and change it as they want. I see Bluesky as being designed to be "the shmoo of social media." (Shmoos were an Al Capp cartoon creature that tasted like whatever you wanted.)

Bluesky has benefited from its "cool kids" buzz, but also increasingly seeing negative spin, such as articles in Vanity Fair and the New Yorker about its "vibe" and how it is changing as it grows to a broader demographic. Few seem to understand that they ain't seen nothin' yet. With selectable, composable feeds, users will be able to create views that offer whatever vibe they want (and with whatever levels of moderation they want). Users concerned about the current vibe should be advised that this is the infancy of a flexible new social ecosystem, and that whatever initial vibe chaos might arise will give way to a new order -- if given a chance.

[End of updates]

MORE ON MY VISION

Here are the most relevant of my writings on these aspects of my vision. (A fuller list is here.)

Recent visions of the fediverse/pluriverse

Context Lost; Context Regained – Comments on iDPI's “A Manifesto for a Smaller, Denser Internet” (4/5/23) - My comments and my own expansions on a manifesto from Ethan Zuckerman's iDPI team. A "pluriverse" of many linked diverse platforms and a "loyal" agent that "aggregates, cross-posts, and curates."
Into the Plativerse… Through Fiddleware? (Tech Policy Press, 12/20/22) - Suggesting a future that enables nuanced control -- and may enable the emergence of federated middleware (fiddleware?) to best serve users. See especially the last two sections here. (*My more recent 4/5/23 post favors the term "pluriverse" as more evocative.)

In more depth on the vision:

The Internet Beyond Social Media Thought-Robber Barons (Tech Policy Press, 4/22/21) - See especially Part 1 and beyond.

The Augmented Wisdom of Crowds: Rate the Raters and Weight the Ratings (7/22/18) - An earlier detailing of my vision.

Broad statements of direction and motivation:

From Freedom of Speech and Reach to Freedom of Expression and Impression (Tech Policy Press, 2/14/23) - Distilling and updating essential reframings from the Delegation series (just below).