Improving Fediverse Discovery & Onboarding

sapient [they/them] · edit-2 2 years ago

Improving Fediverse Discovery & Onboarding

sapient [they/them] · edit-2 2 years ago

Step 4 - Term Merging

Each instance has provided subject trees of what it’s community is meant to be like. Moreover, it has provided the terms it believes to refer to various concepts within their subject tree.

This step is where all those terms get merged together to then be used later via some kind of search algorithm, for the more sophisticated cases.

The steps are as follows.

Collect all the subject trees from each instance into some way of iterating over them.
Construct a BTree-based map of topic paths plus associated term information, merging in new values for every level from every federated server ^.^. Much more sophisticated versions of doing this efficiently are documented in the Common Interest Algorithm snippet, even if not for the terms, so just look at that :)

Step 5 - Common Interest Weighting

Apply Common Interest Weighting via the Common Interest Algorithm between the user and each possible instance.

There may be a way to use Heaps or some hierarchical datastructure to sort the instances to do this more efficiently, but as long as the implementation of the Common Interest Algorithm uses BTrees and pre-calculates lexicographically ordered maps of data it can be ensured that the cost of this kind of commonality assessment only grows with the size of the tree specified by the user and the single instance to be compared, rather than all instances (for an individual instance/user comparison ^.^).

There may also be ways to compare the user against all instances at once more efficiently that I don’t know of. But the point is, we can use the Common Interest Algorithm to assign weights for each instance/group/etc. relative to each user.

We could also use some way to convert a user search query into their Common Interest Algorithm tree weights, using the list of known terms. This is for slightly more advanced terms or people perhaps searching for communities or other groups too.

Step 6 - Elimination of Anti-Aligned Instances

Any instances/groups/communities/etc. with alignment <0 should be immediately eliminated from the list of suggested instances/groups/communities/etc. to the user.

Step 7 - Combining Sentiment Alignment Weights & Other Ranking, plus Final Selection

We already have some ranking information based on how willing and able an instance is for new users, plus we have information on how aligned each instance is with this hypothetical new user - now all a fraction from 0 to 1, as we cut out instances that have a negative alignment with the user ^.^. Then I suggest we find some simple way to join those two values together. For now, I suggest simply multiplying the alignment fraction with the weights for each instance, and then use probabalistic selection to direct the user to an instance that aligns with what they want ^.^

It may also be desirable for instances to prioritise somewhat older instances with better uptime, or more trustability (e.g. using some kind of heuristic to detect bot instances or similar), and modify the weightings based on that, or eliminate some instances ^.^

For non-instance searching or discovery, we can use the alignment ranking directly as a form of search ranking :)

Step 8 - Redirection

Redirect the user to the “final” signup page as listed in the instance metadata, along with the parameter for their desired username. Perhaps it would be worth using webfinger to make sure the username isn’t taken on any selected instance, and automatically selecting different instances from the list until you find one without the username taken already, with a warning.

If we’re talking about discoverability of communities or similar, you just put those in order of their direct sentiment alignment rank ^.^

Igotz80HDnImWinning · 2 years ago

I would LOVE to see a user tuneability control for continued content discovery along these same weighted relationships. Kind of like a Discover Weekly meter that you could adjust/threshold to see suggested content from instances that are more vs less similar to ones we follow. You may have said that in here but either way this seems really useful for instance steering/selection and distribution.

CaptBobbers · 2 years ago

@Igotz80HDnImWinning @sapient_cogbag
Seems super useful.

Improving Fediverse Discovery & Onboarding

Improving Fediverse Discovery & Onboarding

Self-Tagging Structure

Tagging the “Type” of Social Media an Instance is Running

Tagging the Focus of Instances

Subject Trees/Sentiment Trees

Step 4 - Term Merging

Step 5 - Common Interest Weighting

Step 6 - Elimination of Anti-Aligned Instances

Step 7 - Combining Sentiment Alignment Weights & Other Ranking, plus Final Selection

Step 8 - Redirection