(7) Composability Series: Web3 Scores, Stacks, and Algorithms
The types of web3 scores, and how they shape the future of the data stack
We’ve seen algorithms become the lifeblood of the internet - driving infinite recommendations on top of infinite feeds. In web2, these algorithms are complete black boxes (end-to-end), used for one specific application, and manipulated by some interested party. No one knows exactly what inputs are used, what scores are given to people and content, or what models are used for curation. And you can be sure there’s no API to run their models and get their outputs yourself.
Web3 algorithms are still nascent, but we’re close to a tipping point due to a combination of:
Bigger Blockchains: The blockchain data behemoth is growing exponentially due to transaction capacity scaling and the number of layer 2 chains launched.
Modular Evolution: Web3 protocols are being optimized/simplified while actually emphasizing modular composability. That means anything that was the protocol team’s choice is now up to individual communities - everything from pricing curves, token and voting permissions, and customizable and conditional token marketplaces.
Hyperbundling: With increased composability comes heavy aggregation - across defi, NFTs, and DAOs. It should be no surprise that Uniswap and Opensea are already heavily focused here.
Maturing Data Infra: Everyone and their grandma knows how to do auto-generated decoded tables already. Data providers like Dune are maturing to compete at the aggregated tables level instead, across many chains.
All this means more data, more cross-app and cross-domain intricacies, and more powerful ways to index and analyze data. It’s inevitable that we’ll see this data leveraged to create curated and tiered experiences for communities and users alike.
Currently, web3 algorithms focus on calculating scores targeting different facets of a user’s profile (using aggregate activity across multiple chains/apps). Protocol teams and communities can then decide how to use those scores to enhance their product experience, governance processes, or tokenomics.
In this article, I’ll cover a framework for comparing scores and understanding the interlinked data stacks behind them.
⚠️ What I’m covering is separate from token-based reputation score systems. While they are related, I’m not interested in exploring tokenomics in this article. Most of the scores here don’t directly translate to some balance or distribution of tokens. I’ll have a follow on article at some point discussing how to possibly model tokenomics with various kinds of categorical scores, but I believe trying too hard to attach them at the start will lead to perverse incentives and difficulty growing organically.
Modeling the Web3 Score Landscape
Let’s start by defining some of the current main categories for user scores:
Creators/Curators: Scoring someone’s ability to create or curate content
DeFi/Credit Management: Scoring someone’s ability to deploy and manage capital
Contributors: Scoring someone’s contributions to a protocol/DAO and their skill set
Rule-based: Scoring someone based on a defined set of actions taken
Sybil (Identity): Scoring someone based on how likely they are to be human
We’ll likely see Sybil scores that combine with credit scores for enhanced access in DeFi, or contributor scores that combine with rule-based scores for prioritized onboarding to DAOs. However, scores can differ a lot even within the same category, so let’s compare them on two axes:
Contextual <> Generalized: how generalizable is a score across the communities its used in? The more contextual the score is, the more it doesn’t translate well across communities or applications. Generalized scores are good for comparisons across communities and protocols, while contextual scores are better for analyzing health over time granularly.
Elo Scores <> Levels: is the score constantly adjusted and normalized, or is it a number that only goes up? Elo scores mean your score is more dependent/ranked against the actions of others in the ecosystem. For the most part, levels will be more beginner friendly and better understood than elo scores.
Here are some examples of scores from the real world, placed in our four quadrants:
Contextual + Elo Score: Your competitive ranking in a game like Chess or League of Legends
Contextual + Level: Your level in a rewards program, like Citibike Bike Angels
Generalized + Elo Score: Your socioeconomic status (combination of education, income, health)
Generalized + Level: Your years of education/degrees earned
Here’s where some of the top web3 scoring systems/protocols sit along the axes:
I’ll briefly describe what each score provider does, hopefully explaining their placement. Note that I placed each provider based on how I interpreted their main offerings (as of September 1st, 2022):
Creators/Curators:
yup.io: Action adapters that are combined in a weighted score.
$WRITE airdrop: Betweenness centrality algorithm to give you a weighted score of your Mirror activity.
jokedao: Round-based voting on best jokes.
DeFi/Credit:
MACRO: Creates a market conditions time-weighted score of your defi actions.
ARCx: A gamified and clearly defined three-part defi scoring system.
Degenscore: Ranks how much of a degen you are compared to other protocol users with a score and percentile format
Contributors:
layer3xyz: Experience earned by fulfilling bounties and winning contests.
0xStation: DAO passports where defined actions become a part of a searchable system of records (slide 10).
metropolis: Building generalized identity around gnosis safes to build more context-rich data around on-chain group interactions.
Coordinape: Your fellow contributors vote on your level of contribution per epoch (interval of time).
Rule-based:
Rabbithole: Custom skill trees with specific completable actions.
Metagame: Activity logs of various categories of (decoded) transaction types.
Sybil:
BrightID: A vote-in system where you get a fluctuating health score based on the SybilRank of the network closest to you
PoH: Also a vote-in system, but with deposit minimums. You’re either in the network or you’re out (challenged/voted out). Union kind of builds off of this.
Worldcoin: Scans your eye to give you a private key. Don’t lose either one!
Right now we’re mostly seeing scores used in tiered protocol access and leaderboards/ranked feeds, but more powerful curation algorithms are right around the corner. To understand how they’ll emerge, we need to take a look at both the score data stack and score aggregators.
From data provider to score provider - then back
Below is my simplified interpretation of the data stack used by score providers - I’ll walk through it from the bottom to the top.
Sourcing Data:
While you can still build your own events/transactions indexer off of node providers like Alchemy, a few web3 product teams are open-sourcing their decoding indexers (zora, gallery). This trend will continue for most standard contract types, such that direct JSON RPC calls get abstracted more and more.
For nicely aggregated data, you’d likely use the Graph (for mapped graphQL subscriptions) or Dune Analytics (for cross-protocol/chain abstractions).
Feature Engineering:
Unlike data mappers such as Dune that capture every on-chain data point since genesis, score providers usually ingest just a subset of data depending on the protocols they’re interested in. From there, we have three steps of feature engineering:
Splitting actions into binary and continuous components, such as “borrowed USDC” versus “health factor of loan over time”.
Generalize variables (usually complex structs) across protocols of the same type; for example, loan health factors across DeFi protocols must be generalized based on respective risk parameters.
Wallet normalization means choosing a score or level distribution you want wallets to fit into. Binary variables can be used to boost or decay scores for more custom distributions.
There are more steps if you consider including external datasets that are already preprocessed, but I’ve left that out of the diagram.
Model:
While inputs/outputs are much more transparent in web3, some models are still privately developed right now (either for IP or Sybil/gamification reasons). I don’t have much to say on the models themselves since we’re still using the same web2 academic algorithms and libraries, however, I do believe that open-source models will win in the long run.
At the end of the day, everyone needs to be able to trust and interpret these scores - especially if there’s any sort of tokenomics/value involved. You need both builder and consumer communities to buy into using your scores - there’s no way for you to just force it upon them (yet, and hopefully never). Once you generate that trust, then you’ll have an extremely strong flywheel of more applications, more data, more builders, and more users.
Most people already understand that it’s not the technology that’s special or proprietary in web3 - it’s the community. It doesn’t matter how strong your team is, you won’t win by keeping things private.
Service:
On top of their basic score offerings, score providers also act as special data providers. All their engineered features can be re-used as a tiered data service similar to hugging face NLP libraries.
There are three main data consumers:
Apps: Some products/protocols/communities will integrate directly with base scores, most likely the contextual scores (especially contributor scores). Features within the model will likely be useful for various UI elements and forums/discords.
Aggregators: Improving the developer experience for integrations is a no-brainer. Some examples out currently are:
Chainlink flux aggregator: for on-chain score deployment, you might want to take a weighted Sybil score from multiple sources.
Gitcoin passport: for social apps, you’ll likely want easy access to multiple types of scores to enhance user profiles. Ceramic fits here too.
Orange Protocol: for complex protocols (i.e. defi) you might need an ensemble model approach where you have complete control of inputs, models, and output variability.
My guess is that aggregators will probably end up having larger and larger says in the governance and tech stack of score providers over time.
Analytics Platforms: Wallet labeling has already proven extremely powerful for various analytics use cases. Having scores as joinable tables would give us completely new ways of measuring adoption/market share, user segments, and product-community-fit across protocols.
I believe that the more generalizable the score, the more valuable the final outputs are (compared to the features). Vice versa, the more contextual the score, the more valuable the engineered features are. This should be intuitive given that a community would likely have more use for contextual features in their forums and membership processes than something general like who voted for who in a Sybil network.
Concluding thoughts on the future of scores
There’s a ton we can extrapolate about where scores are heading and what they’ll be used for. I’m personally excited about how score providers will enhance web3 native profile search and contributor relationship management systems (web3 CRM).
Transaction hashes and addresses give us strong anchors for finding relationships between users, but individual activity isn’t easily searchable. For example, you can see an address has some number of tokens and transactions, but you can’t search for their first ENS registered, first liquidation, or when they fulfilled their first bounty. These require the data to be more semantically layered, which is what score providers do when they engineer features into binary/continuous variables most relevant to their models. A transaction that’s just a USDC transfer would now have standardized metadata around what grant proposal it was tied to, and a transaction showing an ETH deposit into Compound is now classified as “avoiding liquidation”. It might not seem like much on its own, but combining all the features across score providers gives us the ability to do a top-down search starting from some “category” to get a “proof of action” - rather than trying to filter bottoms-up manually for specific function calls or transactions.
Combing scores allow us to filter through large clusters of users and create flexible metrics for measuring community health. Each segment within a web3 community will have a sliding scale for understanding underlying users and their activity. Younger communities would likely check generalized + level scores more often to see if overall health is going up compared to others, while mature communities would rely more on contextual + elo scores to track the pulse and retention of top members. You could prioritize your onboarding funnel for voters by layering Sybil scores on top of historical governance activity (proposals, delegations, votes), or for builders by layering contribution levels on top of their historical compensation distributions (payment rate, token diversity). I’m sure there are many teams working on contributor relationship management systems, where contributor activity is layered on top of score-based filtering to build out a more accurate understanding of which contributors (old, new, or tangential) are most active and likely to be interested in taking on a new project.
Both activity search and score combinations go towards scaling membership - an article I have planned for later this year! As always, thanks for reading this far - if you enjoyed this piece, please collect and subscribe for more 🙂
My DMs are always open if you have questions or want to discuss anything I’ve written here (especially if you’re currently building or researching this topic).
*thanks to Serena for her feedback on this article! *😊