Inside Twitter's Recommendation Algorithm

twitter/the-algorithm is the full source for the systems that power the For You Timeline, recommended notifications, and related surfaces at X (formerly Twitter). It covers candidate generation, ML ranking, content filtering, and serving infrastructure — roughly 73k GitHub stars and a codebase that spans Scala, Python, Java, and Rust.

Why I starred it

April 2023 was a wild week. Elon Musk tweeted that he'd open-source the algorithm, and then they actually did it. The repo dropped with 22 components and real production code — not a cleaned-up toy. The architecture docs are legitimate, the Scala internals compile against internal Twitter frameworks, and there's enough there to understand how recommendation systems at this scale actually work.

Most "recommendation algorithm" discourse is hand-wavy. This repo is not.

How it works

The For You timeline is a 4-stage pipeline:

1. Candidate generation — The system fetches from roughly 5 sources in parallel to get candidates down from ~500M tweets to around 1,500:

search-index (Earlybird) — in-network tweets ranked by a light Scala model
cr-mixer — a coordinating service that calls out to external candidate sources and caches results
user-tweet-entity-graph (UTEG) — an in-memory graph traversal using the GraphJet framework
follow-recommendations-service (FRS) — out-of-network accounts and their tweets
tweet-mixer — coordination layer for out-of-network candidates

2. Feature hydration and ranking — ~6,000 features are fetched for each candidate, then a two-pass neural ranking runs. The light ranker runs in the search index (Earlybird), the heavy ranker is a separate neural network served via Navi.

3. Filtering — visibilitylib/ applies rules based on safety labels, block/mute relationships, legal compliance, and content quality. The library is described as "a centralized rule engine" — one policy per safety level, evaluated as a priority-ordered rule sequence.

4. Mixing — home-mixer/ (built on product-mixer/) blends tweets, ads, who-to-follow modules, and prompts into the final response.

The most interesting piece is SimClusters (src/scala/com/twitter/simclusters_v2/). The algorithm detects ~145,000 communities from the Twitter follow graph by:

Computing cosine similarity between producer follow vectors to build a producer-producer similarity graph
Running Metropolis-Hastings sampling-based community detection to assign each producer a cluster
Building InterestedIn embeddings by multiplying the follow matrix by the KnownFor matrix

The KnownFor matrix is intentionally sparse — each producer belongs to at most one community. ProducerEmbeddingsFromInterestedIn.scala relaxes this for content: a user can be "known for" multiple topics when you compute the cosine similarity between their followers and each community's InterestedIn vector.

Tweet embeddings are updated in real-time. Every time a user favorites a tweet, their InterestedIn vector is added to that tweet's embedding. This runs as an online Heron streaming job — so fresh tweets accumulate signal within minutes.

The UpdateKnownFor20M145K2020.scala batch job runs on a 7-day schedule covering the top 20M producers. The parameter that caught my eye:

// squareWeightEnable:
// edge weight = cosine_sim * cosine_sim * 10
// Squaring makes high-weight edges relatively more important;
// a neighbor with cosine_sim=0.1 is more than 2x important
// compared to cosine_sim=0.05
val squareWeightEnable = args.boolean("squareWeightEnable")

It's a deliberate tradeoff to amplify strong community signals and suppress noise. The comment explains the "why" — uncommon in production code.

Navi (navi/) is Twitter's own ML serving layer, written in Rust. It exposes a gRPC API compatible with TensorFlow Serving, which means existing TF clients drop in without changes. Supports TF, ONNX, and experimental PyTorch. The ONNX path is specialized for the home recommendation use case with a proprietary BatchPredictRequest Thrift format.

The product-mixer/ framework is the glue. Pipelines are composed of typed steps: Candidate Pipelines feed into Recommendation Pipelines, which feed into Mixer Pipelines. The type hierarchy in product-mixer/core/src/main/scala/com/twitter/product_mixer/core/pipeline/ shows the full abstraction: PipelineConfig, Pipeline, PipelineResult, FailOpenPolicy. When a step fails, the framework has explicit fallback behavior rather than crashing the whole request.

Using it

This isn't a drop-in library — it's production infrastructure tied to Twitter's internal Hadoop/Scalding/Thrift stack. You can read and run individual components, but the build system is Bazel without a top-level WORKSPACE, so you can't just bazel build //.... To compile a component like cr-mixer, you'd need to wire up the missing Twitter-internal dependencies yourself.

What you can do is study the algorithm definitions. The RETREIVAL_SIGNALS.md (their typo, not mine) documents every user signal across components in a clean matrix — which signals are used as features vs. labels in SimClusters, TwHIN, UTEG, FRS, and light ranking. That table alone is worth bookmarking.

Tweet Favorite  | Features | Features | Features/Labels | Features | Features/Labels | Features/Labels
Tweet Don't like| Features | N/A      | N/A             | N/A      | N/A             | N/A
Tweet Report    | Features | N/A      | N/A             | N/A      | N/A             | N/A

The "don't like" and "report" signals only feed into USS (User Signal Service) as features — they don't directly label training data for most candidate sources.

Rough edges

The build system is incomplete by design. Twitter said they plan to add a full build and test system "in the future" — as of the latest commits, that hasn't materialized. The last real code commit was July 2023; there was a single update in September 2025 to "for-you recommendations code" but no context or changelog.

The visibilitylib/ README is explicit that parts of the code were removed and sanitized before open-sourcing. The rule engine skeleton is there, but the actual safety label definitions and enforcement logic are stripped.

Navi has no test or benchmark code — removed "due to data security concerns" per the README. So you have the serving architecture, but no performance baseline.

Bottom line

If you build recommendation systems, this is the most complete production reference architecture available anywhere. The SimClusters embedding strategy, the staged pipeline structure in product-mixer, and the real-time tweet embedding update loop are all worth studying. Don't expect to run it — expect to read it.