back to home

xai-org / x-algorithm

Algorithm powering the For You feed on X

16,030 stars
2,780 forks
0 issues
RustPython

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing xai-org/x-algorithm in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/xai-org/x-algorithm)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

X For You Feed Algorithm This repository contains the core recommendation system powering the "For You" feed on X. It combines in-network content (from accounts you follow) with out-of-network content (discovered through ML-based retrieval) and ranks everything using a Grok-based transformer model. > **Note:** The transformer implementation is ported from the Grok-1 open source release by xAI, adapted for recommendation system use cases. Table of Contents • Overview • System Architecture • Components • Home Mixer • Thunder • Phoenix • Candidate Pipeline • How It Works • Pipeline Stages • Scoring and Ranking • Filtering • Key Design Decisions • License --- Overview The For You feed algorithm retrieves, ranks, and filters posts from two sources: • **In-Network (Thunder)**: Posts from accounts you follow • **Out-of-Network (Phoenix Retrieval)**: Posts discovered from a global corpus Both sources are combined and ranked together using **Phoenix**, a Grok-based transformer model that predicts engagement probabilities for each post. The final score is a weighted combination of these predicted engagements. We have eliminated every single hand-engineered feature and most heuristics from the system. The Grok-based transformer does all the heavy lifting by understanding your engagement history (what you liked, replied to, shared, etc.) and using that to determine what content is relevant to you. --- System Architecture --- Components Home Mixer **Location:** The orchestration layer that assembles the For You feed. It leverages the framework with the following stages: | Stage | Description | |-------|-------------| | Query Hydrators | Fetch user context (engagement history, following list) | | Sources | Retrieve candidates from Thunder and Phoenix | | Hydrators | Enrich candidates with additional data | | Filters | Remove ineligible candidates | | Scorers | Predict engagement and compute final scores | | Selector | Sort by score and select top K | | Post-Selection Filters | Final visibility and dedup checks | | Side Effects | Cache request info for future use | The server exposes a gRPC endpoint ( ) that returns ranked posts for a given user. --- Thunder **Location:** An in-memory post store and realtime ingestion pipeline that tracks recent posts from all users. It: • Consumes post create/delete events from Kafka • Maintains per-user stores for original posts, replies/reposts, and video posts • Serves "in-network" post candidates from accounts the requesting user follows • Automatically trims posts older than the retention period Thunder enables sub-millisecond lookups for in-network content without hitting an external database. --- Phoenix **Location:** The ML component with two main functions: • Retrieval (Two-Tower Model) Finds relevant out-of-network posts: • **User Tower**: Encodes user features and engagement history into an embedding • **Candidate Tower**: Encodes all posts into embeddings • **Similarity Search**: Retrieves top-K posts via dot product similarity • Ranking (Transformer with Candidate Isolation) Predicts engagement probabilities for each candidate: • Takes user context (engagement history) and candidate posts as input • Uses special attention masking so candidates cannot attend to each other • Outputs probabilities for each action type (like, reply, repost, click, etc.) See for detailed architecture documentation. --- Candidate Pipeline **Location:** A reusable framework for building recommendation pipelines. Defines traits for: | Trait | Purpose | |-------|---------| | | Fetch candidates from a data source | | | Enrich candidates with additional features | | | Remove candidates that shouldn't be shown | | | Compute scores for ranking | | | Sort and select top candidates | | | Run async side effects (caching, logging) | The framework runs sources and hydrators in parallel where possible, with configurable error handling and logging. --- How It Works Pipeline Stages • **Query Hydration**: Fetch the user's recent engagements history and metadata (eg. following list) • **Candidate Sourcing**: Retrieve candidates from: • **Thunder**: Recent posts from followed accounts (in-network) • **Phoenix Retrieval**: ML-discovered posts from the global corpus (out-of-network) • **Candidate Hydration**: Enrich candidates with: • Core post data (text, media, etc.) • Author information (username, verification status) • Video duration (for video posts) • Subscription status • **Pre-Scoring Filters**: Remove posts that are: • Duplicates • Too old • From the viewer themselves • From blocked/muted accounts • Containing muted keywords • Previously seen or recently served • Ineligible subscription content • **Scoring**: Apply multiple scorers sequentially: • **Phoenix Scorer**: Get ML predictions from the Phoenix transformer model • **Weighted Scorer**: Combine predictions into a final relevance score • **Author Diversity Scorer**: Attenuate repeated author scores for diversity • **OON Scorer**: Adjust scores for out-of-network content • **Selection**: Sort by score and select the top K candidates • **Post-Selection Processing**: Final validation of post candidates to be served --- Scoring and Ranking The Phoenix Grok-based transformer model predicts probabilities for multiple engagement types: The **Weighted Scorer** combines these into a final score: Positive actions (like, repost, share) have positive weights. Negative actions (block, mute, report) have negative weights, pushing down content the user would likely dislike. --- Filtering Filters run at two stages: **Pre-Scoring Filters:** | Filter | Purpose | |--------|---------| | | Remove duplicate post IDs | | | Remove posts that failed to hydrate core metadata | | | Remove posts older than threshold | | | Remove user's own posts | | | Dedupe reposts of same content | | | Remove paywalled content user can't access | | | Remove posts user has already seen | | | Remove posts already served in session |…