diff --git a/docs/CLAUDE.md b/docs/CLAUDE.md new file mode 100644 index 000000000..18ea8f3fe --- /dev/null +++ b/docs/CLAUDE.md @@ -0,0 +1,483 @@ +# CLAUDE.md - Interactive Documentation Guide + +This file provides guidance to Claude Code when working on the interactive documentation in `/docs`. + +## Purpose + +The `docs/` directory contains **public-facing interactive documentation** that translates our technical research findings (from `notes/`) into accessible, engaging explanations for a broad audience. + +**Goals**: +- Make complex algorithmic behavior understandable to non-technical readers +- Provide concrete examples with real calculations +- Use interactive visualizations to demonstrate dynamic effects +- Maintain objectivity and verifiability (all claims backed by code references) +- Create an engaging reading experience + +**Audience**: Users, creators, researchers, policy makers, and anyone curious about how algorithmic systems shape discourse. + +## Quick Reference: Content Structure + +**Every algorithmic concept must follow this three-part structure:** + +1. **Intuition** - Plain language explanation answering "What is this?" and "Why does it matter?" +2. **Feel** - Interactive visualization or calculator to experience the dynamics +3. **Proof** - Mathematical formulas, concrete calculations, and code references for verification + +This ensures content is accessible (intuition), engaging (feel), and verifiable (proof). + +See [Writing Principles](#writing-principles) for detailed guidance. + +## Content Structure + +The investigation is structured as a multi-part series: + +1. **Introduction** (`parts/01-introduction.html`) - Why this matters, approach, questions +2. **Tweet Journey** (`parts/02-tweet-journey.html`) - Following a tweet through all 5 algorithmic stages +3. **User Journey** (`parts/03-user-journey.html`) - How user experience evolves over 6 months +4. **Discourse Levers** (`parts/04-discourse-levers.html`) - 6 mechanisms shaping platform discourse +5. **What This Means** (`parts/05-what-this-means.html`) - Objective analysis of designed vs emergent effects +6. **Conclusions** (`parts/06-conclusions.html`) - Perspective and implications (subjective analysis) +7. **Appendix** (`parts/07-appendix.html`) - Methodology, file index, verification guide + +**Landing page** (`index.html`) - Overview with key findings and navigation + +## Writing Principles + +### Content Structure: Intuition → Feel → Proof + +Every algorithmic concept should be presented in three layers to serve different learning styles: + +**1. General Description (Intuition)** +- Plain language explanation of what this mechanism does and why it exists +- Build intuition: help readers understand the "shape" of the behavior +- Answer: "What is this?" and "Why does it matter?" +- Example: "Twitter prevents any single author from dominating your feed by applying an exponential penalty..." + +**2. Visualization/Interactive Element (Feel)** +- Let readers experience the dynamic, not just read about it +- Interactive calculators, charts, simulations, or animated diagrams +- Answer: "How does this feel?" and "What happens if I change X?" +- Example: Slider to adjust tweet count and see penalty compound in real-time + +**3. Math & Code (Proof)** +- Concrete formulas with actual parameters from the code +- Step-by-step calculations with real numbers +- Code references with file paths and line numbers for verification +- Answer: "How exactly does this work?" and "Can I verify this?" +- Example: Formula with decay factor 0.5, floor 0.25, plus code reference + +This three-part structure ensures content is: +- **Accessible**: Non-technical readers get the intuition +- **Engaging**: Visual learners can explore interactively +- **Verifiable**: Technical readers can check the implementation + +### Objectivity Until Conclusions + +- **Parts 1-5**: Present facts, mechanics, and observable effects without judgment +- **Part 6**: Bring perspective, interpretation, and implications +- Use phrases like "the algorithm optimizes for" not "the algorithm wants to" +- Describe effects objectively: "this increases polarization" not "this is bad" + +### Show the Math + +Every algorithmic effect should include **concrete calculations**: + +```html +
When an author posts multiple tweets, each subsequent tweet receives a penalty:
+ +Formula: multiplier = (1 - floor) × decayFactor^position + floor
+
+Where:
+- decayFactor = 0.5
+- floor = 0.25 (minimum multiplier)
+- position = tweet number from this author (0-indexed)
+
+Example - Author posts 3 tweets with base score 100:
+
+Tweet 1: 100 × 1.0 = 100
+Tweet 2: 100 × 0.625 = 62.5
+Tweet 3: 100 × 0.4375 = 43.75
+
+Total reach: 206.25 (not 300!)
+Effective penalty: 31% loss vs posting separately
+
+Code:
+home-mixer/server/src/main/scala/com/twitter/home_mixer/product/scored_tweets/scorer/AuthorBasedListwiseRescoringProvider.scala:54
Out-of-network tweets receive a 0.75x multiplier (25% penalty).
+Code:
+home-mixer/.../RescoringFactorProvider.scala:45-57
Out-of-network tweets are penalized.
+ +``` + +### Accessible Language + +Translate technical concepts into plain language while maintaining accuracy: + +**Technical** (for `notes/`): +> "The MaskNet architecture predicts 15 engagement probability distributions using parallel task-specific towers with shared representation layers." + +**Accessible** (for `docs/`): +> "The ranking model predicts 15 different ways you might engage with a tweet (like, reply, retweet, etc.) all at once. Each prediction gets a weight, and the weighted sum becomes the tweet's score." + +## HTML Structure and Patterns + +### Standard Page Template + +```html + + + + + +Code:
+path/to/file.scala:line-numbers
Key Insight: This is a critical finding that deserves emphasis.
+Brief description of the finding and its implications.
+Code: file.scala:123
Formula or calculation here
+With multiple lines
+Showing step-by-step math
+```
+
+## Interactive Visualizations
+
+Use interactive elements to help readers **experience** algorithmic dynamics, not just read about them.
+
+### Types of Visualizations
+
+**1. Decay Functions** - Show how effects change over time
+- Author diversity decay (exponential)
+- Feedback fatigue decay (linear over 140 days)
+- Temporal decay for tweet age
+
+**2. Multiplicative Effects** - Demonstrate compound behaviors
+- Gravitational pull (interest drift from 60/40 to 76/24)
+- Cluster reinforcement
+- Follower advantage compounding
+
+**3. Interactive Calculators** - Let readers adjust parameters
+- Tweet score calculator (adjust engagement weights)
+- Author diversity penalty (adjust number of tweets)
+- In-network vs out-of-network comparison
+
+**4. Flow Diagrams** - Show pipelines and stages
+- Tweet journey through 5 stages
+- Candidate funnel (1B → 1,400 → 50-100)
+- Signal usage across components
+
+**5. Comparison Charts** - Relative values and impacts
+- Engagement weights bar chart (Reply: 75.0 vs Favorite: 0.5)
+- Filter penalties comparison
+- Signal importance across systems
+
+### Implementation Guidelines
+
+**Technology**:
+- Vanilla JavaScript (no frameworks, keep it lightweight)
+- For charts: Chart.js or D3.js (include via CDN)
+- For animations: CSS transitions or Canvas API
+- Store data in separate JSON files in `assets/` if needed
+
+**Accessibility**:
+- All visualizations should have text fallbacks
+- Describe what the visualization shows before showing it
+- Provide static examples alongside interactive ones
+- Ensure keyboard navigation works
+
+**Example: Interactive Author Diversity Calculator**
+
+```html
+Adjust the number of tweets to see how the penalty compounds:
+ + + + +``` + +### When to Use Interactive vs Static + +**Use interactive visualizations when**: +- The concept involves change over time (decay curves, drift) +- Readers benefit from experimenting with parameters +- Multiple scenarios need comparison +- The dynamic is hard to grasp from numbers alone + +**Use static examples when**: +- A single concrete example is sufficient +- The calculation is straightforward +- Interaction would add complexity without insight +- Page performance is a concern + +## Content Tone and Style + +### Voice + +- **Clear and direct**: Short sentences, active voice +- **Conversational but precise**: Explain like you're talking to a smart friend +- **Objective in facts, thoughtful in implications**: Present mechanics objectively, analyze effects thoughtfully +- **No hyperbole**: Let the findings speak for themselves + +### Framing Effects + +Be careful with word choice that implies intent: + +**Avoid** (implies intent): +- "The algorithm wants you to..." +- "Twitter designed this to manipulate..." +- "This is meant to exploit..." + +**Prefer** (describes mechanism): +- "The algorithm optimizes for..." +- "This design choice results in..." +- "This creates an incentive to..." + +### Example Transformations + +**From technical finding** (`notes/`): +> "The Heavy Ranker applies a -74.0 weight to the predicted probability of negative feedback, while favorites receive only a 0.5 weight. This creates a 148:1 ratio favoring avoidance of negative signals over accumulation of positive ones." + +**To accessible explanation** (`docs/`): +> "When scoring a tweet, the algorithm severely penalizes content that might trigger 'not interested' clicks (-74.0 weight) while barely rewarding favorites (0.5 weight). This means one 'not interested' click has the same negative impact as 148 likes have positive impact. The algorithm is designed to avoid showing you things you'll reject, not to show you things you'll like." + +## Visual Design + +The site uses a dark theme (`css/style.css`) inspired by Twitter/X's interface. + +**Design principles**: +- High contrast for readability (light text on dark background) +- Generous whitespace and line height +- Code blocks with syntax highlighting colors +- Responsive layout (works on mobile) +- Consistent spacing and typography + +**Color usage**: +- Background: `#15202b` (dark blue-gray) +- Text: `#e7e9ea` (light gray) +- Links: `#1d9bf0` (Twitter blue) +- Code blocks: `#192734` background, `#50fa7b` for highlights +- Callouts: Subtle border and background variation + +## File Organization + +``` +docs/ +├── index.html # Landing page +├── CLAUDE.md # This file (documentation guidance) +├── README.md # Deployment and viewing instructions +│ +├── parts/ # Main content sections +│ ├── 01-introduction.html +│ ├── 02-tweet-journey.html +│ ├── 03-user-journey.html +│ ├── 04-discourse-levers.html +│ ├── 05-what-this-means.html +│ ├── 06-conclusions.html +│ └── 07-appendix.html +│ +├── css/ +│ └── style.css # Dark theme styling +│ +├── js/ # Interactive widgets +│ ├── author-diversity-calculator.js +│ ├── engagement-weight-chart.js +│ ├── gravitational-pull-simulator.js +│ └── tweet-journey-flow.js +│ +└── assets/ # Data files, images + └── data/ + └── engagement-weights.json +``` + +## Common Tasks + +### Adding a New Interactive Visualization + +1. **Research the mechanism** in the codebase (see root `/CLAUDE.md`) +2. **Create concrete examples** with real numbers in `notes/` +3. **Design the interaction**: What should users be able to adjust? What do they see? +4. **Write the HTML structure** in the appropriate `parts/*.html` file +5. **Create the JavaScript** in `js/` with clear comments +6. **Test interactivity**: Does it help understanding? Is it intuitive? +7. **Add text explanation**: Describe what the visualization shows +8. **Include code reference**: Point to the actual implementation + +### Writing a New Section + +Follow the **Intuition → Feel → Proof** structure: + +**Step 1: Research & Prepare** +1. Review related notes in `notes/` for detailed findings +2. Locate the relevant code implementation +3. Extract exact formulas, parameters, and thresholds +4. Calculate concrete examples with real numbers + +**Step 2: Write Part 1 (Intuition)** +1. Start with plain language: "What is this mechanism?" +2. Explain why it exists: "What problem does it solve?" +3. Describe the shape: "How does it behave?" (exponential? linear? threshold-based?) +4. Make it relatable: Connect to user experience +5. Draft as if explaining to a non-technical friend + +**Step 3: Create Part 2 (Feel)** +1. Choose visualization type: calculator? graph? simulation? comparison chart? +2. Identify which parameters users should control +3. Decide what should be displayed: results? charts? comparisons? +4. Sketch the HTML structure with proper IDs and classes +5. Create the JavaScript (or note for separate task) +6. Add text description: "Use this to explore..." +7. Ensure it teaches, not just decorates + +**Step 4: Write Part 3 (Proof)** +1. Show the actual formula with exact parameters from code +2. Explain each variable: "Where decayFactor = 0.5..." +3. Walk through step-by-step calculation +4. Provide concrete example: "If an author posts 4 tweets..." +5. Show the consequences: "Effective loss: 40%..." +6. Add code reference: file path and line numbers +7. Cross-check accuracy against source code + +**Step 5: Review & Refine** +1. Check tone: Objective (Parts 1-5) or analytical (Part 6)? +2. Verify all code references are accurate +3. Test calculations manually +4. Run quality checklist +5. Test locally: Open in browser, check rendering +6. Verify links work correctly + +### Verifying Technical Accuracy + +Before publishing any claim: +1. **Check the source code** in the main repository +2. **Verify file paths and line numbers** are current +3. **Test calculations** with real numbers +4. **Cross-reference** with `notes/comprehensive-summary.md` +5. **Look for edge cases**: Are there exceptions or conditions? + +## Quality Checklist + +Before considering a section complete: + +**Content Structure (Intuition → Feel → Proof)**: +- [ ] **Part 1 (Intuition)**: Plain language explanation of what and why +- [ ] **Part 1 (Intuition)**: Describes the "shape" of the behavior +- [ ] **Part 2 (Feel)**: Interactive or visual element present +- [ ] **Part 2 (Feel)**: Visualization enhances understanding (not decorative) +- [ ] **Part 3 (Proof)**: Concrete formulas with actual parameters +- [ ] **Part 3 (Proof)**: Step-by-step calculations with real numbers +- [ ] **Part 3 (Proof)**: Code reference with file path and line numbers + +**Technical Accuracy**: +- [ ] Every claim has a code reference with file path and line numbers +- [ ] Formulas match the actual implementation in code +- [ ] Calculations are correct and use realistic examples +- [ ] Technical accuracy verified against source code +- [ ] Edge cases or conditions are mentioned if relevant + +**Accessibility & Style**: +- [ ] Language is accessible to non-technical readers +- [ ] Tone is objective (or appropriately analytical for Part 6) +- [ ] No jargon without explanation +- [ ] Active voice and short sentences + +**Technical Implementation**: +- [ ] Links to other sections work correctly +- [ ] Renders correctly on mobile and desktop +- [ ] No broken internal or external links +- [ ] Follows the established HTML/CSS patterns +- [ ] Interactive elements are keyboard accessible +- [ ] Text fallbacks exist for visualizations + +## Resources + +**For technical details**: See root `/CLAUDE.md` and `notes/comprehensive-summary.md` + +**For deployment**: See `docs/README.md` + +**For visual design**: See `docs/css/style.css` + +**Reference material**: +- Twitter's open-source repo: https://github.com/twitter/the-algorithm +- Engineering blog: https://blog.x.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm diff --git a/docs/DESIGN_STANDARDS.md b/docs/DESIGN_STANDARDS.md new file mode 100644 index 000000000..7afa36272 --- /dev/null +++ b/docs/DESIGN_STANDARDS.md @@ -0,0 +1,267 @@ +# Design Standards for Twitter Algorithm Documentation + +## Navigation Structure + +### Current State +We have files in: +- `index.html` (landing page) +- `parts/01-introduction.html` +- `parts/07-appendix.html` +- `interactive/` (6 interactive tools) + +### Standard Navigation + +**For index.html:** +```html + +``` + +**For parts/*.html:** +```html + +``` + +**For interactive/*.html:** +```html + +``` + +**Rationale:** Only show what actually exists. Keep it simple. + +--- + +## Code Reference Standard + +### Format + +All code references should be **clickable GitHub links** to the actual source code. + +**Base URL:** `https://github.com/twitter/the-algorithm/blob/main/` + +### HTML Pattern + +```html +
+ Code:
+
+ home-mixer/server/src/main/scala/com/twitter/home_mixer/param/HomeGlobalParams.scala:788-930
+
+
+ Code: home-mixer/server/src/main/scala/com/twitter/home_mixer/param/HomeGlobalParams.scala:788-930
+
+ Code:
+
+ home-mixer/server/src/main/scala/com/twitter/home_mixer/param/HomeGlobalParams.scala:788-930
+
+
RETREIVAL_SIGNALS.md
+
+```
+
+2. **External repo** (algorithm-ml):
+```html
+
+ the-algorithm-ml/projects/home/recap/README.md
+
+```
+
+---
+
+## File Organization
+
+```
+docs/
+├── index.html # Landing page with overview
+├── DESIGN_STANDARDS.md # This file (design guidelines)
+├── README.md # How to view/deploy
+├── CLAUDE.md # AI assistant guidelines
+│
+├── parts/ # Main content sections
+│ ├── 01-introduction.html
+│ └── 07-appendix.html
+│
+├── interactive/ # Interactive visualizations
+│ ├── cluster-explorer.html
+│ ├── engagement-calculator.html
+│ ├── invisible-filter.html
+│ ├── journey-simulator.html
+│ ├── pipeline-explorer.html
+│ └── reinforcement-loop.html
+│
+├── js/ # JavaScript for interactives
+│ ├── cluster-explorer.js
+│ ├── engagement-calculator.js
+│ ├── invisible-filter.js
+│ ├── journey-simulator.js
+│ ├── pipeline-explorer.js
+│ └── reinforcement-loop.js
+│
+├── css/
+│ └── style.css # Global styles
+│
+└── assets/ # Static assets (if needed)
+ └── data/ # JSON data files
+```
+
+---
+
+## Migration Plan
+
+### Step 1: Update Navigation (All Files)
+- Remove links to non-existent parts (02-06)
+- Update navigation HTML in:
+ - `index.html`
+ - `parts/01-introduction.html`
+ - `parts/07-appendix.html`
+ - All `interactive/*.html` files
+
+### Step 2: Add Code Reference Styling
+- Add CSS to `css/style.css`
+
+### Step 3: Convert Code References
+- Find all `.code-ref` instances across all HTML files
+- Convert plain text paths to clickable GitHub links
+- Use the standard format above
+
+### Step 4: Test
+- Open each page and verify:
+ - Navigation links work
+ - Code reference links go to correct GitHub locations
+ - No broken links
+
+---
+
+## Checklist for New Pages
+
+When creating a new page, ensure:
+
+- [ ] Navigation uses the standard format (only existing pages)
+- [ ] All code references are clickable GitHub links
+- [ ] Footer includes page navigation (Previous/Next if applicable)
+- [ ] Page uses `css/style.css`
+- [ ] Title follows pattern: "Page Title - How Twitter's Algorithm Really Works"
+- [ ] Meta description included
+- [ ] Links to related interactives where relevant
+
+---
+
+## Notes
+
+- **Why GitHub links?** Users can verify claims by reading the actual code
+- **Why simplified nav?** Don't promise what doesn't exist yet
+- **target="_blank"** and **rel="noopener"**: Security best practice for external links
+- **Line number format**: GitHub uses `#L123` for single line, `#L123-L456` for ranges
diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 000000000..dba38461e
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,157 @@
+# How Twitter's Algorithm Really Works
+
+A code-based investigation of X's (Twitter's) open-source recommendation algorithm.
+
+**Live Site**: [View the investigation →](https://ernests.github.io/the-algorithm/)
+
+---
+
+## What This Is
+
+In March 2023, X (formerly Twitter) open-sourced their recommendation algorithm. We analyzed the implementation—reading thousands of lines of Scala, examining ML model weights, and tracing data flows through the pipeline.
+
+This site presents our findings through:
+- **Verified claims** backed by specific code references (file paths + line numbers)
+- **Interactive explorations** to experience algorithmic mechanics hands-on
+- **Concrete examples** with real calculations showing how the system works
+
+---
+
+## Top Findings
+
+### 🤯 The Favorites Paradox
+Likes have the **lowest** positive weight (0.5) while replies have 27x more value (13.5). Reply with author engagement: 75.0 weight (150x more valuable than a like).
+
+### 🔥 Conflict is Emergent, Not Intentional
+The algorithm cannot distinguish agreement from disagreement—all replies get the same weight regardless of sentiment. Conflict amplification is a design limitation, not malicious intent.
+
+### 📊 Multiplicative Scoring = Mathematical Echo Chambers
+Tweet scores use multiplication (`score = baseScore × clusterInterest`), not addition. Any imbalance compounds over time through reinforcement loops.
+
+### 👑 Verified Accounts Get 100x Multiplier
+Verification provides a massive algorithmic advantage. Combined with other structural benefits, large verified accounts have a **348:1 reach advantage** over small accounts posting identical content.
+
+### ☢️ "Not Interested" is Nuclear
+One click triggers 0.2x multiplier (80% penalty) with 140-day linear recovery. Removes an author from your feed for ~5 months.
+
+**[See all findings →](https://ernests.github.io/the-algorithm/#findings)**
+
+---
+
+## Interactive Explorations
+
+Experience how the algorithm works through 9 interactive demos:
+
+### Understanding The Pipeline
+- **Pipeline Explorer** - Follow a tweet through all 5 algorithmic stages
+- **Engagement Calculator** - Calculate tweet scores with real weights
+
+### Understanding Your Algorithmic Identity
+- **Cluster Explorer** - Discover which of ~145,000 communities you belong to
+- **Algorithmic Identity Builder** - See your dual profiles (consumer vs creator)
+
+### Understanding Filter Bubbles & Echo Chambers
+- **Journey Simulator** - Model how interests drift over time
+- **Invisible Filter Demo** - See how personalization creates different realities
+- **Reinforcement Loop Visualizer** - Watch feedback loops compound week by week
+
+### Understanding Structural Advantages
+- **Algorithmic Aristocracy** - Explore how follower count creates different rules
+
+### Understanding Next-Generation Systems
+- **Phoenix: Behavioral Prediction** - X's transformer-based system (likely in active A/B testing) that models 522 of your recent actions to predict what you'll do next
+
+**[Explore all interactive demos →](https://ernests.github.io/the-algorithm/#interactives)**
+
+---
+
+## Our Approach
+
+**Objective Evidence**: Every claim backed by:
+- File path (exact location in codebase)
+- Line numbers (specific implementation)
+- Code snippets (what it actually does)
+- Explanation (how the mechanism works)
+- Consequences (what it means for users and creators)
+
+**Verifiable**: The algorithm is [open source](https://github.com/twitter/the-algorithm). You can check our work.
+
+**Interactive**: We built simulators and calculators so you can experience the mechanics hands-on, not just read about them.
+
+---
+
+## Who This Is For
+
+- **Users** wondering why their feed looks the way it does
+- **Creators** optimizing for reach and engagement
+- **Researchers** studying recommendation algorithms and their societal effects
+- **Policy makers** understanding algorithmic amplification
+- **Anyone curious** about how algorithmic systems shape online discourse
+
+---
+
+## Technology
+
+- **Plain HTML/CSS/JavaScript** - No build step, no dependencies, fast loading
+- **Interactive visualizations** - Chart.js for graphs, vanilla JS for simulators
+- **Responsive design** - Works on desktop and mobile
+- **Accessible** - Semantic HTML, keyboard navigation, text alternatives
+
+---
+
+## About This Investigation
+
+This analysis was conducted by reading X's open-source algorithm code (released March 2023). All findings are based on the actual implementation, not speculation or reverse engineering.
+
+**Repository**: [github.com/twitter/the-algorithm](https://github.com/twitter/the-algorithm)
+
+**Methodology**: We read thousands of lines of Scala, traced data flows through pipelines, examined ML model configurations, and documented every mechanism with file paths and line numbers.
+
+**Last Updated**: November 2025
+
+---
+
+## Structure
+
+```
+docs/
+├── index.html # Landing page with key findings
+├── interactive/ # 9 interactive explorations
+│ ├── pipeline-explorer.html
+│ ├── engagement-calculator.html
+│ ├── cluster-explorer.html
+│ ├── algorithmic-identity.html
+│ ├── journey-simulator.html
+│ ├── invisible-filter.html
+│ ├── reinforcement-loop.html
+│ ├── algorithmic-aristocracy.html
+│ └── phoenix-sequence-prediction.html
+├── parts/
+│ └── reference.html # Code reference documentation
+├── css/
+│ └── style.css # Clean, readable styling
+├── js/ # Interactive widget scripts
+└── assets/ # Images and data files
+```
+
+---
+
+## Contributing
+
+Found an error or have a correction? [Open an issue](https://github.com/twitter/the-algorithm/issues) or submit a pull request.
+
+All claims should be backed by specific code references with file paths and line numbers.
+
+---
+
+## License
+
+This documentation is provided for educational and research purposes. The analyzed algorithm code is owned by X Corp.
+
+---
+
+## Questions?
+
+Open an issue on [GitHub](https://github.com/twitter/the-algorithm/issues) or explore the interactive demos to understand how the algorithm works.
+
+**Key Insight**: The algorithm is not neutral. It is designed for engagement, not for truth, diversity, or societal health. Understanding how it works is the first step to using it consciously rather than being shaped by it.
diff --git a/docs/convert-code-refs.py b/docs/convert-code-refs.py
new file mode 100644
index 000000000..7531f0eb5
--- /dev/null
+++ b/docs/convert-code-refs.py
@@ -0,0 +1,144 @@
+#!/usr/bin/env python3
+"""
+Convert plain code references to clickable GitHub links.
+
+Usage:
+ python3 convert-code-refs.py Label: path/to/file.scala:123-456
Code: path
tag
+ code_start = full_match.find('')
+ code_end = full_match.find('') + len('')
+
+ before = full_match[:code_start]
+ after = full_match[code_end:]
+
+ linked_code = f'{display_name}'
+
+ return before + linked_code + after
+
+
+def process_file(filepath):
+ """Process a single HTML file."""
+ with open(filepath, 'r', encoding='utf-8') as f:
+ content = f.read()
+
+ # Pattern to match code-ref blocks
+ # Match from + pattern = r'
]*>.*?
' + + # Count matches before + matches_before = len(re.findall(pattern, content)) + + # Convert all code-refs + new_content = re.sub(pattern, convert_code_ref, content, flags=re.DOTALL) + + # Count links after + matches_after = len(re.findall(r']*>.*? [ We read X's open-source algorithm. Here's what the code actually says. In March 2023, X (then Twitter) open-sourced their recommendation algorithm. We analyzed the implementation—reading thousands of lines of Scala, examining ML model weights, and tracing data flows through the pipeline. Every claim below is backed by specific code references you can verify yourself. Our approach: No speculation. No reverse engineering. Just reading the actual implementation and documenting what it does. The Finding: Favorites (likes) have the lowest positive weight despite being the most visible metric. Why It Matters: The most tracked engagement type has the least algorithmic value. Tweets that drive conversation (replies) vastly outrank tweets that drive agreement (likes). This means controversial content systematically outranks agreeable content. Consequence: One reply with author engagement = 150 likes in algorithmic value. ⚠️ Weight Values: The open-source code defines weights as configurable parameters (FSBoundedParam with default = 0.0). The actual production weight values shown here come from X's ML training repository and represent their documented configuration.
+ Parameter definitions: Explore: Engagement Calculator → The Finding: The algorithm cannot distinguish between: All replies get the same weight (13.5) regardless of sentiment or tone. Why It Matters: Conflict gets amplified not because X wants it, but because the algorithm is blind to it. This is a design limitation, not malicious intent. The code has no sentiment analysis, no toxicity detection in the scoring model—just raw engagement counts. Consequence: Controversial takes (50 angry replies) score higher than helpful explanations (50 supportive replies) despite identical engagement volume.
+ Code: No sentiment analysis anywhere in ML scoring pipeline ( Explore: Engagement Calculator → The Finding: Tweet scores use multiplication, not addition: This single design choice means any imbalance compounds over time. Why It Matters: Echo chambers aren't a bug—they're mathematical inevitability. If you're 60% interested in AI and 40% in cooking, multiplicative scoring gives AI tweets a permanent 50% scoring advantage. This advantage means you see more AI → engage more with AI → interest increases → advantage grows. Example: Consequence: The algorithm concentrates your interests over time through reinforcement loops. Balanced interests are unstable—any imbalance (even 51/49) drifts toward concentration.
+ Code: Explore: Invisible Filter → | Reinforcement Loop → The Finding: Verified status = 100x reputation boost in the TweepcredGraph system. Why It Matters: This isn't just a badge—it's a massive algorithmic advantage. Combined with other structural benefits (follower count, PageRank, follow ratio scoring), large verified accounts have a 348:1 reach advantage over small accounts posting identical content. Consequence: The platform has algorithmic aristocracy built into its architecture. Not all accounts are treated equally—some start with order-of-magnitude advantages.
+ Code: Explore: Algorithmic Aristocracy → The Finding: A single "not interested" click triggers: Why It Matters: This is your most powerful tool as a user. One click removes an author from your feed for 5 months. This is personal to you and doesn't aggregate globally. Note: Content that the ML model predicts will receive negative feedback (based on historical patterns) gets penalized globally with -74.0 weight. Reports have an even more severe predicted penalty (-369.0 weight).
+ Code: Explore: Engagement Calculator → Your InterestedIn profile (which clusters you belong to) updates every 7 days via batch job, not real-time. Your feed is stuck with a week-old view of your interests. Code: Tweets from people you don't follow get a 0.75x multiplier (25% score reduction). Breaking out of your network requires 33% more engagement to compete. Code: Multiple tweets from same author get exponential penalty: 1st tweet (100%), 2nd tweet (62.5%), 3rd tweet (43.75%). Posting more = severe diminishing returns. Code: Tweets need ≥16 engagements to get TwHIN embeddings (used for recommendations). Below threshold = zero-out. Small accounts locked out of this candidate source. Code: ~145,000 total clusters exist, but you're only assigned to 10-20 of them (default). Highly sparse representation of your interests. Code: Experience how the algorithm works through interactive demos. Each one lets you experiment with the actual mechanics found in the code. Follow a tweet through all 5 algorithmic stages from posting to your timeline. See exactly what happens at each stage with real score calculations, filters, and penalties. Calculate tweet scores yourself. Adjust engagement probabilities and see how replies (13.5 weight) vastly outweigh likes (0.5 weight). Discover which of ~145,000 algorithmic communities you belong to based on your follows and engagement. See how X groups users into interest clusters. Understand your dual algorithmic profiles: InterestedIn (consumer) vs Producer Embeddings (creator). See how clusters + engagement create your personalized identity (updates weekly!). Model how your interests might drift over time based on multiplicative scoring mechanics. See the reinforcement loop in action. See how you and a friend see completely different rankings for the same tweets. Multiplicative scoring creates personalized realities. Step through the feedback loop week by week. Watch how seeing more AI content → engaging more → seeing even more creates drift through multiplicative mechanics. Explore how follower count creates different algorithmic rules. Verified accounts get 100x multipliers, small accounts hit hard barriers. See the four mechanisms that multiply together for 348:1 advantage. X's next-generation transformer-based system models your last 522 actions (hours to days of behavior) to predict what you'll do next. Evidence suggests active A/B testing: 9-cluster deployment with parallel evaluation, progressive rollout infrastructure, and hybrid mode for incremental migration. This represents a paradigm shift from static features to behavioral sequences. Objective Evidence: Every claim backed by: Verifiable: The algorithm is open source. You can check our work. Interactive: We built simulators and calculators so you can experience the mechanics hands-on, not just read about them. This analysis was conducted by reading X's open-source algorithm code (released March 2023). All findings are based on the actual implementation, not speculation or reverse engineering. Repository: github.com/twitter/the-algorithm Methodology: We read thousands of lines of Scala, traced data flows through pipelines, examined ML model configurations, and documented every mechanism with file paths and line numbers. Our research notes contain detailed analysis with step-by-step code walkthroughs. Last Updated: November 2025 Key Insight: The algorithm is not neutral. It is designed for engagement, not for truth, diversity, or societal health. Understanding how it works is the first step to using it consciously rather than being shaped by it. How follower count creates different algorithmic treatment through four mechanisms that multiply together Not all Twitter accounts are treated equally by the algorithm. Two accounts posting identical content with identical quality can experience order-of-magnitude differences in reach—not because of what they say, but because of structural characteristics like follower count, verification status, and follow ratio. The algorithm uses multiplication to combine structural advantages. This single design choice creates exponential scaling where small accounts face compounding disadvantages and large accounts receive compounding benefits. If the algorithm used addition (linear): But the algorithm uses multiplication (exponential): The result: A small account (500 followers, unverified) and a large account (50,000 followers, verified) posting identical tweets can see a 348:1 difference in reach purely from structural advantages, not content quality. Four distinct algorithmic mechanisms create advantages that multiply together: Verified accounts get 100× boost to TweepCred (reputation score) Available via Twitter Blue ($8/month) Accounts must get ≥16 engagements to access advanced ML features Small accounts often can't cross this threshold Exponential penalty when following > followers Ratio of 2:1 (following 2× followers) = 1,097× penalty Your follower count determines out-of-network reach potential 500 followers × 0.75 = 375 potential vs 50K × 0.75 = 37,500 potential This analysis is evidence-based: Every mechanism documented with file paths, line numbers, and formulas from Twitter's open-source algorithm. These mechanisms are documented in Twitter's open-source code. Each includes file paths, line numbers, formulas, and concrete examples for verification. Key Pattern: These four mechanisms multiply together, not add. This multiplication creates exponential scaling where advantages compound. Code reference: Mechanism: Verified accounts receive a 100x multiplier on their TweepCred (reputation score). Type: Hardcoded constant (requires code deployment to change) Effect calculation: Availability: Twitter Blue ($8/month) or legacy verification status Code reference: Mechanism: Tweets with fewer than 16 engagements receive zero embeddings, excluding them from TwHIN candidate generation and features. Type: Hardcoded constant Effect: Differential impact: Code reference: Mechanism: Accounts following >500 users with a following/followers ratio >0.6 receive exponential penalty on TweepCred. Type: Hardcoded formula, no maximum cap Penalty table: Observation: Large accounts typically have more followers than following (ratio <0.6), avoiding this penalty entirely. Code reference: Mechanism: Out-of-network tweets receive a 0.75x multiplier on their score (25% reduction). Type: FSBoundedParam (configurable without deployment, range: 0.0-1.0) Differential impact: Two accounts post identical tweets with identical quality. Different structural characteristics produce different reach. Observation: Identical content, 348x difference in reach due to structural characteristics. How mechanisms apply at different follower counts: Observation: Reach multiplier grows faster than follower count (non-linear scaling). Which parameters Twitter can adjust: Observation: Most mechanisms are hardcoded architectural decisions. Only out-of-network penalty is configurable. All mechanisms documented here can be verified in Twitter's open-source algorithm: Methodology note: This analysis went through multiple corrections. Initial understanding of SLOP filter (incorrectly interpreted as minimum follower gate) and follow ratio penalty (incorrectly assumed to be capped) were revised after careful code reading. All findings presented here have been verified against the actual implementation. Every user has two algorithmic profiles: one determines what you see, the other determines who sees you. They're calculated independently and can drift in completely different directions—and you're not fully in control of either. Twitter's recommendation system is built on three interconnected layers. Understanding how they work together—and how they can diverge—is key to understanding why your feed behaves the way it does. KnownFor is the foundation layer that identifies what the top 20 million most popular accounts on X are "known for"—which clusters they belong to. This is the algorithmic "map" of who creates what kind of content. It's slow to change (weekly updates) because it represents the relatively stable structure of content communities on the platform. Key insight: KnownFor only covers the top 20M accounts. If you're not in the top 20M, you're not in this foundation layer—but you still get a producer profile (see Layer 3). InterestedIn is your consumer profile—it determines what content the algorithm shows you. If KnownFor is about what people create and are "known for," InterestedIn is what you are "interested in"—or more precisely, what content the algorithm predicts you'll engage with. The 100-Day Half-Life: Here's the catch—your old engagement doesn't disappear. The algorithm uses exponential decay to weight your historical engagement: What this means in practice: If you engaged heavily with AI content for a week back in January, that engagement will continue influencing your recommendations through July. Your InterestedIn profile is like a slow-moving ship—it takes sustained effort to change course. Relationship type: ONE-TO-MANY. You (one person) engage with many producers. You choose what to engage with (moderate agency). Every user has a second algorithmic representation: your producer profile. This is calculated by Producer Embeddings and determines who sees YOUR content. The 100-Follower Threshold: Below 100 followers, you have no algorithmic boost for exposure. At 99 followers, you're invisible to the recommendation system. At 101 followers, your content starts being calculated and shown to users beyond your immediate followers. Relationship type: MANY-TO-ONE. Many consumers engage with you (one producer). You DON'T choose who engages with you (low agency). Your InterestedIn (consumer profile) and Producer Embeddings (producer profile) are calculated completely independently. They can—and often do—follow completely different paths: The bottom line: Neither profile stays where you want it. Your consumer profile drifts based on what the algorithm shows you and what you engage with. Your producer profile drifts based on who happens to engage with you. Both are moving targets shaped by forces beyond your full control. Your engagement doesn't have an expiration date—it decays exponentially with a 100-day half-life: Your InterestedIn consumer profile is calculated from YOUR engagement with many producers: Your Producer Embeddings producer profile is calculated from OTHERS' engagement with you: Producer Embeddings only exist for accounts with ≥100 followers: When a cluster in your InterestedIn drops below 0.072 (7.2%), it gets filtered out completely: One viral tweet can completely reshape your Producer Embedding: Changing your algorithmic identity is slow: InterestedIn calculation: Producer Embeddings calculation: 100-day half-life decay: 100-follower threshold: Weekly batch updates: KnownFor weekly updates: L2 normalization: 0.072 threshold filtering: You have two algorithmic identities on X: These profiles are calculated independently. They can diverge. And you're not fully in control of where either ends up. The result: Most users drift into algorithmic states they didn't consciously choose—consuming content that reinforces one cluster, producing for audiences that don't match their interests, or both. The architecture creates paths of least resistance, and users follow them without realizing it's happening. Twitter's algorithm doesn't see you as "interested in AI" or "interested in cooking." Instead, it assigns you to invisible communities called clusters—discovered by analyzing the follow graph of 400+ million users. The scale: There are approximately 145,000 clusters discovered from the top 20 million most-followed accounts. Each cluster represents a community with similar interests, discovered organically from who follows whom. Why clusters matter: Your cluster membership determines what appears in your For You feed. If you're 70% assigned to the "AI/ML Research" cluster and 30% to "Cooking," your feed will reflect that split. The algorithm shows you content from producers (accounts) in your clusters. Clusters aren't manually defined—they emerge from the data using Sparse Binary Matrix Factorization (SBF) with Metropolis-Hastings optimization: The Critical Insight: Your clusters come from ENGAGEMENT (likes, replies, retweets), NOT just follows. Engagement has a 100-day half-life, which means: This makes your clusters "sticky"—they change slowly over months, not days. Following diverse accounts isn't enough; you must engage with diverse content. For new accounts: Your follows determine your initial clusters. If you follow 5 AI researchers and 2 chefs, you'll start ~70% AI cluster, ~30% cooking cluster. Over time (weeks to months): Your engagement history dominates. If you like 50 AI tweets and 10 cooking tweets in 100 days, your clusters will drift toward AI regardless of who you follow. Long-term steady state: Your clusters reflect the last 100-200 days of engagement with exponential decay. Past behavior has momentum—changing clusters requires sustained engagement pattern changes over 3-6 months. Use this calculator to see how the algorithm would categorize you based on your follows and engagement patterns. Notice how engagement weights dominate over follows. Your cluster membership (InterestedIn) is calculated through matrix multiplication: Let's say you follow and engage with these producers over 100 days: Engagement decay follows exponential decay with 100-day half-life: Why this matters: Your engagement from 6 months ago (180 days) still has 29% weight. Your clusters have momentum—they resist change. Diversifying your feed requires sustained engagement pattern changes over 3-6 months, not just following different accounts. Cluster data updates at different cadences: Implication: Your clusters lag behind your behavior. It takes up to 1 week for new engagement to affect InterestedIn, and up to 1 week for follow graph changes to affect which clusters exist (KnownFor). Both update on the same weekly schedule. The ~145,000 clusters are discovered using Sparse Binary Matrix Factorization (SBF) with Metropolis-Hastings optimization: Why sparsity matters: The "one cluster per producer" constraint is what creates echo chambers by design. Producers can't belong to multiple clusters, which enforces clear community boundaries and limits cross-cluster discovery. This number is emergent from the follow graph structure, not directly tuned: Cluster formation (SBF/Metropolis-Hastings): KnownFor generation (production job): Note on Louvain: InterestedIn calculation: 100-day half-life decay: L2 normalization: Update frequencies (verified from Twitter Engineering Blog): For users trying to diversify: For creators trying to reach audiences: Why echo chambers emerge: Not all engagement is equal. When you like, reply, or retweet a tweet, the algorithm assigns each action a different value. These values—called engagement weights—determine how the algorithm scores every tweet in your feed. How it works: The Heavy Ranker (Twitter's scoring model) predicts 15 different types of engagement for every tweet. Each prediction gets multiplied by its weight, and the sum becomes the tweet's score. Higher score = higher ranking in your feed. Why weights matter: These weights reveal what Twitter optimizes for. They're not just scoring mechanisms—they shape: Twitter massively prioritizes conversation (replies, especially with author engagement) over passive consumption (likes, video views): The math: One reply with author engagement (75.0) equals 150 likes (75.0 ÷ 0.5). This isn't a bug—it's the business model. Why? Conversation drives time on platform: High-value actions (10.0 to 75.0): Deep engagement requiring effort—replies, profile exploration, meaningful clicks. These signal genuine interest. Low-value actions (0.5 to 1.0): Passive engagement—likes, retweets, basic viewing. Easy to give, minimal signal. Negative actions (-74.0 to -369.0): Explicit dislike or policy violations. Catastrophic penalties that last weeks to months. The Engagement Weights Paradox: Favorites (likes) are the most visible metric, tracked by all systems, and the easiest engagement. Yet they have the lowest positive weight (0.5). The algorithm doesn't care what's "popular" by likes—it cares what generates conversation. Use this calculator to see how different engagement patterns score. Notice how high-weighted engagement (conversation) massively outweighs passive engagement (likes) even when volume is much lower. These weight values come from X's ML training repository. The algorithm multiplies predicted probabilities by these weights. ⚠️ Important Context on Weight Values:
+ Parameter definitions:
+ Compare how different tweet types score. Each scenario represents realistic engagement probabilities. Notice how conversation-driven tweets can outscore viral content.
+ These weights don't just rank tweets—they train the algorithm what YOU care about. The weekly feedback loop (FavBasedUserInterestedIn - DEFAULT):
+ Want to see this in action? Code: InterestedIn uses engagement history: Every tweet's base score is calculated as a weighted sum of predicted engagement probabilities: Let's score a tweet with realistic predictions: Key observation: The 1% reply-with-author-engagement probability (0.75 contribution) contributes more than the 20% favorite probability (0.10 contribution). This is by design. From Twitter's official documentation: Translation: Every weight is defined as an Where the actual weight values come from: What this means: Note: Bookmark, Share, Dwell, Video Quality View, and Video Extended weights are configurable but not disclosed in March 2023 snapshot Equivalent to -148 favorites of negative value Triggered by: "Not interested" click, "See less often" click, muting author Duration: 140-day linear decay Impact: A single "not interested" click suppresses content from that author for 5 months. Equivalent to -738 favorites of negative value Triggered by: Explicit report for spam, harassment, misinformation, or other policy violations To overcome ONE report, a tweet would need: In practice: Impossible to overcome. Reports are a platform-level harm signal. Content that gets reported is effectively dead. Engagement type definitions: Weight configuration: Scoring implementation: Heavy Ranker (MaskNet) details: InterestedIn uses engagement: The Heavy Ranker predicts engagement probability, not: The algorithm optimizes for engagement, not for truth, quality, or societal value. This is a business decision, not a technical limitation. For users trying to shape their feed: For creators trying to maximize reach: The conversation advantage: You and your friend see the same tweets, but in completely different orders. This is how clusters create filter bubbles. Twitter doesn't show everyone the same feed in the same order. Even if you and your friend follow similar accounts and see identical tweets, those tweets will rank completely differently for each of you. This isn't random—it's driven by an invisible mechanism called cluster-based personalization. Twitter assigns every user and every tweet to invisible communities called "clusters" (there are ~145,000 of them). Think of clusters as interest groups: "AI/Tech enthusiasts," "Cooking fans," "Political junkies," etc. When scoring tweets for your feed, the algorithm multiplies each tweet's base quality score by your cluster interest. The result: The exact same tweet with the exact same base quality will score completely differently for you vs your friend based on which clusters you belong to and how strongly. This mechanism creates different realities on the same platform: Cluster filtering happens through multiplication, not addition. If you're 60% interested in AI/Tech and 15% interested in Politics: Consequence: Your existing interests get amplified, minority interests get suppressed, and you drift toward increasingly concentrated feeds over time. New to clusters? See the Cluster Explorer to understand where these communities come from and how they're based on who you follow. See how cluster interests create completely different feeds for different people. Adjust YOUR cluster interests and compare against a friend's profile—the same 15 tweets will rank in completely different orders.
+ Adjust your cluster interests to see how it affects your feed ranking.
+
+ Total: 100%
+
+ Select a friend's profile to compare against yours.
+ Each tweet's final score is calculated as: Cluster-based personalization doesn't just happen once—it happens at multiple stages, compounding the effect. Before any engagement scoring happens, the algorithm fetches candidates based on YOUR clusters: Each tweet gets multiplied by your cluster interest: After cluster multiplication, engagement weights are applied. But cluster filtering already determined which tweets you see! Cluster scoring happens at MULTIPLE stages: This is why 60/40 becomes 76/24 in the Journey Simulator - the multiplicative effect compounds at multiple stages. Cluster assignment (InterestedIn): Multiplicative scoring: Cluster count: ~145,000 clusters total (most users assigned to 10-20) L2 normalization: Experience how your feed will drift over time, even if you don't change anything. Your interests won't stay balanced. Even if you start with 60% AI and 40% Cooking, you'll drift to 76% AI and 24% Cooking over 6 months—without consciously changing your behavior. This isn't a bug, it's how multiplicative scoring works. The algorithm uses multiplication, not addition, to score tweets. Your dominant interest gets a scoring advantage, which means you see more of it, which means you engage more with it, which increases the advantage, creating a feedback loop. The drift is exponential at first, then plateaus. The first 12 weeks see rapid change (60/40 → 70/30), then it slows as you approach saturation (~80/20 is typical plateau). The algorithm doesn't show you what you want—it shows you what it predicts you'll engage with. When you first joined X, you followed a mix of accounts. Let's say you followed accounts in two interest areas. What was your initial split? This simulator models the gravitational pull effect based on the actual algorithm code: This simulator uses simplified formulas for illustration. The actual algorithm: Multiplicative scoring: InterestedIn calculation: Weekly batch updates: L2 normalization: Twitter built a sequence-based prediction system for user behavior. Instead of aggregating features, Phoenix models up to 522 of your recent actions (spanning hours to days of behavior) to predict what you'll do next—like, reply, click. The architecture suggests a fundamental shift from feature-based to sequence-based recommendation. Important: This analysis is based on code structure and architecture patterns. While the infrastructure is verifiably complete, some aspects (like training objectives and behavioral modeling details) are inferred from architectural similarities to transformer-based systems. We clearly mark what's verified code vs. reasoned inference throughout. Status: Phoenix infrastructure is complete and production-ready (September 2025 commit). It's currently feature-flagged (default = false), suggesting it may be in testing phase. The architecture represents a shift from feature-based to sequence-based prediction. The current recommendation system (NaviModelScorer) thinks about you in terms of averages and statistics: "Alice likes 30% tech content, 20% sports, follows 342 people, engages 10 times per day." Phoenix thinks about you in terms of what you're doing right now: "Alice just clicked 3 tech tweets in a row, expanded photos, watched a video—she's deep-diving into tech content." Feature-Based Prediction Your profile: Algorithm asks: "What does Alice usually like?" Time horizon: Months of aggregated behavior Updates: Daily batch recalculation Sequence-Based Prediction Your recent actions: Algorithm asks: "What will Alice do next given her recent behavioral pattern?" Time horizon: Hours to days of behavioral history (522 aggregated actions) Updates: Real-time action capture, aggregated into sessions Hypothesis: Phoenix uses a transformer-based architecture similar to language models, but instead of predicting text, it predicts actions. This inference is based on: Comparison to language models: Hypothesis: By modeling behavior as a sequence, Phoenix could understand dynamics that aggregated features miss. These capabilities are inferred from how sequence models typically work, not explicitly verified in code: Hypothesis: Phoenix could represent Twitter's move toward "delete heuristics"—the vision of replacing manual tuning with learned patterns. This interpretation is based on architectural design patterns: The result: An algorithm that understands your current intent from your behavioral patterns, not your historical preferences from aggregated statistics. This is closer to how humans actually browse—following threads of interest as they emerge, not mechanically consuming averaged content. This simulator demonstrates how sequence-based prediction could work based on Phoenix's architecture. The predictions shown are illustrative of what behavioral sequence modeling enables, not actual Phoenix output. Phoenix splits prediction and aggregation into two separate stages: Code: Input: Action Sequence Processing: Transformer Model (Inferred Architecture) Verified: Phoenix calls an external gRPC service named Why gRPC Service? (Verified: separate service, inferred: reasons) Code: Step 1: Per-Head Max Normalization For each engagement type (each "head"), find the maximum prediction across all candidates: Why normalize per-head? Different engagement types have different prediction ranges. Normalization ensures fair aggregation. Step 2: Weighted Aggregation Phoenix uses the same weights as NaviModelScorer for fair A/B testing comparison: Weight parameters: Code: Phoenix predicts probabilities for 13 different action types: Action sequence hydration: CRITICAL: The "5-minute window" is for aggregation (grouping actions within proximity), not filtering (time limit on history). What "5-minute aggregation window" means: Actual temporal span (522 actions): Comparison to LLM context windows: Phoenix Infrastructure: All components verified in twitter/the-algorithm repository Infrastructure Status (September 2025 commit): What This Means: Cluster configuration: Phoenix isn't a single model—it's 9 separate transformer deployments designed for parallel experimentation: Twitter can test 8 different Phoenix variants simultaneously: Each user's request can be routed to a different cluster: A/B testing flow: Safe, gradual deployment with instant rollback: Key advantage: Zero-downtime experimentation. New models can be tested without code deployment or service restart—just change the Multi-cluster logging: For offline analysis, Twitter can query all 9 clusters simultaneously for the same candidates: Purpose: Create comprehensive comparison dataset without affecting user experience. Only the selected cluster's predictions are used for ranking, but all predictions are logged for evaluation. Hybrid configuration: Twitter can use Navi predictions for some action types and Phoenix predictions for others: Incremental migration strategy: Why this matters: Reduces risk by preserving proven Navi predictions while testing Phoenix predictions incrementally. If Phoenix predictions for clicks are great but favorites are worse, Twitter can use Phoenix for clicks only. This isn't experimental infrastructure—it's production A/B testing at scale. The sophistication of the cluster system suggests: Phoenix's feature gate (default = false) doesn't mean "not deployed"—it means "controlled rollout." Twitter can activate Phoenix for specific user cohorts, test different model variants, and compare results, all without changing code. Connection pooling: Each cluster maintains 10 gRPC channels for load balancing and fault tolerance (90 total connections across 9 clusters). Request routing: Randomly selects one of 10 channels per request for even load distribution (PhoenixUtils.scala:107-117). Retry policy: 2 attempts with different channels, 500ms default timeout (configurable to 10s max). Graceful degradation: If a cluster fails to respond, the system continues with other clusters (for logging) or falls back to Navi (for production scoring). Phoenix Scorer (Stage 1 - Prediction): Phoenix Reranking Scorer (Stage 2 - Aggregation): User Action Sequence Hydrator: 13 Engagement Predictions (Action Types): gRPC Transformer Service Integration: Per-Head Max Normalization: Weighted Aggregation Logic: Model Weight Parameters: Actual Weight Values (ML Repo): Action Sequence Max Count: What we know: Phoenix infrastructure is complete, feature-gated, and production-ready. The architecture represents a fundamental shift from feature-based to sequence-based prediction. More importantly, Phoenix has sophisticated A/B testing infrastructure that strongly suggests active deployment on real users. The 9-cluster system isn't just placeholder infrastructure—it's production experimentation at scale: Conclusion: This level of infrastructure sophistication indicates Phoenix is likely being tested on production traffic right now, not merely prepared for future deployment. Paradigm shift in progress: The algorithm would understand you not as a static profile of historical preferences, but as a dynamic behavioral sequence revealing your current intent. Your feed would adapt as you browse, following threads of interest that emerge in your behavior—not mechanically serving averaged content from long-term statistics. This mirrors how humans actually consume content: following curiosity as it arises, deep-diving into topics that capture attention, switching contexts when interest shifts. An algorithm that learns to follow your behavioral lead, not force you into a predetermined statistical box. Verified from code: What we don't know from open-source code: Most likely scenario: Phoenix is in active A/B testing with controlled user cohorts. Twitter is iterating on multiple model variants (via Experiment1-8 clusters), comparing results, and gradually expanding deployment as metrics improve. The infrastructure is too sophisticated to be merely preparatory. Follow a tweet's complete journey from posting to your timeline through all 5 algorithmic stages. See exactly how scoring, filters, and penalties determine what you see. Every day, Twitter processes approximately 1 billion tweets through a 5-stage pipeline. By the time one reaches your feed, it has passed through candidate generation, feature extraction, machine learning scoring, multiple filters and penalties, and final mixing. Only ~4% survive to appear in feeds.
+ Choose a tweet scenario to follow through the pipeline. Each scenario has realistic engagement probabilities and characteristics.
+ High-quality thread from someone you follow in your main interest cluster Great content from someone you don't follow, different cluster Hot take that drives replies, from followed author Good content but author already has 2 tweets in your feed Fetch ~1,400 candidate tweets from various sources based on your profile: Attach ~6,000 features to each tweet: MaskNet model predicts 15 engagement probabilities and calculates weighted score: Multiple filters reshape the ranking: Insert ads, promoted tweets, follow recommendations, and serve final timeline. Candidate generation: Heavy Ranker weights: Scoring computation: Out-of-network penalty: Author diversity: Cluster scoring: Step through the algorithmic feedback loop that causes your interests to drift, even when you don't change your behavior. The algorithm creates a self-reinforcing feedback loop: Your profile determines what you see. What you see determines what you engage with. What you engage with updates your profile. Round and round, amplifying imbalances week after week. The drift follows a logistic curve: Fast growth initially (60 → 70% in 12 weeks), then slowing as you approach saturation (~80% is typical plateau). This isn't about your behavior changing—it's pure mathematics from multiplicative scoring + L2 normalization + weekly batch updates.
+ Set your initial cluster interests. Watch how even a small imbalance (60/40) compounds over time.
+
+ Total: 100%
+ At the scoring stage, tweets get multiplied by your cluster interest: Because AI content ranks higher, you see more of it. Your engagement naturally matches what's visible: Cluster interests must sum to 1.0 (100%). When AI increases, Cooking MUST decrease: InterestedIn updates weekly via batch jobs. Each week's drift becomes the new baseline: The output becomes the input: This isn't about user behavior. It's pure math: Key insight: As long as there's ANY imbalance (not perfect 50/50), drift will occur. The stronger interest always wins. In the current design, drift is inevitable for any user with unequal interests. Multiplicative scoring: Weekly batch updates: L2 normalization: InterestedIn calculation: ${tierDescriptions[tier]} Strong consolidation: Your feed is dominated by ${dominant.name} (${dominant.percent.toFixed(1)}%).
+ This means: Warning: You're approaching a filter bubble. Consider diversifying your engagement. Moderate imbalance: ${dominant.name} is your dominant interest (${dominant.percent.toFixed(1)}%),
+ but you still see meaningful ${secondary.name} content. This will likely drift further toward ${dominant.name} over time. Tip: If you want to maintain balance, actively engage more with ${secondary.name} content. Relatively balanced: Your interests are fairly distributed. However, the algorithm will
+ naturally drift toward the strongest interest (${dominant.name} at ${dominant.percent.toFixed(1)}%) over time. Tip: Balanced interests require active maintenancethe algorithm has no rebalancing mechanisms.
+ Next week: Your InterestedIn will update based on what you engage with this week.
+ The cycle repeats every 7 days, creating compounding drift.
+ Starting point: Slightly unbalanced (60/40). This small imbalance will compound over time. KnownFor update week: The underlying cluster structure recalculates. This creates a new baseline for the next 3 weeks of InterestedIn updates. Result: Without changing your behavior, your feed drifted from 60/40 to ${data.aiPercent}/${data.cookingPercent}. This will continue toward 80/20, 90/10, eventually 100/0. Heavily concentrated: You're ${dominantPercent}% ${dominantName}. This cluster will dominate your For You feed. Roughly ${dominantPercent}% of your feed will be ${dominantName} content. ${secondName} (${secondPercent}%) will be much less visible. Moderately concentrated: You're ${dominantPercent}% ${dominantName}, ${secondPercent}% ${secondName}. Your feed will be roughly ${dominantPercent}% ${dominantName} and ${secondPercent}% ${secondName}. The smaller cluster may drop below the threshold over time due to multiplicative scoring. Relatively balanced: Your top cluster is ${dominantPercent}% ${dominantName}. Your feed will be fairly diverse, but expect drift toward ${dominantName} over time due to multiplicative scoring (gravitational pull effect). ${weakestName} (${weakestPercent}%) is approaching the algorithm's threshold. If it drops below ~7%, it may be filtered out entirely from your feed. To maintain diversity, you need to actively engage with ${weakestName} content to keep this cluster above the threshold. Your engagement history is strongly influencing your clusters (${(totalEngagement / (totalFollows + totalEngagement) * 100).toFixed(0)}% of the signal). This is normal! Engagement with a 100-day half-life dominates over follows for active users. Your clusters reflect what you actually engage with, not just who you follow. Your follows are the primary signal (${(totalFollows / (totalFollows + totalEngagement) * 100).toFixed(0)}% of the calculation). This suggests you're either a new account or a light user. As you engage more, your engagement history will start dominating. Within a few weeks of active use, engagement will override your follow choices. ${scenario.name} "${scenario.description}" Why this score: ${scenario.explanation}
+ Phoenix detected this pattern from your last ${actionSequence.length} actions
+ Passive Browsing Mode Detected Your sequence shows mostly scrolling with minimal engagement. Phoenix predicts: In Navi (old system): You'd get your standard 50/50 mix regardless of browsing mode. High Engagement Streak Detected! You've been actively engaging (replies, likes) with ${dominant.toLowerCase()} content. Phoenix predicts: In Navi: Static prediction based on lifetime averages. Focused Interest Detected: ${dominant} Your recent sequence shows clear focus on ${dominant.toLowerCase()} content (${dominantProb}% probability). Phoenix predicts: Key difference: Phoenix sees you're interested in ${dominant.toLowerCase()} right now, not based on what you liked last month. Context Switching Detected Your sequence shows mixed interests (Tech: ${analysis.predictions.tech}%, Sports: ${analysis.predictions.sports}%). Phoenix predicts: Phoenix advantage: Can detect when you switch contexts mid-session and adapt instantly. Your tweet enters the pipeline as one of ~1,400 candidates selected based on your profile.
+ ${currentScenario.network === 'in-network'
+ ? 'Retrieved from Earlybird search index because you follow this author. ~50% of candidates come from in-network.'
+ : 'Retrieved via SimClusters ANN based on your interest clusters. ~20% of candidates come from similar content clusters.'}
+
+ From ~1 billion tweets posted, only 1,400 make it to your candidate pool. That's a 99.9998% rejection rate before any scoring!
+
+ ${currentScenario.network === 'in-network'
+ ? 'You follow this author, so this tweet has in-network status and will avoid the 25% out-of-network penalty.'
+ : 'You don\'t follow this author. Will receive a 0.75x multiplier (25% penalty) later in the pipeline.'}
+ The algorithm attaches ~6,000 features to this tweet for the ML model to evaluate.
+ The Heavy Ranker will predict probabilities for 15 different engagement types. These feed into the weighted scoring formula.
+ MaskNet model predicts engagement probabilities and calculates weighted score: Multiple filters reshape the ranking by applying multipliers and penalties: ${mod.description}
+ ${baseScore > currentScore
+ ? `Score reduced by ${(((baseScore - currentScore) / baseScore) * 100).toFixed(1)}% through filters`
+ : 'Score maintained through filters'}
+ The final stage inserts ads, promoted tweets, and modules before serving your timeline.
+ Based on the final score of ${finalScore.toFixed(3)}, this tweet would rank approximately #${estimateRank(finalScore)} in your timeline of ~100-200 tweets.
+
+ Only ~50-100 tweets make it to your final timeline. ${estimateRank(finalScore) <= 100 ? 'This tweet made it!' : 'This tweet was filtered out in the final ranking.'}
+
+ Twitter inserts ads (~10% of timeline), promoted tweets, "Who to Follow" modules, and topic suggestions. Your organic timeline is interspersed with monetization elements.
+ You've experienced one complete loop. Now let's see how this compounds over 4 weeks: The compounding effect: Each week, the loop repeats. Your profile updates based on your engagement, which was determined by your feed, which was determined by your profile. The imbalance grows automatically. Key insight: You didn't change your behavior at all! You consistently engaged with what you saw. The algorithm's multiplicative scoring and L2 normalization created this drift. Here's the long-term effect of the reinforcement loop: After 6 months, you've drifted from ${initialAi}/${100-initialAi} to ${week24Ai}/${100-week24Ai}. ${week24Ai >= 75 ? 'Your feed is now a monoculture - the minority interest has nearly disappeared.' : 'The drift continues accelerating as the imbalance grows.'} This isn't because you changed. The algorithm's design makes drift mathematically inevitable for any imbalanced starting point. This is your current InterestedIn profile - the algorithm's understanding of what you care about: What this means: The algorithm will use these weights to score tweets. AI content gets multiplied by ${(profile.ai).toFixed(2)}, Cooking by ${(profile.cooking).toFixed(2)}. Change from Week ${currentWeek - 1}: AI ${profile.ai > history[currentWeek - 1].ai ? '↑' : '↓'} ${Math.abs((profile.ai - history[currentWeek - 1].ai) * 100).toFixed(1)}%, Cooking ${profile.cooking > history[currentWeek - 1].cooking ? '↑' : '↓'} ${Math.abs((profile.cooking - history[currentWeek - 1].cooking) * 100).toFixed(1)}% The algorithm fetches ~1,600 candidate tweets from your clusters, proportional to your interests: What this means: Before any scoring happens, the algorithm already fetched ${aiPercent}% AI content and ${cookingPercent}% Cooking content. Your profile determines what's even in the pool! Each tweet gets scored by multiplying base quality × your cluster interest: What this means: Despite equal quality (0.85), the AI tweet scores ${scoreAdvantage}x higher due to your cluster interests. This determines what ranks at the top of your feed. The algorithm sorts tweets by score and builds your feed. The composition matches your profile: What this means: Because AI content scored higher, it dominates your feed. You'll see ${aiPercent}% AI tweets and ${cookingPercent}% Cooking tweets. This isn't random - it's a direct result of the multiplicative scoring. You engage with what you see. Since ${aiPercent}% of your feed is AI, ${aiPercent}% of your engagements are with AI content: Critical insight: You didn't change your preferences! You just engaged with what was shown to you. The algorithm controlled what you saw, which determined what you engaged with. Based on your engagement pattern, the algorithm updates your InterestedIn profile. This happens via weekly batch jobs (L2 normalization ensures interests sum to 100%): The feedback loop: AI increased because you engaged more with AI. Cooking decreased because interests must sum to 100% (zero-sum). This new profile becomes the input for Week ${currentWeek + 1}, and the cycle repeats. This is drift! Small changes compound week after week, pushing you toward monoculture. Technical terminology, code references, and verification guide for the interactive documentation The Twitter algorithm is built from many interconnected systems. Here's what each piece does, explained intuitively rather than technically. Reading this glossary: Each entry explains what the system does and why it exists. Think of these as tools in a toolbox - each serves a specific purpose in the larger recommendation pipeline. What it is: The main machine learning model that scores tweets. How to think about it: Imagine a judge at a competition who can predict 15 different ways the audience might react to each performance. The Heavy Ranker looks at a tweet and predicts: "There's a 5% chance you'll like this, 2% chance you'll reply, 0.1% chance you'll click 'not interested'," and so on. Each prediction gets a weight (replies are worth 13.5x more than likes), and the weighted sum becomes the tweet's final score. Why it exists: Scoring thousands of tweets per user is computationally expensive. The Heavy Ranker is "heavy" because it's thorough - it uses a neural network with ~48 million parameters to make highly accurate predictions. But you can only afford to run something this expensive on a pre-filtered set of candidates. Architecture: Uses MaskNet (see below) - a special neural network design that predicts all 15 engagement types simultaneously while sharing knowledge between predictions. Code: External repo Weights: What it is: A faster, simpler scoring model embedded in the search index. How to think about it: If Heavy Ranker is a detailed film critic analyzing every aspect of a movie, Light Ranker is a quick star rating. It's a basic logistic regression model that runs inside the search index (Earlybird) to quickly score millions of tweets and pick the top few thousand worth sending to Heavy Ranker. Why it exists: You can't run Heavy Ranker on a billion tweets - it would take too long and cost too much. Light Ranker is the bouncer that gets the candidate pool down from millions to thousands in milliseconds. Trade-off: Fast but less accurate. Uses only ~20 features vs Heavy Ranker's ~6,000 features. Code: What it is: A giant knowledge graph that represents everything on Twitter (users, tweets, topics, communities) as connected points in mathematical space. How to think about it: Imagine a 3D map where every user is a point, every tweet is a point, and every topic is a point. Similar things are close together. If you like sci-fi movies and engage with certain accounts, you'll be positioned near other sci-fi fans. TwHIN can then say "show this person tweets from that nearby cluster they haven't seen yet." Why it exists: Finding relevant content from people you don't follow is hard. TwHIN solves this by representing similarity mathematically - it can find "users similar to you" or "tweets similar to what you engage with" by measuring geometric distance in this abstract space. Heterogeneous means: The graph includes different types of things (users, tweets, topics, hashtags) all in one unified mathematical representation. Code: What it is: A system that divides X into ~145,000 interest-based communities and represents both users and tweets as membership in these communities. How to think about it: Instead of saying "Alice follows Bob and Carol," SimClusters says "Alice is 60% in the AI cluster, 30% in the cooking cluster, and 10% in the gardening cluster." Tweets are described the same way. Then matching is simple: show people tweets from clusters they belong to. Why it exists: Communities are more stable than individual follow relationships, and they're much more efficient to compute with. Rather than comparing you to millions of individual users, the algorithm can compare your cluster membership to tweet cluster scores. The gravitational pull effect: Because scoring uses multiplication ( How clusters are created: X analyzes the follow graph using community detection algorithms to discover ~145,000 natural communities. Your interests (InterestedIn) are calculated from your engagement history with a 100-day half-life, updated weekly. See the Cluster Explorer interactive to understand how you're categorized. Code: What it is: An in-memory graph database that tracks recent engagement patterns to make real-time recommendations. How to think about it: UTEG is like a short-term memory system. It remembers "in the last 24 hours, people similar to you engaged with these tweets." It's built using GraphJet (see below), which keeps a live graph in RAM that can answer queries in milliseconds. Why it exists: Some recommendation systems (like SimClusters) are based on long-term patterns and update slowly. UTEG captures what's happening right now - trending topics, breaking news, viral content. It provides the "fresh" recommendations that complement the more stable systems. Graph traversal: To find recommendations, UTEG does graph walks: "You liked tweet A → Other people who liked A also liked B → Show you tweet B." Code: What it is: An in-memory graph database optimized for real-time recommendations. How to think about it: A traditional database stores data on disk and reads it when needed (slow). GraphJet keeps the entire graph in RAM (fast) and is optimized for the specific types of queries Twitter needs: "given this user, find related tweets" or "given this tweet, find similar users." Why it exists: Speed. When you refresh your timeline, Twitter has ~200 milliseconds to gather candidates, score them, and serve the results. GraphJet can traverse millions of graph edges in memory in just a few milliseconds. Trade-off: RAM is expensive and limited, so GraphJet only stores recent data (typically last 24-48 hours of engagement). Code: Open-sourced separately at What it is: Twitter's real-time search index - a specialized database optimized for finding tweets by keywords, authors, or engagement patterns. How to think about it: When you search for "machine learning" on Twitter, Earlybird finds matching tweets in milliseconds even though there are billions of tweets. For the recommendation algorithm, Earlybird serves as the main source of in-network candidates (tweets from people you follow). Why it exists: Traditional databases aren't fast enough for Twitter's scale. Earlybird is custom-built for one purpose: extremely fast tweet retrieval with ranking. It includes the Light Ranker (see above) built directly into the index so it can return already-scored candidates. Real-time means: New tweets are indexed within seconds, so Earlybird always has the latest content. Code: What it is: A system that predicts the strength of relationships between users based on interaction patterns, not just follow relationships. How to think about it: You might follow 500 people, but you only regularly interact with 20 of them. Real Graph identifies those 20 by tracking who you reply to, whose profiles you visit, whose tweets you engage with. It creates a weighted graph where edge strength = relationship strength. Why it exists: Following someone is a weak signal. The algorithm needs to know who you actually care about. Real Graph provides this by analyzing behavior: "You follow both @alice and @bob, but you reply to Alice 10x more often, so Alice gets 10x more weight in your recommendations." Used for: Prioritizing in-network content, finding follow recommendations, and scoring out-of-network candidates based on similarity to your real connections. Code: What it is: A coordination service that gathers out-of-network tweet candidates from multiple sources and combines them. How to think about it: Tweet Mixer is like a talent scout that asks multiple agencies (TwHIN, SimClusters, UTEG, FRS) for their best recommendations, then combines those lists into one unified candidate pool to send to the Heavy Ranker. Why it exists: Each recommendation system has different strengths - UTEG finds trending content, SimClusters finds thematic matches, TwHIN finds geometric similarity. Tweet Mixer orchestrates these systems and ensures you get a diverse mix of out-of-network candidates rather than duplicates from the same source. Does NOT score: Tweet Mixer just fetches and combines. The actual scoring happens later in the Heavy Ranker. Code: What it is: A high-performance inference engine that runs machine learning models in production. How to think about it: Training a neural network happens offline in Python/TensorFlow. But when it's time to actually score tweets for millions of users, you need something blazing fast. Navi is a Rust-based serving system optimized for running the Heavy Ranker model with minimal latency. Why it exists: Python is too slow for production inference at Twitter's scale. Navi compiles the trained model into optimized Rust code that can score thousands of tweets per second with single-digit millisecond latency. Trade-off: More complex to deploy and maintain than standard TensorFlow Serving, but much faster. Code: Proprietary, but referenced in What it is: A framework for building content feeds - provides reusable components for fetching candidates, scoring, filtering, and mixing content. How to think about it: Building a recommendation timeline involves many common steps: fetch candidates, hydrate features, run ML models, apply filters, insert ads, etc. Product Mixer provides these as Lego blocks so teams can assemble feeds without reimplementing everything from scratch. Why it exists: Twitter has multiple feeds (For You, Following, Lists, Search, Notifications). Product Mixer lets them share code and ensure consistency while customizing each feed's specific logic. Pipeline structure: Product Mixer uses a pipeline model where each stage's output feeds into the next stage, making the data flow explicit and testable. Code: What it is: A neural network architecture designed for multi-task learning - predicting multiple related outcomes simultaneously. How to think about it: Traditional models predict one thing ("will you like this?"). MaskNet predicts 15 things at once (like, reply, retweet, report, etc.) while sharing knowledge between tasks. The insight is that all these predictions are related - if someone is likely to reply, they're probably also likely to like - so the model can learn more efficiently by predicting them together. Why it exists: Training 15 separate models would be inefficient and they'd miss shared patterns. MaskNet uses "shared towers" (neural network layers that all tasks use) and "task-specific towers" (layers unique to each prediction), getting the best of both worlds. The mask part: During training, MaskNet randomly "masks" (hides) some tasks to prevent the model from cheating by learning shortcuts between correlated tasks. Code: External repo What it is: A configuration system that lets Twitter tune algorithm parameters without deploying new code. How to think about it: Hardcoded values like Why it exists: Algorithm optimization is experimental. Twitter needs to test "what if we change the out-of-network penalty from 0.75 to 0.80?" dozens of times per week. FSBoundedParam makes this safe (the bounds prevent catastrophically bad values) and fast (no deployment required). Important implication: Most weights, penalties, and thresholds in the algorithm are FSBoundedParams. The March 2023 open-source code shows the structure and formulas, but Twitter can tune the parameters without us seeing the changes. Code: Used throughout, defined in What it is: A reputation score for users based on their follower graph quality, using PageRank-like algorithms. How to think about it: Not all followers are equal. A verified account with 1M engaged followers has higher TweepCred than a bot farm with 1M fake followers. TweepCred measures "how much does the Twitter network trust/value this user?" by looking at who follows them and the quality of those followers. Why it exists: Follower count alone is easily gamed. TweepCred provides a more robust measure of influence by analyzing the graph structure. It's used to boost high-quality accounts and filter low-quality ones (the SLOP filter removes users with TweepCred below a threshold). Verified accounts: Get a ~100x TweepCred multiplier, which partly explains why verified accounts dominate recommendations. Code: What it is: A service that recommends users you might want to follow. How to think about it: FRS analyzes your follow graph and engagement patterns to suggest accounts similar to those you already follow or engage with. But it has a dual purpose: it also feeds into timeline recommendations by showing you tweets from accounts it thinks you should follow before you actually follow them. Why it exists: Growing your follow graph improves your timeline quality (more in-network candidates). But FRS also serves as a candidate source - "here are tweets from people you don't follow but should." Cluster reinforcement: FRS recommends users from your strongest SimClusters, which accelerates the gravitational pull effect. If you're 60% AI cluster, FRS recommends more AI accounts, you follow them, which makes you even more AI-cluster-heavy. Code: What it is: A centralized platform for collecting, storing, and serving user behavior signals. How to think about it: Every action you take on Twitter (like, reply, click, scroll, dwell time) generates a signal. Rather than having every recommendation system separately track these signals, USS centralizes them. When the algorithm needs to know "what has this user engaged with recently?", it queries USS. Why it exists: Reduces duplication and ensures consistency. Multiple systems use the same signals (favorites, follows, etc.), so centralizing this in USS means one source of truth. Real-time and batch: USS provides both real-time signals (recent clicks in the last hour) and batch signals (aggregated engagement over weeks/months). Code: Twitter's algorithm was open-sourced in two major releases: ~300 files showing the 5-stage pipeline structure, basic candidate sources, and core concepts. The +762 new files adding 161 feature hydrators, 56 filters, 29 scorers, complete ML integration, and full parameter definitions. The Our investigation analyzes a composite system: Important: Parameter definitions exist with Core findings remain valid: The fundamental mechanisms (multiplicative scoring, exponential decay, 0.75x out-of-network penalty, 140-day feedback fatigue) are unchanged. The September 2025 release added detail and confirmed the architecture we analyzed. Every finding in this investigation can be verified. Here's how: We provide file paths like: To view this: To see when code was last modified: We show calculations like: You can verify these against the code references we provide. Twitter published some official explanations: Our analysis adds detail and implications beyond what Twitter officially documented. Entry point: Scoring orchestration: All 15 weight parameters: Engagement type definitions: "Not interested" filtering: 140-day penalty calculation: Author diversity exponential decay: Out-of-network 0.75x multiplier: Earlybird search index: UTEG: Out-of-network coordination: FRS: Community detection and embeddings: Approximate nearest neighbor search: Complete list of 20+ tracked signals: Signal collection and serving: Real-time action stream: Questions or corrections? This is a living document. If you find errors or have questions about our analysis, please open an issue or submit a pull request on GitHub.How Twitter's Algorithm Really Works
+
+
+ What We Found
+
+
+
+ Top 5 Most Interesting Findings
+
+
+ 🤯 The Favorites Paradox
+
+
+
+ Favorite (like): 0.5 weight
+Reply: 13.5 weight (27x more valuable)
+Reply with Author Engagement: 75.0 weight (150x more valuable!)HomeGlobalParams.scala:786-930
+ Production weight values: the-algorithm-ml/projects/home/recap
+ 🔥 Conflict is Emergent, Not Intentional
+
+
+
+
+ NaviModelScorer.scala, HomeGlobalParams.scala)
+ 📊 Multiplicative Scoring = Mathematical Echo Chambers
+
+
+
+ score = baseScore × clusterInterest
+
+ AI tweet (quality 0.9): 0.9 × 0.60 = 0.54
+Cooking tweet (quality 0.9): 0.9 × 0.40 = 0.36
+
+Same quality, 50% score difference!ApproximateCosineSimilarity.scala:94 → score * sourceClusterScore
+ 👑 Verified Accounts Get 100x Multiplier
+
+
+
+ if (isVerified) 100
+else { /* calculate based on followers, PageRank, etc. */ }UserMass.scala:40-41
+ ☢️ "Not Interested" is Nuclear
+
+
+
+
+
+
+ Day 0: 0.2x multiplier (nearly invisible)
+Day 70: 0.6x multiplier (still suppressed)
+Day 140: 1.0x multiplier (penalty expires)FeedbackFatigueScorer.scala:38 → val DurationForDiscounting = 140.days
+
+
+ Other Notable Code-Backed Findings
+
+ ⏰ Weekly Batch Updates
+ InterestedInFromKnownFor.scala:59 → Days(7)🚧 Out-of-Network Penalty
+ RescoringFactorProvider.scala:45-57📝 Author Diversity Decay
+ AuthorBasedListwiseRescoringProvider.scala:54🥶 TwHIN Cold Start Problem
+ TwhinEmbeddingsStore.scala:48🗺️ Sparse Cluster Assignment
+ InterestedInFromKnownFor.scala:148 → maxClustersPerUser = 20
+
+ Interactive Explorations
+
+ Understanding The Pipeline
+
+ 🔍 Pipeline Explorer
+ 🧮 Engagement Calculator
+ Understanding Your Algorithmic Identity
+
+ 🗺️ Cluster Explorer
+ 🎭 Algorithmic Identity Builder
+ Understanding Filter Bubbles & Echo Chambers
+
+ 🚀 Journey Simulator
+ 👥 Invisible Filter Demo
+ 🔁 Reinforcement Loop Visualizer
+ Understanding Structural Advantages
+
+ 👑 Algorithmic Aristocracy
+ Understanding Next-Generation Systems
+
+ 🔮 Phoenix: Behavioral Prediction (Likely Active)
+
+
+ Our Approach
+
+
+
+
+
+
+ Who This Is For
+
+
+
+
+
+
+ About This Investigation
+
+
+
+ The Algorithm's Aristocracy
+ What Is Algorithmic Aristocracy?
+
+ The Key Architectural Choice: Multiplication, Not Addition
+
+ Why Multiplication Matters
+
+
+ Small account: 1 + 1 + 1 + 1 = 4
+Large account: 100 + 10 + 5 + 2 = 117
+Ratio: 29:1 (manageable difference)
+ Small account: 1 × 1 × 0.001 × 100 = 0.1
+Large account: 100 × 1 × 1 × 50,000 = 5,000,000
+Ratio: 50,000,000:1 (insurmountable difference)The Four Mechanisms That Multiply
+
+ 1. Verification Multiplier
+ 2. TwHIN Engagement Threshold
+ 3. Follow Ratio Penalty
+ 4. Out-of-Network Base
+ How They Compound: Concrete Example
+
+
+
+ Account A (small account):
+ Verification: 1× (no multiplier)
+ TwHIN: 0× (below 16 engagement threshold - no advanced features)
+ Follow ratio: 0.001× (following 2× more than followers)
+ Base reach: 500 followers
+
+ Calculation: 1 × 0.5 × 0.001 × 500 = ~0.25
+ Effective reach: ~575 (with partial OON)
+
+Account B (large account):
+ Verification: 100× (Twitter Blue)
+ TwHIN: 1× (crossed threshold - full features)
+ Follow ratio: 1× (no penalty)
+ Base reach: 50,000 followers
+
+ Calculation: 100 × 1 × 1 × 50,000 = 5,000,000
+ Effective reach: ~200,000 (after normalization)
+
+Ratio: 348:1 from identical content
+
+The multipliers compound:
+• 1 × 0.5 × 0.001 × 500 = 0.25
+• 100 × 1 × 1 × 50,000 = 5,000,000
+• Gap created purely by multiplication of structural advantagesWhy This Matters
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ The Four Mechanisms (Technical Details)
+ 1. Verification Multiplier (100x)
+
+ UserMass.scala:41
+ if (isVerified) 100Account A (10K followers, unverified):
+ TweepCred ≈ 50 (calculated from graph structure)
+
+Account B (10K followers, verified):
+ TweepCred ≈ 5,000 (100x multiplier)
+
+Same follower count, 100:1 difference in algorithmic treatment
+
+ 2. TwHIN Engagement Threshold (16)
+
+ TwhinEmbeddingsStore.scala:48-65
+ val MinEngagementCount = 16
+
+if (persistentEmbedding.updatedCount < MinEngagementCount)
+ embedding.map(_ => 0.0) // Zero out if insufficient engagement
+
+
+ Small account (500 followers):
+ Average tweet: 8 engagements
+ Result: Most tweets never cross threshold
+
+Large account (50K followers):
+ Average tweet: 250 engagements
+ Result: All tweets cross threshold immediately
+ 3. Follow Ratio Penalty (Exponential, Unbounded)
+
+ UserMass.scala:54-64
+ val friendsToFollowersRatio = (1.0 + numFollowings) / (1.0 + numFollowers)
+val adjustedMass = mass / exp(5.0 × (ratio - 0.6))
+
+
+
+
+
+
+
+ Ratio
+ Penalty Multiplier
+
+
+ 0.6
+ 1x (no penalty)
+
+
+ 1.0
+ 7.4x penalty
+
+
+ 2.0
+ 1,097x penalty
+
+
+
+ 5.0
+ 3.6 billion x penalty
+ 4. Out-of-Network Penalty (0.75x)
+
+ RescoringFactorProvider.scala:46-57
+ object OutOfNetworkScaleFactorParam extends FSBoundedParam[Double](
+ name = "out_of_network_scale_factor",
+ default = 0.75,
+ min = 0.0,
+ max = 1.0
+)1K followers account:
+ In-network base: 1,000 users (no penalty)
+ Out-of-network: ~99% of potential audience (0.75x penalty applies to nearly all growth)
+
+1M followers account:
+ In-network base: 1,000,000 users (no penalty)
+ Out-of-network: ~95% of potential audience, but base is 1000x larger
+ Same penalty (0.75x), different absolute impact
+ Same Content, Different Treatment
+
+
+
+
+
+
+
+ Characteristic
+ Account A
+ Account B
+
+
+ Followers
+ 500
+ 50,000
+
+
+ Following
+ 1,000
+ 200
+
+
+ Verified
+ No
+ Yes ($8/month)
+
+
+ Avg engagements/tweet
+ 8
+ 250
+
+
+ Mechanisms Applied:
+
+
+ 1. Verification multiplier
+ 1x (no multiplier)
+ 100x multiplier
+
+
+ 2. TwHIN threshold
+ Not crossed (8 < 16)
+ Crossed (250 ≥ 16)
+
+
+ 3. Follow ratio penalty
+ Ratio 2.0 → 1,097x penalty
+ Ratio 0.004 → no penalty
+
+
+ 4. Out-of-network base
+ ~100 × 0.75 = 75
+ ~50,000 × 0.75 = 37,500
+
+
+ Estimated Effective Reach
+ ~575
+ ~200,000
+
+
+
+ Reach Ratio: 348:1
+ The Five Tiers
+
+
+
+
+
+
+
+ Tier
+ Followers
+ Typical Characteristics
+ Reach Multiplier
+
+
+ 1
+ <1,000
+ Unverified, high follow ratio, <16 engagements, no TwHIN support
+ ~1x (base only)
+
+
+ 2
+ 1,000-10,000
+ Possibly verified, improving ratio, occasional TwHIN on popular tweets
+ ~1-15x
+
+
+ 3
+ 10,000-100,000
+ Often verified (100x), low ratio, frequent TwHIN support
+ ~15-200x
+
+
+ 4
+ 100,000-1,000,000
+ Verified, minimal ratio penalty, all tweets get TwHIN
+ ~200-2,000x
+
+
+
+ 5
+ ≥1,000,000
+ Verified, all penalties negligible, maximum algorithmic support
+ ~2,000x+
+ Configurability
+
+
+
+
+
+
+
+ Mechanism
+ Value
+ Type
+ Adjustable
+
+
+ Verification multiplier
+ 100x
+ Hardcoded
+ No (requires deployment)
+
+
+ TwHIN threshold
+ 16 engagements
+ Hardcoded
+ No (requires deployment)
+
+
+ Follow ratio formula
+ exp(5.0 × (ratio - 0.6))
+ Hardcoded
+ No (requires deployment)
+
+
+
+ Out-of-network penalty
+ 0.75x (default)
+ FSBoundedParam
+ Yes (range: 0.0-1.0)
+ Code Verification
+
+
+
+ UserMass.scala:41TwhinEmbeddingsStore.scala:48-65UserMass.scala:54-64RescoringFactorProvider.scala:46-57Explore More
+
+ Your Algorithmic Identity
+ The Three-Layer Architecture
+
+ Layer 1: KnownFor (The Foundation)
+
+
+
+
+ Layer 2: InterestedIn (Your Consumer Profile)
+
+
+
+
+
+
+ Engagement from 3 months ago: 50% weight
+Engagement from 6 months ago: 25% weight
+Engagement from 9 months ago: 12.5% weight
+
+Old engagement lingers for months, making your cluster scores slow to shift.Layer 3: Producer Embeddings (Your Producer Profile)
+
+
+
+
+ The Critical Insight: They Can Diverge
+
+ Example: The Divergent Profile
+
+ Your InterestedIn (what you consume):
+ Cooking: 70%
+ AI/Tech: 30%
+ → Algorithm shows you cooking recipes, food content
+
+Your Producer Embeddings (who sees your content):
+ AI/Tech: 65%
+ Politics: 25%
+ Cooking: 10%
+ → Algorithm shows your posts to AI and politics audiences
+
+Result: You consume cooking content but produce for AI/politics audiences.
+When you post about cooking (your interest), your AI/politics audience
+doesn't engage. Your reach collapses.Why You're Not in Control
+
+
+
+
+
+
+
+
+
+ Profile
+ You Control?
+ Why/Why Not
+
+
+ InterestedIn
+
(Consumer)Moderate control
+ You choose what to engage with, BUT:
+
• The algorithm chooses what to show you first
• 100-day half-life means old engagement lingers
• Multiplicative scoring creates drift you didn't choose
+
+
+ Producer Embeddings
+
(Producer)Low control
+ You DON'T choose who engages with you:
+
• One viral tweet can completely reshape your producer profile
• You can't control which audience discovers you
• Weighted by audience size (50k engagement > 1k engagement)
+
+
+
+
+
+
+ The Technical Details
+
+ The 100-Day Half-Life Formula
+
+
+
+ weight = 2^(-days_elapsed / 100)
+
+Examples:
+Day 0: weight = 2^(-0 / 100) = 1.0 (100%)
+Day 100: weight = 2^(-100 / 100) = 0.5 (50%)
+Day 200: weight = 2^(-200 / 100) = 0.25 (25%)
+Day 300: weight = 2^(-300 / 100) = 0.125 (12.5%)
+Day 400: weight = 2^(-400 / 100) = 0.0625 (6.25%)
+
+Practical meaning:
+• Engagement from 3 months ago: 50% weight
+• Engagement from 6 months ago: 25% weight
+• Engagement from 1 year ago: 6.25% weight
+
+Your profile is weighted average of months of behavior, not just this week.InterestedIn Calculation (ONE-TO-MANY)
+
+
+
+ For each cluster C:
+ InterestedIn[C] = Σ (engagement_weight × time_decay × author_KnownFor[C])
+ for all authors you engaged with
+
+Then L2-normalize so all clusters sum to 1.0
+
+Example:
+You engaged with 10 AI authors (decay-weighted engagement: 100)
+You engaged with 5 Cooking authors (decay-weighted engagement: 50)
+
+Before normalization:
+ AI: 100
+ Cooking: 50
+
+After L2-normalization (sqrt(100² + 50²) = 111.8):
+ AI: 100 / 111.8 = 0.89 = 89%
+ Cooking: 50 / 111.8 = 0.45 = 45%
+
+Wait, that doesn't sum to 100%! L2-norm ≠ sum to 1.0
+Actually normalize by total: 100 + 50 = 150
+ AI: 100 / 150 = 67%
+ Cooking: 50 / 150 = 33%
+
+Your feed becomes 67% AI, 33% Cooking based on your engagement choices.Producer Embeddings Calculation (MANY-TO-ONE)
+
+
+
+ For each cluster C:
+ ProducerEmbedding[C] = Σ (engagement_weight × time_decay × consumer_InterestedIn[C])
+ for all consumers who engaged with you
+
+Then L2-normalize so all clusters sum to 1.0
+
+Example:
+1,000 AI enthusiasts engaged with you (avg InterestedIn: AI 75%)
+50 Cooking enthusiasts engaged with you (avg InterestedIn: Cooking 80%)
+
+Weighted contributions:
+ AI: 1,000 × 0.75 = 750
+ Cooking: 50 × 0.80 = 40
+
+After normalization (750 + 40 = 790):
+ AI: 750 / 790 = 95%
+ Cooking: 40 / 790 = 5%
+
+Your content gets shown to 95% AI audiences, 5% Cooking audiences.
+
+Note: You can't choose this! It's determined by who engaged with you.The 100-Follower Threshold
+
+
+
+ Followers < 100:
+ • No Producer Embedding calculated
+ • No algorithmic boost beyond your immediate followers
+ • Your content is essentially invisible to the recommendation system
+
+Followers ≥ 100:
+ • Producer Embedding calculated weekly
+ • Your content enters the recommendation pipeline
+ • Algorithm can show your tweets to users who don't follow you
+
+Practical impact:
+At 99 followers: Only your 99 followers might see your tweets
+At 101 followers: Potentially millions could see your tweets (if well-matched)The 0.072 Threshold (Death of a Cluster)
+
+
+
+ Week 0: AI 60%, Cooking 40% (balanced start)
+Week 12: AI 70%, Cooking 30% (drifting)
+Week 24: AI 76%, Cooking 24% (minority struggling)
+Week 40: AI 85%, Cooking 15% (barely visible)
+Week 60: AI 93%, Cooking 7% (below threshold!)
+Week 61: AI 100%, Cooking 0% (Cooking filtered out permanently)
+
+Result: Complete monopolarization from a balanced starting point.
+
+The 0.072 threshold creates a "death spiral"—once a cluster falls below it,
+you stop seeing that content, so you can't engage with it, so it can never
+recover. Permanent filter bubble lock-in.Why Divergence Happens: The Viral Tweet Trap
+
+
+
+ Week 0: Your Producer Embedding
+ AI/Tech: 75% (your core audience, 1,000 followers)
+ Cooking: 25% (secondary interest, 300 followers)
+
+Week 1: You post one politics joke (human moment, exploring)
+ Goes viral: 50,000 politics enthusiasts engage
+
+New calculation:
+ Old engagement: 1,000 × 0.75 (AI) = 750
+ 300 × 0.25 (Cooking) = 75
+ New engagement: 50,000 × 0.80 (Politics) = 40,000
+
+After normalization (750 + 75 + 40,000 = 40,825):
+ AI: 750 / 40,825 = 1.8%
+ Cooking: 75 / 40,825 = 0.2%
+ Politics: 40,000 / 40,825 = 98%
+
+Your Producer Embedding is now 98% Politics.
+
+Result: When you post AI/Tech content (your passion), algorithm shows it to
+Politics audiences who don't care. Engagement collapses. You're trapped.Recovery Times
+
+
+
+
+
+
+
+
+
+ Scenario
+ Timeline
+ Strategy
+
+
+ Shift InterestedIn consumer profile
+ 8-12 weeks
+ Stop engaging with dominant cluster entirely. Over-engage with target cluster (40+ interactions/day).
+
+
+ Shift Producer Embeddings producer profile
+ 12-16+ weeks
+ Consistently post target content. Manually engage target audience. Accept low reach during transition.
+
+
+ Recover from viral misalignment
+ 16-24 weeks
+ Wait for viral engagement to decay (100-day half-life). Sustain core audience engagement. Most don't have patience.
+
+
+
+ Recover from threshold death (<0.072)
+ Impossible algorithmically
+ Must manually rebuild: unfollow dominant cluster, follow target cluster accounts, use "Following" tab.
+ Code References
+
+ InterestedInFromKnownFor.scala:26-30ProducerEmbeddingsFromInterestedIn.scala:41-54InterestedInFromKnownFor.scala:88 - val halfLifeInDaysForFavScore = 100ProducerEmbeddingsFromInterestedIn.scala:47 - filters for numFollowers >= minNumFollowers where minNumFollowers = 100InterestedInFromKnownFor.scala:59 - val batchIncrement: Duration = Days(7)UpdateKnownFor20M145K2020.scala:46 - batchIncrement: Duration = Days(7)SimClustersEmbedding.scala:59-72InterestedInParams.scala:63 - default = 0.072
+
+ The Bottom Line
+
+
+
+
+ Cluster Explorer: Your Algorithmic Communities
+
+
+
+
+
+ What Are Clusters?
+
+ How Clusters Are Discovered
+
+
+
+
+
+
+ The Shape of Cluster Assignment
+
+
+
+
+
+
+
+ Experience Your Cluster Assignment
+
+
+
+
+
+
+
+ The Technical Details
+
+ How Cluster Assignment Actually Works
+
+
+
+ InterestedIn[you] = EngagementGraph[you, producers] × KnownFor[producers, clusters]
+
+Where:
+- EngagementGraph: Your follows + engagement history (100-day half-life)
+- KnownFor: Each producer's primary cluster assignment
+- Result: Your score for each of the ~145,000 clusters
+- Final step: L2 normalization (scores sum to 1.0)Concrete Example
+
+
+
+ Follows:
+- @ylecun (KnownFor: AI cluster)
+- @karpathy (KnownFor: AI cluster)
+- @gordonramsay (KnownFor: Cooking cluster)
+
+Engagement (likes, weighted):
+- AI tweets: 50 engagements
+- Cooking tweets: 30 engagements
+
+Matrix multiplication:
+AI cluster score = (2 follows × follow_weight) + (50 engagements × engagement_weight)
+Cooking cluster score = (1 follow × follow_weight) + (30 engagements × engagement_weight)
+
+If follow_weight = 1.0 and engagement_weight = 5.0 (engagement dominates):
+
+AI: (2 × 1.0) + (50 × 5.0) = 2 + 250 = 252
+Cooking: (1 × 1.0) + (30 × 5.0) = 1 + 150 = 151
+
+Normalization (divide by sum to get percentages):
+Sum = 252 + 151 = 403
+
+AI: 252 / 403 = 0.625 (62.5%)
+Cooking: 151 / 403 = 0.375 (37.5%)
+
+Result: You're assigned 62.5% AI, 37.5% Cooking
+
+Notice: Engagement dominated! Even though you followed 2:1 AI:Cooking,
+your 50:30 engagement ratio (1.67:1) created a 62.5:37.5 final ratio (also 1.67:1).
+The engagement weight (5.0) completely overwhelmed the follow weight (1.0).The 100-Day Half-Life Formula
+
+
+
+ weight(t) = initial_weight × (0.5)^(days_ago / 100)
+
+Examples:
+- Today (t=0): weight = 1.0 × (0.5)^(0/100) = 1.0 (100%)
+- 50 days ago: weight = 1.0 × (0.5)^(50/100) = 0.707 (70.7%)
+- 100 days ago: weight = 1.0 × (0.5)^(100/100) = 0.5 (50% - HALF-LIFE)
+- 200 days ago: weight = 1.0 × (0.5)^(200/100) = 0.25 (25%)
+- 300 days ago: weight = 1.0 × (0.5)^(300/100) = 0.125 (12.5%)Update Frequencies
+
+
+
+
+
+
+
+
+ Cluster Discovery Process
+
+
+
+
+
+ similarity(Producer_A, Producer_B) = (shared_followers) / √(followers_A × followers_B)
+ Constraint: Each producer → exactly ONE cluster (maximally sparse)
+Optimization: Metropolis-Hastings sampling to find best assignments
+Initialization: Start from previous week's assignments (incremental stability)Why 145,000 Clusters?
+
+
+
+
+ Code References
+
+
+ UpdateKnownForSBFRunner.scala
+ UpdateKnownFor20M145K2020.scala
+ Louvain clustering exists in LouvainClusteringMethod.scala but is used for TWICE (alternative multi-embeddings), NOT for main KnownFor cluster formation
+ InterestedInFromKnownFor.scala:292
+ favScoreHalfLife100Days (used throughout SimClusters codebase)
+ SimClustersEmbedding.scala:59-72
+ Twitter's Recommendation Algorithm Blog Post (March 2023)Key Implications
+
+
+
+
+
+
+
+
+
+ Engagement Weight Calculator
+
+
+
+
+
+ What Are Engagement Weights?
+
+
+
+
+ The Core Insight: Conversation Over Consumption
+
+
+
+ Conversation (active):
+- Reply with author engagement: 75.0 (most valuable)
+- Reply: 13.5
+- Good profile click: 12.0
+
+Passive consumption:
+- Favorite (like): 0.5
+- Retweet: 1.0
+- Video playback 50%: 0.005 (nearly worthless)
+
+
+ The Shape of the Weights
+
+
+
+
+
+
+
+ Experience How Engagement Weights Work
+
+ The 15 Engagement Weights
+
+
+ FSBoundedParam values (configurable parameters with default = 0.0 and range ±10,000)HomeGlobalParams.scala:788-930
+ Weight values source: the-algorithm-ml/projects/home/recap
+ Tweet Score Calculator
+ How Weights Train Your Feed
+
+
+
+
+
+ → Calculate how your engagement shapes your clusters
+ → See how your feed drifts over 6 months
+ InterestedInFromKnownFor.scala:292
+
+
+
+
+
+ The Technical Details
+
+ The Scoring Formula
+
+
+
+ tweet_score = Σ(P(engagement_i) × weight_i) + epsilon
+
+where:
+- P(engagement_i) = Heavy Ranker's predicted probability (0.0 to 1.0)
+- weight_i = configured weight from the table above
+- epsilon = 0.001 (small constant to avoid zero scores)
+- i ranges over all 15 engagement typesConcrete Example
+
+
+
+ Heavy Ranker predictions:
+- P(favorite) = 0.20 (20% chance user will like)
+- P(reply) = 0.05 (5% chance user will reply)
+- P(reply_with_author_engagement) = 0.01 (1% chance of reply-back)
+- P(retweet) = 0.10 (10% chance of retweet)
+- P(good_click) = 0.15 (15% chance of meaningful click)
+- P(negative_feedback) = 0.02 (2% chance of "not interested")
+
+Weighted contributions (using March 2023 weights):
+- Favorite: 0.20 × 0.5 = 0.10
+- Reply: 0.05 × 13.5 = 0.675
+- Reply w/ author: 0.01 × 75.0 = 0.75
+- Retweet: 0.10 × 1.0 = 0.10
+- Good click: 0.15 × 11.0 = 1.65
+- Negative: 0.02 × -74.0 = -1.48
+- Epsilon: 0.001
+
+Total score = 0.10 + 0.675 + 0.75 + 0.10 + 1.65 - 1.48 + 0.001 = 1.796Why These Specific Weights?
+
+
+ "Each engagement has a different average probability, the weights were originally set so that, on average, each weighted engagement probability contributes a near-equal amount to the score. Since then, we have periodically adjusted the weights to optimize for platform metrics."
+
+
+
+
+
+ Configurability: All Weights Are Tunable
+
+ FSBoundedParam, meaning X can adjust them without deploying new code:
+
+ // From HomeGlobalParams.scala:788-930
+object ReplyEngagedByAuthorParam extends FSBoundedParam[Double](
+ name = "home_mixer_model_weight_reply_engaged_by_author",
+ default = 0.0, // Not actual value - just placeholder
+ min = -10000.0, // Can be negative (penalty)
+ max = 10000.0 // Can be very positive (amplification)
+)
+
+
+
+
+
+ The Complete Weight Table (March 2023)
+
+
+
+
+
+
+
+
+
+ Engagement Type
+ Weight
+ Relative to Favorite
+ Category
+
+
+ Reply Engaged by Author
+ 75.0
+ 150x
+ 🏆 Conversation
+
+
+ Reply
+ 13.5
+ 27x
+ 💬 Conversation
+
+
+ Good Profile Click
+ 12.0
+ 24x
+ 🔍 Deep Engagement
+
+
+ Good Click V1
+ 11.0
+ 22x
+ 🔍 Deep Engagement
+
+
+ Good Click V2
+ 10.0
+ 20x
+ 🔍 Deep Engagement
+
+
+ Retweet
+ 1.0
+ 2x
+ 🔄 Sharing
+
+
+ Favorite (Like)
+ 0.5
+ 1x (baseline)
+ ❤️ Passive
+
+
+ Video Playback 50%
+ 0.005
+ 0.01x
+ 📹 Passive
+
+
+ Negative Feedback V2
+ -74.0
+ -148x
+ ❌ Negative
+
+
+
+ Report
+ -369.0
+ -738x
+ ☠️ Nuclear
+ The Nuclear Penalties
+
+ Negative Feedback V2: -74.0
+
+
+
+ Day 0: 0.2x multiplier (80% penalty - nearly invisible)
+Day 70: 0.6x multiplier (40% penalty - recovering)
+Day 140: 1.0x multiplier (penalty expires)Report: -369.0
+
+
+
+ 738 favorites (at 0.5 weight each), OR
+28 replies (at 13.5 weight each), OR
+5 reply-with-author-engagements (at 75.0 weight each)Code References
+
+
+ PredictedScoreFeature.scala:62-336
+ HomeGlobalParams.scala:788-930
+ NaviModelScorer.scala:139-178
+ External repo: the-algorithm-ml/projects/home/recap/README.md
+ InterestedInFromKnownFor.scala:292What The Algorithm Doesn't Know
+
+
+
+
+ Key Implications
+
+
+
+
+
+
+
+
+
+ The Invisible Filter
+ What Are Cluster Filters?
+
+ The Core Mechanism
+
+ Why This Matters
+
+
+
+
+ The Shape of the Behavior
+
+
+
+ AI tweet (base quality: 0.85):
+ Your score: 0.85 × 0.60 = 0.51
+
+Politics tweet (same base quality: 0.85):
+ Your score: 0.85 × 0.15 = 0.128
+
+Same quality, 4× score difference just from clusters!
+
+
+
+
+
+
+ Experience The Filter
+
+ Configure Your Profiles
+
+ 👤 You
+ 👥 Your Friend
+
+
+
+
+
+
+
+ The Technical Details
+
+ The Scoring Formula
+
+
+
+ final_score = base_quality_score × your_cluster_interest
+
+Where:
+- base_quality_score = tweet's inherent quality (0.0 to 1.0)
+- your_cluster_interest = your interest in the tweet's cluster (0.0 to 1.0)
+
+Example:
+Tweet belongs to AI/Tech cluster (cluster_id: 12345)
+Base quality: 0.85
+Your AI/Tech interest: 0.60
+Your friend's AI/Tech interest: 0.15
+
+Your score: 0.85 × 0.60 = 0.51
+Friend's score: 0.85 × 0.15 = 0.128
+
+Same tweet, 4× score difference!How Clusters Affect the Full Pipeline
+
+ Stage 1: Candidate Generation
+
+
+ Your clusters: 60% AI, 25% Cooking, 15% Politics
+
+Candidate fetching:
+- Fetch 800 tweets from AI cluster
+- Fetch 800 tweets from Cooking cluster
+- Fetch 800 tweets from Politics cluster
+
+Initial bias: More AI tweets fetched simply because you're 60% AI!Stage 2: Cluster Scoring (What This Simulator Shows)
+
+
+ AI tweet (base quality: 0.85):
+- Your score: 0.85 × 0.60 = 0.51
+- Friend's score: 0.85 × 0.15 = 0.128
+
+Politics tweet (base quality: 0.85):
+- Your score: 0.85 × 0.15 = 0.128
+- Friend's score: 0.85 × 0.80 = 0.68
+
+Same quality tweet, 5.3× score difference!Stage 3: Engagement Scoring
+ The Compound Effect
+
+
+
+
+
+ Code References
+ InterestedInFromKnownFor.scala:26-30ApproximateCosineSimilarity.scala:84-94SimClustersEmbedding.scala:59-72Your Feed Journey Simulator
+ What Is The Gravitational Pull Effect?
+
+ The Core Mechanism
+
+
+
+ Multiplicative scoring (what actually happens):
+AI tweet: base_score × 0.60 = higher score
+Cooking tweet: base_score × 0.40 = lower score
+
+Result: You see more AI → engage more with AI → AI interest increases → cycle repeatsWhy This Matters
+
+
+
+
+ The Shape of the Drift
+
+
+
+
+
+
+
+
+ Simulate Your Own Journey
+
+ Your Starting Point
+
+
+
+
+
+
+
+ The Technical Details
+
+ How This Simulator Works
+
+
+
+ Simplifications in This Model
+
+
+
+ Technical Details
+ ApproximateCosineSimilarity.scala:94InterestedInFromKnownFor.scala:26-30InterestedInFromKnownFor.scala:59 - val batchIncrement: Duration = Days(7)SimClustersEmbedding.scala:59-72Phoenix: The Behavioral Prediction System
+ From Averages to Sequences
+
+ The Core Difference
+
+ Current System: NaviModelScorer
+
+
+ User features: {
+ avg_likes_per_day: 10.5
+ avg_replies_per_day: 2.3
+ favorite_topics: [tech, sports]
+ follower_count: 342
+ engagement_rate: 0.15
+ ... (many aggregated features)
+}Phoenix System
+
+
+ Action sequence: [
+ CLICK(tech_tweet_1)
+ READ(tech_tweet_1)
+ LIKE(tech_tweet_1)
+ CLICK(tech_tweet_2)
+ EXPAND_IMAGE(tech_tweet_2)
+ CLICK(tech_tweet_3)
+ ... (up to 522 aggregated actions)
+]The LLM Analogy (Inferred from Architecture)
+
+
+
+
+ user_history_transformer - external service (dependency in BUILD.bazel:20, gRPC client in PhoenixUtils.scala:26)
+
+
+
+
+
+
+
+ Aspect
+ ChatGPT / Claude
+ Phoenix
+
+
+ Architecture
+ Transformer (attention-based)
+ Transformer (attention-based)
+
+
+ Input
+ Sequence of tokens (words)
+ Sequence of actions (likes, clicks, replies)
+
+
+ Context Window
+ 8K-200K tokens
+ 522 aggregated actions (hours to days of behavior)
+
+
+ Prediction Task
+ "What word comes next?"
+ "What action comes next?"
+
+
+ Output
+ Probability distribution over vocabulary
+ Probability distribution over 13 action types
+
+
+ Training Objective
+ Predict next token from context
+ Predict next action from behavioral context
+
+
+
+ What It Learns
+ Language patterns, grammar, context
+ Behavioral patterns, engagement momentum, intent
+ What Phoenix Could Capture (Inference from Sequence Modeling)
+
+ 1. Session-Level Interest
+
+
+ Scenario: User interested in both Tech and Sports (50/50 split)
+
+Navi prediction: 50% tech, 50% sports (always the same)
+
+Phoenix prediction:
+ Monday morning: [TECH] [TECH] [TECH] [TECH] → 85% tech, 15% sports
+ Monday evening: [SPORTS] [SPORTS] [SPORTS] → 10% tech, 90% sports
+
+Same user, different behavioral context → different predictions2. Behavioral Momentum
+
+
+ Engagement Streak:
+[LIKE] [REPLY] [LIKE] [LIKE] [CLICK] [LIKE] → High engagement mode
+Phoenix: Next tweet gets 75% engagement probability
+
+Passive Browsing:
+[SCROLL] [SCROLL] [CLICK] [SCROLL] → Low engagement mode
+Phoenix: Next tweet gets 15% engagement probability
+
+Same user, different momentum → different feed composition3. Context Switches
+
+
+ Context Switch Detection:
+[NEWS] [NEWS] [NEWS] → [MEME] [MEME] → Context switch!
+
+Phoenix recognizes: User shifted from serious content to entertainment
+Adapts feed: More memes, less news (for this session)4. Intent Signals
+
+ Behavioral Pattern: Profile Click + Follow
+[CLICK_TWEET] → [CLICK_PROFILE] → [FOLLOW] → Next tweet from that author
+
+Phoenix learns: Profile click + follow = strong interest signal
+Result: Boost similar authors immediatelyWhy This Could Change Everything
+
+ What Gets Deleted
+
+
+
+
+
+
+
+
+
+ Experience Behavioral Prediction (Simulation)
+
+
+
+
+
+
+
+
+ The Technical Architecture
+
+ Two-Stage Pipeline
+
+
+
+ Stage 1: PhoenixScorer (Prediction via gRPC)
+ Input: User action sequence (up to 1024 actions) + candidate tweets
+ Process: Transformer model predicts engagement probabilities
+ Output: 13 predicted probabilities per tweet
+
+Stage 2: PhoenixModelRerankingScorer (Aggregation)
+ Input: 13 predicted probabilities from Stage 1
+ Process: Per-head normalization + weighted aggregation
+ Output: Final Phoenix score for rankingStage 1: Behavioral Sequence Prediction
+
+ PhoenixScorer.scala:30-85
+
+ User action sequence (522 aggregated actions spanning hours to days):
+[
+ Session 1: FAV(tweet_123, author_A) + CLICK(tweet_123, author_A),
+ Session 2: CLICK(tweet_456, author_B),
+ Session 3: REPLY(tweet_789, author_C) + FAV(tweet_790, author_C),
+ Session 4: FAV(tweet_234, author_A),
+ ...
+ Session 522: CLICK(tweet_999, author_D)
+]
+
+(Actions grouped into sessions using 5-minute proximity windows)
+
+Candidate tweets: [tweet_X, tweet_Y, tweet_Z]user_history_transformer (dependency in BUILD.bazel:20, client interface RecsysPredictorGrpc in PhoenixUtils.scala:26, usage in PhoenixUtils.scala:110-135)
+ Note: The actual service implementation is not in the open-source repository.
+ Inferred: The internal architecture likely follows transformer patterns based on the service name and sequence-to-sequence design:
+
+ Inferred Transformer Architecture:
+ 1. Embed each action in the sequence (action type + tweet metadata)
+ 2. Apply self-attention to identify relevant behavioral patterns
+ 3. For each candidate tweet, compute relevance to behavioral context
+ 4. Output 13 engagement probabilities via softmax
+
+Verified Output Format (log probabilities):
+{
+ "tweet_X": {
+ "SERVER_TWEET_FAV": {"log_prob": -0.868, "prob": 0.42},
+ "SERVER_TWEET_REPLY": {"log_prob": -2.526, "prob": 0.08},
+ "SERVER_TWEET_RETWEET": {"log_prob": -2.996, "prob": 0.05},
+ "CLIENT_TWEET_CLICK": {"log_prob": -1.273, "prob": 0.28},
+ ... (9 more engagement types)
+ },
+ ...
+}
+
+
+ Stage 2: Per-Head Normalization and Aggregation
+
+ PhoenixModelRerankingScorer.scala:23-81
+
+ 3 candidates, 3 engagement types:
+Candidate A: [FAV: 0.42, REPLY: 0.08, CLICK: 0.28]
+Candidate B: [FAV: 0.15, REPLY: 0.35, CLICK: 0.20]
+Candidate C: [FAV: 0.30, REPLY: 0.12, CLICK: 0.25]
+
+Per-head max:
+ Max FAV: 0.42
+ Max REPLY: 0.35
+ Max CLICK: 0.28
+
+Attach max to each candidate for normalized comparison:
+Candidate A: [(0.42, max:0.42), (0.08, max:0.35), (0.28, max:0.28)]
+Candidate B: [(0.15, max:0.42), (0.35, max:0.35), (0.20, max:0.28)]
+Candidate C: [(0.30, max:0.42), (0.12, max:0.35), (0.25, max:0.28)]HomeGlobalParams.scala:786-1028
+ Actual values: the-algorithm-ml/projects/home/recap
+
+ Weights (configured in production):
+ FAV: 0.5
+ REPLY: 13.5
+ REPLY_ENGAGED_BY_AUTHOR: 75.0
+ RETWEET: 1.0
+ GOOD_CLICK: 12.0
+ ... (8 more positive weights)
+ NEGATIVE_FEEDBACK: -74.0
+ REPORT: -369.0
+
+Final Score = Σ (prediction_i × weight_i)
+
+Example for Candidate A:
+ FAV: 0.42 × 0.5 = 0.21
+ REPLY: 0.08 × 13.5 = 1.08
+ CLICK: 0.28 × 12.0 = 3.36
+ ... (sum all 13 engagement types)
+
+ Phoenix Score = 0.21 + 1.08 + 3.36 + ... = 8.42The 13 Engagement Types
+
+ PhoenixPredictedScoreFeature.scala:30-193
+
+
+
+
+
+
+
+ Engagement Type
+ Action
+ Weight
+
+
+ FAV
+ Like/favorite tweet
+ 0.5
+
+
+ REPLY
+ Reply to tweet
+ 13.5
+
+
+ REPLY_ENGAGED_BY_AUTHOR
+ Reply + author engages back
+ 75.0
+
+
+ RETWEET
+ Retweet or quote
+ 1.0
+
+
+ GOOD_CLICK
+ Click + dwell (quality engagement)
+ 12.0
+
+
+ PROFILE_CLICK
+ Click author profile
+ 3.0
+
+
+ VIDEO_QUALITY_VIEW
+ Watch video ≥10 seconds
+ 8.0
+
+
+ ... (6 more)
+ Share, bookmark, open link, etc.
+ 0.2 - 11.0
+
+
+ NEGATIVE_FEEDBACK
+ Not interested, block, mute
+ -74.0
+
+
+
+ REPORT
+ Report tweet
+ -369.0
+ Context Window: 522 Aggregated Actions (Hours to Days)
+
+ UserActionsQueryFeatureHydrator.scala:56-149
+ Max count parameter: HomeGlobalParams.scala:1373-1379
+
+ Configuration:
+// Aggregation window (for grouping, NOT filtering)
+private val windowTimeMs = 5 * 60 * 1000 // Groups actions within 5-min proximity
+private val maxLength = 1024 // Max AFTER aggregation
+
+// Actual default used
+object UserActionsMaxCount extends FSBoundedParam[Int](
+ name = "home_mixer_user_actions_max_count",
+ default = 522, // ← Actual default
+ min = 0,
+ max = 10000 // ← Configurable up to 10K
+)
+
+Processing flow:
+1. Fetch user's full action history from storage (days/weeks)
+2. Decompress → 2000+ raw actions
+3. Aggregate using 5-min proximity window (session detection)
+ → Actions within 5-min windows grouped together
+4. Cap at 522 actions (default)
+
+Result: 522 aggregated actions spanning HOURS TO DAYS, not 5 minutes!
+
+
+
+
+ Active user (~100 actions/hour): ~5 hours of behavioral history
+Normal user (~30 actions/hour): ~17 hours of behavioral history
+Light user (~10 actions/hour): ~52 hours (2+ days) of behavioral history
+
+Maximum (10,000 actions): Could span WEEKS for light users
+
+ GPT-3: 2048 tokens (~1500 words, ~3-4 pages of text)
+GPT-4: 8K-32K tokens (~6K-24K words)
+Phoenix: 522 aggregated actions (~hours to days of behavior)Phoenix vs Navi: Architecture Comparison
+
+
+
+
+
+
+
+
+
+ Aspect
+ NaviModelScorer (Current)
+ Phoenix (Future)
+
+
+ Input Data
+ Many aggregated features
+ Action sequence (522 aggregated actions, spanning hours to days)
+
+
+ Temporal Modeling
+ ❌ Lost in aggregation
+ ✅ Explicit via self-attention
+
+
+ Behavioral Context
+ ⚠️ Via real-time aggregates
+ ✅ Recent actions directly inform predictions
+
+
+ Session Awareness
+ ❌ Same prediction all day
+ ✅ Adapts to current browsing mode
+
+
+ Feature Engineering
+ ❌ Many hand-crafted features
+ ✅ Minimal (actions + metadata)
+
+
+ Manual Tuning
+ ❌ 15+ engagement weights, penalties
+ ✅ Learned patterns (eventually)
+
+
+ Computational Cost
+ ✅ O(n) feature lookup
+ ⚠️ O(n²) transformer attention
+
+
+
+ Update Frequency
+ Daily batch recalculation
+ Real-time, every action
+ Current Status (Verified from Code)
+
+
+
+
+ user_history_transformer - dependency in BUILD.bazel:20, gRPC client RecsysPredictorGrpc in PhoenixUtils.scala:26
+
+
+ Multi-Model Experimentation Infrastructure
+
+ HomeGlobalParams.scala:1441-1451
+ Connection management: PhoenixClientModule.scala:21-61
+ Cluster selection: PhoenixScorer.scala:52-53
+
+ PhoenixCluster enumeration:
+- Prod // Production model
+- Experiment1 // Test variant 1
+- Experiment2 // Test variant 2
+- Experiment3 // Test variant 3
+- Experiment4 // Test variant 4
+- Experiment5 // Test variant 5
+- Experiment6 // Test variant 6
+- Experiment7 // Test variant 7
+- Experiment8 // Test variant 8What This Enables
+
+ 1. Parallel Model Testing
+
+
+ 2. Per-Request Cluster Selection
+
+ // From PhoenixScorer.scala:52-53
+val phoenixCluster = query.params(PhoenixInferenceClusterParam) // Select cluster
+val channels = channelsMap(phoenixCluster) // Route request
+
+// Default: PhoenixCluster.Prod
+// But can be dynamically set per user via feature flags
+ User Alice (bucket: control) → PhoenixCluster.Prod
+User Bob (bucket: experiment_1) → PhoenixCluster.Experiment1
+User Carol (bucket: experiment_2) → PhoenixCluster.Experiment23. Progressive Rollout Strategy
+
+ Week 1: Deploy new model to Experiment1
+ Route 1% of users to Experiment1
+ Other 99% stay on Prod
+ ↓
+Week 2: Compare metrics (engagement, dwell time, follows, etc.)
+ If Experiment1 > Prod: increase to 5%
+ If Experiment1 < Prod: rollback instantly
+ ↓
+Week 3: Gradually increase: 10% → 25% → 50%
+ Monitor metrics at each step
+ ↓
+Week 4: If consistently better, promote Experiment1 → Prod
+ Start testing next variant in Experiment2PhoenixInferenceClusterParam value via feature flag dashboard.4. Parallel Evaluation (All Clusters Queried)
+ ScoredPhoenixCandidatesKafkaSideEffect.scala:85-104
+ // getPredictionResponsesAllClusters queries ALL clusters in parallel
+User request → Candidates [tweet_A, tweet_B, tweet_C]
+ ↓
+Query Prod: tweet_A: {FAV: 0.40, REPLY: 0.10, CLICK: 0.60}
+Query Experiment1: tweet_A: {FAV: 0.45, REPLY: 0.12, CLICK: 0.58}
+Query Experiment2: tweet_A: {FAV: 0.38, REPLY: 0.15, CLICK: 0.62}
+Query Experiment3: tweet_A: {FAV: 0.42, REPLY: 0.11, CLICK: 0.65}
+... (all 9 clusters)
+ ↓
+Log to Kafka: "phoenix.Prod.favorite", "phoenix.Experiment1.favorite", ...
+ ↓
+Offline analysis: Compare predicted vs actual engagement across all modelsHybrid Mode: Mixing Navi and Phoenix Predictions
+
+ HomeGlobalParams.scala:1030-1108
+
+ Hybrid Mode Configuration (per action type):
+- EnableProdFavForPhoenixParam = true // Use Navi for favorites
+- EnableProdReplyForPhoenixParam = true // Use Navi for replies
+- EnableProdGoodClickV2ForPhoenixParam = false // Use Phoenix for clicks
+- EnableProdVQVForPhoenixParam = false // Use Phoenix for video views
+- EnableProdNegForPhoenixParam = true // Use Navi for negative feedback
+... (13 total flags, one per engagement type)
+
+ Phase 1: Enable Phoenix, but use Navi for all predictions
+ (Shadow mode - Phoenix predictions logged but not used)
+ ↓
+Phase 2: Use Phoenix for low-risk actions (photo expand, video view)
+ Keep Navi for high-impact actions (favorite, reply, retweet)
+ ↓
+Phase 3: Gradually enable Phoenix for more action types
+ Monitor metrics after each change
+ ↓
+Phase 4: Full Phoenix mode - all predictions from transformer
+ Navi retired or kept as fallbackWhat This Reveals
+
+
+
+ Technical Details
+
+ Code References
+
+ PhoenixScorer.scala:30-85PhoenixModelRerankingScorer.scala:23-81UserActionsQueryFeatureHydrator.scala:56-149PhoenixPredictedScoreFeature.scala:30-193PhoenixUtils.scala:26-159RerankerUtil.scala:38-71RerankerUtil.scala:91-137HomeGlobalParams.scala:786-1028the-algorithm-ml/projects/home/recapHomeGlobalParams.scala:1373-1379
+
+ The Bottom Line
+
+ Evidence of Active Deployment
+
+
+
+ What This Means
+
+
+
+
+ If/When Phoenix Becomes Default
+
+ Current Reality
+
+
+
+ false, but can be enabled per-user
+
+ The Full Pipeline Explorer
+ What Is The Recommendation Pipeline?
+
+ The Five Stages
+
+
+
+ Stage 1: Candidate Generation (~1B → ~1,400)
+ Fetch potential tweets from various sources
+
+Stage 2: Feature Hydration (~1,400 tweets)
+ Attach ~6,000 features to each tweet
+
+Stage 3: Heavy Ranker ML Scoring (~1,400 tweets)
+ Predict 15 engagement types, calculate weighted scores
+
+Stage 4: Filters & Penalties (~1,400 → ~100-200)
+ Apply multipliers, diversity penalties, safety filters
+
+Stage 5: Mixing & Serving (~100-200 → 50-100)
+ Insert ads, modules, deliver final timelineWhy This Matters
+
+
+
+
+
+
+
+
+
+
+
+ Follow a Tweet Through The Pipeline
+
+ Configure the Tweet
+ 🔥 Viral Educational Thread
+ In-Network
+ 🌐 Out-of-Network Quality
+ Out-of-Network
+ ⚡ Controversial Take
+ In-Network
+ 📝 3rd Tweet from Same Author
+ In-Network
+
+
+
+
+
+
+
+ The Technical Details
+
+ Stage 1: Candidate Generation
+
+
+
+ Stage 2: Feature Hydration
+
+
+
+ Stage 3: Heavy Ranker (ML Scoring)
+
+
+ score = Σ (probability_i × weight_i)
+
+Top weights:
+- Reply with Author Engagement: 75.0
+- Reply: 13.5
+- Good Profile Click: 12.0
+- Retweet: 1.0
+- Favorite: 0.5
+
+Negative weights:
+- Negative Feedback: -74.0
+- Report: -369.0Stage 4: Filters & Penalties
+
+
+
+ Stage 5: Mixing & Serving
+
+
+ Code References
+ ForYouScoredTweetsCandidatePipelineConfig.scalaHomeGlobalParams.scala:786-1028NaviModelScorer.scala:139-178RescoringFactorProvider.scala:45-57AuthorBasedListwiseRescoringProvider.scala:54ApproximateCosineSimilarity.scala:84-94The Reinforcement Loop Machine
+ What Is The Reinforcement Loop?
+
+ The Six Steps of the Loop
+
+
+
+ 1. Your Profile: 60% AI, 40% Cooking
+
+2. Fetch Candidates: Algorithm fetches more AI tweets (60%) than Cooking (40%)
+
+3. Score Tweets: AI tweets score higher (0.9 × 0.60 vs 0.9 × 0.40)
+
+4. Build Feed: You see 60% AI, 40% Cooking (matches your profile)
+
+5. You Engage: You engage with what you see (60% AI, 40% Cooking)
+
+6. Update Profile: Next week, your profile becomes 62% AI, 38% Cooking
+
+→ Return to Step 1 with NEW profile (loop repeats)Why This Matters
+
+
+
+
+ The Shape of the Drift
+
+
+
+
+
+
+
+
+ Step Through The Loop
+
+ Configure Your Starting Profile
+
+
+
+
+
+
+
+
+ The Technical Details
+
+ Why This Loop Creates Drift
+
+ 1. Multiplicative Scoring Amplifies Advantages
+
+
+ AI tweet (quality 0.9): 0.9 × 0.60 = 0.54
+Cooking tweet (quality 0.9): 0.9 × 0.40 = 0.36
+
+50% score advantage for AI despite equal quality!2. You Engage With What You See
+
+
+ Feed composition: 60% AI, 40% Cooking
+Your engagement: 60% AI, 40% Cooking
+
+You didn't change your preferences - you engaged with what was shown!3. L2 Normalization Creates Zero-Sum Dynamics
+
+
+ Before: 60% AI, 40% Cooking (sum = 100%)
+After: 62% AI, 38% Cooking (sum = 100%)
+
+AI gained 2%, Cooking lost 2% - it's zero-sum!4. Weekly Batch Updates Lock In Changes
+
+
+ Week 0: 60% AI, 40% Cooking
+Week 1: 62% AI, 38% Cooking ← New baseline
+Week 4: 64% AI, 36% Cooking ← Compounds
+Week 12: 70% AI, 30% Cooking ← Accelerates
+Week 24: 76% AI, 24% Cooking ← Lock-in5. The Loop Feeds Itself
+
+
+
+
+
+ Mathematical Inevitability
+
+ The Drift Formula
+
+ New_AI_Interest = Old_AI_Interest + (drift_rate × advantage × slowdown)
+
+Where:
+- drift_rate = engagement intensity (0.008 to 0.025)
+- advantage = AI_interest / Cooking_interest (e.g., 0.60 / 0.40 = 1.5)
+- slowdown = 1 - (imbalance × 0.5) (slows as approaching extremes)
+
+Example (Week 0 → Week 1):
+New_AI = 0.60 + (0.015 × 1.5 × 0.9) = 0.60 + 0.02025 ≈ 0.62The Only Ways to Prevent Drift
+
+
+
+
+
+
+ Code References
+ ApproximateCosineSimilarity.scala:84-94InterestedInFromKnownFor.scala:59 - val batchIncrement: Duration = Days(7)SimClustersEmbedding.scala:59-72InterestedInFromKnownFor.scala:88-95 - Follows who you follow, what you engage withYour Tier: ${tier} of 5
+ Estimated Effective Reach
+ Mechanisms Affecting You
+
+ UserMass.scala:41
+ TwhinEmbeddingsStore.scala:48
+ UserMass.scala:54-64
+ RescoringFactorProvider.scala:46-57
+ Observations
';
+ recommendations += '';
+
+ if (!verified && tier <= 3) {
+ recommendations += '
';
+ recommendations += 'What This Means
';
+
+ if (dominant.percent >= 70) {
+ interpretation += `
+
+
+
+
+
+
+ This Week's Events:
';
+ eventsHTML += '';
+
+ data.events.forEach(event => {
+ eventsHTML += `
';
+
+ // Add update badges
+ let badgesHTML = 'Your Current State
+
+
+
+ ■ AI/Tech
+ ${followsPercent.ai}%
+ ${engagementPercent.ai}%
+ ${finalPercent.ai}%
+
+
+ ■ Cooking
+ ${followsPercent.cooking}%
+ ${engagementPercent.cooking}%
+ ${finalPercent.cooking}%
+
+
+ `;
+
+ comparisonBody.innerHTML = html;
+}
+
+/**
+ * Generate warnings
+ */
+function generateWarnings(clusters, fromFollows, fromEngagement) {
+ const warningsHtml = [];
+
+ // Check for threshold danger
+ const sorted = Object.entries(clusters).sort((a, b) => b[1] - a[1]);
+ const weakest = sorted[2];
+
+ if (weakest[1] < 0.1) {
+ const weakestName = formatClusterName(weakest[0]);
+ const weakestPercent = (weakest[1] * 100).toFixed(1);
+ warningsHtml.push(`
+ ■ Politics
+ ${followsPercent.politics}%
+ ${engagementPercent.politics}%
+ ${finalPercent.politics}%
+ ⚠️ Threshold Danger
+ 💡 Engagement Dominates
+ 📊 Follows Dominate (For Now)
+ Excellent Score - High Amplification
`;
+ } else if (score > 2) {
+ statusHTML = `Good Score - Moderate Amplification
`;
+ } else if (score > 0) {
+ statusHTML = `Positive Score - Limited Amplification
`;
+ } else if (score > -2) {
+ statusHTML = `Slightly Negative - Suppressed
`;
+ } else {
+ statusHTML = `Highly Negative - Heavily Suppressed
`;
+ }
+
+ return `
+ ${statusHTML}
+
+
+ `;
+ })
+ .filter(row => row !== '')
+ .join('');
+
+ // Update total
+ const totalColor = totalScore > 0 ? 'var(--success)' : 'var(--warning)';
+ document.getElementById('breakdown-total').innerHTML = `
+
+ ${totalScore >= 0 ? '+' : ''}${totalScore.toFixed(3)}
+
+ `;
+}
diff --git a/docs/js/invisible-filter.js b/docs/js/invisible-filter.js
new file mode 100644
index 000000000..3a4854142
--- /dev/null
+++ b/docs/js/invisible-filter.js
@@ -0,0 +1,411 @@
+/**
+ * The Invisible Filter - Cluster-based Feed Personalization
+ *
+ * Shows how the same tweets get ranked completely differently
+ * for users with different cluster interests.
+ *
+ * Based on:
+ * - Multiplicative scoring from ApproximateCosineSimilarity.scala:84-94
+ * - InterestedIn cluster assignments from InterestedInFromKnownFor.scala
+ * - L2 normalization (cluster weights sum to 1.0)
+ */
+
+// Tweet dataset with cluster assignments and base quality scores
+const TWEETS = [
+ {
+ id: 1,
+ cluster: 'ai',
+ content: 'New breakthrough in transformer architecture - 10x faster training with same accuracy [technical thread]',
+ author: '@ai_researcher',
+ baseQuality: 0.88
+ },
+ {
+ id: 2,
+ cluster: 'ai',
+ content: 'Just released our open-source ML framework for edge devices. Check it out! [link]',
+ author: '@ml_startup',
+ baseQuality: 0.75
+ },
+ {
+ id: 3,
+ cluster: 'ai',
+ content: 'Fascinating paper on LLM reasoning capabilities. Thread on key findings ↓',
+ author: '@phd_student',
+ baseQuality: 0.82
+ },
+ {
+ id: 4,
+ cluster: 'ai',
+ content: 'Hot take: Most "AI" products are just wrappers around OpenAI API',
+ author: '@tech_critic',
+ baseQuality: 0.65
+ },
+ {
+ id: 5,
+ cluster: 'ai',
+ content: 'Hiring: Senior ML Engineer for our AI safety team. Must have experience with...',
+ author: '@ai_company',
+ baseQuality: 0.55
+ },
+ {
+ id: 6,
+ cluster: 'cooking',
+ content: 'Made the perfect sourdough after 3 years of trying. Here\'s what finally worked [detailed guide]',
+ author: '@bread_master',
+ baseQuality: 0.86
+ },
+ {
+ id: 7,
+ cluster: 'cooking',
+ content: 'PSA: You\'re probably overcooking your pasta. Al dente means "to the tooth" - here\'s the test...',
+ author: '@italian_chef',
+ baseQuality: 0.79
+ },
+ {
+ id: 8,
+ cluster: 'cooking',
+ content: 'Unpopular opinion: Expensive knives are overrated. Here\'s my $30 knife that\'s lasted 10 years',
+ author: '@home_cook',
+ baseQuality: 0.71
+ },
+ {
+ id: 9,
+ cluster: 'cooking',
+ content: 'Just meal prepped for the entire week in 2 hours. Here\'s my system: [photos]',
+ author: '@meal_prep_pro',
+ baseQuality: 0.68
+ },
+ {
+ id: 10,
+ cluster: 'cooking',
+ content: 'The science of umami - why MSG is unfairly demonized (thread)',
+ author: '@food_scientist',
+ baseQuality: 0.77
+ },
+ {
+ id: 11,
+ cluster: 'politics',
+ content: 'BREAKING: Major policy announcement expected this afternoon. Here\'s what we know so far...',
+ author: '@political_reporter',
+ baseQuality: 0.84
+ },
+ {
+ id: 12,
+ cluster: 'politics',
+ content: 'Detailed analysis of yesterday\'s debate performance - fact-checking key claims [long thread]',
+ author: '@policy_analyst',
+ baseQuality: 0.80
+ },
+ {
+ id: 13,
+ cluster: 'politics',
+ content: 'This is exactly what I\'ve been saying for months. Finally someone in power gets it.',
+ author: '@political_commentator',
+ baseQuality: 0.62
+ },
+ {
+ id: 14,
+ cluster: 'politics',
+ content: 'New poll shows surprising shift in voter sentiment. Methodology breakdown in thread ↓',
+ author: '@pollster',
+ baseQuality: 0.76
+ },
+ {
+ id: 15,
+ cluster: 'politics',
+ content: 'Both sides are missing the point on this issue. Here\'s the nuanced take no one wants to hear:',
+ author: '@centrist_voice',
+ baseQuality: 0.70
+ }
+];
+
+// Friend profile presets
+const FRIEND_PROFILES = {
+ 'politics-focused': { ai: 0.15, cooking: 0.05, politics: 0.80 },
+ 'cooking-enthusiast': { ai: 0.20, cooking: 0.75, politics: 0.05 },
+ 'balanced': { ai: 0.33, cooking: 0.33, politics: 0.34 },
+ 'tech-specialist': { ai: 0.90, cooking: 0.02, politics: 0.08 }
+};
+
+// Cluster display names and colors
+const CLUSTER_INFO = {
+ 'ai': { name: 'AI/Tech', color: '#1DA1F2' },
+ 'cooking': { name: 'Cooking', color: '#17bf63' },
+ 'politics': { name: 'Politics', color: '#ff9500' }
+};
+
+// Current profiles
+let userProfile = { ai: 0.60, cooking: 0.25, politics: 0.15 };
+let friendProfile = { ai: 0.15, cooking: 0.05, politics: 0.80 };
+let selectedFriend = 'politics-focused';
+
+// DOM elements
+const userAiSlider = document.getElementById('user-ai');
+const userCookingSlider = document.getElementById('user-cooking');
+const userPoliticsSlider = document.getElementById('user-politics');
+const compareBtn = document.getElementById('compare-btn');
+const comparisonContainer = document.getElementById('comparison-container');
+
+// Initialize
+window.addEventListener('DOMContentLoaded', () => {
+ initializeSliders();
+ initializeFriendSelector();
+ attachEventListeners();
+});
+
+/**
+ * Initialize sliders with normalization
+ */
+function initializeSliders() {
+ // Update displays
+ updateUserDisplays();
+
+ // Attach input handlers with normalization
+ [userAiSlider, userCookingSlider, userPoliticsSlider].forEach(slider => {
+ slider.addEventListener('input', () => {
+ normalizeUserProfile();
+ updateUserDisplays();
+ });
+ });
+}
+
+/**
+ * Normalize user profile to sum to 100%
+ * When one slider changes, adjust others proportionally
+ */
+function normalizeUserProfile() {
+ const ai = parseInt(userAiSlider.value);
+ const cooking = parseInt(userCookingSlider.value);
+ const politics = parseInt(userPoliticsSlider.value);
+ const total = ai + cooking + politics;
+
+ if (total !== 100) {
+ // Normalize to 100%
+ const normAi = Math.round((ai / total) * 100);
+ const normCooking = Math.round((cooking / total) * 100);
+ const normPolitics = 100 - normAi - normCooking; // Ensure exact 100%
+
+ userProfile = {
+ ai: normAi / 100,
+ cooking: normCooking / 100,
+ politics: normPolitics / 100
+ };
+ } else {
+ userProfile = {
+ ai: ai / 100,
+ cooking: cooking / 100,
+ politics: politics / 100
+ };
+ }
+}
+
+/**
+ * Update user profile displays
+ */
+function updateUserDisplays() {
+ const aiPercent = Math.round(userProfile.ai * 100);
+ const cookingPercent = Math.round(userProfile.cooking * 100);
+ const politicsPercent = Math.round(userProfile.politics * 100);
+ const total = aiPercent + cookingPercent + politicsPercent;
+
+ document.getElementById('user-ai-display').textContent = `${aiPercent}%`;
+ document.getElementById('user-cooking-display').textContent = `${cookingPercent}%`;
+ document.getElementById('user-politics-display').textContent = `${politicsPercent}%`;
+ document.getElementById('user-total').textContent = `${total}%`;
+
+ // Update slider values
+ userAiSlider.value = aiPercent;
+ userCookingSlider.value = cookingPercent;
+ userPoliticsSlider.value = politicsPercent;
+}
+
+/**
+ * Initialize friend profile selector
+ */
+function initializeFriendSelector() {
+ updateFriendDisplay();
+
+ const friendBtns = document.querySelectorAll('.friend-btn');
+ friendBtns.forEach(btn => {
+ btn.addEventListener('click', () => {
+ // Update active state
+ friendBtns.forEach(b => b.classList.remove('active'));
+ btn.classList.add('active');
+
+ // Load profile
+ const profileKey = btn.dataset.profile;
+ selectedFriend = profileKey;
+ friendProfile = FRIEND_PROFILES[profileKey];
+ updateFriendDisplay();
+ });
+ });
+}
+
+/**
+ * Update friend profile display
+ */
+function updateFriendDisplay() {
+ const aiPercent = Math.round(friendProfile.ai * 100);
+ const cookingPercent = Math.round(friendProfile.cooking * 100);
+ const politicsPercent = Math.round(friendProfile.politics * 100);
+
+ document.getElementById('friend-ai-display').textContent = `${aiPercent}%`;
+ document.getElementById('friend-cooking-display').textContent = `${cookingPercent}%`;
+ document.getElementById('friend-politics-display').textContent = `${politicsPercent}%`;
+}
+
+/**
+ * Attach event listeners
+ */
+function attachEventListeners() {
+ compareBtn.addEventListener('click', () => {
+ generateComparison();
+ comparisonContainer.style.display = 'block';
+ comparisonContainer.scrollIntoView({ behavior: 'smooth', block: 'start' });
+ });
+
+ // View toggle
+ document.getElementById('view-user').addEventListener('click', () => {
+ setActiveView('user');
+ });
+
+ document.getElementById('view-friend').addEventListener('click', () => {
+ setActiveView('friend');
+ });
+}
+
+// Current view state
+let currentView = 'user';
+let scoredData = null;
+
+/**
+ * Generate and display feed comparison
+ */
+function generateComparison() {
+ // Score tweets for each user
+ const userTweets = scoreTweets(userProfile);
+ const friendTweets = scoreTweets(friendProfile);
+
+ // Create rank maps
+ const userRanks = {};
+ const friendRanks = {};
+
+ userTweets.forEach((tweet, index) => {
+ userRanks[tweet.id] = {
+ rank: index + 1,
+ score: tweet.score
+ };
+ });
+
+ friendTweets.forEach((tweet, index) => {
+ friendRanks[tweet.id] = {
+ rank: index + 1,
+ score: tweet.score
+ };
+ });
+
+ // Store for view toggling
+ scoredData = {
+ userTweets,
+ friendTweets,
+ userRanks,
+ friendRanks
+ };
+
+ // Render initial view
+ renderFeed();
+}
+
+/**
+ * Set active view and re-render
+ */
+function setActiveView(view) {
+ currentView = view;
+
+ // Update button states
+ document.getElementById('view-user').classList.toggle('active', view === 'user');
+ document.getElementById('view-friend').classList.toggle('active', view === 'friend');
+
+ // Re-render
+ renderFeed();
+}
+
+/**
+ * Score tweets based on profile
+ * score = base_quality × cluster_interest
+ */
+function scoreTweets(profile) {
+ return TWEETS.map(tweet => {
+ const clusterInterest = profile[tweet.cluster];
+ const score = tweet.baseQuality * clusterInterest;
+
+ return {
+ ...tweet,
+ score: score
+ };
+ }).sort((a, b) => b.score - a.score); // Sort by score descending
+}
+
+/**
+ * Render the feed based on current view
+ */
+function renderFeed() {
+ if (!scoredData) return;
+
+ const container = document.getElementById('tweet-feed');
+ const tweets = currentView === 'user' ? scoredData.userTweets : scoredData.friendTweets;
+ const { userRanks, friendRanks } = scoredData;
+
+ container.innerHTML = tweets.map(tweet => {
+ const userRank = userRanks[tweet.id].rank;
+ const friendRank = friendRanks[tweet.id].rank;
+ const userScore = userRanks[tweet.id].score;
+ const friendScore = friendRanks[tweet.id].score;
+
+ const rankDiff = Math.abs(userRank - friendRank);
+ const clusterInfo = CLUSTER_INFO[tweet.cluster];
+
+ // Highlight big differences
+ const isDifferent = rankDiff >= 5;
+ const diffClass = isDifferent ? 'rank-different' : '';
+
+ return `
+ ${type}
+ ${probability.toFixed(1)}%
+ ${weight.toFixed(1)}
+
+ ${contribution >= 0 ? '+' : ''}${contribution.toFixed(3)}
+
+
+
+ `;
+ })
+ .join('');
+}
+
+/**
+ * Get explanation for what's happening at each milestone
+ */
+function getWeekExplanation(week, percent1, interest1Name) {
+ if (week === 0) {
+ return 'Your initial state based on who you followed.';
+ } else if (week <= 4) {
+ return `Subtle drift begins. ${interest1Name} content scores slightly higher in the algorithm, so you see more of it.`;
+ } else if (week <= 12) {
+ return `Engagement reinforcement. You're engaging more with ${interest1Name} because you're seeing more of it. This increases your cluster score.`;
+ } else if (week <= 20) {
+ return `FRS acceleration (if enabled). X recommends ${interest1Name} accounts. Following them accelerates drift.`;
+ } else if (week <= 24) {
+ return `Approaching equilibrium. Drift slows as you near the algorithm's "natural" balance for your engagement pattern.`;
+ } else if (percent1 >= 85) {
+ return `Deep in the gravity well. Breaking out now requires deliberate counter-engagement for 30+ days.`;
+ } else {
+ return `Continued drift toward monoculture. Your secondary interest is becoming barely visible.`;
+ }
+}
+
+// Initialize with default values on page load
+window.addEventListener('DOMContentLoaded', () => {
+ // Set initial split display
+ const initialSplit = parseInt(splitSlider.value);
+ splitDisplay.textContent = `${initialSplit}% / ${100 - initialSplit}%`;
+});
diff --git a/docs/js/phoenix-simulator.js b/docs/js/phoenix-simulator.js
new file mode 100644
index 000000000..35b9de085
--- /dev/null
+++ b/docs/js/phoenix-simulator.js
@@ -0,0 +1,328 @@
+// Phoenix Behavioral Sequence Simulator
+
+document.addEventListener('DOMContentLoaded', function() {
+
+// ============================================================================
+// State Management
+// ============================================================================
+
+let actionSequence = [];
+const MAX_SEQUENCE_LENGTH = 8;
+
+// Action metadata
+const ACTION_INFO = {
+ 'LIKE_tech': { emoji: '❤️', label: 'Like Tech', category: 'tech', engagement: 'high', color: '#1DA1F2' },
+ 'CLICK_tech': { emoji: '👁️', label: 'Click Tech', category: 'tech', engagement: 'medium', color: '#1DA1F2' },
+ 'REPLY_tech': { emoji: '💬', label: 'Reply Tech', category: 'tech', engagement: 'very high', color: '#1DA1F2' },
+ 'LIKE_sports': { emoji: '❤️', label: 'Like Sports', category: 'sports', engagement: 'high', color: '#17bf63' },
+ 'CLICK_sports': { emoji: '👁️', label: 'Click Sports', category: 'sports', engagement: 'medium', color: '#17bf63' },
+ 'REPLY_sports': { emoji: '💬', label: 'Reply Sports', category: 'sports', engagement: 'very high', color: '#17bf63' },
+ 'SCROLL_neutral': { emoji: '📜', label: 'Scroll Past', category: 'neutral', engagement: 'none', color: '#8899AA' }
+};
+
+// ============================================================================
+// Event Handlers
+// ============================================================================
+
+// Add action to sequence
+document.querySelectorAll('.action-btn').forEach(button => {
+ button.addEventListener('click', () => {
+ const action = button.dataset.action;
+ const category = button.dataset.category;
+ const actionKey = `${action}_${category}`;
+
+ if (actionSequence.length >= MAX_SEQUENCE_LENGTH) {
+ // Remove oldest action (shift left)
+ actionSequence.shift();
+ }
+
+ actionSequence.push(actionKey);
+ updateSequenceDisplay();
+ analyzeBehavior();
+ });
+});
+
+// Clear sequence
+document.getElementById('clear-sequence-btn').addEventListener('click', () => {
+ actionSequence = [];
+ updateSequenceDisplay();
+ document.getElementById('predictions-container').style.display = 'none';
+});
+
+// ============================================================================
+// Display Functions
+// ============================================================================
+
+function updateSequenceDisplay() {
+ const container = document.getElementById('action-sequence');
+
+ if (actionSequence.length === 0) {
+ container.innerHTML = 'No actions yet. Click buttons above to build your sequence.';
+ return;
+ }
+
+ container.innerHTML = '';
+
+ actionSequence.forEach((actionKey, index) => {
+ const info = ACTION_INFO[actionKey];
+ const badge = document.createElement('div');
+ badge.style.cssText = `
+ padding: 0.5rem 1rem;
+ background-color: ${info.color}22;
+ border: 2px solid ${info.color};
+ border-radius: 6px;
+ font-weight: 600;
+ font-size: 0.95rem;
+ display: inline-flex;
+ align-items: center;
+ gap: 0.5rem;
+ `;
+ badge.innerHTML = `${info.emoji} ${info.label}`;
+ container.appendChild(badge);
+ });
+}
+
+// ============================================================================
+// Behavioral Analysis
+// ============================================================================
+
+function analyzeBehavior() {
+ if (actionSequence.length === 0) {
+ document.getElementById('predictions-container').style.display = 'none';
+ return;
+ }
+
+ // Analyze sequence
+ const analysis = analyzeSequencePattern(actionSequence);
+
+ // Show predictions
+ document.getElementById('predictions-container').style.display = 'block';
+
+ // Display behavioral state
+ displayBehavioralState(analysis);
+
+ // Display predictions
+ displayPredictions(analysis);
+
+ // Display interpretation
+ displayInterpretation(analysis);
+}
+
+function analyzeSequencePattern(sequence) {
+ // Count by category
+ const categoryCounts = { tech: 0, sports: 0, neutral: 0 };
+ const actionCounts = { LIKE: 0, CLICK: 0, REPLY: 0, SCROLL: 0 };
+ const engagementLevels = { 'very high': 0, high: 0, medium: 0, none: 0 };
+
+ sequence.forEach(actionKey => {
+ const info = ACTION_INFO[actionKey];
+ const [action, category] = actionKey.split('_');
+
+ categoryCounts[category]++;
+ actionCounts[action]++;
+ engagementLevels[info.engagement]++;
+ });
+
+ // Calculate dominant category
+ const totalActions = sequence.length;
+ const techPercent = (categoryCounts.tech / totalActions) * 100;
+ const sportsPercent = (categoryCounts.sports / totalActions) * 100;
+ const neutralPercent = (categoryCounts.neutral / totalActions) * 100;
+
+ // Determine behavioral state
+ let behavioralState = '';
+ let stateColor = '';
+
+ if (neutralPercent >= 60) {
+ behavioralState = 'Passive Browsing Mode';
+ stateColor = '#8899AA';
+ } else if (categoryCounts.REPLY >= 2 || engagementLevels['very high'] >= 2) {
+ behavioralState = 'High Engagement Streak';
+ stateColor = '#ff6b6b';
+ } else if (techPercent >= 75 || sportsPercent >= 75) {
+ const dominant = techPercent > sportsPercent ? 'Tech' : 'Sports';
+ behavioralState = `Deep Dive: ${dominant} Content`;
+ stateColor = techPercent > sportsPercent ? '#1DA1F2' : '#17bf63';
+ } else if (techPercent >= 50 && sportsPercent === 0) {
+ behavioralState = 'Focused Exploration: Tech';
+ stateColor = '#1DA1F2';
+ } else if (sportsPercent >= 50 && techPercent === 0) {
+ behavioralState = 'Focused Exploration: Sports';
+ stateColor = '#17bf63';
+ } else {
+ behavioralState = 'Context Switching: Mixed Interests';
+ stateColor = '#f5a623';
+ }
+
+ // Calculate predictions based on behavioral pattern
+ const predictions = calculatePredictions({
+ techPercent,
+ sportsPercent,
+ neutralPercent,
+ engagementLevels,
+ actionCounts,
+ sequence
+ });
+
+ return {
+ behavioralState,
+ stateColor,
+ categoryCounts,
+ techPercent,
+ sportsPercent,
+ neutralPercent,
+ predictions,
+ engagementLevels,
+ actionCounts
+ };
+}
+
+function calculatePredictions(analysis) {
+ const { techPercent, sportsPercent, neutralPercent, engagementLevels, sequence } = analysis;
+
+ // Base probabilities
+ let techEngagement = Math.max(5, techPercent);
+ let sportsEngagement = Math.max(5, sportsPercent);
+
+ // Boost based on recent momentum (last 3 actions)
+ const recentActions = sequence.slice(-3);
+ const recentTech = recentActions.filter(a => a.includes('tech')).length;
+ const recentSports = recentActions.filter(a => a.includes('sports')).length;
+
+ techEngagement += recentTech * 15;
+ sportsEngagement += recentSports * 15;
+
+ // Boost based on engagement intensity
+ if (engagementLevels['very high'] >= 2) {
+ // High engagement mode - boost everything
+ techEngagement *= 1.3;
+ sportsEngagement *= 1.3;
+ }
+
+ // Penalty for passive browsing
+ if (neutralPercent >= 60) {
+ techEngagement *= 0.3;
+ sportsEngagement *= 0.3;
+ }
+
+ // Normalize to 100%
+ const total = techEngagement + sportsEngagement;
+ const techProb = (techEngagement / total) * 100;
+ const sportsProb = (sportsEngagement / total) * 100;
+
+ return {
+ tech: Math.round(techProb),
+ sports: Math.round(sportsProb)
+ };
+}
+
+// ============================================================================
+// Display Predictions
+// ============================================================================
+
+function displayBehavioralState(analysis) {
+ const container = document.getElementById('behavioral-state');
+ container.style.borderLeftColor = analysis.stateColor;
+ container.innerHTML = `
+ Week ${week}
+ ${percent1}%
+ ${percent2}%
+ ${explanation}
+
+
+
+ In Phoenix: Algorithm recognizes passive mode and adjusts accordingly.
+
+
+ In Phoenix: Real-time adaptation to your engagement momentum.
+
+
+
+ Stage 1: Candidate Generation
+ Selection Source
+ Initial Pool
+ Network Status
+ Stage 2: Feature Hydration
+ Author Features
+
+
+ Tweet Features
+
+
+ User-Tweet Features
+
+
+ Engagement Predictions
+ Stage 3: Heavy Ranker (ML Scoring)
+ score = Σ (probabilityi × weighti)
+ Stage 4: Filters & Penalties
+ ${mod.name}
+ ${mod.multiplier}
+ Final Score After Filters
+ Stage 5: Mixing & Serving
+ Final Ranking
+ Survival Rate
+ Score Evolution
+ Mixing & Ads
+
+ ${survived
+ ? '✓ This tweet survived the 96% rejection rate and would appear in your timeline.'
+ : '✗ This tweet was filtered out in the final ranking and would not appear in your timeline.'}
+
+ ${currentScenario.network === 'out-of-network' && !survived
+ ? 'The 25% out-of-network penalty significantly reduced its competitiveness.'
+ : currentScenario.authorPosition > 1 && !survived
+ ? 'The author diversity penalty reduced its ranking below the visibility threshold.'
+ : survived && currentScenario.network === 'in-network'
+ ? 'In-network status and strong engagement predictions helped it survive.'
+ : 'Try exploring different scenarios to see how network status and engagement affect outcomes.'}
+ `;
+
+ document.getElementById('summary-text').innerHTML = summaryText;
+ finalSummary.style.display = 'block';
+ finalSummary.scrollIntoView({ behavior: 'smooth', block: 'start' });
+
+ nextStageBtn.style.display = 'none';
+}
diff --git a/docs/js/reinforcement-loop.js b/docs/js/reinforcement-loop.js
new file mode 100644
index 000000000..208d91a09
--- /dev/null
+++ b/docs/js/reinforcement-loop.js
@@ -0,0 +1,596 @@
+/**
+ * The Reinforcement Loop Machine
+ *
+ * Shows step-by-step how the feedback loop creates drift:
+ * Profile → Candidates → Scoring → Feed → Engagement → Profile Update → repeat
+ *
+ * Based on:
+ * - Multiplicative scoring (ApproximateCosineSimilarity.scala:84-94)
+ * - Weekly InterestedIn updates (InterestedInFromKnownFor.scala:59)
+ * - L2 normalization (SimClustersEmbedding.scala:59-72)
+ */
+
+// State
+let currentWeek = 0;
+let currentStage = 0;
+let profile = { ai: 0.60, cooking: 0.40 };
+let history = [];
+
+// Configuration
+const DRIFT_RATE = 0.015; // Medium engagement
+const STAGES = ['profile', 'candidates', 'scoring', 'feed', 'engagement', 'update'];
+
+// DOM elements
+const loopAiSlider = document.getElementById('loop-ai');
+const loopCookingSlider = document.getElementById('loop-cooking');
+const startLoopBtn = document.getElementById('start-loop-btn');
+const loopContainer = document.getElementById('loop-container');
+const stageContainer = document.getElementById('stage-container');
+const nextStageBtn = document.getElementById('next-stage-btn');
+const restartBtn = document.getElementById('restart-btn');
+const currentWeekDisplay = document.getElementById('current-week');
+const historyContainer = document.getElementById('history-container');
+
+// Initialize
+window.addEventListener('DOMContentLoaded', () => {
+ initializeSliders();
+ attachEventListeners();
+});
+
+/**
+ * Initialize sliders with normalization
+ */
+function initializeSliders() {
+ updateLoopDisplays();
+
+ [loopAiSlider, loopCookingSlider].forEach(slider => {
+ slider.addEventListener('input', () => {
+ normalizeLoopProfile();
+ updateLoopDisplays();
+ });
+ });
+}
+
+/**
+ * Normalize loop profile to 100%
+ */
+function normalizeLoopProfile() {
+ const ai = parseInt(loopAiSlider.value);
+ const cooking = parseInt(loopCookingSlider.value);
+ const total = ai + cooking;
+
+ if (total !== 100) {
+ const normAi = Math.round((ai / total) * 100);
+ const normCooking = 100 - normAi;
+
+ loopAiSlider.value = normAi;
+ loopCookingSlider.value = normCooking;
+
+ profile = { ai: normAi / 100, cooking: normCooking / 100 };
+ } else {
+ profile = { ai: ai / 100, cooking: cooking / 100 };
+ }
+}
+
+/**
+ * Update loop displays
+ */
+function updateLoopDisplays() {
+ const aiPercent = Math.round(profile.ai * 100);
+ const cookingPercent = Math.round(profile.cooking * 100);
+
+ document.getElementById('loop-ai-display').textContent = `${aiPercent}%`;
+ document.getElementById('loop-cooking-display').textContent = `${cookingPercent}%`;
+ document.getElementById('loop-total').textContent = `${aiPercent + cookingPercent}%`;
+}
+
+/**
+ * Attach event listeners
+ */
+function attachEventListeners() {
+ startLoopBtn.addEventListener('click', startLoop);
+ nextStageBtn.addEventListener('click', nextStage);
+ restartBtn.addEventListener('click', restart);
+}
+
+/**
+ * Start the loop
+ */
+function startLoop() {
+ // Reset state
+ currentWeek = 0;
+ currentStage = 0;
+ history = [{ week: 0, ai: profile.ai, cooking: profile.cooking }];
+
+ // Show loop container
+ loopContainer.style.display = 'block';
+ loopContainer.scrollIntoView({ behavior: 'smooth', block: 'start' });
+
+ // Render first stage
+ renderStage();
+ updateProgress();
+}
+
+/**
+ * Advance to next stage
+ */
+function nextStage() {
+ currentStage++;
+
+ if (currentStage >= STAGES.length) {
+ // Completed first loop, show 4-week projection
+ showProjection();
+ } else {
+ renderStage();
+ updateProgress();
+ }
+}
+
+/**
+ * Calculate drift for one week
+ */
+function calculateWeeklyDrift(currentProfile) {
+ const advantage = currentProfile.ai / currentProfile.cooking;
+ const imbalance = Math.abs(currentProfile.ai - currentProfile.cooking);
+ const slowdown = 1 - (imbalance * 0.5);
+ const drift = DRIFT_RATE * advantage * slowdown;
+
+ const newAi = Math.min(0.95, currentProfile.ai + drift);
+ const newCooking = 1 - newAi; // L2 normalization
+
+ return { ai: newAi, cooking: newCooking };
+}
+
+/**
+ * Show 4-week projection after completing one loop
+ */
+function showProjection() {
+ nextStageBtn.textContent = 'Show 6-Month Projection →';
+ nextStageBtn.onclick = extendToSixMonths;
+
+ // Calculate 4 weeks of drift
+ const projection = [history[0]]; // Week 0
+ let currentProfile = { ai: profile.ai, cooking: profile.cooking };
+
+ for (let week = 1; week <= 4; week++) {
+ currentProfile = calculateWeeklyDrift(currentProfile);
+ projection.push({ week, ai: currentProfile.ai, cooking: currentProfile.cooking });
+ }
+
+ const initialAi = Math.round(projection[0].ai * 100);
+ const week4Ai = Math.round(projection[4].ai * 100);
+ const totalDrift = week4Ai - initialAi;
+
+ stageContainer.innerHTML = `
+ Loop Complete - 4-Week Projection
+ Week-by-Week Breakdown
+ 6-Month Projection (24 Weeks)
+ Complete Timeline
+ Filter Bubble Lock-In
+ Stage 1: Your Profile
+ Stage 2: Fetch Candidates
+ Stage 3: Score Tweets
+ Base Quality: 0.85
+ × Your AI Interest: ${profile.ai.toFixed(2)}
+ = Score: ${aiScore}
+ Base Quality: 0.85
+ × Your Cooking Interest: ${profile.cooking.toFixed(2)}
+ = Score: ${cookingScore}
+ Stage 4: Build Your Feed
+ Stage 5: You Engage
+ Stage 6: Update Your Profile
+ Reference & Glossary
+
+ Contents
+
+ Glossary: Algorithm Components
+
+
+
+ Reference Sections
+
+
+
+
+ Glossary: Building Blocks of the Algorithm
+
+ Heavy Ranker
+
+ recapHomeGlobalParams.scala:786-1028Light Ranker
+
+ earlybirdTwHIN (Twitter Heterogeneous Information Network)
+
+ recos and related embeddingsSimClusters
+
+ your_cluster_score × tweet_cluster_score), your strongest cluster keeps getting stronger. If you're 60% AI and 40% cooking today, engaging slightly more with AI content makes you 65% AI, which makes AI content score even higher, which makes you engage more with AI... and six months later you're 76% AI.simclusters_v2UTEG (User-Tweet-Entity-Graph)
+
+ user_tweet_entity_graphGraphJet
+
+ GraphJetEarlybird
+
+ searchReal Graph
+
+ interaction_graphTweet Mixer
+
+ tweet-mixerNavi
+
+ NaviModelScorer.scalaProduct Mixer
+
+ product-mixerMaskNet
+
+ recapFSBoundedParam (Feature Switch)
+
+ val penalty = 0.75 require a code deployment to change. FSBoundedParam defines parameters like OutOfNetworkPenalty(default=0.75, min=0.0, max=1.0) that can be adjusted through a dashboard. Twitter can run A/B tests or tune values in real-time without touching code.paramTweepCred
+
+ tweepcredFRS (Follow Recommendations Service)
+
+ follow-recommendations-serviceUser Signal Service (USS)
+
+ user-signal-service
+
+ Code Evolution Timeline
+
+ March 2023: Architecture Skeleton
+ HomeGlobalParams.scala file contained only 86 lines with basic configuration—no engagement weights, no ML integration configs.September 2025: Complete Implementation
+ HomeGlobalParams.scala file expanded to 1,479 lines with all engagement weight parameters defined.What This Means
+
+
+
+
+ default = 0.0, but actual production values come from Twitter's internal configuration system. The code shows structure and formulas; external documentation provides values.
+
+ How to Verify Our Claims
+
+ 1. Get the Code
+
+
+ git clone https://github.com/twitter/the-algorithm.git
+cd the-algorithm2. Navigate to Referenced Files
+ HomeGlobalParams.scala:786-1028
+
+ cd home-mixer/server/src/main/scala/com/twitter/home_mixer/param/
+cat HomeGlobalParams.scala | sed -n '786,1028p'3. Check Implementation Date
+
+
+ git blame path/to/file.scala | grep -A5 "pattern"4. Verify Calculations
+
+
+ Tweet score = 0.5 × P(favorite) + 13.5 × P(reply)
+
+Example:
+P(favorite) = 0.1 (10% chance)
+P(reply) = 0.02 (2% chance)
+
+Score = 0.5 × 0.1 + 13.5 × 0.02
+ = 0.05 + 0.27
+ = 0.325. Cross-Reference Documentation
+
+
+
+
+
+ File Index: Where to Find Things
+
+ Main Pipeline
+ home-mixer/server/src/main/scala/com/twitter/home_mixer/product/for_you/ForYouProductPipelineConfig.scalahome-mixer/server/src/main/scala/com/twitter/home_mixer/product/scored_tweets/ScoredTweetsProductPipelineConfig.scalaEngagement Weights
+ home-mixer/server/src/main/scala/com/twitter/home_mixer/param/HomeGlobalParams.scala:786-1028home-mixer/server/src/main/scala/com/twitter/home_mixer/model/PredictedScoreFeature.scala:62-336Filters and Penalties
+ home-mixer/.../filter/FeedbackFatigueFilter.scalahome-mixer/.../scorer/FeedbackFatigueScorer.scalahome-mixer/.../scorer/AuthorBasedListwiseRescoringProvider.scala:54home-mixer/.../scorer/RescoringFactorProvider.scala:45-57Candidate Sources
+ src/java/com/twitter/search/src/scala/com/twitter/recos/user_tweet_entity_graph/tweet-mixer/follow-recommendations-service/SimClusters and Communities
+ src/scala/com/twitter/simclusters_v2/simclusters-ann/User Signals
+ RETREIVAL_SIGNALS.mduser-signal-service/unified_user_actions/
+
+ Further Reading
+
+ Official Sources
+
+
+
+
+
+