From f8addff97814ee2c65d3cc8ad92d99cd138c1f55 Mon Sep 17 00:00:00 2001 From: Ernest Date: Wed, 12 Nov 2025 18:00:25 +0200 Subject: [PATCH] initial additions --- docs/CLAUDE.md | 483 ++++ docs/DESIGN_STANDARDS.md | 267 +++ docs/README.md | 157 ++ docs/convert-code-refs.py | 144 ++ docs/css/style.css | 2116 +++++++++++++++++ docs/index.html | 317 +++ docs/interactive/algorithmic-aristocracy.html | 529 +++++ docs/interactive/algorithmic-identity.html | 363 +++ docs/interactive/cluster-explorer.html | 437 ++++ docs/interactive/engagement-calculator.html | 531 +++++ docs/interactive/invisible-filter.html | 315 +++ docs/interactive/journey-simulator.html | 229 ++ .../phoenix-sequence-prediction.html | 832 +++++++ docs/interactive/pipeline-explorer.html | 281 +++ docs/interactive/reinforcement-loop.html | 283 +++ docs/js/algorithmic-aristocracy.js | 281 +++ docs/js/algorithmic-identity.js | 361 +++ docs/js/cluster-explorer.js | 416 ++++ docs/js/engagement-calculator.js | 613 +++++ docs/js/invisible-filter.js | 411 ++++ docs/js/journey-simulator.js | 360 +++ docs/js/phoenix-simulator.js | 328 +++ docs/js/pipeline-explorer.js | 636 +++++ docs/js/reinforcement-loop.js | 596 +++++ docs/parts/reference.html | 396 +++ 25 files changed, 11682 insertions(+) create mode 100644 docs/CLAUDE.md create mode 100644 docs/DESIGN_STANDARDS.md create mode 100644 docs/README.md create mode 100644 docs/convert-code-refs.py create mode 100644 docs/css/style.css create mode 100644 docs/index.html create mode 100644 docs/interactive/algorithmic-aristocracy.html create mode 100644 docs/interactive/algorithmic-identity.html create mode 100644 docs/interactive/cluster-explorer.html create mode 100644 docs/interactive/engagement-calculator.html create mode 100644 docs/interactive/invisible-filter.html create mode 100644 docs/interactive/journey-simulator.html create mode 100644 docs/interactive/phoenix-sequence-prediction.html create mode 100644 docs/interactive/pipeline-explorer.html create mode 100644 docs/interactive/reinforcement-loop.html create mode 100644 docs/js/algorithmic-aristocracy.js create mode 100644 docs/js/algorithmic-identity.js create mode 100644 docs/js/cluster-explorer.js create mode 100644 docs/js/engagement-calculator.js create mode 100644 docs/js/invisible-filter.js create mode 100644 docs/js/journey-simulator.js create mode 100644 docs/js/phoenix-simulator.js create mode 100644 docs/js/pipeline-explorer.js create mode 100644 docs/js/reinforcement-loop.js create mode 100644 docs/parts/reference.html diff --git a/docs/CLAUDE.md b/docs/CLAUDE.md new file mode 100644 index 000000000..18ea8f3fe --- /dev/null +++ b/docs/CLAUDE.md @@ -0,0 +1,483 @@ +# CLAUDE.md - Interactive Documentation Guide + +This file provides guidance to Claude Code when working on the interactive documentation in `/docs`. + +## Purpose + +The `docs/` directory contains **public-facing interactive documentation** that translates our technical research findings (from `notes/`) into accessible, engaging explanations for a broad audience. + +**Goals**: +- Make complex algorithmic behavior understandable to non-technical readers +- Provide concrete examples with real calculations +- Use interactive visualizations to demonstrate dynamic effects +- Maintain objectivity and verifiability (all claims backed by code references) +- Create an engaging reading experience + +**Audience**: Users, creators, researchers, policy makers, and anyone curious about how algorithmic systems shape discourse. + +## Quick Reference: Content Structure + +**Every algorithmic concept must follow this three-part structure:** + +1. **Intuition** - Plain language explanation answering "What is this?" and "Why does it matter?" +2. **Feel** - Interactive visualization or calculator to experience the dynamics +3. **Proof** - Mathematical formulas, concrete calculations, and code references for verification + +This ensures content is accessible (intuition), engaging (feel), and verifiable (proof). + +See [Writing Principles](#writing-principles) for detailed guidance. + +## Content Structure + +The investigation is structured as a multi-part series: + +1. **Introduction** (`parts/01-introduction.html`) - Why this matters, approach, questions +2. **Tweet Journey** (`parts/02-tweet-journey.html`) - Following a tweet through all 5 algorithmic stages +3. **User Journey** (`parts/03-user-journey.html`) - How user experience evolves over 6 months +4. **Discourse Levers** (`parts/04-discourse-levers.html`) - 6 mechanisms shaping platform discourse +5. **What This Means** (`parts/05-what-this-means.html`) - Objective analysis of designed vs emergent effects +6. **Conclusions** (`parts/06-conclusions.html`) - Perspective and implications (subjective analysis) +7. **Appendix** (`parts/07-appendix.html`) - Methodology, file index, verification guide + +**Landing page** (`index.html`) - Overview with key findings and navigation + +## Writing Principles + +### Content Structure: Intuition → Feel → Proof + +Every algorithmic concept should be presented in three layers to serve different learning styles: + +**1. General Description (Intuition)** +- Plain language explanation of what this mechanism does and why it exists +- Build intuition: help readers understand the "shape" of the behavior +- Answer: "What is this?" and "Why does it matter?" +- Example: "Twitter prevents any single author from dominating your feed by applying an exponential penalty..." + +**2. Visualization/Interactive Element (Feel)** +- Let readers experience the dynamic, not just read about it +- Interactive calculators, charts, simulations, or animated diagrams +- Answer: "How does this feel?" and "What happens if I change X?" +- Example: Slider to adjust tweet count and see penalty compound in real-time + +**3. Math & Code (Proof)** +- Concrete formulas with actual parameters from the code +- Step-by-step calculations with real numbers +- Code references with file paths and line numbers for verification +- Answer: "How exactly does this work?" and "Can I verify this?" +- Example: Formula with decay factor 0.5, floor 0.25, plus code reference + +This three-part structure ensures content is: +- **Accessible**: Non-technical readers get the intuition +- **Engaging**: Visual learners can explore interactively +- **Verifiable**: Technical readers can check the implementation + +### Objectivity Until Conclusions + +- **Parts 1-5**: Present facts, mechanics, and observable effects without judgment +- **Part 6**: Bring perspective, interpretation, and implications +- Use phrases like "the algorithm optimizes for" not "the algorithm wants to" +- Describe effects objectively: "this increases polarization" not "this is bad" + +### Show the Math + +Every algorithmic effect should include **concrete calculations**: + +```html +

Author Diversity Penalty

+

When an author posts multiple tweets, each subsequent tweet receives a penalty:

+ +
Formula: multiplier = (1 - floor) × decayFactor^position + floor
+
+Where:
+- decayFactor = 0.5
+- floor = 0.25 (minimum multiplier)
+- position = tweet number from this author (0-indexed)
+
+Example - Author posts 3 tweets with base score 100:
+
+Tweet 1: 100 × 1.0 = 100
+Tweet 2: 100 × 0.625 = 62.5
+Tweet 3: 100 × 0.4375 = 43.75
+
+Total reach: 206.25 (not 300!)
+Effective penalty: 31% loss vs posting separately
+ +

Code: +home-mixer/server/src/main/scala/com/twitter/home_mixer/product/scored_tweets/scorer/AuthorBasedListwiseRescoringProvider.scala:54

+``` + +### Code References Are Mandatory + +Every claim must include verifiable code references: + +**Good**: +```html +

Out-of-network tweets receive a 0.75x multiplier (25% penalty).

+

Code: +home-mixer/.../RescoringFactorProvider.scala:45-57

+``` + +**Bad**: +```html +

Out-of-network tweets are penalized.

+ +``` + +### Accessible Language + +Translate technical concepts into plain language while maintaining accuracy: + +**Technical** (for `notes/`): +> "The MaskNet architecture predicts 15 engagement probability distributions using parallel task-specific towers with shared representation layers." + +**Accessible** (for `docs/`): +> "The ranking model predicts 15 different ways you might engage with a tweet (like, reply, retweet, etc.) all at once. Each prediction gets a weight, and the weighted sum becomes the tweet's score." + +## HTML Structure and Patterns + +### Standard Page Template + +```html + + + + + + Part Title - How Twitter's Algorithm Really Works + + + + + +
+ +
+ + + + + + +``` + +### Common HTML Patterns + +**Code reference block**: +```html +

Code: +path/to/file.scala:line-numbers

+``` + +**Callout for important insights**: +```html +
+

Key Insight: This is a critical finding that deserves emphasis.

+
+``` + +**Finding cards** (for index page): +```html +
+

Finding Title

+

Brief description of the finding and its implications.

+

Code: file.scala:123

+
+``` + +**Calculations and formulas**: +```html +
Formula or calculation here
+With multiple lines
+Showing step-by-step math
+``` + +## Interactive Visualizations + +Use interactive elements to help readers **experience** algorithmic dynamics, not just read about them. + +### Types of Visualizations + +**1. Decay Functions** - Show how effects change over time +- Author diversity decay (exponential) +- Feedback fatigue decay (linear over 140 days) +- Temporal decay for tweet age + +**2. Multiplicative Effects** - Demonstrate compound behaviors +- Gravitational pull (interest drift from 60/40 to 76/24) +- Cluster reinforcement +- Follower advantage compounding + +**3. Interactive Calculators** - Let readers adjust parameters +- Tweet score calculator (adjust engagement weights) +- Author diversity penalty (adjust number of tweets) +- In-network vs out-of-network comparison + +**4. Flow Diagrams** - Show pipelines and stages +- Tweet journey through 5 stages +- Candidate funnel (1B → 1,400 → 50-100) +- Signal usage across components + +**5. Comparison Charts** - Relative values and impacts +- Engagement weights bar chart (Reply: 75.0 vs Favorite: 0.5) +- Filter penalties comparison +- Signal importance across systems + +### Implementation Guidelines + +**Technology**: +- Vanilla JavaScript (no frameworks, keep it lightweight) +- For charts: Chart.js or D3.js (include via CDN) +- For animations: CSS transitions or Canvas API +- Store data in separate JSON files in `assets/` if needed + +**Accessibility**: +- All visualizations should have text fallbacks +- Describe what the visualization shows before showing it +- Provide static examples alongside interactive ones +- Ensure keyboard navigation works + +**Example: Interactive Author Diversity Calculator** + +```html +

Interactive: Author Diversity Penalty

+

Adjust the number of tweets to see how the penalty compounds:

+ +
+ + +
+ +
+ + +
+ + +``` + +### When to Use Interactive vs Static + +**Use interactive visualizations when**: +- The concept involves change over time (decay curves, drift) +- Readers benefit from experimenting with parameters +- Multiple scenarios need comparison +- The dynamic is hard to grasp from numbers alone + +**Use static examples when**: +- A single concrete example is sufficient +- The calculation is straightforward +- Interaction would add complexity without insight +- Page performance is a concern + +## Content Tone and Style + +### Voice + +- **Clear and direct**: Short sentences, active voice +- **Conversational but precise**: Explain like you're talking to a smart friend +- **Objective in facts, thoughtful in implications**: Present mechanics objectively, analyze effects thoughtfully +- **No hyperbole**: Let the findings speak for themselves + +### Framing Effects + +Be careful with word choice that implies intent: + +**Avoid** (implies intent): +- "The algorithm wants you to..." +- "Twitter designed this to manipulate..." +- "This is meant to exploit..." + +**Prefer** (describes mechanism): +- "The algorithm optimizes for..." +- "This design choice results in..." +- "This creates an incentive to..." + +### Example Transformations + +**From technical finding** (`notes/`): +> "The Heavy Ranker applies a -74.0 weight to the predicted probability of negative feedback, while favorites receive only a 0.5 weight. This creates a 148:1 ratio favoring avoidance of negative signals over accumulation of positive ones." + +**To accessible explanation** (`docs/`): +> "When scoring a tweet, the algorithm severely penalizes content that might trigger 'not interested' clicks (-74.0 weight) while barely rewarding favorites (0.5 weight). This means one 'not interested' click has the same negative impact as 148 likes have positive impact. The algorithm is designed to avoid showing you things you'll reject, not to show you things you'll like." + +## Visual Design + +The site uses a dark theme (`css/style.css`) inspired by Twitter/X's interface. + +**Design principles**: +- High contrast for readability (light text on dark background) +- Generous whitespace and line height +- Code blocks with syntax highlighting colors +- Responsive layout (works on mobile) +- Consistent spacing and typography + +**Color usage**: +- Background: `#15202b` (dark blue-gray) +- Text: `#e7e9ea` (light gray) +- Links: `#1d9bf0` (Twitter blue) +- Code blocks: `#192734` background, `#50fa7b` for highlights +- Callouts: Subtle border and background variation + +## File Organization + +``` +docs/ +├── index.html # Landing page +├── CLAUDE.md # This file (documentation guidance) +├── README.md # Deployment and viewing instructions +│ +├── parts/ # Main content sections +│ ├── 01-introduction.html +│ ├── 02-tweet-journey.html +│ ├── 03-user-journey.html +│ ├── 04-discourse-levers.html +│ ├── 05-what-this-means.html +│ ├── 06-conclusions.html +│ └── 07-appendix.html +│ +├── css/ +│ └── style.css # Dark theme styling +│ +├── js/ # Interactive widgets +│ ├── author-diversity-calculator.js +│ ├── engagement-weight-chart.js +│ ├── gravitational-pull-simulator.js +│ └── tweet-journey-flow.js +│ +└── assets/ # Data files, images + └── data/ + └── engagement-weights.json +``` + +## Common Tasks + +### Adding a New Interactive Visualization + +1. **Research the mechanism** in the codebase (see root `/CLAUDE.md`) +2. **Create concrete examples** with real numbers in `notes/` +3. **Design the interaction**: What should users be able to adjust? What do they see? +4. **Write the HTML structure** in the appropriate `parts/*.html` file +5. **Create the JavaScript** in `js/` with clear comments +6. **Test interactivity**: Does it help understanding? Is it intuitive? +7. **Add text explanation**: Describe what the visualization shows +8. **Include code reference**: Point to the actual implementation + +### Writing a New Section + +Follow the **Intuition → Feel → Proof** structure: + +**Step 1: Research & Prepare** +1. Review related notes in `notes/` for detailed findings +2. Locate the relevant code implementation +3. Extract exact formulas, parameters, and thresholds +4. Calculate concrete examples with real numbers + +**Step 2: Write Part 1 (Intuition)** +1. Start with plain language: "What is this mechanism?" +2. Explain why it exists: "What problem does it solve?" +3. Describe the shape: "How does it behave?" (exponential? linear? threshold-based?) +4. Make it relatable: Connect to user experience +5. Draft as if explaining to a non-technical friend + +**Step 3: Create Part 2 (Feel)** +1. Choose visualization type: calculator? graph? simulation? comparison chart? +2. Identify which parameters users should control +3. Decide what should be displayed: results? charts? comparisons? +4. Sketch the HTML structure with proper IDs and classes +5. Create the JavaScript (or note for separate task) +6. Add text description: "Use this to explore..." +7. Ensure it teaches, not just decorates + +**Step 4: Write Part 3 (Proof)** +1. Show the actual formula with exact parameters from code +2. Explain each variable: "Where decayFactor = 0.5..." +3. Walk through step-by-step calculation +4. Provide concrete example: "If an author posts 4 tweets..." +5. Show the consequences: "Effective loss: 40%..." +6. Add code reference: file path and line numbers +7. Cross-check accuracy against source code + +**Step 5: Review & Refine** +1. Check tone: Objective (Parts 1-5) or analytical (Part 6)? +2. Verify all code references are accurate +3. Test calculations manually +4. Run quality checklist +5. Test locally: Open in browser, check rendering +6. Verify links work correctly + +### Verifying Technical Accuracy + +Before publishing any claim: +1. **Check the source code** in the main repository +2. **Verify file paths and line numbers** are current +3. **Test calculations** with real numbers +4. **Cross-reference** with `notes/comprehensive-summary.md` +5. **Look for edge cases**: Are there exceptions or conditions? + +## Quality Checklist + +Before considering a section complete: + +**Content Structure (Intuition → Feel → Proof)**: +- [ ] **Part 1 (Intuition)**: Plain language explanation of what and why +- [ ] **Part 1 (Intuition)**: Describes the "shape" of the behavior +- [ ] **Part 2 (Feel)**: Interactive or visual element present +- [ ] **Part 2 (Feel)**: Visualization enhances understanding (not decorative) +- [ ] **Part 3 (Proof)**: Concrete formulas with actual parameters +- [ ] **Part 3 (Proof)**: Step-by-step calculations with real numbers +- [ ] **Part 3 (Proof)**: Code reference with file path and line numbers + +**Technical Accuracy**: +- [ ] Every claim has a code reference with file path and line numbers +- [ ] Formulas match the actual implementation in code +- [ ] Calculations are correct and use realistic examples +- [ ] Technical accuracy verified against source code +- [ ] Edge cases or conditions are mentioned if relevant + +**Accessibility & Style**: +- [ ] Language is accessible to non-technical readers +- [ ] Tone is objective (or appropriately analytical for Part 6) +- [ ] No jargon without explanation +- [ ] Active voice and short sentences + +**Technical Implementation**: +- [ ] Links to other sections work correctly +- [ ] Renders correctly on mobile and desktop +- [ ] No broken internal or external links +- [ ] Follows the established HTML/CSS patterns +- [ ] Interactive elements are keyboard accessible +- [ ] Text fallbacks exist for visualizations + +## Resources + +**For technical details**: See root `/CLAUDE.md` and `notes/comprehensive-summary.md` + +**For deployment**: See `docs/README.md` + +**For visual design**: See `docs/css/style.css` + +**Reference material**: +- Twitter's open-source repo: https://github.com/twitter/the-algorithm +- Engineering blog: https://blog.x.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm diff --git a/docs/DESIGN_STANDARDS.md b/docs/DESIGN_STANDARDS.md new file mode 100644 index 000000000..7afa36272 --- /dev/null +++ b/docs/DESIGN_STANDARDS.md @@ -0,0 +1,267 @@ +# Design Standards for Twitter Algorithm Documentation + +## Navigation Structure + +### Current State +We have files in: +- `index.html` (landing page) +- `parts/01-introduction.html` +- `parts/07-appendix.html` +- `interactive/` (6 interactive tools) + +### Standard Navigation + +**For index.html:** +```html + +``` + +**For parts/*.html:** +```html + +``` + +**For interactive/*.html:** +```html + +``` + +**Rationale:** Only show what actually exists. Keep it simple. + +--- + +## Code Reference Standard + +### Format + +All code references should be **clickable GitHub links** to the actual source code. + +**Base URL:** `https://github.com/twitter/the-algorithm/blob/main/` + +### HTML Pattern + +```html +

+ Code: + + home-mixer/server/src/main/scala/com/twitter/home_mixer/param/HomeGlobalParams.scala:788-930 + +

+``` + +### Line Number Formats + +- **Single line**: `file.scala:123` → GitHub URL ends with `#L123` +- **Range**: `file.scala:123-456` → GitHub URL ends with `#L123-L456` +- **Multiple lines**: `file.scala:123,145,167` → Link to first line `#L123` (GitHub doesn't support non-contiguous) + +### Helper Function (JavaScript) + +```javascript +/** + * Convert file path with line numbers to GitHub URL + * @param {string} path - e.g., "home-mixer/server/.../file.scala:123-456" + * @returns {string} GitHub URL + */ +function codeRefToGitHubUrl(path) { + const baseUrl = 'https://github.com/twitter/the-algorithm/blob/main/'; + + // Handle root-level files (no slashes before colon) + if (path.includes(':')) { + const [file, lines] = path.split(':'); + const lineFragment = lines.includes('-') + ? `#L${lines.replace('-', '-L')}` + : `#L${lines}`; + return baseUrl + file + lineFragment; + } + + // No line numbers, just file + return baseUrl + path; +} + +// Example usage: +// codeRefToGitHubUrl("home-mixer/server/src/main/scala/.../HomeGlobalParams.scala:788-930") +// Returns: "https://github.com/twitter/the-algorithm/blob/main/home-mixer/server/src/main/scala/.../HomeGlobalParams.scala#L788-L930" +``` + +### CSS Styling + +```css +/* Code reference links */ +.code-ref a { + color: var(--primary-color); + text-decoration: none; + transition: color 0.2s; + display: inline-flex; + align-items: center; + gap: 0.25rem; +} + +.code-ref a:hover { + color: var(--primary-hover); + text-decoration: underline; +} + +.code-ref a::after { + content: "→"; + font-size: 0.875em; + opacity: 0.6; +} + +.code-ref code { + background-color: var(--code-bg); + border: 1px solid var(--code-border); + border-radius: 3px; + padding: 0.2em 0.4em; + font-size: 0.8em; +} +``` + +### Examples + +**Before (plain text):** +```html +

+ Code: home-mixer/server/src/main/scala/com/twitter/home_mixer/param/HomeGlobalParams.scala:788-930 +

+``` + +**After (clickable link):** +```html +

+ Code: + + home-mixer/server/src/main/scala/com/twitter/home_mixer/param/HomeGlobalParams.scala:788-930 + +

+``` + +**Special Cases:** + +1. **Root-level files** (like RETREIVAL_SIGNALS.md): +```html + + RETREIVAL_SIGNALS.md + +``` + +2. **External repo** (algorithm-ml): +```html + + the-algorithm-ml/projects/home/recap/README.md + +``` + +--- + +## File Organization + +``` +docs/ +├── index.html # Landing page with overview +├── DESIGN_STANDARDS.md # This file (design guidelines) +├── README.md # How to view/deploy +├── CLAUDE.md # AI assistant guidelines +│ +├── parts/ # Main content sections +│ ├── 01-introduction.html +│ └── 07-appendix.html +│ +├── interactive/ # Interactive visualizations +│ ├── cluster-explorer.html +│ ├── engagement-calculator.html +│ ├── invisible-filter.html +│ ├── journey-simulator.html +│ ├── pipeline-explorer.html +│ └── reinforcement-loop.html +│ +├── js/ # JavaScript for interactives +│ ├── cluster-explorer.js +│ ├── engagement-calculator.js +│ ├── invisible-filter.js +│ ├── journey-simulator.js +│ ├── pipeline-explorer.js +│ └── reinforcement-loop.js +│ +├── css/ +│ └── style.css # Global styles +│ +└── assets/ # Static assets (if needed) + └── data/ # JSON data files +``` + +--- + +## Migration Plan + +### Step 1: Update Navigation (All Files) +- Remove links to non-existent parts (02-06) +- Update navigation HTML in: + - `index.html` + - `parts/01-introduction.html` + - `parts/07-appendix.html` + - All `interactive/*.html` files + +### Step 2: Add Code Reference Styling +- Add CSS to `css/style.css` + +### Step 3: Convert Code References +- Find all `.code-ref` instances across all HTML files +- Convert plain text paths to clickable GitHub links +- Use the standard format above + +### Step 4: Test +- Open each page and verify: + - Navigation links work + - Code reference links go to correct GitHub locations + - No broken links + +--- + +## Checklist for New Pages + +When creating a new page, ensure: + +- [ ] Navigation uses the standard format (only existing pages) +- [ ] All code references are clickable GitHub links +- [ ] Footer includes page navigation (Previous/Next if applicable) +- [ ] Page uses `css/style.css` +- [ ] Title follows pattern: "Page Title - How Twitter's Algorithm Really Works" +- [ ] Meta description included +- [ ] Links to related interactives where relevant + +--- + +## Notes + +- **Why GitHub links?** Users can verify claims by reading the actual code +- **Why simplified nav?** Don't promise what doesn't exist yet +- **target="_blank"** and **rel="noopener"**: Security best practice for external links +- **Line number format**: GitHub uses `#L123` for single line, `#L123-L456` for ranges diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 000000000..dba38461e --- /dev/null +++ b/docs/README.md @@ -0,0 +1,157 @@ +# How Twitter's Algorithm Really Works + +A code-based investigation of X's (Twitter's) open-source recommendation algorithm. + +**Live Site**: [View the investigation →](https://ernests.github.io/the-algorithm/) + +--- + +## What This Is + +In March 2023, X (formerly Twitter) open-sourced their recommendation algorithm. We analyzed the implementation—reading thousands of lines of Scala, examining ML model weights, and tracing data flows through the pipeline. + +This site presents our findings through: +- **Verified claims** backed by specific code references (file paths + line numbers) +- **Interactive explorations** to experience algorithmic mechanics hands-on +- **Concrete examples** with real calculations showing how the system works + +--- + +## Top Findings + +### 🤯 The Favorites Paradox +Likes have the **lowest** positive weight (0.5) while replies have 27x more value (13.5). Reply with author engagement: 75.0 weight (150x more valuable than a like). + +### 🔥 Conflict is Emergent, Not Intentional +The algorithm cannot distinguish agreement from disagreement—all replies get the same weight regardless of sentiment. Conflict amplification is a design limitation, not malicious intent. + +### 📊 Multiplicative Scoring = Mathematical Echo Chambers +Tweet scores use multiplication (`score = baseScore × clusterInterest`), not addition. Any imbalance compounds over time through reinforcement loops. + +### 👑 Verified Accounts Get 100x Multiplier +Verification provides a massive algorithmic advantage. Combined with other structural benefits, large verified accounts have a **348:1 reach advantage** over small accounts posting identical content. + +### ☢️ "Not Interested" is Nuclear +One click triggers 0.2x multiplier (80% penalty) with 140-day linear recovery. Removes an author from your feed for ~5 months. + +**[See all findings →](https://ernests.github.io/the-algorithm/#findings)** + +--- + +## Interactive Explorations + +Experience how the algorithm works through 9 interactive demos: + +### Understanding The Pipeline +- **Pipeline Explorer** - Follow a tweet through all 5 algorithmic stages +- **Engagement Calculator** - Calculate tweet scores with real weights + +### Understanding Your Algorithmic Identity +- **Cluster Explorer** - Discover which of ~145,000 communities you belong to +- **Algorithmic Identity Builder** - See your dual profiles (consumer vs creator) + +### Understanding Filter Bubbles & Echo Chambers +- **Journey Simulator** - Model how interests drift over time +- **Invisible Filter Demo** - See how personalization creates different realities +- **Reinforcement Loop Visualizer** - Watch feedback loops compound week by week + +### Understanding Structural Advantages +- **Algorithmic Aristocracy** - Explore how follower count creates different rules + +### Understanding Next-Generation Systems +- **Phoenix: Behavioral Prediction** - X's transformer-based system (likely in active A/B testing) that models 522 of your recent actions to predict what you'll do next + +**[Explore all interactive demos →](https://ernests.github.io/the-algorithm/#interactives)** + +--- + +## Our Approach + +**Objective Evidence**: Every claim backed by: +- File path (exact location in codebase) +- Line numbers (specific implementation) +- Code snippets (what it actually does) +- Explanation (how the mechanism works) +- Consequences (what it means for users and creators) + +**Verifiable**: The algorithm is [open source](https://github.com/twitter/the-algorithm). You can check our work. + +**Interactive**: We built simulators and calculators so you can experience the mechanics hands-on, not just read about them. + +--- + +## Who This Is For + +- **Users** wondering why their feed looks the way it does +- **Creators** optimizing for reach and engagement +- **Researchers** studying recommendation algorithms and their societal effects +- **Policy makers** understanding algorithmic amplification +- **Anyone curious** about how algorithmic systems shape online discourse + +--- + +## Technology + +- **Plain HTML/CSS/JavaScript** - No build step, no dependencies, fast loading +- **Interactive visualizations** - Chart.js for graphs, vanilla JS for simulators +- **Responsive design** - Works on desktop and mobile +- **Accessible** - Semantic HTML, keyboard navigation, text alternatives + +--- + +## About This Investigation + +This analysis was conducted by reading X's open-source algorithm code (released March 2023). All findings are based on the actual implementation, not speculation or reverse engineering. + +**Repository**: [github.com/twitter/the-algorithm](https://github.com/twitter/the-algorithm) + +**Methodology**: We read thousands of lines of Scala, traced data flows through pipelines, examined ML model configurations, and documented every mechanism with file paths and line numbers. + +**Last Updated**: November 2025 + +--- + +## Structure + +``` +docs/ +├── index.html # Landing page with key findings +├── interactive/ # 9 interactive explorations +│ ├── pipeline-explorer.html +│ ├── engagement-calculator.html +│ ├── cluster-explorer.html +│ ├── algorithmic-identity.html +│ ├── journey-simulator.html +│ ├── invisible-filter.html +│ ├── reinforcement-loop.html +│ ├── algorithmic-aristocracy.html +│ └── phoenix-sequence-prediction.html +├── parts/ +│ └── reference.html # Code reference documentation +├── css/ +│ └── style.css # Clean, readable styling +├── js/ # Interactive widget scripts +└── assets/ # Images and data files +``` + +--- + +## Contributing + +Found an error or have a correction? [Open an issue](https://github.com/twitter/the-algorithm/issues) or submit a pull request. + +All claims should be backed by specific code references with file paths and line numbers. + +--- + +## License + +This documentation is provided for educational and research purposes. The analyzed algorithm code is owned by X Corp. + +--- + +## Questions? + +Open an issue on [GitHub](https://github.com/twitter/the-algorithm/issues) or explore the interactive demos to understand how the algorithm works. + +**Key Insight**: The algorithm is not neutral. It is designed for engagement, not for truth, diversity, or societal health. Understanding how it works is the first step to using it consciously rather than being shaped by it. diff --git a/docs/convert-code-refs.py b/docs/convert-code-refs.py new file mode 100644 index 000000000..7531f0eb5 --- /dev/null +++ b/docs/convert-code-refs.py @@ -0,0 +1,144 @@ +#!/usr/bin/env python3 +""" +Convert plain code references to clickable GitHub links. + +Usage: + python3 convert-code-refs.py +""" + +import re +import sys +from pathlib import Path + +GITHUB_BASE = "https://github.com/twitter/the-algorithm/blob/main/" +GITHUB_SEARCH = "https://github.com/twitter/the-algorithm/search?q=" + + +def extract_filename(path): + """Extract just the filename from a full path.""" + return Path(path).name + + +def convert_line_numbers(lines): + """Convert line number format to GitHub fragment.""" + if not lines: + return "" + if "-" in lines: + start, end = lines.split("-") + return f"#L{start}-L{end}" + return f"#L{lines}" + + +def create_github_link(path, lines=None): + """Create a GitHub URL for the given path and optional line numbers.""" + # Check if it's just a constant/variable name (no slashes or extension) + if "/" not in path and "." not in path: + # It's a constant like "favScoreHalfLife100Days" + return GITHUB_SEARCH + path + + # Normal file path + url = GITHUB_BASE + path + if lines: + url += convert_line_numbers(lines) + return url + + +def convert_code_ref(match): + """Convert a single code-ref match to a linked version.""" + full_match = match.group(0) + + # Extract the structure + # Pattern:

Label: path/to/file.scala:123-456

+ # or:

Code: path

+ + # Check if it already has a link + if '([^<]+)' + code_match = re.search(code_pattern, full_match) + + if not code_match: + return full_match # No code block found + + code_content = code_match.group(1).strip() + + # Parse the code content + # Could be: "path/to/file.scala:123-456" or "path/to/file.scala" or "constantName" + if ":" in code_content: + path, lines = code_content.rsplit(":", 1) + else: + path = code_content + lines = None + + # Create GitHub link + github_url = create_github_link(path, lines) + + # Extract just the filename for display + if "/" in path: + display_name = extract_filename(path) + if lines: + display_name += f":{lines}" + else: + # It's a constant or short name, keep as is + display_name = code_content + + # Build the replacement + # Find the position of tag + code_start = full_match.find('') + code_end = full_match.find('') + len('') + + before = full_match[:code_start] + after = full_match[code_end:] + + linked_code = f'{display_name}' + + return before + linked_code + after + + +def process_file(filepath): + """Process a single HTML file.""" + with open(filepath, 'r', encoding='utf-8') as f: + content = f.read() + + # Pattern to match code-ref blocks + # Match from

+ pattern = r'

]*>.*?

' + + # Count matches before + matches_before = len(re.findall(pattern, content)) + + # Convert all code-refs + new_content = re.sub(pattern, convert_code_ref, content, flags=re.DOTALL) + + # Count links after + matches_after = len(re.findall(r'

]*>.*? [ ...]") + sys.exit(1) + + total_before = 0 + total_after = 0 + + for filepath in sys.argv[1:]: + print(f"Processing {filepath}...") + before, after = process_file(filepath) + total_before += before + total_after += after + print(f" ✓ Converted {after}/{before} code references") + + print(f"\nTotal: {total_after}/{total_before} code references converted") + + +if __name__ == "__main__": + main() diff --git a/docs/css/style.css b/docs/css/style.css new file mode 100644 index 000000000..2867be91b --- /dev/null +++ b/docs/css/style.css @@ -0,0 +1,2116 @@ +/* Modern Light Blog Theme */ + +* { + margin: 0; + padding: 0; + box-sizing: border-box; +} + +:root { + /* Light theme colors */ + --primary-color: #1DA1F2; + --primary-hover: #1a8cd8; + --text-primary: #1a1a1a; + --text-secondary: #6b6b6b; + --text-muted: #9b9b9b; + --background: #ffffff; + --background-alt: #f7f9fa; + --border-color: #e1e8ed; + --code-bg: #f5f7f8; + --code-border: #d9e1e8; + --highlight-bg: #fff4cc; + --success: #17bf63; + --warning: #ff6b6b; + --info: #1DA1F2; + + /* Typography */ + --font-sans: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif; + --font-serif: Charter, Georgia, 'Times New Roman', serif; + --font-mono: "SF Mono", Monaco, "Cascadia Code", "Roboto Mono", Consolas, monospace; + + /* Spacing */ + --max-width: 720px; + --content-padding: 2rem; +} + +body { + font-family: var(--font-serif); + font-size: 20px; + line-height: 1.6; + color: var(--text-primary); + background-color: var(--background); + padding: 0; + margin: 0; + -webkit-font-smoothing: antialiased; + -moz-osx-font-smoothing: grayscale; +} + +/* Navigation */ +nav { + background-color: var(--background); + border-bottom: 1px solid var(--border-color); + position: sticky; + top: 0; + z-index: 100; + padding: 1rem 0; + backdrop-filter: blur(10px); + background-color: rgba(255, 255, 255, 0.95); +} + +nav .nav-container { + max-width: var(--max-width); + margin: 0 auto; + padding: 0 var(--content-padding); + display: flex; + align-items: center; + justify-content: space-between; + gap: 2rem; + flex-wrap: wrap; +} + +nav h1 { + font-family: var(--font-sans); + font-size: 0.95rem; + font-weight: 600; + margin: 0; + letter-spacing: -0.01em; +} + +nav h1 a { + color: var(--text-primary); + text-decoration: none; +} + +nav h1 a:hover { + color: var(--primary-color); +} + +.nav-links { + display: flex; + align-items: center; + gap: 1.5rem; + font-family: var(--font-sans); +} + +.nav-links a { + color: var(--text-secondary); + text-decoration: none; + font-size: 0.875rem; + font-weight: 500; + transition: color 0.2s; + white-space: nowrap; +} + +.nav-links a:hover { + color: var(--primary-color); +} + +/* Main Content */ +main { + max-width: var(--max-width); + margin: 3rem auto 4rem; + padding: 0 var(--content-padding); +} + +/* Typography */ +h1 { + font-family: var(--font-sans); + font-size: 2.75rem; + font-weight: 800; + line-height: 1.15; + margin: 0 0 1.5rem; + color: var(--text-primary); + letter-spacing: -0.03em; +} + +h2 { + font-family: var(--font-sans); + font-size: 2rem; + font-weight: 700; + line-height: 1.25; + margin: 3rem 0 1rem; + color: var(--text-primary); + letter-spacing: -0.02em; +} + +h2:first-of-type { + margin-top: 2rem; +} + +h3 { + font-family: var(--font-sans); + font-size: 1.5rem; + font-weight: 600; + line-height: 1.3; + margin: 2rem 0 0.75rem; + color: var(--text-primary); + letter-spacing: -0.01em; +} + +h4 { + font-family: var(--font-sans); + font-size: 1.125rem; + font-weight: 600; + margin: 1.5rem 0 0.5rem; + color: var(--text-secondary); +} + +p { + margin: 1.5rem 0; + color: var(--text-primary); +} + +a { + color: var(--primary-color); + text-decoration: none; + transition: color 0.2s; +} + +a:hover { + color: var(--primary-hover); + text-decoration: underline; +} + +strong { + font-weight: 600; + color: var(--text-primary); +} + +em { + font-style: italic; +} + +hr { + border: none; + border-top: 1px solid var(--border-color); + margin: 3rem 0; +} + +/* Lists */ +ul, ol { + margin: 1.5rem 0; + padding-left: 2rem; + color: var(--text-primary); +} + +li { + margin: 0.75rem 0; + line-height: 1.6; +} + +li::marker { + color: var(--text-secondary); +} + +/* Code */ +code { + font-family: var(--font-mono); + font-size: 0.875em; + background-color: var(--code-bg); + padding: 0.2em 0.4em; + border-radius: 3px; + border: 1px solid var(--code-border); + color: var(--text-primary); +} + +pre { + background-color: var(--code-bg); + padding: 1.5rem; + border-radius: 6px; + overflow-x: auto; + margin: 2rem 0; + border: 1px solid var(--code-border); + line-height: 1.5; +} + +pre code { + background: none; + padding: 0; + border: none; + font-size: 0.875rem; +} + +/* Blockquotes */ +blockquote { + border-left: 3px solid var(--primary-color); + padding-left: 1.5rem; + margin: 2rem 0; + color: var(--text-secondary); + font-style: italic; + font-size: 1.1em; +} + +/* Tables */ +table { + width: 100%; + border-collapse: collapse; + margin: 2rem 0; + font-family: var(--font-sans); + font-size: 0.9rem; +} + +thead { + background-color: var(--background-alt); +} + +th, td { + padding: 0.75rem 1rem; + text-align: left; + border: 1px solid var(--border-color); +} + +th { + font-weight: 600; + color: var(--text-primary); +} + +tbody tr:nth-child(even) { + background-color: var(--background-alt); +} + +/* Callout Boxes */ +.callout { + background-color: var(--background-alt); + border-left: 4px solid var(--info); + padding: 1.5rem; + margin: 2rem 0; + border-radius: 4px; + font-size: 0.95em; +} + +.callout p:first-child { + margin-top: 0; +} + +.callout p:last-child { + margin-bottom: 0; +} + +.callout.warning { + border-left-color: var(--warning); + background-color: #fff5f5; +} + +.callout.success { + border-left-color: var(--success); + background-color: #f0fdf4; +} + +/* Finding Cards */ +.finding-card { + background-color: var(--background-alt); + border: 1px solid var(--border-color); + border-radius: 8px; + padding: 2rem; + margin: 2rem 0; + transition: box-shadow 0.3s, border-color 0.3s; +} + +.finding-card:hover { + box-shadow: 0 4px 12px rgba(0, 0, 0, 0.08); + border-color: var(--primary-color); +} + +.finding-card h3 { + margin-top: 0; + font-size: 1.375rem; +} + +.finding-card h3 a { + color: var(--text-primary); + text-decoration: none; +} + +.finding-card h3 a:hover { + color: var(--primary-color); +} + +.finding-card p { + font-family: var(--font-sans); + font-size: 1rem; + line-height: 1.5; + color: var(--text-secondary); +} + +.finding-card .consequence { + margin-top: 1.5rem; + padding-top: 1.5rem; + border-top: 1px solid var(--border-color); + color: var(--text-secondary); + font-size: 0.95em; +} + +.finding-card .code-ref { + font-family: var(--font-mono); + font-size: 0.8rem; + color: var(--text-muted); + margin-top: 1rem; + display: block; +} + +/* Page Navigation */ +.page-nav { + display: flex; + justify-content: space-between; + align-items: center; + margin: 4rem 0 2rem; + padding: 2rem 0; + border-top: 1px solid var(--border-color); + font-family: var(--font-sans); +} + +.page-nav a { + padding: 0.75rem 1.5rem; + background-color: var(--background); + border: 2px solid var(--border-color); + border-radius: 6px; + font-weight: 600; + color: var(--text-primary); + transition: all 0.2s; + text-decoration: none; +} + +.page-nav a:hover { + background-color: var(--primary-color); + border-color: var(--primary-color); + color: white; + transform: translateY(-2px); + box-shadow: 0 4px 12px rgba(29, 161, 242, 0.3); +} + +/* Interactive Widgets */ +.widget { + background-color: var(--background-alt); + border: 2px solid var(--primary-color); + border-radius: 8px; + padding: 2rem; + margin: 2rem 0; + font-family: var(--font-sans); +} + +.widget h3 { + margin-top: 0; + color: var(--primary-color); + font-size: 1.25rem; +} + +.widget input, +.widget select, +.widget button { + font-family: var(--font-sans); + font-size: 1rem; + padding: 0.75rem; + border-radius: 6px; + border: 2px solid var(--border-color); + margin: 0.5rem 0; + background-color: var(--background); + color: var(--text-primary); +} + +.widget button { + cursor: pointer; + background-color: var(--primary-color); + border-color: var(--primary-color); + color: white; + font-weight: 600; + transition: all 0.2s; +} + +.widget button:hover { + background-color: var(--primary-hover); + transform: translateY(-1px); + box-shadow: 0 4px 12px rgba(29, 161, 242, 0.3); +} + +/* Footer */ +footer { + max-width: var(--max-width); + margin: 4rem auto 2rem; + padding: 2rem var(--content-padding); + border-top: 1px solid var(--border-color); + color: var(--text-secondary); + text-align: center; + font-family: var(--font-sans); + font-size: 0.875rem; +} + +footer p { + margin: 0.5rem 0; + font-size: 0.875rem; +} + +/* Intro/Lead Paragraph */ +main > p:first-of-type { + font-size: 1.25em; + line-height: 1.5; + color: var(--text-secondary); + margin: 1.5rem 0 2rem; +} + +/* Highlight text */ +mark { + background-color: var(--highlight-bg); + padding: 0.1em 0.2em; + border-radius: 2px; +} + +/* Responsive Design */ +@media (max-width: 768px) { + :root { + --content-padding: 1.5rem; + } + + body { + font-size: 18px; + } + + h1 { + font-size: 2rem; + } + + h2 { + font-size: 1.5rem; + } + + h3 { + font-size: 1.25rem; + } + + nav .nav-container { + flex-direction: column; + align-items: flex-start; + gap: 0.75rem; + } + + .nav-links { + flex-direction: column; + align-items: flex-start; + gap: 0.75rem; + width: 100%; + } + + .page-nav { + flex-direction: column; + gap: 1rem; + align-items: stretch; + } + + .page-nav a { + text-align: center; + } + + main { + margin: 2rem auto 3rem; + } + + pre { + padding: 1rem; + font-size: 0.8rem; + } + + table { + font-size: 0.8rem; + } + + th, td { + padding: 0.5rem; + } +} + +@media (max-width: 480px) { + body { + font-size: 17px; + } + + h1 { + font-size: 1.75rem; + } + + h2 { + font-size: 1.375rem; + } + + .finding-card { + padding: 1.5rem; + } +} + +/* Print Styles */ +@media print { + nav { + display: none; + } + + body { + font-size: 12pt; + color: black; + } + + a { + color: black; + text-decoration: underline; + } + + .page-nav, + footer { + display: none; + } + + pre { + border: 1px solid #ccc; + page-break-inside: avoid; + } + + h2, h3 { + page-break-after: avoid; + } +} + +/* Smooth scroll */ +html { + scroll-behavior: smooth; +} + +/* Selection */ +::selection { + background-color: rgba(29, 161, 242, 0.2); + color: var(--text-primary); +} + +/* Focus styles for accessibility */ +a:focus, +button:focus, +input:focus { + outline: 2px solid var(--primary-color); + outline-offset: 2px; +} + +/* Loading animation for future use */ +@keyframes fadeIn { + from { + opacity: 0; + transform: translateY(10px); + } + to { + opacity: 1; + transform: translateY(0); + } +} + +main { + animation: fadeIn 0.5s ease-out; +} + +/* Engagement Calculator Specific Styles */ +.engagement-input { + margin: 1.5rem 0; + padding: 1rem; + background-color: var(--background); + border: 1px solid var(--border-color); + border-radius: 6px; +} + +.engagement-input label { + display: flex; + justify-content: space-between; + align-items: center; + margin-bottom: 0.75rem; + font-weight: 600; + font-family: var(--font-sans); +} + +.engagement-input .label-text { + font-size: 1rem; + color: var(--text-primary); +} + +.engagement-input .probability-display { + font-size: 1.125rem; + color: var(--primary-color); + font-weight: 700; + min-width: 50px; + text-align: right; +} + +.engagement-input .probability-slider { + width: 100%; + height: 8px; + border-radius: 4px; + background: linear-gradient(to right, var(--background-alt), var(--primary-color)); + outline: none; + -webkit-appearance: none; + appearance: none; +} + +.engagement-input .probability-slider::-webkit-slider-thumb { + -webkit-appearance: none; + appearance: none; + width: 20px; + height: 20px; + border-radius: 50%; + background: var(--primary-color); + cursor: pointer; + border: 2px solid white; + box-shadow: 0 2px 4px rgba(0, 0, 0, 0.2); +} + +.engagement-input .probability-slider::-moz-range-thumb { + width: 20px; + height: 20px; + border-radius: 50%; + background: var(--primary-color); + cursor: pointer; + border: 2px solid white; + box-shadow: 0 2px 4px rgba(0, 0, 0, 0.2); +} + +.engagement-input .help-text { + font-size: 0.875rem; + color: var(--text-secondary); + margin: 0.5rem 0 0 0; + font-family: var(--font-sans); +} + +.scenario-btn { + padding: 0.625rem 1.25rem; + background-color: var(--background-alt); + border: 2px solid var(--border-color); + border-radius: 6px; + font-weight: 600; + font-family: var(--font-sans); + font-size: 0.875rem; + color: var(--text-primary); + cursor: pointer; + transition: all 0.2s; +} + +.scenario-btn:hover { + background-color: var(--background); + border-color: var(--primary-color); + color: var(--primary-color); + transform: translateY(-1px); + box-shadow: 0 2px 8px rgba(29, 161, 242, 0.2); +} + +/* Scenario Grid */ +.scenario-grid { + display: grid; + grid-template-columns: repeat(auto-fit, minmax(280px, 1fr)); + gap: 1.25rem; + margin: 2rem 0; +} + +.scenario-card { + background-color: var(--background); + border: 2px solid var(--border-color); + border-radius: 8px; + padding: 1.5rem; + cursor: pointer; + transition: all 0.2s; + position: relative; +} + +.scenario-card:hover { + border-color: var(--primary-color); + box-shadow: 0 4px 12px rgba(29, 161, 242, 0.15); + transform: translateY(-2px); +} + +.scenario-card.selected { + border-color: var(--primary-color); + background-color: rgba(29, 161, 242, 0.05); + box-shadow: 0 4px 12px rgba(29, 161, 242, 0.2); +} + +.scenario-card h4 { + margin: 0 0 0.75rem 0; + font-size: 1.125rem; + font-family: var(--font-sans); + color: var(--text-primary); +} + +.scenario-card .scenario-description { + font-size: 0.9rem; + color: var(--text-primary); + font-style: italic; + margin: 0.5rem 0; + line-height: 1.4; + min-height: 2.8em; +} + +.scenario-card .scenario-engagement { + font-size: 0.8rem; + color: var(--text-secondary); + margin: 0.75rem 0 0 0; + font-family: var(--font-sans); +} + +@media (max-width: 768px) { + .scenario-grid { + grid-template-columns: 1fr; + gap: 1rem; + } + + .scenario-card { + padding: 1.25rem; + } +} + +/* Invisible Filter Specific Styles */ +.profile-configurator { + display: grid; + grid-template-columns: repeat(auto-fit, minmax(300px, 1fr)); + gap: 2rem; + margin: 2rem 0; +} + +.profile-config { + background-color: var(--background-alt); + padding: 2rem; + border-radius: 8px; + border: 2px solid var(--border-color); +} + +.cluster-slider { + margin: 1.5rem 0; +} + +.cluster-slider label { + display: flex; + justify-content: space-between; + align-items: center; + margin-bottom: 0.5rem; + font-family: var(--font-sans); + font-weight: 600; +} + +.cluster-name { + font-size: 1rem; +} + +.cluster-value { + font-size: 1.125rem; + font-weight: 700; + min-width: 50px; + text-align: right; +} + +.cluster-slider input[type="range"] { + width: 100%; + height: 8px; + border-radius: 4px; + background: var(--border-color); + outline: none; + -webkit-appearance: none; + appearance: none; +} + +.cluster-slider input[type="range"]::-webkit-slider-thumb { + -webkit-appearance: none; + appearance: none; + width: 20px; + height: 20px; + border-radius: 50%; + background: var(--primary-color); + cursor: pointer; + border: 2px solid white; + box-shadow: 0 2px 4px rgba(0, 0, 0, 0.2); +} + +.cluster-slider input[type="range"]::-moz-range-thumb { + width: 20px; + height: 20px; + border-radius: 50%; + background: var(--primary-color); + cursor: pointer; + border: 2px solid white; + box-shadow: 0 2px 4px rgba(0, 0, 0, 0.2); +} + +.friend-selector { + display: grid; + gap: 0.75rem; +} + +.friend-btn { + padding: 1rem; + background-color: var(--background); + border: 2px solid var(--border-color); + border-radius: 6px; + font-family: var(--font-sans); + font-weight: 600; + font-size: 0.95rem; + color: var(--text-primary); + cursor: pointer; + transition: all 0.2s; + text-align: left; + line-height: 1.4; +} + +.friend-btn:hover { + border-color: var(--primary-color); + background-color: rgba(29, 161, 242, 0.05); +} + +.friend-btn.active { + border-color: var(--primary-color); + background-color: rgba(29, 161, 242, 0.1); + box-shadow: 0 2px 8px rgba(29, 161, 242, 0.2); +} + +.view-toggle-container { + background-color: var(--background-alt); + padding: 1.5rem; + border-radius: 8px; + border: 2px solid var(--border-color); +} + +.view-toggle { + display: flex; + gap: 1rem; + justify-content: center; +} + +.view-btn { + flex: 1; + max-width: 200px; + padding: 1rem 1.5rem; + background-color: var(--background); + border: 2px solid var(--border-color); + border-radius: 6px; + font-family: var(--font-sans); + font-weight: 600; + font-size: 1rem; + color: var(--text-primary); + cursor: pointer; + transition: all 0.2s; +} + +.view-btn:hover { + border-color: var(--primary-color); + background-color: rgba(29, 161, 242, 0.05); +} + +.view-btn.active { + border-color: var(--primary-color); + background-color: var(--primary-color); + color: white; + box-shadow: 0 4px 12px rgba(29, 161, 242, 0.3); +} + +.tweet-list { + display: flex; + flex-direction: column; + gap: 1rem; +} + +.tweet-card { + background-color: var(--background-alt); + border: 2px solid var(--border-color); + border-radius: 8px; + padding: 1.25rem; + transition: all 0.2s; +} + +.tweet-card:hover { + box-shadow: 0 4px 12px rgba(0, 0, 0, 0.08); +} + +.tweet-card.rank-different { + border-color: var(--warning); + background-color: rgba(255, 107, 107, 0.05); +} + +.tweet-header { + display: flex; + align-items: center; + gap: 0.75rem; + margin-bottom: 0.75rem; + flex-wrap: wrap; +} + +.tweet-rank { + display: inline-block; + padding: 0.25rem 0.625rem; + border-radius: 4px; + color: white; + font-family: var(--font-sans); + font-weight: 700; + font-size: 0.875rem; +} + +.tweet-ranks { + display: flex; + align-items: center; + gap: 0.5rem; +} + +.rank-badge { + display: flex; + align-items: center; + gap: 0.375rem; + padding: 0.375rem 0.75rem; + border-radius: 6px; + font-family: var(--font-sans); + font-size: 0.85rem; + background-color: var(--background); + border: 1.5px solid var(--border-color); +} + +.rank-badge.primary { + border-color: var(--primary-color); + background-color: rgba(29, 161, 242, 0.1); + font-weight: 700; +} + +.rank-badge.secondary { + opacity: 0.7; +} + +.rank-number { + font-weight: 700; + font-size: 0.95rem; +} + +.rank-label { + font-size: 0.75rem; + opacity: 0.8; +} + +.tweet-cluster { + font-family: var(--font-sans); + font-weight: 600; + font-size: 0.875rem; +} + +.rank-diff { + margin-left: auto; + padding: 0.25rem 0.5rem; + background-color: var(--warning); + color: white; + border-radius: 4px; + font-family: var(--font-sans); + font-weight: 700; + font-size: 0.75rem; +} + +.tweet-content { + font-size: 0.95rem; + line-height: 1.5; + color: var(--text-primary); + margin: 0.75rem 0; +} + +.tweet-author { + font-family: var(--font-sans); + font-size: 0.875rem; + color: var(--text-secondary); + margin: 0.5rem 0; +} + +.tweet-score { + margin-top: 1rem; + padding-top: 1rem; + border-top: 1px solid var(--border-color); + font-family: var(--font-mono); + font-size: 0.8rem; +} + +.score-breakdown { + display: flex; + justify-content: space-between; + margin: 0.25rem 0; + color: var(--text-secondary); +} + +.score-breakdown.score-total { + margin-top: 0.5rem; + padding-top: 0.5rem; + border-top: 1px solid var(--border-color); + font-weight: 700; + color: var(--text-primary); + font-size: 0.875rem; +} + +@media (max-width: 1024px) { + .feeds-comparison { + grid-template-columns: 1fr; + } +} + +@media (max-width: 768px) { + .profile-configurator { + grid-template-columns: 1fr; + } +} + +/* Reinforcement Loop Specific Styles */ +.loop-config { + background-color: var(--background-alt); + padding: 2rem; + border-radius: 8px; + border: 2px solid var(--border-color); + max-width: 500px; + margin: 2rem auto; +} + +.cluster-input-group { + margin: 1.5rem 0; +} + +.cluster-input { + margin: 1rem 0; +} + +.cluster-input label { + display: flex; + justify-content: space-between; + margin-bottom: 0.5rem; + font-family: var(--font-sans); + font-weight: 600; +} + +.cluster-input input[type="range"] { + width: 100%; +} + +.week-header { + text-align: center; + margin-bottom: 2rem; +} + +.week-header h2 { + margin: 0; + color: var(--primary-color); +} + +.stage-display { + min-height: 400px; + margin: 2rem 0; +} + +.stage-card { + background-color: var(--background-alt); + border: 2px solid var(--primary-color); + border-radius: 8px; + padding: 2rem; +} + +.stage-card h3 { + margin-top: 0; + color: var(--primary-color); +} + +.stage-explanation { + margin-top: 1.5rem; + padding-top: 1.5rem; + border-top: 2px solid var(--border-color); + color: var(--text-secondary); + font-size: 0.95rem; + line-height: 1.6; +} + +/* Profile Bars */ +.profile-bars { + margin: 1.5rem 0; +} + +.profile-bar { + margin: 1rem 0; +} + +.profile-bar-label { + display: flex; + justify-content: space-between; + margin-bottom: 0.5rem; + font-family: var(--font-sans); + font-size: 0.95rem; +} + +.profile-bar-fill { + height: 32px; + border-radius: 4px; + display: flex; + align-items: center; + justify-content: center; + color: white; + font-weight: 700; + font-family: var(--font-sans); + transition: width 0.3s ease; +} + +/* Candidates Display */ +.candidates-display { + margin: 1.5rem 0; + background-color: var(--background); + padding: 1.5rem; + border-radius: 6px; + border: 1px solid var(--border-color); +} + +.candidate-row { + display: flex; + justify-content: space-between; + padding: 0.75rem 0; + border-bottom: 1px solid var(--border-color); +} + +.candidate-row:last-child { + border-bottom: none; +} + +/* Scoring Examples */ +.scoring-examples { + margin: 1.5rem 0; + display: grid; + gap: 1.5rem; +} + +.scoring-example { + background-color: var(--background); + border-radius: 6px; + padding: 1.5rem; + border: 1px solid var(--border-color); +} + +.tweet-preview { + padding: 1rem; + background-color: var(--background-alt); + border-radius: 4px; + margin-bottom: 1rem; + font-size: 0.95rem; +} + +.scoring-calc { + font-family: var(--font-mono); + font-size: 0.875rem; +} + +.scoring-calc code { + display: block; + padding: 0.375rem 0; + background: none; + border: none; +} + +/* Feed Composition */ +.feed-composition { + display: flex; + height: 100px; + margin: 1.5rem 0; + border-radius: 6px; + overflow: hidden; +} + +.feed-section { + display: flex; + align-items: center; + justify-content: center; + transition: flex 0.3s ease; +} + +.feed-section-content { + text-align: center; + font-family: var(--font-sans); +} + +.feed-section-content strong { + display: block; + font-size: 1.5rem; + margin-bottom: 0.25rem; +} + +/* Engagement Pattern */ +.engagement-pattern { + margin: 1.5rem 0; +} + +.engagement-row { + margin: 1rem 0; +} + +.engagement-row > span:first-child { + display: block; + margin-bottom: 0.5rem; + font-family: var(--font-sans); +} + +.engagement-bar { + position: relative; + height: 40px; + background-color: var(--background); + border-radius: 6px; + border: 1px solid var(--border-color); + overflow: hidden; +} + +.engagement-fill { + height: 100%; + display: flex; + align-items: center; + justify-content: center; + color: white; + font-weight: 700; + font-family: var(--font-sans); + font-size: 0.9rem; + transition: width 0.3s ease; +} + +/* Profile Update */ +.profile-update { + margin: 1.5rem 0; + background-color: var(--background); + padding: 1.5rem; + border-radius: 6px; + border: 1px solid var(--border-color); +} + +.update-row { + display: flex; + justify-content: space-between; + align-items: center; + padding: 0.75rem; + font-family: var(--font-sans); +} + +.update-label { + font-weight: 600; + color: var(--text-secondary); +} + +.update-value { + font-size: 1.05rem; +} + +.update-arrow { + text-align: center; + font-size: 1.5rem; + color: var(--primary-color); + font-weight: 700; + padding: 0.5rem 0; +} + +.update-diff { + margin-top: 1rem; + padding-top: 1rem; + border-top: 1px solid var(--border-color); + text-align: center; + font-family: var(--font-mono); + font-size: 0.9rem; + color: var(--warning); + font-weight: 600; +} + +/* Loop Navigation */ +.loop-navigation { + display: flex; + justify-content: center; + gap: 1rem; + margin: 2rem 0; +} + +.loop-nav-btn { + padding: 1rem 2rem; + font-size: 1.05rem; + font-weight: 600; + font-family: var(--font-sans); + border-radius: 6px; + cursor: pointer; + transition: all 0.2s; + border: none; +} + +.loop-nav-btn.primary { + background-color: var(--primary-color); + color: white; +} + +.loop-nav-btn.primary:hover { + background-color: var(--primary-hover); + transform: translateY(-2px); + box-shadow: 0 4px 12px rgba(29, 161, 242, 0.3); +} + +.loop-nav-btn.secondary { + background-color: var(--background-alt); + border: 2px solid var(--border-color); + color: var(--text-primary); +} + +.loop-nav-btn.secondary:hover { + border-color: var(--primary-color); + color: var(--primary-color); +} + +/* Progress Tracker */ +.progress-tracker { + display: flex; + align-items: center; + justify-content: center; + margin: 3rem 0; + padding: 2rem; + background-color: var(--background-alt); + border-radius: 8px; + flex-wrap: wrap; + gap: 0.5rem; +} + +.progress-item { + display: flex; + flex-direction: column; + align-items: center; + gap: 0.5rem; +} + +.progress-item span { + font-size: 0.75rem; + font-family: var(--font-sans); + font-weight: 600; + color: var(--text-secondary); + text-align: center; + min-width: 80px; +} + +.progress-dot { + width: 16px; + height: 16px; + border-radius: 50%; + background-color: var(--border-color); + transition: all 0.3s; +} + +.progress-item.active .progress-dot { + width: 24px; + height: 24px; + background-color: var(--primary-color); + box-shadow: 0 0 0 4px rgba(29, 161, 242, 0.2); +} + +.progress-item.completed .progress-dot { + background-color: var(--success); +} + +.progress-item.active span { + color: var(--primary-color); + font-weight: 700; +} + +.progress-line { + width: 40px; + height: 2px; + background-color: var(--border-color); +} + +/* History Timeline */ +.history-timeline { + margin: 2rem 0; +} + +.history-entry { + display: flex; + align-items: center; + gap: 1rem; + margin: 0.75rem 0; +} + +.history-week { + min-width: 80px; + font-family: var(--font-sans); + font-weight: 600; + font-size: 0.9rem; + color: var(--text-secondary); +} + +.history-bar { + flex: 1; + display: flex; + height: 32px; + border-radius: 4px; + overflow: hidden; + border: 1px solid var(--border-color); +} + +.history-segment { + display: flex; + align-items: center; + justify-content: center; + color: white; + font-family: var(--font-sans); + font-weight: 700; + font-size: 0.85rem; + transition: width 0.3s ease; +} + +/* Projection display */ +.projection-summary { + background-color: rgba(29, 161, 242, 0.05); + border: 1px solid var(--border-color); + border-radius: 8px; + padding: 1.5rem; + margin: 1.5rem 0; +} + +.projection-row { + display: flex; + justify-content: space-between; + align-items: center; + padding: 0.75rem 0; +} + +.projection-label { + font-weight: 600; + color: var(--text-secondary); +} + +.projection-value { + font-size: 1.1rem; + font-family: var(--font-mono); +} + +.projection-arrow { + text-align: center; + font-size: 1.5rem; + color: var(--text-secondary); + margin: 0.5rem 0; +} + +.projection-diff { + text-align: center; + margin-top: 1rem; + padding-top: 1rem; + border-top: 1px solid var(--border-color); + font-weight: 600; + color: var(--warning); + font-size: 1.05rem; +} + +.projection-timeline { + display: flex; + flex-direction: column; + gap: 1rem; + margin-top: 1rem; +} + +.projection-timeline.long { + gap: 0.75rem; +} + +.projection-week { + display: flex; + align-items: center; + gap: 1rem; +} + +.projection-week-label { + min-width: 80px; + font-weight: 600; + color: var(--text-secondary); + font-family: var(--font-mono); +} + +.projection-week-bar { + flex: 1; + display: flex; + height: 40px; + border-radius: 6px; + overflow: hidden; + border: 1px solid var(--border-color); +} + +.projection-segment { + display: flex; + align-items: center; + justify-content: center; + color: white; + font-weight: 700; + font-size: 0.9rem; + font-family: var(--font-mono); + transition: all 0.3s ease; +} + +/* Pipeline Explorer */ +.pipeline-scenarios { + display: grid; + grid-template-columns: repeat(auto-fit, minmax(280px, 1fr)); + gap: 1.5rem; + margin: 2rem 0; +} + +.pipeline-scenario-card { + background-color: rgba(29, 161, 242, 0.05); + border: 2px solid var(--border-color); + border-radius: 8px; + padding: 1.5rem; + cursor: pointer; + transition: all 0.2s ease; +} + +.pipeline-scenario-card:hover { + border-color: var(--primary-color); + background-color: rgba(29, 161, 242, 0.1); + transform: translateY(-2px); + box-shadow: 0 4px 12px rgba(29, 161, 242, 0.2); +} + +.scenario-header { + display: flex; + justify-content: space-between; + align-items: flex-start; + margin-bottom: 0.75rem; +} + +.scenario-header h4 { + margin: 0; + font-size: 1.1rem; +} + +.scenario-badge { + padding: 0.25rem 0.75rem; + border-radius: 4px; + font-size: 0.75rem; + font-weight: 600; + text-transform: uppercase; +} + +.scenario-badge.in-network { + background-color: rgba(23, 191, 99, 0.2); + color: var(--success); + border: 1px solid var(--success); +} + +.scenario-badge.out-of-network { + background-color: rgba(255, 149, 0, 0.2); + color: var(--warning); + border: 1px solid var(--warning); +} + +.scenario-desc { + color: var(--text-secondary); + font-size: 0.9rem; + margin: 0.5rem 0 1rem 0; +} + +.scenario-stats { + display: flex; + flex-wrap: wrap; + gap: 0.5rem; + font-size: 0.85rem; + color: var(--text-secondary); + font-family: var(--font-mono); +} + +/* Pipeline Funnel */ +.pipeline-overview { + margin: 2rem 0; + padding: 2rem; + background-color: rgba(29, 161, 242, 0.05); + border-radius: 8px; + border: 1px solid var(--border-color); +} + +.pipeline-funnel { + display: flex; + flex-direction: column; + align-items: center; + gap: 0.5rem; +} + +.funnel-stage { + display: flex; + flex-direction: column; + align-items: center; + padding: 1rem 2rem; + background-color: var(--background-color); + border: 2px solid var(--border-color); + border-radius: 8px; + min-width: 200px; + transition: all 0.3s ease; +} + +.funnel-stage.active { + border-color: var(--primary-color); + background-color: rgba(29, 161, 242, 0.1); + box-shadow: 0 0 20px rgba(29, 161, 242, 0.3); +} + +.funnel-stage.completed { + border-color: var(--success); + background-color: rgba(23, 191, 99, 0.05); +} + +.funnel-count { + font-size: 1.5rem; + font-weight: 700; + font-family: var(--font-mono); + color: var(--primary-color); +} + +.funnel-label { + font-size: 0.9rem; + color: var(--text-secondary); + margin-top: 0.25rem; +} + +.funnel-arrow { + font-size: 1.5rem; + color: var(--text-secondary); + opacity: 0.5; +} + +/* Stage Detail */ +.stage-detail { + margin: 3rem 0; +} + +.stage-detail h3 { + color: var(--primary-color); + margin-bottom: 1rem; +} + +.stage-content { + display: grid; + grid-template-columns: repeat(auto-fit, minmax(300px, 1fr)); + gap: 1.5rem; + margin: 2rem 0; +} + +.detail-card { + background-color: rgba(29, 161, 242, 0.05); + border: 1px solid var(--border-color); + border-radius: 8px; + padding: 1.5rem; +} + +.detail-card h4 { + margin: 0 0 1rem 0; + font-size: 1rem; + color: var(--text-secondary); +} + +.detail-value { + font-size: 1.3rem; + font-weight: 700; + margin-bottom: 0.75rem; + color: var(--text-color); +} + +.detail-value.in-network { + color: var(--success); +} + +.detail-value.out-of-network { + color: var(--warning); +} + +.detail-explanation { + font-size: 0.9rem; + color: var(--text-secondary); + line-height: 1.6; + margin: 0; +} + +.feature-list { + list-style: none; + padding: 0; + margin: 0; +} + +.feature-list li { + padding: 0.5rem 0; + border-bottom: 1px solid var(--border-color); + font-size: 0.9rem; +} + +.feature-list li:last-child { + border-bottom: none; +} + +.stage-note { + background-color: rgba(255, 149, 0, 0.1); + border-left: 4px solid var(--warning); + padding: 1rem 1.5rem; + margin: 2rem 0; + border-radius: 4px; +} + +/* Scoring Formula */ +.scoring-formula { + text-align: center; + margin: 1.5rem 0; + padding: 1rem; + background-color: rgba(29, 161, 242, 0.05); + border-radius: 8px; +} + +.scoring-formula code { + font-size: 1.2rem; + font-weight: 600; +} + +/* Score Breakdown Table */ +.score-breakdown-table { + background-color: var(--code-background); + border-radius: 8px; + overflow: hidden; + border: 1px solid var(--border-color); +} + +.breakdown-header, +.breakdown-row, +.breakdown-total { + display: grid; + grid-template-columns: 2fr 1fr 1fr 1fr; + gap: 1rem; + padding: 0.75rem 1rem; + align-items: center; +} + +.breakdown-header { + background-color: rgba(29, 161, 242, 0.1); + font-weight: 600; + font-size: 0.85rem; + text-transform: uppercase; + color: var(--text-secondary); + border-bottom: 2px solid var(--border-color); +} + +.breakdown-row { + border-bottom: 1px solid var(--border-color); + font-family: var(--font-mono); + font-size: 0.9rem; +} + +.breakdown-row:last-of-type { + border-bottom: 2px solid var(--border-color); +} + +.breakdown-row.negative { + background-color: rgba(255, 59, 48, 0.05); +} + +.breakdown-type { + font-weight: 500; +} + +.breakdown-contribution.positive { + color: var(--success); + font-weight: 700; +} + +.breakdown-contribution.negative { + color: var(--danger); + font-weight: 700; +} + +.breakdown-total { + background-color: rgba(29, 161, 242, 0.1); + font-weight: 700; + font-size: 1rem; + border-top: 2px solid var(--primary-color); +} + +.total-score { + color: var(--primary-color); + font-size: 1.2rem; + font-family: var(--font-mono); +} + +/* Modifications List */ +.modifications-list { + display: flex; + flex-direction: column; + gap: 1rem; +} + +.modification-card { + background-color: rgba(29, 161, 242, 0.05); + border: 1px solid var(--border-color); + border-radius: 8px; + padding: 1.25rem; +} + +.modification-header { + display: flex; + justify-content: space-between; + align-items: center; + margin-bottom: 0.75rem; +} + +.modification-header h4 { + margin: 0; + font-size: 1rem; +} + +.modification-multiplier { + font-family: var(--font-mono); + font-weight: 700; + font-size: 1.1rem; + color: var(--warning); + background-color: rgba(255, 149, 0, 0.1); + padding: 0.25rem 0.75rem; + border-radius: 4px; +} + +.modification-scores { + display: flex; + align-items: center; + gap: 1rem; + margin: 0.75rem 0; + font-family: var(--font-mono); + font-size: 1.1rem; +} + +.score-before { + color: var(--text-secondary); +} + +.score-arrow { + color: var(--primary-color); + font-weight: 700; +} + +.score-after { + color: var(--primary-color); + font-weight: 700; +} + +.modification-description { + font-size: 0.9rem; + color: var(--text-secondary); + margin: 0.5rem 0 0 0; +} + +.final-score-card { + background-color: rgba(29, 161, 242, 0.1); + border: 2px solid var(--primary-color); + border-radius: 8px; + padding: 1.5rem; + text-align: center; + margin-top: 1.5rem; +} + +.final-score-card h4 { + margin: 0 0 1rem 0; + color: var(--primary-color); +} + +.final-score-value { + font-size: 2.5rem; + font-weight: 700; + font-family: var(--font-mono); + color: var(--primary-color); + margin: 1rem 0; +} + +/* Score Evolution */ +.score-evolution { + display: flex; + flex-direction: column; + gap: 0.5rem; +} + +.evolution-row { + display: flex; + justify-content: space-between; + align-items: center; + padding: 0.5rem; +} + +.evolution-score { + font-family: var(--font-mono); + font-weight: 700; + font-size: 1.2rem; +} + +.evolution-score.positive { + color: var(--success); +} + +.evolution-score.negative { + color: var(--danger); +} + +.evolution-change { + font-size: 0.9rem; + margin-left: 0.5rem; +} + +.evolution-arrow { + text-align: center; + color: var(--text-secondary); + font-size: 1.2rem; +} + +/* Score Tracker */ +.score-tracker { + margin: 3rem 0; + padding: 2rem; + background-color: rgba(29, 161, 242, 0.05); + border-radius: 8px; + border: 1px solid var(--border-color); +} + +.score-tracker h3 { + margin-top: 0; + margin-bottom: 1.5rem; +} + +#score-breakdown { + display: flex; + flex-direction: column; + gap: 0.5rem; + align-items: center; +} + +.score-stage { + display: flex; + justify-content: space-between; + align-items: center; + padding: 0.75rem 1.5rem; + background-color: var(--background-color); + border: 1px solid var(--border-color); + border-radius: 6px; + width: 100%; + max-width: 500px; +} + +.score-stage.current { + border-color: var(--primary-color); + background-color: rgba(29, 161, 242, 0.1); + box-shadow: 0 0 10px rgba(29, 161, 242, 0.2); +} + +.score-stage-name { + font-weight: 600; + color: var(--text-secondary); +} + +.score-stage-value { + font-family: var(--font-mono); + font-weight: 700; + font-size: 1.1rem; + color: var(--primary-color); +} + +.score-arrow { + color: var(--text-secondary); + font-size: 1.2rem; +} + +/* Pipeline Navigation */ +.pipeline-navigation { + display: flex; + justify-content: center; + gap: 1rem; + margin: 2rem 0; +} + +.pipeline-nav-btn { + padding: 0.75rem 2rem; + font-size: 1rem; + font-weight: 600; + border: 2px solid; + border-radius: 6px; + cursor: pointer; + transition: all 0.2s ease; + font-family: var(--font-sans); +} + +.pipeline-nav-btn.primary { + background-color: var(--primary-color); + color: white; + border-color: var(--primary-color); +} + +.pipeline-nav-btn.primary:hover:not(:disabled) { + background-color: #1a8cd8; + box-shadow: 0 4px 12px rgba(29, 161, 242, 0.3); +} + +.pipeline-nav-btn.secondary { + background-color: transparent; + color: var(--primary-color); + border-color: var(--primary-color); +} + +.pipeline-nav-btn.secondary:hover:not(:disabled) { + background-color: rgba(29, 161, 242, 0.1); +} + +.pipeline-nav-btn:disabled { + opacity: 0.3; + cursor: not-allowed; +} + +@media (max-width: 768px) { + .progress-tracker { + flex-direction: column; + gap: 1rem; + } + + .progress-line { + width: 2px; + height: 30px; + } + + .loop-navigation { + flex-direction: column; + } + + .history-entry { + flex-direction: column; + align-items: flex-start; + } + + .history-week { + min-width: auto; + } +} + +/* Code Reference Links */ +.code-ref { + font-family: var(--font-mono); + font-size: 0.8rem; + color: var(--text-muted); + margin-top: 1rem; + display: block; +} + +.code-ref a { + color: var(--primary-color); + text-decoration: none; + transition: all 0.2s; + display: inline-flex; + align-items: center; + gap: 0.25rem; +} + +.code-ref a:hover { + color: var(--primary-hover); + text-decoration: underline; +} + +.code-ref a:hover code { + border-color: var(--primary-color); + background-color: rgba(29, 161, 242, 0.05); +} + +.code-ref a::after { + content: "→"; + font-size: 0.875em; + opacity: 0.6; + transition: opacity 0.2s; +} + +.code-ref a:hover::after { + opacity: 1; +} + +.code-ref code { + background-color: var(--code-bg); + border: 1px solid var(--code-border); + border-radius: 3px; + padding: 0.2em 0.4em; + font-size: 0.875em; + transition: all 0.2s; +} diff --git a/docs/index.html b/docs/index.html new file mode 100644 index 000000000..e212e0a0d --- /dev/null +++ b/docs/index.html @@ -0,0 +1,317 @@ + + + + + + How X's Algorithm Really Works + + + + +

+ +
+

How Twitter's Algorithm Really Works

+

We read X's open-source algorithm. Here's what the code actually says.

+ +
+ +

What We Found

+ +

In March 2023, X (then Twitter) open-sourced their recommendation algorithm. We analyzed the implementation—reading thousands of lines of Scala, examining ML model weights, and tracing data flows through the pipeline. Every claim below is backed by specific code references you can verify yourself.

+ +
+

Our approach: No speculation. No reverse engineering. Just reading the actual implementation and documenting what it does.

+
+ +
+ +

Top 5 Most Interesting Findings

+ + +
+

🤯 The Favorites Paradox

+ +

The Finding: Favorites (likes) have the lowest positive weight despite being the most visible metric.

+ +
Favorite (like):                0.5 weight
+Reply:                          13.5 weight (27x more valuable)
+Reply with Author Engagement:   75.0 weight (150x more valuable!)
+ +

Why It Matters: The most tracked engagement type has the least algorithmic value. Tweets that drive conversation (replies) vastly outrank tweets that drive agreement (likes). This means controversial content systematically outranks agreeable content.

+ +

Consequence: One reply with author engagement = 150 likes in algorithmic value.

+ +
+

⚠️ Weight Values: The open-source code defines weights as configurable parameters (FSBoundedParam with default = 0.0). The actual production weight values shown here come from X's ML training repository and represent their documented configuration.

+
+ +

+ Parameter definitions: HomeGlobalParams.scala:786-930
+ Production weight values: the-algorithm-ml/projects/home/recap +

+ +

Explore: Engagement Calculator →

+
+ + +
+

🔥 Conflict is Emergent, Not Intentional

+ +

The Finding: The algorithm cannot distinguish between:

+
    +
  • Agreement reply vs disagreement reply
  • +
  • Supportive engagement vs hostile engagement
  • +
  • Productive conversation vs rage bait
  • +
+ +

All replies get the same weight (13.5) regardless of sentiment or tone.

+ +

Why It Matters: Conflict gets amplified not because X wants it, but because the algorithm is blind to it. This is a design limitation, not malicious intent. The code has no sentiment analysis, no toxicity detection in the scoring model—just raw engagement counts.

+ +

Consequence: Controversial takes (50 angry replies) score higher than helpful explanations (50 supportive replies) despite identical engagement volume.

+ +

+ Code: No sentiment analysis anywhere in ML scoring pipeline (NaviModelScorer.scala, HomeGlobalParams.scala) +

+ +

Explore: Engagement Calculator →

+
+ + +
+

📊 Multiplicative Scoring = Mathematical Echo Chambers

+ +

The Finding: Tweet scores use multiplication, not addition:

+ +
score = baseScore × clusterInterest
+ +

This single design choice means any imbalance compounds over time.

+ +

Why It Matters: Echo chambers aren't a bug—they're mathematical inevitability. If you're 60% interested in AI and 40% in cooking, multiplicative scoring gives AI tweets a permanent 50% scoring advantage. This advantage means you see more AI → engage more with AI → interest increases → advantage grows.

+ +

Example:

+
AI tweet (quality 0.9):      0.9 × 0.60 = 0.54
+Cooking tweet (quality 0.9): 0.9 × 0.40 = 0.36
+
+Same quality, 50% score difference!
+ +

Consequence: The algorithm concentrates your interests over time through reinforcement loops. Balanced interests are unstable—any imbalance (even 51/49) drifts toward concentration.

+ +

+ Code: ApproximateCosineSimilarity.scala:94score * sourceClusterScore +

+ +

Explore: Invisible Filter → | Reinforcement Loop →

+
+ + +
+

👑 Verified Accounts Get 100x Multiplier

+ +

The Finding: Verified status = 100x reputation boost in the TweepcredGraph system.

+ +
if (isVerified) 100
+else { /* calculate based on followers, PageRank, etc. */ }
+ +

Why It Matters: This isn't just a badge—it's a massive algorithmic advantage. Combined with other structural benefits (follower count, PageRank, follow ratio scoring), large verified accounts have a 348:1 reach advantage over small accounts posting identical content.

+ +

Consequence: The platform has algorithmic aristocracy built into its architecture. Not all accounts are treated equally—some start with order-of-magnitude advantages.

+ +

+ Code: UserMass.scala:40-41 +

+ +

Explore: Algorithmic Aristocracy →

+
+ + +
+

☢️ "Not Interested" is Nuclear

+ +

The Finding: A single "not interested" click triggers:

+
    +
  • 0.2x multiplier (80% penalty immediately)
  • +
  • 140-day linear recovery (penalty slowly fades over 5 months)
  • +
  • Author-level effect (affects ALL tweets from that author)
  • +
+ +
Day 0:   0.2x multiplier (nearly invisible)
+Day 70:  0.6x multiplier (still suppressed)
+Day 140: 1.0x multiplier (penalty expires)
+ +

Why It Matters: This is your most powerful tool as a user. One click removes an author from your feed for 5 months. This is personal to you and doesn't aggregate globally.

+ +

Note: Content that the ML model predicts will receive negative feedback (based on historical patterns) gets penalized globally with -74.0 weight. Reports have an even more severe predicted penalty (-369.0 weight).

+ +

+ Code: FeedbackFatigueScorer.scala:38val DurationForDiscounting = 140.days +

+ +

Explore: Engagement Calculator →

+
+ +
+ +

Other Notable Code-Backed Findings

+ +
+ +
+

⏰ Weekly Batch Updates

+

Your InterestedIn profile (which clusters you belong to) updates every 7 days via batch job, not real-time. Your feed is stuck with a week-old view of your interests.

+

Code: InterestedInFromKnownFor.scala:59Days(7)

+
+ +
+

🚧 Out-of-Network Penalty

+

Tweets from people you don't follow get a 0.75x multiplier (25% score reduction). Breaking out of your network requires 33% more engagement to compete.

+

Code: RescoringFactorProvider.scala:45-57

+
+ +
+

📝 Author Diversity Decay

+

Multiple tweets from same author get exponential penalty: 1st tweet (100%), 2nd tweet (62.5%), 3rd tweet (43.75%). Posting more = severe diminishing returns.

+

Code: AuthorBasedListwiseRescoringProvider.scala:54

+
+ +
+

🥶 TwHIN Cold Start Problem

+

Tweets need ≥16 engagements to get TwHIN embeddings (used for recommendations). Below threshold = zero-out. Small accounts locked out of this candidate source.

+

Code: TwhinEmbeddingsStore.scala:48

+
+ +
+

🗺️ Sparse Cluster Assignment

+

~145,000 total clusters exist, but you're only assigned to 10-20 of them (default). Highly sparse representation of your interests.

+

Code: InterestedInFromKnownFor.scala:148maxClustersPerUser = 20

+
+ +
+ +
+ +

Interactive Explorations

+ +

Experience how the algorithm works through interactive demos. Each one lets you experiment with the actual mechanics found in the code.

+ +

Understanding The Pipeline

+ +
+

🔍 Pipeline Explorer

+

Follow a tweet through all 5 algorithmic stages from posting to your timeline. See exactly what happens at each stage with real score calculations, filters, and penalties.

+
+ +
+

🧮 Engagement Calculator

+

Calculate tweet scores yourself. Adjust engagement probabilities and see how replies (13.5 weight) vastly outweigh likes (0.5 weight).

+
+ +

Understanding Your Algorithmic Identity

+ +
+

🗺️ Cluster Explorer

+

Discover which of ~145,000 algorithmic communities you belong to based on your follows and engagement. See how X groups users into interest clusters.

+
+ +
+

🎭 Algorithmic Identity Builder

+

Understand your dual algorithmic profiles: InterestedIn (consumer) vs Producer Embeddings (creator). See how clusters + engagement create your personalized identity (updates weekly!).

+
+ +

Understanding Filter Bubbles & Echo Chambers

+ +
+

🚀 Journey Simulator

+

Model how your interests might drift over time based on multiplicative scoring mechanics. See the reinforcement loop in action.

+
+ +
+

👥 Invisible Filter Demo

+

See how you and a friend see completely different rankings for the same tweets. Multiplicative scoring creates personalized realities.

+
+ +
+

🔁 Reinforcement Loop Visualizer

+

Step through the feedback loop week by week. Watch how seeing more AI content → engaging more → seeing even more creates drift through multiplicative mechanics.

+
+ +

Understanding Structural Advantages

+ +
+

👑 Algorithmic Aristocracy

+

Explore how follower count creates different algorithmic rules. Verified accounts get 100x multipliers, small accounts hit hard barriers. See the four mechanisms that multiply together for 348:1 advantage.

+
+ +

Understanding Next-Generation Systems

+ +
+

🔮 Phoenix: Behavioral Prediction (Likely Active)

+

X's next-generation transformer-based system models your last 522 actions (hours to days of behavior) to predict what you'll do next. Evidence suggests active A/B testing: 9-cluster deployment with parallel evaluation, progressive rollout infrastructure, and hybrid mode for incremental migration. This represents a paradigm shift from static features to behavioral sequences.

+
+ +
+ +

Our Approach

+ +

Objective Evidence: Every claim backed by:

+
    +
  • File path - Exact location in codebase
  • +
  • Line numbers - Specific implementation
  • +
  • Code snippets - What it actually does
  • +
  • Explanation - How the mechanism works
  • +
  • Consequences - What it means for users and creators
  • +
+ +

Verifiable: The algorithm is open source. You can check our work.

+ +

Interactive: We built simulators and calculators so you can experience the mechanics hands-on, not just read about them.

+ +
+ +

Who This Is For

+ +
    +
  • Users wondering why their feed looks the way it does
  • +
  • Creators optimizing for reach and engagement
  • +
  • Researchers studying recommendation algorithms and their societal effects
  • +
  • Policy makers understanding algorithmic amplification
  • +
  • Anyone curious about how algorithmic systems shape online discourse
  • +
+ +
+ +

About This Investigation

+ +

This analysis was conducted by reading X's open-source algorithm code (released March 2023). All findings are based on the actual implementation, not speculation or reverse engineering.

+ +

Repository: github.com/twitter/the-algorithm

+ +

Methodology: We read thousands of lines of Scala, traced data flows through pipelines, examined ML model configurations, and documented every mechanism with file paths and line numbers. Our research notes contain detailed analysis with step-by-step code walkthroughs.

+ +

Last Updated: November 2025

+ +
+ +
+

Key Insight: The algorithm is not neutral. It is designed for engagement, not for truth, diversity, or societal health. Understanding how it works is the first step to using it consciously rather than being shaped by it.

+
+ +
+ +
+

Questions or corrections? Open an issue on GitHub.

+

This is a living document based on X's March 2023 open-source algorithm.

+
+ + + diff --git a/docs/interactive/algorithmic-aristocracy.html b/docs/interactive/algorithmic-aristocracy.html new file mode 100644 index 000000000..6ab5d43eb --- /dev/null +++ b/docs/interactive/algorithmic-aristocracy.html @@ -0,0 +1,529 @@ + + + + + + The Algorithm's Aristocracy - Follower Count Effects + + + + + + +
+

The Algorithm's Aristocracy

+

How follower count creates different algorithmic treatment through four mechanisms that multiply together

+ + + + + + +

What Is Algorithmic Aristocracy?

+ +

Not all Twitter accounts are treated equally by the algorithm. Two accounts posting identical content with identical quality can experience order-of-magnitude differences in reach—not because of what they say, but because of structural characteristics like follower count, verification status, and follow ratio.

+ +

The Key Architectural Choice: Multiplication, Not Addition

+ +

The algorithm uses multiplication to combine structural advantages. This single design choice creates exponential scaling where small accounts face compounding disadvantages and large accounts receive compounding benefits.

+ +
+

Why Multiplication Matters

+

If the algorithm used addition (linear):

+
Small account: 1 + 1 + 1 + 1 = 4
+Large account: 100 + 10 + 5 + 2 = 117
+Ratio: 29:1 (manageable difference)
+ +

But the algorithm uses multiplication (exponential):

+
Small account: 1 × 1 × 0.001 × 100 = 0.1
+Large account: 100 × 1 × 1 × 50,000 = 5,000,000
+Ratio: 50,000,000:1 (insurmountable difference)
+
+ +

The result: A small account (500 followers, unverified) and a large account (50,000 followers, verified) posting identical tweets can see a 348:1 difference in reach purely from structural advantages, not content quality.

+ +

The Four Mechanisms That Multiply

+ +

Four distinct algorithmic mechanisms create advantages that multiply together:

+ +
+
+

1. Verification Multiplier

+

Verified accounts get 100× boost to TweepCred (reputation score)

+

Available via Twitter Blue ($8/month)

+
+ +
+

2. TwHIN Engagement Threshold

+

Accounts must get ≥16 engagements to access advanced ML features

+

Small accounts often can't cross this threshold

+
+ +
+

3. Follow Ratio Penalty

+

Exponential penalty when following > followers

+

Ratio of 2:1 (following 2× followers) = 1,097× penalty

+
+ +
+

4. Out-of-Network Base

+

Your follower count determines out-of-network reach potential

+

500 followers × 0.75 = 375 potential vs 50K × 0.75 = 37,500 potential

+
+
+ +

How They Compound: Concrete Example

+ +
Account A (small account):
+  Verification:   1× (no multiplier)
+  TwHIN:          0× (below 16 engagement threshold - no advanced features)
+  Follow ratio:   0.001× (following 2× more than followers)
+  Base reach:     500 followers
+
+  Calculation: 1 × 0.5 × 0.001 × 500 = ~0.25
+  Effective reach: ~575 (with partial OON)
+
+Account B (large account):
+  Verification:   100× (Twitter Blue)
+  TwHIN:          1× (crossed threshold - full features)
+  Follow ratio:   1× (no penalty)
+  Base reach:     50,000 followers
+
+  Calculation: 100 × 1 × 1 × 50,000 = 5,000,000
+  Effective reach: ~200,000 (after normalization)
+
+Ratio: 348:1 from identical content
+
+The multipliers compound:
+• 1 × 0.5 × 0.001 × 500 = 0.25
+• 100 × 1 × 1 × 50,000 = 5,000,000
+• Gap created purely by multiplication of structural advantages
+ +

Why This Matters

+ +
    +
  • Content quality becomes secondary: Structural advantages outweigh content quality in determining reach
  • +
  • Winner-take-all dynamics: The algorithm amplifies existing advantages, making it harder for new voices to break through
  • +
  • Verification as pay-to-win: Twitter Blue ($8/month) provides the 100× multiplier, creating a paid advantage layer
  • +
  • Most mechanisms are hardcoded: These aren't configuration choices Twitter can easily adjust—they're architectural decisions baked into the code
  • +
  • Compounding is by design: Multiplication (not addition) is an intentional architectural choice that creates exponential scaling
  • +
+ +

This analysis is evidence-based: Every mechanism documented with file paths, line numbers, and formulas from Twitter's open-source algorithm.

+ +
+ + + + + + +
+

Calculate Your Tier

+

Enter your account characteristics to see which mechanisms affect you:

+ +
+
+ + +
+
+ + +
+
+ + +
+
+ +
+
+ + + + +
+ +
+ + + + + + + +
+

The Four Mechanisms (Technical Details)

+

These mechanisms are documented in Twitter's open-source code. Each includes file paths, line numbers, formulas, and concrete examples for verification.

+ +
+

Key Pattern: These four mechanisms multiply together, not add. This multiplication creates exponential scaling where advantages compound.

+
+ + +
+

1. Verification Multiplier (100x)

+ +

Code reference: UserMass.scala:41

+ +

Mechanism: Verified accounts receive a 100x multiplier on their TweepCred (reputation score).

+ +
+
if (isVerified) 100
+
+ +

Type: Hardcoded constant (requires code deployment to change)

+ +

Effect calculation:

+
Account A (10K followers, unverified):
+  TweepCred ≈ 50 (calculated from graph structure)
+
+Account B (10K followers, verified):
+  TweepCred ≈ 5,000 (100x multiplier)
+
+Same follower count, 100:1 difference in algorithmic treatment
+ +

Availability: Twitter Blue ($8/month) or legacy verification status

+
+ + +
+

2. TwHIN Engagement Threshold (16)

+ +

Code reference: TwhinEmbeddingsStore.scala:48-65

+ +

Mechanism: Tweets with fewer than 16 engagements receive zero embeddings, excluding them from TwHIN candidate generation and features.

+ +
+
val MinEngagementCount = 16
+
+if (persistentEmbedding.updatedCount < MinEngagementCount)
+  embedding.map(_ => 0.0)  // Zero out if insufficient engagement
+
+ +

Type: Hardcoded constant

+ +

Effect:

+
    +
  • <16 engagements: No TwHIN candidate generation, no TwHIN features for Heavy Ranker, invisible to 10+ TwHIN feature hydrators
  • +
  • ≥16 engagements: Full TwHIN support (ANN search + feature hydration)
  • +
+ +

Differential impact:

+
Small account (500 followers):
+  Average tweet: 8 engagements
+  Result: Most tweets never cross threshold
+
+Large account (50K followers):
+  Average tweet: 250 engagements
+  Result: All tweets cross threshold immediately
+
+ + +
+

3. Follow Ratio Penalty (Exponential, Unbounded)

+ +

Code reference: UserMass.scala:54-64

+ +

Mechanism: Accounts following >500 users with a following/followers ratio >0.6 receive exponential penalty on TweepCred.

+ +
+
val friendsToFollowersRatio = (1.0 + numFollowings) / (1.0 + numFollowers)
+val adjustedMass = mass / exp(5.0 × (ratio - 0.6))
+
+ +

Type: Hardcoded formula, no maximum cap

+ +

Penalty table:

+ + + + + + + + + + + + + + + + + + + + + + + + + +
RatioPenalty Multiplier
0.61x (no penalty)
1.07.4x penalty
2.01,097x penalty
5.03.6 billion x penalty
+ +

Observation: Large accounts typically have more followers than following (ratio <0.6), avoiding this penalty entirely.

+
+ + +
+

4. Out-of-Network Penalty (0.75x)

+ +

Code reference: RescoringFactorProvider.scala:46-57

+ +

Mechanism: Out-of-network tweets receive a 0.75x multiplier on their score (25% reduction).

+ +
+
object OutOfNetworkScaleFactorParam extends FSBoundedParam[Double](
+  name = "out_of_network_scale_factor",
+  default = 0.75,
+  min = 0.0,
+  max = 1.0
+)
+
+ +

Type: FSBoundedParam (configurable without deployment, range: 0.0-1.0)

+ +

Differential impact:

+
1K followers account:
+  In-network base: 1,000 users (no penalty)
+  Out-of-network: ~99% of potential audience (0.75x penalty applies to nearly all growth)
+
+1M followers account:
+  In-network base: 1,000,000 users (no penalty)
+  Out-of-network: ~95% of potential audience, but base is 1000x larger
+  Same penalty (0.75x), different absolute impact
+
+ +
+ + +
+

Same Content, Different Treatment

+

Two accounts post identical tweets with identical quality. Different structural characteristics produce different reach.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CharacteristicAccount AAccount B
Followers50050,000
Following1,000200
VerifiedNoYes ($8/month)
Avg engagements/tweet8250
Mechanisms Applied:
1. Verification multiplier1x (no multiplier)100x multiplier
2. TwHIN thresholdNot crossed (8 < 16)Crossed (250 ≥ 16)
3. Follow ratio penaltyRatio 2.0 → 1,097x penaltyRatio 0.004 → no penalty
4. Out-of-network base~100 × 0.75 = 75~50,000 × 0.75 = 37,500
Estimated Effective Reach~575~200,000
Reach Ratio: 348:1
+
+ +

Observation: Identical content, 348x difference in reach due to structural characteristics.

+
+ + +
+

The Five Tiers

+

How mechanisms apply at different follower counts:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TierFollowersTypical CharacteristicsReach Multiplier
1<1,000Unverified, high follow ratio, <16 engagements, no TwHIN support~1x (base only)
21,000-10,000Possibly verified, improving ratio, occasional TwHIN on popular tweets~1-15x
310,000-100,000Often verified (100x), low ratio, frequent TwHIN support~15-200x
4100,000-1,000,000Verified, minimal ratio penalty, all tweets get TwHIN~200-2,000x
5≥1,000,000Verified, all penalties negligible, maximum algorithmic support~2,000x+
+
+ +

Observation: Reach multiplier grows faster than follower count (non-linear scaling).

+
+ + +
+

Configurability

+

Which parameters Twitter can adjust:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
MechanismValueTypeAdjustable
Verification multiplier100xHardcodedNo (requires deployment)
TwHIN threshold16 engagementsHardcodedNo (requires deployment)
Follow ratio formulaexp(5.0 × (ratio - 0.6))HardcodedNo (requires deployment)
Out-of-network penalty0.75x (default)FSBoundedParamYes (range: 0.0-1.0)
+
+ +

Observation: Most mechanisms are hardcoded architectural decisions. Only out-of-network penalty is configurable.

+
+ + +
+

Code Verification

+

All mechanisms documented here can be verified in Twitter's open-source algorithm:

+ + +
+

Methodology note: This analysis went through multiple corrections. Initial understanding of SLOP filter (incorrectly interpreted as minimum follower gate) and follow ratio penalty (incorrectly assumed to be capped) were revised after careful code reading. All findings presented here have been verified against the actual implementation.

+
+
+ + +
+

Explore More

+
+ + See Your Account Evolution → + + + ← Calculate Your Clusters + +
+
+ +
+ +
+

Based on Twitter's open-source algorithm (March 2023 release)

+

All mechanisms verified with file paths and line numbers

+
+ + + + diff --git a/docs/interactive/algorithmic-identity.html b/docs/interactive/algorithmic-identity.html new file mode 100644 index 000000000..62d5b6fbb --- /dev/null +++ b/docs/interactive/algorithmic-identity.html @@ -0,0 +1,363 @@ + + + + + + Your Algorithmic Identity - KnownFor, InterestedIn, and Producer Embeddings + + + + + +
+

Your Algorithmic Identity

+

Every user has two algorithmic profiles: one determines what you see, the other determines who sees you. They're calculated independently and can drift in completely different directions—and you're not fully in control of either.

+ + + + + + +

The Three-Layer Architecture

+ +

Twitter's recommendation system is built on three interconnected layers. Understanding how they work together—and how they can diverge—is key to understanding why your feed behaves the way it does.

+ +

Layer 1: KnownFor (The Foundation)

+ +

KnownFor is the foundation layer that identifies what the top 20 million most popular accounts on X are "known for"—which clusters they belong to.

+ +
    +
  • Who's included: Top 20M producers on the platform
  • +
  • What it calculates: Which clusters each producer belongs to (e.g., AI/Tech, Cooking, Politics)
  • +
  • Update frequency: Weekly (7 days)
  • +
  • Purpose: Creates the cluster structure that everything else builds on
  • +
+ +

This is the algorithmic "map" of who creates what kind of content. It's slow to change (weekly updates) because it represents the relatively stable structure of content communities on the platform.

+ +
+

Key insight: KnownFor only covers the top 20M accounts. If you're not in the top 20M, you're not in this foundation layer—but you still get a producer profile (see Layer 3).

+
+ +

Layer 2: InterestedIn (Your Consumer Profile)

+ +

InterestedIn is your consumer profile—it determines what content the algorithm shows you. If KnownFor is about what people create and are "known for," InterestedIn is what you are "interested in"—or more precisely, what content the algorithm predicts you'll engage with.

+ +
    +
  • Who has this: Every user on the platform
  • +
  • What it calculates: Your cluster interests as a consumer (e.g., 60% AI/Tech, 40% Cooking)
  • +
  • Update frequency: Every 7 days (weekly batch updates)
  • +
  • Based on: YOUR engagement choices—what you like, reply to, retweet, click on
  • +
  • Controls: What content YOU see in your timeline
  • +
+ +

The 100-Day Half-Life: Here's the catch—your old engagement doesn't disappear. The algorithm uses exponential decay to weight your historical engagement:

+ +
Engagement from 3 months ago: 50% weight
+Engagement from 6 months ago: 25% weight
+Engagement from 9 months ago: 12.5% weight
+
+Old engagement lingers for months, making your cluster scores slow to shift.
+ +

What this means in practice: If you engaged heavily with AI content for a week back in January, that engagement will continue influencing your recommendations through July. Your InterestedIn profile is like a slow-moving ship—it takes sustained effort to change course.

+ +
+

Relationship type: ONE-TO-MANY. You (one person) engage with many producers. You choose what to engage with (moderate agency).

+
+ +

Layer 3: Producer Embeddings (Your Producer Profile)

+ +

Every user has a second algorithmic representation: your producer profile. This is calculated by Producer Embeddings and determines who sees YOUR content.

+ +
    +
  • Who has this: Only users with ≥100 followers
  • +
  • What it calculates: Which audiences should see your content (based on who engages with you)
  • +
  • Update frequency: Every 7 days (weekly batch updates)
  • +
  • Based on: OTHERS' engagement with you—who likes, replies to, retweets your content
  • +
  • Controls: Who sees YOUR content when you post
  • +
+ +

The 100-Follower Threshold: Below 100 followers, you have no algorithmic boost for exposure. At 99 followers, you're invisible to the recommendation system. At 101 followers, your content starts being calculated and shown to users beyond your immediate followers.

+ +
+

Relationship type: MANY-TO-ONE. Many consumers engage with you (one producer). You DON'T choose who engages with you (low agency).

+
+ +

The Critical Insight: They Can Diverge

+ +

Your InterestedIn (consumer profile) and Producer Embeddings (producer profile) are calculated completely independently. They can—and often do—follow completely different paths:

+ +
+

Example: The Divergent Profile

+
Your InterestedIn (what you consume):
+  Cooking: 70%
+  AI/Tech: 30%
+  → Algorithm shows you cooking recipes, food content
+
+Your Producer Embeddings (who sees your content):
+  AI/Tech: 65%
+  Politics: 25%
+  Cooking: 10%
+  → Algorithm shows your posts to AI and politics audiences
+
+Result: You consume cooking content but produce for AI/politics audiences.
+When you post about cooking (your interest), your AI/politics audience
+doesn't engage. Your reach collapses.
+
+ +

Why You're Not in Control

+ + + + + + + + + + + + + + + + + + + + + +
ProfileYou Control?Why/Why Not
InterestedIn
(Consumer)
Moderate controlYou choose what to engage with, BUT:
• The algorithm chooses what to show you first
• 100-day half-life means old engagement lingers
• Multiplicative scoring creates drift you didn't choose
Producer Embeddings
(Producer)
Low controlYou DON'T choose who engages with you:
• One viral tweet can completely reshape your producer profile
• You can't control which audience discovers you
• Weighted by audience size (50k engagement > 1k engagement)
+ +

The bottom line: Neither profile stays where you want it. Your consumer profile drifts based on what the algorithm shows you and what you engage with. Your producer profile drifts based on who happens to engage with you. Both are moving targets shaped by forces beyond your full control.

+ +
+ + + + + + +

The Technical Details

+ +

The 100-Day Half-Life Formula

+ +

Your engagement doesn't have an expiration date—it decays exponentially with a 100-day half-life:

+ +
weight = 2^(-days_elapsed / 100)
+
+Examples:
+Day 0:   weight = 2^(-0 / 100)   = 1.0    (100%)
+Day 100: weight = 2^(-100 / 100) = 0.5    (50%)
+Day 200: weight = 2^(-200 / 100) = 0.25   (25%)
+Day 300: weight = 2^(-300 / 100) = 0.125  (12.5%)
+Day 400: weight = 2^(-400 / 100) = 0.0625 (6.25%)
+
+Practical meaning:
+• Engagement from 3 months ago: 50% weight
+• Engagement from 6 months ago: 25% weight
+• Engagement from 1 year ago: 6.25% weight
+
+Your profile is weighted average of months of behavior, not just this week.
+ +

InterestedIn Calculation (ONE-TO-MANY)

+ +

Your InterestedIn consumer profile is calculated from YOUR engagement with many producers:

+ +
For each cluster C:
+  InterestedIn[C] = Σ (engagement_weight × time_decay × author_KnownFor[C])
+                    for all authors you engaged with
+
+Then L2-normalize so all clusters sum to 1.0
+
+Example:
+You engaged with 10 AI authors (decay-weighted engagement: 100)
+You engaged with 5 Cooking authors (decay-weighted engagement: 50)
+
+Before normalization:
+  AI: 100
+  Cooking: 50
+
+After L2-normalization (sqrt(100² + 50²) = 111.8):
+  AI: 100 / 111.8 = 0.89 = 89%
+  Cooking: 50 / 111.8 = 0.45 = 45%
+
+Wait, that doesn't sum to 100%! L2-norm ≠ sum to 1.0
+Actually normalize by total: 100 + 50 = 150
+  AI: 100 / 150 = 67%
+  Cooking: 50 / 150 = 33%
+
+Your feed becomes 67% AI, 33% Cooking based on your engagement choices.
+ +

Producer Embeddings Calculation (MANY-TO-ONE)

+ +

Your Producer Embeddings producer profile is calculated from OTHERS' engagement with you:

+ +
For each cluster C:
+  ProducerEmbedding[C] = Σ (engagement_weight × time_decay × consumer_InterestedIn[C])
+                         for all consumers who engaged with you
+
+Then L2-normalize so all clusters sum to 1.0
+
+Example:
+1,000 AI enthusiasts engaged with you (avg InterestedIn: AI 75%)
+50 Cooking enthusiasts engaged with you (avg InterestedIn: Cooking 80%)
+
+Weighted contributions:
+  AI: 1,000 × 0.75 = 750
+  Cooking: 50 × 0.80 = 40
+
+After normalization (750 + 40 = 790):
+  AI: 750 / 790 = 95%
+  Cooking: 40 / 790 = 5%
+
+Your content gets shown to 95% AI audiences, 5% Cooking audiences.
+
+Note: You can't choose this! It's determined by who engaged with you.
+ +

The 100-Follower Threshold

+ +

Producer Embeddings only exist for accounts with ≥100 followers:

+ +
Followers < 100:
+  • No Producer Embedding calculated
+  • No algorithmic boost beyond your immediate followers
+  • Your content is essentially invisible to the recommendation system
+
+Followers ≥ 100:
+  • Producer Embedding calculated weekly
+  • Your content enters the recommendation pipeline
+  • Algorithm can show your tweets to users who don't follow you
+
+Practical impact:
+At 99 followers: Only your 99 followers might see your tweets
+At 101 followers: Potentially millions could see your tweets (if well-matched)
+ +

The 0.072 Threshold (Death of a Cluster)

+ +

When a cluster in your InterestedIn drops below 0.072 (7.2%), it gets filtered out completely:

+ +
Week 0:  AI 60%, Cooking 40% (balanced start)
+Week 12: AI 70%, Cooking 30% (drifting)
+Week 24: AI 76%, Cooking 24% (minority struggling)
+Week 40: AI 85%, Cooking 15% (barely visible)
+Week 60: AI 93%, Cooking 7% (below threshold!)
+Week 61: AI 100%, Cooking 0% (Cooking filtered out permanently)
+
+Result: Complete monopolarization from a balanced starting point.
+
+The 0.072 threshold creates a "death spiral"—once a cluster falls below it,
+you stop seeing that content, so you can't engage with it, so it can never
+recover. Permanent filter bubble lock-in.
+ +

Why Divergence Happens: The Viral Tweet Trap

+ +

One viral tweet can completely reshape your Producer Embedding:

+ +
Week 0: Your Producer Embedding
+  AI/Tech: 75% (your core audience, 1,000 followers)
+  Cooking: 25% (secondary interest, 300 followers)
+
+Week 1: You post one politics joke (human moment, exploring)
+  Goes viral: 50,000 politics enthusiasts engage
+
+New calculation:
+  Old engagement: 1,000 × 0.75 (AI) = 750
+                  300 × 0.25 (Cooking) = 75
+  New engagement: 50,000 × 0.80 (Politics) = 40,000
+
+After normalization (750 + 75 + 40,000 = 40,825):
+  AI: 750 / 40,825 = 1.8%
+  Cooking: 75 / 40,825 = 0.2%
+  Politics: 40,000 / 40,825 = 98%
+
+Your Producer Embedding is now 98% Politics.
+
+Result: When you post AI/Tech content (your passion), algorithm shows it to
+Politics audiences who don't care. Engagement collapses. You're trapped.
+ +

Recovery Times

+ +

Changing your algorithmic identity is slow:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ScenarioTimelineStrategy
Shift InterestedIn consumer profile8-12 weeksStop engaging with dominant cluster entirely. Over-engage with target cluster (40+ interactions/day).
Shift Producer Embeddings producer profile12-16+ weeksConsistently post target content. Manually engage target audience. Accept low reach during transition.
Recover from viral misalignment16-24 weeksWait for viral engagement to decay (100-day half-life). Sustain core audience engagement. Most don't have patience.
Recover from threshold death (<0.072)Impossible algorithmicallyMust manually rebuild: unfollow dominant cluster, follow target cluster accounts, use "Following" tab.
+ +

Code References

+ +
+

InterestedIn calculation: InterestedInFromKnownFor.scala:26-30

+

Producer Embeddings calculation: ProducerEmbeddingsFromInterestedIn.scala:41-54

+

100-day half-life decay: InterestedInFromKnownFor.scala:88 - val halfLifeInDaysForFavScore = 100

+

100-follower threshold: ProducerEmbeddingsFromInterestedIn.scala:47 - filters for numFollowers >= minNumFollowers where minNumFollowers = 100

+

Weekly batch updates: InterestedInFromKnownFor.scala:59 - val batchIncrement: Duration = Days(7)

+

KnownFor weekly updates: UpdateKnownFor20M145K2020.scala:46 - batchIncrement: Duration = Days(7)

+

L2 normalization: SimClustersEmbedding.scala:59-72

+

0.072 threshold filtering: InterestedInParams.scala:63 - default = 0.072

+
+ +
+ +

The Bottom Line

+ +

You have two algorithmic identities on X:

+ +
    +
  1. InterestedIn (Consumer): Determines what you see. Based on what you engage with. Updates weekly. You have moderate control—but 100-day decay, multiplicative scoring, and threshold death create drift you didn't choose.
  2. +
  3. Producer Embeddings (Producer): Determines who sees you. Based on who engages with you. Updates weekly. Requires ≥100 followers. You have low control—one viral moment can reshape it permanently.
  4. +
+ +

These profiles are calculated independently. They can diverge. And you're not fully in control of where either ends up.

+ +

The result: Most users drift into algorithmic states they didn't consciously choose—consuming content that reinforces one cluster, producing for audiences that don't match their interests, or both. The architecture creates paths of least resistance, and users follow them without realizing it's happening.

+ +
+ + + + diff --git a/docs/interactive/cluster-explorer.html b/docs/interactive/cluster-explorer.html new file mode 100644 index 000000000..998bf1ba1 --- /dev/null +++ b/docs/interactive/cluster-explorer.html @@ -0,0 +1,437 @@ + + + + + + Cluster Explorer - Discover Your Algorithmic Communities + + + + + + +
+

Cluster Explorer: Your Algorithmic Communities

+ + + + + +

What Are Clusters?

+ +

Twitter's algorithm doesn't see you as "interested in AI" or "interested in cooking." Instead, it assigns you to invisible communities called clusters—discovered by analyzing the follow graph of 400+ million users.

+ +

The scale: There are approximately 145,000 clusters discovered from the top 20 million most-followed accounts. Each cluster represents a community with similar interests, discovered organically from who follows whom.

+ +

Why clusters matter: Your cluster membership determines what appears in your For You feed. If you're 70% assigned to the "AI/ML Research" cluster and 30% to "Cooking," your feed will reflect that split. The algorithm shows you content from producers (accounts) in your clusters.

+ +

How Clusters Are Discovered

+ +

Clusters aren't manually defined—they emerge from the data using Sparse Binary Matrix Factorization (SBF) with Metropolis-Hastings optimization:

+ +
    +
  1. Build similarity graph: Calculate how similar each producer is to every other producer based on who follows them (cosine similarity of follower patterns)
  2. +
  3. Find communities: Use SBF algorithm with Metropolis-Hastings optimization to group producers who share many followers into clusters, with the constraint that each producer belongs to exactly one cluster (maximally sparse)
  4. +
  5. Assign producers: Each producer gets assigned to their strongest cluster (called "KnownFor")—this sparsity constraint is what creates clear community separation
  6. +
  7. Derive user interests: Your cluster membership comes from the clusters of producers you follow and engage with (called "InterestedIn")
  8. +
+ +
+

The Critical Insight: Your clusters come from ENGAGEMENT (likes, replies, retweets), NOT just follows. Engagement has a 100-day half-life, which means:

+
    +
  • Your likes from today have 100% weight
  • +
  • Your likes from 100 days ago still have 50% weight
  • +
  • Your likes from 200 days ago still have 25% weight
  • +
+

This makes your clusters "sticky"—they change slowly over months, not days. Following diverse accounts isn't enough; you must engage with diverse content.

+
+ +

The Shape of Cluster Assignment

+ +

For new accounts: Your follows determine your initial clusters. If you follow 5 AI researchers and 2 chefs, you'll start ~70% AI cluster, ~30% cooking cluster.

+ +

Over time (weeks to months): Your engagement history dominates. If you like 50 AI tweets and 10 cooking tweets in 100 days, your clusters will drift toward AI regardless of who you follow.

+ +

Long-term steady state: Your clusters reflect the last 100-200 days of engagement with exponential decay. Past behavior has momentum—changing clusters requires sustained engagement pattern changes over 3-6 months.

+ +
+ + + + + +

Experience Your Cluster Assignment

+ +

Use this calculator to see how the algorithm would categorize you based on your follows and engagement patterns. Notice how engagement weights dominate over follows.

+ +
+

Step 1: Choose a Profile

+

+ Start with a preset or build your own custom profile: +

+ +
+ + +
+ +
+

Step 2: Select Accounts You Follow

+

+ These provide your initial cluster assignment. Engagement will shift this over time. +

+ +
+ + + + + + + + + + + + + + + + + + +
+ +

Step 3: Add Your Engagement History

+

+ This is where cluster assignment actually happens. Engagement with 100-day half-life dominates over follows. +

+ +
+ + +

+ Likes, replies, retweets (weighted by engagement type) +

+
+ +
+ + +
+ +
+ + +
+
+ + +
+ + + + +
+ + + + + +

The Technical Details

+ +

How Cluster Assignment Actually Works

+ +

Your cluster membership (InterestedIn) is calculated through matrix multiplication:

+ +
InterestedIn[you] = EngagementGraph[you, producers] × KnownFor[producers, clusters]
+
+Where:
+- EngagementGraph: Your follows + engagement history (100-day half-life)
+- KnownFor: Each producer's primary cluster assignment
+- Result: Your score for each of the ~145,000 clusters
+- Final step: L2 normalization (scores sum to 1.0)
+ +

Concrete Example

+ +

Let's say you follow and engage with these producers over 100 days:

+ +
Follows:
+- @ylecun (KnownFor: AI cluster)
+- @karpathy (KnownFor: AI cluster)
+- @gordonramsay (KnownFor: Cooking cluster)
+
+Engagement (likes, weighted):
+- AI tweets: 50 engagements
+- Cooking tweets: 30 engagements
+
+Matrix multiplication:
+AI cluster score = (2 follows × follow_weight) + (50 engagements × engagement_weight)
+Cooking cluster score = (1 follow × follow_weight) + (30 engagements × engagement_weight)
+
+If follow_weight = 1.0 and engagement_weight = 5.0 (engagement dominates):
+
+AI: (2 × 1.0) + (50 × 5.0) = 2 + 250 = 252
+Cooking: (1 × 1.0) + (30 × 5.0) = 1 + 150 = 151
+
+Normalization (divide by sum to get percentages):
+Sum = 252 + 151 = 403
+
+AI: 252 / 403 = 0.625 (62.5%)
+Cooking: 151 / 403 = 0.375 (37.5%)
+
+Result: You're assigned 62.5% AI, 37.5% Cooking
+
+Notice: Engagement dominated! Even though you followed 2:1 AI:Cooking,
+your 50:30 engagement ratio (1.67:1) created a 62.5:37.5 final ratio (also 1.67:1).
+The engagement weight (5.0) completely overwhelmed the follow weight (1.0).
+ +

The 100-Day Half-Life Formula

+ +

Engagement decay follows exponential decay with 100-day half-life:

+ +
weight(t) = initial_weight × (0.5)^(days_ago / 100)
+
+Examples:
+- Today (t=0):       weight = 1.0 × (0.5)^(0/100)   = 1.0   (100%)
+- 50 days ago:       weight = 1.0 × (0.5)^(50/100)  = 0.707 (70.7%)
+- 100 days ago:      weight = 1.0 × (0.5)^(100/100) = 0.5   (50% - HALF-LIFE)
+- 200 days ago:      weight = 1.0 × (0.5)^(200/100) = 0.25  (25%)
+- 300 days ago:      weight = 1.0 × (0.5)^(300/100) = 0.125 (12.5%)
+ +

Why this matters: Your engagement from 6 months ago (180 days) still has 29% weight. Your clusters have momentum—they resist change. Diversifying your feed requires sustained engagement pattern changes over 3-6 months, not just following different accounts.

+ +

Update Frequencies

+ +

Cluster data updates at different cadences:

+ +
    +
  • KnownFor (producer → cluster mapping): Updated weekly (7 days) +
      +
    • Computationally expensive (runs SBF with Metropolis-Hastings on 20M producers)
    • +
    • Changes slowly (follow graph is stable)
    • +
    +
  • +
  • InterestedIn (your cluster interests): Updated weekly (7 days) +
      +
    • Much cheaper (matrix multiplication using existing KnownFor)
    • +
    • Reflects your recent engagement faster
    • +
    +
  • +
+ +

Implication: Your clusters lag behind your behavior. It takes up to 1 week for new engagement to affect InterestedIn, and up to 1 week for follow graph changes to affect which clusters exist (KnownFor). Both update on the same weekly schedule.

+ +

Cluster Discovery Process

+ +

The ~145,000 clusters are discovered using Sparse Binary Matrix Factorization (SBF) with Metropolis-Hastings optimization:

+ +
    +
  1. Build follow graph: 400M+ users, filter to top 20M most-followed (producers)
  2. +
  3. Calculate similarity: Cosine similarity between producers based on shared followers +
    similarity(Producer_A, Producer_B) = (shared_followers) / √(followers_A × followers_B)
    +
  4. +
  5. Filter weak edges: Remove producer pairs with low similarity (threshold ~0.1-0.2)
  6. +
  7. Run SBF with Metropolis-Hastings: Iteratively optimize cluster assignments over 4 epochs with sparsity constraint +
    Constraint: Each producer → exactly ONE cluster (maximally sparse)
    +Optimization: Metropolis-Hastings sampling to find best assignments
    +Initialization: Start from previous week's assignments (incremental stability)
    +
  8. +
  9. Result: ~145,000 clusters, each representing a community with shared interests, with clear separation (no producer overlap)
  10. +
+ +

Why sparsity matters: The "one cluster per producer" constraint is what creates echo chambers by design. Producers can't belong to multiple clusters, which enforces clear community boundaries and limits cross-cluster discovery.

+ +

Why 145,000 Clusters?

+ +

This number is emergent from the follow graph structure, not directly tuned:

+
    +
  • Self-perpetuating: The algorithm reads the max cluster ID from the previous week's data and uses that as the starting point
  • +
  • Not directly configurable: Would require completely rebuilding the clustering from scratch to change
  • +
  • ~145,000 reflects natural community structure: When running SBF on 20M producers with current similarity thresholds
  • +
  • Trade-offs: Too few = overly broad ("Tech" too diverse), too many = too granular (sparse data)
  • +
  • Empirically stable: Week-over-week incremental updates maintain approximately this count
  • +
+ +

Code References

+ +
+

Cluster formation (SBF/Metropolis-Hastings):
+ UpdateKnownForSBFRunner.scala

+ +

KnownFor generation (production job):
+ UpdateKnownFor20M145K2020.scala

+ +

Note on Louvain:
+ Louvain clustering exists in LouvainClusteringMethod.scala but is used for TWICE (alternative multi-embeddings), NOT for main KnownFor cluster formation

+ +

InterestedIn calculation:
+ InterestedInFromKnownFor.scala:292

+ +

100-day half-life decay:
+ favScoreHalfLife100Days (used throughout SimClusters codebase)

+ +

L2 normalization:
+ SimClustersEmbedding.scala:59-72

+ +

Update frequencies (verified from Twitter Engineering Blog):
+ Twitter's Recommendation Algorithm Blog Post (March 2023)

+
+ +

Key Implications

+ +
+

For users trying to diversify:

+
    +
  • Following diverse accounts gives you initial diverse clusters (helpful for new accounts)
  • +
  • But engagement is what matters long-term—you must like/reply to diverse content
  • +
  • Expect 3-6 months to shift cluster balance due to 100-day half-life
  • +
  • You're fighting the gravitational pull effect (multiplicative scoring amplifies dominant clusters)
  • +
+ +

For creators trying to reach audiences:

+
    +
  • Your KnownFor cluster assignment determines who sees your tweets
  • +
  • Clear niche = strong cluster assignment = multiplicative advantage
  • +
  • Multi-topic accounts get weak/diffuse cluster assignment = penalty
  • +
  • Building reach requires building a strong following within ONE cluster first
  • +
+ +

Why echo chambers emerge:

+
    +
  • SBF algorithm enforces sparsity constraint (one cluster per producer = forced separation)
  • +
  • 100-day half-life creates momentum (clusters resist change)
  • +
  • Multiplicative scoring amplifies dominant clusters (gravitational pull)
  • +
  • No built-in exploration or cross-cluster recommendation mechanisms
  • +
  • Echo chamber architecture is the optimization objective, not a side effect
  • +
+
+ +
+ + + + + + diff --git a/docs/interactive/engagement-calculator.html b/docs/interactive/engagement-calculator.html new file mode 100644 index 000000000..1eebac49e --- /dev/null +++ b/docs/interactive/engagement-calculator.html @@ -0,0 +1,531 @@ + + + + + + Engagement Weight Calculator - Tweet Scoring Simulator + + + + + + +
+

Engagement Weight Calculator

+ + + + + +

What Are Engagement Weights?

+ +

Not all engagement is equal. When you like, reply, or retweet a tweet, the algorithm assigns each action a different value. These values—called engagement weights—determine how the algorithm scores every tweet in your feed.

+ +

How it works: The Heavy Ranker (Twitter's scoring model) predicts 15 different types of engagement for every tweet. Each prediction gets multiplied by its weight, and the sum becomes the tweet's score. Higher score = higher ranking in your feed.

+ +

Why weights matter: These weights reveal what Twitter optimizes for. They're not just scoring mechanisms—they shape:

+
    +
  • What content gets amplified - High-weighted engagement (replies: 13.5) drives visibility more than low-weighted engagement (likes: 0.5)
  • +
  • Your algorithmic identity - Your engagement history (weighted by these values) trains your cluster assignment weekly
  • +
  • Creator incentives - Creators optimize for high-weighted engagement, not just high volume
  • +
  • The feedback loop - High-weighted engagement → stronger cluster assignment → see more similar content → more high-weighted engagement
  • +
+ +

The Core Insight: Conversation Over Consumption

+ +

Twitter massively prioritizes conversation (replies, especially with author engagement) over passive consumption (likes, video views):

+ +
Conversation (active):
+- Reply with author engagement: 75.0 (most valuable)
+- Reply: 13.5
+- Good profile click: 12.0
+
+Passive consumption:
+- Favorite (like): 0.5
+- Retweet: 1.0
+- Video playback 50%: 0.005 (nearly worthless)
+ +

The math: One reply with author engagement (75.0) equals 150 likes (75.0 ÷ 0.5). This isn't a bug—it's the business model.

+ +

Why? Conversation drives time on platform:

+
    +
  • Reading replies takes time
  • +
  • Writing replies takes time
  • +
  • Waiting for responses keeps users checking back
  • +
  • Back-and-forth conversations = more ad impressions
  • +
+ +

The Shape of the Weights

+ +

High-value actions (10.0 to 75.0): Deep engagement requiring effort—replies, profile exploration, meaningful clicks. These signal genuine interest.

+ +

Low-value actions (0.5 to 1.0): Passive engagement—likes, retweets, basic viewing. Easy to give, minimal signal.

+ +

Negative actions (-74.0 to -369.0): Explicit dislike or policy violations. Catastrophic penalties that last weeks to months.

+ +
+

The Engagement Weights Paradox: Favorites (likes) are the most visible metric, tracked by all systems, and the easiest engagement. Yet they have the lowest positive weight (0.5). The algorithm doesn't care what's "popular" by likes—it cares what generates conversation.

+
+ +
+ + + + + +

Experience How Engagement Weights Work

+ +

Use this calculator to see how different engagement patterns score. Notice how high-weighted engagement (conversation) massively outweighs passive engagement (likes) even when volume is much lower.

+ +

The 15 Engagement Weights

+

These weight values come from X's ML training repository. The algorithm multiplies predicted probabilities by these weights.

+ +
+ +
+ +
+

⚠️ Important Context on Weight Values:

+
    +
  • Parameter structure: The open-source code defines weights as FSBoundedParam values (configurable parameters with default = 0.0 and range ±10,000)
  • +
  • Actual values: The weight values shown here (0.5, 13.5, 75.0, etc.) come from X's separate ML training repository, not the main algorithm repo
  • +
  • Configurability: X can adjust these weights without deploying new code
  • +
  • Current production: We don't know if these exact values are still used in production; they represent X's documented configuration from their ML repo
  • +
+

+ Parameter definitions: HomeGlobalParams.scala:788-930
+ Weight values source: the-algorithm-ml/projects/home/recap +

+
+ +

Tweet Score Calculator

+

+ Compare how different tweet types score. Each scenario represents realistic engagement probabilities. Notice how conversation-driven tweets can outscore viral content. +

+ +
+

Select a Tweet Scenario

+ +
+ +
+

📚 Educational Thread

+

"How to build a neural network from scratch (10 tweet thread with code examples)"

+

High replies with author engagement, good clicks, moderate likes

+
+ +
+

📰 Breaking News

+

"BREAKING: Major tech company announces layoffs. Thread with details ↓"

+

Very high engagement across all types, some profile clicks

+
+ +
+

🔧 Useful Resource

+

"I've compiled 100 free resources for learning data science: [link]"

+

High good clicks, retweets, bookmarks, moderate replies

+
+ + +
+

❤️ Wholesome Content

+

"My daughter just wrote her first line of code. So proud! [cute photo]"

+

Very high likes, low replies, some retweets

+
+ +
+

😂 Viral Meme

+

"me: I'll just check Twitter for 5 minutes [4 hours later meme]"

+

Extremely high likes/retweets, low replies, good clicks

+
+ +
+

💭 Personal Story

+

"Thread about my journey from bootcamp to senior engineer (authentic, relatable)"

+

Balanced engagement, good reply-with-engagement rate

+
+ + +
+

🔥 Hot Take

+

"Unpopular opinion: [controversial tech opinion that sparks debate]"

+

Very high replies, moderate negative feedback, low likes

+
+ +
+

💢 Quote Tweet Dunk

+

"lmao imagine actually believing this [quote tweets bad take]"

+

High replies (debate), high negative feedback, mixed signals

+
+ +
+

🎣 Engagement Bait

+

"Drop a 🔥 if you agree! Follow me for more content like this! #engagement"

+

Moderate replies/likes, HIGH negative feedback (users hate this)

+
+ + +
+

🚫 Spam/Low Quality

+

"CHECK OUT MY CRYPTO COURSE!!! 🚀💰 LINK IN BIO [generic spam]"

+

Very low positive engagement, high negative feedback, reports

+
+ +
+

😬 Reply Guy

+

"Actually, [unsolicited correction on someone's casual tweet]"

+

Low engagement, moderate negative feedback

+
+ +
+

🤖 Algorithm Gaming

+

"Agree or disagree? Comment below! ⬇️ [intentionally vague to drive replies]"

+

High replies but also high negative feedback

+
+
+
+ + + + +

How Weights Train Your Feed

+ +
+

These weights don't just rank tweets—they train the algorithm what YOU care about.

+ +

The weekly feedback loop (FavBasedUserInterestedIn - DEFAULT):

+
    +
  1. You engage with content - All engagement types (likes, replies, clicks) get tracked
  2. +
  3. InterestedIn updates weekly - Your cluster assignment shifts based on engagement patterns
  4. +
  5. High-weighted engagements signal stronger interest - Replies (13.5) signal 27x more interest than likes (0.5)
  6. +
  7. Algorithm shows you more of what you reply to - Not just what you like
  8. +
  9. Loop compounds - More AI replies → stronger AI cluster → see more AI content → even more AI replies
  10. +
+ +

+ Want to see this in action?
+ → Calculate how your engagement shapes your clusters
+ → See how your feed drifts over 6 months +

+ +

Code: InterestedIn uses engagement history: InterestedInFromKnownFor.scala:292

+
+ +
+ + + + + +

The Technical Details

+ +

The Scoring Formula

+ +

Every tweet's base score is calculated as a weighted sum of predicted engagement probabilities:

+ +
tweet_score = Σ(P(engagement_i) × weight_i) + epsilon
+
+where:
+- P(engagement_i) = Heavy Ranker's predicted probability (0.0 to 1.0)
+- weight_i = configured weight from the table above
+- epsilon = 0.001 (small constant to avoid zero scores)
+- i ranges over all 15 engagement types
+ +

Concrete Example

+ +

Let's score a tweet with realistic predictions:

+ +
Heavy Ranker predictions:
+- P(favorite) = 0.20 (20% chance user will like)
+- P(reply) = 0.05 (5% chance user will reply)
+- P(reply_with_author_engagement) = 0.01 (1% chance of reply-back)
+- P(retweet) = 0.10 (10% chance of retweet)
+- P(good_click) = 0.15 (15% chance of meaningful click)
+- P(negative_feedback) = 0.02 (2% chance of "not interested")
+
+Weighted contributions (using March 2023 weights):
+- Favorite: 0.20 × 0.5 = 0.10
+- Reply: 0.05 × 13.5 = 0.675
+- Reply w/ author: 0.01 × 75.0 = 0.75
+- Retweet: 0.10 × 1.0 = 0.10
+- Good click: 0.15 × 11.0 = 1.65
+- Negative: 0.02 × -74.0 = -1.48
+- Epsilon: 0.001
+
+Total score = 0.10 + 0.675 + 0.75 + 0.10 + 1.65 - 1.48 + 0.001 = 1.796
+ +

Key observation: The 1% reply-with-author-engagement probability (0.75 contribution) contributes more than the 20% favorite probability (0.10 contribution). This is by design.

+ +

Why These Specific Weights?

+ +

From Twitter's official documentation:

+ +
+ "Each engagement has a different average probability, the weights were originally set so that, on average, each weighted engagement probability contributes a near-equal amount to the score. Since then, we have periodically adjusted the weights to optimize for platform metrics." +
+ +

Translation:

+
    +
  • Initial design: Normalize by rarity (rare actions get high weights, common actions get low weights)
  • +
  • Current state: Tuned to optimize Twitter's business goals—time on platform, conversation depth, user retention
  • +
+ +

Configurability: All Weights Are Tunable

+ +

Every weight is defined as an FSBoundedParam, meaning X can adjust them without deploying new code:

+ +
// From HomeGlobalParams.scala:788-930
+object ReplyEngagedByAuthorParam extends FSBoundedParam[Double](
+  name = "home_mixer_model_weight_reply_engaged_by_author",
+  default = 0.0,        // Not actual value - just placeholder
+  min = -10000.0,       // Can be negative (penalty)
+  max = 10000.0         // Can be very positive (amplification)
+)
+ +

Where the actual weight values come from:

+
    +
  • The open-source algorithm repo shows parameter structure (default = 0.0 is a placeholder)
  • +
  • X's ML training repo (the-algorithm-ml/projects/home/recap) documents the actual production configuration
  • +
  • The values we show (0.5, 13.5, 75.0, etc.) come from that ML repo
  • +
  • These represent X's documented weights, but current production may differ
  • +
+ +

What this means:

+
    +
  • X can A/B test different weight configurations
  • +
  • Weights can be adjusted per-user or per-region
  • +
  • Production values may differ from the documented ML repo values
  • +
  • X can pivot platform priorities by adjusting weights (more video, less conversation, etc.)
  • +
+ +

The Complete Weight Table (March 2023)

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Engagement TypeWeightRelative to FavoriteCategory
Reply Engaged by Author75.0150x🏆 Conversation
Reply13.527x💬 Conversation
Good Profile Click12.024x🔍 Deep Engagement
Good Click V111.022x🔍 Deep Engagement
Good Click V210.020x🔍 Deep Engagement
Retweet1.02x🔄 Sharing
Favorite (Like)0.51x (baseline)❤️ Passive
Video Playback 50%0.0050.01x📹 Passive
Negative Feedback V2-74.0-148x❌ Negative
Report-369.0-738x☠️ Nuclear
+ +

Note: Bookmark, Share, Dwell, Video Quality View, and Video Extended weights are configurable but not disclosed in March 2023 snapshot

+ +

The Nuclear Penalties

+ +

Negative Feedback V2: -74.0

+ +

Equivalent to -148 favorites of negative value

+ +

Triggered by: "Not interested" click, "See less often" click, muting author

+ +

Duration: 140-day linear decay

+
Day 0: 0.2x multiplier (80% penalty - nearly invisible)
+Day 70: 0.6x multiplier (40% penalty - recovering)
+Day 140: 1.0x multiplier (penalty expires)
+ +

Impact: A single "not interested" click suppresses content from that author for 5 months.

+ +

Report: -369.0

+ +

Equivalent to -738 favorites of negative value

+ +

Triggered by: Explicit report for spam, harassment, misinformation, or other policy violations

+ +

To overcome ONE report, a tweet would need:

+
738 favorites (at 0.5 weight each), OR
+28 replies (at 13.5 weight each), OR
+5 reply-with-author-engagements (at 75.0 weight each)
+ +

In practice: Impossible to overcome. Reports are a platform-level harm signal. Content that gets reported is effectively dead.

+ +

Code References

+ +
+

Engagement type definitions:
+ PredictedScoreFeature.scala:62-336

+ +

Weight configuration:
+ HomeGlobalParams.scala:788-930

+ +

Scoring implementation:
+ NaviModelScorer.scala:139-178

+ +

Heavy Ranker (MaskNet) details:
+ External repo: the-algorithm-ml/projects/home/recap/README.md

+ +

InterestedIn uses engagement:
+ InterestedInFromKnownFor.scala:292

+
+ +

What The Algorithm Doesn't Know

+ +

The Heavy Ranker predicts engagement probability, not:

+
    +
  • Truth: No fact-checking in the scoring model
  • +
  • Quality: Only engagement likelihood, not content value
  • +
  • Intent: Can't distinguish productive debate from toxic conflict
  • +
  • Context: Replies could be agreement or furious disagreement—same weight either way
  • +
+ +

The algorithm optimizes for engagement, not for truth, quality, or societal value. This is a business decision, not a technical limitation.

+ +

Key Implications

+ +
+

For users trying to shape their feed:

+
    +
  • Your likes barely affect your feed (0.5 weight)
  • +
  • Your replies HEAVILY affect your feed (13.5 weight, 27x more)
  • +
  • Want diverse content? Reply to diverse content, not just like it
  • +
  • "Not interested" clicks are very effective (140-day penalty)
  • +
+ +

For creators trying to maximize reach:

+
    +
  • Optimize for conversation (replies: 13.5, reply w/ author: 75.0)
  • +
  • ENGAGE WITH YOUR REPLIES (this is the 75.0 weight—highest value action!)
  • +
  • Drive curiosity (profile clicks: 12.0)
  • +
  • Avoid like-bait (0.5 weight, minimal value)
  • +
  • Avoid negative feedback triggers (-74.0 = death sentence for 5 months)
  • +
+ +

The conversation advantage:

+
    +
  • One reply-with-author-engagement (75.0) = 150 likes
  • +
  • Creators who engage with replies have massive algorithmic advantage
  • +
  • A tweet with 10 likes + 5 engaged replies can compete with 1,000 passive likes
  • +
+
+ +
+ + + + + + diff --git a/docs/interactive/invisible-filter.html b/docs/interactive/invisible-filter.html new file mode 100644 index 000000000..b0288c1ac --- /dev/null +++ b/docs/interactive/invisible-filter.html @@ -0,0 +1,315 @@ + + + + + + The Invisible Filter - How Clusters Create Different Realities + + + + + +
+

The Invisible Filter

+

You and your friend see the same tweets, but in completely different orders. This is how clusters create filter bubbles.

+ + + + + + +

What Are Cluster Filters?

+ +

Twitter doesn't show everyone the same feed in the same order. Even if you and your friend follow similar accounts and see identical tweets, those tweets will rank completely differently for each of you. This isn't random—it's driven by an invisible mechanism called cluster-based personalization.

+ +

The Core Mechanism

+ +

Twitter assigns every user and every tweet to invisible communities called "clusters" (there are ~145,000 of them). Think of clusters as interest groups: "AI/Tech enthusiasts," "Cooking fans," "Political junkies," etc. When scoring tweets for your feed, the algorithm multiplies each tweet's base quality score by your cluster interest.

+ +

The result: The exact same tweet with the exact same base quality will score completely differently for you vs your friend based on which clusters you belong to and how strongly.

+ +

Why This Matters

+ +

This mechanism creates different realities on the same platform:

+
    +
  • Same content, different visibility: What's #1 in your feed might be #15 in your friend's feed
  • +
  • Invisible and uncontrollable: You can't see your cluster assignments, can't opt out, can't manually balance them
  • +
  • Reinforces over time: Each engagement strengthens your cluster interests, making the filter stronger
  • +
  • Creates echo chambers by design: The more you engage with a topic, the more of it you see, the less of everything else you see
  • +
+ +

The Shape of the Behavior

+ +

Cluster filtering happens through multiplication, not addition. If you're 60% interested in AI/Tech and 15% interested in Politics:

+ +
AI tweet (base quality: 0.85):
+  Your score: 0.85 × 0.60 = 0.51
+
+Politics tweet (same base quality: 0.85):
+  Your score: 0.85 × 0.15 = 0.128
+
+Same quality, 4× score difference just from clusters!
+ +

Consequence: Your existing interests get amplified, minority interests get suppressed, and you drift toward increasingly concentrated feeds over time.

+ +
+

New to clusters? See the Cluster Explorer to understand where these communities come from and how they're based on who you follow.

+
+ +
+ + + + + + +

Experience The Filter

+ +

See how cluster interests create completely different feeds for different people. Adjust YOUR cluster interests and compare against a friend's profile—the same 15 tweets will rank in completely different orders.

+ +

Configure Your Profiles

+ +
+
+

👤 You

+

+ Adjust your cluster interests to see how it affects your feed ranking. +

+ +
+ + +
+ +
+ + +
+ +
+ + +
+ +

+ Total: 100% +

+
+ +
+

👥 Your Friend

+

+ Select a friend's profile to compare against yours. +

+ +
+ + + + +
+ +
+
+ ■ AI/Tech + 15% +
+
+ ■ Cooking + 5% +
+
+ ■ Politics + 80% +
+
+
+
+ + + + + +
+ + + + + + +

The Technical Details

+ +

The Scoring Formula

+ +

Each tweet's final score is calculated as:

+ +
final_score = base_quality_score × your_cluster_interest
+
+Where:
+- base_quality_score = tweet's inherent quality (0.0 to 1.0)
+- your_cluster_interest = your interest in the tweet's cluster (0.0 to 1.0)
+
+Example:
+Tweet belongs to AI/Tech cluster (cluster_id: 12345)
+Base quality: 0.85
+Your AI/Tech interest: 0.60
+Your friend's AI/Tech interest: 0.15
+
+Your score: 0.85 × 0.60 = 0.51
+Friend's score: 0.85 × 0.15 = 0.128
+
+Same tweet, 4× score difference!
+ +

How Clusters Affect the Full Pipeline

+ +

Cluster-based personalization doesn't just happen once—it happens at multiple stages, compounding the effect.

+ +

Stage 1: Candidate Generation

+

Before any engagement scoring happens, the algorithm fetches candidates based on YOUR clusters:

+
Your clusters: 60% AI, 25% Cooking, 15% Politics
+
+Candidate fetching:
+- Fetch 800 tweets from AI cluster
+- Fetch 800 tweets from Cooking cluster
+- Fetch 800 tweets from Politics cluster
+
+Initial bias: More AI tweets fetched simply because you're 60% AI!
+ +

Stage 2: Cluster Scoring (What This Simulator Shows)

+

Each tweet gets multiplied by your cluster interest:

+
AI tweet (base quality: 0.85):
+- Your score: 0.85 × 0.60 = 0.51
+- Friend's score: 0.85 × 0.15 = 0.128
+
+Politics tweet (base quality: 0.85):
+- Your score: 0.85 × 0.15 = 0.128
+- Friend's score: 0.85 × 0.80 = 0.68
+
+Same quality tweet, 5.3× score difference!
+ +

Stage 3: Engagement Scoring

+

After cluster multiplication, engagement weights are applied. But cluster filtering already determined which tweets you see!

+ +

The Compound Effect

+

Cluster scoring happens at MULTIPLE stages:

+
    +
  • Candidate Generation: Determines which tweets are fetched (Stage 1)
  • +
  • ML Scoring: Multiplies scores (Stage 3)
  • +
  • Result: Double amplification of your existing interests
  • +
+ +

This is why 60/40 becomes 76/24 in the Journey Simulator - the multiplicative effect compounds at multiple stages.

+ +
+ +

Code References

+
+

Cluster assignment (InterestedIn): InterestedInFromKnownFor.scala:26-30

+

Multiplicative scoring: ApproximateCosineSimilarity.scala:84-94

+

Cluster count: ~145,000 clusters total (most users assigned to 10-20)

+

L2 normalization: SimClustersEmbedding.scala:59-72

+
+ +
+ + + + + + diff --git a/docs/interactive/journey-simulator.html b/docs/interactive/journey-simulator.html new file mode 100644 index 000000000..daf60a05f --- /dev/null +++ b/docs/interactive/journey-simulator.html @@ -0,0 +1,229 @@ + + + + + + Journey Simulator - How Your Feed Will Drift + + + + + + +
+

Your Feed Journey Simulator

+

Experience how your feed will drift over time, even if you don't change anything.

+ + + + + + +

What Is The Gravitational Pull Effect?

+ +

Your interests won't stay balanced. Even if you start with 60% AI and 40% Cooking, you'll drift to 76% AI and 24% Cooking over 6 months—without consciously changing your behavior. This isn't a bug, it's how multiplicative scoring works.

+ +

The Core Mechanism

+ +

The algorithm uses multiplication, not addition, to score tweets. Your dominant interest gets a scoring advantage, which means you see more of it, which means you engage more with it, which increases the advantage, creating a feedback loop.

+ +
Multiplicative scoring (what actually happens):
+AI tweet: base_score × 0.60 = higher score
+Cooking tweet: base_score × 0.40 = lower score
+
+Result: You see more AI → engage more with AI → AI interest increases → cycle repeats
+ +

Why This Matters

+ +
    +
  • Inevitable drift: You can't maintain balance without active intervention
  • +
  • Echo chambers by design: The algorithm concentrates your interests over time
  • +
  • Loss of diversity: Minority interests get progressively buried
  • +
  • Unconscious shift: Most users never realize this is happening to them
  • +
+ +

The Shape of the Drift

+ +

The drift is exponential at first, then plateaus. The first 12 weeks see rapid change (60/40 → 70/30), then it slows as you approach saturation (~80/20 is typical plateau). The algorithm doesn't show you what you want—it shows you what it predicts you'll engage with.

+ +
+ + + + + + +

Simulate Your Own Journey

+ +

Your Starting Point

+

When you first joined X, you followed a mix of accounts. Let's say you followed accounts in two interest areas. What was your initial split?

+ +
+

Configure Your Interests

+ +
+ + +
+ +
+ + +
+ +
+ + +
+ 50/50 (balanced) + 80/20 (skewed) +
+
+ +
+ + +
+ +
+ +

+ When enabled, X will recommend accounts from your dominant interest, accelerating the gravitational pull. +

+
+ + +
+ + + +
+ + + + + + +

The Technical Details

+ +

How This Simulator Works

+

This simulator models the gravitational pull effect based on the actual algorithm code:

+
    +
  • Multiplicative scoring: Cluster interest multiplies tweet scores at candidate generation AND ML scoring stages
  • +
  • Engagement reinforcement: More visibility → more engagement → higher interest score
  • +
  • L2 normalization: Your interests sum to 1.0, so one interest increasing means others decrease (zero-sum game)
  • +
  • FRS acceleration: Follow recommendations create a triple reinforcement (see content → engage → follow)
  • +
  • Weekly batches: InterestedIn scores update weekly via batch jobs, not real-time
  • +
+ +

Simplifications in This Model

+

This simulator uses simplified formulas for illustration. The actual algorithm:

+
    +
  • Assigns you to ~10-20 clusters (out of 145,000 total), not just 2
  • +
  • Uses complex engagement prediction models with 6,000+ features per tweet
  • +
  • Applies multiple filters and penalties that compound
  • +
  • Updates in weekly batches, creating 0-7 day lag
  • +
+ +

Technical Details

+
+

Multiplicative scoring: ApproximateCosineSimilarity.scala:94

+

InterestedIn calculation: InterestedInFromKnownFor.scala:26-30

+

Weekly batch updates: InterestedInFromKnownFor.scala:59 - val batchIncrement: Duration = Days(7)

+

L2 normalization: SimClustersEmbedding.scala:59-72

+
+
+ + + + + + diff --git a/docs/interactive/phoenix-sequence-prediction.html b/docs/interactive/phoenix-sequence-prediction.html new file mode 100644 index 000000000..97b6c4b4b --- /dev/null +++ b/docs/interactive/phoenix-sequence-prediction.html @@ -0,0 +1,832 @@ + + + + + + Phoenix: Behavioral Sequence Prediction System - Twitter Algorithm Analysis + + + + + +
+

Phoenix: The Behavioral Prediction System

+

Twitter built a sequence-based prediction system for user behavior. Instead of aggregating features, Phoenix models up to 522 of your recent actions (spanning hours to days of behavior) to predict what you'll do next—like, reply, click. The architecture suggests a fundamental shift from feature-based to sequence-based recommendation.

+ +
+

Important: This analysis is based on code structure and architecture patterns. While the infrastructure is verifiably complete, some aspects (like training objectives and behavioral modeling details) are inferred from architectural similarities to transformer-based systems. We clearly mark what's verified code vs. reasoned inference throughout.

+
+ +
+

Status: Phoenix infrastructure is complete and production-ready (September 2025 commit). It's currently feature-flagged (default = false), suggesting it may be in testing phase. The architecture represents a shift from feature-based to sequence-based prediction.

+
+ + + + + + +

From Averages to Sequences

+ +

The current recommendation system (NaviModelScorer) thinks about you in terms of averages and statistics: "Alice likes 30% tech content, 20% sports, follows 342 people, engages 10 times per day." Phoenix thinks about you in terms of what you're doing right now: "Alice just clicked 3 tech tweets in a row, expanded photos, watched a video—she's deep-diving into tech content."

+ +

The Core Difference

+ +
+
+

Current System: NaviModelScorer

+

Feature-Based Prediction

+ +

Your profile:

+
User features: {
+  avg_likes_per_day: 10.5
+  avg_replies_per_day: 2.3
+  favorite_topics: [tech, sports]
+  follower_count: 342
+  engagement_rate: 0.15
+  ... (many aggregated features)
+}
+ +

Algorithm asks: "What does Alice usually like?"

+

Time horizon: Months of aggregated behavior

+

Updates: Daily batch recalculation

+
+ +
+

Phoenix System

+

Sequence-Based Prediction

+ +

Your recent actions:

+
Action sequence: [
+  CLICK(tech_tweet_1)
+  READ(tech_tweet_1)
+  LIKE(tech_tweet_1)
+  CLICK(tech_tweet_2)
+  EXPAND_IMAGE(tech_tweet_2)
+  CLICK(tech_tweet_3)
+  ... (up to 522 aggregated actions)
+]
+ +

Algorithm asks: "What will Alice do next given her recent behavioral pattern?"

+

Time horizon: Hours to days of behavioral history (522 aggregated actions)

+

Updates: Real-time action capture, aggregated into sessions

+
+
+ +

The LLM Analogy (Inferred from Architecture)

+ +

Hypothesis: Phoenix uses a transformer-based architecture similar to language models, but instead of predicting text, it predicts actions. This inference is based on:

+ + +

Comparison to language models:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
AspectChatGPT / ClaudePhoenix
ArchitectureTransformer (attention-based)Transformer (attention-based)
InputSequence of tokens (words)Sequence of actions (likes, clicks, replies)
Context Window8K-200K tokens522 aggregated actions (hours to days of behavior)
Prediction Task"What word comes next?""What action comes next?"
OutputProbability distribution over vocabularyProbability distribution over 13 action types
Training ObjectivePredict next token from contextPredict next action from behavioral context
What It LearnsLanguage patterns, grammar, contextBehavioral patterns, engagement momentum, intent
+ +

What Phoenix Could Capture (Inference from Sequence Modeling)

+ +

Hypothesis: By modeling behavior as a sequence, Phoenix could understand dynamics that aggregated features miss. These capabilities are inferred from how sequence models typically work, not explicitly verified in code:

+ +
+

1. Session-Level Interest

+
Scenario: User interested in both Tech and Sports (50/50 split)
+
+Navi prediction: 50% tech, 50% sports (always the same)
+
+Phoenix prediction:
+  Monday morning: [TECH] [TECH] [TECH] [TECH] → 85% tech, 15% sports
+  Monday evening: [SPORTS] [SPORTS] [SPORTS] → 10% tech, 90% sports
+
+Same user, different behavioral context → different predictions
+ +

2. Behavioral Momentum

+
Engagement Streak:
+[LIKE] [REPLY] [LIKE] [LIKE] [CLICK] [LIKE] → High engagement mode
+Phoenix: Next tweet gets 75% engagement probability
+
+Passive Browsing:
+[SCROLL] [SCROLL] [CLICK] [SCROLL] → Low engagement mode
+Phoenix: Next tweet gets 15% engagement probability
+
+Same user, different momentum → different feed composition
+ +

3. Context Switches

+
Context Switch Detection:
+[NEWS] [NEWS] [NEWS] → [MEME] [MEME] → Context switch!
+
+Phoenix recognizes: User shifted from serious content to entertainment
+Adapts feed: More memes, less news (for this session)
+ +

4. Intent Signals

+
Behavioral Pattern: Profile Click + Follow
+[CLICK_TWEET] → [CLICK_PROFILE] → [FOLLOW] → Next tweet from that author
+
+Phoenix learns: Profile click + follow = strong interest signal
+Result: Boost similar authors immediately
+
+ +

Why This Could Change Everything

+ +

Hypothesis: Phoenix could represent Twitter's move toward "delete heuristics"—the vision of replacing manual tuning with learned patterns. This interpretation is based on architectural design patterns:

+ +
+
+

What Gets Deleted

+
    +
  • Manual weights: Reply: 75.0, Favorite: 0.5, Report: -369.0 → Phoenix learns what matters from data
  • +
  • Hand-crafted aggregated features: avg_likes_per_day, favorite_topics, engagement_rate → Just action sequences
  • +
  • 15+ manual penalties: OON penalty, author diversity, feedback fatigue → Phoenix learns user preferences
  • +
  • Static predictions: "Alice likes 30% tech" → "Alice is deep-diving tech RIGHT NOW"
  • +
+
+
+ +

The result: An algorithm that understands your current intent from your behavioral patterns, not your historical preferences from aggregated statistics. This is closer to how humans actually browse—following threads of interest as they emerge, not mechanically consuming averaged content.

+ +
+ + + + + + +

Experience Behavioral Prediction (Simulation)

+ +

This simulator demonstrates how sequence-based prediction could work based on Phoenix's architecture. The predictions shown are illustrative of what behavioral sequence modeling enables, not actual Phoenix output.

+ +
+

Behavioral Sequence Simulator

+ +
+

Your Recent Actions (Last 8 actions):

+

Click actions to add them to your sequence. Phoenix analyzes your behavioral pattern and predicts your next action.

+ +
+ + + + + + + + + +
+ +
+
+

Your Sequence:

+ +
+
+ No actions yet. Click buttons above to build your sequence. +
+
+ + +
+ +
+

Try These Patterns:

+
    +
  • Deep Dive: Click [Tech] [Tech] [Tech] [Tech] → Phoenix detects focused exploration
  • +
  • Engagement Streak: [Like Tech] [Reply Tech] [Like Tech] → High momentum mode
  • +
  • Context Switch: [Tech] [Tech] [Sports] [Sports] → Phoenix adapts to interest shift
  • +
  • Passive Browsing: [Scroll] [Scroll] [Scroll] → Low engagement mode
  • +
+
+
+ +
+ + + + + + +

The Technical Architecture

+ +

Two-Stage Pipeline

+ +

Phoenix splits prediction and aggregation into two separate stages:

+ +
Stage 1: PhoenixScorer (Prediction via gRPC)
+  Input: User action sequence (up to 1024 actions) + candidate tweets
+  Process: Transformer model predicts engagement probabilities
+  Output: 13 predicted probabilities per tweet
+
+Stage 2: PhoenixModelRerankingScorer (Aggregation)
+  Input: 13 predicted probabilities from Stage 1
+  Process: Per-head normalization + weighted aggregation
+  Output: Final Phoenix score for ranking
+ +

Stage 1: Behavioral Sequence Prediction

+ +

Code: PhoenixScorer.scala:30-85

+ +

Input: Action Sequence

+
User action sequence (522 aggregated actions spanning hours to days):
+[
+  Session 1: FAV(tweet_123, author_A) + CLICK(tweet_123, author_A),
+  Session 2: CLICK(tweet_456, author_B),
+  Session 3: REPLY(tweet_789, author_C) + FAV(tweet_790, author_C),
+  Session 4: FAV(tweet_234, author_A),
+  ...
+  Session 522: CLICK(tweet_999, author_D)
+]
+
+(Actions grouped into sessions using 5-minute proximity windows)
+
+Candidate tweets: [tweet_X, tweet_Y, tweet_Z]
+ +

Processing: Transformer Model (Inferred Architecture)

+

Verified: Phoenix calls an external gRPC service named user_history_transformer (dependency in BUILD.bazel:20, client interface RecsysPredictorGrpc in PhoenixUtils.scala:26, usage in PhoenixUtils.scala:110-135)
+ Note: The actual service implementation is not in the open-source repository.
+ Inferred: The internal architecture likely follows transformer patterns based on the service name and sequence-to-sequence design:

+ +
Inferred Transformer Architecture:
+  1. Embed each action in the sequence (action type + tweet metadata)
+  2. Apply self-attention to identify relevant behavioral patterns
+  3. For each candidate tweet, compute relevance to behavioral context
+  4. Output 13 engagement probabilities via softmax
+
+Verified Output Format (log probabilities):
+{
+  "tweet_X": {
+    "SERVER_TWEET_FAV": {"log_prob": -0.868, "prob": 0.42},
+    "SERVER_TWEET_REPLY": {"log_prob": -2.526, "prob": 0.08},
+    "SERVER_TWEET_RETWEET": {"log_prob": -2.996, "prob": 0.05},
+    "CLIENT_TWEET_CLICK": {"log_prob": -1.273, "prob": 0.28},
+    ... (9 more engagement types)
+  },
+  ...
+}
+ +

Why gRPC Service? (Verified: separate service, inferred: reasons)

+
    +
  • Verified: Phoenix calls external gRPC service for predictions (PhoenixUtils.scala:110-135)
  • +
  • Inferred: Sequence model inference is compute-intensive (likely GPU/TPU accelerated)
  • +
  • Inferred: Separate service allows independent scaling
  • +
  • Inferred: Runs on specialized ML infrastructure, not home-mixer cluster
  • +
+ +

Stage 2: Per-Head Normalization and Aggregation

+ +

Code: PhoenixModelRerankingScorer.scala:23-81

+ +

Step 1: Per-Head Max Normalization

+

For each engagement type (each "head"), find the maximum prediction across all candidates:

+ +
3 candidates, 3 engagement types:
+Candidate A: [FAV: 0.42, REPLY: 0.08, CLICK: 0.28]
+Candidate B: [FAV: 0.15, REPLY: 0.35, CLICK: 0.20]
+Candidate C: [FAV: 0.30, REPLY: 0.12, CLICK: 0.25]
+
+Per-head max:
+  Max FAV: 0.42
+  Max REPLY: 0.35
+  Max CLICK: 0.28
+
+Attach max to each candidate for normalized comparison:
+Candidate A: [(0.42, max:0.42), (0.08, max:0.35), (0.28, max:0.28)]
+Candidate B: [(0.15, max:0.42), (0.35, max:0.35), (0.20, max:0.28)]
+Candidate C: [(0.30, max:0.42), (0.12, max:0.35), (0.25, max:0.28)]
+ +

Why normalize per-head? Different engagement types have different prediction ranges. Normalization ensures fair aggregation.

+ +

Step 2: Weighted Aggregation

+

Phoenix uses the same weights as NaviModelScorer for fair A/B testing comparison:

+ +

Weight parameters: HomeGlobalParams.scala:786-1028
+ Actual values: the-algorithm-ml/projects/home/recap

+ +
Weights (configured in production):
+  FAV: 0.5
+  REPLY: 13.5
+  REPLY_ENGAGED_BY_AUTHOR: 75.0
+  RETWEET: 1.0
+  GOOD_CLICK: 12.0
+  ... (8 more positive weights)
+  NEGATIVE_FEEDBACK: -74.0
+  REPORT: -369.0
+
+Final Score = Σ (prediction_i × weight_i)
+
+Example for Candidate A:
+  FAV:    0.42 × 0.5   = 0.21
+  REPLY:  0.08 × 13.5  = 1.08
+  CLICK:  0.28 × 12.0  = 3.36
+  ... (sum all 13 engagement types)
+
+  Phoenix Score = 0.21 + 1.08 + 3.36 + ... = 8.42
+ +

The 13 Engagement Types

+ +

Code: PhoenixPredictedScoreFeature.scala:30-193

+ +

Phoenix predicts probabilities for 13 different action types:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Engagement TypeActionWeight
FAVLike/favorite tweet0.5
REPLYReply to tweet13.5
REPLY_ENGAGED_BY_AUTHORReply + author engages back75.0
RETWEETRetweet or quote1.0
GOOD_CLICKClick + dwell (quality engagement)12.0
PROFILE_CLICKClick author profile3.0
VIDEO_QUALITY_VIEWWatch video ≥10 seconds8.0
... (6 more)Share, bookmark, open link, etc.0.2 - 11.0
NEGATIVE_FEEDBACKNot interested, block, mute-74.0
REPORTReport tweet-369.0
+ +

Context Window: 522 Aggregated Actions (Hours to Days)

+ +

Action sequence hydration: UserActionsQueryFeatureHydrator.scala:56-149
+ Max count parameter: HomeGlobalParams.scala:1373-1379

+ +

CRITICAL: The "5-minute window" is for aggregation (grouping actions within proximity), not filtering (time limit on history).

+ +
Configuration:
+// Aggregation window (for grouping, NOT filtering)
+private val windowTimeMs = 5 * 60 * 1000  // Groups actions within 5-min proximity
+private val maxLength = 1024                // Max AFTER aggregation
+
+// Actual default used
+object UserActionsMaxCount extends FSBoundedParam[Int](
+  name = "home_mixer_user_actions_max_count",
+  default = 522,    // ← Actual default
+  min = 0,
+  max = 10000       // ← Configurable up to 10K
+)
+
+Processing flow:
+1. Fetch user's full action history from storage (days/weeks)
+2. Decompress → 2000+ raw actions
+3. Aggregate using 5-min proximity window (session detection)
+   → Actions within 5-min windows grouped together
+4. Cap at 522 actions (default)
+
+Result: 522 aggregated actions spanning HOURS TO DAYS, not 5 minutes!
+ +

What "5-minute aggregation window" means:

+
    +
  • NOT: "Only use last 5 minutes of actions"
  • +
  • YES: "Group actions that occur within 5-minute proximity"
  • +
  • Purpose: Session detection and noise reduction
  • +
+ +

Actual temporal span (522 actions):

+
Active user (~100 actions/hour):  ~5 hours of behavioral history
+Normal user (~30 actions/hour):   ~17 hours of behavioral history
+Light user (~10 actions/hour):    ~52 hours (2+ days) of behavioral history
+
+Maximum (10,000 actions): Could span WEEKS for light users
+ +

Comparison to LLM context windows:

+
GPT-3:   2048 tokens (~1500 words, ~3-4 pages of text)
+GPT-4:   8K-32K tokens (~6K-24K words)
+Phoenix: 522 aggregated actions (~hours to days of behavior)
+ +

Phoenix vs Navi: Architecture Comparison

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
AspectNaviModelScorer (Current)Phoenix (Future)
Input DataMany aggregated featuresAction sequence (522 aggregated actions, spanning hours to days)
Temporal Modeling❌ Lost in aggregation✅ Explicit via self-attention
Behavioral Context⚠️ Via real-time aggregates✅ Recent actions directly inform predictions
Session Awareness❌ Same prediction all day✅ Adapts to current browsing mode
Feature Engineering❌ Many hand-crafted features✅ Minimal (actions + metadata)
Manual Tuning❌ 15+ engagement weights, penalties✅ Learned patterns (eventually)
Computational Cost✅ O(n) feature lookup⚠️ O(n²) transformer attention
Update FrequencyDaily batch recalculationReal-time, every action
+ +

Current Status (Verified from Code)

+ +

Phoenix Infrastructure: All components verified in twitter/the-algorithm repository

+ +

Infrastructure Status (September 2025 commit):

+ + +

What This Means:

+
    +
  • Infrastructure is production-ready but not yet default
  • +
  • Feature flags allow enabling Phoenix without code deployment
  • +
  • The system is designed for A/B testing (multiple clusters configured)
  • +
  • Whether it's currently active in production is unknown from open-source code
  • +
+ +

Multi-Model Experimentation Infrastructure

+ +

Cluster configuration: HomeGlobalParams.scala:1441-1451
+ Connection management: PhoenixClientModule.scala:21-61
+ Cluster selection: PhoenixScorer.scala:52-53

+ +

Phoenix isn't a single model—it's 9 separate transformer deployments designed for parallel experimentation:

+ +
PhoenixCluster enumeration:
+- Prod          // Production model
+- Experiment1   // Test variant 1
+- Experiment2   // Test variant 2
+- Experiment3   // Test variant 3
+- Experiment4   // Test variant 4
+- Experiment5   // Test variant 5
+- Experiment6   // Test variant 6
+- Experiment7   // Test variant 7
+- Experiment8   // Test variant 8
+ +

What This Enables

+ +
+
+
1. Parallel Model Testing
+

Twitter can test 8 different Phoenix variants simultaneously:

+
    +
  • Different architectures: 6-layer vs 12-layer transformers, varying attention heads
  • +
  • Different context windows: 256 vs 522 vs 1024 vs 2048 actions
  • +
  • Different training data: Models trained on different time periods or user segments
  • +
  • Feature integration tests: Actions only vs. actions + embeddings vs. actions + temporal features
  • +
+
+ +
+
2. Per-Request Cluster Selection
+

Each user's request can be routed to a different cluster:

+
// From PhoenixScorer.scala:52-53
+val phoenixCluster = query.params(PhoenixInferenceClusterParam)  // Select cluster
+val channels = channelsMap(phoenixCluster)                        // Route request
+
+// Default: PhoenixCluster.Prod
+// But can be dynamically set per user via feature flags
+

A/B testing flow:

+
User Alice (bucket: control)      → PhoenixCluster.Prod
+User Bob   (bucket: experiment_1) → PhoenixCluster.Experiment1
+User Carol (bucket: experiment_2) → PhoenixCluster.Experiment2
+
+ +
+
3. Progressive Rollout Strategy
+

Safe, gradual deployment with instant rollback:

+
Week 1: Deploy new model to Experiment1
+        Route 1% of users to Experiment1
+        Other 99% stay on Prod
+        ↓
+Week 2: Compare metrics (engagement, dwell time, follows, etc.)
+        If Experiment1 > Prod: increase to 5%
+        If Experiment1 < Prod: rollback instantly
+        ↓
+Week 3: Gradually increase: 10% → 25% → 50%
+        Monitor metrics at each step
+        ↓
+Week 4: If consistently better, promote Experiment1 → Prod
+        Start testing next variant in Experiment2
+

Key advantage: Zero-downtime experimentation. New models can be tested without code deployment or service restart—just change the PhoenixInferenceClusterParam value via feature flag dashboard.

+
+ +
+
4. Parallel Evaluation (All Clusters Queried)
+

Multi-cluster logging: ScoredPhoenixCandidatesKafkaSideEffect.scala:85-104

+

For offline analysis, Twitter can query all 9 clusters simultaneously for the same candidates:

+
// getPredictionResponsesAllClusters queries ALL clusters in parallel
+User request → Candidates [tweet_A, tweet_B, tweet_C]
+             ↓
+Query Prod:         tweet_A: {FAV: 0.40, REPLY: 0.10, CLICK: 0.60}
+Query Experiment1:  tweet_A: {FAV: 0.45, REPLY: 0.12, CLICK: 0.58}
+Query Experiment2:  tweet_A: {FAV: 0.38, REPLY: 0.15, CLICK: 0.62}
+Query Experiment3:  tweet_A: {FAV: 0.42, REPLY: 0.11, CLICK: 0.65}
+... (all 9 clusters)
+             ↓
+Log to Kafka: "phoenix.Prod.favorite", "phoenix.Experiment1.favorite", ...
+             ↓
+Offline analysis: Compare predicted vs actual engagement across all models
+

Purpose: Create comprehensive comparison dataset without affecting user experience. Only the selected cluster's predictions are used for ranking, but all predictions are logged for evaluation.

+
+
+ +

Hybrid Mode: Mixing Navi and Phoenix Predictions

+ +

Hybrid configuration: HomeGlobalParams.scala:1030-1108

+ +

Twitter can use Navi predictions for some action types and Phoenix predictions for others:

+ +
Hybrid Mode Configuration (per action type):
+- EnableProdFavForPhoenixParam         = true   // Use Navi for favorites
+- EnableProdReplyForPhoenixParam       = true   // Use Navi for replies
+- EnableProdGoodClickV2ForPhoenixParam = false  // Use Phoenix for clicks
+- EnableProdVQVForPhoenixParam         = false  // Use Phoenix for video views
+- EnableProdNegForPhoenixParam         = true   // Use Navi for negative feedback
+... (13 total flags, one per engagement type)
+ +

Incremental migration strategy:

+
Phase 1: Enable Phoenix, but use Navi for all predictions
+         (Shadow mode - Phoenix predictions logged but not used)
+         ↓
+Phase 2: Use Phoenix for low-risk actions (photo expand, video view)
+         Keep Navi for high-impact actions (favorite, reply, retweet)
+         ↓
+Phase 3: Gradually enable Phoenix for more action types
+         Monitor metrics after each change
+         ↓
+Phase 4: Full Phoenix mode - all predictions from transformer
+         Navi retired or kept as fallback
+ +

Why this matters: Reduces risk by preserving proven Navi predictions while testing Phoenix predictions incrementally. If Phoenix predictions for clicks are great but favorites are worse, Twitter can use Phoenix for clicks only.

+ +

What This Reveals

+ +
+

This isn't experimental infrastructure—it's production A/B testing at scale.

+

The sophistication of the cluster system suggests:

+
    +
  • Active deployment: Phoenix is likely running on real users right now
  • +
  • Ongoing iteration: Multiple transformer variants being tested simultaneously
  • +
  • Serious commitment: This level of infrastructure investment indicates Phoenix is strategic priority
  • +
  • Modern ML engineering: Safe, data-driven model deployment with comprehensive monitoring
  • +
+

Phoenix's feature gate (default = false) doesn't mean "not deployed"—it means "controlled rollout." Twitter can activate Phoenix for specific user cohorts, test different model variants, and compare results, all without changing code.

+
+ +

Technical Details

+ +

Connection pooling: Each cluster maintains 10 gRPC channels for load balancing and fault tolerance (90 total connections across 9 clusters).

+ +

Request routing: Randomly selects one of 10 channels per request for even load distribution (PhoenixUtils.scala:107-117).

+ +

Retry policy: 2 attempts with different channels, 500ms default timeout (configurable to 10s max).

+ +

Graceful degradation: If a cluster fails to respond, the system continues with other clusters (for logging) or falls back to Navi (for production scoring).

+ +

Code References

+ +
+

Phoenix Scorer (Stage 1 - Prediction): PhoenixScorer.scala:30-85

+

Phoenix Reranking Scorer (Stage 2 - Aggregation): PhoenixModelRerankingScorer.scala:23-81

+

User Action Sequence Hydrator: UserActionsQueryFeatureHydrator.scala:56-149

+

13 Engagement Predictions (Action Types): PhoenixPredictedScoreFeature.scala:30-193

+

gRPC Transformer Service Integration: PhoenixUtils.scala:26-159

+

Per-Head Max Normalization: RerankerUtil.scala:38-71

+

Weighted Aggregation Logic: RerankerUtil.scala:91-137

+

Model Weight Parameters: HomeGlobalParams.scala:786-1028

+

Actual Weight Values (ML Repo): the-algorithm-ml/projects/home/recap

+

Action Sequence Max Count: HomeGlobalParams.scala:1373-1379

+
+ +
+ +

The Bottom Line

+ +

What we know: Phoenix infrastructure is complete, feature-gated, and production-ready. The architecture represents a fundamental shift from feature-based to sequence-based prediction. More importantly, Phoenix has sophisticated A/B testing infrastructure that strongly suggests active deployment on real users.

+ +

Evidence of Active Deployment

+ +
+

The 9-cluster system isn't just placeholder infrastructure—it's production experimentation at scale:

+
    +
  • Multi-cluster A/B testing: 8 experimental variants can be tested simultaneously against production
  • +
  • Parallel evaluation: All 9 clusters queried for every request and logged to Kafka for comparison
  • +
  • Progressive rollout: Per-user cluster selection enables gradual traffic shifting (1% → 5% → 100%)
  • +
  • Hybrid mode: Can mix Navi and Phoenix predictions per action type (incremental migration strategy)
  • +
  • Connection pooling: 90 maintained gRPC connections (9 clusters × 10 channels) indicates active use
  • +
  • Instant rollback: Feature flags allow switching clusters without code deployment
  • +
+

Conclusion: This level of infrastructure sophistication indicates Phoenix is likely being tested on production traffic right now, not merely prepared for future deployment.

+
+ +

What This Means

+ +

Paradigm shift in progress:

+ +
    +
  • From static features to behavioral sequences: Your last 522 aggregated actions (hours to days of behavior) replacing lifetime averages
  • +
  • From "what you usually like" to "what you're doing now": Session-aware, context-sensitive predictions that adapt as you browse
  • +
  • From manual tuning to learned patterns: Transformer learns what matters from behavioral data, replacing 15+ hand-tuned engagement weights and penalties
  • +
  • From daily batch updates to continuous adaptation: Algorithm learns your behavioral patterns over hours/days, not months
  • +
+ +

If/When Phoenix Becomes Default

+ +

The algorithm would understand you not as a static profile of historical preferences, but as a dynamic behavioral sequence revealing your current intent. Your feed would adapt as you browse, following threads of interest that emerge in your behavior—not mechanically serving averaged content from long-term statistics.

+ +

This mirrors how humans actually consume content: following curiosity as it arises, deep-diving into topics that capture attention, switching contexts when interest shifts. An algorithm that learns to follow your behavioral lead, not force you into a predetermined statistical box.

+ +

Current Reality

+ +
+

Verified from code:

+
    +
  • ✅ Phoenix infrastructure is production-ready and deployed
  • +
  • ✅ Multi-cluster A/B testing system is fully operational
  • +
  • ✅ Feature gates allow instant activation without code deployment
  • +
  • ⚠️ Default setting is false, but can be enabled per-user
  • +
+

What we don't know from open-source code:

+
    +
  • ❓ What percentage of users currently experience Phoenix predictions
  • +
  • ❓ Which experimental clusters are active and how they differ
  • +
  • ❓ How Phoenix performance compares to Navi in production metrics
  • +
  • ❓ Timeline for full rollout (if planned)
  • +
+

Most likely scenario: Phoenix is in active A/B testing with controlled user cohorts. Twitter is iterating on multiple model variants (via Experiment1-8 clusters), comparing results, and gradually expanding deployment as metrics improve. The infrastructure is too sophisticated to be merely preparatory.

+
+ +
+ +
+ +

Questions or corrections? Open an issue on GitHub.

+

This analysis is based on code found in a September 2025 commit to Twitter's open-source algorithm repository. Phoenix infrastructure is feature-gated and may not be active in production.

+
+ + + + diff --git a/docs/interactive/pipeline-explorer.html b/docs/interactive/pipeline-explorer.html new file mode 100644 index 000000000..ced482b4d --- /dev/null +++ b/docs/interactive/pipeline-explorer.html @@ -0,0 +1,281 @@ + + + + + + The Full Pipeline Explorer - How Tweets Get Ranked + + + + + +
+

The Full Pipeline Explorer

+

Follow a tweet's complete journey from posting to your timeline through all 5 algorithmic stages. See exactly how scoring, filters, and penalties determine what you see.

+ + + + + + +

What Is The Recommendation Pipeline?

+ +

Every day, Twitter processes approximately 1 billion tweets through a 5-stage pipeline. By the time one reaches your feed, it has passed through candidate generation, feature extraction, machine learning scoring, multiple filters and penalties, and final mixing. Only ~4% survive to appear in feeds.

+ +

The Five Stages

+ +
Stage 1: Candidate Generation (~1B → ~1,400)
+  Fetch potential tweets from various sources
+
+Stage 2: Feature Hydration (~1,400 tweets)
+  Attach ~6,000 features to each tweet
+
+Stage 3: Heavy Ranker ML Scoring (~1,400 tweets)
+  Predict 15 engagement types, calculate weighted scores
+
+Stage 4: Filters & Penalties (~1,400 → ~100-200)
+  Apply multipliers, diversity penalties, safety filters
+
+Stage 5: Mixing & Serving (~100-200 → 50-100)
+  Insert ads, modules, deliver final timeline
+ +

Why This Matters

+ +
    +
  • Extreme filtering: 96% of tweets never reach any feeds
  • +
  • Multi-stage compounding: Advantages and penalties multiply across stages
  • +
  • Invisible decisions: Most filtering happens before you ever see rankings
  • +
  • Same tweet, different treatment: Identical content gets different scores for different users
  • +
+ +
+ + + + + + +

Follow a Tweet Through The Pipeline

+ +

Configure the Tweet

+

+ Choose a tweet scenario to follow through the pipeline. Each scenario has realistic engagement probabilities and characteristics. +

+ +
+
+
+

🔥 Viral Educational Thread

+ In-Network +
+

High-quality thread from someone you follow in your main interest cluster

+
+ 📊 Reply: 8% + ❤️ Like: 25% + 🔁 RT: 3% +
+
+ +
+
+

🌐 Out-of-Network Quality

+ Out-of-Network +
+

Great content from someone you don't follow, different cluster

+
+ 📊 Reply: 5% + ❤️ Like: 18% + 🔁 RT: 2% +
+
+ +
+
+

⚡ Controversial Take

+ In-Network +
+

Hot take that drives replies, from followed author

+
+ 📊 Reply: 12% + ❤️ Like: 8% + 👎 Negative: 4% +
+
+ +
+
+

📝 3rd Tweet from Same Author

+ In-Network +
+

Good content but author already has 2 tweets in your feed

+
+ 📊 Reply: 6% + ❤️ Like: 20% + ⚠️ 3rd tweet penalty +
+
+
+ + + +
+ + + + + + +

The Technical Details

+ +

Stage 1: Candidate Generation

+

Fetch ~1,400 candidate tweets from various sources based on your profile:

+
    +
  • In-Network: ~50% from people you follow (via Earlybird search)
  • +
  • SimClusters: ~20% from your interest clusters
  • +
  • Real Graph: ~15% from social connections
  • +
  • UTEG: ~15% from engagement graph
  • +
+ +

Stage 2: Feature Hydration

+

Attach ~6,000 features to each tweet:

+
    +
  • Author features (follower count, verified status, reputation score)
  • +
  • Tweet features (media type, length, recency, topic)
  • +
  • Engagement features (predicted probabilities for 15 engagement types)
  • +
  • User-tweet features (cluster similarity, real graph connection)
  • +
+ +

Stage 3: Heavy Ranker (ML Scoring)

+

MaskNet model predicts 15 engagement probabilities and calculates weighted score:

+
score = Σ (probability_i × weight_i)
+
+Top weights:
+- Reply with Author Engagement: 75.0
+- Reply: 13.5
+- Good Profile Click: 12.0
+- Retweet: 1.0
+- Favorite: 0.5
+
+Negative weights:
+- Negative Feedback: -74.0
+- Report: -369.0
+ +

Stage 4: Filters & Penalties

+

Multiple filters reshape the ranking:

+
    +
  • Out-of-Network Penalty: 0.75x multiplier (25% penalty)
  • +
  • Author Diversity: Exponential decay for multiple tweets from same author
  • +
  • Cluster Scoring: Multiply by your cluster interest (this creates filter bubbles!)
  • +
  • Feedback Fatigue: 80% penalty after "not interested" click
  • +
  • Previously Seen: Remove tweets you've already seen
  • +
  • Safety Filters: Remove NSFW, blocked users, muted keywords
  • +
+ +

Stage 5: Mixing & Serving

+

Insert ads, promoted tweets, follow recommendations, and serve final timeline.

+ +
+ +

Code References

+ + +
+ + + + + + diff --git a/docs/interactive/reinforcement-loop.html b/docs/interactive/reinforcement-loop.html new file mode 100644 index 000000000..402bdde9b --- /dev/null +++ b/docs/interactive/reinforcement-loop.html @@ -0,0 +1,283 @@ + + + + + + The Reinforcement Loop Machine - Why Drift Is Inevitable + + + + + +
+

The Reinforcement Loop Machine

+

Step through the algorithmic feedback loop that causes your interests to drift, even when you don't change your behavior.

+ + + + + + +

What Is The Reinforcement Loop?

+ +

The algorithm creates a self-reinforcing feedback loop: Your profile determines what you see. What you see determines what you engage with. What you engage with updates your profile. Round and round, amplifying imbalances week after week.

+ +

The Six Steps of the Loop

+ +
1. Your Profile: 60% AI, 40% Cooking
+
+2. Fetch Candidates: Algorithm fetches more AI tweets (60%) than Cooking (40%)
+
+3. Score Tweets: AI tweets score higher (0.9 × 0.60 vs 0.9 × 0.40)
+
+4. Build Feed: You see 60% AI, 40% Cooking (matches your profile)
+
+5. You Engage: You engage with what you see (60% AI, 40% Cooking)
+
+6. Update Profile: Next week, your profile becomes 62% AI, 38% Cooking
+
+→ Return to Step 1 with NEW profile (loop repeats)
+ +

Why This Matters

+ +
    +
  • Mathematical inevitability: Any imbalance (even 51/49) will drift over time
  • +
  • Zero-sum dynamics: L2 normalization means one interest growing forces others to shrink
  • +
  • Weekly compounding: Each cycle amplifies the previous imbalance
  • +
  • No equilibrium: The system has no rebalancing mechanism—it only concentrates
  • +
+ +

The Shape of the Drift

+ +

The drift follows a logistic curve: Fast growth initially (60 → 70% in 12 weeks), then slowing as you approach saturation (~80% is typical plateau). This isn't about your behavior changing—it's pure mathematics from multiplicative scoring + L2 normalization + weekly batch updates.

+ +
+ + + + + + +

Step Through The Loop

+ +

Configure Your Starting Profile

+ +
+

+ Set your initial cluster interests. Watch how even a small imbalance (60/40) compounds over time. +

+ +
+
+ + +
+ +
+ + +
+ +

+ Total: 100% +

+
+ + +
+ + + +
+ + + + + + +

The Technical Details

+ +

Why This Loop Creates Drift

+ +

1. Multiplicative Scoring Amplifies Advantages

+

At the scoring stage, tweets get multiplied by your cluster interest:

+
AI tweet (quality 0.9): 0.9 × 0.60 = 0.54
+Cooking tweet (quality 0.9): 0.9 × 0.40 = 0.36
+
+50% score advantage for AI despite equal quality!
+ +

2. You Engage With What You See

+

Because AI content ranks higher, you see more of it. Your engagement naturally matches what's visible:

+
Feed composition: 60% AI, 40% Cooking
+Your engagement: 60% AI, 40% Cooking
+
+You didn't change your preferences - you engaged with what was shown!
+ +

3. L2 Normalization Creates Zero-Sum Dynamics

+

Cluster interests must sum to 1.0 (100%). When AI increases, Cooking MUST decrease:

+
Before: 60% AI, 40% Cooking (sum = 100%)
+After:  62% AI, 38% Cooking (sum = 100%)
+
+AI gained 2%, Cooking lost 2% - it's zero-sum!
+ +

4. Weekly Batch Updates Lock In Changes

+

InterestedIn updates weekly via batch jobs. Each week's drift becomes the new baseline:

+
Week 0:  60% AI, 40% Cooking
+Week 1:  62% AI, 38% Cooking  ← New baseline
+Week 4:  64% AI, 36% Cooking  ← Compounds
+Week 12: 70% AI, 30% Cooking  ← Accelerates
+Week 24: 76% AI, 24% Cooking  ← Lock-in
+ +

5. The Loop Feeds Itself

+

The output becomes the input:

+
    +
  • Week 0: 60/40 profile → see 60/40 feed → engage 60/40
  • +
  • Week 1: 62/38 profile → see 62/38 feed → engage 62/38
  • +
  • Week 4: 64/36 profile → see 64/36 feed → engage 64/36
  • +
  • Result: Each iteration amplifies the imbalance
  • +
+ +
+ +

Mathematical Inevitability

+ +

This isn't about user behavior. It's pure math:

+ +
+

The Drift Formula

+
New_AI_Interest = Old_AI_Interest + (drift_rate × advantage × slowdown)
+
+Where:
+- drift_rate = engagement intensity (0.008 to 0.025)
+- advantage = AI_interest / Cooking_interest (e.g., 0.60 / 0.40 = 1.5)
+- slowdown = 1 - (imbalance × 0.5) (slows as approaching extremes)
+
+Example (Week 0 → Week 1):
+New_AI = 0.60 + (0.015 × 1.5 × 0.9) = 0.60 + 0.02025 ≈ 0.62
+
+ +

Key insight: As long as there's ANY imbalance (not perfect 50/50), drift will occur. The stronger interest always wins.

+ +

The Only Ways to Prevent Drift

+ +
    +
  1. Start perfectly balanced (50/50) - But even 51/49 will drift over time
  2. +
  3. Use additive scoring - Instead of multiplication, use addition (but this eliminates personalization)
  4. +
  5. Disable cluster scoring - Don't multiply by interest (but then why have clusters?)
  6. +
  7. Active counterbalancing - Manually over-engage with minority interest (exhausting)
  8. +
+ +

In the current design, drift is inevitable for any user with unequal interests.

+ +
+ +

Code References

+
+

Multiplicative scoring: ApproximateCosineSimilarity.scala:84-94

+

Weekly batch updates: InterestedInFromKnownFor.scala:59 - val batchIncrement: Duration = Days(7)

+

L2 normalization: SimClustersEmbedding.scala:59-72

+

InterestedIn calculation: InterestedInFromKnownFor.scala:88-95 - Follows who you follow, what you engage with

+
+ +
+ + + + + + diff --git a/docs/js/algorithmic-aristocracy.js b/docs/js/algorithmic-aristocracy.js new file mode 100644 index 000000000..8c9c3fd2e --- /dev/null +++ b/docs/js/algorithmic-aristocracy.js @@ -0,0 +1,281 @@ +// Algorithmic Aristocracy Calculator + +document.getElementById('calculate-btn').addEventListener('click', calculateTier); + +// Also trigger on Enter key +['followers', 'following', 'avg-engagement'].forEach(id => { + document.getElementById(id).addEventListener('keypress', (e) => { + if (e.key === 'Enter') calculateTier(); + }); +}); + +function calculateTier() { + // Get input values + const followers = parseInt(document.getElementById('followers').value) || 0; + const following = parseInt(document.getElementById('following').value) || 0; + const avgEngagement = parseInt(document.getElementById('avg-engagement').value) || 0; + const verified = document.getElementById('verified').checked; + + if (followers === 0) { + alert('Please enter follower count'); + return; + } + + // Calculate mechanisms + const mechanisms = calculateMechanisms(followers, following, avgEngagement, verified); + + // Determine tier + const tier = determineTier(followers, verified); + + // Calculate effective reach + const effectiveReach = calculateEffectiveReach(followers, mechanisms); + + // Display results + displayResults(followers, following, avgEngagement, verified, mechanisms, tier, effectiveReach); +} + +function calculateMechanisms(followers, following, avgEngagement, verified) { + const mechanisms = { + verification: { + applies: verified, + multiplier: verified ? 100 : 1, + description: verified ? '100x TweepCred multiplier' : 'No multiplier (unverified)' + }, + twhin: { + applies: avgEngagement >= 16, + threshold: 16, + description: avgEngagement >= 16 + ? `Threshold crossed (${avgEngagement} ≥ 16)` + : `Threshold NOT crossed (${avgEngagement} < 16)` + }, + followRatio: calculateFollowRatio(followers, following), + outOfNetwork: { + penalty: 0.75, + description: 'All out-of-network tweets: 0.75x multiplier' + } + }; + + return mechanisms; +} + +function calculateFollowRatio(followers, following) { + const ratio = (1 + following) / (1 + followers); + + // Check if penalty applies (following > 500 AND ratio > 0.6) + const penaltyApplies = following > 500 && ratio > 0.6; + + let penaltyMultiplier = 1; + if (penaltyApplies) { + // Formula: mass / exp(5.0 * (ratio - 0.6)) + penaltyMultiplier = Math.exp(5.0 * (ratio - 0.6)); + } + + return { + ratio: ratio, + applies: penaltyApplies, + multiplier: penaltyMultiplier, + description: penaltyApplies + ? `Ratio ${ratio.toFixed(2)} → ${formatNumber(penaltyMultiplier)}x penalty` + : `Ratio ${ratio.toFixed(2)} → no penalty (${ratio > 0.6 ? 'following ≤500' : 'ratio ≤0.6'})` + }; +} + +function determineTier(followers, verified) { + if (followers < 1000) return 1; + if (followers < 10000) return 2; + if (followers < 100000) return 3; + if (followers < 1000000) return 4; + return 5; +} + +function calculateEffectiveReach(followers, mechanisms) { + // Simplified reach calculation + // In-network base (no penalty) + let inNetworkReach = followers; + + // Out-of-network potential (simplified model) + // Small accounts: limited OON reach + // Large accounts: substantial OON reach + let oonPotential = followers * 0.2; // Simplified: 20% of followers as OON base + + // Apply verification multiplier to OON potential + if (mechanisms.verification.applies) { + oonPotential *= mechanisms.verification.multiplier / 10; // Scaled down for realism + } + + // Apply follow ratio penalty to OON + if (mechanisms.followRatio.applies) { + oonPotential /= mechanisms.followRatio.multiplier; + } + + // Apply out-of-network penalty + oonPotential *= mechanisms.outOfNetwork.penalty; + + // Total reach + const totalReach = inNetworkReach + oonPotential; + + return { + inNetwork: inNetworkReach, + outOfNetwork: oonPotential, + total: totalReach + }; +} + +function displayResults(followers, following, avgEngagement, verified, mechanisms, tier, reach) { + const resultsContainer = document.getElementById('calculator-results'); + + // Show results container + resultsContainer.style.display = 'block'; + + // Generate tier description + const tierDescriptions = { + 1: 'Subject to all penalties, no structural advantages', + 2: 'Some advantages if verified, improving ratio', + 3: 'Often verified (100x), penalties matter less', + 4: 'Verified, minimal penalties, strong reach', + 5: 'Above most rules, maximum algorithmic support' + }; + + // Build results HTML + let html = ` +
+

Your Tier: ${tier} of 5

+

${tierDescriptions[tier]}

+ +
+

Estimated Effective Reach

+
+ ~${formatNumber(Math.round(reach.total))} +
+
+
+
In-network
+
${formatNumber(Math.round(reach.inNetwork))}
+
+
+
Out-of-network
+
${formatNumber(Math.round(reach.outOfNetwork))}
+
+
+
+ +
+

Mechanisms Affecting You

+ +
+
+ ${mechanisms.verification.applies ? '✓' : '✗'} 1. Verification Multiplier +
+
+ ${mechanisms.verification.description} +
+
+ Code: UserMass.scala:41 +
+
+ +
+
+ ${mechanisms.twhin.applies ? '✓' : '✗'} 2. TwHIN Threshold +
+
+ ${mechanisms.twhin.description} +
+ ${mechanisms.twhin.applies ? + '
Full TwHIN support: ANN candidate generation + 10+ feature hydrators
' : + '
No TwHIN support: Zero embeddings, no candidate generation
' + } +
+ Code: TwhinEmbeddingsStore.scala:48 +
+
+ +
+
+ ${mechanisms.followRatio.applies ? '⚠' : '✓'} 3. Follow Ratio Penalty +
+
+ ${mechanisms.followRatio.description} +
+ ${mechanisms.followRatio.applies ? + `
High penalty: Your TweepCred is divided by ${formatNumber(mechanisms.followRatio.multiplier)}
` : + '
No penalty applied to your account
' + } +
+ Code: UserMass.scala:54-64 +
+
+ +
+
+ 4. Out-of-Network Penalty (Universal) +
+
+ ${mechanisms.outOfNetwork.description} +
+
+ ${followers < 1000 ? + 'High impact: ~99% of your potential reach is out-of-network' : + followers < 100000 ? + 'Moderate impact: Large in-network base reduces relative effect' : + 'Low relative impact: In-network base is very large' + } +
+
+ Code: RescoringFactorProvider.scala:46-57 +
+
+
+ + ${generateRecommendations(followers, following, verified, mechanisms, tier)} +
+ `; + + resultsContainer.innerHTML = html; + + // Scroll results into view + resultsContainer.scrollIntoView({ behavior: 'smooth', block: 'nearest' }); +} + +function generateRecommendations(followers, following, verified, mechanisms, tier) { + let recommendations = '
'; + recommendations += '

Observations

'; + recommendations += '
    '; + + if (!verified && tier <= 3) { + recommendations += '
  • Verification would provide 100x TweepCred multiplier ($8/month Twitter Blue)
  • '; + } + + if (!mechanisms.twhin.applies) { + recommendations += '
  • Average engagement below TwHIN threshold (16) - most tweets lack embedding support
  • '; + } + + if (mechanisms.followRatio.applies && mechanisms.followRatio.multiplier > 10) { + recommendations += '
  • High follow ratio penalty significantly reduces reach
  • '; + } + + if (tier === 1) { + recommendations += '
  • Tier 1: Limited algorithmic support, primarily in-network reach
  • '; + } + + if (tier >= 4) { + recommendations += '
  • Tier 4-5: Most penalties negligible due to large base
  • '; + } + + recommendations += '
'; + recommendations += '
'; + + return recommendations; +} + +function formatNumber(num) { + if (num >= 1000000) { + return (num / 1000000).toFixed(1) + 'M'; + } else if (num >= 1000) { + return (num / 1000).toFixed(1) + 'K'; + } else if (num >= 10) { + return num.toFixed(0); + } else { + return num.toFixed(1); + } +} diff --git a/docs/js/algorithmic-identity.js b/docs/js/algorithmic-identity.js new file mode 100644 index 000000000..12ed7fde2 --- /dev/null +++ b/docs/js/algorithmic-identity.js @@ -0,0 +1,361 @@ +// Algorithmic Identity Interactive JavaScript + +// Wait for DOM to be ready +document.addEventListener('DOMContentLoaded', function() { + +// ============================================================================ +// InterestedIn Calculator +// ============================================================================ + +document.getElementById('calculate-btn').addEventListener('click', calculateInterestedIn); + +// Also trigger on Enter key +['ai-likes', 'cooking-likes', 'politics-likes'].forEach(id => { + document.getElementById(id).addEventListener('keypress', (e) => { + if (e.key === 'Enter') calculateInterestedIn(); + }); +}); + +function calculateInterestedIn() { + // Get input values + const aiLikes = parseInt(document.getElementById('ai-likes').value) || 0; + const cookingLikes = parseInt(document.getElementById('cooking-likes').value) || 0; + const politicsLikes = parseInt(document.getElementById('politics-likes').value) || 0; + + // Calculate total + const total = aiLikes + cookingLikes + politicsLikes; + + if (total === 0) { + alert('Please enter at least some engagement!'); + return; + } + + // Calculate percentages (raw scores) + const aiPercent = (aiLikes / total) * 100; + const cookingPercent = (cookingLikes / total) * 100; + const politicsPercent = (politicsLikes / total) * 100; + + // Display results + displayResults({ + ai: { raw: aiLikes, percent: aiPercent }, + cooking: { raw: cookingLikes, percent: cookingPercent }, + politics: { raw: politicsLikes, percent: politicsPercent }, + total: total + }); +} + +function displayResults(data) { + const container = document.getElementById('results-container'); + const barsContainer = document.getElementById('results-bars'); + const interpretationContainer = document.getElementById('results-interpretation'); + + // Show results + container.style.display = 'block'; + + // Create visual bars + barsContainer.innerHTML = ` +
+
+ AI/Tech + ${data.ai.percent.toFixed(1)}% +
+
+
+
+
+ +
+
+ Cooking + ${data.cooking.percent.toFixed(1)}% +
+
+
+
+
+ +
+
+ Politics + ${data.politics.percent.toFixed(1)}% +
+
+
+
+
+ `; + + // Determine dominant cluster + const sorted = [ + { name: 'AI/Tech', percent: data.ai.percent }, + { name: 'Cooking', percent: data.cooking.percent }, + { name: 'Politics', percent: data.politics.percent } + ].sort((a, b) => b.percent - a.percent); + + const dominant = sorted[0]; + const secondary = sorted[1]; + + // Generate interpretation + let interpretation = '

What This Means

'; + + if (dominant.percent >= 70) { + interpretation += ` +

Strong consolidation: Your feed is dominated by ${dominant.name} (${dominant.percent.toFixed(1)}%). + This means:

+ +

Warning: You're approaching a filter bubble. Consider diversifying your engagement.

+ `; + } else if (dominant.percent >= 50) { + interpretation += ` +

Moderate imbalance: ${dominant.name} is your dominant interest (${dominant.percent.toFixed(1)}%), + but you still see meaningful ${secondary.name} content. This will likely drift further toward ${dominant.name} over time.

+ +

Tip: If you want to maintain balance, actively engage more with ${secondary.name} content.

+ `; + } else { + interpretation += ` +

Relatively balanced: Your interests are fairly distributed. However, the algorithm will + naturally drift toward the strongest interest (${dominant.name} at ${dominant.percent.toFixed(1)}%) over time.

+ +

Tip: Balanced interests require active maintenancethe algorithm has no rebalancing mechanisms.

+ `; + } + + interpretation += ` +
+

+ Next week: Your InterestedIn will update based on what you engage with this week. + The cycle repeats every 7 days, creating compounding drift. +

+
+ `; + + interpretationContainer.innerHTML = interpretation; + + // Scroll results into view + container.scrollIntoView({ behavior: 'smooth', block: 'nearest' }); +} + +// ============================================================================ +// Timeline Scrubber +// ============================================================================ + +const weekSlider = document.getElementById('week-slider'); +const weekDisplay = document.getElementById('week-display'); +const timelineEvents = document.getElementById('timeline-events'); +const timelineState = document.getElementById('timeline-state'); + +// Timeline data +const timelineData = [ + { + week: 0, + aiPercent: 60, + cookingPercent: 40, + events: ['You follow 50 accounts: 30 AI, 20 Cooking', 'Initial engagement tracked'], + knownForUpdate: false, + interestedInUpdate: false + }, + { + week: 1, + aiPercent: 62, + cookingPercent: 38, + events: ['InterestedIn first calculation', 'Feed shifts to 62/38 composition', 'You engage with what you see (62% AI)'], + knownForUpdate: false, + interestedInUpdate: true + }, + { + week: 2, + aiPercent: 64, + cookingPercent: 36, + events: ['InterestedIn updates (weekly)', 'Compounding begins: AI engagement increases AI score'], + knownForUpdate: false, + interestedInUpdate: true + }, + { + week: 3, + aiPercent: 65, + cookingPercent: 35, + events: ['KnownFor updates (first time since Week 0)', 'Cluster structure recalculates', 'InterestedIn recalculates with new KnownFor', 'Some accounts may shift clusters'], + knownForUpdate: true, + interestedInUpdate: true + }, + { + week: 4, + aiPercent: 66, + cookingPercent: 34, + events: ['InterestedIn updates on new KnownFor baseline', 'Compounding accelerates'], + knownForUpdate: false, + interestedInUpdate: true + }, + { + week: 5, + aiPercent: 68, + cookingPercent: 32, + events: ['InterestedIn updates (weekly)', 'Drift momentum building'], + knownForUpdate: false, + interestedInUpdate: true + }, + { + week: 6, + aiPercent: 70, + cookingPercent: 30, + events: ['KnownFor updates (second time)', 'InterestedIn updates', 'AI dominance clear: 70/30 split'], + knownForUpdate: true, + interestedInUpdate: true + }, + { + week: 7, + aiPercent: 71, + cookingPercent: 29, + events: ['InterestedIn updates (weekly)', 'Cooking content becoming rare'], + knownForUpdate: false, + interestedInUpdate: true + }, + { + week: 8, + aiPercent: 72, + cookingPercent: 28, + events: ['InterestedIn updates (weekly)', 'Feed increasingly homogeneous'], + knownForUpdate: false, + interestedInUpdate: true + }, + { + week: 9, + aiPercent: 74, + cookingPercent: 26, + events: ['KnownFor updates (third time)', 'InterestedIn updates', 'Cluster structure locked in'], + knownForUpdate: true, + interestedInUpdate: true + }, + { + week: 10, + aiPercent: 75, + cookingPercent: 25, + events: ['InterestedIn updates (weekly)', 'Approaching threshold danger zone'], + knownForUpdate: false, + interestedInUpdate: true + }, + { + week: 11, + aiPercent: 76, + cookingPercent: 24, + events: ['InterestedIn updates (weekly)', 'Cooking barely visible'], + knownForUpdate: false, + interestedInUpdate: true + }, + { + week: 12, + aiPercent: 76, + cookingPercent: 24, + events: ['KnownFor updates (fourth time)', 'InterestedIn updates', 'Consolidation complete: 60/40 � 76/24', '16 percentage point drift in 12 weeks'], + knownForUpdate: true, + interestedInUpdate: true + } +]; + +// Initialize timeline +updateTimeline(0); + +weekSlider.addEventListener('input', (e) => { + const week = parseInt(e.target.value); + updateTimeline(week); +}); + +function updateTimeline(week) { + weekDisplay.textContent = week; + const data = timelineData[week]; + + // Display events + let eventsHTML = '

This Week's Events:

'; + eventsHTML += ''; + + // Add update badges + let badgesHTML = '
'; + + if (data.knownForUpdate) { + badgesHTML += '= KnownFor Updates'; + } + + if (data.interestedInUpdate) { + badgesHTML += '= InterestedIn Updates'; + } + + badgesHTML += '
'; + + timelineEvents.innerHTML = eventsHTML + badgesHTML; + + // Display current state + const aiDrift = data.aiPercent - 60; + const cookingDrift = data.cookingPercent - 40; + + let stateHTML = ` +

Your Current State

+ +
+
+ AI/Tech + ${data.aiPercent}% +
+
+
+
+ ${aiDrift > 0 ? `
+${aiDrift} points from Week 0
` : ''} +
+ +
+
+ Cooking + ${data.cookingPercent}% +
+
+
+
+ ${cookingDrift < 0 ? `
${cookingDrift} points from Week 0
` : ''} +
+ `; + + // Add interpretation + if (week === 0) { + stateHTML += ` +
+

Starting point: Slightly unbalanced (60/40). This small imbalance will compound over time.

+
+ `; + } else if (week === 3 || week === 6 || week === 9 || week === 12) { + stateHTML += ` +
+

KnownFor update week: The underlying cluster structure recalculates. This creates a new baseline for the next 3 weeks of InterestedIn updates.

+
+ `; + } else if (week === 12) { + stateHTML += ` +
+

Result: Without changing your behavior, your feed drifted from 60/40 to ${data.aiPercent}/${data.cookingPercent}. This will continue toward 80/20, 90/10, eventually 100/0.

+
+ `; + } + + timelineState.innerHTML = stateHTML; +} + +// End DOMContentLoaded +}); diff --git a/docs/js/cluster-explorer.js b/docs/js/cluster-explorer.js new file mode 100644 index 000000000..2f946d8ea --- /dev/null +++ b/docs/js/cluster-explorer.js @@ -0,0 +1,416 @@ +/** + * Cluster Explorer - Calculate Your Algorithmic Communities + * + * Based on Twitter's SimClusters algorithm: + * - InterestedIn = EngagementGraph × KnownFor + * - Default type: FavBasedUserInterestedIn (engagement-based) + * - 100-day half-life for engagement decay + * - L2 normalization (cluster weights sum to 1.0) + * - Weekly batch updates + * + * Code references: + * - InterestedInFromKnownFor.scala:292 (favScore calculation) + * - SimClustersEmbeddingId.scala:46 (default type) + * - SimClustersEmbedding.scala:59-72 (L2 normalization) + */ + +// DOM elements +const profileSelect = document.getElementById('profile-select'); +const customSelection = document.getElementById('custom-selection'); +const producerCheckboxes = document.querySelectorAll('.producer-checkbox'); +const aiEngagement = document.getElementById('ai-engagement'); +const cookingEngagement = document.getElementById('cooking-engagement'); +const politicsEngagement = document.getElementById('politics-engagement'); +const aiWeightDisplay = document.getElementById('ai-weight-display'); +const cookingWeightDisplay = document.getElementById('cooking-weight-display'); +const politicsWeightDisplay = document.getElementById('politics-weight-display'); +const calculateBtn = document.getElementById('calculate-btn'); +const resultsContainer = document.getElementById('results-container'); +const interpretation = document.getElementById('interpretation'); +const comparisonBody = document.getElementById('comparison-body'); +const warnings = document.getElementById('warnings'); + +// Chart instance +let clusterChart = null; + +// Preset profiles +const PROFILES = { + balanced: { + follows: { ai: 4, cooking: 4, politics: 4 }, + engagement: { ai: 50, cooking: 50, politics: 50 } + }, + tech: { + follows: { ai: 9, cooking: 2, politics: 1 }, + engagement: { ai: 150, cooking: 30, politics: 20 } + }, + politics: { + follows: { ai: 2, cooking: 1, politics: 9 }, + engagement: { ai: 30, cooking: 10, politics: 160 } + }, + cooking: { + follows: { ai: 2, cooking: 9, politics: 1 }, + engagement: { ai: 40, cooking: 150, politics: 10 } + } +}; + +// Initialize +updateEngagementDisplays(); + +// Event listeners +profileSelect.addEventListener('change', handleProfileChange); +aiEngagement.addEventListener('input', updateEngagementDisplays); +cookingEngagement.addEventListener('input', updateEngagementDisplays); +politicsEngagement.addEventListener('input', updateEngagementDisplays); +calculateBtn.addEventListener('click', calculateClusters); + +/** + * Handle profile selection + */ +function handleProfileChange() { + const profile = profileSelect.value; + + if (profile === 'custom') { + customSelection.style.display = 'block'; + return; + } + + // Load preset profile + customSelection.style.display = 'block'; + const preset = PROFILES[profile]; + + // Set checkboxes + producerCheckboxes.forEach(checkbox => { + checkbox.checked = false; + }); + + // Check appropriate number of boxes per cluster + let aiChecked = 0, cookingChecked = 0, politicsChecked = 0; + producerCheckboxes.forEach(checkbox => { + const cluster = checkbox.dataset.cluster; + if (cluster === 'ai' && aiChecked < preset.follows.ai) { + checkbox.checked = true; + aiChecked++; + } else if (cluster === 'cooking' && cookingChecked < preset.follows.cooking) { + checkbox.checked = true; + cookingChecked++; + } else if (cluster === 'politics' && politicsChecked < preset.follows.politics) { + checkbox.checked = true; + politicsChecked++; + } + }); + + // Set engagement sliders + aiEngagement.value = preset.engagement.ai; + cookingEngagement.value = preset.engagement.cooking; + politicsEngagement.value = preset.engagement.politics; + updateEngagementDisplays(); +} + +/** + * Update engagement weight displays + */ +function updateEngagementDisplays() { + aiWeightDisplay.textContent = aiEngagement.value; + cookingWeightDisplay.textContent = cookingEngagement.value; + politicsWeightDisplay.textContent = politicsEngagement.value; +} + +/** + * Calculate cluster assignment + */ +function calculateClusters() { + // Count follows per cluster + const follows = { ai: 0, cooking: 0, politics: 0 }; + producerCheckboxes.forEach(checkbox => { + if (checkbox.checked) { + follows[checkbox.dataset.cluster]++; + } + }); + + // Get engagement weights + const engagement = { + ai: parseInt(aiEngagement.value), + cooking: parseInt(cookingEngagement.value), + politics: parseInt(politicsEngagement.value) + }; + + // Calculate InterestedIn + const result = calculateInterestedIn(follows, engagement); + + // Display results + displayResults(result); + + // Scroll to results + resultsContainer.style.display = 'block'; + resultsContainer.scrollIntoView({ behavior: 'smooth', block: 'start' }); +} + +/** + * Calculate InterestedIn using simplified matrix multiplication + * + * InterestedIn = EngagementGraph × KnownFor + * + * Simplified model: + * - Follows contribute base weight (starting point) + * - Engagement contributes weighted score (dominates over time) + * - L2 normalize to sum to 1.0 + */ +function calculateInterestedIn(follows, engagement) { + // Step 1: Calculate from follows (base weight) + const fromFollows = { + ai: follows.ai * 10, // Each follow = 10 base weight + cooking: follows.cooking * 10, + politics: follows.politics * 10 + }; + + // Step 2: Add engagement (with 100-day half-life, current engagement = full weight) + // Engagement weight is MUCH higher (this is the key!) + const fromEngagement = { + ai: engagement.ai * 5, // Engagement weighted 5x per engagement + cooking: engagement.cooking * 5, + politics: engagement.politics * 5 + }; + + // Step 3: Combine + let clusters = { + ai: fromFollows.ai + fromEngagement.ai, + cooking: fromFollows.cooking + fromEngagement.cooking, + politics: fromFollows.politics + fromEngagement.politics + }; + + // Step 4: L2 normalization (sum to 1.0) + const total = clusters.ai + clusters.cooking + clusters.politics; + + if (total === 0) { + // No follows or engagement - equal distribution + clusters = { ai: 0.33, cooking: 0.33, politics: 0.34 }; + } else { + clusters.ai /= total; + clusters.cooking /= total; + clusters.politics /= total; + } + + return { + clusters, + fromFollows, + fromEngagement + }; +} + +/** + * Display results with Chart.js + */ +function displayResults(result) { + const { clusters, fromFollows, fromEngagement } = result; + + // Destroy existing chart + if (clusterChart) { + clusterChart.destroy(); + } + + // Create bar chart + const ctx = document.getElementById('cluster-chart').getContext('2d'); + clusterChart = new Chart(ctx, { + type: 'bar', + data: { + labels: ['AI/Tech', 'Cooking', 'Politics'], + datasets: [{ + label: 'Your Cluster Interest (%)', + data: [ + (clusters.ai * 100).toFixed(1), + (clusters.cooking * 100).toFixed(1), + (clusters.politics * 100).toFixed(1) + ], + backgroundColor: ['#1DA1F2', '#17bf63', '#ff9500'], + borderWidth: 2, + borderColor: '#192734' + }] + }, + options: { + responsive: true, + maintainAspectRatio: true, + plugins: { + legend: { + display: false + }, + tooltip: { + callbacks: { + label: function(context) { + return context.parsed.y.toFixed(1) + '%'; + } + } + } + }, + scales: { + y: { + beginAtZero: true, + max: 100, + ticks: { + callback: function(value) { + return value + '%'; + } + }, + grid: { + color: 'rgba(255, 255, 255, 0.1)' + } + }, + x: { + grid: { + display: false + } + } + } + } + }); + + // Generate interpretation + generateInterpretation(clusters); + + // Generate comparison table + generateComparisonTable(fromFollows, fromEngagement, clusters); + + // Generate warnings + generateWarnings(clusters, fromFollows, fromEngagement); +} + +/** + * Generate interpretation text + */ +function generateInterpretation(clusters) { + // Find dominant cluster + const sorted = Object.entries(clusters) + .sort((a, b) => b[1] - a[1]); + + const dominant = sorted[0]; + const dominantName = formatClusterName(dominant[0]); + const dominantPercent = (dominant[1] * 100).toFixed(0); + + const second = sorted[1]; + const secondName = formatClusterName(second[0]); + const secondPercent = (second[1] * 100).toFixed(0); + + let html = '
'; + + if (dominant[1] > 0.7) { + html += `

Heavily concentrated: You're ${dominantPercent}% ${dominantName}. This cluster will dominate your For You feed.

`; + html += `

Roughly ${dominantPercent}% of your feed will be ${dominantName} content. ${secondName} (${secondPercent}%) will be much less visible.

`; + } else if (dominant[1] > 0.5) { + html += `

Moderately concentrated: You're ${dominantPercent}% ${dominantName}, ${secondPercent}% ${secondName}.

`; + html += `

Your feed will be roughly ${dominantPercent}% ${dominantName} and ${secondPercent}% ${secondName}. The smaller cluster may drop below the threshold over time due to multiplicative scoring.

`; + } else { + html += `

Relatively balanced: Your top cluster is ${dominantPercent}% ${dominantName}.

`; + html += `

Your feed will be fairly diverse, but expect drift toward ${dominantName} over time due to multiplicative scoring (gravitational pull effect).

`; + } + + html += '
'; + interpretation.innerHTML = html; +} + +/** + * Generate comparison table + */ +function generateComparisonTable(fromFollows, fromEngagement, clusters) { + // Calculate percentages from follows only + const totalFollows = fromFollows.ai + fromFollows.cooking + fromFollows.politics; + const followsPercent = { + ai: totalFollows > 0 ? (fromFollows.ai / totalFollows * 100).toFixed(0) : 0, + cooking: totalFollows > 0 ? (fromFollows.cooking / totalFollows * 100).toFixed(0) : 0, + politics: totalFollows > 0 ? (fromFollows.politics / totalFollows * 100).toFixed(0) : 0 + }; + + // Calculate percentages from engagement only + const totalEngagement = fromEngagement.ai + fromEngagement.cooking + fromEngagement.politics; + const engagementPercent = { + ai: totalEngagement > 0 ? (fromEngagement.ai / totalEngagement * 100).toFixed(0) : 0, + cooking: totalEngagement > 0 ? (fromEngagement.cooking / totalEngagement * 100).toFixed(0) : 0, + politics: totalEngagement > 0 ? (fromEngagement.politics / totalEngagement * 100).toFixed(0) : 0 + }; + + const finalPercent = { + ai: (clusters.ai * 100).toFixed(0), + cooking: (clusters.cooking * 100).toFixed(0), + politics: (clusters.politics * 100).toFixed(0) + }; + + const html = ` + + AI/Tech + ${followsPercent.ai}% + ${engagementPercent.ai}% + ${finalPercent.ai}% + + + Cooking + ${followsPercent.cooking}% + ${engagementPercent.cooking}% + ${finalPercent.cooking}% + + + Politics + ${followsPercent.politics}% + ${engagementPercent.politics}% + ${finalPercent.politics}% + + `; + + comparisonBody.innerHTML = html; +} + +/** + * Generate warnings + */ +function generateWarnings(clusters, fromFollows, fromEngagement) { + const warningsHtml = []; + + // Check for threshold danger + const sorted = Object.entries(clusters).sort((a, b) => b[1] - a[1]); + const weakest = sorted[2]; + + if (weakest[1] < 0.1) { + const weakestName = formatClusterName(weakest[0]); + const weakestPercent = (weakest[1] * 100).toFixed(1); + warningsHtml.push(` +
+

⚠️ Threshold Danger

+

${weakestName} (${weakestPercent}%) is approaching the algorithm's threshold. If it drops below ~7%, it may be filtered out entirely from your feed.

+

To maintain diversity, you need to actively engage with ${weakestName} content to keep this cluster above the threshold.

+
+ `); + } + + // Check for engagement vs follows mismatch + const totalFollows = fromFollows.ai + fromFollows.cooking + fromFollows.politics; + const totalEngagement = fromEngagement.ai + fromEngagement.cooking + fromEngagement.politics; + + if (totalEngagement > totalFollows * 2) { + warningsHtml.push(` +
+

💡 Engagement Dominates

+

Your engagement history is strongly influencing your clusters (${(totalEngagement / (totalFollows + totalEngagement) * 100).toFixed(0)}% of the signal).

+

This is normal! Engagement with a 100-day half-life dominates over follows for active users. Your clusters reflect what you actually engage with, not just who you follow.

+
+ `); + } else if (totalFollows > totalEngagement * 2) { + warningsHtml.push(` +
+

📊 Follows Dominate (For Now)

+

Your follows are the primary signal (${(totalFollows / (totalFollows + totalEngagement) * 100).toFixed(0)}% of the calculation).

+

This suggests you're either a new account or a light user. As you engage more, your engagement history will start dominating. Within a few weeks of active use, engagement will override your follow choices.

+
+ `); + } + + warnings.innerHTML = warningsHtml.join(''); +} + +/** + * Format cluster name for display + */ +function formatClusterName(cluster) { + const names = { + ai: 'AI/Tech', + cooking: 'Cooking', + politics: 'Politics' + }; + return names[cluster] || cluster; +} diff --git a/docs/js/engagement-calculator.js b/docs/js/engagement-calculator.js new file mode 100644 index 000000000..9fcefae3f --- /dev/null +++ b/docs/js/engagement-calculator.js @@ -0,0 +1,613 @@ +/** + * Engagement Weight Calculator - Tweet Scoring Simulator + * + * Based on Twitter's algorithm code: + * - Engagement weights from HomeGlobalParams.scala:788-930 + * - Scoring logic from NaviModelScorer.scala:139-178 + * - Heavy Ranker predictions from MaskNet architecture + * + * Code references: + * - HomeGlobalParams.scala:788-930 (engagement weights) + * - NaviModelScorer.scala:139-178 (weighted score computation) + */ + +// Actual engagement weights from March 2023 code +// Source: home-mixer/server/src/main/scala/com/twitter/home_mixer/param/HomeGlobalParams.scala:788-930 +const ENGAGEMENT_WEIGHTS = { + 'Reply with Author Engagement': 75.0, + 'Reply': 13.5, + 'Good Profile Click': 12.0, + 'Good Click': 11.0, + 'Video Playback 50%': 0.005, + 'Retweet': 1.0, + 'Favorite': 0.5, + 'Negative Feedback': -74.0, + 'Report': -369.0 +}; + +// Engagement type keys (for consistent ordering) +const ENGAGEMENT_TYPES = Object.keys(ENGAGEMENT_WEIGHTS); + +// Scenario definitions with realistic engagement probabilities +// Probabilities represent: "What % of users who see this tweet will engage this way?" +const SCENARIOS = { + 'educational-thread': { + name: '📚 Educational Thread', + description: 'How to build a neural network from scratch (10 tweet thread with code examples)', + explanation: 'Educational content drives deep engagement. Users who learn something valuable are likely to reply with questions or engage with the author. High "good click" rate as users read the entire thread.', + probabilities: { + 'Reply with Author Engagement': 3.0, // People ask questions, author responds + 'Reply': 8.0, // High conversation + 'Good Profile Click': 6.0, // Check out the author + 'Good Click': 28.0, // Read the whole thread + 'Video Playback 50%': 0.0, // No video + 'Retweet': 5.0, // Share with followers + 'Favorite': 18.0, // Bookmark/like + 'Negative Feedback': 1.0, // Very low + 'Report': 0.1 // Minimal + } + }, + 'breaking-news': { + name: '📰 Breaking News', + description: 'BREAKING: Major tech company announces layoffs. Thread with details ↓', + explanation: 'Breaking news drives extremely high engagement across all types. People want to discuss, share, and learn more. Profile clicks as people check if source is credible.', + probabilities: { + 'Reply with Author Engagement': 2.0, + 'Reply': 15.0, // High discussion + 'Good Profile Click': 10.0, // Check credibility + 'Good Click': 35.0, // Read full thread + 'Video Playback 50%': 0.0, + 'Retweet': 18.0, // High shares + 'Favorite': 25.0, // Many likes + 'Negative Feedback': 2.0, // Some don't care + 'Report': 0.2 + } + }, + 'useful-resource': { + name: '🔧 Useful Resource', + description: 'I\'ve compiled 100 free resources for learning data science: [link]', + explanation: 'Resource lists drive clicks (people visit the link), retweets (share with others), and bookmarks (save for later). Lower reply rate because there\'s less to discuss.', + probabilities: { + 'Reply with Author Engagement': 1.0, + 'Reply': 3.0, // Low conversation + 'Good Profile Click': 4.0, + 'Good Click': 32.0, // Click the link! + 'Video Playback 50%': 0.0, + 'Retweet': 12.0, // Share the resource + 'Favorite': 28.0, // Bookmark/like + 'Negative Feedback': 1.5, + 'Report': 0.2 + } + }, + 'wholesome': { + name: '❤️ Wholesome Content', + description: 'My daughter just wrote her first line of code. So proud! [cute photo]', + explanation: 'Wholesome content gets LOTS of likes (feel-good engagement) but very few replies (what is there to say?). This is the "Favorites Paradox" in action - high engagement but low algorithmic value.', + probabilities: { + 'Reply with Author Engagement': 0.3, + 'Reply': 2.0, // "Congrats!" replies + 'Good Profile Click': 1.0, + 'Good Click': 8.0, // Look at photo + 'Video Playback 50%': 0.0, + 'Retweet': 3.0, + 'Favorite': 42.0, // TONS of likes! + 'Negative Feedback': 0.5, // Very positive + 'Report': 0.05 + } + }, + 'viral-meme': { + name: '😂 Viral Meme', + description: 'me: I\'ll just check Twitter for 5 minutes [4 hours later meme]', + explanation: 'Viral memes get massive passive engagement (likes, retweets) but relatively low conversation. The algorithm values this less than educational threads despite higher total engagement!', + probabilities: { + 'Reply with Author Engagement': 0.2, + 'Reply': 4.0, // Some funny responses + 'Good Profile Click': 2.0, + 'Good Click': 15.0, // View the meme + 'Video Playback 50%': 0.0, + 'Retweet': 15.0, // High sharing + 'Favorite': 45.0, // VERY high likes + 'Negative Feedback': 1.0, + 'Report': 0.1 + } + }, + 'personal-story': { + name: '💭 Personal Story', + description: 'Thread about my journey from bootcamp to senior engineer (authentic, relatable)', + explanation: 'Authentic stories drive balanced engagement. People relate, engage meaningfully, and sometimes have conversations with the author.', + probabilities: { + 'Reply with Author Engagement': 2.5, // Author engages with supporters + 'Reply': 7.0, + 'Good Profile Click': 8.0, // Check out their profile + 'Good Click': 25.0, // Read the story + 'Video Playback 50%': 0.0, + 'Retweet': 6.0, + 'Favorite': 20.0, + 'Negative Feedback': 1.2, + 'Report': 0.1 + } + }, + 'hot-take': { + name: '🔥 Hot Take', + description: 'Unpopular opinion: [controversial tech opinion that sparks debate]', + explanation: 'Controversial takes drive HIGH reply rates (people want to argue) but also significant negative feedback. May score negatively overall despite high engagement!', + probabilities: { + 'Reply with Author Engagement': 0.8, + 'Reply': 22.0, // LOTS of debate + 'Good Profile Click': 3.0, + 'Good Click': 8.0, + 'Video Playback 50%': 0.0, + 'Retweet': 5.0, // Some people share + 'Favorite': 8.0, // Low agreement + 'Negative Feedback': 14.0, // Many click "not interested"! + 'Report': 1.5 // Some reports + } + }, + 'quote-dunk': { + name: '💢 Quote Tweet Dunk', + description: 'lmao imagine actually believing this [quote tweets bad take]', + explanation: 'Dunking drives engagement but creates negative experiences. High replies (people join the pile-on) but also high negative feedback (many find it toxic).', + probabilities: { + 'Reply with Author Engagement': 1.0, + 'Reply': 18.0, // Pile-on replies + 'Good Profile Click': 4.0, // See the drama + 'Good Click': 12.0, + 'Video Playback 50%': 0.0, + 'Retweet': 8.0, // Share the dunk + 'Favorite': 15.0, // Agree with dunk + 'Negative Feedback': 12.0, // Many hide this + 'Report': 2.0 // Harassment reports + } + }, + 'engagement-bait': { + name: '🎣 Engagement Bait', + description: 'Drop a 🔥 if you agree! Follow me for more content like this! #engagement', + explanation: 'Obvious engagement bait gets moderate replies but VERY high negative feedback. Users hate this type of content. Despite replies, usually scores negative!', + probabilities: { + 'Reply with Author Engagement': 0.3, + 'Reply': 8.0, // Some engagement + 'Good Profile Click': 1.0, + 'Good Click': 3.0, + 'Video Playback 50%': 0.0, + 'Retweet': 2.0, + 'Favorite': 5.0, + 'Negative Feedback': 18.0, // Users HATE this! + 'Report': 3.0 // Spam reports + } + }, + 'spam': { + name: '🚫 Spam/Low Quality', + description: 'CHECK OUT MY CRYPTO COURSE!!! 🚀💰 LINK IN BIO [generic spam]', + explanation: 'Spam gets almost no positive engagement and very high negative signals. Heavily suppressed by the algorithm.', + probabilities: { + 'Reply with Author Engagement': 0.0, + 'Reply': 0.5, // Almost nothing + 'Good Profile Click': 0.2, + 'Good Click': 1.0, + 'Video Playback 50%': 0.0, + 'Retweet': 0.1, + 'Favorite': 0.3, + 'Negative Feedback': 25.0, // VERY high + 'Report': 8.0 // Lots of reports + } + }, + 'reply-guy': { + name: '😬 Reply Guy', + description: 'Actually, [unsolicited correction on someone\'s casual tweet]', + explanation: 'Unsolicited corrections get low engagement and moderate negative feedback. People don\'t like being corrected on casual tweets.', + probabilities: { + 'Reply with Author Engagement': 0.2, + 'Reply': 3.0, // Some arguments + 'Good Profile Click': 1.0, + 'Good Click': 2.0, + 'Video Playback 50%': 0.0, + 'Retweet': 0.5, + 'Favorite': 2.0, + 'Negative Feedback': 8.0, // Annoying + 'Report': 0.8 + } + }, + 'algorithm-hack': { + name: '🤖 Algorithm Gaming', + description: 'Agree or disagree? Comment below! ⬇️ [intentionally vague to drive replies]', + explanation: 'Attempts to game the algorithm with vague engagement prompts. Gets replies but also significant negative feedback from savvy users.', + probabilities: { + 'Reply with Author Engagement': 0.5, + 'Reply': 12.0, // Gets replies + 'Good Profile Click': 1.5, + 'Good Click': 4.0, + 'Video Playback 50%': 0.0, + 'Retweet': 2.0, + 'Favorite': 6.0, + 'Negative Feedback': 10.0, // Users recognize the tactic + 'Report': 1.5 + } + } +}; + +// Chart instances +let weightsChart = null; +let contributionChart = null; + +// Currently selected scenario +let selectedScenario = null; + +// Initialize on page load +window.addEventListener('DOMContentLoaded', () => { + renderWeightsChart(); + attachScenarioHandlers(); +}); + +/** + * Attach click handlers to scenario cards + */ +function attachScenarioHandlers() { + const cards = document.querySelectorAll('.scenario-card'); + cards.forEach(card => { + card.addEventListener('click', () => { + const scenarioKey = card.dataset.scenario; + selectScenario(scenarioKey); + }); + }); +} + +/** + * Select and display a scenario + */ +function selectScenario(scenarioKey) { + const scenario = SCENARIOS[scenarioKey]; + if (!scenario) return; + + selectedScenario = scenarioKey; + + // Update visual selection + document.querySelectorAll('.scenario-card').forEach(card => { + card.classList.remove('selected'); + }); + document.querySelector(`[data-scenario="${scenarioKey}"]`).classList.add('selected'); + + // Calculate and display + calculateScenario(scenario); + + // Scroll to results + document.getElementById('results-container').scrollIntoView({ + behavior: 'smooth', + block: 'start' + }); +} + +/** + * Calculate score for a scenario + */ +function calculateScenario(scenario) { + let totalScore = 0; + const contributions = {}; + + // Calculate contributions + ENGAGEMENT_TYPES.forEach(type => { + const probability = scenario.probabilities[type]; + const weight = ENGAGEMENT_WEIGHTS[type]; + const contribution = (probability / 100) * weight; + contributions[type] = contribution; + totalScore += contribution; + }); + + // Display results + displayResults(totalScore, scenario, contributions); +} + +/** + * Render the engagement weights bar chart + */ +function renderWeightsChart() { + const ctx = document.getElementById('weights-chart').getContext('2d'); + + const labels = Object.keys(ENGAGEMENT_WEIGHTS); + const data = Object.values(ENGAGEMENT_WEIGHTS); + const colors = data.map(value => { + if (value >= 10) return '#17bf63'; // High positive - green + if (value > 0) return '#1DA1F2'; // Low positive - blue + if (value > -100) return '#ff9500'; // Moderate negative - orange + return '#ff6b6b'; // Severe negative - red + }); + + weightsChart = new Chart(ctx, { + type: 'bar', + data: { + labels: labels, + datasets: [{ + label: 'Engagement Weight', + data: data, + backgroundColor: colors, + borderColor: colors.map(c => c), + borderWidth: 1 + }] + }, + options: { + indexAxis: 'y', + responsive: true, + maintainAspectRatio: true, + plugins: { + title: { + display: true, + text: 'Engagement Type Weights (March 2023)', + font: { + size: 16, + weight: 'bold', + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#1a1a1a', + padding: 20 + }, + legend: { + display: false + }, + tooltip: { + backgroundColor: 'rgba(0, 0, 0, 0.8)', + titleFont: { + size: 14, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + bodyFont: { + size: 13, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + padding: 12, + callbacks: { + label: function(context) { + const value = context.parsed.x; + return `Weight: ${value.toFixed(1)}`; + }, + afterLabel: function(context) { + const value = context.parsed.x; + if (value === 75.0) return 'Highest value - conversation!'; + if (value === 0.5) return 'Lowest positive - passive'; + if (value === -369.0) return 'Nuclear penalty!'; + if (value < 0) return 'Negative signal'; + return 'Positive signal'; + } + } + } + }, + scales: { + x: { + title: { + display: true, + text: 'Weight', + font: { + size: 14, + weight: 'bold', + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#1a1a1a' + }, + ticks: { + font: { + size: 12, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#6b6b6b' + }, + grid: { + color: 'rgba(0, 0, 0, 0.1)' + } + }, + y: { + ticks: { + font: { + size: 12, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#6b6b6b' + }, + grid: { + display: false + } + } + } + } + }); +} + +/** + * Display calculation results + */ +function displayResults(totalScore, scenario, contributions) { + const resultsContainer = document.getElementById('results-container'); + resultsContainer.style.display = 'block'; + + // Update total score + const scoreElement = document.getElementById('total-score'); + scoreElement.textContent = totalScore.toFixed(2); + scoreElement.style.color = totalScore > 0 ? 'var(--success)' : 'var(--warning)'; + + // Score interpretation + const interpretation = document.getElementById('score-interpretation'); + interpretation.innerHTML = getScoreInterpretation(totalScore, scenario); + + // Render contribution chart + renderContributionChart(contributions); + + // Render breakdown table + renderBreakdownTable(scenario.probabilities, contributions, totalScore); +} + +/** + * Get human-readable interpretation of score + */ +function getScoreInterpretation(score, scenario) { + let statusHTML = ''; + + if (score > 5) { + statusHTML = `

Excellent Score - High Amplification

`; + } else if (score > 2) { + statusHTML = `

Good Score - Moderate Amplification

`; + } else if (score > 0) { + statusHTML = `

Positive Score - Limited Amplification

`; + } else if (score > -2) { + statusHTML = `

Slightly Negative - Suppressed

`; + } else { + statusHTML = `

Highly Negative - Heavily Suppressed

`; + } + + return ` + ${statusHTML} +

${scenario.name}

+

"${scenario.description}"

+

Why this score: ${scenario.explanation}

+ `; +} + +/** + * Render contribution breakdown chart + */ +function renderContributionChart(contributions) { + const ctx = document.getElementById('contribution-chart').getContext('2d'); + + // Destroy existing chart + if (contributionChart) { + contributionChart.destroy(); + } + + // Prepare data - only show non-zero contributions + const entries = Object.entries(contributions) + .filter(([_, value]) => Math.abs(value) > 0.001) + .sort((a, b) => Math.abs(b[1]) - Math.abs(a[1])); + + const labels = entries.map(([type, _]) => type); + const data = entries.map(([_, value]) => value); + const colors = data.map(value => { + if (value >= 1) return '#17bf63'; // High positive + if (value > 0) return '#1DA1F2'; // Low positive + return '#ff6b6b'; // Negative + }); + + contributionChart = new Chart(ctx, { + type: 'bar', + data: { + labels: labels, + datasets: [{ + label: 'Contribution to Score', + data: data, + backgroundColor: colors, + borderColor: colors.map(c => c), + borderWidth: 1 + }] + }, + options: { + indexAxis: 'y', + responsive: true, + maintainAspectRatio: true, + plugins: { + title: { + display: true, + text: 'Score Contribution by Engagement Type', + font: { + size: 16, + weight: 'bold', + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#1a1a1a', + padding: 20 + }, + legend: { + display: false + }, + tooltip: { + backgroundColor: 'rgba(0, 0, 0, 0.8)', + titleFont: { + size: 14, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + bodyFont: { + size: 13, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + padding: 12, + callbacks: { + label: function(context) { + const value = context.parsed.x; + return `Contribution: ${value.toFixed(3)}`; + } + } + } + }, + scales: { + x: { + title: { + display: true, + text: 'Contribution to Total Score', + font: { + size: 14, + weight: 'bold', + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#1a1a1a' + }, + ticks: { + font: { + size: 12, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#6b6b6b' + }, + grid: { + color: 'rgba(0, 0, 0, 0.1)' + } + }, + y: { + ticks: { + font: { + size: 12, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#6b6b6b' + }, + grid: { + display: false + } + } + } + } + }); +} + +/** + * Render breakdown table + */ +function renderBreakdownTable(probabilities, contributions, totalScore) { + const tbody = document.getElementById('breakdown-tbody'); + + tbody.innerHTML = ENGAGEMENT_TYPES + .map(type => { + const probability = probabilities[type]; + const weight = ENGAGEMENT_WEIGHTS[type]; + const contribution = contributions[type]; + + // Skip if no probability + if (probability === 0) return ''; + + const contributionColor = contribution > 0 ? 'var(--success)' : 'var(--warning)'; + + return ` + + ${type} + ${probability.toFixed(1)}% + ${weight.toFixed(1)} + + ${contribution >= 0 ? '+' : ''}${contribution.toFixed(3)} + + + `; + }) + .filter(row => row !== '') + .join(''); + + // Update total + const totalColor = totalScore > 0 ? 'var(--success)' : 'var(--warning)'; + document.getElementById('breakdown-total').innerHTML = ` + + ${totalScore >= 0 ? '+' : ''}${totalScore.toFixed(3)} + + `; +} diff --git a/docs/js/invisible-filter.js b/docs/js/invisible-filter.js new file mode 100644 index 000000000..3a4854142 --- /dev/null +++ b/docs/js/invisible-filter.js @@ -0,0 +1,411 @@ +/** + * The Invisible Filter - Cluster-based Feed Personalization + * + * Shows how the same tweets get ranked completely differently + * for users with different cluster interests. + * + * Based on: + * - Multiplicative scoring from ApproximateCosineSimilarity.scala:84-94 + * - InterestedIn cluster assignments from InterestedInFromKnownFor.scala + * - L2 normalization (cluster weights sum to 1.0) + */ + +// Tweet dataset with cluster assignments and base quality scores +const TWEETS = [ + { + id: 1, + cluster: 'ai', + content: 'New breakthrough in transformer architecture - 10x faster training with same accuracy [technical thread]', + author: '@ai_researcher', + baseQuality: 0.88 + }, + { + id: 2, + cluster: 'ai', + content: 'Just released our open-source ML framework for edge devices. Check it out! [link]', + author: '@ml_startup', + baseQuality: 0.75 + }, + { + id: 3, + cluster: 'ai', + content: 'Fascinating paper on LLM reasoning capabilities. Thread on key findings ↓', + author: '@phd_student', + baseQuality: 0.82 + }, + { + id: 4, + cluster: 'ai', + content: 'Hot take: Most "AI" products are just wrappers around OpenAI API', + author: '@tech_critic', + baseQuality: 0.65 + }, + { + id: 5, + cluster: 'ai', + content: 'Hiring: Senior ML Engineer for our AI safety team. Must have experience with...', + author: '@ai_company', + baseQuality: 0.55 + }, + { + id: 6, + cluster: 'cooking', + content: 'Made the perfect sourdough after 3 years of trying. Here\'s what finally worked [detailed guide]', + author: '@bread_master', + baseQuality: 0.86 + }, + { + id: 7, + cluster: 'cooking', + content: 'PSA: You\'re probably overcooking your pasta. Al dente means "to the tooth" - here\'s the test...', + author: '@italian_chef', + baseQuality: 0.79 + }, + { + id: 8, + cluster: 'cooking', + content: 'Unpopular opinion: Expensive knives are overrated. Here\'s my $30 knife that\'s lasted 10 years', + author: '@home_cook', + baseQuality: 0.71 + }, + { + id: 9, + cluster: 'cooking', + content: 'Just meal prepped for the entire week in 2 hours. Here\'s my system: [photos]', + author: '@meal_prep_pro', + baseQuality: 0.68 + }, + { + id: 10, + cluster: 'cooking', + content: 'The science of umami - why MSG is unfairly demonized (thread)', + author: '@food_scientist', + baseQuality: 0.77 + }, + { + id: 11, + cluster: 'politics', + content: 'BREAKING: Major policy announcement expected this afternoon. Here\'s what we know so far...', + author: '@political_reporter', + baseQuality: 0.84 + }, + { + id: 12, + cluster: 'politics', + content: 'Detailed analysis of yesterday\'s debate performance - fact-checking key claims [long thread]', + author: '@policy_analyst', + baseQuality: 0.80 + }, + { + id: 13, + cluster: 'politics', + content: 'This is exactly what I\'ve been saying for months. Finally someone in power gets it.', + author: '@political_commentator', + baseQuality: 0.62 + }, + { + id: 14, + cluster: 'politics', + content: 'New poll shows surprising shift in voter sentiment. Methodology breakdown in thread ↓', + author: '@pollster', + baseQuality: 0.76 + }, + { + id: 15, + cluster: 'politics', + content: 'Both sides are missing the point on this issue. Here\'s the nuanced take no one wants to hear:', + author: '@centrist_voice', + baseQuality: 0.70 + } +]; + +// Friend profile presets +const FRIEND_PROFILES = { + 'politics-focused': { ai: 0.15, cooking: 0.05, politics: 0.80 }, + 'cooking-enthusiast': { ai: 0.20, cooking: 0.75, politics: 0.05 }, + 'balanced': { ai: 0.33, cooking: 0.33, politics: 0.34 }, + 'tech-specialist': { ai: 0.90, cooking: 0.02, politics: 0.08 } +}; + +// Cluster display names and colors +const CLUSTER_INFO = { + 'ai': { name: 'AI/Tech', color: '#1DA1F2' }, + 'cooking': { name: 'Cooking', color: '#17bf63' }, + 'politics': { name: 'Politics', color: '#ff9500' } +}; + +// Current profiles +let userProfile = { ai: 0.60, cooking: 0.25, politics: 0.15 }; +let friendProfile = { ai: 0.15, cooking: 0.05, politics: 0.80 }; +let selectedFriend = 'politics-focused'; + +// DOM elements +const userAiSlider = document.getElementById('user-ai'); +const userCookingSlider = document.getElementById('user-cooking'); +const userPoliticsSlider = document.getElementById('user-politics'); +const compareBtn = document.getElementById('compare-btn'); +const comparisonContainer = document.getElementById('comparison-container'); + +// Initialize +window.addEventListener('DOMContentLoaded', () => { + initializeSliders(); + initializeFriendSelector(); + attachEventListeners(); +}); + +/** + * Initialize sliders with normalization + */ +function initializeSliders() { + // Update displays + updateUserDisplays(); + + // Attach input handlers with normalization + [userAiSlider, userCookingSlider, userPoliticsSlider].forEach(slider => { + slider.addEventListener('input', () => { + normalizeUserProfile(); + updateUserDisplays(); + }); + }); +} + +/** + * Normalize user profile to sum to 100% + * When one slider changes, adjust others proportionally + */ +function normalizeUserProfile() { + const ai = parseInt(userAiSlider.value); + const cooking = parseInt(userCookingSlider.value); + const politics = parseInt(userPoliticsSlider.value); + const total = ai + cooking + politics; + + if (total !== 100) { + // Normalize to 100% + const normAi = Math.round((ai / total) * 100); + const normCooking = Math.round((cooking / total) * 100); + const normPolitics = 100 - normAi - normCooking; // Ensure exact 100% + + userProfile = { + ai: normAi / 100, + cooking: normCooking / 100, + politics: normPolitics / 100 + }; + } else { + userProfile = { + ai: ai / 100, + cooking: cooking / 100, + politics: politics / 100 + }; + } +} + +/** + * Update user profile displays + */ +function updateUserDisplays() { + const aiPercent = Math.round(userProfile.ai * 100); + const cookingPercent = Math.round(userProfile.cooking * 100); + const politicsPercent = Math.round(userProfile.politics * 100); + const total = aiPercent + cookingPercent + politicsPercent; + + document.getElementById('user-ai-display').textContent = `${aiPercent}%`; + document.getElementById('user-cooking-display').textContent = `${cookingPercent}%`; + document.getElementById('user-politics-display').textContent = `${politicsPercent}%`; + document.getElementById('user-total').textContent = `${total}%`; + + // Update slider values + userAiSlider.value = aiPercent; + userCookingSlider.value = cookingPercent; + userPoliticsSlider.value = politicsPercent; +} + +/** + * Initialize friend profile selector + */ +function initializeFriendSelector() { + updateFriendDisplay(); + + const friendBtns = document.querySelectorAll('.friend-btn'); + friendBtns.forEach(btn => { + btn.addEventListener('click', () => { + // Update active state + friendBtns.forEach(b => b.classList.remove('active')); + btn.classList.add('active'); + + // Load profile + const profileKey = btn.dataset.profile; + selectedFriend = profileKey; + friendProfile = FRIEND_PROFILES[profileKey]; + updateFriendDisplay(); + }); + }); +} + +/** + * Update friend profile display + */ +function updateFriendDisplay() { + const aiPercent = Math.round(friendProfile.ai * 100); + const cookingPercent = Math.round(friendProfile.cooking * 100); + const politicsPercent = Math.round(friendProfile.politics * 100); + + document.getElementById('friend-ai-display').textContent = `${aiPercent}%`; + document.getElementById('friend-cooking-display').textContent = `${cookingPercent}%`; + document.getElementById('friend-politics-display').textContent = `${politicsPercent}%`; +} + +/** + * Attach event listeners + */ +function attachEventListeners() { + compareBtn.addEventListener('click', () => { + generateComparison(); + comparisonContainer.style.display = 'block'; + comparisonContainer.scrollIntoView({ behavior: 'smooth', block: 'start' }); + }); + + // View toggle + document.getElementById('view-user').addEventListener('click', () => { + setActiveView('user'); + }); + + document.getElementById('view-friend').addEventListener('click', () => { + setActiveView('friend'); + }); +} + +// Current view state +let currentView = 'user'; +let scoredData = null; + +/** + * Generate and display feed comparison + */ +function generateComparison() { + // Score tweets for each user + const userTweets = scoreTweets(userProfile); + const friendTweets = scoreTweets(friendProfile); + + // Create rank maps + const userRanks = {}; + const friendRanks = {}; + + userTweets.forEach((tweet, index) => { + userRanks[tweet.id] = { + rank: index + 1, + score: tweet.score + }; + }); + + friendTweets.forEach((tweet, index) => { + friendRanks[tweet.id] = { + rank: index + 1, + score: tweet.score + }; + }); + + // Store for view toggling + scoredData = { + userTweets, + friendTweets, + userRanks, + friendRanks + }; + + // Render initial view + renderFeed(); +} + +/** + * Set active view and re-render + */ +function setActiveView(view) { + currentView = view; + + // Update button states + document.getElementById('view-user').classList.toggle('active', view === 'user'); + document.getElementById('view-friend').classList.toggle('active', view === 'friend'); + + // Re-render + renderFeed(); +} + +/** + * Score tweets based on profile + * score = base_quality × cluster_interest + */ +function scoreTweets(profile) { + return TWEETS.map(tweet => { + const clusterInterest = profile[tweet.cluster]; + const score = tweet.baseQuality * clusterInterest; + + return { + ...tweet, + score: score + }; + }).sort((a, b) => b.score - a.score); // Sort by score descending +} + +/** + * Render the feed based on current view + */ +function renderFeed() { + if (!scoredData) return; + + const container = document.getElementById('tweet-feed'); + const tweets = currentView === 'user' ? scoredData.userTweets : scoredData.friendTweets; + const { userRanks, friendRanks } = scoredData; + + container.innerHTML = tweets.map(tweet => { + const userRank = userRanks[tweet.id].rank; + const friendRank = friendRanks[tweet.id].rank; + const userScore = userRanks[tweet.id].score; + const friendScore = friendRanks[tweet.id].score; + + const rankDiff = Math.abs(userRank - friendRank); + const clusterInfo = CLUSTER_INFO[tweet.cluster]; + + // Highlight big differences + const isDifferent = rankDiff >= 5; + const diffClass = isDifferent ? 'rank-different' : ''; + + return ` +
+
+
+
+ 👤 You: + #${userRank} +
+
+ 👥 Friend: + #${friendRank} +
+ ${isDifferent ? `Δ${rankDiff}` : ''} +
+ + ${clusterInfo.name} + +
+
+ ${tweet.content} +
+
${tweet.author}
+
+
+ Base Quality: + ${tweet.baseQuality.toFixed(2)} +
+
+ × Your ${clusterInfo.name} Interest (${(userProfile[tweet.cluster] * 100).toFixed(0)}%): + = ${userScore.toFixed(3)} +
+
+ × Friend's ${clusterInfo.name} Interest (${(friendProfile[tweet.cluster] * 100).toFixed(0)}%): + = ${friendScore.toFixed(3)} +
+
+
+ `; + }).join(''); +} diff --git a/docs/js/journey-simulator.js b/docs/js/journey-simulator.js new file mode 100644 index 000000000..e2df38257 --- /dev/null +++ b/docs/js/journey-simulator.js @@ -0,0 +1,360 @@ +/** + * Journey Simulator - Models the gravitational pull effect + * + * Based on Twitter's algorithm code: + * - Multiplicative scoring at candidate generation and ML scoring stages + * - L2 normalization (interests sum to 1.0) + * - Weekly batch updates for InterestedIn + * - FRS (Follow Recommendations) acceleration + * + * Code references: + * - ApproximateCosineSimilarity.scala:94 (multiplicative scoring) + * - InterestedInFromKnownFor.scala:59 (weekly batches) + * - SimClustersEmbedding.scala:59-72 (L2 normalization) + */ + +// DOM elements +const interest1Input = document.getElementById('interest1-name'); +const interest2Input = document.getElementById('interest2-name'); +const splitSlider = document.getElementById('interest-split'); +const splitDisplay = document.getElementById('split-display'); +const engagementSelect = document.getElementById('engagement-level'); +const frsCheckbox = document.getElementById('frs-enabled'); +const simulateBtn = document.getElementById('simulate-btn'); +const resultsContainer = document.getElementById('results-container'); +const projectionSummary = document.getElementById('projection-summary'); +const driftTableBody = document.getElementById('drift-table-body'); +const interest1Header = document.getElementById('interest1-header'); +const interest2Header = document.getElementById('interest2-header'); + +// Chart instance +let driftChart = null; + +// Update split display when slider moves +splitSlider.addEventListener('input', (e) => { + const primary = parseInt(e.target.value); + const secondary = 100 - primary; + splitDisplay.textContent = `${primary}% / ${secondary}%`; +}); + +// Simulate button click handler +simulateBtn.addEventListener('click', runSimulation); + +/** + * Main simulation function + */ +function runSimulation() { + // Get inputs + const interest1Name = interest1Input.value.trim() || 'Interest 1'; + const interest2Name = interest2Input.value.trim() || 'Interest 2'; + const initialSplit = parseInt(splitSlider.value) / 100; // e.g., 0.60 + const engagementLevel = engagementSelect.value; // 'low', 'medium', 'high' + const frsEnabled = frsCheckbox.checked; + + // Update table headers + interest1Header.textContent = interest1Name; + interest2Header.textContent = interest2Name; + + // Run simulation + const weeks = 52; // Simulate 1 year + const data = simulateDrift(initialSplit, engagementLevel, frsEnabled, weeks); + + // Display results + displayResults(data, interest1Name, interest2Name); + + // Scroll to results + resultsContainer.scrollIntoView({ behavior: 'smooth', block: 'start' }); +} + +/** + * Simulate the gravitational pull effect over time + * + * @param {number} initialSplit - Initial interest split (0.5 to 0.8) + * @param {string} engagementLevel - 'low', 'medium', or 'high' + * @param {boolean} frsEnabled - Whether FRS is enabled + * @param {number} weeks - Number of weeks to simulate + * @returns {Array} Array of {week, interest1, interest2} objects + */ +function simulateDrift(initialSplit, engagementLevel, frsEnabled, weeks) { + // Engagement level affects drift rate + const driftRates = { + low: 0.008, // Slow drift + medium: 0.015, // Moderate drift (matches observed 60→76 in 24 weeks) + high: 0.025 // Fast drift + }; + + let interest1 = initialSplit; + let interest2 = 1 - initialSplit; + const data = [{ week: 0, interest1, interest2 }]; + + // Base drift rate + let baseDriftRate = driftRates[engagementLevel]; + + // FRS acceleration (adds ~20% to drift rate) + if (frsEnabled) { + baseDriftRate *= 1.2; + } + + for (let week = 1; week <= weeks; week++) { + // Calculate multiplicative advantage + // The stronger interest gets amplified by its existing strength + const advantage = interest1 / interest2; + + // Drift is proportional to the imbalance and engagement level + // As the gap widens, drift slows (approaching asymptote) + const imbalance = Math.abs(interest1 - interest2); + const slowdownFactor = 1 - (imbalance * 0.5); // Slow down as approaching extremes + const drift = baseDriftRate * advantage * slowdownFactor; + + // Apply drift with L2 normalization (zero-sum) + interest1 = Math.min(0.95, interest1 + drift); // Cap at 95% + interest2 = 1 - interest1; // L2 normalization + + // Weekly batch update (InterestedIn recalculates weekly) + data.push({ week, interest1, interest2 }); + + // Stop if reached near-total dominance + if (interest1 >= 0.95) break; + } + + return data; +} + +/** + * Display simulation results + */ +function displayResults(data, interest1Name, interest2Name) { + resultsContainer.style.display = 'block'; + + // Summary + const finalWeek = data[data.length - 1]; + const initial = data[0]; + const initialPercent1 = Math.round(initial.interest1 * 100); + const initialPercent2 = Math.round(initial.interest2 * 100); + const finalPercent1 = Math.round(finalWeek.interest1 * 100); + const finalPercent2 = Math.round(finalWeek.interest2 * 100); + const changeMagnitude = finalPercent1 - initialPercent1; + + projectionSummary.innerHTML = ` + You started at ${initialPercent1}% ${interest1Name} / ${initialPercent2}% ${interest2Name}. + After ${finalWeek.week} weeks, your feed will be ${finalPercent1}% ${interest1Name} / ${finalPercent2}% ${interest2Name}. + That's a ${changeMagnitude} percentage point shift toward ${interest1Name}, even though you didn't unfollow anyone. + `; + + // Render chart + renderChart(data, interest1Name, interest2Name); + + // Render table (show key milestones) + renderTable(data, interest1Name, interest2Name); +} + +/** + * Render the drift chart using Chart.js + */ +function renderChart(data, interest1Name, interest2Name) { + const ctx = document.getElementById('drift-chart').getContext('2d'); + + // Destroy existing chart if it exists + if (driftChart) { + driftChart.destroy(); + } + + // Prepare data + const labels = data.map(d => `Week ${d.week}`); + const interest1Data = data.map(d => (d.interest1 * 100).toFixed(1)); + const interest2Data = data.map(d => (d.interest2 * 100).toFixed(1)); + + // Create chart + driftChart = new Chart(ctx, { + type: 'line', + data: { + labels: labels, + datasets: [ + { + label: interest1Name, + data: interest1Data, + borderColor: '#1DA1F2', + backgroundColor: 'rgba(29, 161, 242, 0.1)', + borderWidth: 3, + fill: true, + tension: 0.3 + }, + { + label: interest2Name, + data: interest2Data, + borderColor: '#17bf63', + backgroundColor: 'rgba(23, 191, 99, 0.1)', + borderWidth: 3, + fill: true, + tension: 0.3 + } + ] + }, + options: { + responsive: true, + maintainAspectRatio: true, + interaction: { + mode: 'index', + intersect: false, + }, + plugins: { + title: { + display: true, + text: 'Feed Composition Over Time (Gravitational Pull Effect)', + font: { + size: 16, + weight: 'bold', + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#1a1a1a', + padding: 20 + }, + legend: { + display: true, + position: 'top', + labels: { + font: { + size: 14, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#1a1a1a', + padding: 15, + usePointStyle: true, + pointStyle: 'circle' + } + }, + tooltip: { + backgroundColor: 'rgba(0, 0, 0, 0.8)', + titleFont: { + size: 14, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + bodyFont: { + size: 13, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + padding: 12, + displayColors: true, + callbacks: { + label: function(context) { + return `${context.dataset.label}: ${context.parsed.y}%`; + } + } + } + }, + scales: { + x: { + title: { + display: true, + text: 'Time (Weeks)', + font: { + size: 14, + weight: 'bold', + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#1a1a1a' + }, + ticks: { + font: { + size: 12, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#6b6b6b', + maxRotation: 0, + autoSkip: true, + autoSkipPadding: 20 + }, + grid: { + color: 'rgba(0, 0, 0, 0.05)' + } + }, + y: { + title: { + display: true, + text: 'Feed Composition (%)', + font: { + size: 14, + weight: 'bold', + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#1a1a1a' + }, + min: 0, + max: 100, + ticks: { + font: { + size: 12, + family: '-apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif' + }, + color: '#6b6b6b', + callback: function(value) { + return value + '%'; + } + }, + grid: { + color: 'rgba(0, 0, 0, 0.1)' + } + } + } + } + }); +} + +/** + * Render the milestone table + */ +function renderTable(data, interest1Name, interest2Name) { + // Show milestones: week 0, 4, 8, 12, 16, 20, 24, and final + const milestones = [0, 4, 8, 12, 16, 20, 24]; + const finalWeek = data[data.length - 1].week; + if (finalWeek > 24 && !milestones.includes(finalWeek)) { + milestones.push(finalWeek); + } + + driftTableBody.innerHTML = milestones + .filter(week => week <= finalWeek) + .map(week => { + const point = data[week]; + const percent1 = Math.round(point.interest1 * 100); + const percent2 = Math.round(point.interest2 * 100); + const explanation = getWeekExplanation(week, percent1, interest1Name); + + return ` + + Week ${week} + ${percent1}% + ${percent2}% + ${explanation} + + `; + }) + .join(''); +} + +/** + * Get explanation for what's happening at each milestone + */ +function getWeekExplanation(week, percent1, interest1Name) { + if (week === 0) { + return 'Your initial state based on who you followed.'; + } else if (week <= 4) { + return `Subtle drift begins. ${interest1Name} content scores slightly higher in the algorithm, so you see more of it.`; + } else if (week <= 12) { + return `Engagement reinforcement. You're engaging more with ${interest1Name} because you're seeing more of it. This increases your cluster score.`; + } else if (week <= 20) { + return `FRS acceleration (if enabled). X recommends ${interest1Name} accounts. Following them accelerates drift.`; + } else if (week <= 24) { + return `Approaching equilibrium. Drift slows as you near the algorithm's "natural" balance for your engagement pattern.`; + } else if (percent1 >= 85) { + return `Deep in the gravity well. Breaking out now requires deliberate counter-engagement for 30+ days.`; + } else { + return `Continued drift toward monoculture. Your secondary interest is becoming barely visible.`; + } +} + +// Initialize with default values on page load +window.addEventListener('DOMContentLoaded', () => { + // Set initial split display + const initialSplit = parseInt(splitSlider.value); + splitDisplay.textContent = `${initialSplit}% / ${100 - initialSplit}%`; +}); diff --git a/docs/js/phoenix-simulator.js b/docs/js/phoenix-simulator.js new file mode 100644 index 000000000..35b9de085 --- /dev/null +++ b/docs/js/phoenix-simulator.js @@ -0,0 +1,328 @@ +// Phoenix Behavioral Sequence Simulator + +document.addEventListener('DOMContentLoaded', function() { + +// ============================================================================ +// State Management +// ============================================================================ + +let actionSequence = []; +const MAX_SEQUENCE_LENGTH = 8; + +// Action metadata +const ACTION_INFO = { + 'LIKE_tech': { emoji: '❤️', label: 'Like Tech', category: 'tech', engagement: 'high', color: '#1DA1F2' }, + 'CLICK_tech': { emoji: '👁️', label: 'Click Tech', category: 'tech', engagement: 'medium', color: '#1DA1F2' }, + 'REPLY_tech': { emoji: '💬', label: 'Reply Tech', category: 'tech', engagement: 'very high', color: '#1DA1F2' }, + 'LIKE_sports': { emoji: '❤️', label: 'Like Sports', category: 'sports', engagement: 'high', color: '#17bf63' }, + 'CLICK_sports': { emoji: '👁️', label: 'Click Sports', category: 'sports', engagement: 'medium', color: '#17bf63' }, + 'REPLY_sports': { emoji: '💬', label: 'Reply Sports', category: 'sports', engagement: 'very high', color: '#17bf63' }, + 'SCROLL_neutral': { emoji: '📜', label: 'Scroll Past', category: 'neutral', engagement: 'none', color: '#8899AA' } +}; + +// ============================================================================ +// Event Handlers +// ============================================================================ + +// Add action to sequence +document.querySelectorAll('.action-btn').forEach(button => { + button.addEventListener('click', () => { + const action = button.dataset.action; + const category = button.dataset.category; + const actionKey = `${action}_${category}`; + + if (actionSequence.length >= MAX_SEQUENCE_LENGTH) { + // Remove oldest action (shift left) + actionSequence.shift(); + } + + actionSequence.push(actionKey); + updateSequenceDisplay(); + analyzeBehavior(); + }); +}); + +// Clear sequence +document.getElementById('clear-sequence-btn').addEventListener('click', () => { + actionSequence = []; + updateSequenceDisplay(); + document.getElementById('predictions-container').style.display = 'none'; +}); + +// ============================================================================ +// Display Functions +// ============================================================================ + +function updateSequenceDisplay() { + const container = document.getElementById('action-sequence'); + + if (actionSequence.length === 0) { + container.innerHTML = 'No actions yet. Click buttons above to build your sequence.'; + return; + } + + container.innerHTML = ''; + + actionSequence.forEach((actionKey, index) => { + const info = ACTION_INFO[actionKey]; + const badge = document.createElement('div'); + badge.style.cssText = ` + padding: 0.5rem 1rem; + background-color: ${info.color}22; + border: 2px solid ${info.color}; + border-radius: 6px; + font-weight: 600; + font-size: 0.95rem; + display: inline-flex; + align-items: center; + gap: 0.5rem; + `; + badge.innerHTML = `${info.emoji} ${info.label}`; + container.appendChild(badge); + }); +} + +// ============================================================================ +// Behavioral Analysis +// ============================================================================ + +function analyzeBehavior() { + if (actionSequence.length === 0) { + document.getElementById('predictions-container').style.display = 'none'; + return; + } + + // Analyze sequence + const analysis = analyzeSequencePattern(actionSequence); + + // Show predictions + document.getElementById('predictions-container').style.display = 'block'; + + // Display behavioral state + displayBehavioralState(analysis); + + // Display predictions + displayPredictions(analysis); + + // Display interpretation + displayInterpretation(analysis); +} + +function analyzeSequencePattern(sequence) { + // Count by category + const categoryCounts = { tech: 0, sports: 0, neutral: 0 }; + const actionCounts = { LIKE: 0, CLICK: 0, REPLY: 0, SCROLL: 0 }; + const engagementLevels = { 'very high': 0, high: 0, medium: 0, none: 0 }; + + sequence.forEach(actionKey => { + const info = ACTION_INFO[actionKey]; + const [action, category] = actionKey.split('_'); + + categoryCounts[category]++; + actionCounts[action]++; + engagementLevels[info.engagement]++; + }); + + // Calculate dominant category + const totalActions = sequence.length; + const techPercent = (categoryCounts.tech / totalActions) * 100; + const sportsPercent = (categoryCounts.sports / totalActions) * 100; + const neutralPercent = (categoryCounts.neutral / totalActions) * 100; + + // Determine behavioral state + let behavioralState = ''; + let stateColor = ''; + + if (neutralPercent >= 60) { + behavioralState = 'Passive Browsing Mode'; + stateColor = '#8899AA'; + } else if (categoryCounts.REPLY >= 2 || engagementLevels['very high'] >= 2) { + behavioralState = 'High Engagement Streak'; + stateColor = '#ff6b6b'; + } else if (techPercent >= 75 || sportsPercent >= 75) { + const dominant = techPercent > sportsPercent ? 'Tech' : 'Sports'; + behavioralState = `Deep Dive: ${dominant} Content`; + stateColor = techPercent > sportsPercent ? '#1DA1F2' : '#17bf63'; + } else if (techPercent >= 50 && sportsPercent === 0) { + behavioralState = 'Focused Exploration: Tech'; + stateColor = '#1DA1F2'; + } else if (sportsPercent >= 50 && techPercent === 0) { + behavioralState = 'Focused Exploration: Sports'; + stateColor = '#17bf63'; + } else { + behavioralState = 'Context Switching: Mixed Interests'; + stateColor = '#f5a623'; + } + + // Calculate predictions based on behavioral pattern + const predictions = calculatePredictions({ + techPercent, + sportsPercent, + neutralPercent, + engagementLevels, + actionCounts, + sequence + }); + + return { + behavioralState, + stateColor, + categoryCounts, + techPercent, + sportsPercent, + neutralPercent, + predictions, + engagementLevels, + actionCounts + }; +} + +function calculatePredictions(analysis) { + const { techPercent, sportsPercent, neutralPercent, engagementLevels, sequence } = analysis; + + // Base probabilities + let techEngagement = Math.max(5, techPercent); + let sportsEngagement = Math.max(5, sportsPercent); + + // Boost based on recent momentum (last 3 actions) + const recentActions = sequence.slice(-3); + const recentTech = recentActions.filter(a => a.includes('tech')).length; + const recentSports = recentActions.filter(a => a.includes('sports')).length; + + techEngagement += recentTech * 15; + sportsEngagement += recentSports * 15; + + // Boost based on engagement intensity + if (engagementLevels['very high'] >= 2) { + // High engagement mode - boost everything + techEngagement *= 1.3; + sportsEngagement *= 1.3; + } + + // Penalty for passive browsing + if (neutralPercent >= 60) { + techEngagement *= 0.3; + sportsEngagement *= 0.3; + } + + // Normalize to 100% + const total = techEngagement + sportsEngagement; + const techProb = (techEngagement / total) * 100; + const sportsProb = (sportsEngagement / total) * 100; + + return { + tech: Math.round(techProb), + sports: Math.round(sportsProb) + }; +} + +// ============================================================================ +// Display Predictions +// ============================================================================ + +function displayBehavioralState(analysis) { + const container = document.getElementById('behavioral-state'); + container.style.borderLeftColor = analysis.stateColor; + container.innerHTML = ` +
+
+ ${analysis.behavioralState} +
+

+ Phoenix detected this pattern from your last ${actionSequence.length} actions +

+ `; +} + +function displayPredictions(analysis) { + const container = document.getElementById('prediction-bars'); + + const predictions = [ + { label: 'Next Tech Tweet', prob: analysis.predictions.tech, color: '#1DA1F2' }, + { label: 'Next Sports Tweet', prob: analysis.predictions.sports, color: '#17bf63' } + ]; + + container.innerHTML = ''; + + predictions.forEach(pred => { + const barContainer = document.createElement('div'); + barContainer.style.marginBottom = '1.5rem'; + + barContainer.innerHTML = ` +
+ ${pred.label} + ${pred.prob}% +
+
+
+ ${pred.prob > 15 ? `${pred.prob}%` : ''} +
+
+ `; + + container.appendChild(barContainer); + }); +} + +function displayInterpretation(analysis) { + const container = document.getElementById('prediction-interpretation'); + + let interpretation = ''; + const dominant = analysis.predictions.tech > analysis.predictions.sports ? 'Tech' : 'Sports'; + const dominantProb = Math.max(analysis.predictions.tech, analysis.predictions.sports); + const dominantColor = analysis.predictions.tech > analysis.predictions.sports ? '#1DA1F2' : '#17bf63'; + + if (analysis.neutralPercent >= 60) { + interpretation = ` +

Passive Browsing Mode Detected

+

Your sequence shows mostly scrolling with minimal engagement. Phoenix predicts:

+ +

In Navi (old system): You'd get your standard 50/50 mix regardless of browsing mode.
+ In Phoenix: Algorithm recognizes passive mode and adjusts accordingly.

+ `; + } else if (analysis.engagementLevels['very high'] >= 2) { + interpretation = ` +

High Engagement Streak Detected!

+

You've been actively engaging (replies, likes) with ${dominant.toLowerCase()} content. Phoenix predicts:

+ +

In Navi: Static prediction based on lifetime averages.
+ In Phoenix: Real-time adaptation to your engagement momentum.

+ `; + } else if (dominantProb >= 70) { + interpretation = ` +

Focused Interest Detected: ${dominant}

+

Your recent sequence shows clear focus on ${dominant.toLowerCase()} content (${dominantProb}% probability). Phoenix predicts:

+ +

Key difference: Phoenix sees you're interested in ${dominant.toLowerCase()} right now, not based on what you liked last month.

+ `; + } else { + interpretation = ` +

Context Switching Detected

+

Your sequence shows mixed interests (Tech: ${analysis.predictions.tech}%, Sports: ${analysis.predictions.sports}%). Phoenix predicts:

+ +

Phoenix advantage: Can detect when you switch contexts mid-session and adapt instantly.

+ `; + } + + container.innerHTML = interpretation; +} + +// End DOMContentLoaded +}); diff --git a/docs/js/pipeline-explorer.js b/docs/js/pipeline-explorer.js new file mode 100644 index 000000000..11d124267 --- /dev/null +++ b/docs/js/pipeline-explorer.js @@ -0,0 +1,636 @@ +/** + * The Full Pipeline Explorer + * + * Shows complete tweet journey through all 5 algorithmic stages: + * 1. Candidate Generation + * 2. Feature Hydration + * 3. Heavy Ranker (ML Scoring) + * 4. Filters & Penalties + * 5. Mixing & Serving + * + * Based on Twitter's complete recommendation pipeline. + */ + +// Engagement weights from HomeGlobalParams.scala:786-1028 +const WEIGHTS = { + 'Reply with Author Engagement': 75.0, + 'Reply': 13.5, + 'Good Profile Click': 12.0, + 'Video Playback 50%': 8.0, + 'Retweet': 1.0, + 'Favorite': 0.5, + 'Negative Feedback': -74.0, + 'Report': -369.0 +}; + +// Tweet scenarios +const SCENARIOS = { + 'viral-thread': { + name: '🔥 Viral Educational Thread', + network: 'in-network', + cluster: 'primary', // User's main interest + clusterScore: 0.85, + authorPosition: 1, // First tweet from this author + probabilities: { + 'Reply with Author Engagement': 0.03, + 'Reply': 0.08, + 'Good Profile Click': 0.28, + 'Video Playback 50%': 0.0, + 'Retweet': 0.03, + 'Favorite': 0.25, + 'Negative Feedback': 0.01, + 'Report': 0.001 + } + }, + 'out-of-network': { + name: '🌐 Out-of-Network Quality', + network: 'out-of-network', + cluster: 'secondary', // User's minority interest + clusterScore: 0.35, + authorPosition: 1, + probabilities: { + 'Reply with Author Engagement': 0.02, + 'Reply': 0.05, + 'Good Profile Click': 0.22, + 'Video Playback 50%': 0.0, + 'Retweet': 0.02, + 'Favorite': 0.18, + 'Negative Feedback': 0.015, + 'Report': 0.002 + } + }, + 'controversial': { + name: '⚡ Controversial Take', + network: 'in-network', + cluster: 'primary', + clusterScore: 0.85, + authorPosition: 1, + probabilities: { + 'Reply with Author Engagement': 0.01, + 'Reply': 0.12, + 'Good Profile Click': 0.15, + 'Video Playback 50%': 0.0, + 'Retweet': 0.015, + 'Favorite': 0.08, + 'Negative Feedback': 0.04, + 'Report': 0.005 + } + }, + 'repeat-author': { + name: '📝 3rd Tweet from Same Author', + network: 'in-network', + cluster: 'primary', + clusterScore: 0.85, + authorPosition: 3, // Third tweet from this author + probabilities: { + 'Reply with Author Engagement': 0.02, + 'Reply': 0.06, + 'Good Profile Click': 0.25, + 'Video Playback 50%': 0.0, + 'Retweet': 0.025, + 'Favorite': 0.20, + 'Negative Feedback': 0.01, + 'Report': 0.001 + } + } +}; + +// State +let currentScenario = null; +let currentStage = 0; +let scoreHistory = []; + +// DOM elements +const pipelineContainer = document.getElementById('pipeline-container'); +const scenarioNameDisplay = document.getElementById('scenario-name'); +const stageDetailContainer = document.getElementById('stage-detail-container'); +const scoreBreakdown = document.getElementById('score-breakdown'); +const prevStageBtn = document.getElementById('prev-stage-btn'); +const nextStageBtn = document.getElementById('next-stage-btn'); +const finalSummary = document.getElementById('final-summary'); + +// Initialize +window.addEventListener('DOMContentLoaded', () => { + attachScenarioListeners(); +}); + +/** + * Attach scenario card listeners + */ +function attachScenarioListeners() { + const scenarioCards = document.querySelectorAll('.pipeline-scenario-card'); + scenarioCards.forEach(card => { + card.addEventListener('click', () => { + const scenarioKey = card.dataset.scenario; + startPipeline(scenarioKey); + }); + }); +} + +/** + * Start pipeline with selected scenario + */ +function startPipeline(scenarioKey) { + currentScenario = SCENARIOS[scenarioKey]; + currentStage = 1; + scoreHistory = []; + + // Show container + pipelineContainer.style.display = 'block'; + pipelineContainer.scrollIntoView({ behavior: 'smooth', block: 'start' }); + + // Update display + scenarioNameDisplay.textContent = currentScenario.name; + + // Render first stage + renderStage(); + updateNavigation(); + updateFunnelHighlight(); +} + +/** + * Navigate to previous stage + */ +function previousStage() { + if (currentStage > 1) { + currentStage--; + renderStage(); + updateNavigation(); + updateFunnelHighlight(); + finalSummary.style.display = 'none'; + } +} + +/** + * Navigate to next stage + */ +function nextStage() { + if (currentStage < 5) { + currentStage++; + renderStage(); + updateNavigation(); + updateFunnelHighlight(); + } else { + // Show final summary + showFinalSummary(); + } +} + +/** + * Update navigation buttons + */ +function updateNavigation() { + prevStageBtn.disabled = currentStage === 1; + nextStageBtn.textContent = currentStage === 5 ? 'Show Final Summary →' : 'Next Stage →'; + + prevStageBtn.onclick = previousStage; + nextStageBtn.onclick = nextStage; +} + +/** + * Update funnel stage highlight + */ +function updateFunnelHighlight() { + const funnelStages = document.querySelectorAll('.funnel-stage[data-stage]'); + funnelStages.forEach(stage => { + const stageNum = parseInt(stage.dataset.stage); + if (stageNum === currentStage) { + stage.classList.add('active'); + } else if (stageNum < currentStage) { + stage.classList.add('completed'); + stage.classList.remove('active'); + } else { + stage.classList.remove('active', 'completed'); + } + }); +} + +/** + * Calculate base score from Heavy Ranker + */ +function calculateBaseScore() { + let score = 0; + const breakdown = []; + + for (const [engType, weight] of Object.entries(WEIGHTS)) { + const prob = currentScenario.probabilities[engType] || 0; + const contribution = prob * weight; + score += contribution; + + if (prob > 0) { + breakdown.push({ + type: engType, + probability: prob, + weight: weight, + contribution: contribution + }); + } + } + + return { score, breakdown }; +} + +/** + * Render current stage + */ +function renderStage() { + let html = ''; + + switch (currentStage) { + case 1: + html = renderCandidateGeneration(); + break; + case 2: + html = renderFeatureHydration(); + break; + case 3: + html = renderMLScoring(); + break; + case 4: + html = renderFiltersAndPenalties(); + break; + case 5: + html = renderMixingAndServing(); + break; + } + + stageDetailContainer.innerHTML = html; + updateScoreTracker(); +} + +/** + * Stage 1: Candidate Generation + */ +function renderCandidateGeneration() { + const source = currentScenario.network === 'in-network' ? 'Earlybird (In-Network)' : 'SimClusters ANN'; + + return ` +
+

Stage 1: Candidate Generation

+

Your tweet enters the pipeline as one of ~1,400 candidates selected based on your profile.

+ +
+
+

Selection Source

+
${source}
+

+ ${currentScenario.network === 'in-network' + ? 'Retrieved from Earlybird search index because you follow this author. ~50% of candidates come from in-network.' + : 'Retrieved via SimClusters ANN based on your interest clusters. ~20% of candidates come from similar content clusters.'} +

+
+ +
+

Initial Pool

+
~1,400 candidates
+

+ From ~1 billion tweets posted, only 1,400 make it to your candidate pool. That's a 99.9998% rejection rate before any scoring! +

+
+ +
+

Network Status

+
${currentScenario.network === 'in-network' ? 'In-Network ✓' : 'Out-of-Network'}
+

+ ${currentScenario.network === 'in-network' + ? 'You follow this author, so this tweet has in-network status and will avoid the 25% out-of-network penalty.' + : 'You don\'t follow this author. Will receive a 0.75x multiplier (25% penalty) later in the pipeline.'} +

+
+
+ +
+ Key Insight: Most tweets never even enter your candidate pool. The algorithm pre-filters based on your follow graph and interest clusters. +
+
+ `; +} + +/** + * Stage 2: Feature Hydration + */ +function renderFeatureHydration() { + return ` +
+

Stage 2: Feature Hydration

+

The algorithm attaches ~6,000 features to this tweet for the ML model to evaluate.

+ +
+
+

Author Features

+
    +
  • Follower count & verified status
  • +
  • Reputation score (TweetCred)
  • +
  • Historical engagement rates
  • +
  • Account age & activity level
  • +
+
+ +
+

Tweet Features

+
    +
  • Media type (text, image, video, link)
  • +
  • Length & linguistic features
  • +
  • Recency (time since posting)
  • +
  • Topic & entity recognition
  • +
+
+ +
+

User-Tweet Features

+
    +
  • Cluster similarity: ${(currentScenario.clusterScore * 100).toFixed(0)}%
  • +
  • Real graph connection strength
  • +
  • Past engagement with author
  • +
  • Similar tweets you engaged with
  • +
+
+ +
+

Engagement Predictions

+
15 probability predictions
+

+ The Heavy Ranker will predict probabilities for 15 different engagement types. These feed into the weighted scoring formula. +

+
+
+ +
+ Key Insight: The cluster similarity (${(currentScenario.clusterScore * 100).toFixed(0)}%) will multiply the final score, creating personalization and filter bubbles. +
+
+ `; +} + +/** + * Stage 3: ML Scoring + */ +function renderMLScoring() { + const { score, breakdown } = calculateBaseScore(); + + // Store base score + if (scoreHistory.length === 0) { + scoreHistory.push({ stage: 'Base Score (Heavy Ranker)', score }); + } + + // Sort breakdown by contribution (absolute value) + breakdown.sort((a, b) => Math.abs(b.contribution) - Math.abs(a.contribution)); + + return ` +
+

Stage 3: Heavy Ranker (ML Scoring)

+

MaskNet model predicts engagement probabilities and calculates weighted score:

+ +
+ score = Σ (probabilityi × weighti) +
+ +
+
+
+ Engagement Type + Probability + Weight + Contribution +
+ ${breakdown.map(item => ` +
+ ${item.type} + ${(item.probability * 100).toFixed(1)}% + ${item.weight.toFixed(1)} + + ${item.contribution >= 0 ? '+' : ''}${item.contribution.toFixed(2)} + +
+ `).join('')} +
+ Total Base Score + + + ${score.toFixed(2)} +
+
+
+ +
+ Key Insight: Notice how favorites (0.5 weight) contribute far less than replies (13.5 weight). The algorithm optimizes for engagement depth, not breadth. +
+
+ `; +} + +/** + * Stage 4: Filters & Penalties + */ +function renderFiltersAndPenalties() { + const { score: baseScore } = calculateBaseScore(); + let currentScore = baseScore; + const modifications = []; + + // Out-of-network penalty + if (currentScenario.network === 'out-of-network') { + const penalty = 0.75; + const newScore = currentScore * penalty; + modifications.push({ + name: 'Out-of-Network Penalty', + multiplier: `×${penalty}`, + before: currentScore, + after: newScore, + description: 'You don\'t follow this author, so score reduced by 25%' + }); + currentScore = newScore; + } + + // Cluster scoring (personalization) + const clusterMultiplier = currentScenario.clusterScore; + const afterCluster = currentScore * clusterMultiplier; + modifications.push({ + name: 'Cluster Scoring', + multiplier: `×${clusterMultiplier.toFixed(2)}`, + before: currentScore, + after: afterCluster, + description: `Multiplied by your ${currentScenario.cluster === 'primary' ? 'primary' : 'secondary'} interest cluster score. This creates filter bubbles!` + }); + currentScore = afterCluster; + + // Author diversity penalty + if (currentScenario.authorPosition > 1) { + const position = currentScenario.authorPosition - 1; // 0-indexed + const floor = 0.25; + const decayFactor = 0.5; + const multiplier = (1 - floor) * Math.pow(decayFactor, position) + floor; + const afterDiversity = currentScore * multiplier; + + modifications.push({ + name: 'Author Diversity Penalty', + multiplier: `×${multiplier.toFixed(3)}`, + before: currentScore, + after: afterDiversity, + description: `This is the ${currentScenario.authorPosition}${currentScenario.authorPosition === 3 ? 'rd' : 'th'} tweet from this author in your feed. Exponential penalty applied.` + }); + currentScore = afterDiversity; + } + + // Store final score + scoreHistory.push({ stage: 'After Filters & Penalties', score: currentScore }); + + return ` +
+

Stage 4: Filters & Penalties

+

Multiple filters reshape the ranking by applying multipliers and penalties:

+ +
+
+ ${modifications.map((mod, index) => ` +
+
+

${mod.name}

+ ${mod.multiplier} +
+
+ ${mod.before.toFixed(3)} + + ${mod.after.toFixed(3)} +
+

${mod.description}

+
+ `).join('')} +
+ +
+

Final Score After Filters

+
${currentScore.toFixed(3)}
+

+ ${baseScore > currentScore + ? `Score reduced by ${(((baseScore - currentScore) / baseScore) * 100).toFixed(1)}% through filters` + : 'Score maintained through filters'} +

+
+
+ +
+ Key Insight: Filters can dramatically change rankings. ${currentScenario.network === 'out-of-network' ? 'Out-of-network tweets need 33% higher base scores to compete with in-network content.' : 'In-network tweets avoid the 25% out-of-network penalty.'} +
+
+ `; +} + +/** + * Stage 5: Mixing & Serving + */ +function renderMixingAndServing() { + const finalScore = scoreHistory[scoreHistory.length - 1].score; + const baseScore = scoreHistory[0].score; + const totalChange = ((finalScore - baseScore) / baseScore) * 100; + + return ` +
+

Stage 5: Mixing & Serving

+

The final stage inserts ads, promoted tweets, and modules before serving your timeline.

+ +
+
+

Final Ranking

+
Rank #${estimateRank(finalScore)}
+

+ Based on the final score of ${finalScore.toFixed(3)}, this tweet would rank approximately #${estimateRank(finalScore)} in your timeline of ~100-200 tweets. +

+
+ +
+

Survival Rate

+
${estimateRank(finalScore) <= 100 ? '✓ Survived' : '✗ Filtered Out'}
+

+ Only ~50-100 tweets make it to your final timeline. ${estimateRank(finalScore) <= 100 ? 'This tweet made it!' : 'This tweet was filtered out in the final ranking.'} +

+
+ +
+

Score Evolution

+
+
+ Base Score: + ${baseScore.toFixed(3)} +
+
+
+ Final Score: + + ${finalScore.toFixed(3)} + (${totalChange >= 0 ? '+' : ''}${totalChange.toFixed(1)}%) + +
+
+
+ +
+

Mixing & Ads

+

+ Twitter inserts ads (~10% of timeline), promoted tweets, "Who to Follow" modules, and topic suggestions. Your organic timeline is interspersed with monetization elements. +

+
+
+ +
+ Key Insight: ${totalChange < -20 ? 'This tweet lost significant score through filters.' : totalChange > 0 ? 'This tweet maintained its strong score.' : 'This tweet survived with moderate scoring.'} +
+
+ `; +} + +/** + * Estimate rank based on score (rough heuristic) + */ +function estimateRank(score) { + if (score > 5) return Math.floor(Math.random() * 10) + 1; + if (score > 3) return Math.floor(Math.random() * 30) + 10; + if (score > 2) return Math.floor(Math.random() * 50) + 30; + if (score > 1) return Math.floor(Math.random() * 70) + 80; + return Math.floor(Math.random() * 100) + 150; +} + +/** + * Update score tracker + */ +function updateScoreTracker() { + if (scoreHistory.length === 0) return; + + scoreBreakdown.innerHTML = scoreHistory.map((entry, index) => ` +
+ ${entry.stage} + ${entry.score.toFixed(3)} +
+ `).join('
'); +} + +/** + * Show final summary + */ +function showFinalSummary() { + const finalScore = scoreHistory[scoreHistory.length - 1].score; + const rank = estimateRank(finalScore); + const survived = rank <= 100; + + const summaryText = ` + ${currentScenario.name} completed the pipeline with a final score of ${finalScore.toFixed(3)}, + ranking approximately #${rank} in your timeline. +

+ ${survived + ? '✓ This tweet survived the 96% rejection rate and would appear in your timeline.' + : '✗ This tweet was filtered out in the final ranking and would not appear in your timeline.'} +

+ ${currentScenario.network === 'out-of-network' && !survived + ? 'The 25% out-of-network penalty significantly reduced its competitiveness.' + : currentScenario.authorPosition > 1 && !survived + ? 'The author diversity penalty reduced its ranking below the visibility threshold.' + : survived && currentScenario.network === 'in-network' + ? 'In-network status and strong engagement predictions helped it survive.' + : 'Try exploring different scenarios to see how network status and engagement affect outcomes.'} + `; + + document.getElementById('summary-text').innerHTML = summaryText; + finalSummary.style.display = 'block'; + finalSummary.scrollIntoView({ behavior: 'smooth', block: 'start' }); + + nextStageBtn.style.display = 'none'; +} diff --git a/docs/js/reinforcement-loop.js b/docs/js/reinforcement-loop.js new file mode 100644 index 000000000..208d91a09 --- /dev/null +++ b/docs/js/reinforcement-loop.js @@ -0,0 +1,596 @@ +/** + * The Reinforcement Loop Machine + * + * Shows step-by-step how the feedback loop creates drift: + * Profile → Candidates → Scoring → Feed → Engagement → Profile Update → repeat + * + * Based on: + * - Multiplicative scoring (ApproximateCosineSimilarity.scala:84-94) + * - Weekly InterestedIn updates (InterestedInFromKnownFor.scala:59) + * - L2 normalization (SimClustersEmbedding.scala:59-72) + */ + +// State +let currentWeek = 0; +let currentStage = 0; +let profile = { ai: 0.60, cooking: 0.40 }; +let history = []; + +// Configuration +const DRIFT_RATE = 0.015; // Medium engagement +const STAGES = ['profile', 'candidates', 'scoring', 'feed', 'engagement', 'update']; + +// DOM elements +const loopAiSlider = document.getElementById('loop-ai'); +const loopCookingSlider = document.getElementById('loop-cooking'); +const startLoopBtn = document.getElementById('start-loop-btn'); +const loopContainer = document.getElementById('loop-container'); +const stageContainer = document.getElementById('stage-container'); +const nextStageBtn = document.getElementById('next-stage-btn'); +const restartBtn = document.getElementById('restart-btn'); +const currentWeekDisplay = document.getElementById('current-week'); +const historyContainer = document.getElementById('history-container'); + +// Initialize +window.addEventListener('DOMContentLoaded', () => { + initializeSliders(); + attachEventListeners(); +}); + +/** + * Initialize sliders with normalization + */ +function initializeSliders() { + updateLoopDisplays(); + + [loopAiSlider, loopCookingSlider].forEach(slider => { + slider.addEventListener('input', () => { + normalizeLoopProfile(); + updateLoopDisplays(); + }); + }); +} + +/** + * Normalize loop profile to 100% + */ +function normalizeLoopProfile() { + const ai = parseInt(loopAiSlider.value); + const cooking = parseInt(loopCookingSlider.value); + const total = ai + cooking; + + if (total !== 100) { + const normAi = Math.round((ai / total) * 100); + const normCooking = 100 - normAi; + + loopAiSlider.value = normAi; + loopCookingSlider.value = normCooking; + + profile = { ai: normAi / 100, cooking: normCooking / 100 }; + } else { + profile = { ai: ai / 100, cooking: cooking / 100 }; + } +} + +/** + * Update loop displays + */ +function updateLoopDisplays() { + const aiPercent = Math.round(profile.ai * 100); + const cookingPercent = Math.round(profile.cooking * 100); + + document.getElementById('loop-ai-display').textContent = `${aiPercent}%`; + document.getElementById('loop-cooking-display').textContent = `${cookingPercent}%`; + document.getElementById('loop-total').textContent = `${aiPercent + cookingPercent}%`; +} + +/** + * Attach event listeners + */ +function attachEventListeners() { + startLoopBtn.addEventListener('click', startLoop); + nextStageBtn.addEventListener('click', nextStage); + restartBtn.addEventListener('click', restart); +} + +/** + * Start the loop + */ +function startLoop() { + // Reset state + currentWeek = 0; + currentStage = 0; + history = [{ week: 0, ai: profile.ai, cooking: profile.cooking }]; + + // Show loop container + loopContainer.style.display = 'block'; + loopContainer.scrollIntoView({ behavior: 'smooth', block: 'start' }); + + // Render first stage + renderStage(); + updateProgress(); +} + +/** + * Advance to next stage + */ +function nextStage() { + currentStage++; + + if (currentStage >= STAGES.length) { + // Completed first loop, show 4-week projection + showProjection(); + } else { + renderStage(); + updateProgress(); + } +} + +/** + * Calculate drift for one week + */ +function calculateWeeklyDrift(currentProfile) { + const advantage = currentProfile.ai / currentProfile.cooking; + const imbalance = Math.abs(currentProfile.ai - currentProfile.cooking); + const slowdown = 1 - (imbalance * 0.5); + const drift = DRIFT_RATE * advantage * slowdown; + + const newAi = Math.min(0.95, currentProfile.ai + drift); + const newCooking = 1 - newAi; // L2 normalization + + return { ai: newAi, cooking: newCooking }; +} + +/** + * Show 4-week projection after completing one loop + */ +function showProjection() { + nextStageBtn.textContent = 'Show 6-Month Projection →'; + nextStageBtn.onclick = extendToSixMonths; + + // Calculate 4 weeks of drift + const projection = [history[0]]; // Week 0 + let currentProfile = { ai: profile.ai, cooking: profile.cooking }; + + for (let week = 1; week <= 4; week++) { + currentProfile = calculateWeeklyDrift(currentProfile); + projection.push({ week, ai: currentProfile.ai, cooking: currentProfile.cooking }); + } + + const initialAi = Math.round(projection[0].ai * 100); + const week4Ai = Math.round(projection[4].ai * 100); + const totalDrift = week4Ai - initialAi; + + stageContainer.innerHTML = ` +
+

Loop Complete - 4-Week Projection

+

You've experienced one complete loop. Now let's see how this compounds over 4 weeks:

+ +
+
+ Week 0 (Starting): + + ${initialAi}% AI / + ${100 - initialAi}% Cooking + +
+
+
+ Week 4 (After 4 loops): + + ${week4Ai}% AI / + ${100 - week4Ai}% Cooking + +
+
+ Total drift: ${totalDrift > 0 ? '+' : ''}${totalDrift} percentage points in just 4 weeks +
+
+ +

Week-by-Week Breakdown

+
+ ${projection.map(entry => { + const aiPercent = Math.round(entry.ai * 100); + const cookingPercent = Math.round(entry.cooking * 100); + return ` +
+
Week ${entry.week}
+
+
+ ${aiPercent}% +
+
+ ${cookingPercent}% +
+
+
+ `; + }).join('')} +
+ +
+

The compounding effect: Each week, the loop repeats. Your profile updates based on your engagement, which was determined by your feed, which was determined by your profile. The imbalance grows automatically.

+

Key insight: You didn't change your behavior at all! You consistently engaged with what you saw. The algorithm's multiplicative scoring and L2 normalization created this drift.

+
+
+ `; + + // Store projection for potential 6-month view + window.fullProjection = projection; + + // Update history display + history = projection; + updateHistory(); +} + +/** + * Extend projection to 6 months (24 weeks) + */ +function extendToSixMonths() { + const projection = [...window.fullProjection]; + let currentProfile = { + ai: projection[projection.length - 1].ai, + cooking: projection[projection.length - 1].cooking + }; + + // Calculate weeks 5-24 + for (let week = 5; week <= 24; week++) { + currentProfile = calculateWeeklyDrift(currentProfile); + projection.push({ week, ai: currentProfile.ai, cooking: currentProfile.cooking }); + } + + const initialAi = Math.round(projection[0].ai * 100); + const week24Ai = Math.round(projection[24].ai * 100); + const totalDrift = week24Ai - initialAi; + + stageContainer.innerHTML = ` +
+

6-Month Projection (24 Weeks)

+

Here's the long-term effect of the reinforcement loop:

+ +
+
+ Week 0 (Starting): + + ${initialAi}% AI / + ${100 - initialAi}% Cooking + +
+
+
+ Week 24 (6 months later): + + ${week24Ai}% AI / + ${100 - week24Ai}% Cooking + +
+
+ Total drift: ${totalDrift > 0 ? '+' : ''}${totalDrift} percentage points over 6 months +
+
+ +

Complete Timeline

+
+ ${projection.filter(p => p.week % 4 === 0 || p.week === 1).map(entry => { + const aiPercent = Math.round(entry.ai * 100); + const cookingPercent = Math.round(entry.cooking * 100); + return ` +
+
Week ${entry.week}
+
+
+ ${aiPercent}% +
+
+ ${cookingPercent}% +
+
+
+ `; + }).join('')} +
+ +
+

Filter Bubble Lock-In

+

After 6 months, you've drifted from ${initialAi}/${100-initialAi} to ${week24Ai}/${100-week24Ai}. ${week24Ai >= 75 ? 'Your feed is now a monoculture - the minority interest has nearly disappeared.' : 'The drift continues accelerating as the imbalance grows.'}

+

This isn't because you changed. The algorithm's design makes drift mathematically inevitable for any imbalanced starting point.

+
+
+ `; + + nextStageBtn.style.display = 'none'; + restartBtn.style.display = 'block'; + + // Update history display + history = projection; + updateHistory(); +} + +/** + * Restart the loop + */ +function restart() { + nextStageBtn.style.display = 'block'; + restartBtn.style.display = 'none'; + historyContainer.style.display = 'none'; + + loopContainer.style.display = 'none'; + window.scrollTo({ top: 0, behavior: 'smooth' }); +} + +/** + * Render current stage + */ +function renderStage() { + const stage = STAGES[currentStage]; + const aiPercent = Math.round(profile.ai * 100); + const cookingPercent = Math.round(profile.cooking * 100); + + let html = ''; + + switch (stage) { + case 'profile': + html = ` +
+

Stage 1: Your Profile

+

This is your current InterestedIn profile - the algorithm's understanding of what you care about:

+ +
+
+
+ ■ AI/Tech + ${aiPercent}% +
+
+
+
+
+ ■ Cooking + ${cookingPercent}% +
+
+
+
+ +
+

What this means: The algorithm will use these weights to score tweets. AI content gets multiplied by ${(profile.ai).toFixed(2)}, Cooking by ${(profile.cooking).toFixed(2)}.

+ ${currentWeek > 0 ? `

Change from Week ${currentWeek - 1}: AI ${profile.ai > history[currentWeek - 1].ai ? '↑' : '↓'} ${Math.abs((profile.ai - history[currentWeek - 1].ai) * 100).toFixed(1)}%, Cooking ${profile.cooking > history[currentWeek - 1].cooking ? '↑' : '↓'} ${Math.abs((profile.cooking - history[currentWeek - 1].cooking) * 100).toFixed(1)}%

` : ''} +
+
+ `; + break; + + case 'candidates': + const aiCandidates = Math.round(profile.ai * 1600); + const cookingCandidates = Math.round(profile.cooking * 1600); + + html = ` +
+

Stage 2: Fetch Candidates

+

The algorithm fetches ~1,600 candidate tweets from your clusters, proportional to your interests:

+ +
+
+ ■ AI/Tech: + ${aiCandidates} tweets +
+
+ ■ Cooking: + ${cookingCandidates} tweets +
+
+ +
+

What this means: Before any scoring happens, the algorithm already fetched ${aiPercent}% AI content and ${cookingPercent}% Cooking content. Your profile determines what's even in the pool!

+
+
+ `; + break; + + case 'scoring': + const aiScore = (0.85 * profile.ai).toFixed(3); + const cookingScore = (0.85 * profile.cooking).toFixed(3); + const scoreAdvantage = (aiScore / cookingScore).toFixed(2); + + html = ` +
+

Stage 3: Score Tweets

+

Each tweet gets scored by multiplying base quality × your cluster interest:

+ +
+
+
+ AI Tweet: "New breakthrough in transformer architecture..." +
+
+ Base Quality: 0.85 + × Your AI Interest: ${profile.ai.toFixed(2)} + = Score: ${aiScore} +
+
+ +
+
+ Cooking Tweet: "Made the perfect sourdough after 3 years..." +
+
+ Base Quality: 0.85 + × Your Cooking Interest: ${profile.cooking.toFixed(2)} + = Score: ${cookingScore} +
+
+
+ +
+

What this means: Despite equal quality (0.85), the AI tweet scores ${scoreAdvantage}x higher due to your cluster interests. This determines what ranks at the top of your feed.

+
+
+ `; + break; + + case 'feed': + html = ` +
+

Stage 4: Build Your Feed

+

The algorithm sorts tweets by score and builds your feed. The composition matches your profile:

+ +
+
+
+ ${aiPercent}% + AI/Tech +
+
+
+
+ ${cookingPercent}% + Cooking +
+
+
+ +
+

What this means: Because AI content scored higher, it dominates your feed. You'll see ${aiPercent}% AI tweets and ${cookingPercent}% Cooking tweets. This isn't random - it's a direct result of the multiplicative scoring.

+
+
+ `; + break; + + case 'engagement': + html = ` +
+

Stage 5: You Engage

+

You engage with what you see. Since ${aiPercent}% of your feed is AI, ${aiPercent}% of your engagements are with AI content:

+ + + +
+

Critical insight: You didn't change your preferences! You just engaged with what was shown to you. The algorithm controlled what you saw, which determined what you engaged with.

+
+
+ `; + break; + + case 'update': + const oldAi = currentWeek > 0 ? history[currentWeek - 1].ai : history[0].ai; + const oldCooking = currentWeek > 0 ? history[currentWeek - 1].cooking : history[0].cooking; + const advantage = profile.ai / profile.cooking; + const imbalance = Math.abs(profile.ai - profile.cooking); + const slowdown = 1 - (imbalance * 0.5); + const drift = DRIFT_RATE * advantage * slowdown; + const newAi = Math.min(0.95, profile.ai + drift); + const newCooking = 1 - newAi; + + html = ` +
+

Stage 6: Update Your Profile

+

Based on your engagement pattern, the algorithm updates your InterestedIn profile. This happens via weekly batch jobs (L2 normalization ensures interests sum to 100%):

+ +
+
+ Previous Profile: + + ${Math.round(profile.ai * 100)}% AI / + ${Math.round(profile.cooking * 100)}% Cooking + +
+
+
+ New Profile: + + ${Math.round(newAi * 100)}% AI / + ${Math.round(newCooking * 100)}% Cooking + +
+
+ AI: ${newAi > profile.ai ? '+' : ''}${((newAi - profile.ai) * 100).toFixed(1)}%, + Cooking: ${newCooking > profile.cooking ? '+' : ''}${((newCooking - profile.cooking) * 100).toFixed(1)}% +
+
+ +
+

The feedback loop: AI increased because you engaged more with AI. Cooking decreased because interests must sum to 100% (zero-sum). This new profile becomes the input for Week ${currentWeek + 1}, and the cycle repeats.

+

This is drift! Small changes compound week after week, pushing you toward monoculture.

+
+
+ `; + break; + } + + stageContainer.innerHTML = html; +} + +/** + * Update progress tracker + */ +function updateProgress() { + const progressItems = document.querySelectorAll('.progress-item'); + + progressItems.forEach((item, index) => { + if (index < currentStage) { + item.classList.add('completed'); + item.classList.remove('active'); + } else if (index === currentStage) { + item.classList.add('active'); + item.classList.remove('completed'); + } else { + item.classList.remove('active', 'completed'); + } + }); +} + +/** + * Update history timeline + */ +function updateHistory() { + if (history.length <= 1) return; + + historyContainer.style.display = 'block'; + + const timeline = document.getElementById('history-timeline'); + timeline.innerHTML = history.map((entry, index) => { + const aiPercent = Math.round(entry.ai * 100); + const cookingPercent = Math.round(entry.cooking * 100); + + return ` +
+
Week ${entry.week}
+
+
+ ${aiPercent}% +
+
+ ${cookingPercent}% +
+
+
+ `; + }).join(''); + + // Analysis + const initialAi = Math.round(history[0].ai * 100); + const currentAi = Math.round(history[history.length - 1].ai * 100); + const drift = currentAi - initialAi; + + document.getElementById('drift-analysis').innerHTML = ` + Your profile drifted from ${initialAi}% AI / ${100 - initialAi}% Cooking + to ${currentAi}% AI / ${100 - currentAi}% Cooking over ${currentWeek} weeks. + That's a ${drift} percentage point shift toward AI, happening automatically through the reinforcement loop. + ${drift > 15 ? 'You\'re entering a filter bubble - the minority interest is fading fast!' : 'The drift is accelerating as the imbalance grows.'} + `; +} diff --git a/docs/parts/reference.html b/docs/parts/reference.html new file mode 100644 index 000000000..7c09fb239 --- /dev/null +++ b/docs/parts/reference.html @@ -0,0 +1,396 @@ + + + + + + Reference & Glossary - How Twitter's Algorithm Really Works + + + + + +
+

Reference & Glossary

+ +

Technical terminology, code references, and verification guide for the interactive documentation

+ +
+

Contents

+ +

Glossary: Algorithm Components

+ + +

Reference Sections

+ +
+ +
+ +

Glossary: Building Blocks of the Algorithm

+ +

The Twitter algorithm is built from many interconnected systems. Here's what each piece does, explained intuitively rather than technically.

+ +
+

Reading this glossary: Each entry explains what the system does and why it exists. Think of these as tools in a toolbox - each serves a specific purpose in the larger recommendation pipeline.

+
+ +

Heavy Ranker

+ +

What it is: The main machine learning model that scores tweets.

+ +

How to think about it: Imagine a judge at a competition who can predict 15 different ways the audience might react to each performance. The Heavy Ranker looks at a tweet and predicts: "There's a 5% chance you'll like this, 2% chance you'll reply, 0.1% chance you'll click 'not interested'," and so on. Each prediction gets a weight (replies are worth 13.5x more than likes), and the weighted sum becomes the tweet's final score.

+ +

Why it exists: Scoring thousands of tweets per user is computationally expensive. The Heavy Ranker is "heavy" because it's thorough - it uses a neural network with ~48 million parameters to make highly accurate predictions. But you can only afford to run something this expensive on a pre-filtered set of candidates.

+ +

Architecture: Uses MaskNet (see below) - a special neural network design that predicts all 15 engagement types simultaneously while sharing knowledge between predictions.

+ +

Code: External repo recap

+

Weights: HomeGlobalParams.scala:786-1028

+ +

Light Ranker

+ +

What it is: A faster, simpler scoring model embedded in the search index.

+ +

How to think about it: If Heavy Ranker is a detailed film critic analyzing every aspect of a movie, Light Ranker is a quick star rating. It's a basic logistic regression model that runs inside the search index (Earlybird) to quickly score millions of tweets and pick the top few thousand worth sending to Heavy Ranker.

+ +

Why it exists: You can't run Heavy Ranker on a billion tweets - it would take too long and cost too much. Light Ranker is the bouncer that gets the candidate pool down from millions to thousands in milliseconds.

+ +

Trade-off: Fast but less accurate. Uses only ~20 features vs Heavy Ranker's ~6,000 features.

+ +

Code: earlybird

+ +

TwHIN (Twitter Heterogeneous Information Network)

+ +

What it is: A giant knowledge graph that represents everything on Twitter (users, tweets, topics, communities) as connected points in mathematical space.

+ +

How to think about it: Imagine a 3D map where every user is a point, every tweet is a point, and every topic is a point. Similar things are close together. If you like sci-fi movies and engage with certain accounts, you'll be positioned near other sci-fi fans. TwHIN can then say "show this person tweets from that nearby cluster they haven't seen yet."

+ +

Why it exists: Finding relevant content from people you don't follow is hard. TwHIN solves this by representing similarity mathematically - it can find "users similar to you" or "tweets similar to what you engage with" by measuring geometric distance in this abstract space.

+ +

Heterogeneous means: The graph includes different types of things (users, tweets, topics, hashtags) all in one unified mathematical representation.

+ +

Code: recos and related embeddings

+ +

SimClusters

+ +

What it is: A system that divides X into ~145,000 interest-based communities and represents both users and tweets as membership in these communities.

+ +

How to think about it: Instead of saying "Alice follows Bob and Carol," SimClusters says "Alice is 60% in the AI cluster, 30% in the cooking cluster, and 10% in the gardening cluster." Tweets are described the same way. Then matching is simple: show people tweets from clusters they belong to.

+ +

Why it exists: Communities are more stable than individual follow relationships, and they're much more efficient to compute with. Rather than comparing you to millions of individual users, the algorithm can compare your cluster membership to tweet cluster scores.

+ +

The gravitational pull effect: Because scoring uses multiplication (your_cluster_score × tweet_cluster_score), your strongest cluster keeps getting stronger. If you're 60% AI and 40% cooking today, engaging slightly more with AI content makes you 65% AI, which makes AI content score even higher, which makes you engage more with AI... and six months later you're 76% AI.

+ +

How clusters are created: X analyzes the follow graph using community detection algorithms to discover ~145,000 natural communities. Your interests (InterestedIn) are calculated from your engagement history with a 100-day half-life, updated weekly. See the Cluster Explorer interactive to understand how you're categorized.

+ +

Code: simclusters_v2

+ +

UTEG (User-Tweet-Entity-Graph)

+ +

What it is: An in-memory graph database that tracks recent engagement patterns to make real-time recommendations.

+ +

How to think about it: UTEG is like a short-term memory system. It remembers "in the last 24 hours, people similar to you engaged with these tweets." It's built using GraphJet (see below), which keeps a live graph in RAM that can answer queries in milliseconds.

+ +

Why it exists: Some recommendation systems (like SimClusters) are based on long-term patterns and update slowly. UTEG captures what's happening right now - trending topics, breaking news, viral content. It provides the "fresh" recommendations that complement the more stable systems.

+ +

Graph traversal: To find recommendations, UTEG does graph walks: "You liked tweet A → Other people who liked A also liked B → Show you tweet B."

+ +

Code: user_tweet_entity_graph

+ +

GraphJet

+ +

What it is: An in-memory graph database optimized for real-time recommendations.

+ +

How to think about it: A traditional database stores data on disk and reads it when needed (slow). GraphJet keeps the entire graph in RAM (fast) and is optimized for the specific types of queries Twitter needs: "given this user, find related tweets" or "given this tweet, find similar users."

+ +

Why it exists: Speed. When you refresh your timeline, Twitter has ~200 milliseconds to gather candidates, score them, and serve the results. GraphJet can traverse millions of graph edges in memory in just a few milliseconds.

+ +

Trade-off: RAM is expensive and limited, so GraphJet only stores recent data (typically last 24-48 hours of engagement).

+ +

Code: Open-sourced separately at GraphJet

+ +

Earlybird

+ +

What it is: Twitter's real-time search index - a specialized database optimized for finding tweets by keywords, authors, or engagement patterns.

+ +

How to think about it: When you search for "machine learning" on Twitter, Earlybird finds matching tweets in milliseconds even though there are billions of tweets. For the recommendation algorithm, Earlybird serves as the main source of in-network candidates (tweets from people you follow).

+ +

Why it exists: Traditional databases aren't fast enough for Twitter's scale. Earlybird is custom-built for one purpose: extremely fast tweet retrieval with ranking. It includes the Light Ranker (see above) built directly into the index so it can return already-scored candidates.

+ +

Real-time means: New tweets are indexed within seconds, so Earlybird always has the latest content.

+ +

Code: search

+ +

Real Graph

+ +

What it is: A system that predicts the strength of relationships between users based on interaction patterns, not just follow relationships.

+ +

How to think about it: You might follow 500 people, but you only regularly interact with 20 of them. Real Graph identifies those 20 by tracking who you reply to, whose profiles you visit, whose tweets you engage with. It creates a weighted graph where edge strength = relationship strength.

+ +

Why it exists: Following someone is a weak signal. The algorithm needs to know who you actually care about. Real Graph provides this by analyzing behavior: "You follow both @alice and @bob, but you reply to Alice 10x more often, so Alice gets 10x more weight in your recommendations."

+ +

Used for: Prioritizing in-network content, finding follow recommendations, and scoring out-of-network candidates based on similarity to your real connections.

+ +

Code: interaction_graph

+ +

Tweet Mixer

+ +

What it is: A coordination service that gathers out-of-network tweet candidates from multiple sources and combines them.

+ +

How to think about it: Tweet Mixer is like a talent scout that asks multiple agencies (TwHIN, SimClusters, UTEG, FRS) for their best recommendations, then combines those lists into one unified candidate pool to send to the Heavy Ranker.

+ +

Why it exists: Each recommendation system has different strengths - UTEG finds trending content, SimClusters finds thematic matches, TwHIN finds geometric similarity. Tweet Mixer orchestrates these systems and ensures you get a diverse mix of out-of-network candidates rather than duplicates from the same source.

+ +

Does NOT score: Tweet Mixer just fetches and combines. The actual scoring happens later in the Heavy Ranker.

+ +

Code: tweet-mixer

+ + + +

What it is: A high-performance inference engine that runs machine learning models in production.

+ +

How to think about it: Training a neural network happens offline in Python/TensorFlow. But when it's time to actually score tweets for millions of users, you need something blazing fast. Navi is a Rust-based serving system optimized for running the Heavy Ranker model with minimal latency.

+ +

Why it exists: Python is too slow for production inference at Twitter's scale. Navi compiles the trained model into optimized Rust code that can score thousands of tweets per second with single-digit millisecond latency.

+ +

Trade-off: More complex to deploy and maintain than standard TensorFlow Serving, but much faster.

+ +

Code: Proprietary, but referenced in NaviModelScorer.scala

+ +

Product Mixer

+ +

What it is: A framework for building content feeds - provides reusable components for fetching candidates, scoring, filtering, and mixing content.

+ +

How to think about it: Building a recommendation timeline involves many common steps: fetch candidates, hydrate features, run ML models, apply filters, insert ads, etc. Product Mixer provides these as Lego blocks so teams can assemble feeds without reimplementing everything from scratch.

+ +

Why it exists: Twitter has multiple feeds (For You, Following, Lists, Search, Notifications). Product Mixer lets them share code and ensure consistency while customizing each feed's specific logic.

+ +

Pipeline structure: Product Mixer uses a pipeline model where each stage's output feeds into the next stage, making the data flow explicit and testable.

+ +

Code: product-mixer

+ +

MaskNet

+ +

What it is: A neural network architecture designed for multi-task learning - predicting multiple related outcomes simultaneously.

+ +

How to think about it: Traditional models predict one thing ("will you like this?"). MaskNet predicts 15 things at once (like, reply, retweet, report, etc.) while sharing knowledge between tasks. The insight is that all these predictions are related - if someone is likely to reply, they're probably also likely to like - so the model can learn more efficiently by predicting them together.

+ +

Why it exists: Training 15 separate models would be inefficient and they'd miss shared patterns. MaskNet uses "shared towers" (neural network layers that all tasks use) and "task-specific towers" (layers unique to each prediction), getting the best of both worlds.

+ +

The mask part: During training, MaskNet randomly "masks" (hides) some tasks to prevent the model from cheating by learning shortcuts between correlated tasks.

+ +

Code: External repo recap

+ +

FSBoundedParam (Feature Switch)

+ +

What it is: A configuration system that lets Twitter tune algorithm parameters without deploying new code.

+ +

How to think about it: Hardcoded values like val penalty = 0.75 require a code deployment to change. FSBoundedParam defines parameters like OutOfNetworkPenalty(default=0.75, min=0.0, max=1.0) that can be adjusted through a dashboard. Twitter can run A/B tests or tune values in real-time without touching code.

+ +

Why it exists: Algorithm optimization is experimental. Twitter needs to test "what if we change the out-of-network penalty from 0.75 to 0.80?" dozens of times per week. FSBoundedParam makes this safe (the bounds prevent catastrophically bad values) and fast (no deployment required).

+ +

Important implication: Most weights, penalties, and thresholds in the algorithm are FSBoundedParams. The March 2023 open-source code shows the structure and formulas, but Twitter can tune the parameters without us seeing the changes.

+ +

Code: Used throughout, defined in param

+ +

TweepCred

+ +

What it is: A reputation score for users based on their follower graph quality, using PageRank-like algorithms.

+ +

How to think about it: Not all followers are equal. A verified account with 1M engaged followers has higher TweepCred than a bot farm with 1M fake followers. TweepCred measures "how much does the Twitter network trust/value this user?" by looking at who follows them and the quality of those followers.

+ +

Why it exists: Follower count alone is easily gamed. TweepCred provides a more robust measure of influence by analyzing the graph structure. It's used to boost high-quality accounts and filter low-quality ones (the SLOP filter removes users with TweepCred below a threshold).

+ +

Verified accounts: Get a ~100x TweepCred multiplier, which partly explains why verified accounts dominate recommendations.

+ +

Code: tweepcred

+ +

FRS (Follow Recommendations Service)

+ +

What it is: A service that recommends users you might want to follow.

+ +

How to think about it: FRS analyzes your follow graph and engagement patterns to suggest accounts similar to those you already follow or engage with. But it has a dual purpose: it also feeds into timeline recommendations by showing you tweets from accounts it thinks you should follow before you actually follow them.

+ +

Why it exists: Growing your follow graph improves your timeline quality (more in-network candidates). But FRS also serves as a candidate source - "here are tweets from people you don't follow but should."

+ +

Cluster reinforcement: FRS recommends users from your strongest SimClusters, which accelerates the gravitational pull effect. If you're 60% AI cluster, FRS recommends more AI accounts, you follow them, which makes you even more AI-cluster-heavy.

+ +

Code: follow-recommendations-service

+ +

User Signal Service (USS)

+ +

What it is: A centralized platform for collecting, storing, and serving user behavior signals.

+ +

How to think about it: Every action you take on Twitter (like, reply, click, scroll, dwell time) generates a signal. Rather than having every recommendation system separately track these signals, USS centralizes them. When the algorithm needs to know "what has this user engaged with recently?", it queries USS.

+ +

Why it exists: Reduces duplication and ensures consistency. Multiple systems use the same signals (favorites, follows, etc.), so centralizing this in USS means one source of truth.

+ +

Real-time and batch: USS provides both real-time signals (recent clicks in the last hour) and batch signals (aggregated engagement over weeks/months).

+ +

Code: user-signal-service

+ +
+ +

Code Evolution Timeline

+ +

Twitter's algorithm was open-sourced in two major releases:

+ +
+

March 2023: Architecture Skeleton

+

~300 files showing the 5-stage pipeline structure, basic candidate sources, and core concepts. The HomeGlobalParams.scala file contained only 86 lines with basic configuration—no engagement weights, no ML integration configs.

+ +

September 2025: Complete Implementation

+

+762 new files adding 161 feature hydrators, 56 filters, 29 scorers, complete ML integration, and full parameter definitions. The HomeGlobalParams.scala file expanded to 1,479 lines with all engagement weight parameters defined.

+
+ +

What This Means

+ +

Our investigation analyzes a composite system:

+
    +
  • Architecture: March 2023 foundation (5-stage pipeline)
  • +
  • Implementation: September 2025 details (161 hydrators, 56 filters)
  • +
  • Values: External sources (ML repo, engineering blogs)
  • +
+ +

Important: Parameter definitions exist with default = 0.0, but actual production values come from Twitter's internal configuration system. The code shows structure and formulas; external documentation provides values.

+ +

Core findings remain valid: The fundamental mechanisms (multiplicative scoring, exponential decay, 0.75x out-of-network penalty, 140-day feedback fatigue) are unchanged. The September 2025 release added detail and confirmed the architecture we analyzed.

+ +
+ +

How to Verify Our Claims

+ +

Every finding in this investigation can be verified. Here's how:

+ +

1. Get the Code

+
git clone https://github.com/twitter/the-algorithm.git
+cd the-algorithm
+ +

2. Navigate to Referenced Files

+

We provide file paths like:

+

HomeGlobalParams.scala:786-1028

+ +

To view this:

+
cd home-mixer/server/src/main/scala/com/twitter/home_mixer/param/
+cat HomeGlobalParams.scala | sed -n '786,1028p'
+ +

3. Check Implementation Date

+

To see when code was last modified:

+
git blame path/to/file.scala | grep -A5 "pattern"
+ +

4. Verify Calculations

+

We show calculations like:

+
Tweet score = 0.5 × P(favorite) + 13.5 × P(reply)
+
+Example:
+P(favorite) = 0.1 (10% chance)
+P(reply) = 0.02 (2% chance)
+
+Score = 0.5 × 0.1 + 13.5 × 0.02
+      = 0.05 + 0.27
+      = 0.32
+ +

You can verify these against the code references we provide.

+ +

5. Cross-Reference Documentation

+

Twitter published some official explanations:

+ + +

Our analysis adds detail and implications beyond what Twitter officially documented.

+ +
+ +

File Index: Where to Find Things

+ +

Main Pipeline

+

Entry point: home-mixer/server/src/main/scala/com/twitter/home_mixer/product/for_you/ForYouProductPipelineConfig.scala

+

Scoring orchestration: home-mixer/server/src/main/scala/com/twitter/home_mixer/product/scored_tweets/ScoredTweetsProductPipelineConfig.scala

+ +

Engagement Weights

+

All 15 weight parameters: home-mixer/server/src/main/scala/com/twitter/home_mixer/param/HomeGlobalParams.scala:786-1028

+

Engagement type definitions: home-mixer/server/src/main/scala/com/twitter/home_mixer/model/PredictedScoreFeature.scala:62-336

+ +

Filters and Penalties

+

"Not interested" filtering: home-mixer/.../filter/FeedbackFatigueFilter.scala

+

140-day penalty calculation: home-mixer/.../scorer/FeedbackFatigueScorer.scala

+

Author diversity exponential decay: home-mixer/.../scorer/AuthorBasedListwiseRescoringProvider.scala:54

+

Out-of-network 0.75x multiplier: home-mixer/.../scorer/RescoringFactorProvider.scala:45-57

+ +

Candidate Sources

+

Earlybird search index: src/java/com/twitter/search/

+

UTEG: src/scala/com/twitter/recos/user_tweet_entity_graph/

+

Out-of-network coordination: tweet-mixer/

+

FRS: follow-recommendations-service/

+ +

SimClusters and Communities

+

Community detection and embeddings: src/scala/com/twitter/simclusters_v2/

+

Approximate nearest neighbor search: simclusters-ann/

+ +

User Signals

+

Complete list of 20+ tracked signals: RETREIVAL_SIGNALS.md

+

Signal collection and serving: user-signal-service/

+

Real-time action stream: unified_user_actions/

+ +
+ +

Further Reading

+ +

Official Sources

+ + +
+ +
+

Questions or corrections? This is a living document. If you find errors or have questions about our analysis, please open an issue or submit a pull request on GitHub.

+
+ +
+ + + + +