Home ArchitecturesLinkedIn Feed Ranking Architecture

🗄️ Data & InfrastructureAdvancedWeek 7

LinkedIn Feed Ranking Architecture

Heavyweight ML scoring with online/offline feature pipelines

LinkedInFacebookTwitter

Key Insight

Two-tower ranking models separate retrieval (speed) from scoring (accuracy), making billion-scale personalization feasible.

Request Journey

Connection publishes post→

Features extracted (engagement history, network strength, recency)→

Offline BDT model scores post for each potential viewer→

Online ranking re-scores top candidates at request time→

Feed assembled with diversity rules (no 2 posts from same person back-to-back)

+1 more steps

How It Works

① Connection publishes post

② Features extracted (engagement history, network strength, recency)

③ Offline BDT model scores post for each potential viewer

④ Online ranking re-scores top candidates at request time

⑤ Feed assembled with diversity rules (no 2 posts from same person back-to-back)

⑥ Impressions logged, click/like fed back to retrain model

⚠The Problem

LinkedIn must rank and personalize a feedfor 1 billion members, each with a unique professional network, industry context, and content preferences. The candidate pool includes millions of new posts daily — professional articles, job updates, company news, and engagement-bait that must be filtered. Ranking must balance multiple objectives: relevance to the member, content quality, creator incentives, and business metrics like ad revenue and session time. A poor feed drives professionals to competing platforms.

✓The Solution

LinkedIn's feed uses a multi-stage ranking pipeline: candidate generation retrieves the top-K relevant posts from followed connections and suggested content → a lightweight first-pass model prunes to hundreds of candidates → a heavyweight second-pass model (gradient boosted trees + deep neural networks) scores each candidate on 1,000+ features → diversity injection prevents monotonous feeds → a final business-rules layer applies ad insertion and content-type quotas. Features are served from a real-time feature store with sub-5ms latency.

📊Scale at a Glance

1B+

Members

Hundreds of Millions

Feed Requests/Day

1,000+

Ranking Features

<5ms

Feature Serving Latency

🔬Deep Dive

Two-Tower Retrieval Model — Speed vs Accuracy

LinkedIn's candidate generation uses a two-tower neural network: one tower encodes the member (profile, interests, network) into an embedding, and the other tower encodes each post (content, author, engagement signals) into an embedding. The member's embedding is used for approximate nearest-neighbor search against millions of post embeddings to retrieve the top-K (~1,000) candidates. This separation is key: post embeddings are precomputed offline, so retrieval at serving time is just a vector similarity search — completing in milliseconds even over millions of candidates. The two-tower architecture sacrifices fine-grained member-post interaction features for massive speed.

Heavyweight Second-Pass Ranking

The second-pass model scores the top few hundred candidates using gradient-boosted decision trees (GBDTs) and deep neural networks consuming 1,000+ features. Features span multiple categories: member features (seniority, industry, activity level), post features (content type, author authority, freshness), interaction features (has the member engaged with this author before, content-topic affinity), and context features (time of day, device, session depth). GBDTs handle sparse categorical features well, while neural networks capture complex feature interactions. The model predicts multiple engagement signals — probability of like, comment, share, and long dwell — combined via a multi-objective function.

Online/Offline Feature Pipeline

LinkedIn's ranking models consume features from two pipelines. The offline pipeline computes batch features daily using Spark: a member's industry affinity scores, author authority metrics, content topic distributions, and network graph features. These are materialized into a key-value feature store (Venice). The online pipeline computes real-time features per request: current session behavior, trending topics, recent engagement signals, and freshness decay. The feature store serves both pipelines with sub-5ms p99 latency. Ensuring consistency between training features (computed offline) and serving features (computed in real-time) is a major engineering challenge that LinkedIn addresses via feature logging and monitoring.

Diversity Injection — Preventing Filter Bubbles

After scoring, LinkedIn applies diversity injection to prevent feeds from becoming monotonous. Without intervention, the ranking model would fill feeds with the highest-scoring content type (often viral engagement-bait posts). Diversity rules enforce variety across dimensions: content type (articles, images, videos, job posts), author diversity (don't show 5 posts from the same person), topic diversity (mix industry news with career advice), and network distance (blend first-degree connection posts with suggested content). These rules are applied via a post-processing diversification algorithm that re-ranks the scored list subject to diversity constraints.

Feed Quality and Anti-Viral Measures

LinkedIn actively demotes content that optimizes for engagement at the expense of quality — 'engagement bait' like polls designed to game the algorithm. A quality scoring model evaluates content integrity: is this post informative or clickbait? Is the author an authority on this topic? Does it contribute to professional discourse? Posts that receive engagement primarily from people outside the author's professional network are flagged as potentially viral-but-low-quality. LinkedIn also applies velocity controls — rapidly viral posts are held for human review before being amplified further. This reflects LinkedIn's explicit product decision to prioritize professional value over engagement metrics.

⬡Architecture Diagram

LinkedIn Feed Ranking Architecture — simplified architecture overview

✦Core Concepts

🧠

Two-Tower Model

⚙️

Feature Store

📚

Candidate Retrieval

⚙️

GBDTs

⚙️

Online/Offline Features

⚙️

Diversity Injection

⚖Tradeoffs & Design Decisions

Every architectural decision is a tradeoff. Here's what you gain and what you give up.

✓ Strengths

✓Two-tower retrieval enables millisecond candidate generation over millions of posts
✓1,000+ feature second-pass model captures nuanced member-post affinity signals
✓Real-time feature store enables sub-5ms feature serving with offline/online consistency
✓Diversity injection prevents filter bubbles and promotes balanced professional content

✗ Weaknesses

✗Two-tower architecture sacrifices fine-grained member-post interaction features for retrieval speed
✗Multi-objective ranking optimization can create conflicting signals between engagement and quality
✗Feature store consistency between training and serving is a persistent engineering challenge
✗Diversity constraints necessarily degrade pure relevance scores, creating a quality-diversity trade-off

🎯FAANG Interview Questions

Interview Prep

💡 These questions appear in FAANG system design rounds. Focus on tradeoffs, not just what the system does.

These are real system design interview questions asked at Google, Meta, Amazon, Apple, Netflix, and Microsoft. Study the architecture above before attempting.

Q1
Design a news feed ranking system for a professional social network with 1B members. What's your high-level architecture?
Q2
Explain the two-tower model architecture. Why separate member and post encoding into different towers?
Q3
How would you build a feature store that serves 1,000+ features with sub-5ms latency and ensures training-serving consistency?
Q4
Your feed ranking model optimizes for engagement but users complain about clickbait. How would you incorporate content quality?
Q5
Design a diversity injection algorithm that ensures feed variety across content type, author, and topic dimensions.

Research Papers & Further Reading

2020

The LinkedIn Feed: A Recommender System for Professionals

LinkedIn Engineering

No link

Listen to the Podcast Episode

🎙️ Free Podcast

Alex & Sam break it down

Listen to a conversational deep-dive on this architecture — real trade-offs, production context, and student-friendly explanations. Free, no login required.

Listen to Episode

Free · No account required · Listen in browser