MLSys Case Studies
TokenMixer-Large: Scaling Ranking Models
Embedding Features in Weights to Kill Retrieval Latency
A Blueprint for Scaling Recommender Systems
Decoupling Compute from Sequence Length in CTR Scaling
LinkedIn Semantic Search
Deep Neural Networks for YouTube Recommendations
LinkedIn's MixLM: 10x Faster LLM Ranking via Embedding Injection
xAI Recommendation System Deep Dive
Meta's GEM: Bringing LLM-Scale Architectures to Ads Recommendation
Engineering Airbnb's Embedding-Based Retrieval System
vLLM @ LinkedIn
Deep dive into "Memory for LLMs" architectures
Pinterest recommendation system evolutions through the years
Long sequence for recommendation systems
How LinkedIn built its GenAI platform
Compound AI systems
Near real-time personalization at LinkedIn
TikTok Real Time Recommendation algorithm scales to billions
Uber optimal feature discovery
Netflix ML platform
Reddit's ML Model Deployment and Serving Architecture
Meta AI platform
Doordash monitoring
Uber model deployment