case study 2024-12-10 13 min read

Meta's GEM: Bringing LLM-Scale Architectures to Ads Recommendation

How Meta integrated LLM-scale architectures into their ads recommendation system through the GEM (Generative Embeddings Model) framework.

Meta ads LLM recommendations GEM

Introduction

Meta's GEM (Generative Embeddings Model) represents a breakthrough in applying LLM-scale architectures to ads recommendation. This case study explores how Meta achieved significant improvements in ad relevance while maintaining strict latency requirements.

The Challenge

Ads recommendation at Meta scale presents unique challenges:

  • Billions of daily predictions across Facebook and Instagram
  • Strict latency SLAs (single-digit milliseconds)
  • Complex multi-stakeholder objectives (users, advertisers, platform)

GEM Architecture

Foundation Model Approach

GEM treats ads recommendation as a generative modeling problem:

  1. Pre-training: Learn rich representations from ad content and user interactions
  2. Fine-tuning: Adapt to specific prediction tasks
  3. Efficient inference: Deploy with optimized serving

Model Components

  • Transformer encoder: Process ad creative and metadata
  • User sequence model: Capture temporal patterns
  • Cross-attention layers: Model user-ad interactions

Technical Deep Dive

Scaling Embeddings

GEM uses massive embedding tables:

  • Trillions of parameters in embedding layers
  • Distributed storage across GPU clusters
  • Gradient compression for efficient training

Multi-Task Learning

The model jointly optimizes:

  • Click prediction (CTR)
  • Conversion prediction (CVR)
  • Long-term value estimation

Serving Optimization

Optimizations:
- Embedding caching
- Model quantization
- Batched inference
- Hardware acceleration

Results

  • X% improvement in ads relevance metrics
  • Maintained latency within strict SLAs
  • Reduced model complexity vs. ensemble approaches

Lessons for ML Engineers

  1. Generative approaches can improve discriminative tasks
  2. Scale requires careful infrastructure investment
  3. Multi-task learning provides natural regularization

Dive deeper into ads systems in our Ads Systems at Scale course.

Want to Go Deeper?

This article is part of our comprehensive curriculum on building ML systems at scale. Explore our full courses for hands-on learning.