About This Survey

This is a comprehensive survey of Joint-Embedding Predictive Architectures (JEPA) — the family of self-supervised learning architectures pioneered by Yann LeCun that predict in latent space instead of pixel space. The survey covers all known JEPA variants from the foundational 2022 paper through the latest world models and robotic applications in 2026.

Dual-Level Explanations

Every JEPA variant is explained at two levels:

  1. Accessible Level — Step-by-step explanations using analogies and intuitions. No math prerequisites. Suitable for engineers, students, and anyone curious about how JEPA works.
  2. PhD Level — Full mathematical treatment with equations, derivations, proofs, and analysis of why each design choice works. Written for researchers and practitioners who need the complete picture.

Visual Approach

Inspired by Maarten Grootendorst's visual guides, every article includes:

  • Training diagrams — Complete architecture showing encoders, predictors, loss functions, and gradient flow
  • Inference diagrams — How the trained model is used for downstream tasks
  • Component diagrams — Detailed breakdowns of masking strategies, predictor architectures, and loss landscapes

Generation Pipeline

Each article goes through an autonomous write–review–fix loop:

  1. Paper content loaded from arXiv and source repositories
  2. Claude generates article HTML with equations, code, and SVG diagrams
  3. GPT-5.4 reviews across 8 weighted dimensions
  4. If weighted score < 8.5, Claude revises based on reviewer feedback
  5. Up to 4 revision rounds per article

Author

Remigiusz Kinas

Research engineer focused on self-supervised learning and autonomous AI systems.