Encyclopedia

Every method as a short, self-contained card. Use the sidebar filter or the chips to narrow by modality. ★ marks entries central to a biological world model.

Foundations 2

The position papers and energy-based ideas that define what a world model is and why prediction in latent space matters.

A Path Towards Autonomous Machine Intelligence★2022

LeCun's blueprint for world-model-driven autonomous agents and the birth of the JEPA idea.

Introduction to Latent Variable Energy-Based Models2023

A tutorial grounding JEPA in latent-variable energy-based modeling and inference-as-optimization.

Core Architectures 4

The canonical JEPA models for images and video that established the recipe.

The first image JEPA: predict latent representations of target blocks from one context block, no augmentations.

Self-supervised video representations by predicting masked spatiotemporal features in latent space.

A shared encoder learning optical flow (motion) and content features jointly via multi-task joint-embedding.

V-JEPA 2 extended with dense and deep self-supervision plus multimodal tokenizers for dense features.

Theory & Analysis 12

Why JEPAs work, when they collapse, and what they provably learn.

A theory of JEPAs pinpointing the isotropic Gaussian as the optimal embedding, enforced by SIGReg.

A lightweight single-GPU library unifying energy-based JEPAs for images, video, and planning.

JEPAs Focus on Slow Features2022

JEPAs preferentially encode slowly varying factors, linking latent prediction to slow feature analysis.

How JEPA Avoids Noisy Features2024

Deep linear self-distillation analysis explaining JEPA's implicit bias against noisy, unpredictable features.

Connecting JEPA with Contrastive SSL2024

Formally relates the non-contrastive JEPA objective to contrastive self-supervised learning.

Why and How Auxiliary Tasks Improve JEPA2025

Characterizes when and why auxiliary objectives improve JEPA pretraining and representations.

Image World Models (IWM)2024

Image World Models generalize I-JEPA by conditioning the latent predictor on input transformations.

LiDAR: Sensing Linear Probing Performance2023

A label-free metric predicting linear-probe quality of joint-embedding SSL models from embedding rank.

Variance-Invariance-Covariance regularization: an explicit anti-collapse objective reused across JEPA variants.

Understanding SSL Dynamics without Contrastive Pairs2021

DirectPred analysis explaining how non-contrastive SSL avoids collapse via predictor and EMA dynamics.

Recasts the JEPA objective in a variational, probabilistic framework over latent representations.

Gaussian Joint Embeddings2026

Studies Gaussian embedding distributions as the target for joint-embedding self-supervised learning.

World Models, Robotics & Planning 12

Action-conditioned latent models that predict consequences and plan — the heart of world modeling.

V-JEPA 2★2025

Internet-scale video world model whose action-conditioned variant enables zero-shot robot planning.

What Drives Success in Physical Planning with JEPA World Models?2025

An empirical dissection of the design choices that make JEPA world models actually plan well.

ACT-JEPA★2025

A JEPA jointly predicting action and observation latents for sample-efficient policy learning.

Value-Guided Action Planning with JEPA World Models2025

Guides action search in a JEPA world model with learned values, sharpening long-horizon planning.

Couples a vision-language-action model with a JEPA latent world model for grounded, predictive control.

Causal-JEPA★2026

Object-centric JEPA with entity-level latent masking for counterfactual reasoning and efficient planning.

Learning Invariant Visual Representations for Planning with JEPA World Models2026

Studies which invariances in JEPA representations help versus hurt downstream planning.

LeWorldModel★2026

Stable, teacher-free action-conditioned latent world model learned end-to-end from pixels via SIGReg.

Hierarchical Planning with Latent World Models2026

Hierarchical, multi-timescale planning in latent world models for long-horizon, compute-cheap control.

stable-worldmodel2026

A standardized, reproducible research stack for training, planning with, and evaluating world models.

When Does LeJEPA Learn a World Model?★2026

Theory of when a JEPA provably recovers latent world variables up to rotation — read as design guidance.

Pairs dense JEPA latent dynamics with a vision-language 'thinker' for long-horizon semantic guidance.

Biology & Drug Discovery 4

Cells, genomes, proteins — latent world models for the drug-discovery pipeline.

BioJEPA-AC★2026

Action-conditioned JEPA building a world model for cells: predict how cell states respond to perturbations.

Cell-JEPA★2026

Masked latent prediction for single-cell transcriptomics; a robust state encoder, not a perturbation model.

JEPA-DNA★2026

Model-agnostic continual training grounding genomic foundation models with a global-embedding JEPA loss.

ProteinJEPA★2026

Latent prediction complements protein LMs; JEPA-only collapses, but MLM + masked-position JEPA wins.

Graphs & Molecules 2

Latent prediction over graph structure, including molecular and polymer graphs.

Adapts the JEPA principle to graph-level SSL by predicting masked-subgraph embeddings.

Polymer-JEPA★2025

JEPA pretraining on polymer molecular graphs; largest gains in the scarce-label early-discovery regime.

Medical Imaging & Biosignals 8

Ultrasound, echo, X-ray, EEG, ECG and brain dynamics as latent predictive foundation models.

S-JEPA (Signal-JEPA)2024

Signal-JEPA for EEG and brain-computer interfaces with seamless cross-dataset transfer via spatial attention.

A JEPA foundation model for brain dynamics with gradient positioning and spatiotemporal masking.

JEPA for ECG Classification2024

JEPA pretraining of latent ECG features boosts downstream ECG classification.

From Video to EEG: Adapting JEPA to Brain Signals2025

Transfers the JEPA video recipe to EEG, adapting joint-embedding prediction to brain signal analysis.

Multimodal JEPA for Imaging and Clinical Signatures★2025

Multimodal JEPA jointly embedding imaging and clinical data for mechanism-to-endpoint fusion.

A JEPA encoder for chest radiographs via latent prediction over X-ray images.

EchoJEPA★2026

Latent predictive foundation model for echo video; 18M echos, improved LVEF/RVSP, pediatric zero-shot.

JEPA for medical ultrasound; masked latent prediction beats pixel reconstruction on low-SNR speckle.

Audio & Speech 6

Spectrogram and waveform JEPAs for general audio, music and speech.

JEPA adapted to audio spectrograms via curriculum masked latent prediction.

Design Choices in JEPA for General Audio2024

Systematic study of masking, encoders, and targets for general-audio JEPA.

Predicts compatibility of musical stems in a shared embedding space.

General audio representation learning following the JEPA latent-prediction recipe.

JEPA on raw waveforms for robust, augmentation-free audio foundation models.

JEPA-style latent prediction for distilling audio knowledge into lip-reading models.

3D & Point Clouds 3

Self-supervised latent prediction over 3D shapes, scenes and point clouds.

JEPA for point clouds with a learned sequencer ordering point patches.

Latent masked-prediction pretraining for 3D object and scene representations.

Predicts 3D representations from 2D images for efficient cross-modal 2D->3D learning.

Time Series & Tabular 7

Forecasting, anomaly detection and augmentation-free representation for sequences and tables.

Couples a JEPA latent space with prior-fitted networks for in-context forecasting.

T-JEPA (Trajectory Similarity)2024

Self-supervised trajectory embeddings for similarity search via latent prediction.

T-JEPA (Tabular)2024

Augmentation-free JEPA for tabular data via latent prediction over feature subsets.

Joint Embeddings Go Temporal2025

Extends joint-embedding self-supervision to temporal structure in time series.

Koopman Invariants in JEPAs2025

Explains emergent time-series clustering in JEPA embeddings via Koopman invariants.

TimeSeriesTheory

Multi-resolution JEPA for predicting anomalies in time series.

Giving Sensors a Voice: Multimodal JEPA for Time Series2026

Multimodal JEPA producing semantic embeddings for sensor time series.

Earth Observation 4

Remote-sensing and satellite JEPAs spanning resolutions and modalities.

Predicting Gradient is Better: SSL for SAR ATR with a JEPA2023

JEPA for SAR target recognition that predicts gradient-domain features rather than raw pixels.

One JEPA-style Earth-observation model spanning many resolutions, scales, and sensor modalities.

A JEPA tailored to efficient large-scale remote-sensing image retrieval via latent prediction.

Cross-modal predictive alignment extending JEPA to multi-sensor remote-sensing retrieval.

Language & Multimodal 3

JEPA objectives for text, recommendation and text-image systems.

An energy-based JEPA aligning text and image embeddings for multimodal systems.

Adds a JEPA embedding-prediction objective to LLM training alongside next-token prediction.

JEPA for sequential recommendation, predicting masked item representations in language-embedding space.

Generative Modeling 3

Using the JEPA objective for denoising and conditional generation.

D-JEPA (Denoising with JEPA)2024

Denoising with a JEPA: generative modeling cast as autoregressive denoising in embedding space.

Improving JEPA with Diffusion Noise2025

Improves JEPA by injecting diffusion-style noise into the joint-embedding prediction objective.

JEPA-T: text-conditioned joint-embedding prediction for controllable image generation.