JEPAwiki
EB-JEPA: A Lightweight Library for Energy-Based Joint-Embedding Predictive Architectures
Date2026-02-05
Modalitymulti
AuthorsBasile Terver, Randall Balestriero, Megi Dervishi, David Fan + 7 more
Tagslibrary, educational, practical, image, video, planning
SourceFull text

EB-JEPA

An open-source library that makes JEPA accessible for research and education. Provides modular, self-contained implementations of the entire JEPA pipeline — from image SSL to video prediction to action-conditioned world models — all trainable on a single GPU.

Three progressively complex examples

  1. Image representation learning: self-supervised JEPA on images. Achieves 91% probing accuracy on CIFAR-10.
  2. Video prediction: multi-step prediction in latent space on Moving MNIST. Demonstrates how image SSL principles scale to temporal modeling.
  3. Action-conditioned world model: learns to predict effects of control inputs. Achieves 97% planning success rate on Two Rooms navigation task.

Design principles

  • Modular architecture: reusable components (encoders, predictors, regularizers, planners) that can be recombined
  • Single-GPU training: each example runs in a few hours on one GPU
  • Educational: clear documentation and code structure to teach JEPA principles
  • Comprehensive ablations: reveals critical importance of each regularization component for preventing collapse

Why it matters

Production JEPA codebases (V-JEPA 2, etc.) are designed for large-scale training and are hard to navigate. EB-JEPA bridges the gap between theory and practice, providing a low barrier to entry for the JEPA framework.

Links

See also