Non-record: JEPA v3 — span-masked I-JEPA + VICReg, val_bpb 1.2321 by aiejvn · Pull Request #1581 · openai/parameter-golf

aiejvn · 2026-04-13T01:44:59Z

Non-record: JEPA v3 — span-masked I-JEPA + VICReg, val_bpb 1.2321

Builds on PR #1330 (JEPA v2 — why same-sequence next-k JEPA collapses in causal LMs). Two additions:

Span-masked JEPA: The context encoder sees target spans replaced with a learned mask embedding (jepa_mask_emb) rather than the actual tokens — the target encoder sees the full unmasked sequence. This makes prediction genuinely hard: the context encoder cannot recover the target token from its own input and must rely on surrounding context. Bigram hash contributions are explicitly zeroed at masked positions to prevent the Cantor hash from leaking token identity. Span lengths are sampled from Geometric(mean=16) with 4 spans per sequence (~6% masked per step).

VICReg anti-collapse regularization: Variance hinge and off-diagonal covariance penalty (V-JEPA style) are applied to the predictor-side representations at masked positions. This prevents the predictor from collapsing to a single point or low-rank subspace independently of the span masking. Target-side VICReg terms are monitored as diagnostics only — no gradient.

Optimizer bug fix (v2 regression): In v2, JEPAPredictor and jepa_mask_emb were absent from all three optimizer groups — only base_model.blocks was iterated (verifiable in b4a428b). The predictor was frozen at zero-init for the entire v2 run. Fixed by explicitly routing predictor matrix params to Muon and scalar params to Adam.

Non-record reason: Trained ~20hr on 1× AWS A10G.

Submission	val_bpb
This (JEPA v3, span-masked + VICReg)	1.2321
PR #1330 (JEPA v2, next-k, no JEPA path)	1.4617
PR #1330 (JEPA v2, next-k, with JEPA)	1.6047

JEPA Submission

ede110f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-record: JEPA v3 — span-masked I-JEPA + VICReg, val_bpb 1.2321#1581

Non-record: JEPA v3 — span-masked I-JEPA + VICReg, val_bpb 1.2321#1581
aiejvn wants to merge 1 commit intoopenai:mainfrom
aiejvn:submission-jepa-v3

aiejvn commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aiejvn commented Apr 13, 2026

Non-record: JEPA v3 — span-masked I-JEPA + VICReg, val_bpb 1.2321

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant