feat: consolidate moonshot stack into clean submission-ready training script by Copilot · Pull Request #7 · kailean/parameter-golf

Copilot · 2026-04-03T21:45:29Z

Multiple open PRs developed the moonshot feature stack in parallel branches that were never merged to main. This PR consolidates the best artifacts from all branches into a single coherent submission.

Changes

`train_gpt_mlx_kl.py` — upgraded to full moonshot build (1284 → 1848 lines)

Replaces the stripped-down main-branch version with the feature-complete build from PR #4 (copilot/continue-verify-and-merge-changes), adding:

EngramLite — gated multi-head bigram+trigram hash logit bias; replaces BigramHash when ENGRAM_LITE_ENABLED=1
SkipGramHash — non-adjacent token pair logit bias (SKIPGRAM_HASH_SIZE>0)
BackoffNgramMixer — causal Laplace-smoothed n-gram LM mixed at eval time; zero artifact cost, never serialized (NGRAM_MIXER_ENABLED=1)
Complementary Training — per-token CE weighting that down-weights bigram-easy tokens, forcing neural capacity toward hard tokens (COMPLEMENT_ALPHA=0.5)
GPTQ-lite per-row scale quantization (USE_GPTQ_LITE=1)
SmearGate, partial RoPE, depth-aware LN scale 1/√(layer+1), XSA on last N layers
Sliding-window eval + LoRA TTT at eval
All features default OFF — no impact on existing baseline runs

Moonshot invocation:

ENGRAM_LITE_ENABLED=1 COMPLEMENT_ALPHA=0.5 NGRAM_MIXER_ENABLED=1 \
NGRAM_ALPHA=0.25 NGRAM_MAX_ORDER=4 python3 train_gpt_mlx_kl.py

`pg_novel_ideas.md` — added from PR #1 brainstorm branch

Comprehensive analysis of 8 approaches to sub-1.10 BPB, ranked idea list, dead-idea evidence table, and POC stubs. Previously only existed on copilot/brainstorm-novel-approaches.

`CLAUDE.md`

Fixed broken venv activation path (missing /bin/)
Added moonshot smoke test and full H100 run commands

- Upgrade train_gpt_mlx_kl.py to feature-complete version from PR #4: EngramLite, SkipGram, BackoffNgramMixer, Complementary Training, SmearGate, partial RoPE, LN scale, XSA, GPTQ-lite, TTT, sliding eval - Add pg_novel_ideas.md comprehensive analysis from brainstorm branch - Update module docstring to list all 17 innovations - Fix CLAUDE.md venv activation path and add moonshot smoke test command Agent-Logs-Url: https://github.com/kailean/parameter-golf/sessions/a0c7ea6e-8952-4355-8557-7137e4a94e4c Co-authored-by: kailean <49617037+kailean@users.noreply.github.com>

- Fix orphaned clip_grad_tree function body by adding proper def line - Remove verbose section separator comment blocks (17+ instances) - Compact 26-line module docstring to 2-line summary - Trim multi-line docstrings to single lines throughout - Remove redundant inline comments that restate the code - Remove unnecessary blank lines within function bodies - Compact Hyperparameters class by removing section comment headers All functionality, logic, algorithms, and class/function signatures preserved. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: kailean <49617037+kailean@users.noreply.github.com>

…ngth

Initial plan

8bb32af

Copilot AI assigned Copilot and kailean Apr 3, 2026

Copilot started work on behalf of kailean April 3, 2026 21:45 View session

Copilot AI changed the title ~~[WIP] Create a clean submission ready pull request~~ feat: consolidate moonshot stack into clean submission-ready training script Apr 3, 2026

Copilot finished work on behalf of kailean April 3, 2026 21:53

Copilot AI requested a review from kailean April 3, 2026 21:53

Initial plan

8793084

Copilot AI mentioned this pull request Apr 4, 2026

Trim train_gpt_mlx_kl.py to ≤1500 lines; fix orphaned clip_grad_tree #8

Merged

Copilot AI mentioned this pull request Apr 4, 2026

Merge PR#8 moonshot stack into main #9

Merged

Merge pull request #8 from kailean/copilot/verify-aub-1-0-bpb-code-le…

5c2050e

…ngth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: consolidate moonshot stack into clean submission-ready training script#7

feat: consolidate moonshot stack into clean submission-ready training script#7
Copilot wants to merge 5 commits intomainfrom
copilot/create-clean-submission-ready-pr

Copilot AI commented Apr 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

train_gpt_mlx_kl.py — upgraded to full moonshot build (1284 → 1848 lines)

pg_novel_ideas.md — added from PR #1 brainstorm branch

CLAUDE.md

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Apr 3, 2026 •

edited

Loading

`train_gpt_mlx_kl.py` — upgraded to full moonshot build (1284 → 1848 lines)

`pg_novel_ideas.md` — added from PR #1 brainstorm branch

`CLAUDE.md`