feat: consolidate moonshot stack into clean submission-ready training script#7
Draft
feat: consolidate moonshot stack into clean submission-ready training script#7
Conversation
- Upgrade train_gpt_mlx_kl.py to feature-complete version from PR #4: EngramLite, SkipGram, BackoffNgramMixer, Complementary Training, SmearGate, partial RoPE, LN scale, XSA, GPTQ-lite, TTT, sliding eval - Add pg_novel_ideas.md comprehensive analysis from brainstorm branch - Update module docstring to list all 17 innovations - Fix CLAUDE.md venv activation path and add moonshot smoke test command Agent-Logs-Url: https://github.com/kailean/parameter-golf/sessions/a0c7ea6e-8952-4355-8557-7137e4a94e4c Co-authored-by: kailean <49617037+kailean@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Create a clean submission ready pull request
feat: consolidate moonshot stack into clean submission-ready training script
Apr 3, 2026
- Fix orphaned clip_grad_tree function body by adding proper def line - Remove verbose section separator comment blocks (17+ instances) - Compact 26-line module docstring to 2-line summary - Trim multi-line docstrings to single lines throughout - Remove redundant inline comments that restate the code - Remove unnecessary blank lines within function bodies - Compact Hyperparameters class by removing section comment headers All functionality, logic, algorithms, and class/function signatures preserved. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: kailean <49617037+kailean@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Multiple open PRs developed the moonshot feature stack in parallel branches that were never merged to
main. This PR consolidates the best artifacts from all branches into a single coherent submission.Changes
train_gpt_mlx_kl.py— upgraded to full moonshot build (1284 → 1848 lines)Replaces the stripped-down main-branch version with the feature-complete build from PR #4 (
copilot/continue-verify-and-merge-changes), adding:ENGRAM_LITE_ENABLED=1SKIPGRAM_HASH_SIZE>0)NGRAM_MIXER_ENABLED=1)COMPLEMENT_ALPHA=0.5)USE_GPTQ_LITE=1)1/√(layer+1), XSA on last N layersMoonshot invocation:
pg_novel_ideas.md— added from PR #1 brainstorm branchComprehensive analysis of 8 approaches to sub-1.10 BPB, ranked idea list, dead-idea evidence table, and POC stubs. Previously only existed on
copilot/brainstorm-novel-approaches.CLAUDE.md/bin/)