Name	Name	Last commit message	Last commit date
parent directory ..
007bbfb7	007bbfb7
00d62c1b	00d62c1b
017c7c7b	017c7c7b
025d127b	025d127b
045e512c	045e512c
0520fde7	0520fde7
05269061	05269061
05f2a901	05f2a901
06df4c85	06df4c85
08ed6ac7	08ed6ac7
09629e4f	09629e4f
0962bcdd	0962bcdd
0a938d79	0a938d79
0b148d64	0b148d64
0ca9ddb6	0ca9ddb6
0d3d703e	0d3d703e
0dfd9992	0dfd9992
0e206a2e	0e206a2e
10fcaaa3	10fcaaa3
11852cab	11852cab
1190e5a7	1190e5a7
137eaa0f	137eaa0f
150deff5	150deff5
178fcbfb	178fcbfb
1a07d186	1a07d186
1b2d62fb	1b2d62fb
1b60fb0c	1b60fb0c
1bfc4729	1bfc4729
1c786137	1c786137
1caeab9d	1caeab9d
1cf80156	1cf80156
1e0a9b12	1e0a9b12
1e32b0e9	1e32b0e9
1f0c79e5	1f0c79e5
1f642eb9	1f642eb9
1f85a75f	1f85a75f
1f876c06	1f876c06
1fad071e	1fad071e
2013d3e2	2013d3e2
2204b7a8	2204b7a8
22168020	22168020
22233c11	22233c11
2281f1f4	2281f1f4
228f6490	228f6490
22eb0ac0	22eb0ac0
23581191	23581191
239be575	239be575
23b5c85d	23b5c85d
253bf280	253bf280
25d8a9c8	25d8a9c8
27a28665	27a28665
28bf18c6	28bf18c6
28e73c20	28e73c20
29c11459	29c11459
2bcee788	2bcee788
2bee17df	2bee17df
2dc579da	2dc579da
2dd70a9a	2dd70a9a
32597951	32597951
3428a4f5	3428a4f5
363442ee	363442ee
39a8645d	39a8645d
3aa6fb7a	3aa6fb7a
3ac3eb23	3ac3eb23
3bdb4ada	3bdb4ada
3c9b0459	3c9b0459
3de23699	3de23699
4258a5f9	4258a5f9
44f52bb0	44f52bb0
4c4377d9	4c4377d9
a64e4611	a64e4611
d631b094	d631b094
tasks	tasks
CLAUDE.md	CLAUDE.md
CONTRIBUTING.md	CONTRIBUTING.md
NEXT_BATCH.md	NEXT_BATCH.md
PROJECTION.md	PROJECTION.md
README.md	README.md
TOMORROW_PLAN.md	TOMORROW_PLAN.md
WORKFLOW_PROMPT.md	WORKFLOW_PROMPT.md
evaluator.py	evaluator.py

ARC-AGI Code Golf: Evolving the Shortest Programs

Evolved Python solutions for the NeurIPS 2025 Google Code Golf Championship using LLM-driven evolutionary optimization.

Progress Summary

Metric	Value
Solved	72 / 400 (18.0%)
Total Score	163,412 points
Avg Score/Task	2,270 points
% of Winner Avg	94.4% (winner: 2,405 pts/task)
Projected Final	~908,000 points (details)

Solved Problems (72)

Task	Pattern	Bytes	Score	Solution
`4c4377d9`	Vertical flip concat	24	2,476	solution.py
`3c9b0459`	180° rotation	40	2,460	solution.py
`44f52bb0`	Horizontal symmetry check	46	2,454	solution.py
`d631b094`	Collect non-zero cells	47	2,453	solution.py
`25d8a9c8`	Row uniformity check	50	2,450	solution.py
`0520fde7`	Grid AND comparison	57	2,443	solution.py
`22eb0ac0`	Matching edge markers fill	57	2,443	solution.py
`0d3d703e`	Color mapping (LUT)	58	2,442	solution.py
`1b2d62fb`	Conditional grid coloring	58	2,442	solution.py
`007bbfb7`	Outer product grid	65	2,435	solution.py
`2281f1f4`	Row/column intersection fill	67	2,433	solution.py
`28bf18c6`	Extract + duplicate shape	67	2,433	solution.py
`29c11459`	Horizontal line splitting	68	2,432	solution.py
`1e0a9b12`	Gravity (drop cells)	69	2,431	solution.py
`27a28665`	Pattern shape classification	70	2,430	solution.py
`017c7c7b`	Extend pattern + double	80	2,420	solution.py
`3428a4f5`	XOR halves by separator	88	2,412	solution.py
`1bfc4729`	Dual frame pattern	108	2,392	solution.py
`1fad071e`	Count 2x2 blue blocks	109	2,391	solution.py
`22168020`	Fill between endpoints	112	2,388	solution.py
`05269061`	Diagonal color cycle	113	2,387	solution.py
`1190e5a7`	Count cells by grid lines	124	2,376	solution.py
`137eaa0f`	Symmetry reflection	130	2,370	solution.py
`1cf80156`	Bounding box extraction	130	2,370	solution.py
`2bee17df`	Cross line fill	132	2,368	solution.py
`2204b7a8`	Border region coloring	137	2,363	solution.py
`08ed6ac7`	Column rank labeling	142	2,358	solution.py
`363442ee`	Fill bottom row pattern	144	2,356	solution.py
`28e73c20`	Spiral maze generation	149	2,351	solution.py
`3ac3eb23`	Diagonal checkerboard	150	2,350	solution.py
`2013d3e2`	Symmetry axis extraction	152	2,348	solution.py
`4258a5f9`	3×3 box around 5s	160	2,340	solution.py
`09629e4f`	Fill grid segments	170	2,330	solution.py
`239be575`	Small pattern movement	170	2,330	solution.py
`10fcaaa3`	2x2 tiling + diagonal 8s	174	2,326	solution.py
`1f876c06`	Diagonal line propagation	174	2,326	solution.py
`3aa6fb7a`	L-shaped 8s + corner 1	178	2,322	solution.py
`1f85a75f`	Extract rare color region	182	2,318	solution.py
`2dc579da`	Extract quadrant with anomaly	189	2,311	solution.py
`23581191`	Cross lines intersection	198	2,302	solution.py
`1e32b0e9`	Grid template completion	201	2,299	solution.py
`23b5c85d`	Smallest colored rectangle	201	2,299	solution.py
`0ca9ddb6`	Color spread from seeds	207	2,293	solution.py
`1caeab9d`	Line intersection marking	207	2,293	solution.py
`253bf280`	Connect 8s with 3s	207	2,293	solution.py
`178fcbfb`	Extend markers to lines	217	2,283	solution.py
`00d62c1b`	Fill enclosed regions	219	2,281	solution.py
`3de23699`	Extract marker-bounded region	225	2,275	solution.py
`0a938d79`	Alternating stripe pattern	237	2,263	solution.py
`3bdb4ada`	Middle row stripe	239	2,261	solution.py
`0dfd9992`	Color substitution pairs	239	2,261	solution.py
`0962bcdd`	T-junction detection	241	2,259	solution.py
`1c786137`	Corner rectangle frames	249	2,251	solution.py
`1f0c79e5`	Diagonal ray extension	261	2,239	solution.py
`025d127b`	Parallelogram to rect	266	2,234	solution.py
`1f642eb9`	Marker position projection	266	2,234	solution.py
`32597951`	Extract repeating tile	274	2,226	solution.py
`11852cab`	4-fold rotational symmetry	280	2,220	solution.py
`05f2a901`	Move shape to reference	326	2,174	solution.py
`06df4c85`	Grid line completion	378	2,122	solution.py
`1a07d186`	Line projection	434	2,066	solution.py
`0b148d64`	Quadrant extraction	454	2,046	solution.py
`2bcee788`	Color replacement by marker	465	2,035	solution.py
`22233c11`	Diagonal corner marking	474	2,026	solution.py
`045e512c`	Pattern replication	486	2,014	solution.py
`150deff5`	Grid extraction borders	494	2,006	solution.py
`228f6490`	Shape-to-hole matching	520	1,980	solution.py
`a64e4611`	Largest rectangle + cross	523	1,977	solution.py
`39a8645d`	Most frequent shape	526	1,974	solution.py
`2dd70a9a`	U-shape connector	673	1,827	solution.py
`1b60fb0c`	Segment extraction	1,026	1,474	solution.py
`0e206a2e`	Rotated template placement	1,135	1,365	solution.py

Unsolved Problems (328)

Analyzed Tasks (15)

Task ID	Pattern	Est. Difficulty
`234bbc79`	Bounding box intersection	Very Hard
`25d487eb`	Container fill with color	Hard
`25ff71a9`	Pattern isolation	Hard
`264363fd`	Grid region coloring	Very Hard
`272f95fa`	Grid cell quadrant coloring	Very Hard
`29623171`	Grid cell fill by quadrant	Hard
`29ec7d0e`	Fill missing pattern	Hard
`2c608aff`	Connect marked cross lines	Hard
`2dee498d`	Shape replication	Hard
`31aa019c`	Vertical background lines	Hard
`321b1fc6`	Find unique odd pattern	Hard
`3345333e`	Shape copy across shape	Hard
`3618c87e`	Grid splitting with marker	Hard
`3631a71a`	Remove colored block	Hard

Remaining Tasks (320)

Click to expand full list of remaining tasks

Task ID	Task ID	Task ID	Task ID
`36d67576`	`36fdfd69`	`3906de3d`	`39a8645d`
`39e1d7f9`	`3aa6fb7a`	`3ac3eb23`	`3af2c5a8`
`3bd67248`	`3bdb4ada`	`3befdf3e`	`3c9b0459`
`3de23699`	`3e980e27`	`3eda0437`	`3f7978a0`
`40853293`	`4093f84a`	`41e4d17e`	`4258a5f9`
`4290ef0e`	`42a50994`	`4347f46a`	`444801d8`
`445eab21`	`447fd412`	`44d8ac46`	`44f52bb0`
`4522001f`	`4612dd53`	`46442a0e`	`469497ad`
`46f33fce`	`47c1f68c`	`484b58aa`	`48d8fb45`
`4938f0c2`	`496994bd`	`49d1d64f`	`4be741c5`
`4c4377d9`	`4c5c2cf0`	`50846271`	`508bd3b6`
`50cb2852`	`5117e062`	`5168d44c`	`539a4f51`
`53b68214`	`543a7ed5`	`54d82841`	`54d9e175`
`5521c0d9`	`5582e5ca`	`5614dbcf`	`56dc2b01`
`56ff96f3`	`57aa92db`	`5ad4f10b`	`5bd6f4ac`
`5c0a986e`	`5c2c9af4`	`5daaa586`	`60b61512`
`6150a2bd`	`623ea044`	`62c24649`	`63613498`
`6430c8c4`	`6455b5f5`	`662c240a`	`67385a82`
`673ef223`	`6773b310`	`67a3c6ac`	`67a423a3`
`67e8384a`	`681b3aeb`	`6855a6e4`	`68b16354`
`694f12f3`	`6a1e5592`	`6aa20dc0`	`6b9890af`
`6c434453`	`6cdd2623`	`6cf79266`	`6d0160f0`
`6d0aefbc`	`6d58a25d`	`6d75e8bb`	`6e02f1e3`
`6e19193c`	`6e82a1ae`	`6ecd11f4`	`6f8cd79b`
`6fa7a44f`	`72322fa7`	`72ca375d`	`73251a56`
`7447852a`	`7468f01a`	`746b3537`	`74dd1130`
`75b8110e`	`760b3cac`	`776ffc46`	`77fdfe62`
`780d0b14`	`7837ac64`	`794b24be`	`7b6016b9`
`7b7f7511`	`7c008303`	`7ddcd7ec`	`7df24a62`
`7e0986d6`	`7f4411dc`	`7fe24cdd`	`80af3007`
`810b9b61`	`82819916`	`83302e8f`	`834ec97d`
`8403a5d5`	`846bdb03`	`855e0971`	`85c4e7cd`
`868de0fa`	`8731374e`	`88a10436`	`88a62173`
`890034e9`	`8a004b2b`	`8be77c9e`	`8d5021e8`
`8d510a79`	`8e1813be`	`8e5a5113`	`8eb1be9a`
`8efcae92`	`8f2ea7aa`	`90c28cc7`	`90f3ed37`
`913fb3ed`	`91413438`	`91714a58`	`9172f3a0`
`928ad970`	`93b581b8`	`941d9a10`	`94f9d214`
`952a094c`	`9565186b`	`95990924`	`963e52fc`
`97999447`	`97a05b5b`	`98cf29f8`	`995c5fa3`
`99b1bc43`	`99fa7670`	`9aec4887`	`9af7a82c`
`9d9215db`	`9dfd6313`	`9ecd008a`	`9edfc990`
`9f236235`	`a1570a43`	`a2fd1cf0`	`a3325580`
`a3df8b1e`	`a416b8f3`	`a48eeaf7`	`a5313dff`
`a5f85a15`	`a61ba2ce`	`a61f2674`	`a65b410d`
`a68b268e`	`a699fb00`	`a740d043`	`a78176bb`
`a79310a0`	`a85d4709`	`a87f7484`	`a8c38be5`
`a8d7556c`	`a9f96cdd`	`aabf363d`	`aba27056`
`ac0a08a4`	`ae3edfdc`	`ae4f1146`	`aedd82e4`
`af902bf9`	`b0c4d837`	`b190f7f5`	`b1948b0a`
`b230c067`	`b27ca6d3`	`b2862040`	`b527c5c6`
`b548a754`	`b60334d2`	`b6afb2da`	`b7249182`
`b775ac94`	`b782dc8a`	`b8825c91`	`b8cdaf2b`
`b91ae062`	`b94a9452`	`b9b7f026`	`ba26e723`
`ba97ae07`	`bb43febb`	`bbc9ae5d`	`bc1d5164`
`bd4472b8`	`bda2d7a6`	`bdad9b1f`	`be94b721`
`beb8660c`	`c0f76784`	`c1d99e64`	`c3e719e8`
`c3f564a4`	`c444b776`	`c59eb873`	`c8cbb738`
`c8f0f002`	`c909285e`	`c9e6f938`	`c9f8e694`
`caa06a1f`	`cbded52d`	`cce03e0d`	`cdecee7f`
`ce22a75a`	`ce4f8723`	`ce602527`	`ce9e57f2`
`cf98881b`	`d037b0a7`	`d06dbe63`	`d07ae81c`
`d0f5fe59`	`d10ecb37`	`d13f3404`	`d22278a0`
`d23f8c26`	`d2abd087`	`d364b489`	`d406998b`
`d43fd935`	`d4469b4b`	`d4a91cb9`	`d4f3cd78`
`d511f180`	`d5d6de2d`	`d631b094`	`d687bc17`
`d6ad076f`	`d89b689b`	`d8c310e9`	`d90796e8`
`d9f24cd1`	`d9fac9be`	`dae9d2b5`	`db3e9e38`
`db93a21d`	`dbc1a6ce`	`dc0a314f`	`dc1df850`
`dc433765`	`ddf7fa4f`	`de1cd16c`	`ded97339`
`e179c5f4`	`e21d9049`	`e26a3af2`	`e3497940`
`e40b9e2f`	`e48d4e1a`	`e5062a87`	`e509e548`
`e50d258f`	`e6721834`	`e73095fd`	`e76a88a6`
`e8593010`	`e8dc4411`	`e9614598`	`e98196ab`
`e9afcf9a`	`ea32f347`	`ea786f4a`	`eb281b96`
`eb5a1d5d`	`ec883f72`	`ecdecbb3`	`ed36ccf7`
`ef135b50`	`f15e1fac`	`f1cefba8`	`f25fbde4`
`f25ffba3`	`f2829549`	`f35d900a`	`f5b8619d`
`f76d97a5`	`f8a8fe49`	`f8b3ba0a`	`f8c80d96`
`f8ff0b80`	`f9012d9b`	`fafffa47`	`fcb5c309`
`fcc82909`	`feca6190`	`ff28f65a`	`ff805c23`

Solution Standards

All solutions must include documentation. See CONTRIBUTING.md for requirements.

Each solution directory must contain:

solution.py - The golfed Python code
README.md - Pattern description, algorithm explanation, and golf tricks used

Directory Structure

Each solved task has its own subdirectory:

code-golf/
├── 0520fde7/           # Grid AND comparison (57 bytes)
├── 017c7c7b/           # Extend pattern + double (80 bytes)
├── 00d62c1b/           # Fill enclosed regions (238 bytes)
├── a64e4611/           # Largest rectangle task (523 bytes)
├── evaluator.py        # Scoring and validation
├── tasks/              # All 400 ARC-AGI tasks (legacy)
└── README.md           # This file

Each task directory contains:

solution.py - Golfed Python solution
task.json - ARC-AGI task definition
README.md - Task-specific notes and evolution history

Why This Problem Matters

The Competition

The NeurIPS 2025 - Google Code Golf Championship challenged participants to write the shortest possible Python programs that correctly solve 400 ARC-AGI tasks.

Detail	Value
Prize Pool	$100,000
Tasks	400 (ARC-AGI public training set)
Scoring	`max(1, 2500 - bytes)` per correct solution
Maximum Score	1,000,000 (400 × 2500)
Deadline	October 30, 2025

Why Code Golf is Hard

Code golf is a unique optimization challenge:

Correctness is binary - A solution that fails ANY test case scores 0.001
Every byte matters - Saving 1 byte = +1 point
Semantic equivalence required - Transformations must preserve behavior
Language mastery needed - Exploiting Python quirks and shortcuts
Algorithm selection critical - Sometimes a completely different approach is shorter

Why This Matters for Evolution

Code golf is an ideal testbed for LLM-driven evolution because:

Clear fitness function: Byte count (lower = better)
Automatic verification: Run tests to check correctness
Rich mutation space: Syntax tricks, algorithm changes, refactoring
Transferable learnings: Tricks discovered on one task apply to others

The Evolution Approach

Unlike performance optimization (where we measure ops/sec), code golf evolution optimizes for minimum byte count:

fitness = correctness × (2500 - bytes) / 2500

Three-Stage Pipeline

┌─────────────────────────────────────────────────────────────────┐
│  Code Golf Evolution Pipeline                                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐       │
│  │ Stage 1:     │───▶│ Stage 2:     │───▶│ Stage 3:     │       │
│  │ Find Correct │    │ Apply Known  │    │ Discover New │       │
│  │ Solution     │    │ Tricks       │    │ Approaches   │       │
│  └──────────────┘    └──────────────┘    └──────────────┘       │
│                                                                  │
│  "Make it work"      "Make it short"     "Make it shorter"      │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Golf Tricks Library

Tricks discovered during evolution, applicable to other tasks:

Structural Tricks

Trick	Before	After	Saves
Lambda over def	`def f(x):\n return E`	`f=lambda x:E`	~6 bytes
Star unpacking	`[0]+r+[0]`	`[0,*r,0]`	1 byte
Walrus reuse	`a=[0]*w;g=[a,...,a]`	`g=[a:=[0]*w,...,a]`	1 byte
Trailing comma	`s+=[(a,b),(c,d)]`	`s+=(a,b),(c,d),`	1 byte

Comparison Tricks

Trick	Before	After	Saves
Chain bounds	`0<=a and a<H`	`H>a>=0`	5 bytes
Zero check	`x==0`	`x<1`	1 byte
Nonzero check	`x!=0`	`x>0`	1 byte

Algorithm Tricks

Trick	Description	Savings
Padding for flood fill	Add border of 0s, start from corner	~20 bytes
Smart marker values	Choose markers that simplify final lookup	2-4 bytes
Direct row iteration	`for r in g` vs `for i in range(len(g))`	7+ bytes
Tuple indices	`(0,1,2)` instead of `range(3)`	2 bytes
max() for best	`b=max(b,new)` vs `if new>b:b=new`	5+ bytes
Unified dimension swap	`O[(v,i)[z]][(i,v)[z]]` for row/col toggling	10+ bytes
Merged loops	Combine histogram + rect finding in one pass	15+ bytes
`[*map(list,G)]`	Shorter deep copy than `[r[:]for r in G]`	2 bytes
`I=range` alias	When range used 5+ times, alias saves bytes	3+ bytes/use
`x and Y or Z`	Shorter than `Y if x else Z` for truthy Y	2 bytes
E with default	`E(i,L,z,d=0):d or F` for edge-case bypass	2+ bytes
Algorithm swap	O(n⁴) brute-force can be shorter than O(n²)	70+ bytes
`-~x` for `x+1`	Bitwise not trick: `-~(c-a)` = `c-a+1`	1 byte
`[0,]` fallback	Shorter than `[(0,)]` for empty fallback	2 bytes
Range truthiness	`if L` works for empty `range()` checks	4+ bytes
Lists in tuples	`[I(f),I(j+1,C)]` vs `[(I(f),f),...]` pairs	10+ bytes
Merged conditionals	`I(i-(i>A),i+(i<B)+1)` combines 3 checks	37 bytes
Tuple iteration	`for A,B,P,M,z in(t1),(t2):` vs list concat	7 bytes
Single list comp	`[f()for...for v in L]` flattens nested loops	22 bytes
Tuple vs range	`(i,i-(i>a),i+(i<b))` vs `I(i-(i>a),...)`	2 bytes
`*P` unpacking	`for a,b,*P,z in...` captures middle elements	2 bytes
1D array	`O=sum(G,[])` + `O[rC+j]` vs `O=[map(list,G)]` + `O[r][j]`	3 bytes
~-any trick	`~-any(x for...)` vs `all(x<1for...)`	1 byte
[0] fallback	`or[0]` vs `or[0,]` for single-element fallback	1 byte

Quick Start

Prerequisites

Python 3.8+

Evaluate a Solution

cd showcase/code-golf

# Evaluate single task
python evaluator.py 00d62c1b solutions/00d62c1b.py

# Expected output:
# {
#   "task_id": "00d62c1b",
#   "fitness": 0.9048,
#   "score": 2262,
#   "byte_count": 238,
#   "correct": true
# }

Evolve a Solution

# Use the /evolve-size skill
/evolve shortest Python solution for ARC task <task_id>

Technical Details

Scoring Formula

For each of the 400 tasks:

score = max(1, 2500 - byte_count) if correct else 0.001

Maximum per task: 2500 (0 bytes - impossible)
Practical maximum: ~2450 (50-byte solution)
Incorrect solutions: 0.001 (effectively zero)

Solution Format

Each solution must define a solve function:

def solve(grid):
    # grid: List[List[int]] - input grid
    # return: List[List[int]] - output grid

Constraints

Python Standard Library only (no numpy, scipy, etc.)
Self-contained (no imports from other files)
Must pass all train AND test examples

File Structure

showcase/code-golf/
├── README.md                    # This file
├── evaluator.py                 # Scoring and validation harness
├── tasks/                       # 400 ARC-AGI task JSONs
│   ├── 00d62c1b.json
│   ├── 0520fde7.json
│   └── ...
├── solutions/                   # Evolved Python solutions
│   ├── 00d62c1b.py             # 238 bytes (champion)
│   ├── 0520fde7.py             # 57 bytes (champion)
│   ├── a64e4611.py             # 541 bytes (champion)
│   └── 017c7c7b.py             # 54 bytes (baseline)
└── mutations/                   # Evolution logs
    ├── arc_fill_enclosed_regions.md
    ├── 0520fde7_evolution.md
    └── a64e4611_evolution.md

Reproducing Results

Step 1: Verify Existing Solutions

cd showcase/code-golf
python evaluator.py 00d62c1b solutions/00d62c1b.py
python evaluator.py 0520fde7 solutions/0520fde7.py

Step 2: Evolve a New Task

# Pick an unsolved task
ls tasks/ | head -20

# Evolve it
/evolve shortest Python solution for ARC task <task_id>

Step 3: Verify Improvement

python evaluator.py <task_id> solutions/<task_id>.py

What Works

Padding approach for flood-fill problems - dramatically simplifies boundary logic
Lambda over def - saves 6+ bytes in most cases
Direct iteration (for r in g) over index iteration (for i in range(len(g)))
Lookup tables - usually shorter than arithmetic formulas
Chain comparisons - H>a>=0<=b<W saves multiple and operators
Smart marker values - choose values that simplify final mapping

What Doesn't Work

Recursion - requires setrecursionlimit, adds overhead
Sets for stacks - |= syntax longer than tuple extension
Bitwise tricks - often need parentheses, same or longer
String lookups - return strings, not ints
Complex formulas - lookup tables usually shorter

Competition Status

Metric	Current	Projected (Conservative)	Projected (Optimistic)	Winner
Tasks solved	72	400	400	400
Total score	163,412	~908,000	~920,000	962,070
Avg pts/task	2,270	2,270	2,300	2,405
% of winner	94.4%	94.4%	95.6%	100%
Est. Place	-	~100th	~80th	1st

Winner: Code Golf International (962,070 pts) - Final Leaderboard

Projection Methods

Conservative: Current average (2,262 pts/task) × 400 = ~904,800 pts → ~110th place
Optimistic (tier-weighted): Maintain tier averages = ~918,000 pts → ~80th place

See PROJECTION.md for detailed tier breakdowns.

This showcase demonstrates the /evolve-size capability. The techniques transfer to any code golf challenge.

References

NeurIPS 2025 - Google Code Golf Championship
ARC Prize - The ARC-AGI benchmark
Competition Details
François Chollet's announcement

Deterministic Reproduction

No external data files required (tasks embedded in tasks/)
No network requests during evaluation
Deterministic scoring (byte count is exact)
Same results every run

FilesExpand file tree

code-golf

Directory actions

More options