new test_1_week_1_day_1 #1

I0-OVI · 2025-07-23T14:40:32Z

I just use os library to check whether it is mlx or pytorch and there is a specific function and test for each case.

Signed-off-by: Alex Chi <[email protected]>

Signed-off-by: Alex Chi Z <[email protected]>

* test:add a test case to cover week_1_day_3_task3 Closes: #23 Signed-off-by: Jiawei Zhao <[email protected]> * fmt Signed-off-by: Alex Chi Z <[email protected]> --------- Signed-off-by: Jiawei Zhao <[email protected]> Signed-off-by: Alex Chi Z <[email protected]> Co-authored-by: Alex Chi Z <[email protected]>

Signed-off-by: Alex Chi Z <[email protected]>

Add KV cache module imports to both tiny_llm and tiny_llm_ref packages to enable KV cache functionality. Include comprehensive test suite for week 2 day 1 covering embedding operations, model inference with KV cache, and sequential token generation with offset support. - Add KV cache imports to __init__.py files - Create test_week_2_day_1.py with task 2-4 test coverage - Support multiple Qwen2 model variants (0.5B, 1.5B, 7B) - Include embedding call and as_linear functionality tests - Add sequential generation tests with proper cache management

Signed-off-by: Alex Chi Z <[email protected]>

Extract string replacement operation outside f-string expression to avoid backslash in f-string expression part, which is not allowed in Python syntax. - Move .replace('\n', ' ') operation to separate variable - Improves code readability and fixes SyntaxError

Signed-off-by: Alex Chi Z <[email protected]>

Refer to another commit cause you can't find RMSNorm impl in the current mlx-llm repo (it's replaced by mlx fast impl).

* Possible typo in week1-01-attention Hello, was going through the book! I'm not 100% sure of this, but after going through the tests for day1-task2, it looks like the w_qkv matrices and w_o matrix have their shape reversed. I confirmed by checking the mlx.nn.layers.linear.Linear weight, which is of shape `[Output, Input]`. Since w_qkv's output is HxD and input is E, the shape should be `[H x D, E]`. * Oops fix another typo

* Revert "fix: Use non-traditional RoPE in Qwen2 test case. (#56)" This reverts commit bf3383d. * Update week1-03-gqa.md with RoPE note and test command Added note about using non-traditional RoPE and testing command. --------- Co-authored-by: Alex Chi Z. <[email protected]>

Signed-off-by: Alex Chi Z <[email protected]>

Resolves #50, applies the patch from there and updates pyproject / lockfile to specify newer version of mlx.

* Add CI for reference solution / building extensions * Adjust tests to run build-ext-ref before testing * Add sshx for debugging * Fix nanobind in CMake * Change when the workflow runs

This test requires the latest version of mlx 0.29.1, since they just merged support for this in mlx a week ago: ml-explore/mlx#2564 I verified that the other tests still pass with the version upgrade.

* Add tests for week 2, day 6 - continuous batching * Download model weights in GitHub Actions

* add speculative decoding Signed-off-by: Alex Chi Z <[email protected]> * update readme Signed-off-by: Alex Chi Z <[email protected]> --------- Signed-off-by: Alex Chi Z <[email protected]>

Signed-off-by: Alex Chi Z <[email protected]>

Signed-off-by: Connor1996 <[email protected]>

Co-authored-by: Yangchen Ye <[email protected]>

* docs: add instruction to download Qwen2-1.5B model

Signed-off-by: Connor1996 <[email protected]>

Extract newline character to a variable to avoid backslash in f-string expression part. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <[email protected]>

Signed-off-by: KKKZOZ <[email protected]>

- Add complete quantized_matmul_impl_typed template function for CPU, which support float16, float32, and bfloat16 data types - Add float32 test cases for quantized_matmul - Adjust float32 tolerance in test utils for better precision

Andy1314Chen and others added 20 commits June 13, 2025 17:19

fix Rotation matri form of RoPE (#25)

080a53a

add back installation check script

cd54459

Signed-off-by: Alex Chi <[email protected]>

bump mlx to latest version (#33)

55066c3

Signed-off-by: Alex Chi Z <[email protected]>

fix tokp implementation

13295fc

Signed-off-by: Alex Chi Z <[email protected]>

more precision tweaks

4e9101c

Signed-off-by: Alex Chi Z <[email protected]>

fix bugs in continuous batching

ae06a7f

Signed-off-by: Alex Chi Z <[email protected]>

fix mask tests

cfbc43e

Signed-off-by: Alex Chi Z <[email protected]>

reshape in kvcache

5863d96

Signed-off-by: Alex Chi Z <[email protected]>

fix flash attention

55e7b0c

Signed-off-by: Alex Chi Z <[email protected]>

add back causal mask to gqa

d954fb5

Signed-off-by: Alex Chi Z <[email protected]>

flash attention works for the first token, maybe some mem init issue

e21a583

Signed-off-by: Alex Chi Z <[email protected]>

try debug flashattention on multi test run

9eacc3b

Signed-off-by: Alex Chi Z <[email protected]>

small fixes of flash attention

657c0b3

Signed-off-by: Alex Chi Z <[email protected]>

finally fully fix flash attention

8dfe61c

Signed-off-by: Alex Chi Z <[email protected]>

refactor continuous batching

3cd7d84

Signed-off-by: Alex Chi Z <[email protected]>

chunked prefill only in continuous batching

5930135

Signed-off-by: Alex Chi Z <[email protected]>

update readme and roadmap

8b4d9a7

Signed-off-by: Alex Chi Z <[email protected]>

update the vllm-RoPE code link in the reading (#39)

024d528

I0-OVI closed this Aug 8, 2025

I0-OVI reopened this Aug 8, 2025

skyzh and others added 8 commits August 9, 2025 16:01

bfloat16 support for matmul

00ea990

Signed-off-by: Alex Chi Z <[email protected]>

model shortcut and dispatcher

850dd6c

Signed-off-by: Alex Chi Z <[email protected]>

qwen3 support

ffbd15d

Signed-off-by: Alex Chi Z <[email protected]>

update readme

cd87116

Signed-off-by: Alex Chi Z <[email protected]>

remove offset in week 1, not used

1d7572f

Signed-off-by: Alex Chi Z <[email protected]>

add week2day1 kv cache contents

0b82b7f

Signed-off-by: Alex Chi Z <[email protected]>

update benches

45cff24

Signed-off-by: Alex Chi Z <[email protected]>

58191554 and others added 30 commits August 21, 2025 16:33

fix typo in week2-01 (#54)

e051790

ci: add spell check workflow (#55)

b4c14ed

fix: Use non-traditional RoPE in Qwen2 test case. (#56)

bf3383d

fix: mlx-llm Qwen2 RMSNorm url link (#57)

1c9369a

Refer to another commit cause you can't find RMSNorm impl in the current mlx-llm repo (it's replaced by mlx fast impl).

add test for week 1 day 5 test 1: Qwen2TransformerBlock (#59)

04149a3

format and warn on different test files

919a3e5

Signed-off-by: Alex Chi Z <[email protected]>

mention that we have quantized weight now

34fb3fe

Signed-off-by: Alex Chi Z <[email protected]>

add chunked prefill and continuous batching writeup (#64)

1449816

Signed-off-by: Alex Chi Z <[email protected]>

fix simple kv cache decoding (#65)

1fc0752

Signed-off-by: Alex Chi Z <[email protected]>

update writeup progress

26aa2ff

Signed-off-by: Alex Chi Z <[email protected]>

Bump mlx to >=0.27 and fix build-ext from week 1, day 7 (#66)

1f2ab12

Resolves #50, applies the patch from there and updates pyproject / lockfile to specify newer version of mlx.

CI workflow for pdm setup, build and testing refsol (#67)

308388e

* Add CI for reference solution / building extensions * Adjust tests to run build-ext-ref before testing * Add sshx for debugging * Fix nanobind in CMake * Change when the workflow runs

Day 6, task 1 tests - RoPE with multiple offsets (#68)

136ad7f

This test requires the latest version of mlx 0.29.1, since they just merged support for this in mlx a week ago: ml-explore/mlx#2564 I verified that the other tests still pass with the version upgrade.

Add tests for week 2, day 6 - continuous batching (#69)

6635e4a

* Add tests for week 2, day 6 - continuous batching * Download model weights in GitHub Actions

update dev-tools.py to fix --force in copy-test (#70)

b6a3b00

add speculative decoding (#71)

ad6d976

* add speculative decoding Signed-off-by: Alex Chi Z <[email protected]> * update readme Signed-off-by: Alex Chi Z <[email protected]> --------- Signed-off-by: Alex Chi Z <[email protected]>

ensure user solution can run

ff5d7d0

Signed-off-by: Alex Chi Z <[email protected]>

add definition hint for model args

cf6910a

Signed-off-by: Connor1996 <[email protected]>

add more info

a30f9c2

Signed-off-by: Connor1996 <[email protected]>

rename

83762c8

Signed-off-by: Connor1996 <[email protected]>

Merge pull request #73 from Connor1996/model-args

8eebd4a

fix: fix link to Qwen2.5 blog in week1 (#72)

f1f4f98

Co-authored-by: Yangchen Ye <[email protected]>

docs: add instruction to download Qwen2-1.5B model (#75)

cea8926

* docs: add instruction to download Qwen2-1.5B model

perform pdm sync before running (#76)

5dc71b8

Signed-off-by: Connor1996 <[email protected]>

Fix f-string syntax (#81)

ace6e45

Extract newline character to a variable to avoid backslash in f-string expression part. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <[email protected]>

fix: draft-generate offset (#83)

5b6fdc3

Signed-off-by: KKKZOZ <[email protected]>

fix mx.logsumexp with the right dim (#80)

16f55c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

new test_1_week_1_day_1 #1

new test_1_week_1_day_1 #1

Uh oh!

I0-OVI commented Jul 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants

new test_1_week_1_day_1 #1

Are you sure you want to change the base?

new test_1_week_1_day_1 #1

Uh oh!

Conversation

I0-OVI commented Jul 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants