21 Jun 19:12

scouzi1966

dedad02

afm v0.9.13 Latest

Latest

afm v0.9.13

OpenAI-compatible local LLM inference for Apple Silicon (MLX + Apple Foundation Models).

Highlights since v0.9.12 (73 commits)

New models

cohere2_moe — Cohere North-Mini-Code (30B-A3B MoE). Correct across streaming, non-streaming, prefix-cache, and concurrency (#139).

⚡ Speculative decoding (quality-preserving)

--mtp — Qwen3.6 self-speculative decoding via the in-model MTP head → ~+52% decode.
--eagle3 <drafter> — dense Gemma4-31B EAGLE3 drafter → ~+30% decode.
Both work streaming and non-streaming. Bit-exact to greedy on short generations, near-greedy on long ones.

APIs & agent-friendliness

/v1/embeddings on the main server (Apple NaturalLanguage) (#132, #133).
Mid-stream cancel + /v1/tokenize / /v1/count_tokens + /openapi.json & /docs (#126).
vLLM-namespaced /metrics + Grafana dashboard (#122).
Apple-native Vision OCR and Speech transcription HTTP APIs (thanks @jesserobbins).

Performance & platform

Backported mlx-swift 0.31.3 adaptive-block SDPA → ~+10% decode @16k (pin stays 0.30.3).
Eager <think>-tag streaming + Metal-kernel prewarm. Swift 6 language mode migration.

Fixes

--no-think / server-default enable_thinking=false now actually disables thinking on reasoning models (was a silent no-op).
MTP reject path retains the committed token in the KV/GDN cache (fixes garbled output on longer generations).

Known limitations

--no-think + high --concurrent can corrupt output (#140). Default behavior unaffected; use lower concurrency or omit --no-think.
MTP is bit-exact to greedy on short generations; longer ones stay greedy-quality but may differ token-for-token.

Install

brew tap scouzi1966/afm && brew install scouzi1966/afm/afm   # or: brew upgrade afm
pip install macafm

SHA256 (afm-v0.9.13-arm64.tar.gz): 443bf74650fece15f7ce02663f6d5dd14a7b638c937f80262e426903a6abf42b

Contributors

jesserobbins and 16k

Assets 3

21 Jun 00:33

scouzi1966

nightly-20260621-97e6683

97e6683

afm-next (20260621 · 97e6683) Pre-release

Pre-release

Nightly build from main branch.

Commit: 97e6683
Date: 20260621
Version: 0.9.13-next.97e6683.20260621

This is an unstable development build. For the latest stable release, use brew install scouzi1966/afm/afm.

Changes since last build (`a92a0fe`)

fix(test): repair EmbeddingsControllerTests against resolver-based init (97e6683)
feat(model): add cohere2_moe (Cohere North-Mini-Code) with all-mode correctness (#139) (4e46bc5)
docs(skill): nightly Step 4b also bumps the pinned pip wheel example in README (d2437d8)
docs(README): bump pinned pip nightly example to dev20260614 (2880dd4)
Update nightly release link to 20260614-a92a0fe (22e4aff)

Install / Upgrade

Homebrew

brew tap scouzi1966/afm
brew install scouzi1966/afm/afm-next    # fresh install
brew upgrade afm-next                    # upgrade existing
brew reinstall afm-next                  # force reinstall (same version, new build)

pip

pip install --extra-index-url https://maclocal-ai.pages.dev/afm/wheels/simple/ macafm-next

Switching between stable and nightly

# Homebrew
brew unlink afm && brew install scouzi1966/afm/afm-next   # switch to nightly
brew unlink afm-next && brew link afm                      # switch back to stable

# pip
pip install macafm          # stable
pip install --extra-index-url https://maclocal-ai.pages.dev/afm/wheels/simple/ macafm-next   # nightly

Assets 4

14 Jun 16:12

scouzi1966

nightly-20260614-a92a0fe

a92a0fe

afm-next (20260614 · a92a0fe) Pre-release

Pre-release

Nightly build from main branch.

Commit: a92a0fe
Date: 20260614
Version: 0.9.13-next.a92a0fe.20260614

This is an unstable development build. For the latest stable release, use brew install scouzi1966/afm/afm.

Changes since last build (`5aad36d`)

feat(server): serve /v1/embeddings on the main server + advertise it (#132, #133) (895d20f)
docs(README): point pip wheel index at maclocal-ai.pages.dev (ef94c4e)
docs(README): highlight lossless speculative decoding (MTP + EAGLE3) (32ef421)
Update nightly release link to 20260613-5aad36d (75f43a9)

Install / Upgrade

Homebrew

brew tap scouzi1966/afm
brew install scouzi1966/afm/afm-next    # fresh install
brew upgrade afm-next                    # upgrade existing
brew reinstall afm-next                  # force reinstall (same version, new build)

pip

pip install --extra-index-url https://kruks.ai/afm/wheels/simple/ macafm-next

Switching between stable and nightly

# Homebrew
brew unlink afm && brew install scouzi1966/afm/afm-next   # switch to nightly
brew unlink afm-next && brew link afm                      # switch back to stable

# pip
pip install macafm          # stable
pip install --extra-index-url https://kruks.ai/afm/wheels/simple/ macafm-next   # nightly

Assets 4

13 Jun 11:18

scouzi1966

nightly-20260613-5aad36d

5aad36d

afm-next (20260613 · 5aad36d) Pre-release

Pre-release

Nightly build from main branch.

Commit: 5aad36d
Date: 20260613
Version: 0.9.13-next.5aad36d.20260613

This is an unstable development build. For the latest stable release, use brew install scouzi1966/afm/afm.

Changes since last build (`4bd0ec62a2cc39e4d69463f5ff8f1c119a3759ed`)

Merge speculative-decoding (MTP/EAGLE3 + streaming + benchmarks) into main (5aad36d)
docs: add "which flag for which model" decision section (4300baf)
merge afm-opt: PR #134 build/metallib + whitespace review fixes (0b3d120)
fix(build): address PR #134 review — metallib install guard, debug symlink, probe, whitespace (248a8b5)
fix(spec-stream): address PR #135 review — think injection, error propagation, cancellation (dea202c)
docs(bench): streaming spec-decode retest results (MTP + EAGLE3) (689cb83)
feat(spec): streaming support for MTP and EAGLE3 fast paths (e88011d)
docs: decode-optimization feature guide + release notes / social copy (e0935fd)
perf(eagle3)+bench: lossless bs=2 fast path; afm vs mlx-vlm verify-fidelity (f518a25)
bench(eagle3): afm vs mlx-vlm EAGLE3 head-to-head on dense Gemma4-31B (f535e21)
bench(qwen36): 2026-06-06 MLX engine re-run (latest) + MTP head-to-head (f5a080a)
feat(eagle3): P2/P3 — --eagle3 CLI + service routing, +22% decode on Gemma4-31B (81bd262)
feat(eagle3): P1 greedy speculative loop — output identical to greedy AR (7226736)
fix(gemma4): proportional RoPE on full-attention layers (was stock RoPE) (5a81b9b)
docs: document /v1/embeddings API and list embed/speech in help card (#131) (8f6a999)
feat(eagle3): P0 — Swift Gemma4Eagle3Drafter, bit-exact vs Python reference (e3f082c)
feat(eagle3): P0 reference-capture for the dense Gemma4-31B EAGLE3 port (4f8c9a4)
docs(eagle3): phased afm/Swift port plan for dense Gemma4-31B EAGLE3 (+25% validated) (80ea9b5)
bench(gemma4): dense 31B flips it — EAGLE3 +25% (vs MoE all-negative) (551dbb8)
bench(gemma4): spec-decode validation — all 3 methods SLOWER than AR on MoE (negative) (09720b0)
docs(mtp): record the +52% win; note --mtp-depth now vestigial (ade3ae2)
perf(mtp): rewrite loop after mlx-lm PR #990 — +52% decode vs AR (was +6%) (9d86fea)
docs(mtp): record final implementation result (runnable, +6.5% at depth 1) (b6dc626)
perf(mtp): depth-1 default beats AR (+6.5%); vectorized acceptance + instrumentation (05e288f)
feat(mtp): P2 runnable in afm via --mtp — correct, perf WIP (b628326)
feat(mtp): P2 — MTP self-speculative generator, output identical to greedy AR (11da477)
feat(mtp): P1 — GatedDeltaNet cache rollback, bit-exact (the make-or-break gate) (4474222)
chore(mtp): point P0 test + capture at the cache-root model location (9eeaf77)
feat(mtp): P0 — Swift Qwen3_5MTPHead, bit-exact vs Python reference (d2ab0af)
feat(mtp): P0 reference-capture harness for the Swift MTP head port (ad86bfe)
docs(mtp): phased afm/Swift port plan for MTP self-speculative decoding (b7042b8)
bench(ollama): use qwen3.6:27b-mlx (MLX tag), not the failing GGUF default (999ccb4)
bench(qwen36): rerun full 7-engine cross-engine suite + refresh plots/results (e72db8e)
fix(metallib): include random (RNG) kernel — was crashing sampled generation (401484c)
wip(bench): SDPA backport report/plot + metallib RNG-guard fix; harness fixes (9ef222f)
perf(mlx/sdpa): backport 0.31.3 adaptive-block 2-pass SDPA — decode@16k ~+10% (7d180f8)
docs(deps): reconfirm mlx-swift 0.30.3 pin — 0.31.3 still has long-context SDPA regression (ddc2c97)
perf(stream): eager think-tag emission — cut reasoning TTFT ~610ms -> ~346ms (f1343a6)
perf(mlx): prewarm Metal kernels on server startup (faster cold first-token) (33c247d)
test: Qwen3.6-27B local-engine performance benchmark (afm vs 6 engines) (259b5f0)
fix(swift6): box command+error across the CFRunLoop Task (Swift 6.3.2) (e569a75)
build: migrate to Swift 6 language mode (#130) (1a6ffc1)
feat(build): add --install flag + verify binary paths (7144459)
docs: prominent one-command build-from-source section in README (ac3d303)
build: add root-level build.sh entry point for clone-and-build (781cfb8)
fix(toolcall): prevent server crash on scalar JSON arg value (#128) (#129) (c55d54c)
test: post-merge validation reports for ab90ac1 (proof for #127) (cfb2676)
feat(agent): T1.4-T1.7 — cancel + tokenize + OpenAPI (rerun, stacked-merge correction) (#126) (ab90ac1)
feat(metrics): vLLM-namespaced /metrics + Grafana dashboard (#122) (a8dbffa)
feat(agent): Tier-0 promotion + Tier-1 quick wins (request id, stream usage, parallel_tool_calls) (#123) (ed68189)
docs(claude): require proof before labeling test failures pre-existing (46e2698)
Bump version to 0.9.13 for next dev cycle (1cbd60f)
README: move Install section above the fold (86429a8)
Release v0.9.12: promote nightly to stable (7dfc4f6)

Install / Upgrade

Homebrew

brew tap scouzi1966/afm
brew install scouzi1966/afm/afm-next    # fresh install
brew upgrade afm-next                    # upgrade existing
brew reinstall afm-next                  # force reinstall (same version, new build)

pip

pip install --extra-index-url https://kruks.ai/afm/wheels/simple/ macafm-next

Switching between stable and nightly

# Homebrew
brew unlink afm && brew install scouzi1966/afm/afm-next   # switch to nightly
brew unlink afm-next && brew link afm                      # switch back to stable

# pip
pip install macafm          # stable
pip install --extra-index-url https://kruks.ai/afm/wheels/simple/ macafm-next   # nightly

Assets 4

08 May 02:21

scouzi1966

v0.9.12

4bd0ec6

afm 0.9.12

Apple Foundation Models + MLX local models — OpenAI-compatible API, WebUI, all Swift.

Changes since v0.9.11

fix: resolve executable path via _NSGetExecutablePath, not argv[0] (4bd0ec6)
Add nightly test results for 2026-05-02 (7af438e) (4cbbe06)
Update nightly release link to 20260502-a589c50 (7af438e)
feat(embeddings): add /v1/embeddings backed by Apple NaturalLanguage (#119) (a589c50)
Feature: Expand vision API with barcode, classify, and saliency modes (#114) (b7ffe18)
Add on-device speech transcription and TTS (#113) (34001ec)
fix: afm -w falls back to ephemeral port when 9999 is busy (#116) (1e9f22c)
README: surface "What's new in afm-next" above Install (779e89e)
skill(promote-nightly): validate on staging tap before touching production (53e58fa)
README: remove staging tap from public docs (87c105e)
README: document installing previous versions of afm (3dae2dd)
Bump version to 0.9.12 for next dev cycle (7a23874)
Release v0.9.11: promote nightly to stable (94fdc35)
skill(promote-nightly): verify Apple framework links and bundle id in Step 5h (2717cdd)
Credit @jesserobbins for Vision OCR and Speech transcription (36e0194)
Fix publish-next.sh tap-staging: use full ${VERSION} not ${DATE} (90fd1d3)
Update nightly release link to 20260418-9c3225e (a08f840)

Install / Upgrade via Homebrew

Fresh install:

brew tap scouzi1966/afm
brew install scouzi1966/afm/afm

Upgrade:

brew upgrade afm

Install via PyPI

pip install macafm==0.9.12

Contributors

jesserobbins

Assets 3

02 May 22:52

scouzi1966

nightly-20260502-a589c50

a589c50

afm-next (20260502 · a589c50) Pre-release

Pre-release

Nightly build from main branch.

Commit: a589c50
Date: 20260502
Version: 0.9.12-next.a589c50.20260502

This is an unstable development build. For the latest stable release, use brew install scouzi1966/afm/afm.

Changes since last build (`9c3225e`)

feat(embeddings): add /v1/embeddings backed by Apple NaturalLanguage (#119) (a589c50)
Feature: Expand vision API with barcode, classify, and saliency modes (#114) (b7ffe18)
Add on-device speech transcription and TTS (#113) (34001ec)
fix: afm -w falls back to ephemeral port when 9999 is busy (#116) (1e9f22c)
README: surface "What's new in afm-next" above Install (779e89e)
skill(promote-nightly): validate on staging tap before touching production (53e58fa)
README: remove staging tap from public docs (87c105e)
README: document installing previous versions of afm (3dae2dd)
Bump version to 0.9.12 for next dev cycle (7a23874)
Release v0.9.11: promote nightly to stable (94fdc35)
skill(promote-nightly): verify Apple framework links and bundle id in Step 5h (2717cdd)
Credit @jesserobbins for Vision OCR and Speech transcription (36e0194)
Fix publish-next.sh tap-staging: use full ${VERSION} not ${DATE} (90fd1d3)
Update nightly release link to 20260418-9c3225e (a08f840)

Install / Upgrade

Homebrew

brew tap scouzi1966/afm
brew install scouzi1966/afm/afm-next    # fresh install
brew upgrade afm-next                    # upgrade existing
brew reinstall afm-next                  # force reinstall (same version, new build)

pip

pip install --extra-index-url https://kruks.ai/afm/wheels/simple/ macafm-next

Switching between stable and nightly

# Homebrew
brew unlink afm && brew install scouzi1966/afm/afm-next   # switch to nightly
brew unlink afm-next && brew link afm                      # switch back to stable

# pip
pip install macafm          # stable
pip install --extra-index-url https://kruks.ai/afm/wheels/simple/ macafm-next   # nightly

Contributors

jesserobbins

Assets 4

20 Apr 15:50

scouzi1966

v0.9.11

9c3225e

afm 0.9.11

Apple Foundation Models + MLX local models — OpenAI-compatible API, WebUI, all Swift.

Changes since v0.9.10

Bump version baseline to 0.9.11 (post-v0.9.10 stable) (9b3f3bf)
Fix macOS 26 Speech Recognition SIGABRT: embed Info.plist via linker (b39cd60)
Resolve merge conflicts for PR #107 (e139985)
Address second round of Vision OCR review feedback (9039eb5)
Close taskRef/onCancel race window (8c10027)
Fix recognition task cancellation leak and lock race (4f19247)
Fix data race in speech recognition timeout (34406fd)
Address speech transcription review feedback (8b3a3f0)
Add on-device audio transcription via Apple Speech framework (7e18c90)
Address Vision OCR review feedback (38b16c5)
Fix Vision OCR in webui: bypass Foundation Model 4096 token limit (0292f3c)
Address Vision OCR review feedback (5159ea2)
Document Vision OCR API (b1d0f9c)
Add Vision OCR API and stabilize tests (7ab80b6)
Release v0.9.10: promote nightly to stable (332c8c2)
feat: versioned Homebrew formulae for afm and afm-next (#102) (7cff3df)
fix: handle 1D logits in TopPSampler (#100) (#101) (36ee874)
chore: bump wheel version to 0.9.10.dev20260408 (212d945)
Add nightly test results for 2026-04-07 (4f24281) (272cc13)
Update nightly release link to 20260408-628c2bb (4f24281)

Install / Upgrade via Homebrew

Fresh install:

brew tap scouzi1966/afm
brew install scouzi1966/afm/afm

Upgrade:

brew upgrade afm

Install via PyPI

pip install macafm==0.9.11

Assets 3

18 Apr 15:43

scouzi1966

nightly-20260418-9c3225e

9c3225e

afm-next (20260418 · 9c3225e) Pre-release

Pre-release

Nightly build from main branch.

Commit: 9c3225e
Date: 20260418
Version: 0.9.11-next.9c3225e.20260418

This is an unstable development build. For the latest stable release, use brew install scouzi1966/afm/afm.

🙏 Acknowledgement

Huge thanks to first-time contributor @jesserobbins — this cycle landed two substantial features from him: the Apple Vision OCR HTTP API (#104) and Apple Speech transcription (#107). Both lift afm's Apple-native capabilities from CLI-only into first-class HTTP APIs compatible with the OpenAI-style surface that third-party clients already speak. Contributions of this size and quality from a new contributor are rare and appreciated.

Highlights

Apple Vision OCR HTTP API — POST /v1/vision/ocr for files, multipart uploads, base64, data URLs, and OpenAI-style image content parts. Multi-page PDF support, structured document/page/block/table output, Foundation chat auto-OCR integration. Contributed by @jesserobbins (#104).
Apple Speech transcription — on-device audio transcription via the Speech framework. New afm speech -f <file> CLI, POST /v1/audio/transcriptions API, chat input_audio content parts. Supports WAV/MP3/M4A/CAF/AIFF. Contributed by @jesserobbins (#107).
macOS 26 privacy fix — binary now embeds NSSpeechRecognitionUsageDescription via -sectcreate __TEXT __info_plist, so Speech Recognition actually works instead of SIGABRT'ing the process. First invocation from Terminal prompts for permission as expected (no Developer ID required). (#108)
Versioned Homebrew formulae — pinned nightly formulae afm-next@<full-version>.rb generated alongside the rolling afm-next.rb so users can brew install a specific nightly build. (#102)
TopPSampler 1D-logits crash fix — no longer crashes when concurrent batching meets top_p<1. (#100 / #101)

Changes since last build (`628c2bb`)

Fix publish-next.sh tap-staging: use full ${VERSION} not ${DATE} (90fd1d3)
Update nightly release link to 20260418-9c3225e (a08f840)
Bump version baseline to 0.9.11 (post-v0.9.10 stable) (9b3f3bf)
Fix macOS 26 Speech Recognition SIGABRT: embed Info.plist via linker (b39cd60)
Merge PR #107 (b7ecdbd, e139985) — speech transcription
Speech transcription hardening (8c10027, 4f19247, 34406fd, 8b3a3f0, 7e18c90)
Vision OCR + webui bypass (a3b60a5, 9039eb5, 38b16c5, 0292f3c, a1dda2b, 5159ea2, b1d0f9c, 7ab80b6)
feat: versioned Homebrew formulae for afm and afm-next (#102) (7cff3df)
fix: handle 1D logits in TopPSampler (#100) (#101) (36ee874)

Install / Upgrade

Homebrew

brew tap scouzi1966/afm
brew install scouzi1966/afm/afm-next    # fresh install
brew upgrade afm-next                    # upgrade existing
brew reinstall afm-next                  # force reinstall (same version, new build)

# Pinned to this exact nightly:
brew install scouzi1966/afm/afm-next@0.9.11-next.9c3225e.20260418

pip

pip install --extra-index-url https://kruks.ai/afm/wheels/simple/ macafm-next

Switching between stable and nightly

# Homebrew
brew unlink afm && brew install scouzi1966/afm/afm-next   # switch to nightly
brew unlink afm-next && brew link afm                      # switch back to stable

# pip
pip install macafm          # stable
pip install --extra-index-url https://kruks.ai/afm/wheels/simple/ macafm-next   # nightly

Contributors

jesserobbins

Assets 4

08 Apr 03:34

scouzi1966

v0.9.10

628c2bb

afm 0.9.10

Apple Foundation Models + MLX local models — OpenAI-compatible API, WebUI, all Swift.

Highlights

Gemma 4 support — text, vision-language, and MoE variants with tool calling
Gemma 4 concurrent batch mode — ~10x throughput via new BatchRotatingKVCache for sliding-window attention
Server-level --guided-json now actually constrains MLX requests (#97)
Concurrent / prefix-cache stability — resolved radix cache SIGTRAP on wrapped RotatingKVCache (#94), batched prefill lazy-graph overflow, and Metal buffer lifecycle issues under long runs (#88)
Performance — removed container.perform lock and actor serialization bottlenecks, raised SSE multiplex batch limit to 200, pipeline timing instrumentation via AFM_DEBUG=1
Request correlation IDs for end-to-end tracing across the server → scheduler → MLX path

Fixes since v0.9.9

--guided-json server flag now applied to every request (fixed in #97)
Gemma 4 batch mode, structured tool history, and Metal buffer lifecycle (#88)
Gemma 4 streaming + non-streaming tool call type coercion (array/object/int) (#87)
Radix cache SIGTRAP for wrapped RotatingKVCache (#94)
Root-cause batched prefill crash caused by MLX lazy graph overflow
Snapshot prefill state to prevent decode mutation corruption
BatchRotatingKVCache mask totalLen after circular buffer wrap
Flatten anyOf/oneOf nullable schemas for Jinja template safety
Structured output streaming regression
Queue requests instead of rejecting with server_busy
Homebrew libexec search path for metallib
Test harness: timestamped logs, format validation, Codex ARG_MAX, baseline tagging, spec extraction (#98)

Known issues

TopPSampler + concurrent mode crash: on this exact commit, requests with 0 < top_p < 1 hitting the BatchScheduler (concurrent mode, llama.cpp WebUI default) abort with [squeeze] axis 0 fatal error. Fix is on main (PR #101) and will ship in v0.9.11 or the next nightly. Workaround: use top_p=1.0 or temperature=0 in concurrent mode for now, or use the WebUI against the non-concurrent single-sequence server path.

Install / Upgrade via Homebrew

brew tap scouzi1966/afm
brew install scouzi1966/afm/afm

Upgrade:

brew upgrade afm

Pin to this version specifically:

brew install scouzi1966/afm/afm@0.9.10

Install via PyPI

pip install macafm==0.9.10

Assets 3

08 Apr 00:57

scouzi1966

nightly-20260408-628c2bb

628c2bb

afm-next (20260408 · 628c2bb) Pre-release

Pre-release

Nightly build from main branch.

Commit: 628c2bb
Date: 20260408
Version: 0.9.10-next.628c2bb.20260408

This is an unstable development build. For the latest stable release, use brew install scouzi1966/afm/afm.

Changes since last build (`2b647b2`)

fix: --guided-json server flag and per-test spec extraction (#97, #98) (628c2bb)
fix: generate-report.py reads from RESULTS_FILE env var, not hardcoded path (6b36714)
fix: timestamp all temp files to prevent overwrites between runs (f10f72b)
fix: codex per-test scoring — local outside function, unbound var (5142185)
fix: handle unset AFM_BIN with set -u (e48e0f5)
fix: mlx-model-test.sh defaults to local build over PATH (2f88460)
fix: smart analysis reporting — codex ARG_MAX, format validation, baseline tagging (f582ac5)
refactor: address PR #95 review — extract helper, improve readability (2c7d205)
fix: skip radix save for wrapped RotatingKVCache to prevent SIGTRAP (#94) (48f88a8)
fix: Gemma 4 batch mode, structured tool history, and Metal buffer lifecycle (#88) (d5573d1)
Add nightly test results for 2026-04-04 (4 models) (44c048e)
fix: Gemma 4 tool call type coercion (array/object/int) (2aa2abd)
fix: flatten anyOf/oneOf nullable schemas for Jinja template safety (ce66763)
Merge feature/gemma4-batch-kvcache: 10x throughput for Gemma 4 (bc9de80)
Fix structured output streaming regression (4afb1ff)
fix: address PR review — empty cache merge safety, slot wait cancellation (731ca38)
refactor: raise SSE multiplex batch limit to 200, extract as constant (55ea028)
fix: root-cause batched prefill crash — MLX lazy graph overflow (329b114)
refactor: improve updateConcat alloc pattern, keep individual prefill (3d5f9e1)
feat: add request correlation ID for end-to-end tracing (ea12c89)
perf: add pipeline timing instrumentation (AFM_DEBUG=1) (65c093c)
perf: remove container.perform lock and actor serialization bottlenecks (a289c42)
fix: snapshot prefill state to prevent decode mutation corruption (6e16a21)
fix: BatchRotatingKVCache mask totalLen after circular buffer wrap (f650d65)
fix: updateConcat alloc size, debug prefillBatch B>=3 crash (38159f0)
debug: add SDPA shape logging and BatchRotatingKV tracing (22461ac)
Bypass adaptive XML for Gemma 4 tool calls (45910d7)
feat: BatchRotatingKVCache — Gemma 4 concurrent batch mode working (4006a11)
WIP: BatchRotatingKVCache — B=1 works, B>=2 segfaults in SDPA (c04eafc)
WIP: BatchRotatingKVCache for Gemma 4 batch mode (0f94cca)
refactor: extract magic numbers to named constants, add coding rule (4e9edf3)
fix: queue requests instead of rejecting with server_busy (a73b820)
fix: patch pin check matched wrong line (swift-docc-plugin) (16cf3da)
Fix Gemma 4 handling and consolidate repo skills (f0bd9c7)
chore: bump wheel version to 0.9.10.dev20260403, add test reports (7c38a5a)
Update nightly release link to 20260403-2b647b2 (9098e31)

Install / Upgrade

Homebrew

brew tap scouzi1966/afm
brew install scouzi1966/afm/afm-next    # fresh install
brew upgrade afm-next                    # upgrade existing
brew reinstall afm-next                  # force reinstall (same version, new build)

pip

pip install --extra-index-url https://kruks.ai/afm/wheels/simple/ macafm-next

Switching between stable and nightly

# Homebrew
brew unlink afm && brew install scouzi1966/afm/afm-next   # switch to nightly
brew unlink afm-next && brew link afm                      # switch back to stable

# pip
pip install macafm          # stable
pip install --extra-index-url https://kruks.ai/afm/wheels/simple/ macafm-next   # nightly

Assets 4

Releases: scouzi1966/maclocal-api

afm v0.9.13

afm v0.9.13

Highlights since v0.9.12 (73 commits)

New models

⚡ Speculative decoding (quality-preserving)

APIs & agent-friendliness

Performance & platform

Fixes

Known limitations

Install

Contributors

Uh oh!

afm-next (20260621 · 97e6683)

Changes since last build (a92a0fe)

Install / Upgrade

Homebrew

pip

Switching between stable and nightly

Uh oh!

afm-next (20260614 · a92a0fe)

Changes since last build (5aad36d)

Install / Upgrade

Homebrew

pip

Switching between stable and nightly

Uh oh!

afm-next (20260613 · 5aad36d)

Changes since last build (4bd0ec62a2cc39e4d69463f5ff8f1c119a3759ed)

Install / Upgrade

Homebrew

pip

Switching between stable and nightly

Uh oh!

afm 0.9.12

afm 0.9.12

Changes since v0.9.11

Install / Upgrade via Homebrew

Install via PyPI

Contributors

Uh oh!

afm-next (20260502 · a589c50)

Changes since last build (9c3225e)

Install / Upgrade

Homebrew

pip

Switching between stable and nightly

Contributors

Uh oh!

afm 0.9.11

afm 0.9.11

Changes since v0.9.10

Install / Upgrade via Homebrew

Install via PyPI

Uh oh!

afm-next (20260418 · 9c3225e)

🙏 Acknowledgement

Highlights

Changes since last build (628c2bb)

Install / Upgrade

Homebrew

pip

Switching between stable and nightly

Contributors

Uh oh!

afm 0.9.10

afm 0.9.10

Highlights

Fixes since v0.9.9

Known issues

Install / Upgrade via Homebrew

Install via PyPI

Uh oh!

afm-next (20260408 · 628c2bb)

Changes since last build (2b647b2)

Install / Upgrade

Homebrew

pip

Switching between stable and nightly

Uh oh!

Changes since last build (`a92a0fe`)

Changes since last build (`5aad36d`)

Changes since last build (`4bd0ec62a2cc39e4d69463f5ff8f1c119a3759ed`)

Changes since last build (`9c3225e`)

Changes since last build (`628c2bb`)

Changes since last build (`2b647b2`)