A single-page AAC (augmentative & alternative communication) speech device. Tap one word at a time; the backend predicts kid-safe next-word candidates so a non-speaking kid can build sentences fast.
Built for an 11-year-old named Adrian.
| Talk mode | Game mode |
|---|---|
![]() |
![]() |
The app has two modes — switch between them with the 🗣 talk / 🎮 game tabs at the bottom.
- Tap a word in the grid to add it to the sentence strip at the top.
- After each tap, the grid refreshes with likely next words. Keep tapping until your sentence is done.
- ▶ play — speaks the whole sentence out loud (voice is muted by default — turn it on under ⚙️ settings → voice → on).
- ⌫ backspace — removes the last word.
- ✕ clear — wipes the sentence.
- ✨ more words — opens a bottom sheet with ~30 alternatives if the 8-button grid doesn't have what you want. (This is the only path that calls the LLM.)
- add a word or phrase input under the strip — type a new phrase like
i love octopusand hit+. It's added to the trie immediately and shows up in future predictions. - 📖 dictionary — live-filter the ~1.4k kid-safe word list and tap any word to add it.
- all done pill — appears when the trie thinks the sentence has reached a natural ending; tap it to speak + reset.
A target phrase appears at the top (e.g. "i want pizza"). Tap the words in order to match it. Score, level, and lives track across rounds — handy for practicing word order and reading without the pressure of free-form talking. The same grid + prediction pipeline powers both modes.
- buttons — 4 / 6 / 8 / 10 grid size
- voice — on / off (default: off, so the play button is silent until you turn it on)
- speed — TTS rate, 0.6×–1.4×
- game difficulty — easy / normal / hard
The whole product is two files:
next-word-predictor.html— the entire frontend. Vanilla HTML/CSS/JS, no build step, served as static.server.py— FastAPI backend. Walks a hand-authored phrase trie (phrases.json) for the regular grid; only calls an LLM (OpenRouter / Qwen) when the user opens the "more words" sheet and the trie ran short.
Frontend lives on :8899, backend on :8900. Port 4001 is intentionally untouched.
pip install -r requirements.txt
# Put your key in .env (either works):
echo 'OPENROUTER_KEY=sk-or-...' >> .env
# or:
echo 'GROQ_API_KEY=gsk_...' >> .env
./start.sh # boots static (8899) + backend (8900) in background
# ./start.sh fg # foreground backend, useful for iterating on server.pyOpen http://localhost:8899/next-word-predictor.html.
Watch the backend:
tail -f /tmp/aac-backend.log
curl http://localhost:8900/healthPOST /predict {sentence: [...words], n: 8} → {words: [...]} is the whole product.
- Walk the trie.
_walk_trie(sentence)descendsphrases.jsontoken-by-token. The keys at the resting node are the candidate next words. - Top up if short from the trie's top-level openers.
- LLM only for "more words." When
n > 8(the bottom sheet asks for 30) AND the trie is still short, call OpenRouter once. The 8-button regular grid almost never touches the LLM. - Sort + filter. Alphabetical, drop the just-tapped word, drop the
DENYset, enforce_ok_word(alphabetic, 1–14 chars).requiredword (game mode) is pinned first. - Last-resort fallback to a small hardcoded list.
done_probdrives the "all done" pill — derived from trie shape (resting node has""child → 0.6).- Log every
/predict+/pick+/add_phrasetopredictions.jsonl.
phrases.json is the source of truth. /add_phrase deep-merges new phrases into it at runtime; future predictions see them immediately.
| Method | Path | What it does |
|---|---|---|
| POST | /predict |
{sentence, n, required} → {words, done_prob, debug} |
| POST | /pick |
log offer→pick |
| POST | /add_phrase |
tokenize, validate, deep-merge into the trie, persist |
| GET | /dictionary?q=…&limit=… |
prefix then substring matches over dictionary.json |
| GET | /health |
{ok, model, trie_openers, dict_size, …} |
# Manual predict:
curl -s -X POST http://localhost:8900/predict \
-H 'Content-Type: application/json' \
-d '{"sentence":["i","want"],"n":8}' | jq
# Add a phrase at runtime (no restart):
curl -s -X POST http://localhost:8900/add_phrase \
-H 'Content-Type: application/json' \
-d '{"phrase":"i love octopus"}' | jq
# Search the kid-safe dictionary:
curl -s 'http://localhost:8900/dictionary?q=oct&limit=20' | jq- Add canned phrases offline: drop a JSON chunk into
phrase_chunks/, runpython3 merge_phrases.pyto rebuildphrases.json, restart the backend. - Add one phrase at runtime: use the in-app "add a word or phrase" input, the 📖 dictionary modal, or
POST /add_phrase. - Add to the kid-safe dictionary: edit
dictionary.json(flat lowercase array), restart. - Change the model:
MODEL_NAMEinserver.py. (The endpoint variable is namedGROQ_URLfor historical reasons but points at OpenRouter.)
promptfooconfig.predict.yaml exercises /predict directly — backend must be running on :8900:
OPENROUTER_KEY=... npx promptfoo@latest eval -c promptfooconfig.predict.yamlpython3 analyze_picks.py # per-prefix offer/pick patterns from predictions.jsonlnext-word-predictor.html # the frontend (one file, no build)
server.py # FastAPI backend
phrases.json # ~19k phrase trie, hand-authored + runtime-mutated
dictionary.json # ~1.4k kid-safe words
phrase_chunks/ # source JSON chunks merged into phrases.json
merge_phrases.py # offline builder
analyze_picks.py # offer/pick analysis over predictions.jsonl
predictions.jsonl # runtime log (don't hand-edit)
promptfooconfig.predict.yaml
start.sh # the only blessed way to run the stack
requirements.txt
There is no test suite, no linter, no build step. That is on purpose.

