Skip to content

extropolis/say-it

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

say it! — AAC Kids

A single-page AAC (augmentative & alternative communication) speech device. Tap one word at a time; the backend predicts kid-safe next-word candidates so a non-speaking kid can build sentences fast.

Built for an 11-year-old named Adrian.

Talk mode Game mode
talk view game view

How to use the app

The app has two modes — switch between them with the 🗣 talk / 🎮 game tabs at the bottom.

🗣 Talk mode (build a sentence)

  1. Tap a word in the grid to add it to the sentence strip at the top.
  2. After each tap, the grid refreshes with likely next words. Keep tapping until your sentence is done.
  3. ▶ play — speaks the whole sentence out loud (voice is muted by default — turn it on under ⚙️ settings → voice → on).
  4. ⌫ backspace — removes the last word.
  5. ✕ clear — wipes the sentence.
  6. ✨ more words — opens a bottom sheet with ~30 alternatives if the 8-button grid doesn't have what you want. (This is the only path that calls the LLM.)
  7. add a word or phrase input under the strip — type a new phrase like i love octopus and hit +. It's added to the trie immediately and shows up in future predictions.
  8. 📖 dictionary — live-filter the ~1.4k kid-safe word list and tap any word to add it.
  9. all done pill — appears when the trie thinks the sentence has reached a natural ending; tap it to speak + reset.

🎮 Game mode (practice prompts)

A target phrase appears at the top (e.g. "i want pizza"). Tap the words in order to match it. Score, level, and lives track across rounds — handy for practicing word order and reading without the pressure of free-form talking. The same grid + prediction pipeline powers both modes.

⚙️ Settings (top-right gear)

  • buttons — 4 / 6 / 8 / 10 grid size
  • voice — on / off (default: off, so the play button is silent until you turn it on)
  • speed — TTS rate, 0.6×–1.4×
  • game difficulty — easy / normal / hard

What it is

The whole product is two files:

  • next-word-predictor.html — the entire frontend. Vanilla HTML/CSS/JS, no build step, served as static.
  • server.py — FastAPI backend. Walks a hand-authored phrase trie (phrases.json) for the regular grid; only calls an LLM (OpenRouter / Qwen) when the user opens the "more words" sheet and the trie ran short.

Frontend lives on :8899, backend on :8900. Port 4001 is intentionally untouched.

Quick start

pip install -r requirements.txt

# Put your key in .env (either works):
echo 'OPENROUTER_KEY=sk-or-...' >> .env
# or:
echo 'GROQ_API_KEY=gsk_...' >> .env

./start.sh           # boots static (8899) + backend (8900) in background
# ./start.sh fg      # foreground backend, useful for iterating on server.py

Open http://localhost:8899/next-word-predictor.html.

Watch the backend:

tail -f /tmp/aac-backend.log
curl http://localhost:8900/health

How it works

POST /predict {sentence: [...words], n: 8} → {words: [...]} is the whole product.

  1. Walk the trie. _walk_trie(sentence) descends phrases.json token-by-token. The keys at the resting node are the candidate next words.
  2. Top up if short from the trie's top-level openers.
  3. LLM only for "more words." When n > 8 (the bottom sheet asks for 30) AND the trie is still short, call OpenRouter once. The 8-button regular grid almost never touches the LLM.
  4. Sort + filter. Alphabetical, drop the just-tapped word, drop the DENY set, enforce _ok_word (alphabetic, 1–14 chars). required word (game mode) is pinned first.
  5. Last-resort fallback to a small hardcoded list.
  6. done_prob drives the "all done" pill — derived from trie shape (resting node has "" child → 0.6).
  7. Log every /predict + /pick + /add_phrase to predictions.jsonl.

phrases.json is the source of truth. /add_phrase deep-merges new phrases into it at runtime; future predictions see them immediately.

Endpoints

Method Path What it does
POST /predict {sentence, n, required}{words, done_prob, debug}
POST /pick log offer→pick
POST /add_phrase tokenize, validate, deep-merge into the trie, persist
GET /dictionary?q=…&limit=… prefix then substring matches over dictionary.json
GET /health {ok, model, trie_openers, dict_size, …}
# Manual predict:
curl -s -X POST http://localhost:8900/predict \
  -H 'Content-Type: application/json' \
  -d '{"sentence":["i","want"],"n":8}' | jq

# Add a phrase at runtime (no restart):
curl -s -X POST http://localhost:8900/add_phrase \
  -H 'Content-Type: application/json' \
  -d '{"phrase":"i love octopus"}' | jq

# Search the kid-safe dictionary:
curl -s 'http://localhost:8900/dictionary?q=oct&limit=20' | jq

Editing

  • Add canned phrases offline: drop a JSON chunk into phrase_chunks/, run python3 merge_phrases.py to rebuild phrases.json, restart the backend.
  • Add one phrase at runtime: use the in-app "add a word or phrase" input, the 📖 dictionary modal, or POST /add_phrase.
  • Add to the kid-safe dictionary: edit dictionary.json (flat lowercase array), restart.
  • Change the model: MODEL_NAME in server.py. (The endpoint variable is named GROQ_URL for historical reasons but points at OpenRouter.)

Evals

promptfooconfig.predict.yaml exercises /predict directly — backend must be running on :8900:

OPENROUTER_KEY=... npx promptfoo@latest eval -c promptfooconfig.predict.yaml

Offline analysis

python3 analyze_picks.py     # per-prefix offer/pick patterns from predictions.jsonl

Files

next-word-predictor.html   # the frontend (one file, no build)
server.py                  # FastAPI backend
phrases.json               # ~19k phrase trie, hand-authored + runtime-mutated
dictionary.json            # ~1.4k kid-safe words
phrase_chunks/             # source JSON chunks merged into phrases.json
merge_phrases.py           # offline builder
analyze_picks.py           # offer/pick analysis over predictions.jsonl
predictions.jsonl          # runtime log (don't hand-edit)
promptfooconfig.predict.yaml
start.sh                   # the only blessed way to run the stack
requirements.txt

There is no test suite, no linter, no build step. That is on purpose.

About

AAC speech device for non-speaking kids — tap-to-talk frontend + FastAPI next-word predictor over a hand-authored phrase trie

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors