Integrate Speculative Decoding into Lemonade by sawansri · Pull Request #1638 · lemonade-sdk/lemonade

sawansri · 2026-04-15T06:15:58Z

STILL IN DRAFT

Resolves #1419

Adds UI controls for speculative decoding

Also includes logic in backend to resolve model paths if checkpoint/model name is given.
This enables creating recipes with draft models (paths are not user specific and instead get resolved by lemonade)

Copilot

Pull request overview

Integrates speculative decoding configuration into Lemonade by adding WebUI controls for llama.cpp speculative settings and backend handling to resolve draft model references (checkpoint/model name) into local GGUF paths at load time.

Changes:

Adds speculative decoding controls (type/presets/advanced flags) to the Model Options modal and new CSS styling for the panel.
Adds backend logic to normalize/validate llama-server custom args: resolve --model-draft to a local GGUF and force --no-mmproj when speculative decoding is enabled.
Introduces a ModelManager::resolve_checkpoint_path() helper to resolve checkpoint strings via recipe-specific path resolution.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/cpp/server/model_manager.cpp	Adds `resolve_checkpoint_path()` wrapper around recipe path resolution.
src/cpp/include/lemon/model_manager.h	Declares the new checkpoint resolution helper API.
src/cpp/server/backends/llamacpp_server.cpp	Resolves `--model-draft`, detects speculative decoding, and forces `--no-mmproj` when needed.
src/app/src/renderer/ModelOptionsModal.tsx	Adds speculative decoding UI/presets and rewrites how model export & submission commits options.
src/app/styles.css	Adds styling for the speculative decoding panel layout/responsiveness.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

…del name Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

sawansri requested a review from Copilot April 15, 2026 06:17

Copilot started reviewing on behalf of sawansri April 15, 2026 06:18 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

Comment thread src/cpp/server/backends/llamacpp_server.cpp

Comment thread src/cpp/server/backends/llamacpp_server.cpp Outdated

Comment thread src/app/src/renderer/ModelOptionsModal.tsx Outdated

sawansri added 6 commits April 15, 2026 08:55

Add speculative decoding UI and draft checkpoint resolution

15bbfe3

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

llamacpp: resolve --model-draft non-.gguf values via checkpoint or mo…

2871bef

…del name Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

llamacpp: reject --model-draft values that are missing or next flag

da80e78

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

llamacpp: force --no-mmproj for all speculative decoding flags

bcd7468

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

move flag parsing to frontend

5bfebe2

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

fix spec decoding numerical field rendering

597b090

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

sawansri force-pushed the sawansri/spec-decode-ui branch from 368378a to 597b090 Compare April 15, 2026 16:25

sawansri added 7 commits April 15, 2026 09:52

Fix draft checkpoint fallback for non-GGUF resolutions

e1d038c

Make llama args tokenization idempotent with escapes

8754aa0

ui: simplify speculative decoding args editing

3eb993b

llamacpp: resolve --model-draft by model id only

0723548

test(endpoints): cover --model-draft model-id validation

0e108dc

remove = from llamacpp args

3fbfeb5

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

llamacpp: remove backend --flag=value handling

f1ae42a

sawansri requested a review from Copilot April 15, 2026 18:52

Copilot started reviewing on behalf of sawansri April 15, 2026 18:53 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

Comment thread src/cpp/server/backends/llamacpp_server.cpp Outdated

Comment thread src/app/src/renderer/ModelOptionsModal.tsx

Comment thread src/app/src/renderer/ModelOptionsModal.tsx

Comment thread src/cpp/server/backends/llamacpp_server.cpp

sawansri added 2 commits April 15, 2026 12:50

fix spec args serialization and draft submit state

9707bb3

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

llamacpp: keep pre-spec custom arg override semantics

fb86d37

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate Speculative Decoding into Lemonade#1638

Integrate Speculative Decoding into Lemonade#1638
sawansri wants to merge 15 commits intomainfrom
sawansri/spec-decode-ui

sawansri commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sawansri commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants