Skip to content

Integrate Speculative Decoding into Lemonade#1638

Draft
sawansri wants to merge 15 commits intomainfrom
sawansri/spec-decode-ui
Draft

Integrate Speculative Decoding into Lemonade#1638
sawansri wants to merge 15 commits intomainfrom
sawansri/spec-decode-ui

Conversation

@sawansri
Copy link
Copy Markdown
Collaborator

STILL IN DRAFT

Resolves #1419

Adds UI controls for speculative decoding

image

Also includes logic in backend to resolve model paths if checkpoint/model name is given.
This enables creating recipes with draft models (paths are not user specific and instead get resolved by lemonade)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Integrates speculative decoding configuration into Lemonade by adding WebUI controls for llama.cpp speculative settings and backend handling to resolve draft model references (checkpoint/model name) into local GGUF paths at load time.

Changes:

  • Adds speculative decoding controls (type/presets/advanced flags) to the Model Options modal and new CSS styling for the panel.
  • Adds backend logic to normalize/validate llama-server custom args: resolve --model-draft to a local GGUF and force --no-mmproj when speculative decoding is enabled.
  • Introduces a ModelManager::resolve_checkpoint_path() helper to resolve checkpoint strings via recipe-specific path resolution.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/cpp/server/model_manager.cpp Adds resolve_checkpoint_path() wrapper around recipe path resolution.
src/cpp/include/lemon/model_manager.h Declares the new checkpoint resolution helper API.
src/cpp/server/backends/llamacpp_server.cpp Resolves --model-draft, detects speculative decoding, and forces --no-mmproj when needed.
src/app/src/renderer/ModelOptionsModal.tsx Adds speculative decoding UI/presets and rewrites how model export & submission commits options.
src/app/styles.css Adds styling for the speculative decoding panel layout/responsiveness.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cpp/server/backends/llamacpp_server.cpp
Comment thread src/cpp/server/backends/llamacpp_server.cpp Outdated
Comment thread src/app/src/renderer/ModelOptionsModal.tsx Outdated
Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>
…del name

Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>
Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>
Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>
Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>
Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>
@sawansri sawansri force-pushed the sawansri/spec-decode-ui branch from 368378a to 597b090 Compare April 15, 2026 16:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cpp/server/backends/llamacpp_server.cpp Outdated
Comment thread src/app/src/renderer/ModelOptionsModal.tsx
Comment thread src/app/src/renderer/ModelOptionsModal.tsx
Comment thread src/cpp/server/backends/llamacpp_server.cpp
Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>
Signed-off-by: Sawan Srivastava <sawan1210@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Integrate Speculative Decoding for Token Generation Speedup (Default & Configurable)

2 participants