Orb - Agentic RP Frontend

Problem Statement

LLMs suffer from stylistic inertia in long roleplay sessions. Once a tone, pacing, or prose style is established over several turns, the model tends to perpetuate it regardless of narrative shifts. A lighthearted conversation that turns tragic will often retain the cadence and vocabulary of the earlier tone because the weight of prior context anchors the model's generation.

Static system prompts cannot solve this. The system prompt is written once and does not adapt to evolving scenes.

Solution Overview

An agentic middleware layer sits between the user and the model. It intercepts each user message, runs a short analytical pass to "read the room," then dynamically assembles prompt directives that shape the model's writing before the actual roleplay generation happens.

The user never sees the agentic layer. The writer model doesn't know it's being directed. The result is a roleplay session that naturally adapts its style, tone, and pacing as the narrative evolves.

Notable Features

Clear direction for Writer: Grounding the story + actively steering the writing style = better output
Customizability: Customizable prompt injection that's automatically used by Director model
Anti-slop: Get rid of overused words, phrases, and patterns often seen in LLM outputs
Anti-repetition: Detect various types of repetition from outputs and surgically fix them
Length Guard: Actively or passively protect from length degradation as context grows
Super-regenerate: Normal regens may give samey outputs, ask for a different take
Magic Rewrite: Rewrite the target message in a user-defined direction
Compress History: Summarize chat context and move it to a new conversation
Mobile-compatibility: UI for mobile devices
Integrated TTS: Easy Text-to-speech that supports multiple providers
Character Browser: Fetch character cards from various sites on the Internet

Architecture

Three-Pass Design

The system uses a three-pass architecture, with the agent and writer optionally being the same or different models:

Director Pass - Tool-calling phase where the LLM selects moods, plot direction, and potentially rewrites user prompts
Writer Pass - Story generation phase where the LLM writes the actual roleplay response
Editor Pass - A ReAct loop - Self-audit for slop and length optimization phase. This is surgical, errors will be programmatically detected, the model only needs to write replacement for targeted sentences

Single and Dual Model Modes

In most local setups, the user doesn't have enough resource to load more than one model at a time. Single-Model Mode addresses this by using the same model for both writing and agentic tasks. KV cache is respected by design so prompt reprocessing is avoided.

For the best experience, use Dual-Model Mode. Some harnesses are dropped in this mode so the models should perform better.

KV Cache Reuse Strategy

For optimal KV cache reuse, the following will remain consistent across passes:

1. System Prompt

The system prompt (character card, instructions, etc.) is identical across all passes
Built once and reused forever
Includes character description, scenario, example dialogue, and additional instructions

2. Chat History

The conversation history (previous messages) is identical across all passes
Maintains exact same message content, attachments, and ordering

3. Tool Schemas

The same tool definitions must be sent in each LLM call for kv cache reuse
Inconsistent tool schemas break KV cache alignment

Design Principles

Prioritize small models - if a feature fails half of the time on Gemma-4-26B4A, it probably doesn't belong here
Only use agentic functionalities when absolutely needed - we will not have useless tools like dice_roll
Algorithm-first - if something can be done with an algorithm, don't use LLMs. Avoid making LLMs eyeball for errors
Keep agentic scope small to reduce hallucination, avoid giving agents too much freedom of choice

Drawbacks

Speed: Multiple passes will obviously have a longer time to final response
Cost: Neligible cost increase, which comes naturally with multiple passes, somewhat alleviated by KV cache reuse strategy

Requirements

A model with solid tool/function calling capabilities (recommended: Gemma 4)
OpenAI-compatible LLM inference backend API that supports prompt-caching
Python 3.9+

Wiki

Full documentation is at https://orbfrontend.github.io/Orb/

Contributing & Discussions

Check this out before opening a PR: https://github.com/OrbFrontend/Orb/blob/main/CONTRIBUTING.md

Ideas, help requests, and questions go here: https://github.com/OrbFrontend/Orb/discussions

Name		Name	Last commit message	Last commit date
Latest commit History 746 Commits
.github/workflows		.github/workflows
backend		backend
docs		docs
frontend		frontend
scripts		scripts
tests		tests
.flake8		.flake8
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Orb.png		Orb.png
README.md		README.md
TOFIX.txt		TOFIX.txt
biome.json		biome.json
lefthook.yml		lefthook.yml
mkdocs.yml		mkdocs.yml
package.json		package.json
pyrightconfig.json		pyrightconfig.json
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements-docs.txt		requirements-docs.txt
requirements.txt		requirements.txt
run_unix.sh		run_unix.sh
run_windows.bat		run_windows.bat
update_unix.sh		update_unix.sh
update_windows.bat		update_windows.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Orb - Agentic RP Frontend

Problem Statement

Solution Overview

Notable Features

Architecture

Three-Pass Design

Single and Dual Model Modes

KV Cache Reuse Strategy

1. System Prompt

2. Chat History

3. Tool Schemas

Design Principles

Drawbacks

Requirements

Wiki

Contributing & Discussions

About

Uh oh!

Releases 19

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Orb - Agentic RP Frontend

Problem Statement

Solution Overview

Notable Features

Architecture

Three-Pass Design

Single and Dual Model Modes

KV Cache Reuse Strategy

1. System Prompt

2. Chat History

3. Tool Schemas

Design Principles

Drawbacks

Requirements

Wiki

Contributing & Discussions

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 19

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages