Skip to content

Releases: mcowger/plexus

2026.06.05.1

05 Jun 07:50
2382527

Choose a tag to compare

Plexus Inference V2 Beta

inference-v2 replaces Plexus's hand-written request/response transformation with @earendil-works/pi-ai for chat-style inference, while Plexus retains auth, routing, quotas, cooldowns, failover, and logging.

What Changed

Old path: client -> Plexus Transformer -> UnifiedChatRequest -> Dispatcher -> upstream -> client

New path: client -> inference-v2 parser -> pi-ai Context -> pi-ai stream/complete -> inference-v2 serializer -> client

Opting In

Option 1: Explicit beta routes -- prefix any stable path with /beta:

POST /beta/v1/chat/completions
POST /beta/v1/messages
POST /beta/v1/responses
POST /beta/v1beta/models/{model}:generateContent
POST /beta/v1beta/models/{model}:streamGenerateContent

Option 2: Beta-enabled API key -- admins can mark a key as beta in Access Control. Requests to normal stable paths then route through inference-v2 automatically.

Required Provider Config

Each routed target must have both fields set:

  • provider.pi_ai_provider
  • provider.models[model].pi_ai_model_id

If no beta-compatible target exists, the request fails. There is no silent fallback to the legacy path.

Routing

  • Normal routing applies first (aliases, priorities, policy, cooldowns, concurrency).
  • Targets without valid pi-ai hints are filtered out.
  • Failover works between beta-compatible targets only -- never from pi-ai to Transformer.

Coverage

Tested and expected to work: OpenAI chat completions, Anthropic messages, OpenAI Responses API, Gemini generateContent/streamGenerateContent -- streaming and non-streaming. Full gateway features (auth, quotas, failover, debug logs, cost tracking, etc.) are all active.

In request logs, the pi (π) icon marks inference-v2 requests.

Known Caveats

  • OAuth providers are not included yet.
  • Embeddings, transcriptions, speech, and image APIs stay on the legacy path.
  • Custom providers are untested.
  • Same-format passthrough is not used -- requests are always parsed through pi-ai Context.
  • Some provider-specific fields may be dropped if not represented in pi-ai Context.
  • Multiple Anthropic system blocks with cache-control may not preserve exact behavior.

Bug Reports

Include:

  • Client name/version, API format, streaming or not, route used (/beta/... or beta key).
  • Provider name, pi_ai_provider, pi_ai_model_id, whether the π icon appeared on the usage row.
  • x-request-id header value.
  • Usage log row, debug log (raw/transformed request and response), client response, server log lines for that request ID.
  • Secrets removed (API keys, tokens, credentials, cookies).

The most useful reports identify a concrete mismatch: wrong transformed request, malformed client response, missing tool calls, wrong token usage, invalid stream framing, etc.

Recommended First Tests

  1. Simple prompt, streaming and non-streaming.
  2. Tool use.
  3. Thinking/reasoning options.
  4. Responses API with previous_response_id.

dev-de0c432a6f70b9841ed83a76327700e9619d5e4f

05 Jun 19:37
de0c432

Choose a tag to compare

Development pre-release from commit de0c432

Built from: main at de0c432a6f70b9841ed83a76327700e9619d5e4f

dev-2382527da0af267cf201a57a04d3cf9e80f9075e

05 Jun 07:47
2382527

Choose a tag to compare

Development pre-release from commit 2382527

Built from: main at 2382527da0af267cf201a57a04d3cf9e80f9075e

dev-925405f6db7b1dad8f4587885a095bf67df168d7

04 Jun 15:29
925405f

Choose a tag to compare

Development pre-release from commit 925405f

Built from: main at 925405f6db7b1dad8f4587885a095bf67df168d7

dev-5e20bf4ada0512978ec216a5e994960f50f0bfb0

04 Jun 15:33
5e20bf4

Choose a tag to compare

Development pre-release from commit 5e20bf4

Built from: main at 5e20bf4ada0512978ec216a5e994960f50f0bfb0

dev-06728bc764d29892095d5d8e948fa51d56b2ba1d

04 Jun 15:29
925405f

Choose a tag to compare

Development pre-release from commit 06728bc

Built from: main at 06728bc764d29892095d5d8e948fa51d56b2ba1d

2026.06.03.2

03 Jun 16:52

Choose a tag to compare

Overview

This release brings automated model discovery for providers, a smoother import experience with alias matching, and fixes for bulk selection in the model manager — making it easier than ever to keep your model library up to date.

✨ New Features

  • Provider model autosync: Plexus can now automatically discover and add new models from your providers on a schedule you control. Each provider gets its own toggle and interval settings, so you decide who syncs and how often. (#554)
Screenshot 2026-06-03 at 9 48 26 AM
  • Smarter model importing with alias matching and suppression: The import modal now suggests matching aliases when orphaned models are detected, so you can link them to existing entries instead of creating duplicates. You can also suppress individual models you never want to import — they’ll stay out of your way. (#553)

🐛 Bug Fixes

  • Fixed model bulk selection actions: Select All and Clear now work reliably across the model manager. (#552)
  • Fixed Auto Add modal losing its search context: The modal now correctly keeps your search term when opened from an existing model alias. (#552)
  • Fixed model autosync scheduler config caching: Provider autosync settings now reload correctly instead of getting stuck with stale values.
  • Prevented local Ollama API keys from being sent to the public catalog: Autosync keeps your local credentials local.

2026.06.03.1

03 Jun 04:37

Choose a tag to compare

Overview

This release contains minor internal maintenance to the CI pipeline with no user-facing changes.


🔧 Infrastructure maintenance: Cleaned up unused outputs and removed an unnecessary verification job from the Docker build workflow.

dev-f3fdc7c90ade07a209202634fdf1fc6f81bf1727

03 Jun 16:14
f3fdc7c

Choose a tag to compare

Development pre-release from commit f3fdc7c

Built from: main at f3fdc7c90ade07a209202634fdf1fc6f81bf1727

dev-a2f8e0b14617b18f1a81a2d9523d09b05941e0e5

03 Jun 17:20

Choose a tag to compare

Development pre-release from commit a2f8e0b

Built from: main at a2f8e0b14617b18f1a81a2d9523d09b05941e0e5