[pull] main from danny-avila:main#167
Merged
Merged
Conversation
* fix auth recovery singleflight * add auth recovery e2e coverage * handle invalid auth redirect timestamp
* fix: Reject preliminary parent follow-ups * chore: Sort frontend imports * fix: Narrow preliminary parent detection * fix: Preserve refused submit state * fix: Propagate refused submit result
* fix: prevent empty agents endpoint selection * fix: sort endpoint item imports
Keep the Google Gen AI SDK aligned with the latest 2.x release. Updates the declared range in both backend manifests (api, packages/api) and regenerates the lockfile to resolve @google/genai to 2.8.0. No application code changes: the sole consumer (api/app/clients/tools/structured/GeminiImageGen.js) uses the stable `GoogleGenAI` constructor and `models.generateContent` API, and the upstream changelog records no breaking changes to those between 2.0 and 2.8. Closes #13551
) * 🧰 fix: Flatten union schemas for Gemini/Vertex MCP tool compatibility `@langchain/google-common`'s `zod_to_gemini_parameters` throws "Gemini cannot handle union types" on any genuine `anyOf`/`oneOf` (e.g. discriminated unions), so MCP tools shipping union-typed schemas crash on the Google endpoint while working fine on OpenAI/Claude. Add `flattenJsonSchemaUnions` (packages/api) to collapse unions to their first non-null member and multi-entry `type` arrays to a single nullable type, and apply it in `createToolInstance`'s existing `isGoogle` branch so only the Google/Vertex path is affected. Lossy by design, mirroring the existing empty-object fallback. Closes #13612 * 🩹 fix: Address Codex review — preserve fields, strip null enums, cover definitions path - Preserve parent-level `properties`/`required` when collapsing a union: merge the chosen branch into the parent instead of overwriting, so args declared outside the union (e.g. always-required fields) still reach Gemini. - Drop the `null` member from `enum` when a union/type-array makes a field nullable, keeping Gemini's required homogeneous-enum invariant. - Propagate the Google-flattened schema to the definitions/deferred-tool path: thread `provider` into `loadToolDefinitions` and flatten there, and store the flattened schema on `mcpJsonSchema` so `extractMCPToolDefinition` no longer emits raw unions on Google/Vertex. * 🎨 style: Sort imports in tools/definitions per import-order check * ♊ feat: Broaden union flatten into a full Gemini schema sanitizer The union flatten alone wasn't enough — real GitHub MCP tools on Gemini also 400 with `Invalid value ... (TYPE_STRING), true`, because Gemini's function-calling Schema (https://ai.google.dev/api/caching#Schema) accepts only a restricted JSON Schema subset, and `enum` is `Type.STRING`-only. Rename `flattenJsonSchemaUnions` → `sanitizeGeminiSchema` and broaden it (one pass, Gemini-gated) to cover the documented subset: - Keep only string `enum` values; drop the keyword for non-string types (fixes the reported boolean-enum 400, incl. boolean `const` normalized to `enum: [true]`). - `const` → single-value string enum, or drop if non-string. - Merge `allOf` intersections; fold `exclusiveMinimum`/`exclusiveMaximum` into `minimum`/`maximum`. - Strip unsupported keywords: `additionalProperties`, `default`, `$schema`, `$id`. - (Existing) collapse `anyOf`/`oneOf`, multi-entry `type` arrays, nullable. Grounded in Google's Schema docs rather than reverse-engineered from 400s. Verified end-to-end against the real `@langchain/google-common` converter. Complements danny-avila/agents#232 (langchain bump), which defers schema flattening to LibreChat. * 🩹 fix: Gate enum retention on the effective (collapsed) type Codex review: a mixed-type enum like `type: ['integer','string'], enum: [1,'auto']` collapsed the type to `integer` but still kept the string value `'auto'`, yielding `{type:'integer', enum:['auto']}` — a non-string type with an enum, which Gemini rejects. Keep `enum` only when the effective collapsed type is string (or unset), and stamp `type: 'string'` on a surviving typeless enum (e.g. a string `const` discriminator) so it satisfies Gemini's Type.STRING enum requirement.
* 📖 feat: Add Claude Fable 5 Support Claude Fable 5 (`claude-fable-5`) is Anthropic's most capable widely released model (GA 2026-06-09). Its naming drops the opus/sonnet/haiku tier, so LibreChat's name-parsing helpers miss it; this teaches them the Mythos-class family (Fable / Mythos) and registers the model. - Add `parseMythosClassVersion` and route Fable/Mythos through `supportsAdaptiveThinking`, `omitsThinkingByDefault`, `omitsSamplingParameters`, and `supportsContext1m` - Extend the Bedrock detection regexes (beta headers + adaptive-thinking branch) and `checkPromptCacheSupport` to match `claude-(fable|mythos)` - Return 128K max output for Fable/Mythos in `maxOutputTokens.reset`/`set` - Register `claude-fable-5` in shared Anthropic + Bedrock model lists, 1M context / 128K output token maps, and $10/$50 pricing with 12.5/1 cache rates (`claude-mythos-5` added to token + pricing maps only, since it is limited-availability) - Update `.env.example` and the Vertex `librechat.example.yaml` examples - Add parallel tests across tokens, Anthropic llm config, the Bedrock parser, and tx pricing * 🧹 refactor: Centralize Mythos-class detection; address review feedback - Add `isMythosClassModel` + `MYTHOS_CLASS_FAMILIES` in schemas.ts as the single source of truth for the Fable/Mythos family; route every gate (adaptive thinking, omit-thinking, omit-sampling, 1M context, prompt cache, 128K max-output reset/set) through it. A future sibling class is now a one-line edit. - [Codex P2] Exclude Mythos-class from getBedrockAnthropicBetaHeaders: Fable/ Mythos ship 128K output + fine-grained tool streaming by default, and the legacy output-128k-2025-02-19 beta is 3.7-Sonnet-only on Bedrock and risks request rejection. They still get adaptive thinking + effort. - [Copilot] Add Mythos 5 test parity (name variations, cache rates, pinned $10/$50) in tx.spec; add Mythos context/max-output/name-match in tokens.spec; fix the stale claude-3-7-sonnet-only comment in bedrock.ts. - Add isMythosClassModel unit tests covering all declared families. * 📝 docs: Clarify Mythos-class Bedrock requirements; correct beta-omit rationale Verified live against Bedrock (acct 951834775723, us-west-2): - anthropic.claude-fable-5 IS a real Bedrock catalog model, INFERENCE_PROFILE-only exactly like the existing anthropic.claude-opus-4-7/4-8 and claude-sonnet-4-6 default entries (refutes the "invalid model id" review claim). - Mythos-class also requires opting into Anthropic data sharing (Bedrock Data Retention API) before invocation. Changes: - .env.example: note that Mythos-class (Fable/Mythos) is inference-profile-only on Bedrock and needs the data-sharing opt-in. - bedrock.ts: reword the beta-omit comment to the verified rationale — output-128k / fine-grained-tool-streaming are built-in/no-op for the 4.7+ generation, so omitting them is lossless (dropped the unverified "Bedrock may reject" wording). * 🔄 refactor: Reorganize imports in schemas.ts and tx.spec.ts - Moved `TFeedback` and `Tools` imports to the top of `schemas.ts` for better readability. - Adjusted import order in `tx.spec.ts` to maintain consistency and improve clarity.
) When a skill is primed fresh this turn (manual $-popover or always-apply) AND also appears in history as a `skill` tool_call, its SKILL.md body was injected twice — once by injectSkillPrimes and once reconstructed by formatAgentMessages. - add `collectFreshSkillPrimeNames` helper (packages/api) — union of manual + always-apply prime names - client.js: pass the set as `skipSkillBodyNames` to formatAgentMessages for both the initialMessages and memoryMessages paths so the body reconstructs once. Names not primed this turn still reconstruct (sticky manual re-prime). Requires `@librechat/agents` with `skipSkillBodyNames` support; the published dist silently ignores the unknown option until upgraded.
* 🌊 fix: Stream File Authoring Previews from Partial Tool Args * 🧹 chore: Sort Imports in FileAuthoringCall * 👁️ feat: Keep File Authoring Input Visible After Completion
* ⏳ fix: Extend and decouple MCP OAuth flow timeouts The OAuth auth button disappeared after 2 minutes (the internal OAuth handling timeout) while the flow state lived for 3 minutes, leaving users who didn't click immediately stuck in an unrecoverable re-auth loop. The handling timeouts also reused the connection/init timeout, so a short initTimeout would shrink the OAuth window further. - Add MCP_OAUTH_HANDLING_TIMEOUT (10m) and MCP_OAUTH_FLOW_TTL (15m) to mcpConfig - Decouple the reactive/proactive OAuth waits from initTimeout/connectionTimeout - Use OAUTH_FLOW_TTL for the FlowStateManager TTL and the UI status window - Ensure the flow TTL outlives the handling timeout, fixing the "Flow state not found" race - Remove dead FLOW_TTL constant and document new env vars Fixes #13615 * ⏳ fix: Coordinate OAuth pending window with handling timeout Address Codex review: the extended OAuth wait was still capped by other timeouts that were not updated. - Align PENDING_STALE_MS (button validity + pending-flow reuse window) with MCP_OAUTH_HANDLING_TIMEOUT so a flow stays reusable for the full wait instead of 2 minutes (Finding 3) - Clamp MCP_OAUTH_FLOW_TTL to never fall below the handling timeout so a callback near the deadline still finds its flow state (Finding 2) - Floor attemptToConnect's timeout to the handling window for OAuth servers so the reactive in-connect OAuth wait is not killed by the 30s connection timeout (Finding 1) - Update flow staleness tests to reference the threshold symbolically * ⏳ fix: Align OAuth window across status, action flows, and client polling Address Codex round 2: extending the server wait exposed three more windows that were still capped or now over-extended. - checkOAuthFlowStatus reports a PENDING flow as active only within the usable PENDING_STALE_MS window, not the longer Keyv retention TTL, so the connect button reappears instead of a stuck 'connecting' state - Give Action (custom tool) OAuth its own FlowStateManager on the prior 3-minute TTL so the longer MCP OAuth TTL can't leave an action tool call waiting up to 15 minutes - Extend the MCP server-card client polling to the 10-minute handling window so a user who completes OAuth after 3 minutes is still picked up * 🧪 test: Make stale-flow CSRF test track PENDING_STALE_MS The CSRF-fallback stale-flow test hardcoded a 3-minute age, which is now within the 10-minute PENDING_STALE_MS window and was wrongly treated as active. Derive the age from PENDING_STALE_MS so it tracks the constant. * ⏳ fix: Add grace buffers and surface OAuth timeout to the client Address Codex round 3 (near-deadline edges): - Clamp MCP_OAUTH_FLOW_TTL to handling timeout + 60s grace (not equality), so flow state outlives the wait instead of expiring at the same instant - Extend attemptToConnect's OAuth floor by a 60s grace so a user who authorizes near the deadline still gets the post-OAuth reconnect - Surface OAUTH_HANDLING_TIMEOUT on the connection-status response and have the client poll for the configured window instead of a hardcoded 10 minutes, so a tuned server deadline isn't capped on the client * ⏳ fix: Refresh client OAuth timeout from the first status refetch If the connection-status cache is empty when polling starts, the client captured the 10-minute fallback and never picked up a tuned oauthTimeout. Re-read it after each refetch so a longer configured deadline is honored even on a cold cache. * 📝 refactor: Type oauthTimeout on MCPConnectionStatusResponse Declare the oauthTimeout field on the shared response type in data-provider instead of an ad-hoc inline cast in the client hook, and replace the pre-existing 'as any' on the status query read with the typed getQueryData. Type-level only; no runtime change.
* fix: Resolve MCP Runtime User Placeholders * fix: Harden MCP Runtime Placeholder Connections * fix: Update MCP Source Tag Test Expectations * fix: Complete MCP Runtime Placeholder Reinit * fix: Harden MCP Request Scoped Runtime Configs * fix: Align MCP OAuth Tests With Domain Policy * fix: Harden MCP Runtime Resolution Edges * fix: Avoid MCP Runtime Reprocessing Pitfalls * fix: Reuse MCP Request Scoped Tool Discovery * fix: Validate MCP Body Runtime Fields * 🛡️ refactor: Harden runtime placeholder edges from review - Warn at inspection when a trusted server URL contains runtime placeholders but no domain allowlist restricts the resolved target - Document the three resolution sites that must stay in sync so the validated config always matches the connected one - Note the per-call connect cost of ephemeral GRAPH/BODY connections - Drop the no-op removeUserConnection in callTool's ephemeral cleanup; ephemeral connections are never stored, and removing the entry could orphan a still-connected cached connection after a config change * 🪪 fix: Cover oauth_headers, Graph URL gating, and request-scoped reconnects Address Codex review: - Resolve runtime placeholders in oauth_headers (processMCPEnv + Graph pre-pass) and include the field in placeholder detection, so OAuth discovery/token requests no longer send literals; consolidate the detection field lists into one helper - Defer the early domain gate when the URL still carries a Graph placeholder (resolved async later); the authoritative assertResolvedRuntimeConfigAllowed check still enforces policy - Bypass the 10s reconnect throttle for request-scoped servers, which re-fetch tool definitions on every message by design
…ts (#13636) * 🛰️ feat: Add GPT-5.5 + Frontier OpenAI Models, Drop Deprecated Defaults * 🛰️ fix: Address Codex Review on OpenAI Model Refresh - Replace nonexistent gpt-5.5-chat-latest with the actual chat-latest alias; register its context window, output cap, pricing, and cache rates, and pin explicit rates for legacy gpt-5.x-chat-latest aliases so the new chat-latest key cannot out-match their cheaper pricing - Add long-context premium tiers (>272K input) for gpt-5.5 and gpt-5.4 - Disable streaming for pro reasoning models (o1-pro, gpt-5.x-pro), which OpenAI does not support, with spec coverage * 🛰️ fix: Address Codex Round-2 Review and CI Spec Failure - Allow chat-latest through the official OpenAI fetched-model filter - Export isProReasoningModel and drop unsupported sampling parameters for versioned pro models (gpt-5.4-pro, gpt-5.5-pro), which the versioned-model exemption previously let through - Honor the pro-model streaming disable in both agent chat-completions routes, which decide SSE from model_parameters before llmConfig exists - Update models.spec default-list assertions for the refreshed defaults and cover chat-latest filter retention * 🛰️ fix: Address Codex Round-3 Review - Convert max_tokens for chat-latest, which the gpt-[5-9] guard missed - Drop snake_case sampling params (top_p, logit_bias, penalties) in the reasoning-model exclusion list so addParams-sourced values are removed - Add createOpenAIAggregatorHandlers and wire them into the agent chat-completions service's non-streaming branch, which previously ran with no handlers and always returned an empty aggregated response * 🛰️ ci: Fix Import Order Drift and Controller Spec Mock - Sort type import first in service.spec.ts per import-order convention - Register isProReasoningModel in the openai controller spec's @librechat/api mock factory, whose enumerated exports left the new helper undefined and broke the non-streaming flow under test * 🛰️ chore: Trim Scope to Model Catalog Changes Revert the OpenAI endpoint and agent handler changes (pro-model streaming, sampling exclusions, non-streaming aggregation) — that surface is moving out of LibreChat into the agents SDK and belongs in its own change. Keep the model list, token windows, pricing, and the fetched-model filter for chat-latest. * 🛰️ fix: Correct GPT-5.4 Context Windows and Pro Long-Context Pricing - Set gpt-5.4 and gpt-5.4-pro context to the documented 1,050,000 window — 272K is the long-context pricing breakpoint, not the cap, and using it truncated prompts before they could reach that tier - Add gpt-5.4-pro long-context premium rates ($60/$270 above 272K) per its model page; gpt-5.5-pro documents no long-context tier * 🛰️ fix: Add gpt-5.4-nano and gpt-5.5-pro Long-Context Pricing - Register gpt-5.4-nano ($0.20/$1.25, cached $0.02, 400K context) in the model list, pricing, cache, and token maps — the longest-match fallback billed it at gpt-5.4's $2.50/$15 - Add gpt-5.5-pro long-context premium rates ($60/$270 above 272K); the pricing table lists the tier even though the model page omits it
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )