Releases: llamastack/llama-stack
Releases · llamastack/llama-stack
v0.7.0
What's Changed
- fix: exclude informational checks from ci-status aggregation by @leseb in #5105
- feat: add Responses API test coverage analyzer and conformance annotations by @leseb in #5101
- refactor!: remove fine_tuning API by @leseb in #5104
- fix!: remove duplicate dataset_id parameter in append-rows endpoint by @eoinfennessy in #4849
- fix: Multi-worker cache synchronization for vector stores by @elinacse in #5076
- feat: Add integration test for service_tier with openai client by @gyliu513 in #5103
- feat: test responses API integration tests against Azure AI Foundry by @iamemilio in #5107
- fix(security): add path traversal and header injection defenses by @rhdedgar in #5086
- feat!: Part 2 - implement inline neural rerank for RAG by @r3v5 in #4877
- feat: add provider compatibility matrix for Responses API by @leseb in #5113
- perf: lazy-load braintrust autoevals to reduce idle memory (~63MB) by @leseb in #5078
- feat: add provider version tracking to compatibility matrix by @leseb in #5115
- perf: lazy-load torch in embedding_mixin to reduce startup memory by @leseb in #5116
- perf: lazy-load torch and transformers in prompt_guard by @leseb in #5117
- perf: lazy-load numpy, faiss, and sqlite_vec in vector_io providers by @leseb in #5118
- fix(CI): reduce Mergify PR update frequency by @gyliu513 in #5106
- feat: Add support for filters in PGVector and replace f-string usage in table name by @franciscojavierarceo in #5111
- fix: bump pyjwt to 2.12.0 (CVE-2026-32597) by @eoinfennessy in #5127
- fix(inference): improve chat completions OpenAI conformance by @cdoern in #5108
- fix(storage): resolve asyncio event loop mismatch via operation deferral by @derekhiggins in #5130
- fix(ci): use RELEASE_PAT and PRs in post-release workflow by @cdoern in #5132
- chore: bump fallback_version to 0.6.1.dev0 by @cdoern in #5136
- fix: remove UV_EXTRA_INDEX_URL from Release branch ci by @cdoern in #5138
- fix(ci): add uv lock to post-release workflow to update stale lockfile by @cdoern in #5139
- chore(github-deps): bump stainless-api/upload-openapi-spec-action from 1.11.6 to 1.13.0 by @dependabot[bot] in #5148
- chore(github-deps): bump docker/setup-buildx-action from 3.12.0 to 4.0.0 by @dependabot[bot] in #5142
- chore(github-deps): bump astral-sh/setup-uv from 7.3.1 to 7.5.0 by @dependabot[bot] in #5143
- feat(blog): Agentic flows tutorial by @raghotham in #5035
- chore(github-deps): bump docker/login-action from 3.7.0 to 4.0.0 by @dependabot[bot] in #5146
- chore(github-deps): bump llamastack/llama-stack from ce063ac to 2157c09 by @dependabot[bot] in #5145
- feat: Add OpenAI client integration test for top_logprobs by @gyliu513 in #5124
- ci(mergify): skip conflict comments on stale PRs by @leseb in #5156
- feat: Add stream_options parameter support by @gyliu513 in #4815
- feat: promote connector API from v1alpha to v1beta by @leseb in #5129
- refactor: replace LiteLLM with OpenAI mixin for WatsonX provider by @leseb in #5133
- fix: optimize connector listing by @gyliu513 in #5164
- feat: Add OpenAI client integration test for incomplete_details by @gyliu513 in #5157
- refactor!: rename meta-reference providers to builtin by @leseb in #5131
- feat!: eliminate /files/{file_id} GET differences by @r3v5 in #5154
- feat: Add OpenAI client integration test for reasoning effort by @gyliu513 in #5170
- fix: replace blocking requests calls with async httpx in remote providers by @gyliu513 in #5162
- fix: remove references to defunct inline::builtin inference provider by @leseb in #5174
- fix(vertexai): use SDK-native model names instead of stripping prefixes by @major in #5169
- docs: add multi-tenant isolation example for conversations and responses by @jaideepr97 in #5176
- fix: Remove duplicate decode by @gyliu513 in #5177
- refactor: decouple file_search from legacy knowledge_search tool_groups by @leseb in #5175
- feat: add configurable asyncpg connection pool settings by @iamemilio in #5160
- chore: remove unused LiteLLMOpenAIMixin by @mattf in #5159
- fix: Disable asyncpg OTel auto-instrumentation to prevent duplicate DB spans by @iamemilio in #5158
- refactor!: rename knowledge_search to file_search across codebase by @leseb in #5186
- fix: re-enable external provider module test by @cdoern in #5182
- feat: add WatsonX Responses API integration test recordings by @leseb in #5120
- feat: Add metrics for vector io by @gyliu513 in #5096
- refactor: rename rag-runtime provider and builtin::rag toolgroup to file-search by @leseb in #5187
- feat: auto-record integration tests on PRs with multi-provider support by @cdoern in #5123
- fix: update recording workflow action SHAs to include skip-commit support by @cdoern in #5199
- fix: support workflow_dispatch in commit-recordings via PR metadata artifact by @cdoern in #5202
- fix: bump pyasn1 to 0.6.3 (CVE-2026-30922) by @eoinfennessy in #5207
- docs: Add post about Responses API in Llama Stack by @jwm4 in #5196
- fix: support fork PRs in commit-recordings workflow by @cdoern in #5204
- fix: clean up artifacts before cloning fork PR branch by @cdoern in #5212
- fix: handle both artifact structures for recordings copy by @cdoern in #5214
- chore: rename bug template by @leseb in #5210
- fix: only comment on PR when recordings are actually pushed by @cdoern in #5218
- fix: prevent OTel context leak in fire-and-forget background tasks by @iamemilio in #5168
- fix: provider_data_var context leak by @jaideepr97 in #5227
- chore: Update formatting in CONTRIBUTING.md by @raghotham in #5231
- chore(github-deps): bump actions/cache from 5.0.3 to 5.0.4 by @dependabot[bot] in #5241
- chore(github-deps): bump actions/upload-artifact from 4.6.2 to 7.0.0 by @dependabot[bot] in #5235
- chore(github-deps): bump docker/build-push-action from 6.19.2 to 7.0.0 by @dependabot[bot] in #5236
- chore(github-deps): update llamastack/llama-stack requirement to 700b202 by @dependabot[bot] in #5239
- chore(github-deps): bump docker/setup-qemu-action from 3.7.0 to 4.0.0 by @dependabot[bot] in #5234
- feat!: BREAKING CHANGE: make sentence_transformers trust_remote_code configurable, default to False by @derekhiggins in #4602
- docs: add architecture documentation and module-level READMEs by @leseb in #5213
- refactor!: remove tool_groups from public API and auto-register from provider specs by @leseb in #4997
- docs: add AGENTS.md with gui...
v0.6.1
What's Changed
- fix: remove UV_EXTRA_INDEX_URL from Release branch ci (backport #5138) by @mergify[bot] in #5140
- chore: update llama-stack-client to ^0.6.0 in UI lockfile by @cdoern in #5137
- fix(storage): resolve asyncio event loop mismatch via operation deferral (#5130) by @derekhiggins in #5135
- feat(blog): Agentic flows tutorial (backport #5035) by @mergify[bot] in #5167
- fix: milvus hybrid ranker usage (backport #5312) by @mergify[bot] in #5368
Full Changelog: v0.6.0...v0.6.1
v0.6.0
What's Changed
- chore: update convert_tooldef_to_openai_tool to match its usage by @mattf in #4837
- feat!: improve consistency of post-training API endpoints by @eoinfennessy in #4606
- fix: Arbitrary file write via a non-default configuration by @VaishnaviHire in #4844
- chore: reduce uses of models.llama.datatypes by @mattf in #4847
- docs: add technical release steps and improvements to RELEASE_PROCESS.md by @cdoern in #4792
- chore: bump fallback version to 0.5.1 by @cdoern in #4846
- fix: Exclude null 'strict' field in function tools to prevent OpenAI … by @gyliu513 in #4795
- chore(test): add test to verify responses params make it to backend service by @mattf in #4850
- chore: revert "fix: disable together banner (#4517)" by @mattf in #4856
- fix: update together to work with latest api.together.xyz service (circa feb 2026) by @mattf in #4857
- chore(github-deps): bump astral-sh/setup-uv from 7.2.0 to 7.3.0 by @dependabot[bot] in #4867
- chore(github-deps): bump github/codeql-action from 4.32.0 to 4.32.2 by @dependabot[bot] in #4861
- chore(github-deps): bump actions/cache from 5.0.2 to 5.0.3 by @dependabot[bot] in #4859
- chore(github-deps): bump llamastack/llama-stack from 76bcb66 to c518b35 by @dependabot[bot] in #4858
- fix(ci): ensure oasdiff is available for openai-coverage hook by @EleanorWho in #4835
- fix: Deprecate items when create conversation by @gyliu513 in #4765
- chore: refactor chunking to use configurable tiktoken encoding and document tokenizer limits by @mattf in #4870
- chore: prune unused parts of models packages (checkpoint, tokenizer, prompt templates, datatypes) by @mattf in #4871
- chore: prune unused utils from utils.memory.vector_store by @mattf in #4873
- fix: Escape special characters in auto-generated provider documentati… by @gyliu513 in #4822
- chore(docs): Use starter for opentelemetry integration test by @gyliu513 in #4875
- fix: kvstore should call shutdown but not close by @gyliu513 in #4872
- fix: uvicorn log ambiguity by @cdoern in #4522
- chore(github-deps): bump actions/checkout from 4.2.2 to 6.0.2 by @dependabot[bot] in #4865
- chore: cleanup mypy excludes by @mattf in #4876
- feat: add integration test for max_output_tokens by @gyliu513 in #4825
- chore(test): add test to verify responses params make it to backend s… by @gyliu513 in #4852
- ci: add Docker image publishing to release workflow by @cdoern in #4882
- feat: add ProcessFileRequest model to file_processors API by @alinaryan in #4885
- docs: update responses api known limitations doc by @jaideepr97 in #4845
- fix(vector_io): align Protocol signatures with request models by @skamenan7 in #4747
- fix: add _ExceptionTranslatingRoute to prevent keep-alive breakage on Linux by @iamemilio in #4886
- docs: add release notes for version 0.5 by @rhuss in #4855
- fix(ci): disable uv cache cleanup when UV_NO_CACHE is set by @cdoern in #4889
- feat: Add truncation parameter support by @gyliu513 in #4813
- chore(ci): bump pinned action commit hashes in integration-tests.yml by @cdoern in #4895
- docs: Add README for running observability test by @gyliu513 in #4884
- fix: update rerank routing to match params by @mattf in #4900
- feat: Add prompt_cache_key parameter support by @gyliu513 in #4775
- chore: add rerank support to recorder by @mattf in #4903
- feat: add rerank support to vllm inference provider by @mattf in #4902
- fix(inference): use flat response message model for chat/completions by @cdoern in #4891
- feat: add llama cpp server remote inference provider by @Bobbins228 in #4382
- fix: Remove pillow as direct dependency by @VaishnaviHire in #4901
- fix: pre-commit run -a by @mattf in #4907
- fix(ci): Removed kotlin from preview builds by @gyliu513 in #4910
- feat: Add service_tier parameter support by @gyliu513 in #4816
- chore(github-deps): bump github/codeql-action from 4.32.2 to 4.32.3 by @dependabot[bot] in #4918
- chore(github-deps): bump docker/login-action from 3.4.0 to 3.7.0 by @dependabot[bot] in #4916
- chore(github-deps): bump llamastack/llama-stack from c7cdb40 to 4c1b03b by @dependabot[bot] in #4915
- chore(github-deps): bump stainless-api/upload-openapi-spec-action from 1.10.0 to 1.11.6 by @dependabot[bot] in #4913
- chore(github-deps): bump docker/build-push-action from 6.15.0 to 6.19.2 by @dependabot[bot] in #4912
- fix(vertexai): raise descriptive error on auth failure instead of silent empty string by @major in #4909
- fix: resolve StorageConfig default env vars at construction time by @major in #4897
- feat: Add incomplete_details response property by @gyliu513 in #4812
- feat(client-sdks): add OpenAPI Generator tooling by @aegeiger in #4874
- fix(vector_io): eliminate duplicate call for vector store registration by @r3v5 in #4925
- test(vertexai): add unit tests for VertexAI inference adapter by @major in #4927
- feat: introduce new how-to blog by @cdoern in #4794
- chore: remove reference to non-existent WeaviateRequestProviderData by @mattf in #4937
- feat: standardized error types with HTTP status codes by @iamemilio in #4878
- feat: add opentelemetry-distro to core dependencies by @Artemon-line in #4935
- feat(ci): Add nightly job for doc build by @gyliu513 in #4911
- fix: Ensure user isolation for stored conversations and responses by @jaideepr97 in #4834
- fix: align chat completion usage schema with OpenAI spec by @cdoern in #4930
- fix: allow conversation item type to be omitted by @mattf in #4948
- feat: Enable inline PyPDF file_processors provider by @alinaryan in #4743
- feat: add support for /responses background parameter by @cdoern in #4824
- feat(vector_io): Implement Contextual Retrieval for improved RAG search quality by @r-bit-rry in #4750
- chore: use SecretStr for x-llamastack-provider-data keys by @mattf in #4939
- chore: remove unused vector store utils by @mattf in #4961
- feat: auto-identify embedding models for vllm by @mattf in #4975
- chore(github-deps): bump llamastack/llama-stack from 4c1b03b to 7d9786b by @dependabot[bot] in #4971
- chore(github-deps): bump actions/checkout from 6.0.1 to 6.0.2 by @dependabot[bot] in #4969
- chore(github-deps): bump actions/cache from 4.2.0 to 5.0.3 by @dependabot[bot] in #4963
- chore(github-deps): bump github/codeql-action from 4.32.3 to 4.32.4 by @dependabot...
v0.5.2
What's Changed
- chore: bump llama-stack-client to 0.5.1 by @cdoern in #4957
- ci: add arm64 image manifest publishing to release workflow by @rhdedgar in #5006
- feat(ci): automate post-release and pre-release version management (backport #4938) by @mergify[bot] in #5032
- fix(llama-guard): less strict parsing of safety categories (backport #5045) by @mergify[bot] in #5053
- fix: OCI26ai sql query patches (backport #5046) by @mergify[bot] in #5054
Full Changelog: v0.5.1...v0.5.2
v0.5.1
What's Changed
- fix: [release-0.5.x] Arbitrary file write via a non-default configuration (#4844) by @VaishnaviHire in #4869
- fix(vertexai): raise descriptive error on auth failure instead of silent empty string (backport #4909) by @mergify[bot] in #4923
- fix: resolve StorageConfig default env vars at construction time (backport #4897) by @mergify[bot] in #4924
- feat: add opentelemetry-distro to core dependencies (backport #4935) by @mergify[bot] in #4943
- fix(vector_io): eliminate duplicate call for vector store registration (backport #4925) by @mergify[bot] in #4941
- chore: bump version to 0.5.1 for release by @cdoern in #4955
Full Changelog: v0.5.0...v0.5.1
v0.4.5
What's Changed
- chore: bump llama-stack-client to 0.4.4 in UI lockfile by @cdoern in #4791
- fix: MCP CPU spike by using context manager for session cleanup by @derekhiggins in #4851
- fix(vector_io): eliminate duplicate call for vector store registration (backport #4925) by @mergify[bot] in #4944
- chore: bump version to 0.4.5 for release by @cdoern in #4954
Full Changelog: v0.4.4...v0.4.5
v0.5.0
What's Changed
- docs: Added a new oci-llamastack notebook for how to build agents with OCI and llama stack by @omaryashraf5 in #4418
- docs: Add guide to migrating from Agents to Responses by @jwm4 in #4375
- feat: convert models API to use a FastAPI router by @nathan-weinberg in #4407
- chore: update mcp dependency constraint to >=1.23.0 by @derekhiggins in #4457
- feat(ci): added codeql scanning workflow by @gmatuz in #4462
- feat: migrate Conversations API to FastAPI router by @leseb in #4342
- fix(faiss): add backward compatibility for EmbeddedChunk deserialization by @leseb in #4463
- fix: Removed duplicate parameters from integration test by @gyliu513 in #4461
- fix: removed scan on push by @gmatuz in #4466
- chore: Document release process by @raghotham in #4470
- feat: build ARM64-based UBI starter image by @rhdedgar in #4474
- chore: add "Discussion" issue template by @nathan-weinberg in #4469
- chore: Updated test integration guide by @gyliu513 in #4460
- fix: update
CONTRIBUTING.mdto reflect pre-commit version used in CI by @eoinfennessy in #4468 - fix: skip resources with empty IDs from conditional env vars in config processing by @Elbehery in #4455
- fix: Fix Vector Store Integration Tests by @franciscojavierarceo in #4472
- chore: Delete CHANGELOG.md by @terrytangyuan in #4480
- ci: run ARM64 builds on nightly schedule only by @rhdedgar in #4479
- chore(github-deps): bump actions/checkout from 4.3.1 to 6.0.1 by @dependabot[bot] in #4491
- chore(github-deps): bump astral-sh/setup-uv from 7.1.6 to 7.2.0 by @dependabot[bot] in #4490
- chore(github-deps): bump docker/setup-qemu-action from 3.2.0 to 3.7.0 by @dependabot[bot] in #4489
- chore(github-deps): bump github/codeql-action from 3.31.9 to 4.31.9 by @dependabot[bot] in #4488
- chore(github-deps): bump stainless-api/upload-openapi-spec-action from 1.9.0 to 1.10.0 by @dependabot[bot] in #4487
- chore: Add backwards compatibility for Milvus Chunks by @franciscojavierarceo in #4484
- fix: aiohttp HTTP Parser auto_decompress feature susceptible to zip bomb by @leseb in #4494
- chore: Add backwards compatibility for qdrant chunks by @Ygnas in #4495
- chore: Updated CONTRIBUTING guidance for integration test by @gyliu513 in #4459
- fix: fonttools security advisory by @leseb in #4503
- chore: Add backwards compatibility for pgvector chunks by @Ygnas in #4506
- refactor!: change image_name to distro_name in StackConfig by @cdoern in #4396
- fix: Add backwards compatibility for sqlite-vec, chroma, and weaviate chunks by @ChristianZaccaria in #4502
- fix: urllib3 vulnerable to decompression-bomb safeguard bypass by @leseb in #4512
- fix: disable together banner by @cdoern in #4517
- chore: switch to monthly minor release by @leseb in #4518
- chore: change discussion template label by @nathan-weinberg in #4525
- chore: add maintenance policy to release doc by @leseb in #4514
- docs: fixed outdated links for api overview, routed to the updated links by @lalexandrh in #4524
- chore: upgrade virtualenv by @raghotham in #4585
- chore: resync client dep with main by @leseb in #4591
- fix: llama-stack-api packaging by @cdoern in #4593
- docs: add guidance for contributing new providers by @leseb in #4478
- feat: migrate
post_trainingAPI to FastAPI router by @eoinfennessy in #4496 - fix(memory/rag): remove file:// uri prefix by @r-bit-rry in #4286
- fix: benchmark registration via registered_resources config by @leseb in #4600
- feat(api): migrate Eval API to FastAPI router (#4345) by @r-bit-rry in #4425
- feat: convert shields API to use a FastAPI router by @nathan-weinberg in #4412
- feat: convert datasetio API to use a FastAPI router by @nathan-weinberg in #4400
- feat: Elasticsearch integration for VectorIO by @ezimuel in #4007
- fix: enable vector store registration from config with OpenAI metadata by @are-ces in #4616
- docs: Update RAG Agent Documentation using vector_stores by @robinnarsinghranabhat in #4485
- chore(github-deps): bump github/codeql-action from 4.31.9 to 4.31.10 by @dependabot[bot] in #4640
- chore(github-deps): bump actions/cache from 5.0.1 to 5.0.2 by @dependabot[bot] in #4639
- chore(github-deps): bump docker/setup-buildx-action from 3.11.1 to 3.12.0 by @dependabot[bot] in #4638
- chore(github-deps): bump actions/setup-node from 6.1.0 to 6.2.0 by @dependabot[bot] in #4637
- fix: update responses limitations doc to track latest state by @iamemilio in #4392
- feat: Convert scoring API to use a FastAPI router by @gyliu513 in #4521
- fix: Removed unused para for test score by @gyliu513 in #4645
- fix: fix list-deps quoting in deps-only output by @gyliu513 in #4653
- feat(api): Implement connector support via static configuration by @jaideepr97 in #4263
- fix: default ollama URL in Quickstart was incorrect in 2 places by @damian0815 in #4646
- fix: unregister function first before register by @gyliu513 in #4473
- feat!: migrate safety API to FastAPI router by @r-bit-rry in #4643
- feat: convert prompts API to use a FastAPI router by @nathan-weinberg in #4649
- feat: add scheduled CI workflow for release branches by @cdoern in #4510
- fix: Fix redundant MCP tools/list calls by @jwm4 in #4634
- docs: Move demo script to step 3 for quick start by @gyliu513 in #4661
- feat: convert scoring_functions API to use FastAPI router. by @EleanorWho in #4599
- feat(ci): add Bedrock integration tests with record/replay by @skamenan7 in #4292
- feat: Core Changes for default embedding dims by @rriley99-oci in #4671
- fix: use
SecretStrfor AWS credentials by @eoinfennessy in #4681 - feat: Implemented reasoning.effort parameter in LLS Responses by @Nehanth in #4633
- fix: file_search_call results missing document attributes/metadata by @are-ces in #4680
- fix!: usage input_token_details and output_token_details are not optional by @mattf in #4690
- fix: completed_at is required output by @mattf in #4692
- fix: store is required output by @mattf in #4693
- docs: update contrib guidelines on PR reviews by @leseb in #4676
- docs: require test plan with script and output for API PRs by @leseb in #4659
- feat: Add OpenAI API conformance coverage analyzer by @leseb in #4668
- feat!: use global vertext API endpoint by @ktdreyer in #4674
- fix: Concurrent calls into SentenceTransformer() cause failures of cl...
v0.4.4
What's Changed
- fix: Enable session polling during streaming responses (backport #4738) by @mergify[bot] in #4756
- feat: add scheduled CI workflow for release branches (backport #4510) by @mergify[bot] in #4769
- fix: make release-branch-scheduled-ci compatible with older branches (backport #4753) by @mergify[bot] in #4767
- fix: pass branch explicitly to install-llama-stack-client action (backport #4759) by @mergify[bot] in #4763
- fix: llama-stack-api packaging by @cdoern in #4777
- feat(ci): unify PyPI/npm release workflow with dry-run support (backport #4774) by @mergify[bot] in #4785
- fix: install setuptools-scm in CI (backport #4782) by @mergify[bot] in #4786
- build: bump llama-stack-client to 0.4.4 for release by @cdoern in #4787
- fix: override version from release tag for all packages (backport #4788) by @mergify[bot] in #4789
Full Changelog: v0.4.3...v0.4.4
v0.4.3
What's Changed
- fix: enable vector store registration from config with OpenAI metadata (backport #4616) by @mergify[bot] in #4631
- fix: Fix redundant MCP tools/list calls (backport #4634) by @mergify[bot] in #4663
- fix: file_search_call results missing document attributes/metadata (backport #4680) by @mergify[bot] in #4686
- fix: Concurrent calls into SentenceTransformer() cause failures of client.vector_stores.file_batches.create() (backport #4636) by @mergify[bot] in #4698
- feat: Add shutdown functionality to LlamaStackAsLibraryClient and AsyncLlamaStackAsLibraryClient (backport #4642) by @mergify[bot] in #4733
- feat(PGVector): implement automatic creation of vector extension during initialization of PGVectorVectorIOAdapter (backport #4660) by @mergify[bot] in #4740
Full Changelog: v0.4.2...v0.4.3
v0.4.2
What's Changed
- fix: disable together banner (backport #4517) by @mergify[bot] in #4519
- fix: llama-stack-api packaging (backport #4593) by @mergify[bot] in #4596
- fix(memory/rag): remove file:// uri prefix (backport #4286) by @mergify[bot] in #4603
- fix: benchmark registration via registered_resources config (backport #4600) by @mergify[bot] in #4604
Full Changelog: v0.4.1...v0.4.2