[Fix] Cache - Avoiding expensive operations when cache isn't available #15182

AlexsanderHamir · 2025-10-03T22:58:50Z

Title

[Fix] Cache - Avoiding expensive operations when cache isn't available

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
I have added a screenshot of my new test passing locally
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🧹 Refactoring

Changes

Only do expensive work when necessary, avoid it otherwise.

Performance Gains

With DB

Before

Type	Name	# Requests	# Fails	Median (ms)	95%ile (ms)	99%ile (ms)	Average (ms)	Min (ms)	Max (ms)	Average size (bytes)	Current RPS
POST	/chat/completions	102040	2	500	830	1200	532.05	67	2527	398	778.4
Custom	LiteLLM Overhead Duration (ms)	102038	0	47	71	97	50.42	4	1438	0	778.4
	Aggregated	204078	2	290	700	1100	291.24	4	2527	199	1556.8

After

Type	Name	# Requests	# Fails	Median (ms)	95%ile (ms)	99%ile (ms)	Average (ms)	Min (ms)	Max (ms)	Average size (bytes)	Current RPS	Current Failures/s
POST	/chat/completions	101891	2	280	520	1000	310.07	105	53491	398	939.1	0.1
Custom	LiteLLM Overhead Duration (ms)	101889	0	25	45	60	27.27	1	851	0	939	0
	Aggregated	203780	2	130	420	870	168.67	1	53491	199	1878.1	0.1

With DB + Redis

Before

Type	Name	# Requests	Median (ms)	95%ile (ms)	99%ile (ms)	Average (ms)	Min (ms)	Max (ms)	Average size (bytes)	Current RPS
POST	/chat/completions	35569	1900	3500	4400	2052.36	254	8315	398	330.2
Custom	LiteLLM Overhead Duration (ms)	35569	300	770	1200	356.51	27	2156	0	330.2
	Aggregated	71138	920	3100	4000	1204.44	27	8315	199	660.4

After

Type	Name	# Requests	Median (ms)	95%ile (ms)	99%ile (ms)	Average (ms)	Min (ms)	Max (ms)	Average size (bytes)	Current RPS
POST	/chat/completions	21000	1700	2800	3400	1744.71	84	5383	398	424.1
Custom	LiteLLM Overhead Duration (ms)	21000	270	650	1100	311.15	7	1413	0	424.1
	Aggregated	42000	820	2600	3100	1027.93	7	5383	199	848.2

…ing is disabled - Moved cache availability checks before expensive operations to improve performance for non-cached requests - Updated client code to handle None responses from caching handler

vercel · 2025-10-03T22:58:55Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
litellm	Error			Oct 4, 2025 3:19pm

ishaan-jaff

lgtm

Fixed `TypeError: typing.Any cannot be used with isinstance()` that was occurring in the caching handler when checking cached streaming responses. The issue was caused by CustomStreamWrapper being aliased to `typing.Any` at runtime through the TYPE_CHECKING conditional import pattern. When the code attempted to use isinstance(cached_result, CustomStreamWrapper) at lines 222 and 338, it failed because Python's isinstance() cannot be used with typing.Any. Solution: Import CustomStreamWrapper at runtime separately from the TYPE_CHECKING block, while keeping a type alias for static type checking. This allows isinstance checks to work properly while maintaining type hints.

Optimize cache performance by avoiding expensive operations when cach…

ff8d53c

…ing is disabled - Moved cache availability checks before expensive operations to improve performance for non-cached requests - Updated client code to handle None responses from caching handler

clean hot path

e119b84

vercel bot had a problem deploying to Preview October 4, 2025 01:52 Failure

ishaan-jaff approved these changes Oct 4, 2025

View reviewed changes

vercel bot had a problem deploying to Preview October 4, 2025 02:12 Failure

fix: remove unnecessary type checking

c66aa4e

vercel bot had a problem deploying to Preview October 4, 2025 15:19 Failure

ishaan-jaff merged commit 5d22229 into main Oct 4, 2025
38 of 50 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Fix] Cache - Avoiding expensive operations when cache isn't available #15182

[Fix] Cache - Avoiding expensive operations when cache isn't available #15182

Uh oh!

AlexsanderHamir commented Oct 3, 2025 •

edited

Loading

Uh oh!

vercel bot commented Oct 3, 2025 •

edited

Loading

Uh oh!

ishaan-jaff left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Fix] Cache - Avoiding expensive operations when cache isn't available #15182

[Fix] Cache - Avoiding expensive operations when cache isn't available #15182

Uh oh!

Conversation

AlexsanderHamir commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Title

Relevant issues

Pre-Submission checklist

Type

Changes

Performance Gains

With DB

With DB + Redis

Uh oh!

vercel bot commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ishaan-jaff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AlexsanderHamir commented Oct 3, 2025 •

edited

Loading

vercel bot commented Oct 3, 2025 •

edited

Loading