Docs for cache behaviour 

I'm trying to understand how/when LLM calls get cached, especially when using the OpenAI API. 
I've looked in the docs, but can't find details.

Ideally, in development, I'd like to be able to cache/memoize calls to the API. For example, if one uses a LMQL programe which requests multiple completions, and changes the later part of the programme but leave the early phase unchanged. In this case it seems like the early requests to the API could be cached? This is especially the case if passing a `seed` which is now supported by the API. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs for cache behaviour #342

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Docs for cache behaviour #342

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions