feat: Document usage of request_id #900

AlexanderFillbrunn · 2024-02-12T10:31:17Z

Feature request

Hi,
The /v1/generate endpoint returns a request_id as part of the JSON response. I assume that when finished is set to false, I can somehow use this request ID to query for the rest of the output later. However, the OpenAPI documentation I can access under http://127.0.0.1:3000/ does not seem to document anywhere which endpoint to use for that. Or am I mistaken and this is not possible?

I am using the latest ghcr.io/bentoml/openllm Docker image like this:
docker run --rm -it -p 3000:3000 --platform linux/x86_64 ghcr.io/bentoml/openllm start facebook/opt-1.3b --backend pt

Kind regards,
Alexander

Motivation

This feature would allow me to find out how the request_id can be used to follow up on incomplete queries, if this is at all possible.

Other

No response

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Document usage of request_id #900

feat: Document usage of request_id #900

AlexanderFillbrunn commented Feb 12, 2024

feat: Document usage of request_id #900

feat: Document usage of request_id #900

Comments

AlexanderFillbrunn commented Feb 12, 2024

Feature request

Motivation

Other