You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The data store is not populated with the required Pod details when the InferenceModel and InferencePool CRs are added after EPP is started:
Skipping reconciling EndpointSlice because the InferencePool is not available yet: InferencePool hasn't been initialized yet
...
===DEBUG: Current Pods and metrics: []
Add the example InferenceModel and InferencePool CRs:
reconciling InferencePooldefault/vllm-llama2-7b-pool
reconciling InferenceModeldefault/inferencemodel-sample
Incoming pool ref {inference.networking.x-k8s.io InferencePool vllm-llama2-7b-pool}, server pool name: vllm-llama2-7b-pool
Adding/Updating inference model: tweet-summary
===DEBUG: Current Pods and metrics: []
Recreate the EndpointSlice for the example service and the data store reflects the required Pod details:
EndpointSlice reconciliation should be triggered whenever an InferencePool CRUD operation occurs since it manages the internal Pod state which depends on InferencePool details, e.g. targetPortNumber.
The text was updated successfully, but these errors were encountered:
I'm also facing this issue recently, this leads to a response of HTTP/1.1 429 Too Many Requests and I have to restart the external processing. Is there any progress? If not, I would be happy to make some contribution. :)
BTW, I think we should also check the namespace of the endpointslice, instead of only check if the service name matches the owner label.
The data store is not populated with the required Pod details when the InferenceModel and InferencePool CRs are added after EPP is started:
Add the example InferenceModel and InferencePool CRs:
Recreate the EndpointSlice for the example service and the data store reflects the required Pod details:
EndpointSlice reconciliation should be triggered whenever an InferencePool CRUD operation occurs since it manages the internal Pod state which depends on InferencePool details, e.g.
targetPortNumber
.The text was updated successfully, but these errors were encountered: