Open
Description
Motivations
- Observed Impacts
- When a new router starts, if it is under initial and continuous load, the response times will be much longer than they should be
- For example, a nominal 800 ms response time will take 2-8 seconds for a minute or more
- The response times will start dropping towards the nominal response time
- If the load is removed, briefly stopped, then restored, the response time immediately drops to the nominal response time
- Scenario
- A request arrives but an instance has just disconnected
- The average requests per instance is exactly equal to the targeted number of requests per instance
- Instead of waiting a few milliseconds for an instance to connect, the request is immediately pushed into an over-busy instance, negatively impacting the response time more than waiting for a new instance to connect would be
- Challenge
- As requests on the overloaded instances complete, will they grab the queued request is trying to wait for a new instance?
Setup
- MaxConcurrentRequests: 10
- InstanceCountMultiplier: 5
- TargetRequestsPerLambda: 2
- NominalResponseTime: 800 ms
- ColdStartTime: 12,000 ms
Example
- T0: 2 requests arrive every second
- No instances available for 12 seconds (
oha -z 10m -q 2
) - T12: 24 requests pile up, 1st instance connects, 10 requests are sent to 1st instance, leaving 14 pending
- T13: 2 more requests arrive, 16 pending, 2nd instance connects, 10 requests are sent to 2nd instance, leaving 6 pending
- T14: 2 more requests arrive, 8 pending, 3rd instance connects, 8 requests are sent to 3rd instance, leaving 0 pending
- T15: 2 more requests arrive, 4th instance connects, 2 requests sent to 4th instance, leaving 0 pending
- T16: 2 more requests arrive, 5th instance connects, 2 requests sent to 5th instance, leaving 0 pending
- ...
- T20: 10 requests from 1st instance finish
- This explains slow response times until ~30 seconds