You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
That being said, retry algorithms (at least robust ones) for internet protocols are normally written with an exponential escalation of wait time (such as 1, 2, 4, 8, 16, 32, .. seconds). In that case, a user may want to specify at which point to give up and log a failure, for example --max-retries and/or --max-wait.
reply: It's waiting after a failed attempt if the response is any of the following: 429, 500, 502, 503, 504. It doesn't have a wait between successful downloads. It has a max number of times to retry on the designated responses, but otherwise just logs the response in the error log (along with the index, filename, and url).
Setting a maximum wait time on a request would probably be a good idea as well. urllib3.request seems to handle much of this when also passed a Retry object. @thompsonmj had also suggested HTTPAdapter as an option that also uses Retry.
Seems reasonable to use HTTPAdapter, since it's already using requests. Must also consider streaming interruption, as noted here.
The text was updated successfully, but these errors were encountered:
Sometimes when downloading files we end up reaching a threshold where our IP address gets blocked for a while by a remote server. In that case you typically have to wait for a few hours. I wouldn't expect or want the command to wait in this scenario. For that scenario can we re-run the cautious-robot command and have it skip already downloaded images?
Sometimes when downloading files we end up reaching a threshold where our IP address gets blocked for a while by a remote server. In that case you typically have to wait for a few hours. I wouldn't expect or want the command to wait in this scenario. For that scenario can we re-run the cautious-robot command and have it skip already downloaded images?
Right now I believe it relies on adjusting the start index to avoid re-downloading the image. However, I could add a line here checking for the image:
Originally posted by @hlapp in #1 (comment)
Seems reasonable to use
HTTPAdapter
, since it's already usingrequests
. Must also consider streaming interruption, as noted here.The text was updated successfully, but these errors were encountered: