Fix ES delay_on_retry behavior + clarify config example messaging#435
Merged
Conversation
| # | ||
| ## The amount of time to wait between requests after a failure occurs. | ||
| ## Default: 2000 | ||
| #elasticsearch.delay_on_retry: 10000 |
Member
There was a problem hiding this comment.
So default was 10000 seconds?
Contributor
Author
There was a problem hiding this comment.
I think it was meant to be ms, but all other "time" config settings we have are in s, so getting it lined up with how other configs work was worthwhile
artem-shelkovnikov
approved these changes
May 7, 2026
| try += 1 | ||
| if try < max_tries | ||
| wait_time = @retry_delay**try | ||
| # Exponential backoff: the first retry waits @retry_delay seconds, and each |
There was a problem hiding this comment.
May want an upper limit here. What if max retries is 20? 2^20 is something close to 2 years
Contributor
Author
There was a problem hiding this comment.
max_tries is set by default to 3, and is settable as a config value: elasticsearch.retry_on_failure: 3.
I am good to let this one be because it's adjustable anyways, and the delay on retry behavior is documented as doubling each time
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #380
This PR fixes the
delay_on_retrybehavior within the Crawler's Elasticsearch client.Previously, we were calculating
retry_delay ** try_countwhere
**in Ruby is an exponent operation, not multiplication. So, a retry delay of 10s goeswhich is too aggressive.
The new calculation correctly doubles the retry backoff time:
retry_delay * (2^(try_count-1))Thus,
The description of this config setting has been updated in
elasticsearch.yaml.exampleas well.Checklists
Pre-Review Checklist
crawler.yml.exampleandelasticsearch.yml.example)v0.1.0)make noticeif any dependencies have been addedRelated Pull Requests
Release Note
Fix
elasticsearch.delay_on_retryconfig setting to correctly double the retry backoff with every failure instead of exponentiation of the specified retry backoff time.