Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(nodeadm): increase IMDS retries #2112

Merged
merged 1 commit into from
Jan 8, 2025
Merged

Conversation

cartermckinnon
Copy link
Member

Description of changes:

This increases the number of retry attempts for IMDS requests from 15 to 60. We use exponential backoff with a max delay of 1 second, so this results in about 1 minute of retries. We've seen rare failures where the network is not up until the previous retries (15) were exhausted.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link
Member

@ndbaker1 ndbaker1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@ndbaker1
Copy link
Member

ndbaker1 commented Jan 8, 2025

any value in exercising a longer downtime in the e2e test?

# configure without launching the imds mock service
IMDS_MOCK_ONLY_CONFIGURE=true mock::aws
if nodeadm init --skip run; then
echo "bootstrap should not succeed when EC2 IMDS APIs are not reachable."
exit 1
fi
# start the imds mock part way into the initialization to mimic
# delayed availability of IMDS
{ sleep 10 && AWS_MOCK_ONLY_CONFIGURE=true mock::aws; } &
nodeadm init --skip run

@cartermckinnon cartermckinnon merged commit 3bbee38 into main Jan 8, 2025
11 checks passed
@cartermckinnon cartermckinnon deleted the increase-imds-retries branch January 8, 2025 19:22
@cartermckinnon cartermckinnon changed the title fix: increase IMDS retries fix(nodeadm): increase IMDS retries Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants