Skip to content

Commit b57e27d

Browse files
stackhpc-ciAlex-Welsh
authored andcommitted
ci: Add better/longer retries to AIO workflow TF
1 parent 9afda8e commit b57e27d

File tree

1 file changed

+10
-2
lines changed

1 file changed

+10
-2
lines changed

.github/workflows/stackhpc-all-in-one.yml

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -190,15 +190,23 @@ jobs:
190190
- name: Terraform Apply
191191
id: tf_apply
192192
run: |
193-
for attempt in $(seq 5); do
193+
# Try up to 6 times to create the infrastructure, destroying and retrying if it fails.
194+
# If it fails 3 times, wait 2 hours before trying again.
195+
# The cloud is likely just at capacity, so wait until other jobs finish.
196+
for attempt in $(seq 6); do
194197
if terraform apply -auto-approve; then
195198
echo "Created infrastructure on attempt $attempt"
196199
exit 0
197200
fi
198201
echo "Failed to create infrastructure on attempt $attempt"
199202
sleep 10
200203
terraform destroy -auto-approve
201-
sleep 60
204+
if [ "$attempt" -eq 3 ]; then
205+
echo "Sleeping for 2 hours after 3 failed attempts..."
206+
sleep 7200
207+
else
208+
sleep $(shuf -i 60-180 -n 1)
209+
fi
202210
done
203211
echo "Failed to create infrastructure after $attempt attempts"
204212
exit 1

0 commit comments

Comments
 (0)