File tree Expand file tree Collapse file tree 1 file changed +23
-0
lines changed Expand file tree Collapse file tree 1 file changed +23
-0
lines changed Original file line number Diff line number Diff line change @@ -3,6 +3,29 @@ aws-parallelcluster-node CHANGELOG
33
44This file is used to list changes made in each version of the aws-parallelcluster-node package.
55
6+ 2.6.1
7+ -----
8+
9+ ** ENHANCEMENTS**
10+ - Improved the management of SQS messages and retries to speed-up recovery times when failures occur.
11+
12+ ** CHANGES**
13+ - Do not launch a replacement for an unhealthy or unresponsive node until this is terminated. This makes cluster slower
14+ at provisioning new nodes when failures occur but prevents any temporary over-scaling with respect to the expected
15+ capacity.
16+ - Increase parallelism when starting ` slurmd ` on compute nodes that join the cluster from 10 to 30.
17+ - Reduce the verbosity of messages logged by the node daemons.
18+ - Do not dump logs to ` /home/logs ` when nodewatcher encounters a failure and terminates the node. CloudWatch can be
19+ used to debug such failures.
20+ - Reduce the number of retries for failed REMOVE events in sqswatcher.
21+
22+ ** BUG FIXES**
23+ - Fixed a bug in the ordering and retrying of SQS messages that was causing, under certain circumstances of heavy load,
24+ the scheduler configuration to be left in an inconsistent state.
25+ - Delete from queue the REMOVE events that are discarded due to hostname collision with another event fetched as part
26+ of the same ` sqswatcher ` iteration.
27+
28+
6292.6.0
730-----
831
You can’t perform that action at this time.
0 commit comments