Open
Description
Last week I gave an EESSI tutorial and running the examples on a vanilla instance on AWS was lightning fast from a cold start. In contrast, my runs inside a fresh Magic Castle cluster I brought up today were very slow, it took 10 minutes for the initial run of Tensorflow (and 36s when repeating the run).
The main difference I can think of is the response time from the different S1's. Is there any way we can do a speedcheck for our Stratum 1's to make sure they are operating as fast as we expect them to?
Metadata
Metadata
Assignees
Labels
No labels