Skip to content

Speed-test/health-check for Stratum-1's #151

Open
@ocaisa

Description

@ocaisa

Last week I gave an EESSI tutorial and running the examples on a vanilla instance on AWS was lightning fast from a cold start. In contrast, my runs inside a fresh Magic Castle cluster I brought up today were very slow, it took 10 minutes for the initial run of Tensorflow (and 36s when repeating the run).

The main difference I can think of is the response time from the different S1's. Is there any way we can do a speedcheck for our Stratum 1's to make sure they are operating as fast as we expect them to?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions