Skip to content

Error running basic use with trainer.py #342

@bgheneti

Description

@bgheneti

Hi, thanks for this repo and your amazing work!

I am trying to get the Basic Use example for validation working, after installing the repo on an instance, and am running into some issues.

python trainer.py fit --model ClayMAEModule --data ClayDataModule --config configs/config.yaml --trainer.fast_dev_run=True seems to produce the following error:

error: Parser key "model":
Unable to load config 'ClayMAEModule'
  - Parser key "model": Unable to load config "ClayMAEModule"

So instead I run python trainer.py fit --config configs/config.yaml --trainer.fast_dev_run=True but get the following error:

[rank0]:     n_train, n_test = _validate_shuffle_split(
[rank0]:                       ^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/banti_understory_ai/miniforge3/envs/claymodel/lib/python3.11/site-packages/sklearn/model_selection/_split.py", line 2481, in _validate_shuffle_split
[rank0]:     raise ValueError(
[rank0]: ValueError: With n_samples=0, test_size=0.19999999999999996 and train_size=None, the resulting train set will be empty. Adjust any of the aforementioned parameters.

As a note I also changed devices to 1 and num_nodes to 1 in config.yaml to accommodate my machine

Looking at the config.yaml it seems like I need data in the correct location (/fsx). Is this correct? If so which data do I need to obtain (just one or all of the platforms?) and how do I need to structure it there? Any help getting acclimated is appreciated 🙂

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions