Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

looper.write_sample_yaml pre-submit function assumes the sample has sample_name as an attribute #541

Open
nleroy917 opened this issue Jan 28, 2025 · 7 comments

Comments

@nleroy917
Copy link
Member

I was trying to submit jobs by leveraging the write_sample_yaml pre-submit function. This was my pipeline_interface.yaml:

pipeline_name: cranger-atac
pipeline_type: sample
pre_submit:
  python_functions:
    - looper.write_sample_yaml
command_template: >
  python {looper.piface_dir}/run.py {looper.output_dir}/submission/{sample.id}_sample.yaml

However, before any submission scripts are written, I get the following stack trace:

Traceback (most recent call last):
  File "/sfs/gpfs/tardis/home/xjk5hp/code/scripts/model-training/atacformer-scatlas-base/.venv/bin/looper", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/sfs/gpfs/tardis/home/xjk5hp/code/scripts/model-training/atacformer-scatlas-base/.venv/lib/python3.11/site-packages/looper/cli_pydantic.py", line 343, in main
    return run_looper(args, parser, test_args=test_args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sfs/gpfs/tardis/home/xjk5hp/code/scripts/model-training/atacformer-scatlas-base/.venv/lib/python3.11/site-packages/looper/cli_pydantic.py", line 265, in run_looper
    return run(
           ^^^^
  File "/sfs/gpfs/tardis/home/xjk5hp/code/scripts/model-training/atacformer-scatlas-base/.venv/lib/python3.11/site-packages/looper/looper.py", line 475, in __call__
    curr_pl_fails = cndtr.add_sample(sample, rerun=rerun)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sfs/gpfs/tardis/home/xjk5hp/code/scripts/model-training/atacformer-scatlas-base/.venv/lib/python3.11/site-packages/looper/conductor.py", line 372, in add_sample
    self.submit()
  File "/sfs/gpfs/tardis/home/xjk5hp/code/scripts/model-training/atacformer-scatlas-base/.venv/lib/python3.11/site-packages/looper/conductor.py", line 409, in submit
    script = self.write_script(self._pool, self._curr_size)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sfs/gpfs/tardis/home/xjk5hp/code/scripts/model-training/atacformer-scatlas-base/.venv/lib/python3.11/site-packages/looper/conductor.py", line 633, in write_script
    namespaces = _exec_pre_submit(pl_iface, namespaces)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sfs/gpfs/tardis/home/xjk5hp/code/scripts/model-training/atacformer-scatlas-base/.venv/lib/python3.11/site-packages/looper/conductor.py", line 734, in _exec_pre_submit
    _update_namespaces(namespaces, func(namespaces))
                                   ^^^^^^^^^^^^^^^^
  File "/sfs/gpfs/tardis/home/xjk5hp/code/scripts/model-training/atacformer-scatlas-base/.venv/lib/python3.11/site-packages/looper/plugins.py", line 156, in write_sample_yaml
    sample["sample_yaml_path"] = _get_yaml_path(
                                 ^^^^^^^^^^^^^^^
  File "/sfs/gpfs/tardis/home/xjk5hp/code/scripts/model-training/atacformer-scatlas-base/.venv/lib/python3.11/site-packages/looper/conductor.py", line 74, in _get_yaml_path
    or f"{namespaces['sample'][SAMPLE_NAME_ATTR]}"
          ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "/sfs/gpfs/tardis/home/xjk5hp/code/scripts/model-training/atacformer-scatlas-base/.venv/lib/python3.11/site-packages/peppy/simple_attr_map.py", line 26, in __getitem__
    return self._mapped_attr[item]
           ~~~~~~~~~~~~~~~~~^^^^^^
KeyError: 'sample_name'

You can follow the breadcrumbs and look into this _get_yaml_path function to see that it uses a hard-coded sample name attribute key that comes from peppy.const. I believe this is a bug, correct? I've renamed my sample_table_index column name to id.

I am able to resolve this issue by reverting and making my sample_table_index by sample_name.

@nsheff
Copy link
Contributor

nsheff commented Jan 28, 2025

yep that's a bug

@donaldcampbelljr
Copy link
Contributor

Needs to reference the sample_table_index:
something along the lines of:
sample[self.prj.sample_table_index]

@nleroy917
Copy link
Member Author

I apologize for going for style points and renaming my sample table indexes

@donaldcampbelljr
Copy link
Contributor

or f"{namespaces['sample']['_project'].sample_table_index}"

Getting test failures when making the change, however, so it requires a bit more digging.

donaldcampbelljr added a commit that referenced this issue Jan 28, 2025
@donaldcampbelljr
Copy link
Contributor

Whoops, I forgot to use that as an index:
namespaces['sample'][namespaces['sample']['_project'].sample_table_index]

So the above solution works fine and all tests pass with the above commit on dev branch.

@nleroy917
Copy link
Member Author

Checking now... in the meantime, does line 61 need to change?

Image

@nleroy917
Copy link
Member Author

Was able to run my pipeline using dev and it works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants