Skip to content

Conversation

@ainefairbrother
Copy link
Contributor

@ainefairbrother ainefairbrother commented Oct 3, 2025

This PR addresses ticket ENSVAR_6893.

  • The main goal is to add consistent and detailed logging so that it is clear why particular URN IDs may have failed following a run. Previously, IDs were failing without clear logs and it was not easy to understand exactly why without re-running those IDs.
  • Secondarily, it makes the metadata extraction (extract_metadata.py) more robust to subtly differing file formats, which can occur as the MaveDB data release has not stabilised yet.
  • It also fixes mapping file matching pattern in import_from_files.sh - was too liberal.
  • Updatesnextflow.config to get more detailed tracing to understand memory assignation vs usage.
  • Updates the memory calculation in variant_recoder.nf to reduce and cap memory requested based on previous run traces.

@ainefairbrother ainefairbrother marked this pull request as ready for review October 27, 2025 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant