Summary
The README advertises TSV → PROTAX-GPU conversion via scripts/convert.py, but the script cannot be run from the command line. It crashes on import due to hardcoded paths, and its CLI entry point performs no conversion.
Affected files
README.md L29 ("Compatible with TSV and PROTAX input format")
README.md L38 (convert.py — Converts .TSV to PROTAX-GPU format)
scripts/convert.py
Details
Crashes on import. Lines 398–404 run at module top level with hardcoded developer paths:
test = Path("/home/roy/Downloads/taxonomy.tsv")
test_ref = Path("/home/roy/Downloads/sequences.tsv")
read_jax_model(test_ref) # crashes here → np.load("8M_tax.npz") at L302
The crash occurs inside read_jax_model at L302, which tries to load "8M_tax.npz" — a hardcoded intermediate file that doesn't exist outside the original developer's environment. The test_ref path is passed but never reached before this failure.
CLI entry point does nothing. The argparse block (L409–424) only prints "converting taxonomy..." / "converting model..."; it never calls convert_tsv(), assign_tax(), or convert_sequences().
Unfinished pieces.
trim_subtaxa() — # TODO doesn't work yet (L106–107)
convert_sequences() — # TODO remove hardcoded values (L260)
read_jax_model() — # TODO: remove this leftover test code (L343)
assign_tax() depends on a hardcoded refs.npz intermediate file (L218)
Net effect: the advertised TSV pipeline is both undocumented and non-functional — there is no working path to convert your own TSV data.
Steps to reproduce
git clone the repo and install per the README.
- Run
python scripts/convert.py --taxonomy <some.tsv> (or run it with no args).
Expected behavior
The script converts the given TSV taxonomy/sequence files into the .npz format the package consumes.
Actual behavior
FileNotFoundError: [Errno 2] No such file or directory: '8M_tax.npz'
(raised at import time from the top-level read_jax_model(test_ref) call at L404, which tries to load the hardcoded "8M_tax.npz" intermediate file at L302). Even if that file existed, the --taxonomy/--model flags only print a message and perform no conversion.
Environment
- Repo commit:
392614a
- Python: 3.12
Proposed fix
- Short term: remove or gate the top-level execution code so the script does not crash on import, and note in the README that
convert.py is a work in progress.
- Long term: wire the argparse block to the conversion functions, remove the hardcoded paths, and document the expected TSV format.
- Either way: clarify whether
convert.py is the intended way to produce the taxonomy .npz, or whether it is deprecated.
Summary
The README advertises TSV → PROTAX-GPU conversion via
scripts/convert.py, but the script cannot be run from the command line. It crashes on import due to hardcoded paths, and its CLI entry point performs no conversion.Affected files
README.mdL29 ("Compatible with TSV and PROTAX input format")README.mdL38 (convert.py — Converts .TSV to PROTAX-GPU format)scripts/convert.pyDetails
Crashes on import. Lines 398–404 run at module top level with hardcoded developer paths:
The crash occurs inside
read_jax_modelat L302, which tries to load"8M_tax.npz"— a hardcoded intermediate file that doesn't exist outside the original developer's environment. Thetest_refpath is passed but never reached before this failure.CLI entry point does nothing. The
argparseblock (L409–424) only prints"converting taxonomy..."/"converting model..."; it never callsconvert_tsv(),assign_tax(), orconvert_sequences().Unfinished pieces.
trim_subtaxa()—# TODO doesn't work yet(L106–107)convert_sequences()—# TODO remove hardcoded values(L260)read_jax_model()—# TODO: remove thisleftover test code (L343)assign_tax()depends on a hardcodedrefs.npzintermediate file (L218)Net effect: the advertised TSV pipeline is both undocumented and non-functional — there is no working path to convert your own TSV data.
Steps to reproduce
git clonethe repo and install per the README.python scripts/convert.py --taxonomy <some.tsv>(or run it with no args).Expected behavior
The script converts the given TSV taxonomy/sequence files into the
.npzformat the package consumes.Actual behavior
(raised at import time from the top-level
read_jax_model(test_ref)call at L404, which tries to load the hardcoded"8M_tax.npz"intermediate file at L302). Even if that file existed, the--taxonomy/--modelflags only print a message and perform no conversion.Environment
392614aProposed fix
convert.pyis a work in progress.convert.pyis the intended way to produce the taxonomy.npz, or whether it is deprecated.