Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SyntaxWarning for invalid escape sequences in Python 3.12 #1457

Open
1 task done
Simon-Brandt opened this issue Mar 20, 2025 · 1 comment
Open
1 task done

SyntaxWarning for invalid escape sequences in Python 3.12 #1457

Simon-Brandt opened this issue Mar 20, 2025 · 1 comment
Assignees

Comments

@Simon-Brandt
Copy link

Description of bug

In Python >= 3.6, invalid escape sequences for Unicode strings emit a DeprecationWarning, changed to a SyntaxWarning in Python 3.12, to finally become a SyntaxError in a future Python version. SPAdes uses several of these invalid escape sequences across the Python scripts, most notably (or only?) in regular expressions. I obtained two of these warnings for a metaSPAdes run, in:

source_file, line_no = re.match(
'\<doctest (.*\.rst)\[(.*)\]\>',
source_file).groups()

and in:

cookie_re = re.compile("coding[:=]\s*([-\w.]+)")

To my knowledge, < and > should have never carried a specific meaning in Python's regex flavor and thus shouldn't have been required to escape, whilst the character classes \s and \d indeed do. As the SyntaxWarning will eventually become a SyntaxError, SPAdes will break, in the future. Since the same error has already been reported in #1320 and in #1326, but was only selectively fixed in 60ad35e, it may be preferable to go through the code base and find all strings with escape sequences and fix them, either by doubling the backslashes for Python's parser, or, preferred, by marking them as raw strings. If you want, I could try fixing this myself via pull request.

spades.log

Since my dataset contains sensitive information (including the names of file paths), I cannot upload the spades.log and am only able to provide the following snippets. Since, however, the error should be clear, I hope more data isn't needed.

Command line: /opt/spades/bin/spades.py --meta --threads=32 --memory=200 -1 /path/to/r1.fastq.gz -2 /path/to/r2.fastq.gz -o /path/to/metaspades_out_sample_1
                                                                                
System information:                                                             
  SPAdes version: 4.1.0                                                         
  Python version: 3.12.3                                                        
  OS: Linux-5.4.0-208-generic-x86_64-with-glibc2.39 
== Running: /usr/bin/python3 /opt/spades/share/spades/spades_pipeline/scripts/compress_all.py --input_file /path/to/metaspades_out_sample_1/corrected/corrected.yaml --ext_python_modules_home /opt/spades/share/spades --max_threads 32 --output_dir /path/to/metaspades_out_sample_1/corrected --gzip_output
                                                                                
/opt/spades/share/spades/joblib3/func_inspect.py:51: SyntaxWarning: invalid escape sequence '\<'
  '\<doctest (.*\.rst)\[(.*)\]\>',                                              
/opt/spades/share/spades/joblib3/_memory_helpers.py:10: SyntaxWarning: invalid escape sequence '\s'
  cookie_re = re.compile("coding[:=]\s*([-\w.]+)")   

params.txt

For the params.txt, I replaced the sensitive paths with /path/to.

SPAdes version

SPAdes 4.1.0

Operating System

Linux-5.4.0-208-generic-x86_64-with-glibc2.39

Python Version

Python 3.12.3

Method of SPAdes installation

Manual compilation, virtualized as Docker container, run as Singularity image

No errors reported in spades.log

  • Yes
@andrewprzh
Copy link
Collaborator

Hi @Simon-Brandt

Thanks a lot for the report!

This comes from a joblib that we imported once, a long time ago. Surprisingly, it was not fixed in their repository at the moment. Otherwise, we could've just updated it.

Pull request is most welcomed and appreciated! Quick search in main folders with Python code (ext/src/python_libs, src/projects/spades/pipeline) didn't give me any suspicious places, but second look would be great.

Also, joblib is only used for running gzip in parallel, which, I believe, can be done using inbuilt Python methods, so maybe we don't even need joblib.

Best
Andrey

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants