-
Notifications
You must be signed in to change notification settings - Fork 942
doc: properly handle preformatted blocks #8242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: properly handle preformatted blocks #8242
Conversation
\
\preformatted blocks\
\\
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor fix, generally looks good!
tools/fromschema.py
Outdated
@@ -21,7 +21,7 @@ def output_title(title, underline='-', num_leading_newlines=1, num_trailing_newl | |||
|
|||
def esc_underscores(s): | |||
"""Backslash-escape underscores outside of backtick-enclosed spans""" | |||
return ''.join(['\\_' if x == '_' else x for x in re.findall(r'[^`_\\]+|`(?:[^`\\]|\\.)*`|\\.|_', s)]) | |||
return ''.join(['\\_' if x == '_' else x for x in re.findall(r'(?ms:^[ \t]*```.*?^[ \t]*```)|[^`_\\\n]++|`(?:[^`\\]|\\.)*`|\\.|[_\n]', s)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"++" here is wrong:
re.error: multiple repeat at position 40
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Are you using a very old version of Python? The ++
(possessive one-or-more) quantifier works on Python 3.11.12, 3.12.10, and 3.13.3, but it gives the error you quoted on Python 3.10.17. Do you really need to maintain compatibility with ancient versions of Python? Possessive quantifiers avoid needlessly backtracking when we know that backtracking will not find any new matches, although, now that I am looking at this again, I don't think it's going to make any difference in this case since there are no assertions after that repeat, so no backtracking would ever be attempted even if the quantifier were non-possessive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ubuntu 22.04 is on 3.9 :( We currently require >=3.8.1 but now Ubuntu 20.04 is EOL, we could bump to 3.9.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's mind-boggling! 😵 So it is this project's policy to support being built on all non-EOL versions of Ubuntu? Do you expect much overlap between folks who don't keep their OSes up to date and folks who run experimental cryptocurrency software? 🤭
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely: we support people running stable OSes. We had people who were bitten by sqlite3 being too old for the last release (I accidentally used a feature which was introduced recently).
This is the difference between hobby software and software products.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<rant>
In my experience, constraining yourself to maintaining compatibility with older releases of your dependencies incurs a technical debt that usually never gets repaid, and your software performs worse and/or is harder to maintain as a result. No one in the future is going to realize, “Ooo, I can finally put that possessive quantifier back in the build script's regex now that the very oldest Ubuntu has died!” Instead, it'll be a matter of, “It still seems to be working, so I'm not going to touch it,” and it'll just be needlessly slower forever. Sad. (I do admit that there's no performance difference in this particular case, but I'm speaking of the policy in general.) It would be preferable to bump the build requirements in the documentation, and if that means people running a three-year-old OS can't build the modern version of CLN, then maybe that will motivate them to stay current. Software moves quickly, and coddling the laggards only enables them to remain laggards indefinitely.</rant>
Anyway, piece spoken. I have dropped the possessive quantifier. The error now disappears on Python 3.8 and 3.9. (I didn't retest with 3.10, but I'd assume it's gone there too.)
While testing this, I also ran into a shell quoting bug in Makefile
, a fix for which I'm including in a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's a judgement call. If things are terribly painful, we force everyone to upgrade.
And actually, I do sprinkle FIXMEs in the code for this reason: at some point, particularly when there's a measurable impact, I come across it. I also do drive-by fixes as I'm reading through code for exactly this reason, because these little tidyups add up.
Thankyou.
3e419f0
to
4b88685
Compare
4b88685
to
9e339d3
Compare
Lowdown requires a blank line before all preformatted blocks, or it doesn't recognize them. `tools/md2man.sh` contained some ad-hoc efforts at fixing up some locations where these required blank lines are absent from the output of `tools/fromschema.py`, but it missed some. Instead of playing Whack-a-Mole, use a blanket sed expression to ensure that a blank line precedes _every_ opening ```. `esc_underscores(…)` in `tools/fromschema.py` did not work correctly on strings containing an odd number of backticks, notably the ``` delimiters surrounding preformatted text blocks. Specifically, it was dropping the last backtick since none of the alternatives in the regex matched it. Add a new alternative that matches a whole preformatted block as a single unit. `output_member(…)` in `tools/fromschema.py` was passing each line of a member's description through `esc_underscores(…)` individually, but that breaks preformatted text blocks that are naturally multi-line and leads to mistakenly escaping underscores inside such blocks. Rewrite the code to make use of the `outputs(…)` utility function that joins all the provided lines together before passing the whole text through `esc_underscores(…)`. Drive-by fix a couple of flubbed preformatted blocks in schemas. [ Added shellcheck suppression for md2man.sh --RR ] Changelog-None
When the command string run through $(call VERBOSE,…) contains a single-quote character followed by shell metacharacters, the shell executing the VERBOSE template's $(ECHO) command will attempt to interpret those metacharacters. This could be disastrous, such as if someone were to put in a Makefile recipe: $(call VERBOSE, , $(ECHO) 'Will not `rm -rf ~`') When run with V=1, which causes VERBOSE to be defined as… VERBOSE = $(ECHO) '$(2)'; $(2) …Make would evaluate the above call into: echo 'echo 'Will not `rm -rf ~`''; echo 'Will not `rm -rf ~`' And oops, there goes the neighborhood. The real-world motivating case for this fix is the sed call in the recipe for doc/index.rst in doc/Makefile. It contains a sed expression enclosed in single quotes, containing parentheses. When run through VERBOSE with V=1, the single quotes around the sed expression actually escape _out_ of the single-quoted string that is intended to be the whole command line, and you get: /bin/sh: -c: line 1: syntax error near unexpected token `(' The fix is for VERBOSE to escape any single quotes embedded in the command line argument when echoing it: VERBOSE = $(ECHO) '$(subst ','\'',$(2))'; $(2) Note that this is still wrong, as it will not do the right thing if $(2) happens to begin with a hyphen, but I didn't want to introduce a new "PRINTF" variable (or do something unsavory like calling cat with a here-doc) to squash a bug that currently has no known manifestations. Changelog-None
9e339d3
to
db05c23
Compare
Lowdown requires a blank line before all preformatted blocks, or it doesn't recognize them.
tools/md2man.sh
contained some ad-hoc efforts at fixing up some locations where these required blank lines are absent from the output oftools/fromschema.py
, but it missed some. Instead of playing Whack-a-Mole, use a blanket sed expression to ensure that a blank line precedes every opening ```.esc_underscores(…)
intools/fromschema.py
did not work correctly on strings containing an odd number of backticks, notably the ``` delimiters surrounding preformatted text blocks. Specifically, it was dropping the last backtick since none of the alternatives in the regex matched it. Add a new alternative that matches a whole preformatted block as a single unit.output_member(…)
intools/fromschema.py
was passing each line of a member's description throughesc_underscores(…)
individually, but that breaks preformatted text blocks that are naturally multi-line and leads to mistakenly escaping underscores inside such blocks. Rewrite the code to make use of theoutputs(…)
utility function that joins all the provided lines together before passing the whole text throughesc_underscores(…)
.Drive-by fix a couple of flubbed preformatted blocks in schemas.
Checklist
Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked: