src/save: Avoid writing CopyFile whenever possible#229
src/save: Avoid writing CopyFile whenever possible#229mmirate wants to merge 9 commits intoCyberShadow:masterfrom
Conversation
If the file isn't plaintext, CopyFile remains mandatory. If there are changes, write an inline unified-patch into a heredoc. If it's a new file, write its contents into a heredoc. If the new file has one or fewer lines, write an "echo" command (possibly with the `-e` flag in case of a file containing plaintext but no newlines). This is essentially CyberShadow#46 but built into `aconfmgr save`. Testing ------- I've been using this branch on my collection of Arch computers, to good effect at dredging up cruft, propagating good ideas, separating out things that are supposed to be different, all while keeping my intent clear and simple; and I believe I've adapted the existing test cases to pass. But I did not yet try shoehorning multiline files into any of the existing test cases with which to exercise the full breadth of emitted heredocs; and I suspect that multiline text files who do not end with a newline, will not be accurately captured. Errata ------ * At one point during development I was using the `-`, as in `<<-'EOF'`, to prepend tabs to the files or patches, to allow better indenting in the typical case where these fragments go inside if-statements etc; but I found it could not then accurately capture any file which was itself indented with tabs, so I dropped that feature, but it might be possible to add it back by checking for leading tab characters. * With great difficulty I can fathom that inlining file contents into the config is undesirable, especially as the line count of the inline content increases. The number of lines to tolerate looks like a bikeshed color, so at the moment there's no limit and no way for the user to add one. I know that adding a new command-line switch is suboptimal because it clutters the program's API surface. What about a new helper that manipulates a global variable? Also Included ------------- * One `read` into a global variable. I believe it would have had no adverse effect upon the status quo had it been allowed to persist. The new feature in this PR, by contrast, is broken unless either this variable is localized or the overwritten variable is renamed. * The "Trim redundant blank parameters" loop would try to index into an empty `args` array, so I added a second loop-condition to prevent it from doing that. Maybe the fact that it tried to do that at all is evidence of something going wrong? I'm not sure. * There are a few branches which have to write their contents to the `$config_save_target` themselves, leaving no `$func` etc to be written. Handling this contingency is what necessitates the nonemptiness check on `$func`.
CyberShadow
left a comment
There was a problem hiding this comment.
Thanks for the contribution!
I think it would be a valuable improvement if we can make this work. However, there's definitely a number of somewhat hairy aspects.
I know that adding a new command-line switch for this is suboptimal because it clutters the program's API surface. What about a new helper that manipulates a global variable?
Agreed; I would also imagine that some control on a path-by-path basis would be useful. Perhaps something in the shape of SetSaveMethod '*' auto?
src/save.bash
Outdated
| fi | ||
| elif | ||
| AconfNeedProgram file file n && | ||
| file "$system_file" | grep -Fw text | grep -qvFw -- 'with escape sequences' && |
There was a problem hiding this comment.
LC_ALL=C is needed if we're going to grep for English strings; however, file is a rather heavy dependency for detecting a binary file. Perhaps grep -qI . will work to do the same?
There was a problem hiding this comment.
grep -qI . almost works; here's the kind of file that makes it misbehave:
sh-5.3$ if grep -qI . /etc/motd; then file /etc/motd; fi
/etc/motd: ASCII text, with escape sequences
There was a problem hiding this comment.
Ah, maybe something like this then:
tr -d '\t\n\r\040-\176\200-\376' < file.txt | head -c 1 | grep -q .This should cover more cases than either approach. with escape sequences only checks for one additional control character.
I would extract this to a new top-level function, also so that it can be overridden from configurations if needed.
| while [[ -z "${args[-1]}" ]] | ||
| while [[ "${#args[*]}" -gt 0 && -z "${args[-1]}" ]] |
There was a problem hiding this comment.
Makes sense; alternatively, it could also be moved inside the if [[ -n $func ]] block.
I like it! It will take me a bit more time but it seems worthwhile to me. Bikeshed colors: the status quo would be called "copy" and my new functionality would be called "auto"? |
Works for me! |
Making `grep` accept \xFF style escape sequences requires using its PCRE engine, which for all I know is less efficient. `awk`, by contrast, can process these entirely within POSIX-compliant form. Errata: maybe I should have `grep -qI` take a second pass over the file in case it contains broken UTF-8.
4cd8c55 to
ed00cc4
Compare
Happy to pick this up from here, BTW. |
|
It seems that this feature is somehow behaving differently under |
If the file isn't plaintext, CopyFile remains mandatory.
If there are changes, write an inline unified-patch into a heredoc. If it's a new file, write its contents into a heredoc. If the new file has one or fewer lines, write an "echo" command (possibly with the
-eflag in case of a file containing plaintext but no newlines).This is essentially #46 but built into
aconfmgr save.Testing
This branch spent years on my backburner due to some frustrating bugs. Recently I fixed those, so that I have been using this branch on my collection of Arch computers. Doing so I've seen good effect at dredging up cruft, propagating good ideas, separating out things that are supposed to be different; all while keeping my intent clear and simple. I believe I've adapted the existing test cases to pass; but I did not yet try shoehorning multiline files into any of the existing test cases with which to exercise the full breadth of emitted heredocs; and I suspect that multiline text files who do not end with a newline, will still not be accurately captured.
Errata
At one point during development I was using the
-, as in<<-'EOF', to prepend tabs to the files or patches, to allow better indenting in the typical case where these fragments go inside if-statements etc; but I found it could not then accurately capture any file which was itself indented with tabs, so I dropped that feature, but it might be possible to add it back by checking for leading tab characters.With great difficulty I can fathom that inlining file contents into the config may be undesirable, especially as the line count of the inline content increases. The number of lines to tolerate looks like a bikeshed color. At the moment I've set no limit and no way for the user to add one. I know that adding a new command-line switch for this is suboptimal because it clutters the program's API surface. What about a new helper that manipulates a global variable?
Also Included
One
readinto a global variable. I believe it would have had no adverse effect upon the status quo had it been allowed to persist. The new feature in this PR, by contrast, is broken unless either this variable is localized or the overwritten variable is renamed.The "Trim redundant blank parameters" loop would try to index into an empty
argsarray, so I added a second loop-condition to prevent it from doing that. Maybe the fact that it tried to do that at all is evidence of something going wrong? I'm not sure.There are a few branches which have to write their contents to the
$config_save_targetthemselves, leaving no$funcetc to be written. Handling this contingency is what necessitates the nonemptiness check on$func.