-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support diff merges option in range diff #1734
base: main
Are you sure you want to change the base?
Support diff merges option in range diff #1734
Conversation
Hi @dscho :) I just came across the need for this today, and remembered you had a WIP PR about it. |
In any case, I'll add it to my build and give it a try. |
@phil-blain Well, missing tests, missing documentation, and missing commit message ;-)
No, althought I recently wished for it but didn't find the time to build it in a worktree so that I could use it. |
ac9ecc8
to
0ae0adc
Compare
@phil-blain well, now it no longer misses those things... |
8baf3fa
to
11361e0
Compare
/submit |
Submitted as [email protected] To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this): "Johannes Schindelin via GitGitGadget" <[email protected]>
writes:
> Let's add corresponding support for `git range-diff`, too. This makes it
> more convenient to spot differences between iterations of non-linear
> contributions, where so-called "evil merges" are sometimes necessary and
> need to be reviewed, too.
Nice addition.
As the "diff of the diff" outer diff part would be two-"blob" diff,
we won't have to worry about this option causing confusions to
users, unlike things like "-U8", to which layer of diffs the option
would apply.
Will queue. Thanks. |
This patch series was integrated into seen via git@992e323. |
On the Git mailing list, Johannes Sixt wrote (reply to this): Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget:
> +--diff-merges=<format>::
> + Instead of ignoring merge commits, generate diffs for them using the
> + corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> + and include them in the comparison.
> ++
> +Note: Some of the formats supported by linkgit:git-log[1] make less sense in
> +the context of the `range-diff` command than other formats, so choose wisely!
Can we please be a bit more specific which options are usable or which
are not usable? Perhaps you even mean to say that 'first-parent' is the
only one that makes sense? At a minimum, we should mention that it is
the one that makes most sense (if that is the case).
-- Hannes
|
User |
On the Git mailing list, Johannes Schindelin wrote (reply to this): Hi Hannes
On Fri, 8 Nov 2024, Johannes Sixt wrote:
> Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget:
> > +--diff-merges=<format>::
> > + Instead of ignoring merge commits, generate diffs for them using the
> > + corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> > + and include them in the comparison.
> > ++
> > +Note: Some of the formats supported by linkgit:git-log[1] make less sense in
> > +the context of the `range-diff` command than other formats, so choose wisely!
>
> Can we please be a bit more specific which options are usable or which
> are not usable?
There are a couple of reasons:
- Most importantly: I had not even studied that question in depth when I
wrote the original patch, and not even when I submitted the first
iteration.
In my reviews of branch thickets, I used `--diff-merges=1` and it served
my needs well. Therefore I can speak to this mode, but not really to the
other modes.
- Which options are usable or not may lack a universally-correct answer.
One person might find `--diff-merges=separate` confusing while the next
person may find it quite useful.
- If the documentation points out an unsuitable mode, then an obvious
follow-up wish would be to reject this mode in the command, with an
error message. Some users might, however, need precisely this mode, so
I am loathe to do that.
Maybe this compromise: Change the "Note:" to recommend the `first-parent`
mode, with a bit more explanation why that is so (it is consistent with
the idea that a merge is kind of a "meta patch", comprising all the merged
commits' patches, which is equivalent to the first-parent diff).
> Perhaps you even mean to say that 'first-parent' is the only one that
> makes sense?
I would not go _so_ far as to say that.
Before submitting the patch yesterday, I played a little with a few
modes, using the commit graph as per the new t3206 test case.
What that test case range-diffs looks like this:
- * - changeA - changeB ------------
\ \ \
\ conflict \
\ / \
rename - changeA - changeB ----- clean
In other words, the main branch modifies a file's contents twice, the
topic branch renames it first, then makes the same modifications.
The clean merge simply merges the topic, which makes the main branch
tree-same to the topic branch.
The conflicting merge misses the tip commit of the topic branch, i.e.
changeB on the renamed file. The result is tree-same to the clean merge
because changeB on the non-renamed file is already part of the main
branch.
Out of habit, I used `--diff-merge=first-parent` in the new test case,
whose output reiterates what I just said: both merges introduce the same
first-parent diff (renaming the file, no other changes):
1: 1e6226b < -: ------- s/12/B/
2: 043e738 = 1: 6ddec15 merge
What about `separate`?
2: 043e738 = 1: 6ddec15 merge
1: 1e6226b ! 2: 6ddec15 s/12/B/
@@
## Metadata ##
-Author: Thomas Rast <[email protected]>
+Author: A U Thor <[email protected]>
## Commit message ##
- s/12/B/
+ merge
## renamed-file ##
@@ renamed-file: A
This might look confusing at first because there are two merge diffs on
the right-hand side, but only one on the left-hand side. But that is all
good because the diff of the clean merge between merge head and merge is
empty and therefore not shown. And the `range-diff` demonstrates that the
conflicting merge had to recapitulate the tip commit of the `renamed-file`
topic branch.
The `combined` and `dense-combined` modes produce identical output to
`first-parent` in this instance, which is expected.
Now, let's use `remerge`, which should be illuminating with regards to
"evil merges" [*1*], as it shows what was done on top of the automatic
merge:
1: 1e6226b < -: ------- s/12/B/
2: 043e738 < -: ------- merge
-: ------- > 1: 6ddec15 merge
Huh. That is a bit surprising at first. Let's use a higher creation factor
to ask `range-diff` to match correspondences more loosely (I used 999,
which kinda forces even non-sensical pairings):
1: 1e6226b ! 1: 6ddec15 s/12/B/
@@
## Metadata ##
-Author: Thomas Rast <[email protected]>
+Author: A U Thor <[email protected]>
## Commit message ##
- s/12/B/
+ merge
## renamed-file ##
+ remerge CONFLICT (content): Merge conflict in renamed-file
+ index 3f1c0da..25ddf7a 100644
+ --- renamed-file
+ +++ renamed-file
@@ renamed-file: A
9
10
B
+-<<<<<<< 2901f77 (s/12/B/):file
+ B
+-=======
-12
-+B
+->>>>>>> 3ce7af6 (s/11/B/):renamed-file
13
14
15
2: 043e738 < -: ------- merge
(Note that if you repeat this experiment, you will see something different
due to a bug in `--remerge-diff` that I will fix separately.)
Hold on, that looks wrong! It is actually not wrong, `range-diff` just
finds the changeB a better pairing than the merge commit with an empty
diff...
So: This mode can be useful to detect where merge conflicts had to be
resolved.
In short, I think that in `range-diff`'s context all of these modes have
merit, depending on a particular use case. It's just that `first-parent`
is probably the most natural to use in the common case.
> At a minimum, we should mention that it is the one that makes most sense
> (if that is the case).
Yeah, I'll go with that and drop saying anything about other modes.
Ciao,
Johannes
Footnote *1*: We should really find a much better name for this than "evil
merge". There is nothing evil about me having to add `#include
"environment.h"` in v2.45.1^, for example. It was necessary so that the
code would build. Tedious, yes, but not evil. |
11361e0
to
7ddc69e
Compare
The `git log` command already offers support for including diffs for merges, via the `--diff-merges=<format>` option. Let's add corresponding support for `git range-diff`, too. This makes it more convenient to spot differences between iterations of non-linear contributions, where so-called "evil merges" are sometimes necessary and need to be reviewed, too. In my code reviews, I found the `--diff-merges=first-parent` option particularly useful. Signed-off-by: Johannes Schindelin <[email protected]>
7ddc69e
to
d01a395
Compare
On the Git mailing list, Junio C Hamano wrote (reply to this): Johannes Sixt <[email protected]> writes:
> Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget:
>> +--diff-merges=<format>::
>> + Instead of ignoring merge commits, generate diffs for them using the
>> + corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
>> + and include them in the comparison.
>> ++
>> +Note: Some of the formats supported by linkgit:git-log[1] make less sense in
>> +the context of the `range-diff` command than other formats, so choose wisely!
>
> Can we please be a bit more specific which options are usable or which
> are not usable? Perhaps you even mean to say that 'first-parent' is the
> only one that makes sense? At a minimum, we should mention that it is
> the one that makes most sense (if that is the case).
Good suggestion.
I am curious to see how well things like "--cc" and "--remerge-diff"
fare in this use case. I suspect that in a case where the original
round forgot to make necessary semantic conflict resolutions (aka
"evil merge") that the new round has, all of them (including the
"--first-parent") would essentially show the same change, but what
is shown in such a case would be more readable with "--first-parent"
by treating a merge as if it were just a single parent commit making
a lot of changes, iow, merely one among other commits with equal
footing. |
/submit |
Submitted as [email protected] To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Elijah Newren wrote (reply to this): On Thu, Nov 7, 2024 at 9:20 AM Johannes Schindelin via GitGitGadget
<[email protected]> wrote:
>
> From: Johannes Schindelin <[email protected]>
>
> The `git log` command already offers support for including diffs for
> merges, via the `--diff-merges=<format>` option.
>
> Let's add corresponding support for `git range-diff`, too. This makes it
> more convenient to spot differences between iterations of non-linear
> contributions, where so-called "evil merges" are sometimes necessary and
> need to be reviewed, too.
>
> In my code reviews, I found the `--diff-merges=first-parent` option
> particularly useful.
Curious. Wouldn't --diff-merges=remerge-diff be more useful if you
are particularly interested in so-called "evil merges" and whether
they remain "evil" (i.e. empty remerge-diff) or gain additional bits
of "evilness" (i.e. more changes shown in the remerge-diff)?
first-parent would seem more like a workaround in such a case. Let me
explain; first, let me refer to the result that you'd get after
merging with no human changes (i.e. a non-evil merge) as a
hypothetical "auto-merge" commit. Now, --diff-merges=first-parent
could generally be broken down as the combination of diff from first
parent to auto-merge + diff from auto-merge to evil-merge (even if the
auto-merge wasn't actually recorded anywhere and is just a theoretical
construct). Now, you aren't looking at a first-parent diff directly,
you are diffing two first-parent diffs. In particular, you are
comparing:
pre-rebase first-parent diff = diff from first parent of merge to
the auto-merge + diff from auto-merge to evil-merge
to
post-rebase first-parent diff = diff from first parent of merge to
the auto-merge + diff from auto-merge to evil-merge
Assuming you didn't drop or insert or modify any commits as part of
the rebase, then the two "diff from first parent of merge to the
auto-merge" should match. Since they match, taking the difference of
these two causes that part to cancel out, meaning you are left just
looking at the differences in the "evilness" of the actual merge. But
if you did make other changes while rebasing, maybe dropping or
tweaking a commit, then suddenly you aren't just looking at
differences in the "evilness" of the actual merge anymore; it's mixed
with those other changes making it more challenging to review and easy
to miss the parts you are looking for. If you want to look for
differences in whether the merge commit in question has changes other
than those that a simple "git merge" would make, remerge-diff seems
like a better choice.
> Signed-off-by: Johannes Schindelin <[email protected]>
> ---
> Support diff merges option in range diff
>
> The git range-diff command does the same with merge commits as git
> rebase: It ignores them.
>
> However, when comparing branch thickets it can be quite illuminating to
> watch out for inadvertent changes in merge commits, in particular when
> some "evil" merges have been replayed, i.e. merges that needed to
> introduce changes outside of the merge conflicts (e.g. when one branch
> changed a function's signature and another branch introduced a caller of
> said function), in case the replayed merge is no longer "evil" and
> therefore potentially incorrect.
>
> Let's introduce support for the --diff-merges option that is passed
> through to those git log commands.
>
> I had a need for this earlier this year and got it working, leaving the
> GitGitGadget PR in a draft mode. Phil Blain found it and kindly
> nerd-sniped me into readying it for submitting, so say thanks to Phil!
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1734%2Fdscho%2Fsupport-diff-merges-option-in-range-diff-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1734/dscho/support-diff-merges-option-in-range-diff-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1734
>
> Documentation/git-range-diff.txt | 10 +++++++++-
> builtin/range-diff.c | 11 +++++++++++
> range-diff.c | 15 +++++++++++----
> range-diff.h | 1 +
> t/t3206-range-diff.sh | 16 ++++++++++++++++
> 5 files changed, 48 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> index fbdbe0befeb..a964e856c3c 100644
> --- a/Documentation/git-range-diff.txt
> +++ b/Documentation/git-range-diff.txt
> @@ -10,7 +10,7 @@ SYNOPSIS
> [verse]
> 'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
> [--no-dual-color] [--creation-factor=<factor>]
> - [--left-only | --right-only]
> + [--left-only | --right-only] [--diff-merges=<format>]
> ( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
> [[--] <path>...]
>
> @@ -81,6 +81,14 @@ to revert to color all lines according to the outer diff markers
> Suppress commits that are missing from the second specified range
> (or the "right range" when using the `<rev1>...<rev2>` format).
>
> +--diff-merges=<format>::
> + Instead of ignoring merge commits, generate diffs for them using the
> + corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> + and include them in the comparison.
> ++
> +Note: Some of the formats supported by linkgit:git-log[1] make less sense in
> +the context of the `range-diff` command than other formats, so choose wisely!
> +
Indeed. :-)
> --[no-]notes[=<ref>]::
> This flag is passed to the `git log` program
> (see linkgit:git-log[1]) that generates the patches.
> diff --git a/builtin/range-diff.c b/builtin/range-diff.c
> index 1b33ab66a7b..e41719e0f0d 100644
> --- a/builtin/range-diff.c
> +++ b/builtin/range-diff.c
> @@ -21,6 +21,7 @@ int cmd_range_diff(int argc,
> {
> struct diff_options diffopt = { NULL };
> struct strvec other_arg = STRVEC_INIT;
> + struct strvec diff_merges_arg = STRVEC_INIT;
> struct range_diff_options range_diff_opts = {
> .creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT,
> .diffopt = &diffopt,
> @@ -36,6 +37,9 @@ int cmd_range_diff(int argc,
> OPT_PASSTHRU_ARGV(0, "notes", &other_arg,
> N_("notes"), N_("passed to 'git log'"),
> PARSE_OPT_OPTARG),
> + OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
> + N_("style"), N_("passed to 'git log'"),
> + PARSE_OPT_OPTARG),
> OPT_BOOL(0, "left-only", &left_only,
> N_("only emit output related to the first range")),
> OPT_BOOL(0, "right-only", &right_only,
> @@ -62,6 +66,12 @@ int cmd_range_diff(int argc,
> if (!simple_color)
> diffopt.use_color = 1;
>
> + /* If `--diff-merges` was specified, imply `--merges` */
> + if (diff_merges_arg.nr) {
> + range_diff_opts.include_merges = 1;
> + strvec_pushv(&other_arg, diff_merges_arg.v);
> + }
> +
> for (i = 0; i < argc; i++)
> if (!strcmp(argv[i], "--")) {
> dash_dash = i;
> @@ -155,6 +165,7 @@ int cmd_range_diff(int argc,
> res = show_range_diff(range1.buf, range2.buf, &range_diff_opts);
>
> strvec_clear(&other_arg);
> + strvec_clear(&diff_merges_arg);
> strbuf_release(&range1);
> strbuf_release(&range2);
>
> diff --git a/range-diff.c b/range-diff.c
> index bbb0952264b..9e59733059b 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -38,7 +38,8 @@ struct patch_util {
> * as struct object_id (will need to be free()d).
> */
> static int read_patches(const char *range, struct string_list *list,
> - const struct strvec *other_arg)
> + const struct strvec *other_arg,
> + unsigned int include_merges)
> {
> struct child_process cp = CHILD_PROCESS_INIT;
> struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT;
> @@ -49,7 +50,7 @@ static int read_patches(const char *range, struct string_list *list,
> size_t size;
> int ret = -1;
>
> - strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
> + strvec_pushl(&cp.args, "log", "--no-color", "-p",
> "--reverse", "--date-order", "--decorate=no",
> "--no-prefix", "--submodule=short",
> /*
> @@ -64,6 +65,8 @@ static int read_patches(const char *range, struct string_list *list,
> "--pretty=medium",
> "--show-notes-by-default",
> NULL);
>
> -- Hannes
>
>
> + if (!include_merges)
> + strvec_push(&cp.args, "--no-merges");
> strvec_push(&cp.args, range);
> if (other_arg)
> strvec_pushv(&cp.args, other_arg->v);
> @@ -96,11 +99,14 @@ static int read_patches(const char *range, struct string_list *list,
> }
>
> if (skip_prefix(line, "commit ", &p)) {
> + char *q;
> if (util) {
> string_list_append(list, buf.buf)->util = util;
> strbuf_reset(&buf);
> }
> CALLOC_ARRAY(util, 1);
> + if (include_merges && (q = strstr(p, " (from ")))
> + *q = '\0';
> if (repo_get_oid(the_repository, p, &util->oid)) {
> error(_("could not parse commit '%s'"), p);
> FREE_AND_NULL(util);
> @@ -571,13 +577,14 @@ int show_range_diff(const char *range1, const char *range2,
>
> struct string_list branch1 = STRING_LIST_INIT_DUP;
> struct string_list branch2 = STRING_LIST_INIT_DUP;
> + unsigned int include_merges = range_diff_opts->include_merges;
>
> if (range_diff_opts->left_only && range_diff_opts->right_only)
> res = error(_("options '%s' and '%s' cannot be used together"), "--left-only", "--right-only");
>
> - if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg))
> + if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg, include_merges))
> res = error(_("could not parse log for '%s'"), range1);
> - if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg))
> + if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg, include_merges))
> res = error(_("could not parse log for '%s'"), range2);
>
> if (!res) {
> diff --git a/range-diff.h b/range-diff.h
> index 2f69f6a434d..cd85000b5a0 100644
> --- a/range-diff.h
> +++ b/range-diff.h
> @@ -16,6 +16,7 @@ struct range_diff_options {
> int creation_factor;
> unsigned dual_color:1;
> unsigned left_only:1, right_only:1;
> + unsigned include_merges:1;
> const struct diff_options *diffopt; /* may be NULL */
> const struct strvec *other_arg; /* may be NULL */
> };
> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index 86010931ab6..c18a3fdab83 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -909,4 +909,20 @@ test_expect_success 'submodule changes are shown irrespective of diff.submodule'
> test_cmp expect actual
> '
>
> +test_expect_success '--diff-merges' '
> + renamed_oid=$(git rev-parse --short renamed-file) &&
> + tree=$(git merge-tree unmodified renamed-file) &&
> + clean=$(git commit-tree -m merge -p unmodified -p renamed-file $tree) &&
> + clean_oid=$(git rev-parse --short $clean) &&
> + conflict=$(git commit-tree -m merge -p unmodified -p renamed-file^ $tree) &&
> + conflict_oid=$(git rev-parse --short $conflict) &&
> +
> + git range-diff --diff-merges=1 $clean...$conflict >actual &&
> + cat >expect <<-EOF &&
> + 1: $renamed_oid < -: ------- s/12/B/
> + 2: $clean_oid = 1: $conflict_oid merge
> + EOF
> + test_cmp expect actual
> +'
> +
> test_done
>
> base-commit: 777489f9e09c8d0dd6b12f9d90de6376330577a2
> --
> gitgitgadget
Didn't spot any problems with the patch itself. |
User |
On the Git mailing list, Elijah Newren wrote (reply to this): On Thu, Nov 7, 2024 at 10:54 PM Johannes Sixt <[email protected]> wrote:
>
> Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget:
> > +--diff-merges=<format>::
> > + Instead of ignoring merge commits, generate diffs for them using the
> > + corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> > + and include them in the comparison.
> > ++
> > +Note: Some of the formats supported by linkgit:git-log[1] make less sense in
> > +the context of the `range-diff` command than other formats, so choose wisely!
>
> Can we please be a bit more specific which options are usable or which
> are not usable? Perhaps you even mean to say that 'first-parent' is the
> only one that makes sense? At a minimum, we should mention that it is
> the one that makes most sense (if that is the case).
Here's my guesses I mostly typed up last night before getting tired
and going to bed:
separate: makes no sense; showing two diffs for each merge and then
trying to diff the two sets of two diffs seems crazy, especially since
the whole point of rebasing is usually to change the first parent
somewhere in the ancestry; that'll cascade up and cause the diffs to
the second parents to differ wildly and make this format uselessly
noisy in general.
combined: much better than separate, but probably still quite noisy and
confusing, especially since combined-diff and range-diff are both
complicated in that they have two diffs each, and combining the two
means you have a diff of a dual diff. I suspect that might be hard for users to
process.
dense-combined: much better than combined, you at least have the diff
down to the smallest relevant pieces. You may still deal with the
diff of a dual diff problem.
first-parent: useful, but only when keeping the series of patches largely
intact and transplanting them; any insertion, deletion, or
modification of intermediate patches will result in those
modifications being shown
not just for the patch in question but rolled up into the merge commit
and shown there as well, making the diff for the merge very noisy
(possibly to the point of uselessness). When
keeping the series of patches intact and transplanting them, though,
likely a useful mode.
remerge-diff: generally useful, even if you insert, delete, or modify
intermediate patches while rebasing. Since it also shows how conflict
resolutions are done, you'd see how conflicts are resolved
differently. Actually, that might be one drawback to this method;
since conflict markers try to provide labels involving a short commit
ID and the commit message, and those short commit IDs will differ,
this mode would just end up showing how the two merges had different
parents, which you already know. We could probably make this option
much more useful by surpressing the conflict labels, or at least the
short commit ID parts of those labels. If we do that, and take
Johannes' nice fix he sent to the list separately, then I suspect this
mode is generally the most useful one to use with range-diff. |
On the Git mailing list, Elijah Newren wrote (reply to this): On Fri, Nov 8, 2024 at 5:43 AM Johannes Schindelin via GitGitGadget
<[email protected]> wrote:
>
> From: Johannes Schindelin <[email protected]>
>
> The `git log` command already offers support for including diffs for
> merges, via the `--diff-merges=<format>` option.
>
> Let's add corresponding support for `git range-diff`, too. This makes it
> more convenient to spot differences between iterations of non-linear
> contributions, where so-called "evil merges" are sometimes necessary and
> need to be reviewed, too.
>
> In my code reviews, I found the `--diff-merges=first-parent` option
> particularly useful.
>
> Signed-off-by: Johannes Schindelin <[email protected]>
> ---
> Support diff merges option in range diff
>
> The git range-diff command does the same with merge commits as git
> rebase: It ignores them.
>
> However, when comparing branch thickets it can be quite illuminating to
> watch out for inadvertent changes in merge commits, in particular when
> some "evil" merges have been replayed, i.e. merges that needed to
> introduce changes outside of the merge conflicts (e.g. when one branch
> changed a function's signature and another branch introduced a caller of
> said function), in case the replayed merge is no longer "evil" and
> therefore potentially incorrect.
>
> Let's introduce support for the --diff-merges option that is passed
> through to those git log commands.
>
> I had a need for this earlier this year and got it working, leaving the
> GitGitGadget PR in a draft mode. Phil Blain found it and kindly
> nerd-sniped me into readying it for submitting, so say thanks to Phil!
>
> Changes since v1:
>
> * Changed the documentation to recommend first-parent mode instead of
> vaguely talking about various modes' merits.
> * Disallowed the no-arg --diff-merges option (because --diff-merges
> requires an argument).
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1734%2Fdscho%2Fsupport-diff-merges-option-in-range-diff-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1734/dscho/support-diff-merges-option-in-range-diff-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/1734
>
> Range-diff vs v1:
>
> 1: 11361e07af8 ! 1: d01a395900b range-diff: optionally include merge commits' diffs in the analysis
> @@ Documentation/git-range-diff.txt: to revert to color all lines according to the
> + corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> + and include them in the comparison.
> ++
> -+Note: Some of the formats supported by linkgit:git-log[1] make less sense in
> -+the context of the `range-diff` command than other formats, so choose wisely!
> ++Note: In the common case, the `first-parent` mode will be the most natural one
> ++to use, as it is consistent with the idea that a merge is kind of a "meta
> ++patch", comprising all the merged commits' patches into a single one.
> +
> --[no-]notes[=<ref>]::
> This flag is passed to the `git log` program
> @@ builtin/range-diff.c: int cmd_range_diff(int argc,
> N_("notes"), N_("passed to 'git log'"),
> PARSE_OPT_OPTARG),
> + OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
> -+ N_("style"), N_("passed to 'git log'"),
> -+ PARSE_OPT_OPTARG),
> ++ N_("style"), N_("passed to 'git log'"), 0),
> OPT_BOOL(0, "left-only", &left_only,
> N_("only emit output related to the first range")),
> OPT_BOOL(0, "right-only", &right_only,
>
>
> Documentation/git-range-diff.txt | 11 ++++++++++-
> builtin/range-diff.c | 10 ++++++++++
> range-diff.c | 15 +++++++++++----
> range-diff.h | 1 +
> t/t3206-range-diff.sh | 16 ++++++++++++++++
> 5 files changed, 48 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> index fbdbe0befeb..17a85957877 100644
> --- a/Documentation/git-range-diff.txt
> +++ b/Documentation/git-range-diff.txt
> @@ -10,7 +10,7 @@ SYNOPSIS
> [verse]
> 'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>]
> [--no-dual-color] [--creation-factor=<factor>]
> - [--left-only | --right-only]
> + [--left-only | --right-only] [--diff-merges=<format>]
> ( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> )
> [[--] <path>...]
>
> @@ -81,6 +81,15 @@ to revert to color all lines according to the outer diff markers
> Suppress commits that are missing from the second specified range
> (or the "right range" when using the `<rev1>...<rev2>` format).
>
> +--diff-merges=<format>::
> + Instead of ignoring merge commits, generate diffs for them using the
> + corresponding `--diff-merges=<format>` option of linkgit:git-log[1],
> + and include them in the comparison.
> ++
> +Note: In the common case, the `first-parent` mode will be the most natural one
> +to use,
I think we need more wording around "common case"; I believe this
"common case" is when you are diffing against a merely transplanted
series of commits (a series created by rebasing without inserting,
removing, or minimally modifying those commits in the process) that
`first-parent` makes sense. And it only makes sense in that case.
I think `remerge-diff` generally makes sense here, both in the cases
when `first-parent` makes sense and when `first-parent` does not. It
could be improved by suppressing the inclusion of short commit ids
(and maybe also commit messages) in the labels of conflict markers. I
suspect that issue might make `remerge-diff` less useful than
`first-parent` in simple common cases currently, but I think it's the
right thing to build upon for what you are trying to view.
If `remerge-diff` didn't exist, I think I'd always use
`dense-combined` over `first-parent` because of this
merely-transplanted limitation. I suspect dense-combined would
probably be kind of ugly and hard to parse when there is an actual
evil merge, but it'd at least limit what it shows to the evil merge
content.
> as it is consistent with the idea that a merge is kind of a "meta
> +patch", comprising all the merged commits' patches into a single one.
Doesn't this wording of yours naturally lead to the idea that
`first-parent` is a bad choice if patches leading to the merge have
been inserted, removed, or more than minimally modified? It points
out that first-parent is a diff over the whole range, and so
range-diff shows you a diff of the diffs over the whole range. If you
want to look at the "evilness" of merge commits, i.e. the
user-generated portion of the diffs included in by merge commits, you
need either remerge-diff or dense-combined.
> +
> --[no-]notes[=<ref>]::
> This flag is passed to the `git log` program
> (see linkgit:git-log[1]) that generates the patches.
> diff --git a/builtin/range-diff.c b/builtin/range-diff.c
> index 1b33ab66a7b..901de5d133d 100644
> --- a/builtin/range-diff.c
> +++ b/builtin/range-diff.c
> @@ -21,6 +21,7 @@ int cmd_range_diff(int argc,
> {
> struct diff_options diffopt = { NULL };
> struct strvec other_arg = STRVEC_INIT;
> + struct strvec diff_merges_arg = STRVEC_INIT;
> struct range_diff_options range_diff_opts = {
> .creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT,
> .diffopt = &diffopt,
> @@ -36,6 +37,8 @@ int cmd_range_diff(int argc,
> OPT_PASSTHRU_ARGV(0, "notes", &other_arg,
> N_("notes"), N_("passed to 'git log'"),
> PARSE_OPT_OPTARG),
> + OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg,
> + N_("style"), N_("passed to 'git log'"), 0),
> OPT_BOOL(0, "left-only", &left_only,
> N_("only emit output related to the first range")),
> OPT_BOOL(0, "right-only", &right_only,
> @@ -62,6 +65,12 @@ int cmd_range_diff(int argc,
> if (!simple_color)
> diffopt.use_color = 1;
>
> + /* If `--diff-merges` was specified, imply `--merges` */
> + if (diff_merges_arg.nr) {
> + range_diff_opts.include_merges = 1;
> + strvec_pushv(&other_arg, diff_merges_arg.v);
> + }
> +
> for (i = 0; i < argc; i++)
> if (!strcmp(argv[i], "--")) {
> dash_dash = i;
> @@ -155,6 +164,7 @@ int cmd_range_diff(int argc,
> res = show_range_diff(range1.buf, range2.buf, &range_diff_opts);
>
> strvec_clear(&other_arg);
> + strvec_clear(&diff_merges_arg);
> strbuf_release(&range1);
> strbuf_release(&range2);
>
> diff --git a/range-diff.c b/range-diff.c
> index bbb0952264b..9e59733059b 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -38,7 +38,8 @@ struct patch_util {
> * as struct object_id (will need to be free()d).
> */
> static int read_patches(const char *range, struct string_list *list,
> - const struct strvec *other_arg)
> + const struct strvec *other_arg,
> + unsigned int include_merges)
> {
> struct child_process cp = CHILD_PROCESS_INIT;
> struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT;
> @@ -49,7 +50,7 @@ static int read_patches(const char *range, struct string_list *list,
> size_t size;
> int ret = -1;
>
> - strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
> + strvec_pushl(&cp.args, "log", "--no-color", "-p",
> "--reverse", "--date-order", "--decorate=no",
> "--no-prefix", "--submodule=short",
> /*
> @@ -64,6 +65,8 @@ static int read_patches(const char *range, struct string_list *list,
> "--pretty=medium",
> "--show-notes-by-default",
> NULL);
> + if (!include_merges)
> + strvec_push(&cp.args, "--no-merges");
> strvec_push(&cp.args, range);
> if (other_arg)
> strvec_pushv(&cp.args, other_arg->v);
> @@ -96,11 +99,14 @@ static int read_patches(const char *range, struct string_list *list,
> }
>
> if (skip_prefix(line, "commit ", &p)) {
> + char *q;
> if (util) {
> string_list_append(list, buf.buf)->util = util;
> strbuf_reset(&buf);
> }
> CALLOC_ARRAY(util, 1);
> + if (include_merges && (q = strstr(p, " (from ")))
> + *q = '\0';
> if (repo_get_oid(the_repository, p, &util->oid)) {
> error(_("could not parse commit '%s'"), p);
> FREE_AND_NULL(util);
> @@ -571,13 +577,14 @@ int show_range_diff(const char *range1, const char *range2,
>
> struct string_list branch1 = STRING_LIST_INIT_DUP;
> struct string_list branch2 = STRING_LIST_INIT_DUP;
> + unsigned int include_merges = range_diff_opts->include_merges;
>
> if (range_diff_opts->left_only && range_diff_opts->right_only)
> res = error(_("options '%s' and '%s' cannot be used together"), "--left-only", "--right-only");
>
> - if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg))
> + if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg, include_merges))
> res = error(_("could not parse log for '%s'"), range1);
> - if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg))
> + if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg, include_merges))
> res = error(_("could not parse log for '%s'"), range2);
>
> if (!res) {
> diff --git a/range-diff.h b/range-diff.h
> index 2f69f6a434d..cd85000b5a0 100644
> --- a/range-diff.h
> +++ b/range-diff.h
> @@ -16,6 +16,7 @@ struct range_diff_options {
> int creation_factor;
> unsigned dual_color:1;
> unsigned left_only:1, right_only:1;
> + unsigned include_merges:1;
> const struct diff_options *diffopt; /* may be NULL */
> const struct strvec *other_arg; /* may be NULL */
> };
> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index 86010931ab6..c18a3fdab83 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -909,4 +909,20 @@ test_expect_success 'submodule changes are shown irrespective of diff.submodule'
> test_cmp expect actual
> '
>
> +test_expect_success '--diff-merges' '
> + renamed_oid=$(git rev-parse --short renamed-file) &&
> + tree=$(git merge-tree unmodified renamed-file) &&
> + clean=$(git commit-tree -m merge -p unmodified -p renamed-file $tree) &&
> + clean_oid=$(git rev-parse --short $clean) &&
> + conflict=$(git commit-tree -m merge -p unmodified -p renamed-file^ $tree) &&
> + conflict_oid=$(git rev-parse --short $conflict) &&
> +
> + git range-diff --diff-merges=1 $clean...$conflict >actual &&
> + cat >expect <<-EOF &&
> + 1: $renamed_oid < -: ------- s/12/B/
> + 2: $clean_oid = 1: $conflict_oid merge
> + EOF
> + test_cmp expect actual
> +'
> +
> test_done
>
> base-commit: 777489f9e09c8d0dd6b12f9d90de6376330577a2
> --
> gitgitgadget |
On the Git mailing list, Elijah Newren wrote (reply to this): One more thing...
On Fri, Nov 8, 2024 at 3:04 AM Johannes Schindelin
<[email protected]> wrote:
>
[...]
> Footnote *1*: We should really find a much better name for this than "evil
> merge". There is nothing evil about me having to add `#include
> "environment.h"` in v2.45.1^, for example. It was necessary so that the
> code would build. Tedious, yes, but not evil.
Indeed, I've felt it's problematic for a while too and had to take
time on at least one occasion to mention to someone else that I don't
actually mean "evil" it's just that "evil merge" as a compound term is
the convenient nomenclature we've been using for all of these,
regardless of whether the user modified the merge to resolve syntactic
conflicts, or to resolve semantic conflicts, or to sneak in "evil"
changes. That's particularly odd since the first category is the most
common, and the third (snuck in unrelated changes or "evil changes")
are the most rare. Maybe we should just call these "user-modified
merges" rather than "evil merges"? Any better suggestions? |
The
git range-diff
command does the same with merge commits asgit rebase
: It ignores them.However, when comparing branch thickets it can be quite illuminating to watch out for inadvertent changes in merge commits, in particular when some "evil" merges have been replayed, i.e. merges that needed to introduce changes outside of the merge conflicts (e.g. when one branch changed a function's signature and another branch introduced a caller of said function), in case the replayed merge is no longer "evil" and therefore potentially incorrect.
Let's introduce support for the
--diff-merges
option that is passed through to thosegit log
commands.I had a need for this earlier this year and got it working, leaving the GitGitGadget PR in a draft mode. Phil Blain found it and kindly nerd-sniped me into readying it for submitting, so say thanks to Phil!
Changes since v1:
first-parent
mode instead of vaguely talking about various modes' merits.--diff-merges
option (because--diff-merges
requires an argument).Cc: Philippe Blain [email protected]
cc: Johannes Sixt [email protected]
cc: Elijah Newren [email protected]