-
Notifications
You must be signed in to change notification settings - Fork 141
replace-refs: fix support of qualified replace ref paths #1903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
replace-refs: fix support of qualified replace ref paths #1903
Conversation
There are issues in commit a666d75: |
a666d75
to
373addd
Compare
The enactment of replace refs in replace-object.c#register_replace_ref() explicitly uses the last portion of the ref "path", the substring after the last slash, as the object ID to be replaced. This means that replace refs can be organized into different paths; you can separate replace refs created for different purposes, like "refs/replace/2012-migration/*". This in turn makes it practical to store prepared replace refs in different ref paths on a git server, and have users "map" them, via specific refspecs, into their local repos; different types of replacements can be mapped into different sub-paths of refs/replace/. The only way this didn't "work" is in the commit decoration process, in log-tree.c#add_ref_decoration(), where different logic was used to obtain the replaced object ID, removing the "refs/replace/" prefix only. This inconsistent logic meant that more structured replace ref paths caused a warning to be printed, and the "replaced" decoration to be omitted. Fix this decoration logic to use the same "last part of ref path" logic, fixing spurious warnings (and missing decorations) for users of more structured replace ref paths. Also add tests for qualified replace ref paths. Signed-off-by: Tao Klerks <[email protected]>
373addd
to
d5386da
Compare
/preview |
Preview email sent as [email protected] |
/preview |
Preview email sent as [email protected] |
/submit |
Submitted as [email protected] To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Sat, Apr 26, 2025 at 07:10:52AM +0000, Tao Klerks via GitGitGadget wrote:
> From: Tao Klerks <[email protected]>
>
> The enactment of replace refs in
> replace-object.c#register_replace_ref() explicitly uses the last
> portion of the ref "path", the substring after the last slash, as the
> object ID to be replaced. This means that replace refs can be
> organized into different paths; you can separate replace refs created
> for different purposes, like "refs/replace/2012-migration/*". This in
> turn makes it practical to store prepared replace refs in different ref
> paths on a git server, and have users "map" them, via specific
> refspecs, into their local repos; different types of replacements can
> be mapped into different sub-paths of refs/replace/.
I wonder whether this really was an intentional choice or whether it is
simply a bug that led to useful behaviour. In any case, git-replace(1)
itself does not mention your behaviour at all -- it simply talks about
the "name of the replace reference".
> The only way this didn't "work" is in the commit decoration process,
> in log-tree.c#add_ref_decoration(), where different logic was used to
> obtain the replaced object ID, removing the "refs/replace/" prefix
> only. This inconsistent logic meant that more structured replace ref
> paths caused a warning to be printed, and the "replaced" decoration to
> be omitted.
>
> Fix this decoration logic to use the same "last part of ref path"
> logic, fixing spurious warnings (and missing decorations) for users
> of more structured replace ref paths. Also add tests for qualified
> replace ref paths.
If we can agree that this is something that we want we should definitely
amend git-replace(1) to document this new format.
Which raises the question: is this something that we want? Are there any
arguments that would speak against loosening the format of replace refs
now?
The change itself would be backwards compatible, so any old ref that
parsed as replace ref would continue to parse as one. But the bigger
question is what happens when using an old Git version or an alternative
implementation of Git to parse such replace refs. How do those handle
such refs? For old Git versions we know, for alternative implementations
like JGit or libgit2 we don't. My assumption would be that those simply
ignore the new format, which raises the question whether that is okay.
In any case, we should provide good reasoning why this change is okay to
make.
> diff --git a/log-tree.c b/log-tree.c
> index a4d4ab59ca0..6a87724527b 100644
> --- a/log-tree.c
> +++ b/log-tree.c
> @@ -163,10 +163,11 @@ static int add_ref_decoration(const char *refname, const char *referent UNUSED,
>
> if (starts_with(refname, git_replace_ref_base)) {
> struct object_id original_oid;
> + const char *slash = strrchr(refname, '/');
> + const char *hash = slash ? slash + 1 : refname;
> if (!replace_refs_enabled(the_repository))
> return 0;
> - if (get_oid_hex(refname + strlen(git_replace_ref_base),
> - &original_oid)) {
> + if (get_oid_hex(hash, &original_oid)) {
> warning("invalid replace ref %s", refname);
> return 0;
> }
It would probably make sense to pull out a common function that parses
replace refs for us so that we can deduplicate callsites and show that
all of them behave the same now.
Thanks!
Patrick |
User |
On the Git mailing list, Junio C Hamano wrote (reply to this): Patrick Steinhardt <[email protected]> writes:
> I wonder whether this really was an intentional choice or whether it is
> simply a bug that led to useful behaviour. In any case, git-replace(1)
> itself does not mention your behaviour at all -- it simply talks about
> the "name of the replace reference".
Yup, I am reasonbly sure that this was not a designed behaviour. If
this were discovered during the original review, I would probably
have suggested tightening the rule to ignore anything with extra
levels in the middle. It can be argued that it is too late, but
with this ...
>> The only way this didn't "work" is in the commit decoration process,
... where the "funny" replace refs are honored in some but not all
situations, it also can be argued that it is highly unlikely anybody
sane is actually depending on this "feature", so such a tightening
may not hurt too much.
The reason why I would slightly prefer tightening the rule, rather
than making the loophole even larger by adjusting the code for "the
commit decoration process", is what it means to have two different
replacement objects for the same object, which would naturally be
prevented from happening if we did not allow extra levels in the
middle. Would one always consistently trump the other? When a
filesystem rebalances, would we just pick one at randomly based
purely on the first one readdir() happened to return? It smells
to lead to nothing but a confused mess.
Maybe the answer is "don't do it, then". The same answer, however,
can be given for creating any extra level between refs/replace and
the object name itself, so... I dunno.
> If we can agree that this is something that we want we should definitely
> amend git-replace(1) to document this new format.
>
> Which raises the question: is this something that we want? Are there any
> arguments that would speak against loosening the format of replace refs
> now?
"What if refs/replace/{a,b,c}/$name point at different objects?"
would be a solid reason why we shouldn't do this, but I personally
do not feel too strongly about it, as "don't do it, then" may be a
reasonable answer for such an insane scenario. |
On the Git mailing list, Tao Klerks wrote (reply to this): On Tue, Apr 29, 2025 at 8:46 PM Junio C Hamano <[email protected]> wrote:
>
> Patrick Steinhardt <[email protected]> writes:
>
> > I wonder whether this really was an intentional choice or whether it is
> > simply a bug that led to useful behaviour.
I don't know that for sure either - although the fact that we *do*
already have sanity checks against multiple replace refs targeting the
same object id leads me to believe it was intentionally allowed.
I tried to explain why this is useful, and why it should be *better*
supported, in the change description: *without* this behavior, replace
refs cannot be "namespaced" - they are always inherently an
undifferentiated pile, and that in turn makes *sharing* of replace
refs completely impractical.
>
> ... where the "funny" replace refs are honored in some but not all
> situations, it also can be argued that it is highly unlikely anybody
> sane is actually depending on this "feature", so such a tightening
> may not hurt too much.
Well, I guess my sanity can be reasonably questioned, but it would
certainly hurt me and my thousand plus users - we use replace refs to
keep nice tidy squash-based history by default, but allow for
different types of "deep blame" by setting up one or more fetch
refspecs that map into "refs/replace/SOMETHING/*"
>
> The reason why I would slightly prefer tightening the rule, rather
> than making the loophole even larger by adjusting the code for "the
> commit decoration process", is what it means to have two different
> replacement objects for the same object, which would naturally be
> prevented from happening if we did not allow extra levels in the
> middle.
This is already accounted for with a hard error: If you end up with
duplicate replace ref targets, many operations fail with a clear
error. Clean up your replace refs and you're off to the races again.
> Would one always consistently trump the other? When a
> filesystem rebalances, would we just pick one at randomly based
> purely on the first one readdir() happened to return? It smells
> to lead to nothing but a confused mess.
Reasonable concern, but already addressed.
>
> Maybe the answer is "don't do it, then". The same answer, however,
> can be given for creating any extra level between refs/replace and
> the object name itself, so... I dunno.
The same answer should probably not be given: One behavior is
*useful*, and one is not.
>
> > If we can agree that this is something that we want we should definitely
> > amend git-replace(1) to document this new format.
I will (propose to) do so.
> >
> > Which raises the question: is this something that we want? Are there any
> > arguments that would speak against loosening the format of replace refs
> > now?
My point here is that it's *not* a loosening - this "loose" behavior
already exists, it works, it is useful, and there are people using it.
At least my thousand users. All I'm proposing here is to show them
less spurious warnings when they run a log with --decorate.
>
> "What if refs/replace/{a,b,c}/$name point at different objects?"
> would be a solid reason why we shouldn't do this, but I personally
> do not feel too strongly about it, as "don't do it, then" may be a
> reasonable answer for such an insane scenario.
That is indeed the current outcome: don't do it or you get a hard (but
nondestructive) error. You can then of course fix it and you're set. |
User |
On the Git mailing list, Junio C Hamano wrote (reply to this): Tao Klerks <[email protected]> writes:
> This is already accounted for with a hard error: If you end up with
> duplicate replace ref targets, many operations fail with a clear
> error. Clean up your replace refs and you're off to the races again.
OK. As long as that safety is there, I offhand see no more reason
to forbid it. As you said, it can lead to a useful use case. |
cc: Patrick Steinhardt [email protected]
cc: Tao Klerks [email protected]