Add rough cut of 'pathbase' #720
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a rough draft for a "pathbase" feature. It operates like git's interactive rebase but is not interactive. I'm not intending for this to be merged; just looking for comments on whether this could be merged. It would need test cases at least.
After running
git-filter-repo --analyze --one-path-per-blob, one annotates the resulting.git/filter-repo/analysis/blob-shas-and-paths.txtwith commands and feeds that back in viagit-filter-repo --pathbase. This avoids most of the human error inherent in a person transforming ablob-shas-and-paths.txtinto a file that--paths-from-filecan handle.It is a process that most git users will already be familiar with from rebase.
Unless the extra line in the
blob-shas-and-paths.txtfile causes issues, this is 100% backwards compatible.I am currently using this to clean a repo that had a lot of build artifacts checked in. I find it better than (re-)creating a pathnames-only file from
blob-shas-and-paths.txtbecause I can see what I'm selecting and it's less transformation than making the pathnames-only file.Part of my equation is this lets me do the annotation slowly over time/many iterations while I figure out what needs to come out and what stays.
Yes, this is a different approach than the "blobbase" PR I posted. It turns out that in a lot of cases, I want to remove an item from an erroneously checked in build products directory, but that item is an art/etc asset that should still reside in the codebase. Thus selecting the sha for stripping removes both, and I only want one.
Also, the 0-byte deduplication blob sha has a lot of unrelated files pointing to it; those also have to be filtered by path.