git/object.h, branch main

Merge branch 'ps/receive-pack-shallow-optim'

2026-03-03T01:06:53Z

The code to accept shallow "git push" has been optimized. * ps/receive-pack-shallow-optim: commit: use commit graph in `lookup_commit_reference_gently()` commit: make `repo_parse_commit_no_graph()` more robust commit: avoid parsing non-commits in `lookup_commit_reference_gently()`

commit: avoid parsing non-commits in `lookup_commit_reference_gently()`

2026-02-19T17:34:16Z

The function `lookup_commit_reference_gently()` can be used to look up a committish by object ID. As such, the function knows to peel for example tag objects so that we eventually end up with the commit. The function is used quite a lot throughout our tree. One such user is "shallow.c" via `assign_shallow_commits_to_refs()`. The intent of this function is to figure out whether a shallow push is missing any objects that are required to satisfy the ref updates, and if so, which of the ref updates is missing objects. This is done by painting the tree with `UNINTERESTING`. We start painting by calling `refs_for_each_ref()` so that we can mark all existing referenced objects as the boundary of objects that we already have, and which are supposed to be fully connected. The reference tips are then parsed via `lookup_commit_reference_gently()`, and the commit is then marked as uninteresting. But references may not necessarily point to a committish, and if a lot of them aren't then this step takes a lot of time. This is mostly due to the way that `lookup_commit_reference_gently()` is implemented: before we learn about the type of the object we already call `parse_object()` on the object ID. This has two consequences: - We parse all objects, including trees and blobs, even though we don't even need the contents of them. - More importantly though, `parse_object()` will cause us to check whether the object ID matches its contents. Combined this means that we deflate and hash every non-committish object, and that of course ends up being both CPU- and memory-intensive. Improve the logic so that we first use `peel_object()`. This function won't parse the object for us, and thus it allows us to learn about the object's type before we parse and return it. The following benchmark pushes a single object from a shallow clone into a repository that has 100,000 refs. These refs were created by listing all objects via `git rev-list(1) --objects --all` and creating refs for a subset of them, so lots of those refs will cover non-commit objects. Benchmark 1: git-receive-pack (rev = HEAD~) Time (mean ± σ): 62.571 s ± 0.413 s [User: 58.331 s, System: 4.053 s] Range (min … max): 62.191 s … 63.010 s 3 runs Benchmark 2: git-receive-pack (rev = HEAD) Time (mean ± σ): 38.339 s ± 0.192 s [User: 36.220 s, System: 1.992 s] Range (min … max): 38.176 s … 38.551 s 3 runs Summary git-receive-pack . Signed-off-by: Junio C Hamano

revision: add --maximal-only option

2026-01-22T18:58:14Z

When inspecting a range of commits from some set of starting references, it is sometimes useful to learn which commits are not reachable from any other commits in the selected range. One such application is in the creation of a sequence of bundles for the bundle URI feature. Creating a stack of bundles representing different slices of time includes defining which references to include. If all references are used, then this may be overwhelming or redundant. Instead, selecting commits that are maximal to the range could help defining a smaller reference set to use in the bundle header. Add a new '--maximal-only' option to restrict the output of a revision range to be only the commits that are not reachable from any other commit in the range, based on the reachability definition of the walk. This is accomplished by adding a new 28th bit flag, CHILD_VISITED, that is set as we walk. This does extend the bit range in object.h, but using an earlier bit may collide with another feature. The tests demonstrate the behavior of the feature with a positive-only range, ranges with negative references, and walk-modifying flags like --first-parent and --exclude-first-parent-only. Since the --boundary option would not increase any results when used with the --maximal-only option, mark them as incompatible. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano

Merge branch 'kn/maintenance-is-needed'

2025-11-21T17:14:17Z

"git maintenance" command learned "is-needed" subcommand to tell if it is necessary to perform various maintenance tasks. * kn/maintenance-is-needed: maintenance: add 'is-needed' subcommand maintenance: add checking logic in `pack_refs_condition()` refs: add a `optimize_required` field to `struct ref_storage_be` reftable/stack: add function to check if optimization is required reftable/stack: return stack segments directly

Merge branch 'ps/ref-peeled-tags-fixes'

2025-11-19T18:55:42Z

Another fix-up to "peeled-tags" topic. * ps/ref-peeled-tags-fixes: object: fix performance regression when peeling tags

Merge branch 'ps/ref-peeled-tags'

2025-11-19T18:55:39Z

Some ref backend storage can hold not just the object name of an annotated tag, but the object name of the object the tag points at. The code to handle this information has been streamlined. * ps/ref-peeled-tags: t7004: do not chdir around in the main process ref-filter: fix stale parsed objects ref-filter: parse objects on demand ref-filter: detect broken tags when dereferencing them refs: don't store peeled object IDs for invalid tags object: add flag to `peel_object()` to verify object type refs: drop infrastructure to peel via iterators refs: drop `current_ref_iter` hack builtin/show-ref: convert to use `reference_get_peeled_oid()` ref-filter: propagate peeled object ID upload-pack: convert to use `reference_get_peeled_oid()` refs: expose peeled object ID via the iterator refs: refactor reference status flags refs: fully reset `struct ref_iterator::ref` on iteration refs: introduce `.ref` field for the base iterator refs: introduce wrapper struct for `each_ref_fn`

maintenance: add checking logic in `pack_refs_condition()`

2025-11-10T17:28:48Z

The 'git-maintenance(1)' command supports an '--auto' flag. Usage of the flag ensures to run maintenance tasks only if certain thresholds are met. The heuristic is defined on a task level, wherein each task defines an 'auto_condition', which states if the task should be run. The 'pack-refs' task is hard-coded to return 1 as: 1. There was never a way to check if the reference backend needs to be optimized without actually performing the optimization. 2. We can pass in the '--auto' flag to 'git-pack-refs(1)' which would optimize based on heuristics. The previous commit added a `refs_optimize_required()` function, which can be used to check if a reference backend required optimization. Use this within `pack_refs_condition()`. This allows us to add a 'git maintenance is-needed' subcommand which can notify the user if maintenance is needed without actually performing the optimization. Without this change, the reference backend would always state that optimization is needed. Since we import 'revision.h', we need to remove the definition for 'SEEN' which is duplicated in the included header. Signed-off-by: Karthik Nayak Acked-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

object: fix performance regression when peeling tags

2025-11-06T18:54:34Z

Our Bencher dashboards [1] have recently alerted us about a bunch of performance regressions when writing references, specifically with the reftable backend. There is a 3x regression when writing many refs with preexisting refs in the reftable format, and a 10x regression when migrating refs between backends in either of the formats. Bisecting the issue lands us at 6ec4c0b45b (refs: don't store peeled object IDs for invalid tags, 2025-10-23). The gist of the commit is that we may end up storing peeled objects in both reftables and packed-refs for corrupted tags, where the claimed tagged object type is different than the actual tagged object type. This will then cause us to create the `struct object *` with a wrong type, as well, and obviously nothing good comes out of that. The fix for this issue was to introduce a new flag to `peel_object()` that causes us to verify the tagged object's type before writing it into the refdb -- if the tag is corrupt, we skip writing the peeled value. To verify whether the peeled value is correct we have to look up the object type via the ODB and compare the actual type with the claimed type, and that additional object lookup is costly. This also explains why we see the regression only when writing refs with the reftable backend, but we see the regression with both backends when migrating refs: - The reftable backend knows to store peeled values in the new table immediately, so it has to try and peel each ref it's about to write to the transaction. So the performance regression is visible for all writes. - The files backend only stores peeled values when writing the packed-refs file, so it wouldn't hit the performance regression for normal writes. But on ref migrations we know to write all new values into the packed-refs file immediately, and that's why we see the regression for both backends there. Taking a step back though reveals an oddity in the new verification logic: we not only verify the _tagged_ object's type, but we also verify the type of the tag itself. But this isn't really needed, as we wouldn't hit the bug in such a case anyway, as we only hit the issue with corrupt tags claiming an invalid type for the tagged object. The consequence of this is that we now started to look up the target object of every single reference we're about to write, regardless of whether it even is a tag or not. And that is of course quite costly. Fix the issue by only verifying the type of the tagged objects. This means that we of course still have a performance hit for actual tags. But this only happens for writes anyway, and I'd claim it's preferable to not store corrupted data in the refdb than to be fast here. Rename the flag accordingly to clarify that we only verify the tagged object's type. This fix brings performance back to previous levels: Benchmark 1: baseline Time (mean ± σ): 46.0 ms ± 0.4 ms [User: 40.0 ms, System: 5.7 ms] Range (min … max): 45.0 ms … 47.1 ms 54 runs Benchmark 2: regression Time (mean ± σ): 140.2 ms ± 1.3 ms [User: 77.5 ms, System: 60.5 ms] Range (min … max): 138.0 ms … 142.7 ms 20 runs Benchmark 3: fix Time (mean ± σ): 46.2 ms ± 0.4 ms [User: 40.2 ms, System: 5.7 ms] Range (min … max): 45.0 ms … 47.3 ms 55 runs Summary update-ref: baseline 1.00 ± 0.01 times faster than fix 3.05 ± 0.04 times faster than regression [1]: https://bencher.dev/perf/git/plots Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

object: add flag to `peel_object()` to verify object type

2025-11-04T15:32:25Z

When peeling a tag to a non-tag object we repeatedly call `parse_object()` on the tagged object until we find the first object that isn't a tag. While this feels sensible at first, there is a big catch here: `parse_object()` doesn't actually verify the type of the tagged object. The relevant code path here eventually ends up in `parse_tag_buffer()`. Here, we parse the various fields of the tag, including the "type". Once we've figured out the type and the tagged object ID, we call one of the `lookup_${type}()` functions for whatever type we have found. There is two possible outcomes in the successful case: 1. The object is already part of our cached objects. In that case we double-check whether the type we're trying to look up matches the type that was cached. 2. The object is _not_ part of our cached objects. In that case, we simply create a new object with the expected type, but we don't parse that object. In the first case we might notice type mismatches, but only in the case where our cache has the object with the correct type. In the second case, we'll blindly assume that the type is correct and then go with it. We'll only notice that the type might be wrong when we try to parse the object at a later point. Now arguably, we could change `parse_tag_buffer()` to verify the tagged object's type for us. But that would have the effect that such a tag cannot be parsed at all anymore, and we have a small bunch of tests for exactly this case that assert we still can open such tags. So this change does not feel like something we can retroactively tighten, even though one shouldn't ever hit such corrupted tags. Instead, add a new `flags` field to `peel_object()` that allows the caller to opt in to strict object verification. This will be wired up at a subset of callsites over the next few commits. Note that this change also inlines `deref_tag_noverify()`. There's only been two callsites of that function, the one we're changing and one in our test helpers. The latter callsite can trivially use `deref_tag()` instead, so by inlining the function we avoid having to pass down the flag. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

last-modified: implement faster algorithm

2025-11-03T15:25:41Z

The current implementation of git-last-modified(1) works by doing a revision walk, and inspecting the diff at each level of that walk to annotate entries remaining in the hashmap of paths. In other words, if the diff at some level touches a path which has not yet been associated with a commit, then that commit becomes associated with the path. While a perfectly reasonable implementation, it can perform poorly in either one of two scenarios: 1. There are many entries of interest, in which case there is simply a lot of work to do. 2. Or, there are (even a few) entries which have not been updated in a long time, and so we must walk through a lot of history in order to find a commit that touches that path. This patch rewrites the last-modified implementation that addresses the second point. The idea behind the algorithm is to propagate a set of 'active' paths (a path is 'active' if it does not yet belong to a commit) up to parents and do a truncated revision walk. The walk is truncated because it does not produce a revision for every change in the original pathspec, but rather only for active paths. More specifically, consider a priority queue of commits sorted by generation number. First, enqueue the set of boundary commits with all paths in the original spec marked as interesting. Then, while the queue is not empty, do the following: 1. Pop an element, say, 'c', off of the queue, making sure that 'c' isn't reachable by anything in the '--not' set. 2. For each parent 'p' (with index 'parent_i') of 'c', do the following: a. Compute the diff between 'c' and 'p'. b. Pass any active paths that are TREESAME from 'c' to 'p'. c. If 'p' has any active paths, push it onto the queue. 3. Any path that remains active on 'c' is associated to that commit. This ends up being equivalent to doing something like 'git log -1 -- $path' for each path simultaneously. But, it allows us to go much faster than the original implementation by limiting the number of diffs we compute, since we can avoid parts of history that would have been considered by the revision walk in the original implementation, but are known to be uninteresting to us because we have already marked all paths in that area to be inactive. To avoid computing many first-parent diffs, add another trick on top of this and check if all paths active in 'c' are DEFINITELY NOT in c's Bloom filter. Since the commit-graph only stores first-parent diffs in the Bloom filters, we can only apply this trick to first-parent diffs. Comparing the performance of this new algorithm shows about a 2.5x improvement on git.git: Benchmark 1: master no bloom Time (mean ± σ): 2.868 s ± 0.023 s [User: 2.811 s, System: 0.051 s] Range (min … max): 2.847 s … 2.926 s 10 runs Benchmark 2: master with bloom Time (mean ± σ): 949.9 ms ± 15.2 ms [User: 907.6 ms, System: 39.5 ms] Range (min … max): 933.3 ms … 971.2 ms 10 runs Benchmark 3: HEAD no bloom Time (mean ± σ): 782.0 ms ± 6.3 ms [User: 740.7 ms, System: 39.2 ms] Range (min … max): 776.4 ms … 798.2 ms 10 runs Benchmark 4: HEAD with bloom Time (mean ± σ): 307.1 ms ± 1.7 ms [User: 276.4 ms, System: 29.9 ms] Range (min … max): 303.7 ms … 309.5 ms 10 runs Summary HEAD with bloom ran 2.55 ± 0.02 times faster than HEAD no bloom 3.09 ± 0.05 times faster than master with bloom 9.34 ± 0.09 times faster than master no bloom In short, the existing implementation is comparably fast *with* Bloom filters as the new implementation is *without* Bloom filters. So, most repositories should get a dramatic speed-up by just deploying this (even without computing Bloom filters), and all repositories should get faster still when computing Bloom filters. When comparing a more extreme example of `git last-modified -- COPYING t`, the difference is even 5 times better: Benchmark 1: master Time (mean ± σ): 4.372 s ± 0.057 s [User: 4.286 s, System: 0.062 s] Range (min … max): 4.308 s … 4.509 s 10 runs Benchmark 2: HEAD Time (mean ± σ): 826.3 ms ± 22.3 ms [User: 784.1 ms, System: 39.2 ms] Range (min … max): 810.6 ms … 881.2 ms 10 runs Summary HEAD ran 5.29 ± 0.16 times faster than master As an added benefit, results are more consistent now. For example implementation in 'master' gives: $ git log --max-count=1 --format=%H -- pkt-line.h 15df15fe07ef66b51302bb77e393f3c5502629de $ git last-modified -- pkt-line.h 15df15fe07ef66b51302bb77e393f3c5502629de pkt-line.h $ git last-modified | grep pkt-line.h 5b49c1af03e600c286f63d9d9c9fb01403230b9f pkt-line.h With the changes in this patch the results of git-last-modified(1) always match those of `git log --max-count=1`. One thing to note though, the results might be outputted in a different order than before. This is not considerd to be an issue because nowhere is documented the order is guaranteed. Based-on-patches-by: Derrick Stolee Based-on-patches-by: Taylor Blau Signed-off-by: Taylor Blau Signed-off-by: Toon Claes Acked-by: Taylor Blau [jc: tweaked use of xcalloc() to unbreak coccicheck] Signed-off-by: Junio C Hamano