aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2026-02-24t: disable maintenance where we verify object database structurePatrick Steinhardt
We have a couple of tests that explicitly verify the structure of the object database. Naturally, this structure is dependent on whether or not we run repository maintenance: if it decides to optimize the object database the expected structure is likely to not materialize. Explicitly disable auto-maintenance in such tests so that we are not dependent on decisions made by our maintenance. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24t: fix races caused by background maintenancePatrick Steinhardt
Many Git commands spawn git-maintenance(1) to optimize the repository in the background. By default, performing the maintenance is for most of the part asynchronous: we fork the executable and then continue with the rest of our business logic. This is working as expected for our users, but this behaviour is somewhat problematic for our test suite as this is inherently racy. We have many tests that verify the on-disk state of repositories, and those tests may easily race with our background maintenance. In a similar fashion, we may end up with processes that "leak" out of a current test case. Until now this tends to not be much of a problem. Our maintenance uses git-gc(1) by default, which knows to bail out in case there aren't either too many packfiles or too many loose objects. So even if other data structures would need to be optimized, we won't do so unless the object database also needs optimizations. This is about to change though, as a subsequent commit will switch to the "geometric" maintenance strategy as a default. The consequence is that we will run required optimizations even if the object database is well-optimized. And this uncovers races between our test suite and background maintenance all over the place. Disabling maintenance outright in our test suite is not really an option, as it would result in significant divergence from the "real world" and reduce our test coverage. But we've got an alternative up our sleeves: we can ensure that garbage collection runs synchronously by overriding the "maintenance.autoDetach" configuration. Of course that also diverges from the real world, as we now stop testing that background maintenance interacts in a benign way with normal Git commands. But on the other hand this ensures that the maintenance itself does not for example lead to data loss in a more reproducible way. Another concern is that this would make execution of the test suite much slower. But a quick benchmark on my machine demonstrates that this does not seem to be the case: Benchmark 1: meson test (revision = HEAD~) Time (mean ± σ): 131.182 s ± 1.293 s [User: 853.737 s, System: 1160.479 s] Range (min … max): 130.001 s … 132.563 s 3 runs Benchmark 2: meson test (revision = HEAD) Time (mean ± σ): 129.554 s ± 0.507 s [User: 849.040 s, System: 1152.664 s] Range (min … max): 129.000 s … 129.994 s 3 runs Summary meson test (revision = HEAD) ran 1.01 ± 0.01 times faster than meson test (revision = HEAD~) Funny enough, it even seems as if this speeds up test execution ever so slightly, but that may just as well be noise. Introduce a new `GIT_TEST_MAINT_AUTO_DETACH` environment variable that allows us to override the auto-detach behaviour and set that variable in our tests. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-24diffcore-break: avoid segfault with freed entriesHan Young
After we have freed the file pair, we should set the queue reference to null. When computing a diff in a partial clone, there is a chance that we could trigger a prefetch of missing objects when there are freed entries in the global diff queue due to break-rewrites detection. The segfault only occurs if an entry has been freed by break-rewrites and there is an entry to be prefetched. There is a new test in t4067 that trigger the segmentation fault that results in this case. The test explicitly fetch the necessary blobs to trigger the break rewrites, some blobs are left to be prefetched. The fix is to set the queue pointer to NULL after it is freed, the prefetch will skip NULL entries. Signed-off-by: Han Young <hanyang.tony@bytedance.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23doc: diff-options.adoc: show format.noprefix for format-patchKristoffer Haugsbakk
git-format-patch(1) uses `format.noprefix` and ignores `diff.noprefix`. The configuration variable `format.prefix` was added as an “escape hatch”, and “it’s unlikely that anybody really wants format. noprefix=true in the first place.”[1] Based on that there doesn’t seem to be a need to widely advertise this configuration variable. But in any case: the documentation for this option should not claim that it overrides a config that is always ignored. † 1: 8d5213de (format-patch: add format.noprefix option, 2023-03-09) Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23format-patch: make format.noprefix a booleanKristoffer Haugsbakk
The config `format.noprefix` was added in 8d5213de (format-patch: add format.noprefix option, 2023-03-09) to support no-prefix on paths. That was immediately after making git-format-patch(1) not respect `diff.noprefix`.[1] The intent was to mirror `diff.noprefix`. But this config was unintentionally[2] implemented by enabling no-prefix if any kind of value is set. † 1: c169af8f (format-patch: do not respect diff.noprefix, 2023-03-09) † 2: https://lore.kernel.org/all/20260211073553.GA1867915@coredump.intra.peff.net/ Let’s indeed mirror `diff.noprefix` by treating it as a boolean. This is a breaking change. And as far as breaking changes go it is pretty benign: • The documentation claims that this config is equivalent to `diff.noprefix`; this is just a bug fix if the documentation is what defines the application interface • Only users with non-boolean values will run into problems when we try to parse it as a boolean. But what would (1) make them suspect they could do that in the first place, and (2) have motivated them to do it? • Users who have set this to `false` and expect that to mean *enable format.noprefix* (current behavior) will now have the opposite experience. Which is not a reasonable setup. Let’s only offer a breaking change fig leaf by advising about the previous behavior before dying. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23Merge branch 'ps/object-info-bits-cleanup' into ps/odb-sourcesJunio C Hamano
* ps/object-info-bits-cleanup: odb: convert `odb_has_object()` flags into an enum odb: convert object info flags into an enum odb: drop gaps in object info flag values builtin/fsck: fix flags passed to `odb_has_object()` builtin/backfill: fix flags passed to `odb_has_object()`
2026-02-23Merge branch 'ps/odb-for-each-object' into ps/odb-sourcesJunio C Hamano
* ps/odb-for-each-object: odb: drop unused `for_each_{loose,packed}_object()` functions reachable: convert to use `odb_for_each_object()` builtin/pack-objects: use `packfile_store_for_each_object()` odb: introduce mtime fields for object info requests treewide: drop uses of `for_each_{loose,packed}_object()` treewide: enumerate promisor objects via `odb_for_each_object()` builtin/fsck: refactor to use `odb_for_each_object()` odb: introduce `odb_for_each_object()` packfile: introduce function to iterate through objects packfile: extract function to iterate through objects of a store object-file: introduce function to iterate through objects object-file: extract function to read object info from path odb: fix flags parameter to be unsigned odb: rename `FOR_EACH_OBJECT_*` flags
2026-02-23config: use an enum for typeDerrick Stolee
The --type=<X> option for 'git config' has previously been defined using macros, but using a typed enum is better for tracking the possible values. Move the definition up to make sure it is defined before a macro uses some of its terms. Update the initializer for config_display_options to explicitly set 'type' to TYPE_NONE even though this is implied by a zero value. This assists in knowing that the switch statement added in the previous change has a complete set of cases for a properly-valued enum. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23config: restructure format_config()Derrick Stolee
The recent changes have replaced the bodies of most if/else-if cases with simple helper method calls. This makes it easy to adapt the structure into a clearer switch statement, leaving a simple if/else in the default case. Make things a little simpler to read by reducing the nesting depth via a new goto statement when we want to skip values. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23config: format colors quietlyDerrick Stolee
Move the logic for formatting color config value into a helper method and use quiet parsing when needed. This removes error messages when parsing a list of config values that do not match color formats. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23color: add color_parse_quietly()Derrick Stolee
When parsing colors, a failed parse leads to an error message due to the result returning error(). To allow for quiet parsing, create color_parse_quietly(). This is in contrast to an ..._gently() version because the original does not die(), so both options are technically 'gentle'. To accomplish this, convert the implementation of color_parse_mem() into a static color_parse_mem_1() helper that adds a 'quiet' parameter. The color_parse_quietly() method can then use this. Since it is a near equivalent to color_parse(), move that method down in the file so they can be nearby while also appearing after color_parse_mem_1(). Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23config: format expiry dates quietlyDerrick Stolee
Move the logic for formatting expiry date config values into a helper method and use quiet parsing when needed. Note that git_config_expiry_date() will show an error on a bad parse and not die() like most other git_config...() parsers. Thus, we use 'quietly' here instead of 'gently'. There is an unfortunate asymmetry in these two parsing methods, but we need to treat a positive response from parse_expiry_date() as an error or we will get incorrect values. This updates the behavior of 'git config list --type=expiry-date' to be quiet when attempting parsing on non-date values. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23config: format paths gentlyDerrick Stolee
Move the logic for formatting path config values into a helper method and use gentle parsing when needed. We need to be careful about how to handle the ':(optional)' macro, which as tested in t1311-config-optional.sh must allow for ignoring a missing path when other multiple values exist, but cause 'git config get' to fail if it is the only possible value and thus no result is output. In the case of our list, we need to omit those values silently. This necessitates the use of the 'gently' parameter here. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23config: format bools or strings in helperDerrick Stolee
Move the logic for formatting bool-or-string config values into a helper. This parsing has always been gentle, so this is not unlocking new behavior. This extraction is only to match the formatting of the other cases that do need a behavior change. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23config: format bools or ints gentlyDerrick Stolee
Move the logic for formatting bool-or-int config values into a helper method and use gentle parsing when needed. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23config: format bools gentlyDerrick Stolee
Move the logic for formatting bool config values into a helper method and use gentle parsing when needed. This makes 'git config list --type=bool' not fail when coming across a non-boolean value. Such unparseable values are filtered out quietly. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23config: format int64s gentlyDerrick Stolee
Move the logic for formatting int64 config values into a helper method and use gentle parsing when needed. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23config: make 'git config list --type=<X>' workDerrick Stolee
Previously, the --type=<X> argument to 'git config list' was ignored and did nothing. Now, we add the use of format_config() to the show_all_config() function so each key-value pair is attempted to be parsed. This is our first use of the 'gently' parameter with a nonzero value. When listing multiple values, our initial settings for the output format is different. Add a new init helper to specify the fact that keys should be shown and also add the default delimiters as they were unset in some cases. Our intention is that if there is an error in parsing, then the row is not output. This is necessary to avoid the caller needing to build their own validator to understand the difference between valid, canonicalized types and other raw string values. The raw values will always be available to the user if they do not specify the --type=<X> option. The current behavior is more complicated, including error messages on bad parsing or potentially complete failure of the command. We add tests at this point that demonstrate the current behavior so we can witness the fix in future changes that parse these values quietly and gently. This is a change in behavior! We are starting to respect an option that was previously ignored, leading to potential user confusion. This is probably still a good option, since the --type argument did not change behavior at all previously, so users can get the behavior they expect by removing the --type argument or adding the --no-type argument. t1300-config.sh is updated with the current behavior of this formatting logic to justify the upcoming refactoring of format_config() that will incrementally fix some of these cases to be more user-friendly. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23config: add 'gently' parameter to format_config()Derrick Stolee
This parameter is set to 0 for all current callers and is UNUSED. However, we will start using this option in future changes and in a critical change that requires gentle parsing (not using die()) to try parsing all values in a list. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23config: move show_all_config()Derrick Stolee
In anticipation of using format_config() in this method, move show_all_config() lower in the file without changes. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: replace `refs_for_each_fullref_in()`Patrick Steinhardt
Replace calls to `refs_for_each_fullref_in()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: replace `refs_for_each_namespaced_ref()`Patrick Steinhardt
Replace calls to `refs_for_each_namespaced_ref()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: replace `refs_for_each_glob_ref()`Patrick Steinhardt
Replace calls to `refs_for_each_glob_ref()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: replace `refs_for_each_glob_ref_in()`Patrick Steinhardt
Replace calls to `refs_for_each_glob_ref_in()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: replace `refs_for_each_rawref_in()`Patrick Steinhardt
Replace calls to `refs_for_each_rawref_in()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: replace `refs_for_each_rawref()`Patrick Steinhardt
Replace calls to `refs_for_each_rawref()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: replace `refs_for_each_ref_in()`Patrick Steinhardt
Replace calls to `refs_for_each_ref_in()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: improve verification for-each-ref optionsPatrick Steinhardt
Improve verification of the passed-in for-each-ref options: - Require that the `refs` store must be given. It's arguably very surprising that we simply return successfully in case the ref store is a `NULL` pointer. - When expected to trim ref prefixes we will `BUG()` in case the refname would become empty or in case we're expected to trim a longer prefix than the refname is long. As such, this case is only guaranteed to _not_ `BUG()` in case the caller also specified a prefix. And furthermore, that prefix must end in a trailing slash, as otherwise it may produce an exact match that could lead us to trim to the empty string. An audit shows that there are no callsites that rely on either of these behaviours, so this should not result in a functional change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: generalize `refs_for_each_fullref_in_prefixes()`Patrick Steinhardt
The function `refs_for_each_fullref_in_prefixes()` can be used to iterate over all references part of any of the user-provided prefixes. In contrast to the `prefix` parameter of `refs_for_each_ref_ext()` it knows to handle the case well where multiple of the passed-in prefixes start with a common prefix by computing longest common prefixes and then iterating over those. While we could move this logic into `refs_for_each_ref_ext()`, this one feels somewhat special as we perform multiple iterations. But what we _can_ do is to generalize how this function works: instead of accepting only a small handful of parameters, we can have it accept the full options structure. One obvious exception is that the caller must not provide a prefix via the options. But this case can be easily detected. Refactor the code accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: generalize `refs_for_each_namespaced_ref()`Patrick Steinhardt
The function `refs_for_each_namespaced_ref()` iterates through all references that are part of the current ref namespace. This namespace can be configured by setting the `GIT_NAMESPACE` environment variable and is then retrieved by calling `get_git_namespace()`. If a namespace is configured, then we: - Obviously only yield refs that exist in this namespace. - Rewrite exclude patterns so that they work for the given namespace, if any namespace is currently configured. Port this logic to `refs_for_each_ref_ext()` by adding a new `namespace` field to the options structure. This gives callers more flexibility as they can decide by themselves whether they want to use the globally configured or an arbitrary other namespace. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: speed up `refs_for_each_glob_ref_in()`Patrick Steinhardt
The function `refs_for_each_glob_ref_in()` can be used to iterate through all refs in a specific prefix with globbing. The logic to handle this is currently hosted by `refs_for_each_glob_ref_in()`, which sets up a callback function that knows to filter out refs that _don't_ match the given globbing pattern. The way we do this is somewhat inefficient though: even though the function is expected to only yield refs in the given prefix, we still end up iterating through _all_ references, regardless of whether or not their name matches the given prefix. Extend `refs_for_each_ref_ext()` so that it can handle patterns and adapt `refs_for_each_glob_ref_in()` to use it. This means we continue to use the same callback-based infrastructure to filter individual refs via the globbing pattern, but we can now also use the other functionality of the `_ext()` variant. Most importantly, this means that we now properly handle the prefix. This results in a performance improvement when using a prefix where a significant majority of refs exists outside of the prefix. The following benchmark is an extreme case, with 1 million refs that exist outside the prefix and a single ref that exists inside it: Benchmark 1: git rev-parse --branches=refs/heads/* (rev = HEAD~) Time (mean ± σ): 115.9 ms ± 0.7 ms [User: 113.0 ms, System: 2.4 ms] Range (min … max): 114.9 ms … 117.8 ms 25 runs Benchmark 2: git rev-parse --branches=refs/heads/* (rev = HEAD) Time (mean ± σ): 1.1 ms ± 0.1 ms [User: 0.3 ms, System: 0.7 ms] Range (min … max): 1.0 ms … 2.3 ms 2092 runs Summary git rev-parse --branches=refs/heads/* (rev = HEAD) ran 107.01 ± 6.49 times faster than git rev-parse --branches=refs/heads/* (rev = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: introduce `refs_for_each_ref_ext`Patrick Steinhardt
In the refs subsystem we have a proliferation of functions that all iterate through references. (Almost) all of these functions internally call `do_for_each_ref()` and provide slightly different arguments so that one can control different aspects of its behaviour. This approach doesn't really scale: every time there is a slightly different use case for iterating through refs we create another new function. This combinatorial explosion doesn't make a lot of sense: it leads to confusing interfaces and heightens the maintenance burden. Refactor the code to become more composable by: - Exposing `do_for_each_ref()` as `refs_for_each_ref_ext()`. - Introducing an options structure that lets the caller control individual options. This gives us a much better foundation to build on going forward. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: rename `each_ref_fn`Patrick Steinhardt
Similar to the preceding commit, rename `each_ref_fn` to better match our current best practices around how we name things. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: rename `do_for_each_ref_flags`Patrick Steinhardt
The enum `do_for_each_ref_flags` and its individual values don't match to our current best practices when it comes to naming things. Rename it to `refs_for_each_flag`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: move `do_for_each_ref_flags` further upPatrick Steinhardt
Move the `do_for_each_ref_flags` enum further up. This prepares for subsequent changes, where the flags will be used by more functions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: move `refs_head_ref_namespaced()`Patrick Steinhardt
The function `refs_head_ref_namespaced()` is somewhat special when compared to most of the other functions that take a callback function: while `refs_for_each_*()` functions yield multiple refs, `refs_heasd_ref_namespaced()` will only yield at most the HEAD ref of the current namespace. As such, the function is related to `refs_head_ref()` and not to the for-each functions. Move the function to be located next to `refs_head_ref()` to clarify. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: remove unused `refs_for_each_include_root_ref()`Patrick Steinhardt
Remove the unused `refs_for_each_include_root_ref()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23pack-check: fix verification of large objectsPatrick Steinhardt
It was reported [1] that git-fsck(1) may sometimes run into an infinite loop when processing packfiles. This bug was bisected to c31bad4f7d (packfile: track packs via the MRU list exclusively, 2025-10-30), which refactored our lsit of packfiles to only be tracked via an MRU list, exclusively. This isn't entirely surprising: any caller that iterates through the list of packfiles and then hits `find_pack_entry()`, for example because they read an object from it, may cause the MRU list to be updated. And if the caller is unlucky, this may cause the mentioned infinite loop. While this mechanism is somewhat fragile, it is still surprising that we encounter it when verifying the packfile. We iterate through objects in a given pack one by one and then read them via their offset, and doing this shouldn't ever end up in `find_pack_entry()`. But there is an edge case here: when the object in question is a blob bigger than "core.largeFileThreshold", then we will be careful to not read it into memory. Instead, we read it via an object stream by calling `odb_read_object_stream()`, and that function will perform an object lookup via `odb_read_object_info()`. So in the case where there are at least two blobs in two different packfiles, and both of these blobs exceed "core.largeFileThreshold", then we'll run into an infinite loop because we'll always update the MRU. We could fix this by improving `repo_for_each_pack()` to not update the MRU, and this would address the issue. But the fun part is that using `odb_read_object_stream()` is the wrong thing to do in the first place: it may open _any_ instance of this object, so we ultimately cannot be sure that we even verified the object in our given packfile. Fix this bug by creating the object stream for the packed object directly via `packfile_read_object_stream()`. Add a test that would have caused the infinite loop. [1]: <20260222183710.2963424-1-sandals@crustytoothpaste.net> Reported-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23packfile: expose function to read object stream for an offsetPatrick Steinhardt
The function `packfile_store_read_object_stream()` takes as input an object ID and then constructs a `struct odb_read_stream` from it. In a subsequent commit we'll want to create an object stream for a given combination of packfile and offset though, which is not something that can currently be done. Extract a new function `packfile_read_object_stream()` that makes this functionality available. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23object-file: adapt `stream_object_signature()` to take a streamPatrick Steinhardt
The function `stream_object_signature()` is responsible for verifying whether the given object ID matches the actual hash of the object's contents. In contrast to `check_object_signature()` it does so in a streaming fashion so that we don't have to load the full object into memory. In a subsequent commit we'll want to adapt one of its callsites to pass a preconstructed stream. Prepare for this by accepting a stream as input that the caller needs to assemble. While at it, improve the error reporting in `parse_object_with_flags()` to tell apart the two failure modes. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23t/helper: improve "genrandom" test helperPatrick Steinhardt
The `test-tool genrandom` test helper can be used to generate random data, either as an infinite stream or with a specified number of bytes. The way we handle parsing the number of bytes is lacking though: - We don't have good error handling, so if the caller for example uses `test-tool genrandom 200xyz` then we'll end up generating 200 bytes of random data successfully. - Many callers want to generate e.g. 1 kilobyte or megabyte of data, but they have to either use unwieldy numbers like 1048576, or they have to precompute them. Fix both of these issues by using `git_parse_ulong()` to parse the argument. This function has better error handling, and it knows to handle unit suffixes. Adapt a couple of our tests to use suffixes instead of manual computations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-22object-file.c: avoid container_of() of a NULL containerJunio C Hamano
Even though the "struct odb_transaction" member is at the beginning of the containing "struct odb_transaction_files", i.e., at offset 0, using container_of() to add offset 0 to a NULL pointer gets flagged as a bad behaviour under SANITIZE=undefined. Use container_of_or_null() to work around this issue. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-21pack-redundant: fix memory leak when open_pack_index() failsSahitya Chandra
In add_pack(), we allocate l.remaining_objects with llist_init() before calling open_pack_index(). If open_pack_index() fails we return NULL without freeing the allocated list, leaking the memory. Fix by calling llist_free(l.remaining_objects) on the error path before returning. Signed-off-by: Sahitya Chandra <sahityajb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-21path: factor out skip_slashes() in normalize_path_copy_len()Pushkar Singh
Extract skip_slashes() to avoid repeating the same is_dir_sep() loop in multiple places inside normalize_path_copy_len(). Keep the dot-component handling inline to preserve the original control flow and readability, as suggested in review. No functional changes. Behavior verified with t0060-path-utils.sh. Signed-off-by: Pushkar Singh <pushkarkumarsingh1970@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-21t2004: use test_path_is_file instead of test -fLambert Duclos-de Guise
Replace 'test -f' with the helper function 'test_path_is_file' to provide better error messages upon failure. Signed-off-by: Lambert Duclos-de Guise <lambertddg@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-21replay: prevent the_repository from coming backElijah Newren
Due to the use of DEFAULT_ABBREV, we cannot get rid of our usage of USE_THE_REPOSITORY_VARIABLE. We have removed all other uses of the_repository before, but without removing that definition, they keep coming back. Define the_repository to make it a compilation error so that they don't come back any more; the repo parameter plumbed through the various functions can be used instead. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-21merge-ort: prevent the_repository from coming backElijah Newren
Due to the use of DEFAULT_ABBREV, we cannot get rid of our usage of USE_THE_REPOSITORY_VARIABLE. However, we have removed all other uses of the_repository in merge-ort a few times. But they keep coming back. Define the_repository to make it a compilation error so that they don't come back any more. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-21merge-ort: replace the_hash_algo with opt->repo->hash_algoElijah Newren
We have a perfectly valid repository available and do not need to use the_hash_algo (a shorthand for the_repository->hash_algo), so use the known repository instead. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-21merge-ort: replace the_repository with opt->repoElijah Newren
We have a perfectly valid repository available and do not need to use the_repository. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-21merge-ort: pass repository to write_tree()Elijah Newren
In order to get rid of a usage of the_repository, we need to know the value of opt->repo; pass it along to write_tree(). Once we have the repository, though, we no longer need to pass opt->repo->hash_algo->rawsz, we can have write_tree() look up that value itself. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>