git/commit-graph.c, branch main

Merge branch 'ps/commit-graph-overflow-fix'

2026-04-07T21:59:25Z

Fix a regression in writing the commit-graph where commits with dates exceeding 34 bits (beyond year 2514) could cause an underflow and crash Git during the generation data overflow chunk writing. * ps/commit-graph-overflow-fix: commit-graph: fix writing generations with dates exceeding 34 bits

commit-graph: fix writing generations with dates exceeding 34 bits

2026-03-24T15:39:37Z

The `timestamp_t` type is declared as `uintmax_t` and thus typically has 64 bits of precision. Usually, the full precision of such dates is not required: it would be comforting to know that Git is still around in millions of years, but all in all the chance is rather low. We abuse this fact in the commit-graph: instead of storing the full 64 bits of precision, committer dates only store 34 bits. This is still plenty of headroom, as it means that we can represent dates until year 2514. Commits which are dated beyond that year will simply get a date whose remaining bits are masked. The result of this is somewhat curious: the committer date will be different depending on whether a commit gets parsed via the commit-graph or via the object database. This isn't really too much of an issue in general though, as we don't typically use the date parsed from the commit-graph in user-facing output. But with 024b4c9697 (commit: make `repo_parse_commit_no_graph()` more robust, 2026-02-16) it started to become a problem when writing the commit-graph itself. This commit changed `repo_parse_commit_no_graph()` so that we re-parse the commit via the object database in case it was already parsed beforehand via the commit-graph. The consequence is that we may now act with two different commit dates at different stages: - Initially, we use the 34-bit precision timestamp when writing the chunk generation data. We thus correctly compute the offsets relative to the on-disk timestamp here. - Later, when writing the overflow data, we may end up with the full-precision timestamp. When the date is larger than 34 bits the result of this is an underflow when computing the offset. This causes a mismatch in the number of generation data overflow records we want to write, and that ultimately causes Git to die. Introduce a new helper function that computes the generation offset for a commit while correctly masking the date to 34 bits. This makes the previously-implicit assumptions about the commit date precision explicit and thus hopefully less fragile going forward. Adapt sites that compute the offset to use the function. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

odb: introduce `struct odb_for_each_object_options`

2026-03-20T20:16:41Z

The `odb_for_each_object()` function only accepts a bitset of flags. In a subsequent commit we'll want to change object iteration to also support iterating over only those objects that have a specific prefix. While we could of course add the prefix to the function signature, or alternatively introduce a new function, both of these options don't really seem to be that sensible. Instead, introduce a new `struct odb_for_each_object_options` that can be passed to a new `odb_for_each_object_ext()` function. Splice through the options structure into the respective object database sources. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

odb: introduce generic object counting

2026-03-12T15:38:43Z

Similar to the preceding commit, introduce counting of objects on the object database level, replacing the logic that we have in `repo_approximate_object_count()`. Note that the function knows to cache the object count. It's unclear whether this cache is really required as we shouldn't have that many cases where we count objects repeatedly. But to be on the safe side the caching mechanism is retained, with the only excepting being that we also have to use the passed flags as caching key. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

odb: embed base source in the "files" backend

2026-03-05T19:45:15Z

The "files" backend is implemented as a pointer in the `struct odb_source`. This contradicts our typical pattern for pluggable backends like we use it for example in the ref store or for object database streams, where we typically embed the generic base structure in the specialized implementation. This pattern has a couple of small benefits: - We avoid an extra allocation. - We hide implementation details in the generic structure. - We can easily downcast from a generic backend to the specialized structure and vice versa because the offsets are known at compile time. - It becomes trivial to identify locations where we depend on backend specific logic because the cast needs to be explicit. Refactor our "files" object database source to do the same and embed the `struct odb_source` in the `struct odb_source_files`. There are still a bunch of sites in our code base where we do have to access internals of the "files" backend. The intent is that those will go away over time, but this will certainly take a while. Meanwhile, provide a `odb_source_files_downcast()` function that can convert a generic source into a "files" source. As we only have a single source the downcast succeeds unconditionally for now. Eventually though the intent is to make the cast `BUG()` in case the caller requests to downcast a non-"files" backend to a "files" backend. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

odb: introduce "files" source

2026-03-05T19:45:14Z

Introduce a new "files" object database source. This source encapsulates access to both loose object files and the packfile store, similar to how the "files" backend for refs encapsulates access to loose refs and the packed-refs file. Note that for now the "files" source is still a direct member of a `struct odb_source`. This architecture will be reversed in the next commit so that the files source contains a `struct odb_source`. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

Merge branch 'ps/odb-for-each-object' into ps/odb-sources

2026-02-23T21:48:00Z

* ps/odb-for-each-object: odb: drop unused `for_each_{loose,packed}_object()` functions reachable: convert to use `odb_for_each_object()` builtin/pack-objects: use `packfile_store_for_each_object()` odb: introduce mtime fields for object info requests treewide: drop uses of `for_each_{loose,packed}_object()` treewide: enumerate promisor objects via `odb_for_each_object()` builtin/fsck: refactor to use `odb_for_each_object()` odb: introduce `odb_for_each_object()` packfile: introduce function to iterate through objects packfile: extract function to iterate through objects of a store object-file: introduce function to iterate through objects object-file: extract function to read object info from path odb: fix flags parameter to be unsigned odb: rename `FOR_EACH_OBJECT_*` flags

Merge branch 'ps/commit-list-functions-renamed'

2026-02-13T21:39:25Z

Rename three functions around the commit_list data structure. * ps/commit-list-functions-renamed: commit: rename `free_commit_list()` to conform to coding guidelines commit: rename `reverse_commit_list()` to conform to coding guidelines commit: rename `copy_commit_list()` to conform to coding guidelines

treewide: drop uses of `for_each_{loose,packed}_object()`

2026-01-26T16:26:07Z

We're using `for_each_loose_object()` and `for_each_packed_object()` at a couple of callsites to enumerate all loose and packed objects, respectively. These functions will be removed in a subsequent commit in favor of the newly introduced `odb_source_loose_for_each_object()` and `packfile_store_for_each_object()` replacements. Prepare for this by refactoring the sites accordingly. Note that ideally, we'd convert all callsites to use the generic `odb_for_each_object()` function already. But for some callers this is not possible (yet), and it would require some significant refactorings to make this work. Converting these site will thus be deferred to a later patch series. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano

odb: rename `FOR_EACH_OBJECT_*` flags

2026-01-26T16:26:06Z

Rename the `FOR_EACH_OBJECT_*` flags to have an `ODB_` prefix. This prepares us for a new upcoming `odb_for_each_object()` function and ensures that both the function and its flags have the same prefix. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano