aboutsummaryrefslogtreecommitdiff
path: root/odb.c
AgeCommit message (Collapse)Author
2026-03-31odb: rename `odb_has_object()` flagsPatrick Steinhardt
Rename `odb_has_object()` flags to be properly prefixed with the function name. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-31odb: use enum for `odb_write_object` flagsPatrick Steinhardt
We've got a couple of functions that accept `odb_write_object()` flags, but all of them accept the flags as an `unsigned` integer. In fact, we don't even have an `enum` for the flags field. Introduce this `enum` and adapt functions accordingly according to our coding style. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-31treewide: use enum for `odb_for_each_object()` flagsPatrick Steinhardt
We've got a couple of callsites where we pass `odb_for_each_object()` flags, but accept an `unsigned` flags field instead of the corresponding enum. Adapt these to accept the enum type instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-20odb: introduce generic `odb_find_abbrev_len()`Patrick Steinhardt
Introduce a new generic `odb_find_abbrev_len()` function as well as source-specific callback functions. This makes the logic to compute the required prefix length to make a given object unique fully pluggable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-20odb: introduce `struct odb_for_each_object_options`Patrick Steinhardt
The `odb_for_each_object()` function only accepts a bitset of flags. In a subsequent commit we'll want to change object iteration to also support iterating over only those objects that have a specific prefix. While we could of course add the prefix to the function signature, or alternatively introduce a new function, both of these options don't really seem to be that sensible. Instead, introduce a new `struct odb_for_each_object_options` that can be passed to a new `odb_for_each_object_ext()` function. Splice through the options structure into the respective object database sources. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-12odb: introduce generic object countingPatrick Steinhardt
Similar to the preceding commit, introduce counting of objects on the object database level, replacing the logic that we have in `repo_approximate_object_count()`. Note that the function knows to cache the object count. It's unclear whether this cache is really required as we shouldn't have that many cases where we count objects repeatedly. But to be on the safe side the caching mechanism is retained, with the only excepting being that we also have to use the passed flags as caching key. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb/source: make `write_alternate()` function pluggablePatrick Steinhardt
Introduce a new callback function in `struct odb_source` to make the function pluggable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb/source: make `read_alternates()` function pluggablePatrick Steinhardt
Introduce a new callback function in `struct odb_source` to make the function pluggable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb/source: make `write_object_stream()` function pluggablePatrick Steinhardt
Introduce a new callback function in `struct odb_source` to make the function pluggable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb/source: make `write_object()` function pluggablePatrick Steinhardt
Introduce a new callback function in `struct odb_source` to make the function pluggable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb/source: make `freshen_object()` function pluggablePatrick Steinhardt
Introduce a new callback function in `struct odb_source` to make the function pluggable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb/source: make `for_each_object()` function pluggablePatrick Steinhardt
Introduce a new callback function in `struct odb_source` to make the function pluggable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb/source: make `read_object_info()` function pluggablePatrick Steinhardt
Introduce a new callback function in `struct odb_source` to make the function pluggable. Note that this function is a bit less straight-forward to convert compared to the other functions. The reason here is that the logic to read an object is: 1. We try to read the object. If it exists we return it. 2. If the object does not exist we reprepare the object database source. 3. We then try reading the object info a second time in case the reprepare caused it to appear. The second read is only supposed to happen for the packfile store though, as reading loose objects is not impacted by repreparing the object database. Ideally, we'd just move this whole logic into the ODB source. But that's not easily possible because we try to avoid the reprepare unless really required, which is after we have found out that no other ODB source contains the object, either. So the logic spans across multiple ODB sources, and consequently we cannot move it into an individual source. Instead, introduce a new flag `OBJECT_INFO_SECOND_READ` that tells the backend that we already tried to look up the object once, and that this time around the ODB source should try to find any new objects that may have surfaced due to an on-disk change. With this flag, the "files" backend can trivially skip trying to re-read the object as a loose object. Furthermore, as we know that we only try the second read via the packfile store, we can skip repreparing loose objects and only reprepare the packfile store. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb/source: make `close()` function pluggablePatrick Steinhardt
Introduce a new callback function in `struct odb_source` to make the function pluggable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb/source: make `reprepare()` function pluggablePatrick Steinhardt
Introduce a new callback function in `struct odb_source` to make the function pluggable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb: move reparenting logic into respective subsystemsPatrick Steinhardt
The primary object database source may be initialized with a relative path. When the process changes its current working directory we thus have to update this path and have it point to the same path, but relative to the new working directory. This logic is handled in the object database layer. It consists of three steps: 1. We undo any potential temporary object directory, which are used for transactions. This is done so that we don't end up modifying the temporary object database source that got applied for the transaction. 2. We then iterate through the non-transactional sources and reparent their respective paths. 3. We reapply the temporary object directory, but update its path. All of this logic is heavily tied to how the object database source handles paths in the first place. It's an internal implementation detail, and as sources may not even use an on-disk path at all it is not a mechanism that applies to all potential sources. Refactor the code so that the logic to reparent the sources is hosted by the "files" source and the temporary object directory subsystems, respectively. This logic is easier to reason about, but it also ensures that this logic is handled at the correct level. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb: embed base source in the "files" backendPatrick Steinhardt
The "files" backend is implemented as a pointer in the `struct odb_source`. This contradicts our typical pattern for pluggable backends like we use it for example in the ref store or for object database streams, where we typically embed the generic base structure in the specialized implementation. This pattern has a couple of small benefits: - We avoid an extra allocation. - We hide implementation details in the generic structure. - We can easily downcast from a generic backend to the specialized structure and vice versa because the offsets are known at compile time. - It becomes trivial to identify locations where we depend on backend specific logic because the cast needs to be explicit. Refactor our "files" object database source to do the same and embed the `struct odb_source` in the `struct odb_source_files`. There are still a bunch of sites in our code base where we do have to access internals of the "files" backend. The intent is that those will go away over time, but this will certainly take a while. Meanwhile, provide a `odb_source_files_downcast()` function that can convert a generic source into a "files" source. As we only have a single source the downcast succeeds unconditionally for now. Eventually though the intent is to make the cast `BUG()` in case the caller requests to downcast a non-"files" backend to a "files" backend. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb: introduce "files" sourcePatrick Steinhardt
Introduce a new "files" object database source. This source encapsulates access to both loose object files and the packfile store, similar to how the "files" backend for refs encapsulates access to loose refs and the packed-refs file. Note that for now the "files" source is still a direct member of a `struct odb_source`. This architecture will be reversed in the next commit so that the files source contains a `struct odb_source`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-05odb: split `struct odb_source` into separate headerPatrick Steinhardt
Subsequent commits will expand the `struct odb_source` to become a generic interface for accessing an object database source. As part of these refactorings we'll add a set of function pointers that will significantly expand the structure overall. Prepare for this by splitting out the `struct odb_source` into a separate header. This keeps the high-level object database interface detached from the low-level object database sources. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23Merge branch 'ps/object-info-bits-cleanup' into ps/odb-sourcesJunio C Hamano
* ps/object-info-bits-cleanup: odb: convert `odb_has_object()` flags into an enum odb: convert object info flags into an enum odb: drop gaps in object info flag values builtin/fsck: fix flags passed to `odb_has_object()` builtin/backfill: fix flags passed to `odb_has_object()`
2026-02-23Merge branch 'ps/odb-for-each-object' into ps/odb-sourcesJunio C Hamano
* ps/odb-for-each-object: odb: drop unused `for_each_{loose,packed}_object()` functions reachable: convert to use `odb_for_each_object()` builtin/pack-objects: use `packfile_store_for_each_object()` odb: introduce mtime fields for object info requests treewide: drop uses of `for_each_{loose,packed}_object()` treewide: enumerate promisor objects via `odb_for_each_object()` builtin/fsck: refactor to use `odb_for_each_object()` odb: introduce `odb_for_each_object()` packfile: introduce function to iterate through objects packfile: extract function to iterate through objects of a store object-file: introduce function to iterate through objects object-file: extract function to read object info from path odb: fix flags parameter to be unsigned odb: rename `FOR_EACH_OBJECT_*` flags
2026-02-12odb: convert `odb_has_object()` flags into an enumPatrick Steinhardt
Following the reason in the preceding commit, convert the `odb_has_object()` flags into an enum. With this change, we would have catched the misuse of `odb_has_object()` that was fixed in a preceding commit as the compiler would have generated a warning: ../builtin/backfill.c:71:9: error: implicit conversion from enumeration type 'enum odb_object_info_flag' to different enumeration type 'enum odb_has_object_flag' [-Werror,-Wenum-conversion] 70 | if (!odb_has_object(ctx->repo->objects, &list->oid[i], | ~~~~~~~~~~~~~~ 71 | OBJECT_INFO_FOR_PREFETCH)) | ^~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-12odb: convert object info flags into an enumPatrick Steinhardt
Convert the object info flags into an enum and adapt all functions that receive these flags as parameters to use the enum instead of an integer. This serves two purposes: - The function signatures become more self-documenting, as callers don't have to wonder which flags they expect. - The compiler can warn when a wrong flag type is passed. Note that the second benefit is somewhat limited. For example, when or-ing multiple enum flags together the result will be an integer, and the compiler will not warn about such use cases. But where it does help is when a single flag of the wrong type is passed, as the compiler would generate a warning in that case. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-02odb: transparently handle common transaction behaviorJustin Tobler
A new ODB transaction is created and returned via `odb_transaction_begin()` and stored in the ODB. Only a single transaction may be pending at a time. If the ODB already has a transaction, the function is expected to return NULL. Similarly, when committing a transaction via `odb_transaction_commit()` the transaction being committed must match the pending transaction and upon commit reset the ODB transaction to NULL. These behaviors apply regardless of the ODB transaction implementation. Move the corresponding logic into `odb_transaction_{begin,commit}()` accordingly. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-02odb: prepare `struct odb_transaction` to become genericJustin Tobler
An ODB transaction handles how objects are stored temporarily and eventually committed. Due to object storage being implemented differently for a given ODB source, the ODB transactions must be implemented in a manner specific to the source the objects are being written to. To provide generic transactions, `struct odb_transaction` is updated to store a commit callback that can be configured to support a specific ODB source. For now `struct odb_transaction_files` is the only transaction type and what is always returned when starting a transaction. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-02object-file: rename transaction functionsJustin Tobler
In a subsequent commit, ODB transactions are made more generic to facilitate each ODB source providing its own transaction handling. Rename `object_file_transaction_{begin,commit}()` to `odb_transaction_files_{begin,commit}()` to better match the future source specific transaction implementation. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-26odb: introduce mtime fields for object info requestsPatrick Steinhardt
There are some use cases where we need to figure out the mtime for objects. Most importantly, this is the case when we want to prune unreachable objects. But getting at that data requires users to manually derive the info either via the loose object's mtime, the packfiles' mtime or via the ".mtimes" file. Introduce a new `struct object_info::mtimep` pointer that allows callers to request an object's mtime. This new field will be used in a subsequent commit. Note that the concept of "mtime" is ambiguous: given an object, it may be stored multiple times in the object database, and each of these instances may have a different mtime. Disambiguating these mtimes is nothing that can happen on the generic ODB layer: the caller may search for the oldest object, the newest object, or even the relation of object mtimes depending on the specific source they are located in. As such, it is the responsibility of the caller to disambiguate mtimes. A consequence of this is that it's most likely incorrect to look up the mtime via `odb_read_object_info()`, as this interface does not give us enough information to disambiguate the mtime. Document this accordingly and tell users to use `odb_for_each_object()` instead. Even with this gotcha though it's sensible to have this request as part of the object info, as the mtime is a property of the object storage format. If we for example had a "black-box" storage backend, we'd still need to be able to query it for the mtime info in a generic way. We could introduce a safety mechanism that for example calls `BUG()` in case we look up the mtime outside of `odb_for_each_object()`. But that feels somewhat heavy-handed. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-26odb: introduce `odb_for_each_object()`Patrick Steinhardt
Introduce a new function `odb_for_each_object()` that knows to iterate through all objects part of a given object database. This function is essentially a simple wrapper around the object database sources. Subsequent commits will adapt callers to use this new function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-21Merge branch 'ps/packfile-store-in-odb-source'Junio C Hamano
The packfile_store data structure is moved from object store to odb source. * ps/packfile-store-in-odb-source: packfile: move MIDX into packfile store packfile: refactor `find_pack_entry()` to work on the packfile store packfile: inline `find_kept_pack_entry()` packfile: only prepare owning store in `packfile_store_prepare()` packfile: only prepare owning store in `packfile_store_get_packs()` packfile: move packfile store into object source packfile: refactor misleading code when unusing pack windows packfile: refactor kept-pack cache to work with packfile stores packfile: pass source to `prepare_pack()` packfile: create store via its owning source
2026-01-15Merge branch 'ps/odb-misc-fixes'Junio C Hamano
Miscellaneous fixes on object database layer. * ps/odb-misc-fixes: odb: properly close sources before freeing them builtin/gc: fix condition for whether to write commit graphs
2026-01-15Merge branch 'ps/packfile-store-in-odb-source' into ps/odb-for-each-objectJunio C Hamano
* ps/packfile-store-in-odb-source: packfile: move MIDX into packfile store packfile: refactor `find_pack_entry()` to work on the packfile store packfile: inline `find_kept_pack_entry()` packfile: only prepare owning store in `packfile_store_prepare()` packfile: only prepare owning store in `packfile_store_get_packs()` packfile: move packfile store into object source packfile: refactor misleading code when unusing pack windows packfile: refactor kept-pack cache to work with packfile stores packfile: pass source to `prepare_pack()` packfile: create store via its owning source odb: properly close sources before freeing them builtin/gc: fix condition for whether to write commit graphs
2026-01-09packfile: move MIDX into packfile storePatrick Steinhardt
The multi-pack index still is tracked as a member of the object database source, but ultimately the MIDX is always tied to one specific packfile store. Move the structure into `struct packfile_store` accordingly. This ensures that the packfile store now keeps track of all data related to packfiles. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09packfile: move packfile store into object sourcePatrick Steinhardt
The packfile store is a member of `struct object_database`, which means that we have a single store per database. This doesn't really make much sense though: each source connected to the database has its own set of packfiles, so there is a conceptual mismatch here. This hasn't really caused much of a problem in the past, but with the advent of pluggable object databases this is becoming more of a problem because some of the sources may not even use packfiles in the first place. Move the packfile store down by one level from the object database into the object database source. This ensures that each source now has its own packfile store, and we can eventually start to abstract it away entirely so that the caller doesn't even know what kind of store it uses. Note that we only need to adjust a relatively small number of callers, way less than one might expect. This is because most callers are using `repo_for_each_pack()`, which handles enumeration of all packfiles that exist in the repository. So for now, none of these callers need to be adapted. The remaining callers that iterate through the packfiles directly and that need adjustment are those that are a bit more tangled with packfiles. These will be adjusted over time. Note that this patch only moves the packfile store, and there is still a bunch of functions that seemingly operate on a packfile store but that end up iterating over all sources. These will be adjusted in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09packfile: create store via its owning sourcePatrick Steinhardt
In subsequent patches we're about to move the packfile store from the object database layer into the object database source layer. Once done, we'll have one packfile store per source, where the source is owning the store. Prepare for this future and refactor `packfile_store_new()` to be initialized via an object database source instead of via the object database itself. This refactoring leads to a weird in-between state where the store is owned by the object database but created via the source. But this makes subsequent refactorings easier because we can now start to access the owning source of a given store. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07Merge branch 'ps/odb-misc-fixes' into ps/packfile-store-in-odb-sourceJunio C Hamano
* ps/odb-misc-fixes: odb: properly close sources before freeing them builtin/gc: fix condition for whether to write commit graphs
2026-01-07odb: properly close sources before freeing themPatrick Steinhardt
It is possible to hit a memory leak when reading data from a submodule via git-grep(1): Direct leak of 192 byte(s) in 1 object(s) allocated from: #0 0x55555562e726 in calloc (git+0xda726) #1 0x555555964734 in xcalloc ../wrapper.c:154:8 #2 0x555555835136 in load_multi_pack_index_one ../midx.c:135:2 #3 0x555555834fd6 in load_multi_pack_index ../midx.c:382:6 #4 0x5555558365b6 in prepare_multi_pack_index_one ../midx.c:716:17 #5 0x55555586c605 in packfile_store_prepare ../packfile.c:1103:3 #6 0x55555586c90c in packfile_store_reprepare ../packfile.c:1118:2 #7 0x5555558546b3 in odb_reprepare ../odb.c:1106:2 #8 0x5555558539e4 in do_oid_object_info_extended ../odb.c:715:4 #9 0x5555558533d1 in odb_read_object_info_extended ../odb.c:862:8 #10 0x5555558540bd in odb_read_object ../odb.c:920:6 #11 0x55555580a330 in grep_source_load_oid ../grep.c:1934:12 #12 0x55555580a13a in grep_source_load ../grep.c:1986:10 #13 0x555555809103 in grep_source_is_binary ../grep.c:2014:7 #14 0x555555807574 in grep_source_1 ../grep.c:1625:8 #15 0x555555807322 in grep_source ../grep.c:1837:10 #16 0x5555556a5c58 in run ../builtin/grep.c:208:10 #17 0x55555562bb42 in void* ThreadStartFunc<false>(void*) lsan_interceptors.cpp.o #18 0x7ffff7a9a979 in start_thread (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x9a979) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab) #19 0x7ffff7b22d2b in __GI___clone3 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x122d2b) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab) The root caues of this leak is the way we set up and release the submodule: 1. We use `repo_submodule_init()` to initialize a new repository. This repository is stored in `repos_to_free`. 2. We now read data from the submodule repository. 3. We then call `repo_clear()` on the submodule repositories. 4. `repo_clear()` calls `odb_free()`. 5. `odb_free()` calls `odb_free_sources()` followed by `odb_close()`. The issue here is the 5th step: we call `odb_free_sources()` _before_ we call `odb_close()`. But `odb_free_sources()` already frees all sources, so the logic that closes them in `odb_close()` now becomes a no-op. As a consequence, we never explicitly close sources at all. Fix the leak by closing the store before we free the sources. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30Merge branch 'jc/object-read-stream-fix'Junio C Hamano
Fix a performance regression in recently graduated topic. * jc/object-read-stream-fix: odb: do not use "blank" substitute for NULL
2025-12-22Merge branch 'ps/odb-alternates-object-sources'Junio C Hamano
Code refactoring around alternate object store. * ps/odb-alternates-object-sources: odb: write alternates via sources odb: read alternates via sources odb: drop forward declaration of `read_info_alternates()` odb: remove mutual recursion when parsing alternates odb: stop splitting alternate in `odb_add_to_alternates_file()` odb: move computation of normalized objdir into `alt_odb_usable()` odb: resolve relative alternative paths when parsing odb: refactor parsing of alternates to be self-contained
2025-12-18odb: do not use "blank" substitute for NULLJunio C Hamano
When various *object_info() functions are given an extended object info structure as NULL by a caller that does not want any details, the code uses a file-scope static blank_oi and passes it down to the helper functions they use, to avoid handling NULL specifically. The ps/object-read-stream topic graduated to 'master' recently however had a bug that assumed that two identically named file-scope static variables in two functions are the same, which of course is not the case. This made "git commit" take 0.38 seconds to 1508 seconds in some case, as reported by Aaron Plattner here: https://lore.kernel.org/git/f4ba7e89-4717-4b36-921f-56537131fd69@nvidia.com/ We _could_ move the blank_oi variable to the global scope in common section to fix this regression, but explicitly handling the NULL is a much safer fix. It would also reduce the chance of errors that somebody accidentally writes into blank_oi, making its contents dirty, which potentially will make subsequent calls into the function misbehave. By explicitly handling NULL input, we no longer have to worry about it. Reported-by: Aaron Plattner <aplattner@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-16Merge branch 'ps/object-read-stream'Junio C Hamano
The "git_istream" abstraction has been revamped to make it easier to interface with pluggable object database design. * ps/object-read-stream: streaming: drop redundant type and size pointers streaming: move into object database subsystem streaming: refactor interface to be object-database-centric streaming: move logic to read packed objects streams into backend streaming: move logic to read loose objects streams into backend streaming: make the `odb_read_stream` definition public streaming: get rid of `the_repository` streaming: rely on object sources to create object stream packfile: introduce function to read object info from a store streaming: move zlib stream into backends streaming: create structure for filtered object streams streaming: create structure for packed object streams streaming: create structure for loose object streams streaming: create structure for in-core object streams streaming: allocate stream inside the backend-specific logic streaming: explicitly pass packfile info when streaming a packed object streaming: propagate final object type via the stream streaming: drop the `open()` callback function streaming: rename `git_istream` into `odb_read_stream`
2025-12-15Merge branch 'ps/object-read-stream' into ps/packfile-store-in-odb-sourceJunio C Hamano
* ps/object-read-stream: streaming: drop redundant type and size pointers streaming: move into object database subsystem streaming: refactor interface to be object-database-centric streaming: move logic to read packed objects streams into backend streaming: move logic to read loose objects streams into backend streaming: make the `odb_read_stream` definition public streaming: get rid of `the_repository` streaming: rely on object sources to create object stream packfile: introduce function to read object info from a store streaming: move zlib stream into backends streaming: create structure for filtered object streams streaming: create structure for packed object streams streaming: create structure for loose object streams streaming: create structure for in-core object streams streaming: allocate stream inside the backend-specific logic streaming: explicitly pass packfile info when streaming a packed object streaming: propagate final object type via the stream streaming: drop the `open()` callback function streaming: rename `git_istream` into `odb_read_stream`
2025-12-11odb: write alternates via sourcesPatrick Steinhardt
Refactor writing of alternates so that the actual business logic is structured around the object database source we want to write the alternate to. Same as with the preceding commit, this will eventually allow us to have different logic for writing alternates depending on the backend used. Note that after the refactoring we start to call `odb_add_alternate_recursively()` unconditionally. This is fine though as we know to skip adding sources that are tracked already. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11odb: read alternates via sourcesPatrick Steinhardt
Adapt how we read alternates so that the interface is structured around the object database source we're reading from. This will eventually allow us to abstract away this behaviour with pluggable object databases so that every format can have its own mechanism for listing alternates. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11odb: drop forward declaration of `read_info_alternates()`Patrick Steinhardt
Now that we have removed the mutual recursion in the preceding commit it is not necessary anymore to have a forward declaration of the `read_info_alternates()` function. Move the function and its dependencies further up so that we can remove it. Note that this commit also removes the function documentation of `read_info_alternates()`. It's unclear what it's documenting, but it for sure isn't documenting the modern behaviour of the function anymore. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11odb: remove mutual recursion when parsing alternatesPatrick Steinhardt
When adding an alternative object database source we not only have to consider the added source itself, but we also have to add _its_ sources to our database. We implement this via mutual recursion: 1. We first call `link_alt_odb_entries()`. 2. `link_alt_odb_entries()` calls `parse_alternates()`. 3. We then add each alternate via `odb_add_alternate_recursively()`. 4. `odb_add_alternate_recursively()` calls `link_alt_odb_entries()` again. This flow is somewhat hard to follow, but more importantly it means that parsing of alternates is somewhat tied to the recursive behaviour. Refactor the function to remove the mutual recursion between adding sources and parsing alternates. The parsing step thus becomes completely oblivious to the fact that there is recursive behaviour going on at all. The recursion is handled by `odb_add_alternate_recursively()` instead, which now recurses with itself. This refactoring allows us to move parsing of alternates into object database sources in a subsequent step. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11odb: stop splitting alternate in `odb_add_to_alternates_file()`Patrick Steinhardt
When calling `odb_add_to_alternates_file()` we know to add the newly added source to the object database in case we have already loaded alternates. This is done so that we can make its objects accessible immediately without having to fully reload all alternates. The way we do this though is to call `link_alt_odb_entries()`, which adds _multiple_ sources to the object database source in case we have newline-separated entries. This behaviour is not documented in the function documentation of `odb_add_to_alternates_file()`, and all callers only ever pass a single directory to it. It's thus entirely surprising and a conceptual mismatch. Fix this issue by directly calling `odb_add_alternate_recursively()` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11odb: move computation of normalized objdir into `alt_odb_usable()`Patrick Steinhardt
The function `alt_odb_usable()` receives as input the object database, the path it's supposed to determine usability for as well as the normalized path of the main object directory of the repository. The last part is derived by the function's caller from the object database. As we already pass the object database to `alt_odb_usable()` it is redundant information. Drop the extra parameter and compute the normalized object directory in the function itself. While at it, rename the function to `odb_is_source_usable()` to align it with modern terminology. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11odb: resolve relative alternative paths when parsingPatrick Steinhardt
Parsing alternates and resolving potential relative paths is currently handled in two separate steps. This has the effect that the logic to retrieve alternates is not entirely self-contained. We want it to be just that though so that we can eventually move the logic to list alternates into the `struct odb_source`. Move the logic to resolve relative alternative paths into `parse_alternates()`. Besides bringing us a step closer towards the above goal, it also neatly separates concerns of generating the list of alternatives and linking them into the object database. Note that we ignore any errors when the relative path cannot be resolved. This isn't really a change in behaviour though: if the path cannot be resolved to a directory then `alt_odb_usable()` still knows to bail out. While at it, rename the function to `odb_add_alternate_recursively()` to more clearly indicate what its intent is and to align it with modern terminology. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-11odb: refactor parsing of alternates to be self-containedPatrick Steinhardt
Parsing of the alternates file and environment variable is currently split up across multiple different functions and is entangled with `link_alt_odb_entries()`, which is responsible for linking the parsed object database sources. This results in two downsides: - We have mutual recursion between parsing alternates and linking them into the object database. This is because we also parse alternates that the newly added sources may have. - We mix up the actual logic to parse the data and to link them into place. Refactor the logic so that parsing of the alternates file is entirely self-contained. Note that this doesn't yet fix the above two issues, but it is a necessary step to get there. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25odb: handle recreation of quarantine directoriesPatrick Steinhardt
In the preceding commit we have moved the logic that reparents object database sources on chdir(3p) from "setup.c" into "odb.c". Let's also do the same for any temporary quarantine directories so that the complete reparenting logic is self-contained in "odb.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>