aboutsummaryrefslogtreecommitdiff
path: root/builtin/fetch.c
AgeCommit message (Collapse)Author
2026-03-31odb: rename `odb_has_object()` flagsPatrick Steinhardt
Rename `odb_has_object()` flags to be properly prefixed with the function name. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-12run-command: wean auto_maintenance() functions off the_repositoryBurak Kaan Karaçay
The prepare_auto_maintenance() relies on the_repository to read configurations. Since run_auto_maintenance() calls prepare_auto_maintenance(), it also implicitly depends the_repository. Add 'struct repository *' as a parameter to both functions and update all callers to pass the_repository. With no global repository dependencies left in this file, remove the USE_THE_REPOSITORY_VARIABLE macro. Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Burak Kaan Karaçay <bkkaracay@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-09Merge branch 'ps/refs-for-each'Junio C Hamano
Code refactoring around refs-for-each-* API functions. * ps/refs-for-each: refs: replace `refs_for_each_fullref_in()` refs: replace `refs_for_each_namespaced_ref()` refs: replace `refs_for_each_glob_ref()` refs: replace `refs_for_each_glob_ref_in()` refs: replace `refs_for_each_rawref_in()` refs: replace `refs_for_each_rawref()` refs: replace `refs_for_each_ref_in()` refs: improve verification for-each-ref options refs: generalize `refs_for_each_fullref_in_prefixes()` refs: generalize `refs_for_each_namespaced_ref()` refs: speed up `refs_for_each_glob_ref_in()` refs: introduce `refs_for_each_ref_ext` refs: rename `each_ref_fn` refs: rename `do_for_each_ref_flags` refs: move `do_for_each_ref_flags` further up refs: move `refs_head_ref_namespaced()` refs: remove unused `refs_for_each_include_root_ref()`
2026-03-04Merge branch 'cx/fetch-display-ubfix'Junio C Hamano
Undefined-behaviour fix in "git fetch". * cx/fetch-display-ubfix: fetch: fix wrong evaluation order in URL trailing-slash trimming
2026-02-25fetch: fix wrong evaluation order in URL trailing-slash trimmingcuiweixie
if i == -1, url[i] will be UB. Signed-off-by: cuiweixie <cuiweixie@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: replace `refs_for_each_glob_ref()`Patrick Steinhardt
Replace calls to `refs_for_each_glob_ref()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-17fetch-pack: wire up and enable auto filter logicChristian Couder
Previous commits have set up an infrastructure for `--filter=auto` to automatically prepare a partial clone filter based on what the server advertised and the client accepted. Using that infrastructure, let's now enable the `--filter=auto` option in `git clone` and `git fetch` by setting `allow_auto_filter` to 1. Note that these small changes mean that when `git clone --filter=auto` or `git fetch --filter=auto` are used, "auto" is automatically saved as the partial clone filter for the server on the client. Therefore subsequent calls to `git fetch` on the client will automatically use this "auto" mode even without `--filter=auto`. Let's also set `allow_auto_filter` to 1 in `transport.c`, as the transport layer must be able to accept the "auto" filter spec even if the invoking command hasn't fully parsed it yet. When an "auto" filter is requested, let's have the "fetch-pack.c" code in `do_fetch_pack_v2()` compute a filter and send it to the server. In `do_fetch_pack_v2()` the logic also needs to check for the "promisor-remote" capability and call `promisor_remote_reply()` to parse advertised remotes and populate the list of those accepted (and their filters). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-17fetch: make filter_options local to cmd_fetch()Christian Couder
The `struct list_objects_filter_options filter_options` variable used in "builtin/fetch.c" to store the parsed filters specified by `--filter=<filterspec>` is currently a static variable global to the file. As we are going to use it more in a following commit, it could become a bit less easy to understand how it's managed. To avoid that, let's make it clear that it's owned by cmd_fetch() by moving its definition into that function and making it non-static. This requires passing a pointer to it through the prepare_transport(), do_fetch(), backfill_tags(), fetch_one_setup_partial(), and fetch_one() functions, but it's quite straightforward. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-25fetch: delay user information post committing of transactionKarthik Nayak
In Git 2.50 and earlier, we would display failure codes and error message as part of the status display: $ git fetch . v1.0.0:refs/heads/foo error: cannot update ref 'refs/heads/foo': trying to write non-commit object f665776185ad074b236c00751d666da7d1977dbe to branch 'refs/heads/foo' From . ! [new tag] v1.0.0 -> foo (unable to update local ref) With the addition of batched updates, this information is no longer shown to the user: $ git fetch . v1.0.0:refs/heads/foo From . * [new tag] v1.0.0 -> foo error: cannot update ref 'refs/heads/foo': trying to write non-commit object f665776185ad074b236c00751d666da7d1977dbe to branch 'refs/heads/foo' Since reference updates are batched and processed together at the end, information around the outcome is not available during individual reference parsing. To overcome this, collate and delay the output to the end. Introduce `ref_update_display_info` which will hold individual update's information and also whether the update failed or succeeded. This finally allows us to iterate over all such updates and print them to the user. Using an dynamic array and strmap does add some overhead to 'git-fetch(1)', but from benchmarking this seems to be not too bad: Benchmark 1: fetch: many refs (refformat = files, refcount = 1000, revision = master) Time (mean ± σ): 42.6 ms ± 1.2 ms [User: 13.1 ms, System: 29.8 ms] Range (min … max): 40.1 ms … 45.8 ms 47 runs Benchmark 2: fetch: many refs (refformat = files, refcount = 1000, revision = HEAD) Time (mean ± σ): 43.1 ms ± 1.2 ms [User: 12.7 ms, System: 30.7 ms] Range (min … max): 40.5 ms … 45.8 ms 48 runs Summary fetch: many refs (refformat = files, refcount = 1000, revision = master) ran 1.01 ± 0.04 times faster than fetch: many refs (refformat = files, refcount = 1000, revision = HEAD) Another approach would be to move the status printing logic to be handled post the transaction being committed. That however would require adding an iterator to the ref transaction that tracks both the outcome (success/failure) and the original refspec information for each update, which is more involved infrastructure work compared to the strmap approach here. Helped-by: Phillip Wood <phillip.wood123@gmail.com> Reported-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-25fetch: utilize rejected ref error detailsKarthik Nayak
In 0e358de64a (fetch: use batched reference updates, 2025-05-19), git-fetch(1) switched to using batched reference updates. This also introduced a regression wherein instead of providing detailed error messages for failed referenced updates, the users were provided generic error messages based on the error type. Similar to the previous commit, switch to using detailed error messages if present for failed reference updates to fix this regression. Reported-by: Elijah Newren <newren@gmail.com> Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-25refs: add rejection detail to the callback functionKarthik Nayak
The previous commit started storing the rejection details alongside the error code for rejected updates. Pass this along to the callback function `ref_transaction_for_each_rejected_update()`. Currently the field is unused, but will be integrated in the upcoming commits. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-23Merge branch 'kn/fix-fetch-backfill-tag-with-batched-ref-updates'Junio C Hamano
"git fetch" that involves fetching tags, when a tag being fetched needs to overwrite existing one, failed to fetch other tags, which has been corrected. * kn/fix-fetch-backfill-tag-with-batched-ref-updates: fetch: fix failed batched updates skipping operations fetch: fix non-conflicting tags not being committed fetch: extract out reference committing logic
2025-12-10fetch: fix failed batched updates skipping operationsKarthik Nayak
Fix a regression introduced with batched updates in 0e358de64a (fetch: use batched reference updates, 2025-05-19) when fetching references. In the `do_fetch()` function, we jump to cleanup if committing the transaction fails, regardless of whether using batched or atomic updates. This skips three subsequent operations: - Update 'FETCH_HEAD' as part of `commit_fetch_head()`. - Add upstream tracking information via `set_upstream()`. - Setting remote 'HEAD' values when `do_set_head` is true. For atomic updates, this is expected behavior. For batched updates, we want to continue with these operations even if some refs fail to update. Skipping `commit_fetch_head()` isn't actually a regression because 'FETCH_HEAD' is already updated via `append_fetch_head()` when not using '--atomic'. However, we add a test to validate this behavior. Skipping the other two operations (upstream tracking and remote HEAD) is a regression. Fix this by only jumping to cleanup when using '--atomic', allowing batched updates to continue with post-fetch operations. Add tests to prevent future regressions. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-10fetch: fix non-conflicting tags not being committedKarthik Nayak
The commit 0e358de64a (fetch: use batched reference updates, 2025-05-19) updated the 'git-fetch(1)' command to use batched updates. This batches updates to gain performance improvements. When fetching references, each update is added to the transaction. Finally, when committing, individual updates are allowed to fail with reason, while the transaction itself succeeds. One scenario which was missed here, was fetching tags. When fetching conflicting tags, the `fetch_and_consume_refs()` function returns '1', which skipped committing the transaction and directly jumped to the cleanup section. This mean that no updates were applied. This also extends to backfilling tags which is done when fetching specific refspecs which contains tags in their history. Fix this by committing the transaction when we have an error code and not using an atomic transaction. This ensures other references are applied even when some updates fail. The cleanup section is reached with `retcode` set in several scenarios: - `truncate_fetch_head()`, `open_fetch_head()` and `prune_refs()` set `retcode` before the transaction is created, so no commit is attempted. - `fetch_and_consume_refs()` and `backfill_tags()` are the primary cases this fix targets, both setting a positive `retcode` to trigger the committing of the transaction. This simplifies error handling and ensures future modifications to `do_fetch()` don't need special handling for batched updates. Add tests to check for this regression. While here, add a missing cleanup from previous test. Reported-by: David Bohman <debohman@gmail.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-30Merge branch 'ja/doc-synopsis-style'Junio C Hamano
Doc mark-up updates. * ja/doc-synopsis-style: doc: pull-fetch-param typofix doc: convert git push to synopsis style doc: convert git pull to synopsis style doc: convert git fetch to synopsis style
2025-11-21fetch: extract out reference committing logicKarthik Nayak
The `do_fetch()` function contains the core of the `git-fetch(1)` logic. Part of this is to fetch and store references. This is done by 1. Creating a reference transaction (non-atomic mode uses batched updates). 2. Adding individual reference updates to the transaction. 3. Committing the transaction. 4. When using batched updates, handling the rejected updates. The following commit, will fix a bug wherein fetching tags with conflicts was causing other reference updates to fail. Fixing this requires utilizing this logic in different regions of the function. In preparation of the follow up commit, extract the committing and rejection handling logic into a separate function called `commit_ref_transaction()`. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19doc: convert git fetch to synopsis styleJean-Noël Avila
- Switch the synopsis to a synopsis block which will automatically format placeholders in italics and keywords in monospace - Use _<placeholder>_ instead of <placeholder> in the description - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: introduce wrapper struct for `each_ref_fn`Patrick Steinhardt
The `each_ref_fn` callback function type is used across our code base for several different functions that iterate through reference. There's a bunch of callbacks implementing this type, which makes any changes to the callback signature extremely noisy. An example of the required churn is e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09): adding a single argument required us to change 48 files. It was already proposed back then [1] that we might want to introduce a wrapper structure to alleviate the pain going forward. While this of course requires the same kind of global refactoring as just introducing a new parameter, it at least allows us to more change the callback type afterwards by just extending the wrapper structure. One counterargument to this refactoring is that it makes the structure more opaque. While it is obvious which callsites need to be fixed up when we change the function type, it's not obvious anymore once we use a structure. That being said, we only have a handful of sites that actually need to populate this wrapper structure: our ref backends, "refs/iterator.c" as well as very few sites that invoke the iterator callback functions directly. Introduce this wrapper structure so that we can adapt the iterator interfaces more readily. [1]: <ZmarVcF5JjsZx0dl@tanuki> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17refs/files: catch conflicts on case-insensitive file-systemsKarthik Nayak
During the 'prepare' phase of a reference transaction in the files backend, we create the lock files for references to be created. When using batched updates on case-insensitive filesystems, the entire batched updates would be aborted if there are conflicting names such as: refs/heads/Foo refs/heads/foo This affects all commands which were migrated to use batched updates in Git 2.51, including 'git-fetch(1)' and 'git-receive-pack(1)'. Before that, reference updates would be applied serially with one transaction used per update. When users fetched multiple references on case-insensitive systems, subsequent references would simply overwrite any earlier references. So when fetching: refs/heads/foo: 5f34ec0bfeac225b1c854340257a65b106f70ea6 refs/heads/Foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 The user would simply end up with: refs/heads/foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 This is buggy behavior since the user is never informed about the overrides performed and missing references. Nevertheless, the user is left with a working repository with a subset of the references. Since Git 2.51, in such situations fetches would simply fail without updating any references. Which is also buggy behavior and worse off since the user is left without any references. The error is triggered in `lock_raw_ref()` where the files backend attempts to create a lock file. When a lock file already exists the function returns a 'REF_TRANSACTION_ERROR_GENERIC'. When this happens, the entire batched updates, not individual operation, is aborted as if it were in a transaction. Change this to return 'REF_TRANSACTION_ERROR_CASE_CONFLICT' instead to aid the batched update mechanism to simply reject such errors. The change only affects batched updates since batched updates will reject individual updates with non-generic errors. So specifically this would only affect: 1. git fetch 2. git receive-pack 3. git update-ref --batch-updates This bubbles the error type up to `files_transaction_prepare()` which tries to lock each reference update. So if the locking fails, we check if the rejection type can be ignored, which is done by calling `ref_transaction_maybe_set_rejected()`. As the error type is now 'REF_TRANSACTION_ERROR_CASE_CONFLICT', the specific reference update would simply be rejected, while other updates in the transaction would continue to be applied. This allows partial application of references in case-insensitive filesystems when fetching colliding references. While the earlier implementation allowed the last reference to be applied overriding the initial references, this change would allow the first reference to be applied while rejecting consequent collisions. This should be an okay compromise since with the files backend, there is no scenario possible where we would retain all colliding references. Let's also be more proactive and notify users on case-insensitive filesystems about such problems by providing a brief about the issue while also recommending using the reftable backend, which doesn't have the same issue. Reported-by: Joe Drew <joe.drew@indexexchange.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_int()` wrapperPatrick Steinhardt
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_int()`. All callsites are adjusted so that they use `repo_config_get_int(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config_get_string()` wrapperPatrick Steinhardt
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_string()`. All callsites are adjusted so that they use `repo_config_get_string(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config()` wrapperPatrick Steinhardt
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config()`. All callsites are adjusted so that they use `repo_config(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16Merge branch 'ph/fetch-prune-optim'Junio C Hamano
"git fetch --prune" used to be O(n^2) expensive when there are many refs, which has been corrected. * ph/fetch-prune-optim: clean up interface for refs_warn_dangling_symrefs refs: remove old refs_warn_dangling_symref fetch-prune: optimize dangling-ref reporting
2025-07-15Merge branch 'ps/object-store'Junio C Hamano
Code clean-up around object access API. * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-08Merge branch 'kn/fetch-push-bulk-ref-update'Junio C Hamano
"git push" and "git fetch" are taught to update refs in batches to gain performance. * kn/fetch-push-bulk-ref-update: receive-pack: handle reference deletions separately refs/files: skip updates with errors in batched updates receive-pack: use batched reference updates send-pack: fix memory leak around duplicate refs fetch: use batched reference updates refs: add function to translate errors to strings
2025-07-01clean up interface for refs_warn_dangling_symrefsPhil Hord
The refs_warn_dangling_symrefs interface is a bit fragile as it passes in printf-formatting strings with expectations about the number of arguments. This patch series made it worse by adding a 2nd positional argument. But there are only two call sites, and they both use almost identical display options. Make this safer by moving the format strings into the function that uses them to make it easier to see when the arguments don't match. Pass a prefix string and a dry_run flag so the decision logic can be handled where needed. Signed-off-by: Phil Hord <phil.hord@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01fetch-prune: optimize dangling-ref reportingPhil Hord
When pruning during `git fetch` we check each pruned ref against the ref_store one at a time to decide whether to report it as dangling. This causes every local ref to be scanned for each ref being pruned. If there are N refs in the repo and M refs being pruned, this code is O(M*N). However, `git remote prune` uses a very similar function that is only O(N*log(M)). Remove the wasteful ref scanning for each pruned ref and use the faster version already available in refs_warn_dangling_symrefs. Change the message to include the original refname since the message is no longer printed immediately after the line that did just print the refname. In a repo with 126,000 refs, where I was pruning 28,000 refs, this code made about 3.6 billion calls to strcmp and consumed 410 seconds of CPU. (Invariably in that time, my remote would timeout and the fetch would fail anyway.) After this change, the same operation completes in under a second. Signed-off-by: Phil Hord <phil.hord@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `has_object()`Patrick Steinhardt
Rename `has_object()` to `odb_has_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename `object_directory` to `odb_source`Patrick Steinhardt
The `object_directory` structure is used as an access point for a single object directory like ".git/objects". While the structure isn't yet fully self-contained, the intent is for it to eventually contain all information required to access objects in one specific location. While the name "object directory" is a good fit for now, this will change over time as we continue with the agenda to make pluggable object databases a thing. Eventually, objects may not be accessed via any kind of directory at all anymore, but they could instead be backed by any kind of durable storage mechanism. While it seems quite far-fetched for now, it is thinkable that eventually this might even be some form of a database, for example. As such, the current name of this structure will become worse over time as we evolve into the direction of pluggable ODBs. Immediate next steps will start to carve out proper self-contained object directories, which requires us to pass in these object directories as parameters. Based on our modern naming schema this means that those functions should then be named after their subsystem, which means that we would start to bake the current name into the codebase more and more. Let's preempt this by renaming the structure. There have been a couple alternatives that were discussed: - `odb_backend` was discarded because it led to the association that one object database has a single backend, but the model is that one alternate has one backend. Furthermore, "backend" is more about the actual backing implementation and less about the high-level concept. - `odb_alternate` was discarded because it is a bit of a stretch to also call the main object directory an "alternate". Instead, pick `odb_source` as the new name. It makes it sufficiently clear that there can be multiple sources and does not cause confusion when mixed with the already-existing "alternate" terminology. In the future, this change allows us to easily introduce for example a `odb_files_source` and other format-specific implementations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-25Merge branch 'ps/maintenance-ref-lock'Junio C Hamano
"git maintenance" lacked the care "git gc" had to avoid holding onto the repository lock for too long during packing refs, which has been remedied. * ps/maintenance-ref-lock: builtin/maintenance: fix locking race when handling "gc" task builtin/gc: avoid global state in `gc_before_repack()` usage: allow dying without writing an error message builtin/maintenance: fix locking race with refs and reflogs tasks builtin/maintenance: split into foreground and background tasks builtin/maintenance: fix typedef for function pointers builtin/maintenance: extract function to run tasks builtin/maintenance: stop modifying global array of tasks builtin/maintenance: mark "--task=" and "--schedule=" as incompatible builtin/maintenance: centralize configuration of explicit tasks builtin/gc: drop redundant local variable builtin/gc: use designated field initializers for maintenance tasks
2025-06-03usage: allow dying without writing an error messagePatrick Steinhardt
Sometimes code wants to die in a situation where it already has written an error message. To use the same error code as `die()` we have to use `exit(128)`, which is easy to get wrong and leaves magic numbers all over our codebase. Teach `die_message_builtin()` to not print any error when passed a `NULL` pointer as error string. Like this, such users can now call `die(NULL)` to achieve the same result without any hardcoded error codes. Adapt a couple of builtins to use this new pattern to demonstrate that there is a need for such a helper. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19fetch: use batched reference updatesKarthik Nayak
The reference updates performed as a part of 'git-fetch(1)', take place one at a time. For each reference update, a new transaction is created and committed. This is necessary to ensure we can allow individual updates to fail without failing the entire command. The command also supports an '--atomic' mode, which uses a single transaction to update all of the references. But this mode has an all-or-nothing approach, where if a single update fails, all updates would fail. In 23fc8e4f61 (refs: implement batch reference update support, 2025-04-08), we introduced a new mechanism to batch reference updates. Under the hood, this uses a single transaction to perform a batch of reference updates, while allowing only individual updates to fail. Utilize this newly introduced batch update mechanism in 'git-fetch(1)'. This provides a significant bump in performance, especially when dealing with repositories with large number of references. Adding support for batched updates is simply modifying the flow to also create a batch update transaction in the non-atomic flow. With the reftable backend there is a 22x performance improvement, when performing 'git-fetch(1)' with 10000 refs: Benchmark 1: fetch: many refs (refformat = reftable, refcount = 10000, revision = master) Time (mean ± σ): 3.403 s ± 0.775 s [User: 1.875 s, System: 1.417 s] Range (min … max): 2.454 s … 4.529 s 10 runs Benchmark 2: fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD) Time (mean ± σ): 154.3 ms ± 17.6 ms [User: 102.5 ms, System: 56.1 ms] Range (min … max): 145.2 ms … 220.5 ms 18 runs Summary fetch: many refs (refformat = reftable, refcount = 10000, revision = HEAD) ran 22.06 ± 5.62 times faster than fetch: many refs (refformat = reftable, refcount = 10000, revision = master) In similar conditions, the files backend sees a 1.25x performance improvement: Benchmark 1: fetch: many refs (refformat = files, refcount = 10000, revision = master) Time (mean ± σ): 605.5 ms ± 9.4 ms [User: 117.8 ms, System: 483.3 ms] Range (min … max): 595.6 ms … 621.5 ms 10 runs Benchmark 2: fetch: many refs (refformat = files, refcount = 10000, revision = HEAD) Time (mean ± σ): 485.8 ms ± 4.3 ms [User: 91.1 ms, System: 396.7 ms] Range (min … max): 477.6 ms … 494.3 ms 10 runs Summary fetch: many refs (refformat = files, refcount = 10000, revision = HEAD) ran 1.25 ± 0.02 times faster than fetch: many refs (refformat = files, refcount = 10000, revision = master) With this we'll either be using a regular transaction or a batch update transaction. This helps cleanup some code which is no longer needed as we'll now always have some type of 'ref_transaction' object being propagated. One big change is that earlier, each individual update would propagate a failure. Whereas now, the `ref_transaction_for_each_rejected_update` function is called at the end of the flow to capture the exit status for 'git-fetch(1)' and also to print F/D conflict errors. This does change the order of the errors being printed, but the behavior stays the same. Since transaction errors are now explicitly defined as part of 76e760b999 (refs: introduce enum-based transaction error types, 2025-04-08), utilize them and get rid of custom errors defined within 'builtin/fetch.c'. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15fetch: avoid unnecessary work when there is no current branchJohannes Schindelin
As pointed out by CodeQL, `branch_get()` may return `NULL`, in which case `branch_has_merge_config()` would return early, but we can even avoid enumerating the refs prefixes in that case, saving even more CPU cycles. Technically, we should enclose these two statements in an `if (branch) {...}` block, but the indentation is already quite deep, therefore I refrained from doing that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-15fetch: carefully clear local variable's address after useJohannes Schindelin
As pointed out by CodeQL, it is a potentially dangerous practice to store local variables' addresses in non-local structs. Yet this is exactly what happens with the `acked_commits` attribute that is used in `cmd_fetch()`: The pointer to a local variable is assigned to it. Now, it is Git's convention that `cmd_*()` functions are essentially only returning just before exiting the process, therefore there is little danger that this attribute is used after the code flow returns from that function. However, code in `cmd_*()` function is often so useful that it gets lifted into a library function, at which point this issue could become a real problem. Let's make sure to clear the `acked_commits` attribute out after it was used, and before the function returns (at which point the address would go stale). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-12Merge branch 'ps/object-store-cleanup'Junio C Hamano
Further code clean-up in the object-store layer. * ps/object-store-cleanup: object-store: drop `repo_has_object_file()` treewide: convert users of `repo_has_object_file()` to `has_object()` object-store: allow fetching objects via `has_object()` object-store: move function declarations to their respective subsystems object-store: move and rename `odb_pack_keep()` object-store: drop `loose_object_path()` object-store: move `struct packed_git` into "packfile.h"
2025-04-29treewide: convert users of `repo_has_object_file()` to `has_object()`Patrick Steinhardt
As the comment of `repo_has_object_file()` and its `_with_flags()` variant tells us, these functions are considered to be deprecated in favor of `has_object()`. There are a couple of slight benefits in favor of the replacement: - The new function has a short-and-sweet name. - More explicit defaults: `has_object()` doesn't fetch missing objects via promisor remotes, and neither does it reload packfiles if an object wasn't found by default. This ensures that it becomes immediately obvious when a simple object existence check may result in expensive actions. Most importantly though, it is confusing that we have two sets of functions that ultimately do the same thing, but with different defaults. Start sunsetting `repo_has_object_file()` and its `_with_flags()` sibling by replacing all callsites with `has_object()`: - `repo_has_object_file(...)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., 0)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)` is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`. The replacements should be functionally equivalent. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-24Merge branch 'ps/parse-options-integers'Junio C Hamano
Update parse-options API to catch mistakes to pass address of an integral variable of a wrong type/size. * ps/parse-options-integers: parse-options: detect mismatches in integer signedness parse-options: introduce precision handling for `OPTION_UNSIGNED` parse-options: introduce precision handling for `OPTION_INTEGER` parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()` parse-options: support unit factors in `OPT_INTEGER()` global: use designated initializers for options parse: fix off-by-one for minimum signed values
2025-04-24Merge branch 'ps/object-file-cleanup'Junio C Hamano
Code clean-up. * ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-24Merge branch 'ps/object-file-cleanup' into ps/object-store-cleanupJunio C Hamano
* ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-17Merge branch 'jk/fetch-follow-remote-head-fix'Junio C Hamano
"git fetch [<remote>]" with only the configured fetch refspec should be the only thing to update refs/remotes/<remote>/HEAD, but the code was overly eager to do so in other cases. * jk/fetch-follow-remote-head-fix: fetch: make set_head() call easier to read fetch: don't ask for remote HEAD if followRemoteHEAD is "never" fetch: only respect followRemoteHEAD with configured refspecs
2025-04-17global: use designated initializers for optionsPatrick Steinhardt
While we expose macros for most of our different option types understood by the "parse-options" subsystem, not every combination of fields that has one as that would otherwise quickly lead to an explosion of macros. Instead, we just initialize structures manually for those variants of fields that don't have a macro. Callsites that open-code these structure initialization don't use designated initializers though and instead just provide values for each of the fields that they want to initialize. This has three significant downsides: - Callsites need to specify all values up to the last field that they care about. This often includes fields that should simply be left at their default zero-initialized state, which adds distraction. - Any reader not deeply familiar with the layout of the structure has a hard time figuring out what the respective initializers mean. - Reordering or introducing new fields in the middle of the structure is impossible without adapting all callsites. Convert all sites to instead use designated initializers, which we have started using in our codebase quite a while ago. This allows us to skip any default-initialized fields, gives the reader context by specifying the field names and allows us to reorder or introduce new fields where we want to. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-16Merge branch 'kn/non-transactional-batch-updates'Junio C Hamano
Updating multiple references have only been possible in all-or-none fashion with transactions, but it can be more efficient to batch multiple updates even when some of them are allowed to fail in a best-effort manner. A new "best effort batches of updates" mode has been introduced. * kn/non-transactional-batch-updates: update-ref: add --batch-updates flag for stdin mode refs: support rejection in batch updates during F/D checks refs: implement batch reference update support refs: introduce enum-based transaction error types refs/reftable: extract code from the transaction preparation refs/files: remove duplicate duplicates check refs: move duplicate refname update check to generic layer refs/files: remove redundant check in split_symref_update()
2025-04-16Merge branch 'jt/ref-transaction-abort-fix'Junio C Hamano
A ref transaction corner case fix. * jt/ref-transaction-abort-fix: builtin/fetch: avoid aborting closed reference transaction
2025-04-15Merge branch 'jt/clone-guess-remote-head-fix'Junio C Hamano
"git clone" still gave the message about the default branch name; this message has been turned into an advice message that can be turned off. * jt/clone-guess-remote-head-fix: advice: allow disabling default branch name advice builtin/clone: suppress unexpected default branch advice remote: allow `guess_remote_head()` to suppress advice
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-09fetch: make set_head() call easier to readJeff King
We ignore any error returned from set_head(), but 638060dcb9 (fetch set_head: refactor to use remote directly, 2025-01-26) left its call in a noop "if" conditional as a sort of note-to-self. When c834d1a7ce (fetch: only respect followRemoteHEAD with configured refspecs, 2025-03-18) added a "do_set_head" flag, it was rolled into the same conditional, putting set_head() on the right-hand side of a short-circuit AND. That's not wrong, but it really hides the point of the line, which is (maybe) calling the function. Instead, let's have a full if() block for the flag, and then our comment (with some rewording) will be sufficient to clarify the error handling. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08refs: introduce enum-based transaction error typesKarthik Nayak
Replace preprocessor-defined transaction errors with a strongly-typed enum `ref_transaction_error`. This change: - Improves type safety and function signature clarity. - Makes error handling more explicit and discoverable. - Maintains existing error cases, while adding new error cases for common scenarios. This refactoring paves the way for more comprehensive error handling which we will utilize in the upcoming commits to add batch reference update support. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-25remote: allow `guess_remote_head()` to suppress adviceJustin Tobler
The `repo_default_branch_name()` invoked through `guess_remote_head()` is configured to always display the default branch advice message. Adapt `guess_remote_head()` to accept flags and convert the `all` parameter to a flag. Add the `REMOTE_GUESS_HEAD_QUIET` flag to to enable suppression of advice messages. Call sites are updated accordingly. Signed-off-by: Justin Tobler <jltobler@gmail.com> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-21builtin/fetch: avoid aborting closed reference transactionJustin Tobler
As part of the reference transaction commit phase, the transaction is set to a closed state regardless of whether it was successful of not. Attempting to abort a closed transaction via `ref_transaction_abort()` results in a `BUG()`. In c92abe71df (builtin/fetch: fix leaking transaction with `--atomic`, 2024-08-22), logic to free a transaction after the commit phase is moved to the centralized exit path. In cases where the transaction commit failed, this results in a closed transaction being aborted and signaling a bug. Free the transaction and set it to NULL when the commit fails. This allows the exit path to correctly handle the error without attempting to abort the transaction. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>