aboutsummaryrefslogtreecommitdiff
path: root/builtin/receive-pack.c
AgeCommit message (Collapse)Author
7 daysMerge branch 'ps/odb-cleanup'Junio C Hamano
Various code clean-up around odb subsystem. * ps/odb-cleanup: odb: drop unneeded headers and forward decls odb: rename `odb_has_object()` flags odb: use enum for `odb_write_object` flags odb: rename `odb_write_object()` flags treewide: use enum for `odb_for_each_object()` flags CodingGuidelines: document our style for flags
7 daysMerge branch 'ps/receive-pack-updateinstead-in-worktree'Junio C Hamano
The check in "receive-pack" to prevent a checked out branch from getting updated via updateInstead mechanism has been corrected. * ps/receive-pack-updateinstead-in-worktree: receive-pack: use worktree HEAD for updateInstead t5516: clean up cloned and new-wt in denyCurrentBranch and worktrees test t5516: test updateInstead with worktree and unborn bare HEAD
12 daysMerge branch 'ar/config-hook-cleanups'Junio C Hamano
Code clean-up around the recent "hooks defined in config" topic. * ar/config-hook-cleanups: hook: reject unknown hook names in git-hook(1) hook: show disabled hooks in "git hook list" hook: show config scope in git hook list hook: introduce hook_config_cache_entry for per-hook data t1800: add test to verify hook execution ordering hook: make consistent use of friendly-name in docs hook: replace hook_list_clear() -> string_list_clear_func() hook: detect & emit two more bugs hook: rename cb_data_free/alloc -> hook_data_free/alloc hook: fix minor style issues builtin/receive-pack: properly init receive_hook strbuf hook: move unsorted_string_list_remove() to string-list.[ch]
2026-03-31odb: rename `odb_has_object()` flagsPatrick Steinhardt
Rename `odb_has_object()` flags to be properly prefixed with the function name. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-30receive-pack: use worktree HEAD for updateInsteadPablo Sabater
When a bare repo has linked worktrees, and its HEAD points to an unborn branch, pushing to a wt branch with updateInstead fails and rejects the push, even if the wt is clean. This happens because HEAD is checked only for the bare repo context, instead of the wt. Remove head_has_history and check for worktree->head_oid which does have the correct HEAD of the wt. Update the test added by Runxi's patch to expect success. Signed-off-by: Pablo Sabater <pabloosabaterr@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-25hook: fix minor style issuesAdrian Ratiu
Fix some minor style nits pointed out by Patrick, Junio and Eric: * Use CALLOC_ARRAY instead of xcalloc. * Init struct members during declaration. * Simplify if condition boolean logic. * Missing curly braces in if/else stmts. * Unnecessary header includes. * Capitalization and full-stop in error/warn messages. * Curly brace on separate line when defining struct. * Comment spelling: free'd -> freed. * Sort the included headers. * Blank line fixes to improve readability. These contain no logic changes, the code behaves the same as before. Suggested-by: Eric Sunshine <sunshine@sunshineco.com> Suggested-by: Junio C Hamano <gitster@pobox.com> Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-25builtin/receive-pack: properly init receive_hook strbufAdrian Ratiu
The run_receive_hook() stack-allocated `struct receive_hook_feed_state` is a template with initial values for child states allocated on the heap for each hook process, by calling receive_hook_feed_state_alloc() when spinning up each hook child. All these values are already initialized to zero, however I forgot to properly initialize the strbuf, which I left NULL. This is more of a code cleanup because in practice it has no effect, the states used by the children are always initialized, however it's good to fix in case someone ends up accidentally dereferencing the NULL pointer in the future. Reported-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-12run-command: wean auto_maintenance() functions off the_repositoryBurak Kaan Karaçay
The prepare_auto_maintenance() relies on the_repository to read configurations. Since run_auto_maintenance() calls prepare_auto_maintenance(), it also implicitly depends the_repository. Add 'struct repository *' as a parameter to both functions and update all callers to pass the_repository. With no global repository dependencies left in this file, remove the USE_THE_REPOSITORY_VARIABLE macro. Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Burak Kaan Karaçay <bkkaracay@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-10Merge branch 'ar/config-hooks'Junio C Hamano
Allow hook commands to be defined (possibly centrally) in the configuration files, and run multiple of them for the same hook event. * ar/config-hooks: hook: add -z option to "git hook list" hook: allow out-of-repo 'git hook' invocations hook: allow event = "" to overwrite previous values hook: allow disabling config hooks hook: include hooks from the config hook: add "git hook list" command hook: run a list of hooks to prepare for multihook support hook: add internal state alloc/free callbacks
2026-03-09Merge branch 'ps/refs-for-each'Junio C Hamano
Code refactoring around refs-for-each-* API functions. * ps/refs-for-each: refs: replace `refs_for_each_fullref_in()` refs: replace `refs_for_each_namespaced_ref()` refs: replace `refs_for_each_glob_ref()` refs: replace `refs_for_each_glob_ref_in()` refs: replace `refs_for_each_rawref_in()` refs: replace `refs_for_each_rawref()` refs: replace `refs_for_each_ref_in()` refs: improve verification for-each-ref options refs: generalize `refs_for_each_fullref_in_prefixes()` refs: generalize `refs_for_each_namespaced_ref()` refs: speed up `refs_for_each_glob_ref_in()` refs: introduce `refs_for_each_ref_ext` refs: rename `each_ref_fn` refs: rename `do_for_each_ref_flags` refs: move `do_for_each_ref_flags` further up refs: move `refs_head_ref_namespaced()` refs: remove unused `refs_for_each_include_root_ref()`
2026-03-09Merge branch 'ar/run-command-hook-take-2'Junio C Hamano
Use the hook API to replace ad-hoc invocation of hook scripts via the run_command() API. * ar/run-command-hook-take-2: builtin/receive-pack: avoid spinning no-op sideband async threads receive-pack: convert receive hooks to hook API receive-pack: convert update hooks to new API run-command: poll child input in addition to output hook: add jobs option reference-transaction: use hook API instead of run-command transport: convert pre-push to hook API hook: allow separate std[out|err] streams hook: convert 'post-rewrite' hook in sequencer.c to hook API hook: provide stdin via callback run-command: add stdin callback for parallelization run-command: add helper for pp child states t1800: add hook output stream tests
2026-03-09Merge branch 'ar/config-hooks' into ar/config-hook-cleanupsJunio C Hamano
* ar/config-hooks: (21 commits) builtin/receive-pack: avoid spinning no-op sideband async threads hook: add -z option to "git hook list" hook: allow out-of-repo 'git hook' invocations hook: allow event = "" to overwrite previous values hook: allow disabling config hooks hook: include hooks from the config hook: add "git hook list" command hook: run a list of hooks to prepare for multihook support hook: add internal state alloc/free callbacks receive-pack: convert receive hooks to hook API receive-pack: convert update hooks to new API run-command: poll child input in addition to output hook: add jobs option reference-transaction: use hook API instead of run-command transport: convert pre-push to hook API hook: allow separate std[out|err] streams hook: convert 'post-rewrite' hook in sequencer.c to hook API hook: provide stdin via callback run-command: add stdin callback for parallelization run-command: add helper for pp child states ...
2026-03-02Merge branch 'ar/run-command-hook-take-2' into ar/config-hooksJunio C Hamano
* ar/run-command-hook-take-2: builtin/receive-pack: avoid spinning no-op sideband async threads
2026-03-02builtin/receive-pack: avoid spinning no-op sideband async threadsAdrian Ratiu
Exit early if the hooks do not exist, to avoid spinning up/down sideband async threads which no-op. It is important to call the hook_exists() API provided by hook.[ch] because it covers both config-defined hooks and the "traditional" hooks from the hookdir. find_hook() only covers the hookdir hooks. The regression happened because the no-op async threads add some additional overhead which can be measured with the receive-refs test of the benchmarks suite [1]. Reproduced using: cd benchmarks/receive-refs && \ ./run --revisions /path/to/git \ fc148b146ad41be71a7852c4867f0773cbfe1ff9~,fc148b146ad41be71a7852c4867f0773cbfe1ff9 \ --parameter-list refformat reftable --parameter-list refcount 10000 1: https://gitlab.com/gitlab-org/data-access/git/benchmarks Fixes: fc148b146ad4 ("receive-pack: convert update hooks to new API") Reported-by: Patrick Steinhardt <ps@pks.im> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> [jc: avoid duplicated hardcoded hook names] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-23refs: replace `refs_for_each_fullref_in()`Patrick Steinhardt
Replace calls to `refs_for_each_fullref_in()` with the newly introduced `refs_for_each_ref_ext()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-19hook: add internal state alloc/free callbacksAdrian Ratiu
Some hooks use opaque structs to keep internal state between callbacks. Because hooks ran sequentially (jobs == 1) with one command per hook, these internal states could be allocated on the stack for each hook run. Next commits add the ability to run multiple commands for each hook, so the states cannot be shared or stored on the stack anymore, especially since down the line we will also enable parallel execution (jobs > 1). Add alloc/free helpers for each hook, doing a "deep" alloc/init & free of their internal opaque struct. The alloc callback takes a context pointer, to initialize the struct at at the time of resource acquisition. These callbacks must always be provided together: no alloc without free and no free without alloc, otherwise a BUG() is triggered. Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-13Merge branch 'cf/c23-const-preserving-strchr-updates-0'Junio C Hamano
ISO C23 redefines strchr and friends that tradiotionally took a const pointer and returned a non-const pointer derived from it to preserve constness (i.e., if you ask for a substring in a const string, you get a const pointer to the substring). Update code paths that used non-const pointer to receive their results that did not have to be non-const to adjust. * cf/c23-const-preserving-strchr-updates-0: gpg-interface: remove an unnecessary NULL initialization global: constify some pointers that are not written to
2026-02-09Merge branch 'kn/ref-batch-output-error-reporting-fix'Junio C Hamano
A handful of code paths that started using batched ref update API (after Git 2.51 or so) lost detailed error output, which have been corrected. * kn/ref-batch-output-error-reporting-fix: fetch: delay user information post committing of transaction receive-pack: utilize rejected ref error details fetch: utilize rejected ref error details update-ref: utilize rejected error details if available refs: add rejection detail to the callback function refs: skip to next ref when current ref is rejected
2026-02-05global: constify some pointers that are not written toCollin Funk
The recent glibc 2.43 release had the following change listed in its NEWS file: For ISO C23, the functions bsearch, memchr, strchr, strpbrk, strrchr, strstr, wcschr, wcspbrk, wcsrchr, wcsstr and wmemchr that return pointers into their input arrays now have definitions as macros that return a pointer to a const-qualified type when the input argument is a pointer to a const-qualified type. When compiling with GCC 15, which defaults to -std=gnu23, this causes many warnings like this: merge-ort.c: In function ‘apply_directory_rename_modifications’: merge-ort.c:2734:36: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 2734 | char *last_slash = strrchr(cur_path, '/'); | ^~~~~~~ This patch fixes the more obvious ones by making them const when we do not write to the returned pointer. Signed-off-by: Collin Funk <collin.funk1@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-28receive-pack: convert receive hooks to hook APIEmily Shaffer
This converts the last remaining hooks to the new hook API, for the same benefits as the previous conversions (no need to toggle signals, manage custom struct child_process, call find_hook(), prepares for specifying hooks via configs, etc.). See the previous three commits for a more in-depth explanation of how this all works. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-28receive-pack: convert update hooks to new APIEmily Shaffer
The hook API avoids creating a custom struct child_process and other internal hook plumbing (e.g. calling find_hook()) and prepares for the specification of hooks via configs or running parallel hooks. Execution is still sequential through the run_hooks_opt .jobs == 1, which is the unchanged default for all hooks. When use_sideband==1, the async thread redirects the hook outputs to sideband 2, otherwise it is not used and the hooks write directly to the fds inherited from the main parent process. When .jobs == 1, run-command's poll loop is avoided entirely via the ungroup=1 option like before (this was Jeff's suggestion), achieving the same real-time output performance. When running in parallel, run-command with ungroup=0 will capture and de-interleave the output of each hook, then write to the parent stderr which is redirected via dup2 to the sideband thread, so that each parallel hook output is presented clearly to the client. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-25receive-pack: utilize rejected ref error detailsKarthik Nayak
In 9d2962a7c4 (receive-pack: use batched reference updates, 2025-05-19), git-receive-pack(1) switched to using batched reference updates. This also introduced a regression wherein instead of providing detailed error messages for failed referenced updates, the users were provided generic error messages based on the error type. Now that the updates also contain detailed error message, propagate those to the client via 'rp_error'. The detailed error messages can be very verbose, for e.g. in the files backend, when trying to write a non-commit object to a branch, you would see: ! [remote rejected] 3eaec9ccf3a53f168362a6b3fdeb73426fb9813d -> branch (cannot update ref 'refs/heads/branch': trying to write non-commit object 3eaec9ccf3a53f168362a6b3fdeb73426fb9813d to branch 'refs/heads/branch') Here the refname is repeated multiple times due to how error messages are propagated and filled over the code stack. This potentially can be cleaned up in a future commit. Reported-by: Elijah Newren <newren@gmail.com> Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-25refs: add rejection detail to the callback functionKarthik Nayak
The previous commit started storing the rejection details alongside the error code for rejected updates. Pass this along to the callback function `ref_transaction_for_each_rejected_update()`. Currently the field is unused, but will be integrated in the upcoming commits. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-15Revert "Merge branch 'ar/run-command-hook'"Junio C Hamano
This reverts commit f406b8955295d01089ba2baf35eceadff2d11cae, reversing changes made to 1627809eeff75e6ec936fc609e7be46d5eb2fa9e. It seems to have caused a few regressions, two of the three known ones we have proposed solutions for. Let's give ourselves a bit more room to maneuver during the pre-release freeze period and restart once the 2.53 ships.
2026-01-06Merge branch 'ar/run-command-hook'Junio C Hamano
Use hook API to replace ad-hoc invocation of hook scripts with the run_command() API. * ar/run-command-hook: receive-pack: convert receive hooks to hook API receive-pack: convert update hooks to new API hooks: allow callers to capture output run-command: allow capturing of collated output hook: allow overriding the ungroup option reference-transaction: use hook API instead of run-command transport: convert pre-push to hook API hook: convert 'post-rewrite' hook in sequencer.c to hook API hook: provide stdin via callback run-command: add stdin callback for parallelization run-command: add first helper for pp child states
2025-12-28receive-pack: convert receive hooks to hook APIEmily Shaffer
This converts the last remaining hooks to the new hook API, for the same benefits as the previous conversions (no need to toggle signals, manage custom struct child_process, call find_hook(), prepares for specifyinig hooks via configs, etc.). I noticed a performance degradation when processing large amounts of hook input with just 1 line per callback, due to run-command's poll loop, therefore I batched 500 lines per callback, to ensure similar pipe throughput as before and to avoid hook child waiting on stdin. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-28receive-pack: convert update hooks to new APIEmily Shaffer
Use the new hook sideband API introduced in the previous commit. The hook API avoids creating a custom struct child_process and other internal hook plumbing (e.g. calling find_hook()) and prepares for the specification of hooks via configs or running parallel hooks. Execution is still sequential through the current hook.[ch] via the run_process_parallel_opts.processes=1 arg. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-05Merge branch 'ps/object-source-management'Junio C Hamano
Code refactoring around object database sources. * ps/object-source-management: odb: handle recreation of quarantine directories odb: handle changing a repository's commondir chdir-notify: add function to unregister listeners odb: handle initialization of sources in `odb_new()` http-push: stop setting up `the_repository` for each reference t/helper: stop setting up `the_repository` repeatedly builtin/index-pack: fix deferred fsck outside repos oidset: introduce `oidset_equal()` odb: move logic to disable ref updates into repo odb: refactor `odb_clear()` to `odb_free()` odb: adopt logic to close object databases setup: convert `set_git_dir()` to have file scope path: move `enter_repo()` into "setup.c"
2025-12-05Merge branch 'jc/optional-path'Junio C Hamano
"git config get --path" segfaulted on an ":(optional)path" that does not exist, which has been corrected. * jc/optional-path: config: really treat missing optional path as not configured config: really pretend missing :(optional) value is not there config: mark otherwise unused function as file-scope static
2025-11-24config: really treat missing optional path as not configuredJunio C Hamano
These callers expect that git_config_pathname() that returns 0 is a signal that the variable they passed has a string they need to act on. But with the introduction of ":(optional)path" earlier, that is no longer the case. If the path specified by the configuration variable is missing, their variable will get a NULL in it, and they need to act on it (often, just refraining from copying it elsewhere). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19path: move `enter_repo()` into "setup.c"Patrick Steinhardt
The function `enter_repo()` is used to enter a repository at a given path. As such it sits way closer to setting up a repository than it does with handling paths, but regardless of that it's located in "path.c" instead of in "setup.c". Move the function into "setup.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: introduce wrapper struct for `each_ref_fn`Patrick Steinhardt
The `each_ref_fn` callback function type is used across our code base for several different functions that iterate through reference. There's a bunch of callbacks implementing this type, which makes any changes to the callback signature extremely noisy. An example of the required churn is e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09): adding a single argument required us to change 48 files. It was already proposed back then [1] that we might want to introduce a wrapper structure to alleviate the pain going forward. While this of course requires the same kind of global refactoring as just introducing a new parameter, it at least allows us to more change the callback type afterwards by just extending the wrapper structure. One counterargument to this refactoring is that it makes the structure more opaque. While it is obvious which callsites need to be fixed up when we change the function type, it's not obvious anymore once we use a structure. That being said, we only have a handful of sites that actually need to populate this wrapper structure: our ref backends, "refs/iterator.c" as well as very few sites that invoke the iterator callback functions directly. Introduce this wrapper structure so that we can adapt the iterator interfaces more readily. [1]: <ZmarVcF5JjsZx0dl@tanuki> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24packfile: split up responsibilities of `reprepare_packed_git()`Patrick Steinhardt
In `reprepare_packed_git()` we perform a couple of operations: - We reload alternate object directories. - We clear the loose object cache. - We reprepare packfiles. While the logic is hosted in "packfile.c", it clearly reaches into other subsystems that aren't related to packfiles. Split up the responsibility and introduce `odb_reprepare()` which now becomes responsible for repreparing the whole object database. The existing `reprepare_packed_git()` function is refactored accordingly and only cares about reloading the packfile store now. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-05Merge branch 'ps/object-file-wo-the-repository'Junio C Hamano
Reduce implicit assumption and dependence on the_repository in the object-file subsystem. * ps/object-file-wo-the-repository: object-file: get rid of `the_repository` in index-related functions object-file: get rid of `the_repository` in `force_object_loose()` object-file: get rid of `the_repository` in `read_loose_object()` object-file: get rid of `the_repository` in loose object iterators object-file: remove declaration for `for_each_file_in_obj_subdir()` object-file: inline `for_each_loose_file_in_objdir_buf()` object-file: get rid of `the_repository` when writing objects odb: introduce `odb_write_object()` loose: write loose objects map via their source object-file: get rid of `the_repository` in `finalize_object_file()` object-file: get rid of `the_repository` in `loose_object_info()` object-file: get rid of `the_repository` when freshening objects object-file: inline `check_and_freshen()` functions object-file: get rid of `the_repository` in `has_loose_object()` object-file: stop using `the_hash_algo` object-file: fix -Wsign-compare warnings
2025-07-23config: drop `git_config()` wrapperPatrick Steinhardt
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config()`. All callsites are adjusted so that they use `repo_config(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-17Merge branch 'bc/use-sha256-by-default-in-3.0' into ps/config-wo-the-repositoryJunio C Hamano
* bc/use-sha256-by-default-in-3.0: Enable SHA-256 by default in breaking changes mode help: add a build option for default hash t5300: choose the built-in hash outside of a repo t4042: choose the built-in hash outside of a repo t1007: choose the built-in hash outside of a repo t: default to compile-time default hash if not set setup: use the default algorithm to initialize repo format Use legacy hash for legacy formats builtin: use default hash when outside a repository hash: add a constant for the legacy hash algorithm hash: add a constant for the default hash algorithm
2025-07-16odb: introduce `odb_write_object()`Patrick Steinhardt
We do not have a backend-agnostic way to write objects into an object database. While there is `write_object_file()`, this function is rather specific to the loose object format. Introduce `odb_write_object()` to plug this gap. For now, this function is a simple wrapper around `write_object_file()` and doesn't even use the passed-in object database yet. This will change in subsequent commits, where `write_object_file()` is converted so that it works on top of an `odb_source`. `odb_write_object()` will then become responsible for deciding which source an object shall be written to. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-15Merge branch 'ps/object-store'Junio C Hamano
Code clean-up around object access API. * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-09Merge branch 'ps/object-store' into ps/object-file-wo-the-repositoryJunio C Hamano
* ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-08Merge branch 'kn/fetch-push-bulk-ref-update'Junio C Hamano
"git push" and "git fetch" are taught to update refs in batches to gain performance. * kn/fetch-push-bulk-ref-update: receive-pack: handle reference deletions separately refs/files: skip updates with errors in batched updates receive-pack: use batched reference updates send-pack: fix memory leak around duplicate refs fetch: use batched reference updates refs: add function to translate errors to strings
2025-07-01Use legacy hash for legacy formatsbrian m. carlson
We have a large variety of data formats and protocols where no hash algorithm was defined and the default was assumed to always be SHA-1. Instead of explicitly stating SHA-1, let's use the constant to represent the legacy hash algorithm (which is still SHA-1) so that it's clear for documentary purposes that it's a legacy fallback option and not an intentional choice to use SHA-1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `has_object()`Patrick Steinhardt
Rename `has_object()` to `odb_has_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: get rid of `the_repository` in `for_each()` functionsPatrick Steinhardt
There are a couple of iterator-style functions that execute a callback for each instance of a given set, all of which currently depend on `the_repository`. Refactor them to instead take an object database as parameter so that we can get rid of this dependency. Rename the functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-25receive-pack: handle reference deletions separatelyKarthik Nayak
In 9d2962a7c4 (receive-pack: use batched reference updates, 2025-05-19) we updated the 'git-receive-pack(1)' command to use batched reference updates. One edge case which was missed during this implementation was when a user pushes multiple branches such as: delete refs/heads/branch/conflict create refs/heads/branch Before using batched updates, the references would be applied sequentially and hence no conflicts would arise. With batched updates, while the first update applies, the second fails due to D/F conflict. A similar issue was present in 'git-fetch(1)' and was fixed by separating out reference pruning into a separate transaction in the commit 'fetch: use batched reference updates'. Apply a similar mechanism for 'git-receive-pack(1)' and separate out reference deletions into its own batch. This means 'git-receive-pack(1)' will now use up to two transactions, whereas before using batched updates it would use _at least_ two transactions. So using batched updates is still the better option. Add a test to validate this behavior. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-20builtin/receive-pack: add option to skip connectivity checkJustin Tobler
During git-receive-pack(1), connectivity of the object graph is validated to ensure that the received packfile does not leave the repository in a broken state. This is done via git-rev-list(1) and walking the objects, which can be expensive for large repositories. Generally, this check is critical to avoid an incomplete received packfile from corrupting a repository. Server operators may have additional knowledge though around exactly how Git is being used on the server-side which can be used to facilitate more efficient connectivity computation of incoming objects. For example, if it can be ensured that all objects in a repository are connected and do not depend on any missing objects, the connectivity of newly written objects can be checked by walking the object graph containing only the new objects from the updated tips and identifying the missing objects which represent the boundary between the new objects and the repository. These boundary objects can be checked in the canonical repository to ensure the new objects connect as expected and thus avoid walking the rest of the object graph. Git itself cannot make the guarantees required for such an optimization as it is possible for a repository to contain an unreachable object that references a missing object without the repository being considered corrupt. Introduce the --skip-connectivity-check option for git-receive-pack(1) which bypasses this connectivity check to give more control to the server-side. Note that without proper server-side validation of newly received objects handled outside of Git, usage of this option risks corrupting a repository. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-19receive-pack: use batched reference updatesKarthik Nayak
The reference updates performed as a part of 'git-receive-pack(1)', take place one at a time. For each reference update, a new transaction is created and committed. This is necessary to ensure we can allow individual updates to fail without failing the entire command. The command also supports an 'atomic' mode, which uses a single transaction to update all of the references. But this mode has an all-or-nothing approach, where if a single update fails, all updates would fail. In 23fc8e4f61 (refs: implement batch reference update support, 2025-04-08), we introduced a new mechanism to batch reference updates. Under the hood, this uses a single transaction to perform a batch of reference updates, while allowing only individual updates to fail. Utilize this newly introduced batch update mechanism in 'git-receive-pack(1)'. This provides a significant bump in performance, especially when dealing with repositories with large number of references. With the reftable backend there is a 18x performance improvement, when performing receive-pack with 10000 refs: Benchmark 1: receive: many refs (refformat = reftable, refcount = 10000, revision = master) Time (mean ± σ): 4.276 s ± 0.078 s [User: 0.796 s, System: 3.318 s] Range (min … max): 4.185 s … 4.430 s 10 runs Benchmark 2: receive: many refs (refformat = reftable, refcount = 10000, revision = HEAD) Time (mean ± σ): 235.4 ms ± 6.9 ms [User: 75.4 ms, System: 157.3 ms] Range (min … max): 228.5 ms … 254.2 ms 11 runs Summary receive: many refs (refformat = reftable, refcount = 10000, revision = HEAD) ran 18.16 ± 0.63 times faster than receive: many refs (refformat = reftable, refcount = 10000, revision = master) In similar conditions, the files backend sees a 1.21x performance improvement: Benchmark 1: receive: many refs (refformat = files, refcount = 10000, revision = master) Time (mean ± σ): 1.121 s ± 0.021 s [User: 0.128 s, System: 0.975 s] Range (min … max): 1.097 s … 1.156 s 10 runs Benchmark 2: receive: many refs (refformat = files, refcount = 10000, revision = HEAD) Time (mean ± σ): 927.9 ms ± 22.6 ms [User: 99.0 ms, System: 815.2 ms] Range (min … max): 903.1 ms … 978.0 ms 10 runs Summary receive: many refs (refformat = files, refcount = 10000, revision = HEAD) ran 1.21 ± 0.04 times faster than receive: many refs (refformat = files, refcount = 10000, revision = master) As using batched updates requires the error handling to be moved to the end of the flow, create and use a 'struct strset' to track the failed refs and attribute the correct errors to them. This change also uncovers an issue when a client provides multiple updates to the same reference. For example: $ git send-pack remote.git A:foo B:foo Enumerating objects: 3, done. Counting objects: 100% (3/3), done. Delta compression using up to 20 threads Compressing objects: 100% (2/2), done. Writing objects: 100% (3/3), 226 bytes | 226.00 KiB/s, done. Total 3 (delta 1), reused 0 (delta 0), pack-reused 0 (from 0) remote: error: cannot lock ref 'refs/heads/foo': reference already exists To remote.git ! [remote rejected] A -> foo (failed to update ref) ! [remote failure] B -> foo (remote failed to report status) As you can see, the remote runs into an error because it cannot lock the target reference for the second update. Furthermore, the remote complains that the first update has been rejected whereas the second update didn't receive any status update because we failed to lock it. Reading this status message alone a user would probably expect that `foo` has not been updated at all. But that's not the case: while we claim that the ref wasn't updated, it surprisingly points to `A` now. One could argue that this is merely an error in how we report the result of this push. But ultimately, the user's request itself is already broken and doesn't make any sense in the first place and cannot ever lead to a sensible outcome that honors the full request. The conversion to batched transactions fixes the issue because we now try to queue both updates in the same transaction. As such, the transaction itself will notice this conflict and refuse the update altogether before we commit any of the values. Note that this requires changes to a couple of tests in t5408 that happened to exercise this behaviour. Given that the generated output is misleading and given that the user request cannot ever be fully honored this really feels more like a bug than properly designed behaviour. As such, changing the behaviour feels like the right thing to do. Since now reference updates are batched, the 'reference-transaction' hook will be invoked with all updates together. Currently git will 'die' when the hook returns with a non-zero exit status in the 'prepared' stage. For 'git-receive-pack(1)', this allowed users to reject an individual reference update, git would have applied previous updates but immediately abort further execution. This is definitely an incorrect usage of this hook, since the right place to do this would be the 'update' hook. This patch retains the latter behavior, but 'reference-transaction' hook now changes to a all-or-nothing behavior when a non-zero exit status is returned in the 'prepared' stage, since batch updates use a transaction under the hood. This explains the change in 't1416'. Helped-by: Jeff King <peff@peff.net> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-29treewide: convert users of `repo_has_object_file()` to `has_object()`Patrick Steinhardt
As the comment of `repo_has_object_file()` and its `_with_flags()` variant tells us, these functions are considered to be deprecated in favor of `has_object()`. There are a couple of slight benefits in favor of the replacement: - The new function has a short-and-sweet name. - More explicit defaults: `has_object()` doesn't fetch missing objects via promisor remotes, and neither does it reload packfiles if an object wasn't found by default. This ensures that it becomes immediately obvious when a simple object existence check may result in expensive actions. Most importantly though, it is confusing that we have two sets of functions that ultimately do the same thing, but with different defaults. Start sunsetting `repo_has_object_file()` and its `_with_flags()` sibling by replacing all callsites with `has_object()`: - `repo_has_object_file(...)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED | HAS_OBJECT_FETCH_PROMISOR)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., 0)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_SKIP_FETCH_OBJECT)` is equivalent to `has_object(..., HAS_OBJECT_RECHECK_PACKED)`. - `repo_has_object_file_with_flags(..., OBJECT_INFO_QUICK)` is equivalent to `has_object(..., HAS_OBJECT_FETCH_PROMISOR)`. The replacements should be functionally equivalent. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-file: split out functions relating to object store subsystemPatrick Steinhardt
While we have the "object-store.h" header, most of the functionality for object stores is actually hosted in "object-file.c". This makes it hard to find relevant functions and causes us to mix up concerns. Split out functions relating to the object store subsystem into a new "object-store.c" file. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>