summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2026-01-15Git 2.53-rc0v2.53.0-rc0Junio C Hamano
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-15Merge branch 'ps/clar-integers'Junio C Hamano
Import newer version of "clar", unit testing framework. * ps/clar-integers: gitattributes: disable blank-at-eof errors for clar test expectations t/unit-tests: demonstrate use of integer comparison assertions t/unit-tests: update clar to 39f11fe
2026-01-15Merge branch 'kh/replay-invalid-onto-advance'Junio C Hamano
Improve the error message when a bad argument is given to the `--onto` option of "git replay". Test coverage of "git replay" has been improved. * kh/replay-invalid-onto-advance: t3650: add more regression tests for failure conditions replay: die if we cannot parse object replay: improve code comment and die message replay: die descriptively when invalid commit-ish is given replay: find *onto only after testing for ref name replay: remove dead code and rearrange
2026-01-15Merge branch 'ps/odb-misc-fixes'Junio C Hamano
Miscellaneous fixes on object database layer. * ps/odb-misc-fixes: odb: properly close sources before freeing them builtin/gc: fix condition for whether to write commit graphs
2026-01-15Merge branch 'pt/t7800-difftool-test-racefix'Junio C Hamano
Test fixup. * pt/t7800-difftool-test-racefix: t7800: fix racy "difftool --dir-diff syncs worktree" test
2026-01-12The 17th batchJunio C Hamano
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12Merge branch 'js/mailmap-karsten-blees'Junio C Hamano
Mailmap update for Karsten * js/mailmap-karsten-blees: .mailmap: replace Karsten Blees' default address
2026-01-12Merge branch 'ps/t1300-2021-use-test-path-is-helpers'Junio C Hamano
Test updates. * ps/t1300-2021-use-test-path-is-helpers: t1300: use test helpers instead of `test` command
2026-01-12Merge branch 'rs/commit-stack'Junio C Hamano
Code clean-up, unifying various hand-rolled "list of commit objects" and use the commit_stack API. * rs/commit-stack: commit-reach: use commit_stack commit-graph: use commit_stack commit: add commit_stack_grow() shallow: use commit_stack pack-bitmap-write: use commit_stack commit: add commit_stack_init() test-reach: use commit_stack remote: use commit_stack for src_commits remote: use commit_stack for sent_tips remote: use commit_stack for local_commits name-rev: use commit_stack midx: use commit_stack log: use commit_stack revision: export commit_stack
2026-01-12Merge branch 'sb/bundle-uri-without-uri'Junio C Hamano
Diagnose invalid bundle-URI that lack the URI entry, instead of crashing. * sb/bundle-uri-without-uri: bundle-uri: validate that bundle entries have a uri
2026-01-12Merge branch 'ja/doc-synopsis-style-more'Junio C Hamano
More doc style updates. * ja/doc-synopsis-style-more: doc: convert git-remote to synopsis style doc: convert git stage to use synopsis block doc: convert git-status tables to AsciiDoc format doc: convert git-status to synopsis style doc: fix t0450-txt-doc-vs-help to select only first synopsis block
2026-01-10.mailmap: replace Karsten Blees' default addressJohannes Schindelin
As per a recent email by Karsten, the @dcon.de address no longer works: https://lore.kernel.org/git/77e768b2-6693-454f-9e11-fb0acdec703c@gmail.com Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-08The 16th batchJunio C Hamano
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-08Merge branch 'en/ort-recursive-d-f-conflict-fix'Junio C Hamano
The ort merge machinery hit an assertion failure in a history with criss-cross merges renamed a directory and a non-directory, which has been corrected. * en/ort-recursive-d-f-conflict-fix: merge-ort: fix corner case recursive submodule/directory conflict handling
2026-01-08Merge branch 'dd/t5403-modernise'Junio C Hamano
Test micro-clean-up. * dd/t5403-modernise: t5403: use test_path_is_file instead of test -f
2026-01-08Merge branch 'ds/diff-lazy-fetch-with-name-only-fix'Junio C Hamano
Running "git diff" with "--name-only" and other options that allows us not to look at the blob contents, while objects that are lazily fetched from a promisor remote, caused use-after-free, which has been corrected. * ds/diff-lazy-fetch-with-name-only-fix: diff: avoid segfault with freed entries
2026-01-08Merge branch 'rs/tag-wo-the-repository'Junio C Hamano
Code clean-up. * rs/tag-wo-the-repository: tag: stop using the_repository tag: support arbitrary repositories in parse_tag() tag: support arbitrary repositories in gpg_verify_tag() tag: use algo of repo parameter in parse_tag_buffer()
2026-01-07odb: properly close sources before freeing themPatrick Steinhardt
It is possible to hit a memory leak when reading data from a submodule via git-grep(1): Direct leak of 192 byte(s) in 1 object(s) allocated from: #0 0x55555562e726 in calloc (git+0xda726) #1 0x555555964734 in xcalloc ../wrapper.c:154:8 #2 0x555555835136 in load_multi_pack_index_one ../midx.c:135:2 #3 0x555555834fd6 in load_multi_pack_index ../midx.c:382:6 #4 0x5555558365b6 in prepare_multi_pack_index_one ../midx.c:716:17 #5 0x55555586c605 in packfile_store_prepare ../packfile.c:1103:3 #6 0x55555586c90c in packfile_store_reprepare ../packfile.c:1118:2 #7 0x5555558546b3 in odb_reprepare ../odb.c:1106:2 #8 0x5555558539e4 in do_oid_object_info_extended ../odb.c:715:4 #9 0x5555558533d1 in odb_read_object_info_extended ../odb.c:862:8 #10 0x5555558540bd in odb_read_object ../odb.c:920:6 #11 0x55555580a330 in grep_source_load_oid ../grep.c:1934:12 #12 0x55555580a13a in grep_source_load ../grep.c:1986:10 #13 0x555555809103 in grep_source_is_binary ../grep.c:2014:7 #14 0x555555807574 in grep_source_1 ../grep.c:1625:8 #15 0x555555807322 in grep_source ../grep.c:1837:10 #16 0x5555556a5c58 in run ../builtin/grep.c:208:10 #17 0x55555562bb42 in void* ThreadStartFunc<false>(void*) lsan_interceptors.cpp.o #18 0x7ffff7a9a979 in start_thread (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x9a979) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab) #19 0x7ffff7b22d2b in __GI___clone3 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x122d2b) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab) The root caues of this leak is the way we set up and release the submodule: 1. We use `repo_submodule_init()` to initialize a new repository. This repository is stored in `repos_to_free`. 2. We now read data from the submodule repository. 3. We then call `repo_clear()` on the submodule repositories. 4. `repo_clear()` calls `odb_free()`. 5. `odb_free()` calls `odb_free_sources()` followed by `odb_close()`. The issue here is the 5th step: we call `odb_free_sources()` _before_ we call `odb_close()`. But `odb_free_sources()` already frees all sources, so the logic that closes them in `odb_close()` now becomes a no-op. As a consequence, we never explicitly close sources at all. Fix the leak by closing the store before we free the sources. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07builtin/gc: fix condition for whether to write commit graphsPatrick Steinhardt
When performing auto-maintenance we check whether commit graphs need to be generated by counting the number of commits that are reachable by any reference, but not covered by a commit graph. This search is performed by iterating through all references and then doing a depth-first search until we have found enough commits that are not present in the commit graph. This logic has a memory leak though: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x55555562e433 in malloc (git+0xda433) #1 0x555555964322 in do_xmalloc ../wrapper.c:55:8 #2 0x5555559642e6 in xmalloc ../wrapper.c:76:9 #3 0x55555579bf29 in commit_list_append ../commit.c:1872:35 #4 0x55555569f160 in dfs_on_ref ../builtin/gc.c:1165:4 #5 0x5555558c33fd in do_for_each_ref_iterator ../refs/iterator.c:431:12 #6 0x5555558af520 in do_for_each_ref ../refs.c:1828:9 #7 0x5555558ac317 in refs_for_each_ref ../refs.c:1833:9 #8 0x55555569e207 in should_write_commit_graph ../builtin/gc.c:1188:11 #9 0x55555569c915 in maintenance_is_needed ../builtin/gc.c:3492:8 #10 0x55555569b76a in cmd_maintenance ../builtin/gc.c:3542:9 #11 0x55555575166a in run_builtin ../git.c:506:11 #12 0x5555557502f0 in handle_builtin ../git.c:779:9 #13 0x555555751127 in run_argv ../git.c:862:4 #14 0x55555575007b in cmd_main ../git.c:984:19 #15 0x5555557523aa in main ../common-main.c:9:11 #16 0x7ffff7a2a4d7 in __libc_start_call_main (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a4d7) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab) #17 0x7ffff7a2a59a in __libc_start_main@GLIBC_2.2.5 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a59a) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab) #18 0x5555555f0934 in _start (git+0x9c934) The root cause of this memory leak is our use of `commit_list_append()`. This function expects as parameters the item to append and the _tail_ of the list to append. This tail will then be overwritten with the new tail of the list so that it can be used in subsequent calls. But we call it with `commit_list_append(parent->item, &stack)`, so we end up losing everything but the new item. This issue only surfaces when counting merge commits. Next to being a memory leak, it also shows that we're in fact miscounting as we only respect children of the last parent. All previous parents are discarded, so their children will be disregarded unless they are hit via another reference. While crafting a test case for the issue I was puzzled that I couldn't establish the proper border at which the auto-condition would be fulfilled. As it turns out, there's another bug: if an object is at the tip of any reference we don't mark it as seen. Consequently, if it is the tip of or reachable via another ref, we'd count that object multiple times. Fix both of these bugs so that we properly count objects without leaking any memory. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06The 15th batchJunio C Hamano
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06Merge branch 'rs/parse-config-expiry-simplify'Junio C Hamano
Code clean-up. * rs/parse-config-expiry-simplify: config: use git_parse_int() in git_config_get_expiry_in_days()
2026-01-06Merge branch 'ar/run-command-hook'Junio C Hamano
Use hook API to replace ad-hoc invocation of hook scripts with the run_command() API. * ar/run-command-hook: receive-pack: convert receive hooks to hook API receive-pack: convert update hooks to new API hooks: allow callers to capture output run-command: allow capturing of collated output hook: allow overriding the ungroup option reference-transaction: use hook API instead of run-command transport: convert pre-push to hook API hook: convert 'post-rewrite' hook in sequencer.c to hook API hook: provide stdin via callback run-command: add stdin callback for parallelization run-command: add first helper for pp child states
2026-01-06Merge branch 'rs/show-branch-prio-queue'Junio C Hamano
Code clean-up. * rs/show-branch-prio-queue: show-branch: use prio_queue
2026-01-06Merge branch 'rs/macos-iconv-workaround'Junio C Hamano
Workaround the "iconv" shipped as part of macOS, which is broken handling stateful ISO/IEC 2022 encoded strings. * rs/macos-iconv-workaround: macOS: use iconv from Homebrew if needed and present macOS: make Homebrew use configurable
2026-01-06Merge branch 'bc/checkout-error-message-fix'Junio C Hamano
Message fix. * bc/checkout-error-message-fix: checkout: quote invalid treeish in error message
2026-01-06t3650: add more regression tests for failure conditionsKristoffer Haugsbakk
There isn’t much test coverage for basic failure conditions. Let’s add a few more since these are simple to write and remove if they become obsolete. Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06replay: die if we cannot parse objectKristoffer Haugsbakk
`parse_object` can return `NULL`. That will in turn make `repo_peel_to_type` return the same. Let’s die fast and descriptively with the `*_or_die` variant. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06replay: improve code comment and die messageKristoffer Haugsbakk
Suggested-by: Elijah Newren <newren@gmail.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06replay: die descriptively when invalid commit-ish is givenKristoffer Haugsbakk
Giving an invalid commit-ish to `--onto` makes git-replay(1) fail with: fatal: Replaying down to root commit is not supported yet! Going backwards from this point: 1. `onto` is `NULL` from `set_up_replay_mode`; 2. that function in turn calls `peel_committish`; and 3. here we return `NULL` if `repo_get_oid` fails. Let’s die immediately with a descriptive error message instead. Doing this also provides us with a descriptive error if we “forget” to provide an argument to `--onto` (but we really do unintentionally):[1] $ git replay --onto ^main topic1 fatal: '^main' is not a valid commit-ish Note that the `--advance` case won’t be triggered in practice because of the “argument to --advance must be a reference” check (see the previous test, and commit). † 1: The argument to `--onto` is mandatory and the option parser accepts both `--onto=<name>` (stuck form) and `--onto name`. The latter form makes it easy to unintentionally pass something to the option when you really meant to pass a positional argument. Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06replay: find *onto only after testing for ref nameKristoffer Haugsbakk
We are about to make `peel_committish` die when it cannot find a commit-ish instead of returning `NULL`. But that would make e.g. `git replay --advance=refs/non-existent` die with a less descriptive error message; the highest-level error message is that the name does not exist as a ref, not that we cannot find a commit-ish based on the name. Let’s try to find the ref and only after that try to peel to as a commit-ish. Also add a regression test to protect this error order from future modifications. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06replay: remove dead code and rearrangeKristoffer Haugsbakk
22d99f01 (replay: add --advance or 'cherry-pick' mode, 2023-11-24) both added `--advance` and made one of `--onto` or `--advance` mandatory. But `determine_replay_mode` claims that there is a third alternative; neither of `--onto` or `--advance` were given: if (onto_name) { ... } else if (*advance_name) { ... } else { ... } But this is false—the fallthrough else-block is dead code. Commit 22d99f01 was iterated upon by several people.[1] The initial author wrote code for a sort of *guess mode*, allowing for shorter commands when that was possible. But the next person instead made one of the aforementioned options mandatory. In turn this code was dead on arrival in git.git. [1]: https://lore.kernel.org/git/CABPp-BEcJqjD4ztsZo2FTZgWT5ZOADKYEyiZtda+d0mSd1quPQ@mail.gmail.com/ Let’s remove this code. We can also join the if-block with the condition `!*advance_name` into the `*onto` block since we do not set `*advance_name` in this function. It only looked like we might set it since the dead code has this line: *advance_name = xstrdup_or_null(last_key); Let’s also rename the function since we do not determine the replay mode here. We just set up `*onto` and refs to update. Note that there might be more dead code caused by this *guess mode*. We only concern ourselves with this function for now. Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-05t1300: use test helpers instead of `test` commandPushkar Singh
Replace `test -f` and `test -h` checks with `test_path_is_file` and `test_path_is_symlink`. Using the test framework helpers provides clearer diagnostics and keeps tests consistent across the suite. Signed-off-by: Pushkar Singh <pushkarkumarsingh1970@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-04t7800: fix racy "difftool --dir-diff syncs worktree" testPaul Tarjan
The "difftool --dir-diff syncs worktree without unstaged change" test fails intermittently on Windows CI, as seen at: https://github.com/git/git/actions/runs/20624095002/job/59231745784#step:5:416 The root cause is that the original file content and the replacement content have identical sizes: - Original: "main\ntest\na\n" = 12 bytes - New: "new content\n" = 12 bytes When difftool's sync-back mechanism checks for changes, it compares stat data between the temporary index and the modified files. If the modification happens within the same timestamp granularity window and file size stays the same, the change goes undetected. On Windows, this is more likely to manifest because Git relies on inode changes as a fallback when other stat fields match, but Windows filesystems lack inodes. This is a real bug that could affect users scripting difftool similarly, as seen at: https://github.com/git-for-windows/git/issues/5132 Fix the test by changing the replacement content to "modified content" (17 bytes), ensuring the size difference is detected regardless of timestamp resolution or platform-specific stat behavior. Note: This fixes the test flakiness but not the underlying issue in difftool's change detection. Other tests with same-size file patterns (t0010-racy-git.sh, t2200-add-update.sh) are not affected because they use normal index operations with proper racy-git detection. Signed-off-by: Paul Tarjan <github@paulisageek.com> Reviewed-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30The 14th batchJunio C Hamano
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30Merge branch 'jk/test-curl-updates'Junio C Hamano
Update HTTP tests to adjust for changes in curl 8.18.0 * jk/test-curl-updates: t5563: add missing end-of-line in HTTP header t5551: handle trailing slashes in expected cookies output
2025-12-30Merge branch 'jc/object-read-stream-fix'Junio C Hamano
Fix a performance regression in recently graduated topic. * jc/object-read-stream-fix: odb: do not use "blank" substitute for NULL
2025-12-30Merge branch 'js/test-func-comment-fix'Junio C Hamano
Comment fix. * js/test-func-comment-fix: test_detect_ref_format: fix comment
2025-12-30Merge branch 'gf/clear-path-cache-cleanup'Junio C Hamano
Code clean-up. * gf/clear-path-cache-cleanup: repository: remove duplicate free of cache->squash_msg
2025-12-30Merge branch 'gf/maintenance-is-needed-fix'Junio C Hamano
Brown-paper-bag fix to a recently graduated 'kn/maintenance-is-needed' topic. * gf/maintenance-is-needed-fix: refs: dereference the value of the required pointer
2025-12-30Merge branch 'dk/ci-rust-fix'Junio C Hamano
Build fix. * dk/ci-rust-fix: rust: build correctly without GNU sed
2025-12-30Merge branch 'mh/doc-core-attributesfile'Junio C Hamano
Doc update. * mh/doc-core-attributesfile: docs: note the type of core.attributesfile
2025-12-30Merge branch 'ps/repack-avoid-noop-midx-rewrite'Junio C Hamano
Even when there is no changes in the packfile and no need to recompute bitmaps, "git repack" recomputed and updated the MIDX file, which has been corrected. * ps/repack-avoid-noop-midx-rewrite: midx-write: skip rewriting MIDX with `--stdin-packs` unless needed midx-write: extract function to test whether MIDX needs updating midx: fix `BUG()` when getting preferred pack without a reverse index
2025-12-30Merge branch 'js/test-symlink-windows'Junio C Hamano
Prepare test suite for Git for Windows that supports symbolic links. * js/test-symlink-windows: t7800: work around the MSYS path conversion on Windows t6423: introduce Windows-specific handling for symlinking to /dev/null t1305: skip symlink tests that do not apply to Windows t1006: accommodate for symlink support in MSYS2 t0600: fix incomplete prerequisite for a test case t0301: another fix for Windows compatibility t0001: handle `diff --no-index` gracefully mingw: special-case `open(symlink, O_CREAT | O_EXCL)` apply: symbolic links lack a "trustable executable bit" t9700: accommodate for Windows paths
2025-12-30Merge branch 'jt/doc-rev-list-filter-provided-objects'Junio C Hamano
Document "rev-list --filter-provided-objects" better. * jt/doc-rev-list-filter-provided-objects: docs: clarify git-rev-list(1) --filter behavior
2025-12-30Merge branch 'jt/repo-struct-more-objinfo'Junio C Hamano
More object database related information are shown in "git repo structure" output. * jt/repo-struct-more-objinfo: builtin/repo: add object disk size info to structure table builtin/repo: add disk size info to keyvalue stucture output builtin/repo: add inflated object info to structure table builtin/repo: add inflated object info to keyvalue structure output builtin/repo: humanise count values in structure output strbuf: split out logic to humanise byte values builtin/repo: group per-type object values into struct
2025-12-30diff: avoid segfault with freed entriesDerrick Stolee
When computing a diff in a partial clone, there is a chance that we could trigger a prefetch of missing objects at the same time as we are freeing entries from the global diff queue. This is difficult to reproduce, as we need to have some objects be freed from the queue before triggering the prefetch of missing objects. There is a new test in t4067 that does trigger the segmentation fault that results in this case. The fix is to set the queue pointer to NULL after it is freed, and then to be careful about NULL values in the prefetch. The more elaborate explanation is that within diffcore_std(), we may skip the initial prefetch due to the output format (--name-only in the test) and go straight to diffcore_skip_stat_unmatch(). In that method, the index entries that have been invalidated by path changes show up as entries but may be deleted because they are not actually content diffs and only newer timestamps than expected. As those entries are deleted, later entries are checked with diff_filespec_check_stat_unmatch(), which uses diff_queued_diff_prefetch() as the missing_object_cb in its diff options. That can trigger downloading missing objects if the appropriate scenario occurs to trigger a call to diff_popoulate_filespec(). It's finally within that callback to diff_queued_diff_prefetch() that the segfault occurs. The test was hard to find because it required some real differences, some not-different files that had a newer modified time, and the order of those files alphabetically was important to trigger the deletion before the prefetch was triggered. I briefly considered a "lock" member for the diff queue, but it was a much larger diff and introduced many more possible error scenarios. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30t5403: use test_path_is_file instead of test -fDeveshi Dwivedi
Replace 'test -f' with the test_path_is_file in t5403-post-checkout-hook.sh. This helper provides better error messages when tests fail, making it easier to debug issues. Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30merge-ort: fix corner case recursive submodule/directory conflict handlingElijah Newren
At GitHub, a few repositories were triggering errors of the form: git: merge-ort.c:3037: process_renames: Assertion `newinfo && !newinfo->merged.clean' failed. Aborted (core dumped) While these may look similar to both a562d90a350d (merge-ort: fix failing merges in special corner case, 2025-11-03) and f6ecb603ff8a (merge-ort: fix directory rename on top of source of other rename/delete, 2025-08-06) the cause is different and in this case the problem is not an over-conservative assertion, but a bug before the assertion where we did not update all relevant state appropriately. It sadly took me a really long time to figure out how to get a simple reproducer for this one. It doesn't really have that many moving parts, but there are multiple pieces of background information needed to understand it. First of all, when we have two files added at the same path, merge-ort does a two-way merge of those files. If we have two directories added at the same path, we basically do the same thing (taking the union of files, and two-way merging files with the same name). But two-way merging requires components of the same type. We can't merge the contents of a regular file with a directory, or with a symlink, or with a submodule. Nor can any of those other types be merged with each other, e.g. merging a submodule with a directory is a bad idea. When two paths have the same name but their types do not match, merge-ort is forced to move one of them to an alternate filename (using the unique_path() function). Second, if two commits being merged have more than one merge-base, merge-ort will merge the merge-bases to create a virtual merge-base, and use that as the base commit. Third, one of the really important optimizations in merge-ort is trivial tree-level resolution (roughly meaning merging trees without recursing into them). This optimization has some nuance to it that is important to the current bug, and to understand it, it helps to first look at the high-level overview of how merge-ort runs; there are basically three high-level functions that the work is divided between: collect_merge_info() - walks the top-level trees getting individual paths of interest detect_renames() - detect renames between paths in order to match up paths for three-way merging process_entries() - does a few things of interest: * three-way merging of files, * other special handling (e.g. adjusting paths with conflicting types to avoid path collisions) * as it finishes handling all the files within a subdirectory, writes out a new tree object for that directory If it were not for renames, we could just always do tree-level merging whenever the tree on at least one side was unmodified. Unfortunately, we need to recurse into trees to determine whether there are renames. However, we can also do tree-level merging so long as there aren't any *relevant* renames (another merge-ort optimization), which we can determine without recursing into trees. We would also be able to do tree-level merging if we somehow apriori knew what renames existed, by only recursing into the trees which we could otherwise trivially merge if they contained files involved in renames. That might not seem useful, because we need to find out the renames and we have to recurse into trees to do so, but when you find out that the process_entries() step is more computationally expensive than the collect_merge_info() step, it yields an interesting strategy: * run collect_merge_info() * run detect_renames() * cache the renames() * restart -- rerun collect_merge_info(), using the cached renames to only recurse into the needed trees * we already have the renames cached so no need to re-detect * run process_entries() on the reduced list of paths which was implemented back in 7bee6c100431 (merge-ort: avoid recursing into directories when we don't need to, 2021-07-16) Crucially, this restarting only occurs if the number of paths we could skip recursing into exceeds the number we still need to recurse into by some safety factor (wanted_factor in handle_deferred_entries()); forgetting this fact is a great way to repeatedly fail to create a minimal testcase for several days and go down alternate wrong paths). Now, I earlier summarized this optimization as "merging trees without recursing into them", but this optimization does not require that all three sides of history has a directory at a given path. So long as the tree on one side matches the tree in the base version, we can decide to resolve in favor of whatever the other side of history has at that path -- be it a directory, a file, a submodule, or a symlink. Unfortunately, the code in question didn't fully realize this, and was written assuming the base version and both sides would have a directory at the given path, as can be seen by the "ci->filemask == 0" comment in resolve_trivial_directory_merge() that was added as part of 7bee6c100431 (merge-ort: avoid recursing into directories when we don't need to, 2021-07-16). A few additional lines of code are needed to handle cases where we have something other than a directory on the other side of history. But, knowing that resolve_trivial_directory_merge() doesn't have sufficient state updating logic doesn't show us how to trigger a bug without combining with the other bits of information we provided above. Here's a relevant testcase: * branches A & B * commit A1: adds "folder" as a directory with files tracked under it * commit B1: adds "folder" as a submodule * commit A2: merges B1 into A1, keeping "folder" as a directory (and in fact, with no changes to "folder" since A1), discarding the submodule * commit B2: merges A1 into B1, keeping "folder" as a submodule (and in fact, with no changes to "folder" since B1), discarding the directory Here, if we try to merge A2 & B2, the logic proceeds as follows: * we have multiple merge-bases: A1 & B1. So we have to merge those to get a virtual merge base. * due to "folder" as a directory and "folder" as a submodule, the path collision logic triggers and renames "folder" as a submodule to "folder~Temporary merge branch 2" so we can keep it alongside "folder" as a directory. * we now have a virtual merge base (containing both "folder" directory and a "folder~Temporary merge branch 2" submodule) and can now do the outer merge * in the first step of the outer merge, we attempt to defer recursing into folder/ as a directory, but find we need to for rename detection. * in rename detection, we note that "folder~Temporary merge branch 2" has the same hash as "folder" as a submodule in B2, which means we have an exact rename. * after rename detection, we discover no path in folder/ is needed for renames, and so we can cache renames and restart. * after restarting, we avoid recursing into "folder/" and realize we can resolve it trivially since it hasn't been modified. The resolution removes "folder/", leaving us only "folder" as a submodule from commit B2. * After this point, we should have a rename/delete conflict on "folder~Temporary merge branch 2" -> "folder", but our marking of the merge of "folder" as clean broke our ability to handle that and in fact triggers an assertion in process_renames(). When there was a df_conflict (directory/"file" conflict, where "file" could be submodule or regular file or symlink), ensure resolve_trivial_directory_merge() handles it properly. In particular: * do not pre-emptively mark the path as cleanly merged if the remaining path is a file; allow it to be processed in process_entries() later to determine if it was clean * clear the parts of dirmask or filemask corresponding to the matching sides of history, since we are resolving those away * clear the df_conflict bit afterwards; since we cleared away the two matching sides and only have one side left, that one side can't have a directory/file conflict with itself. Also add the above minimal testcase showcasing this bug to t6422, **with a sufficient number of paths under the folder/ directory to actually trigger it**. (I wish I could have all those days back from all the wrong paths I went down due to not having enough files under that directory...) I know this commit has a very high ratio of lines in the commit message to lines of comments, and a relatively high ratio of comments to actual code, but given how long it took me to track down, on the off chance that we ever need to further modify this logic, I wanted it thoroughly documented for future me and for whatever other poor soul might end up needing to read this commit message. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-29tag: stop using the_repositoryRené Scharfe
gpg_verify_tag() shows the passed in object name on error. Both callers provide one. It falls back to abbreviated hashes for future callers that pass in a NULL name. DEFAULT_ABBREV is default_abbrev, which in turn is a global variable that's populated by git_default_config() and only available with USE_THE_REPOSITORY_VARIABLE. Don't let that hypothetical hold us back from getting rid of the_repository in tag.c. Fall back to full hashes, which are more appropriate for error messages anyway. This allows us to stop setting USE_THE_REPOSITORY_VARIABLE. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-29tag: support arbitrary repositories in parse_tag()René Scharfe
Allow callers of parse_tag() pass in the repository to use. Let most of them pass in the_repository to get the same result as before. One of them has stopped using the_repository in ef9b0370da (sha1-name.c: store and use repo in struct disambiguate_state, 2019-04-16); let it pass in its stored repository. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>