aboutsummaryrefslogtreecommitdiff
path: root/t/t1092-sparse-checkout-compatibility.sh
AgeCommit message (Collapse)Author
2026-02-06merge-ours: integrate with sparse-indexSam Bostock
The merge-ours built-in opens the index to compare it against HEAD. The machinery used to do this (i.e. run_diff_index()) is capable of working with a sparse index, but the start-up sequence of this command does not take the necessary steps, so we end up expanding the index fully before doing the comparison. In order to convince sparse-index.c:is_sparse_index_allowed() to return true, we need to: - Read basic configuration with git_default_config so that global variables like core_apply_sparse_checkout are populated. merge-ours currently does not read configuration at all. - Set command_requires_full_index to 0. With that, the command can work without expanding the index fully before doing its work. Signed-off-by: Sam Bostock <sam@sambostock.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-28ls-files: conditionally leave index sparseDerrick Stolee
When running 'git ls-files' with a pathspec, the index entries get filtered according to that pathspec before iterating over them in show_files(). In 78087097b8 (ls-files: add --sparse option, 2021-12-22), this iteration was prefixed with a check for the '--sparse' option which allows the command to output directory entries; this created a pre-loop call to ensure_full_index(). However, when a user runs 'git ls-files' where the pathspec matches directories that are recursively matched in the sparse-checkout, there are not any sparse directories that match the pathspec so they would not be written to the output. The expansion in this case is just a performance drop for no behavior difference. Replace this global check to expand the index with a check inside the loop for a matched sparse directory. If we see one, then expand the index and continue from the current location. This is safe since the previous entries in the index did not have any sparse directories and thus would remain stable in this expansion. A test in t1092 confirms that this changes the behavior. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-27Merge branch 'ds/sparse-apply-add-p'Junio C Hamano
"git apply" and "git add -i/-p" code paths no longer unnecessarily expand sparse-index while working. * ds/sparse-apply-add-p: p2000: add performance test for patch-mode commands reset: integrate sparse index with --patch git add: make -p/-i aware of sparse index apply: integrate with the sparse index
2025-05-16reset: integrate sparse index with --patchDerrick Stolee
Similar to the previous change for 'git add -p', the reset builtin checked for integration with the sparse index after possibly redirecting its logic toward the interactive logic. This means that the builtin would expand the sparse index to a full one upon read. Move this check earlier within cmd_reset() to improve performance here. Add tests to guarantee that we are not universally expanding the index. Add behavior tests to check that we are doing the same operations as a full index. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16git add: make -p/-i aware of sparse indexDerrick Stolee
It is slow to expand a sparse index in-memory due to parsing of trees. We aim to minimize that performance cost when possible. 'git add -p' uses 'git apply' child processes to modify the index, but still there are some expansions that occur. It turns out that control flows out of cmd_add() in the interactive cases before the lines that confirm that the builtin is integrated with the sparse index. Moving that integration point earlier in cmd_add() allows 'git add -i' and 'git add -p' to operate without expanding a sparse index to a full one. Add test cases that confirm that these interactive add options work with the sparse index. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16apply: integrate with the sparse indexDerrick Stolee
The sparse index allows storing directory entries in the index, marked with the skip-wortkree bit and pointing to a tree object. This may be an unexpected data shape for some implementation areas, so we are rolling it out incrementally on a builtin-per-builtin basis. This change enables the sparse index for 'git apply'. The main motivation for this change is that 'git apply' is used as a child process of 'git add -p' and expanding the sparse index for each of those child processes can lead to significant performance issues. The good news is that the actual index manipulation code used by 'git apply' is already integrated with the sparse index, so the only product change is to mark the builtin as allowing the sparse index so it isn't inflated on read. The more involved part of this change is around adding tests that verify how 'git apply' behaves in a sparse-checkout environment and whether or not the index expands in certain operations. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08tests: remove GIT_TEST_MERGE_ALGORITHM and test_expect_merge_algorithmElijah Newren
Both of these existed to allow us to reuse all the merge-related tests in the testsuite while easily flipping between the 'recursive' and the 'ort' backends. Now that we have removed merge-recursive and remapped 'recursive' to mean 'ort', we don't need this scaffolding anymore. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21t: remove TEST_PASSES_SANITIZE_LEAK annotationsPatrick Steinhardt
Now that the default value for TEST_PASSES_SANITIZE_LEAK is `true` there is no longer a need to have that variable declared in all of our tests. Drop it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-17t: fix typosAndrew Kreimer
Fix typos in documentation, comments, etc. Via codespell. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-10Merge branch 'ps/leakfixes-part-8'Junio C Hamano
More leakfixes. * ps/leakfixes-part-8: (23 commits) builtin/send-pack: fix leaking list of push options remote: fix leaking push reports t/helper: fix leaks in proc-receive helper pack-write: fix return parameter of `write_rev_file_order()` revision: fix leaking saved parents revision: fix memory leaks when rewriting parents midx-write: fix leaking buffer pack-bitmap-write: fix leaking OID array pseudo-merge: fix leaking strmap keys pseudo-merge: fix various memory leaks line-log: fix several memory leaks diff: improve lifecycle management of diff queues builtin/revert: fix leaking `gpg_sign` and `strategy` config t/helper: fix leaking repository in partial-clone helper builtin/clone: fix leaking repo state when cloning with bundle URIs builtin/pack-redundant: fix various memory leaks builtin/stash: fix leaking `pathspec_from_file` submodule: fix leaking submodule entry list wt-status: fix leaking buffer with sparse directories shell: fix leaking strings ...
2024-10-02Merge branch 'ds/sparse-checkout-expansion-advice'Junio C Hamano
When "git sparse-checkout disable" turns a sparse checkout into a regular checkout, the index is fully expanded. This totally expected behaviour however had an "oops, we are expanding the index" advice message, which has been corrected. * ds/sparse-checkout-expansion-advice: sparse-checkout: disable advice in 'disable'
2024-09-30wt-status: fix leaking buffer with sparse directoriesPatrick Steinhardt
When hitting a sparse directory in `wt_status_collect_changes_initial()` we use a `struct strbuf` to assemble the directory's name. We never free that buffer though, causing a memory leak. Fix the leak by releasing the buffer. While at it, move the buffer outside of the loop and reset it to save on some wasteful allocations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23sparse-checkout: disable advice in 'disable'Derrick Stolee
When running 'git sparse-checkout disable' with the sparse index enabled, Git is expected to expand the index into a full index. However, it currently outputs the advice message saying that that is unexpected and likely due to an issue with the working directory. Disable this advice message when in this code path. Establish a pattern for doing a similar removal in the future. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04builtin/cat-file: mark 'git cat-file' sparse-index compatibleKevin Lyles
This change affects how 'git cat-file' works with the index when specifying an object with the ":<path>" syntax (which will give file contents from the index). 'git cat-file' expands a sparse index to a full index any time contents are requested from the index by specifying an object with the ":<path>" syntax. This is true even when the requested file is part of the sparse index, and results in much slower 'git cat-file' operations when working within the sparse index. Mark 'git cat-file' as not needing a full index, so that you only pay the cost of expanding the sparse index to a full index when you request a file outside of the sparse index. Add tests to ensure both that: - 'git cat-file' returns the correct file contents whether or not the file is in the sparse index - 'git cat-file' expands to the full index any time you request something outside of the sparse index Signed-off-by: Kevin Lyles <klyles+github@epic.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04t1092: allow run_on_* functions to use standard inputKevin Lyles
The 'run_on_sparse' and 'run_on_all' functions do not work correctly for commands accepting standard input, because they run the same command multiple times and the first instance consumes it. This also indirectly affects 'test_all_match' and 'test_sparse_match'. To allow these functions to work with commands accepting standard input, first slurp standard input to a temporary file, and then run the command with its standard input redirected from the temporary file. This ensures that each command sees the same contents from its standard input. Note that this does not impact commands that do not read from standard input; they continue to ignore it. Additionally, existing uses of the run_on_* functions do not need to do anything differently, as the standard input of the test environment is already connected to /dev/null. We do not explicitly clean up the input files because they are cleaned up with the rest of the test repositories and their contents may be useful for figuring out which command failed when a test case fails. Signed-off-by: Kevin Lyles <klyles@epic.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22diff-index: integrate with the sparse indexDerrick Stolee
The sparse index allows focusing the index data structure on the files present in the sparse-checkout, leaving only tree entries for directories not within the sparse-checkout. Each builtin needs a repository setting to indicate that it has been tested with the sparse index before Git will allow the index to be loaded into memory in its sparse form. This is a safety precaution. There are still some builtins that haven't been integrated due to the complexity of the integration and the lack of significant use. However, 'git diff-index' was neglected only because of initial data showing low usage. The diff machinery was already integrated and there is no more work to be done there but add some tests to be sure 'git diff-index' behaves as expected. For this purpose, we can follow the testing pattern used in 51ba65b5c35 (diff: enable and test the sparse index, 2021-12-06). One difference here is that we only verify that the sparse index case agrees with the full index case, but do not generate the expected output. The 'git diff' tests use the '--name-status' option to ease the creation of the expected output, but that's not an option for 'diff-index'. Since the underlying diff machinery is the same, a simple comparison is sufficient to give some coverage. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08advice: warn when sparse index expandsDerrick Stolee
Typically, forcing a sparse index to expand to a full index means that Git could not determine the status of a file outside of the sparse-checkout and needed to expand sparse trees into the full list of sparse blobs. This operation can be very slow when the sparse-checkout is much smaller than the full tree at HEAD. When users are in this state, there is usually a modified or untracked file outside of the sparse-checkout mentioned by the output of 'git status'. There are a number of reasons why this is insufficient: 1. Users may not have a full understanding of which files are inside or outside of their sparse-checkout. This is more common in monorepos that manage the sparse-checkout using custom tools that map build dependencies into sparse-checkout definitions. 2. In some cases, an empty directory could exist outside the sparse-checkout and these empty directories are not reported by 'git status' and friends. 3. If the user has '.gitignore' or 'exclude' files, then 'git status' will squelch the warnings and not demonstrate any problems. In order to help users who are in this state, add a new advice message to indicate that a sparse index is expanded to a full index. This message should be written at most once per process, so add a static global 'give_advice_on_expansion' to sparse-index.c. Further, there is a case in 'git sparse-checkout set' that uses the sparse index as an in-memory data structure (even when writing a full index) so we need to disable the message in that kind of case. The t1092-sparse-checkout-compatibility.sh test script compares the behavior of several Git commands across full and sparse repositories, including sparse repositories with and without a sparse index. We need to disable the advice in the sparse-index repo to avoid differences in stderr. By leaving the advice on in the sparse-checkout repo (without the sparse index), we can test the behavior of disabling the advice in convert_to_sparse(). (Indeed, these tests are how that necessity was discovered.) Add a test that reenables the advice and demonstrates that the message is output. The advice message is defined outside of expand_index() to avoid super- wide lines. It is also defined as a macro to avoid compile issues with -Werror=format-security. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-08Merge branch 'jc/test-i18ngrep'Junio C Hamano
Another step to deprecate test_i18ngrep. * jc/test-i18ngrep: tests: teach callers of test_i18ngrep to use test_grep test framework: further deprecate test_i18ngrep
2023-11-02tests: teach callers of test_i18ngrep to use test_grepJunio C Hamano
They are equivalents and the former still exists, so as long as the only change this commit makes are to rewrite test_i18ngrep to test_grep, there won't be any new bug, even if there still are callers of test_i18ngrep remaining in the tree, or when merged to other topics that add new uses of test_i18ngrep. This patch was produced more or less with git grep -l -e 'test_i18ngrep ' 't/t[0-9][0-9][0-9][0-9]-*.sh' | xargs perl -p -i -e 's/test_i18ngrep /test_grep /' and a good way to sanity check the result yourself is to run the above in a checkout of c4603c1c (test framework: further deprecate test_i18ngrep, 2023-10-31) and compare the resulting working tree contents with the result of applying this patch to the same commit. You'll see that test_i18ngrep in a few t/lib-*.sh files corrected, in addition to the manual reproduction. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-11check-attr: integrate with sparse-indexShuqi Liang
Set the requires-full-index to false for "check-attr". Add a test to ensure that the index is not expanded whether the files are outside or inside the sparse-checkout cone when the sparse index is enabled. The `p2000` tests demonstrate a ~63% execution time reduction for 'git check-attr' using a sparse index. Test before after ----------------------------------------------------------------------- 2000.106: git check-attr -a f2/f4/a (full-v3) 0.05 0.05 +0.0% 2000.107: git check-attr -a f2/f4/a (full-v4) 0.05 0.05 +0.0% 2000.108: git check-attr -a f2/f4/a (sparse-v3) 0.04 0.02 -50.0% 2000.109: git check-attr -a f2/f4/a (sparse-v4) 0.04 0.01 -75.0% Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-11attr.c: read attributes in a sparse directoryShuqi Liang
Before this patch, git check-attr was unable to read the attributes from a .gitattributes file within a sparse directory. The original comment was operating under the assumption that users are only interested in files or directories inside the cones. Therefore, in the original code, in the case of a cone-mode sparse-checkout, we didn't load the .gitattributes file. However, this behavior can lead to missing attributes for files inside sparse directories, causing inconsistencies in file handling. To resolve this, revise 'git check-attr' to allow attribute reading for files in sparse directories from the corresponding .gitattributes files: 1.Utilize path_in_cone_mode_sparse_checkout() and index_name_pos_sparse to check if a path falls within a sparse directory. 2.If path is inside a sparse directory, employ the value of index_name_pos_sparse() to find the sparse directory containing path and path relative to sparse directory. Proceed to read attributes from the tree OID of the sparse directory using read_attr_from_blob(). 3.If path is not inside a sparse directory,ensure that attributes are fetched from the index blob with read_blob_data_from_index(). Change the test 'check-attr with pathspec outside sparse definition' to 'test_expect_success' to reflect that the attributes inside a sparse directory can now be read. Ensure that the sparse index case works correctly for git check-attr to illustrate the successful handling of attributes within sparse directories. Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-11t1092: add tests for 'git check-attr'Shuqi Liang
Add tests for `git check-attr`, make sure attribute file does get read from index when path is either inside or outside of sparse-checkout definition. Add a test named 'diff --check with pathspec outside sparse definition'. It starts by disabling the trailing whitespace and space-before-tab checks using the core. whitespace configuration option. Then, it specifically re-enables the trailing whitespace check for a file located in a sparse directory by adding a whitespace=trailing-space rule to the .gitattributes file within that directory. Next, create and populate the folder1 directory, and then add the .gitattributes file to the index. Edit the contents of folder1/a, add it to the index, and proceed to "re-sparsify" 'folder1/' with 'git sparse-checkout reapply'. Finally, use 'git diff --check --cached' to compare the 'index vs. HEAD', ensuring the correct application of the attribute rules even when the file's path is outside the sparse-checkout definition. Mark the two tests 'check-attr with pathspec outside sparse definition' and 'diff --check with pathspec outside sparse definition' as 'test_expect_failure' to reflect an existing issue where the attributes inside a sparse directory are ignored. Ensure that the 'check-attr' command fails to read the required attributes to demonstrate this expected failure. Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-23Merge branch 'sl/worktree-sparse'Junio C Hamano
"git worktree" learned to work better with sparse index feature. * sl/worktree-sparse: worktree: integrate with sparse-index
2023-06-13Merge branch 'sl/diff-tree-sparse'Junio C Hamano
"git diff-tree" has been taught to take advantage of the sparse-index feature. * sl/diff-tree-sparse: diff-tree: integrate with sparse index
2023-06-12worktree: integrate with sparse-indexShuqi Liang
The index is read in 'worktree.c' at two points: 1.The 'validate_no_submodules' function, which checks if there are any submodules present in the worktree. 2.The 'check_clean_worktree' function, which verifies if a worktree is 'clean', i.e., there are no untracked or modified but uncommitted files. This is done by running the 'git status' command, and an error message is thrown if the worktree is not clean. Given that 'git status' is already sparse-aware, the function is also sparse-aware. Hence we can just set the requires-full-index to false for "git worktree". Add tests that verify that 'git worktree' behaves correctly when the sparse index is enabled and test to ensure the index is not expanded. The `p2000` tests demonstrate a ~20% execution time reduction for 'git worktree' using a sparse index: (Note:the p2000 test results didn't reflect the huge speedup because of the index reading time is minuscule comparing to the filesystem operations.) Test before after ----------------------------------------------------------------------- 2000.102: git worktree add....(full-v3) 3.15 2.82 -10.5% 2000.103: git worktree add....(full-v4) 3.14 2.84 -9.6% 2000.104: git worktree add....(sparse-v3) 2.59 2.14 -16.4% 2000.105: git worktree add....(sparse-v4) 2.10 1.57 -25.2% Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Acked-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-25Merge branch 'sl/sparse-write-tree-part-2'Junio C Hamano
Fix-up to a topic already graduated to 'master'. * sl/sparse-write-tree-part-2: t1092: update a write-tree test
2023-05-18diff-tree: integrate with sparse indexShuqi Liang
The index is read in 'cmd_diff_tree' at two points: 1. The first index read was added in fd66bcc31ff (diff-tree: read the index so attribute checks work in bare repositories, 2017-12-06) to deal with reading '.gitattributes' content. 77efbb366ab (attr: be careful about sparse directories, 2021-09-08) established that, in a sparse index, we do _not_ try to load a '.gitattributes' file from within a sparse directory. 2. The second index access point is involved in rename detection, specifically when reading from stdin.This was initially added in f0c6b2a2fd9 ([PATCH] Optimize diff-tree -[CM]--stdin, 2005-05-27), where 'setup' was set to 'DIFF_SETUP_USE_SIZE_CACHE |DIFF_SETUP_USE_CACHE'. That assignment was later modified to drop the'DIFF_SETUP_USE_CACHE' in ff7fe37b053 (diff.c: move read_index() code back to the caller, 2018-08-13).However, 'DIFF_SETUP_USE_SIZE_CACHE' seems to be unused as of 6e0b8ed6d35 (diff.c: do not use a separate "size cache"., 2007-05-07) and nothing about 'detect_rename' otherwise indicates index usage. Hence we can just set the requires-full-index to false for "diff-tree". Add tests that verify that 'git diff-tree' behaves correctly when the sparse index is enabled and test to ensure the index is not expanded. The `p2000` tests demonstrate a ~98% execution time reduction for 'git diff-tree' using a sparse index: Test before after ----------------------------------------------------------------------- 2000.94: git diff-tree HEAD (full-v3) 0.05 0.04 -20.0% 2000.95: git diff-tree HEAD (full-v4) 0.06 0.05 -16.7% 2000.96: git diff-tree HEAD (sparse-v3) 0.59 0.01 -98.3% 2000.97: git diff-tree HEAD (sparse-v4) 0.61 0.01 -98.4% 2000.98: git diff-tree HEAD -- f2/f4/a (full-v3) 0.05 0.05 +0.0% 2000.99: git diff-tree HEAD -- f2/f4/a (full-v4) 0.05 0.04 -20.0% 2000.100: git diff-tree HEAD -- f2/f4/a (sparse-v3) 0.58 0.01 -98.3% 2000.101: git diff-tree HEAD -- f2/f4/a (sparse-v4) 0.55 0.01 -98.2% Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-09diff-files: integrate with sparse indexShuqi Liang
Remove full index requirement for `git diff-files`. Refactor the ensure_expanded and ensure_not_expanded functions by introducing a common helper function, ensure_index_state. Add test to ensure the index is no expanded in `git diff-files`. The `p2000` tests demonstrate a ~96% execution time reduction for 'git diff-files' and a ~97% execution time reduction for 'git diff-files' for a file using a sparse index: Test before after ----------------------------------------------------------------------- 2000.94: git diff-files (full-v3) 0.09 0.08 -11.1% 2000.95: git diff-files (full-v4) 0.09 0.09 +0.0% 2000.96: git diff-files (sparse-v3) 0.52 0.02 -96.2% 2000.97: git diff-files (sparse-v4) 0.51 0.02 -96.1% 2000.98: git diff-files -- f2/f4/a (full-v3) 0.06 0.07 +16.7% 2000.99: git diff-files -- f2/f4/a (full-v4) 0.08 0.08 +0.0% 2000.100: git diff-files -- f2/f4/a (sparse-v3) 0.46 0.01 -97.8% 2000.101: git diff-files -- f2/f4/a (sparse-v4) 0.51 0.02 -96.1% Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-09t1092: add tests for `git diff-files`Shuqi Liang
Before integrating the 'git diff-files' builtin with the sparse index feature, add tests to t1092-sparse-checkout-compatibility.sh to ensure it currently works with sparse-checkout and will still work with sparse index after that integration. When adding tests against a sparse-checkout definition, we test two modes: all changes are within the sparse-checkout cone and some changes are outside the sparse-checkout cone. In order to have staged changes outside of the sparse-checkout cone, make a directory called 'folder1' and copy `a` into 'folder1/a'. 'folder1/a' is identical to `a` in the base commit. These make 'folder1/a' in the index, while leaving it outside of the sparse-checkout definition. Change content inside 'folder1/a' in order to test 'folder1/a' being present on-disk with modifications. Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-08t1092: update a write-tree testShuqi Liang
* 'on all' in the title of the test 'write-tree on all' was unclear; remove it. * Add a baseline 'test_all_match git write-tree' before making any changes to the index, providing a reference point for the 'write-tree' prior to any modifications. * Add a comparison of the output of 'git status --porcelain=v2' to test the working tree after 'write-tree' exits. * Ensure SKIP_WORKTREE files weren't materialized on disk by using "test_path_is_missing". Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-21Merge branch 'rn/sparse-describe'Junio C Hamano
"git describe --dirty" learns to work better with sparse-index. * rn/sparse-describe: describe: enable sparse index for describe
2023-04-04write-tree: integrate with sparse indexShuqi Liang
Update 'git write-tree' to allow using the sparse-index in memory without expanding to a full one. The recursive algorithm for update_one() was already updated in 2de37c5 (cache-tree: integrate with sparse directory entries, 2021-03-03) to handle sparse directory entries in the index. Hence we can just set the requires-full-index to false for "write-tree". The `p2000` tests demonstrate a ~96% execution time reduction for 'git write-tree' using a sparse index: Test before after ----------------------------------------------------------------- 2000.78: git write-tree (full-v3) 0.34 0.33 -2.9% 2000.79: git write-tree (full-v4) 0.32 0.30 -6.3% 2000.80: git write-tree (sparse-v3) 0.47 0.02 -95.8% 2000.81: git write-tree (sparse-v4) 0.45 0.02 -95.6% Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-03describe: enable sparse index for describeRaghul Nanth A
git describe compares the index with the working tree when (and only when) it is run with the "--dirty" flag. This is done by the run_diff_index() function. The function has been made aware of the sparse-index in the series that led to 8d2c3732 (Merge branch 'ld/sparse-diff-blame', 2021-12-21). Hence we can just set the requires-full-index to false for "describe". Performance metrics Test HEAD~1 HEAD ------------------------------------------------------------------------------------------------- 2000.2: git describe --dirty (full-v3) 0.08(0.09+0.01) 0.08(0.06+0.03) +0.0% 2000.3: git describe --dirty (full-v4) 0.09(0.07+0.03) 0.08(0.05+0.04) -11.1% 2000.4: git describe --dirty (sparse-v3) 0.88(0.82+0.06) 0.02(0.01+0.05) -97.7% 2000.5: git describe --dirty (sparse-v4) 0.68(0.60+0.08) 0.02(0.02+0.04) -97.1% 2000.6: echo >>new && git describe --dirty (full-v3) 0.08(0.04+0.05) 0.08(0.05+0.04) +0.0% 2000.7: echo >>new && git describe --dirty (full-v4) 0.08(0.07+0.03) 0.08(0.05+0.04) +0.0% 2000.8: echo >>new && git describe --dirty (sparse-v3) 0.75(0.69+0.07) 0.02(0.03+0.03) -97.3% 2000.9: echo >>new && git describe --dirty (sparse-v4) 0.81(0.73+0.09) 0.02(0.01+0.05) -97.5% Signed-off-by: Raghul Nanth A <nanth.raghul@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17Sync with v2.38.1Junio C Hamano
2022-10-10Merge branch 'sy/sparse-grep'Junio C Hamano
"git grep" learned to expand the sparse-index more lazily and on demand in a sparse checkout. * sy/sparse-grep: builtin/grep.c: integrate with sparse index
2022-10-06Sync with 2.37.4Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06Sync with 2.36.3Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06Sync with 2.35.5Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01t1092: prepare for changing protocol.file.allowTaylor Blau
Explicitly cloning over the "file://" protocol in t1092 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-09-23builtin/grep.c: integrate with sparse indexShaoxuan Yuan
Turn on sparse index and remove ensure_full_index(). Before this patch, `git-grep` utilizes the ensure_full_index() method to expand the index and search all the entries. Because this method requires walking all the trees and constructing the index, it is the slow part within the whole command. To achieve better performance, this patch uses grep_tree() to search the sparse directory entries and get rid of the ensure_full_index() method. Why grep_tree() is a better choice over ensure_full_index()? 1) grep_tree() is as correct as ensure_full_index(). grep_tree() looks into every sparse-directory entry (represented by a tree) recursively when looping over the index, and the result of doing so matches the result of expanding the index. 2) grep_tree() utilizes pathspecs to limit the scope of searching. ensure_full_index() always expands the index, which means it will always walk all the trees and blobs in the repo without caring if the user only wants a subset of the content, i.e. using a pathspec. On the other hand, grep_tree() will only search the contents that match the pathspec, and thus possibly walking fewer trees. 3) grep_tree() does not construct and copy back a new index, while ensure_full_index() does. This also saves some time. ---------------- Performance test - Summary: p2000 tests demonstrate a ~71% execution time reduction for `git grep --cached bogus -- "f2/f1/f1/*"` using tree-walking logic. However, notice that this result varies depending on the pathspec given. See below "Command used for testing" for more details. Test HEAD~ HEAD ------------------------------------------------------- 2000.78: git grep ... (full-v3) 0.35 0.39 (≈) 2000.79: git grep ... (full-v4) 0.36 0.30 (≈) 2000.80: git grep ... (sparse-v3) 0.88 0.23 (-73.8%) 2000.81: git grep ... (sparse-v4) 0.83 0.26 (-68.6%) - Command used for testing: git grep --cached bogus -- "f2/f1/f1/*" The reason for specifying a pathspec is that, if we don't specify a pathspec, then grep_tree() will walk all the trees and blobs to find the pattern, and the time consumed doing so is not too different from using the original ensure_full_index() method, which also spends most of the time walking trees. However, when a pathspec is specified, this latest logic will only walk the area of trees enclosed by the pathspec, and the time consumed is reasonably a lot less. Generally speaking, because the performance gain is acheived by walking less trees, which are specified by the pathspec, the HEAD time v.s. HEAD~ time in sparse-v[3|4], should be proportional to "pathspec enclosed area" v.s. "all area", respectively. Namely, the wider the <pathspec> is encompassing, the less the performance difference between HEAD~ and HEAD, and vice versa. That is, if we don't specify a pathspec, the performance difference [1] is indistinguishable: both methods walk all the trees and take generally same amount of time (even with the index construction time included for ensure_full_index()). [1] Performance test result without pathspec (hence walking all trees): Command used: git grep --cached bogus Test HEAD~ HEAD --------------------------------------------------- 2000.78: git grep ... (full-v3) 6.17 5.19 (≈) 2000.79: git grep ... (full-v4) 6.19 5.46 (≈) 2000.80: git grep ... (sparse-v3) 6.57 6.44 (≈) 2000.81: git grep ... (sparse-v4) 6.65 6.28 (≈) -------------------------- NEEDSWORK about submodules There are a few NEEDSWORKs that belong to improvements beyond this topic. See the NEEDSWORK in builtin/grep.c::grep_submodule() for more context. The other two NEEDSWORKs in t1092 are also relative. Suggested-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Victoria Dye <vdye@github.com> Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-09Merge branch 'vd/sparse-reset-checkout-fixes'Junio C Hamano
Segfault fix-up to an earlier fix to the topic to teach "git reset" and "git checkout" work better in a sparse checkout. * vd/sparse-reset-checkout-fixes: unpack-trees: fix sparse directory recursion check
2022-09-02unpack-trees: fix sparse directory recursion checkVictoria Dye
Ensure 'is_sparse_directory_entry()' receives a valid 'name_entry *' if one exists in the list of tree(s) being unpacked in 'unpack_callback()'. Currently, 'is_sparse_directory_entry()' is called with the first 'name_entry' in the 'names' list of entries on 'unpack_callback()'. However, this entry may be empty even when other elements of 'names' are not (such as when switching from an orphan branch back to a "normal" branch). As a result, 'is_sparse_directory_entry()' could incorrectly indicate that a sparse directory is *not* actually sparse because the name of the index entry does not match the (empty) 'name_entry' path. Fix the issue by using the existing 'name_entry *p' value in 'unpack_callback()', which points to the first non-empty entry in 'names'. Because 'p' is 'const', also update 'is_sparse_directory_entry()'s 'name_entry *' argument to be 'const'. Finally, add a regression test case. Reported-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22t: detect and signal failure within loopEric Sunshine
Failures within `for` and `while` loops can go unnoticed if not detected and signaled manually since the loop itself does not abort when a contained command fails, nor will a failure necessarily be detected when the loop finishes since the loop returns the exit code of the last command it ran on the final iteration, which may not be the command which failed. Therefore, detect and signal failures manually within loops using the idiom `|| return 1` (or `|| exit 1` within subshells). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-22t1092: fix buggy sparse "blame" testEric Sunshine
This test wants to verify that `git blame` errors out when asked to blame a file _not_ in the sparse checkout. However, the very first file it asks to blame _is_ present in the checkout, thus `test_must_fail git blame $file` gives an unexpected result (the "blame" succeeds). This problem went unnoticed because the test invokes `test_must_fail git blame $file` in loop but forgets to break out of the loop early upon failure, thus the failure gets swallowed. Fix the test by having it not ask to blame a file present in the sparse checkout, and instead only blame files not present, as intended. While at it, also add the missing `|| return 1` which allowed this bug to go unnoticed. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08rm: integrate with sparse-indexShaoxuan Yuan
Enable the sparse index within the `git-rm` command. The `p2000` tests demonstrate a ~92% execution time reduction for 'git rm' using a sparse index. Test HEAD~1 HEAD -------------------------------------------------------------------------- 2000.74: git rm ... (full-v3) 0.41(0.37+0.05) 0.43(0.36+0.07) +4.9% 2000.75: git rm ... (full-v4) 0.38(0.34+0.05) 0.39(0.35+0.05) +2.6% 2000.76: git rm ... (sparse-v3) 0.57(0.56+0.01) 0.05(0.05+0.00) -91.2% 2000.77: git rm ... (sparse-v4) 0.57(0.55+0.02) 0.03(0.03+0.00) -94.7% ---- Also, normalize a behavioral difference of `git-rm` under sparse-index. See related discussion [1]. `git-rm` a sparse-directory entry within a sparse-index enabled repo behaves differently from a sparse directory within a sparse-checkout enabled repo. For example, in a sparse-index repo, where 'folder1' is a sparse-directory entry, `git rm -r --sparse folder1` provides this: rm 'folder1/' Whereas in a sparse-checkout repo *without* sparse-index, doing so provides this: rm 'folder1/0/0/0' rm 'folder1/0/1' rm 'folder1/a' Because `git rm` a sparse-directory entry does not need to expand the index, therefore we should accept the current behavior, which is faster than "expand the sparse-directory entry to match the sparse-checkout situation". Modify a previous test so such difference is not considered as an error. [1] https://github.com/ffyuanda/git/pull/6#discussion_r934861398 Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08rm: expand the index only when necessaryShaoxuan Yuan
Remove the `ensure_full_index()` method so `git-rm` does not always expand the index when the expansion is unnecessary, i.e. when <pathspec> does not have any possibilities to match anything outside of sparse-checkout definition. Expand the index when the <pathspec> needs an expanded index, i.e. the <pathspec> contains wildcard that may need a full-index or the <pathspec> is simply outside of sparse-checkout definition. Notice that the test 'rm pathspec expands index when necessary' in t1092 *is* testing this code change behavior, though it will be marked as 'test_expect_success' only in the next patch, where we officially mark `command_requires_full_index = 0`, so the index does not expand unless we tell it to do so. Notice that because we also want `ensure_full_index` to record the stdout and stderr from Git command, a corresponding modification is also included in this patch. The reason we want the "sparse-index-out" and "sparse-index-err", is that we need to make sure there is no error from Git command itself, so we can rely on the `test_region` result and determine if the index is expanded or not. Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08t1092: add tests for `git-rm`Shaoxuan Yuan
Add tests for `git-rm`, make sure it behaves as expected when <pathspec> is both inside or outside of sparse-checkout definition. Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08Merge branch 'vd/sparse-reset-checkout-fixes' into sy/sparse-rmJunio C Hamano
* vd/sparse-reset-checkout-fixes: unpack-trees: unpack new trees as sparse directories cache.h: create 'index_name_pos_sparse()' oneway_diff: handle removed sparse directories checkout: fix nested sparse directory diff in sparse index
2022-08-08unpack-trees: unpack new trees as sparse directoriesVictoria Dye
If 'unpack_single_entry()' is unpacking a new directory tree (that is, one not already present in the index) into a sparse index, unpack the tree as a sparse directory rather than traversing its contents and unpacking each file individually. This helps keep the sparse index as collapsed as possible in cases such as 'git reset --hard' restoring a outside-of-cone directory removed with 'git rm -r --sparse'. Without this patch, 'unpack_single_entry()' will only unpack a directory into the index as a sparse directory (rather than traversing into it and unpacking its files one-by-one) if an entry with the same name already exists in the index. This patch allows sparse directory unpacking without a matching index entry when the following conditions are met: 1. the directory's path is outside the sparse cone, and 2. there are no children of the directory in the index If a directory meets these requirements (as determined by 'is_new_sparse_dir()'), 'unpack_single_entry()' unpacks the sparse directory index entry and propagates the decision back up to 'unpack_callback()' to prevent unnecessary tree traversal into the unpacked directory. Reported-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08checkout: fix nested sparse directory diff in sparse indexVictoria Dye
Add the 'recursive' diff flag to the local changes reporting done by 'git checkout' in 'show_local_changes()'. Without the flag enabled, unexpanded sparse directories will not be recursed into to report the diff of each file's contents, resulting in the reported local changes including "modified" sparse directories. The same issue was found and fixed for 'git status' in 2c521b0e49 (status: fix nested sparse directory diff in sparse index, 2022-03-01) Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>