aboutsummaryrefslogtreecommitdiff
path: root/t/t1091-sparse-checkout-builtin.sh
AgeCommit message (Collapse)Author
2026-01-21sparse-checkout: optimize string_list construction and add tests to verify ↵Amisha Chhajed
deduplication. Improve O(n^2) complexity to O(n log n) while building a sorted 'string_list' by constructing it unsorted then sorting it followed by removing duplicates. sparse-checkout deduplicates repeated cone-mode patterns, but this behaviour was previously untested, add tests that verify that sparse-checkout file contain each cone pattern only once and sparse-checkout list reports each pattern only once. Signed-off-by: Amisha Chhajed <amishhhaaaa@gmail.com> Acked-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-15t: expand tests around sparse merges and cleanDerrick Stolee
With the current implementation of 'git sparse-checkout clean', we notice that a file that was in a conflicted state does not get cleaned up because of some internal details around the SKIP_WORKTREE bit. This test is documenting the current behavior before we update it in the following change. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-15sparse-checkout: add --verbose option to 'clean'Derrick Stolee
The 'git sparse-checkout clean' subcommand is focused on directories, deleting any tracked sparse directories to clean up the worktree and make the sparse index feature work optimally. However, this directory-focused approach can leave users wondering why those directories exist at all. In my experience, these files are left over due to ignore or exclude patterns, Windows file handles, or possibly merge conflict resolutions. Add a new '--verbose' option for users to see all the files that are being deleted (with '--force') or would be deleted (with '--dry-run'). Based on usage, users may request further context on this list of files for states such as tracked/untracked, unstaged/staged/conflicted, etc. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-12sparse-checkout: match some 'clean' behaviorDerrick Stolee
The 'git sparse-checkout clean' subcommand is somewhat similar to 'git clean' in that it will delete files that should not be in the worktree. The big difference is that it focuses on the directories that should not be in the worktree due to cone-mode sparse-checkout. It also does not discriminate in the kinds of files and focuses on deleting entire directories. However, there are some restrictions that would be good to bring over from 'git clean', specifically how it refuses to do anything without the '-f'/'--force' or '-n'/'--dry-run' arguments. The 'clean.requireForce' config can be set to 'false' to imply '--force'. Add this behavior to avoid accidental deletion of files that cannot be recovered from Git. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-12sparse-checkout: add basics of 'clean' commandDerrick Stolee
When users change their sparse-checkout definitions to add new directories and remove old ones, there may be a few reasons why directories no longer in scope remain (ignored or excluded files still exist, Windows handles are still open, etc.). When these files still exist, the sparse index feature notices that a tracked, but sparse, directory still exists on disk and thus the index expands. This causes a performance hit _and_ the advice printed isn't very helpful. Using 'git clean' isn't enough (generally '-dfx' may be needed) but also this may not be sufficient. Add a new subcommand to 'git sparse-checkout' that removes these tracked-but-sparse directories. The implementation details provide a clear definition of what is happening, but it is difficult to describe this without including the internal implementation details. The core operation converts the index to a sparse index (in memory if not already on disk) and then deletes any directories in the worktree that correspond with a sparse directory entry in that sparse index. In the most common case, this means that a file will be removed if it is contained within a directory that is both tracked and outside of the sparse-checkout definition. However, there can be exceptions depending on the current state of the index: * If the worktree has a modification to a tracked, sparse file, then that file's parent directories will be expanded instead of represented as sparse directories. Siblings of those parent directories may be considered sparse. * If the user staged a sparse file with "git add --sparse", then that file loses the SKIP_WORKTREE bit until the sparse-checkout is reapplied. Until then, that file's parent directories are not represented as sparse directory entries and thus will not be removed. Siblings of those parent directories may be considered sparse. (There may be other reasons why the SKIP_WORKTREE bit was removed for a file and this impact on the sparse directories will apply to those as well.) * If the user has a merge conflict outside of the sparse-checkout definition, then those conflict entries prevent the parent directories from being represented as sparse directory entries and thus are not removed. * The cases above present reasons why certain _file conditions_ will impact which _directories_ are considered sparse. The list of tracked directories that are outside of the sparse-checkout definition but not represented as a sparse directory further reduces the list of files that will be removed. For these complicated reasons, the documentation details a potential list of files that will be "considered for removal" instead of defining the list concretely. The special cases can be handled by resolving conflicts, committing staged changes, and running 'git sparse-checkout reapply' to update the SKIP_WORKTREE bits as expected by the sparse-checkout definition. It is important to make clear that this operation will remove ignored and excluded files which would normally be ignored even by 'git clean -f' unless the '-x' or '-X' option is provided. This is the most extreme method for doing this, but it works when the sparse-checkout is in cone mode and is expected to rescope based on directories, not files. The current implementation always deletes these sparse directories without warning. This is unacceptable for a released version, but those features will be added in changes coming immediately after this one. Note that this will not remove an untracked directory (or any of its contents) if its parent is a tracked directory within the sparse-checkout definition. This is required to prevent removing data created by tools that perform caching operations for editors or build tools. Thus, 'git sparse-checkout clean' is both more aggressive and more careful than 'git clean -fx': * It is more aggressive because it will remove _tracked_ files within the sparse directories. * It is less aggressive because it will leave _untracked_ files that are not contained in sparse directories. These special cases will be handled more explicitly in a future change that expands tests for the 'git sparse-checkout clean' command. We handle some of the modified, staged, and committed states including some impact on 'git status' after cleaning. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-21t: remove TEST_PASSES_SANITIZE_LEAK annotationsPatrick Steinhardt
Now that the default value for TEST_PASSES_SANITIZE_LEAK is `true` there is no longer a need to have that variable declared in all of our tests. Drop it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-05sparse-checkout: free duplicate hashmap entriesJeff King
In insert_recursive_pattern(), we create a new pattern_entry to insert into the parent_hashmap. If we find that the same entry already exists in the hashmap, we skip adding the new one. But we forget to free the new one, creating a leak. We can fix it by cleaning up the discarded entry. It would probably be possible to avoid creating it in the first place, but it's non-trivial. We'd have to define a "keydata" struct that lets us compare the existing entries to the broken-out fields. It's probably not worth the complexity, so we'll punt on that for now. There is one subtlety here: our insertion is happening in a loop, with each iteration looking at the pattern we just inserted (hence the "recursive" in the name). So if we skip insertion, what do we look at? The obvious answer is that we should remember the existing duplicate we found and use that. But I _think_ in that case, we probably already have all of the recursive bits already (from when the original entry was added). And so just breaking out of the loop would be correct. But I'm not 100% sure on that; after all, the original leaky code could have done the same break, but it didn't. So I went with the "obvious answer" above, which has no chance of changing the behavior aside from fixing the leak. With this patch, t1091 can now be marked leak-free. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16t/t1*: avoid redundant uses of catBeat Bolli
Signed-off-by: Beat Bolli <dev+git@drbeat.li> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26sparse-checkout: be consistent with end of options markersElijah Newren
93851746 (parse-options: decouple "--end-of-options" and "--", 2023-12-06) updated the world order to make callers of parse-options that set PARSE_OPT_KEEP_UNKNOWN_OPT responsible for deciding what to do with "--end-of-options" they may see after parse_options() returns. This made a previous bug in sparse-checkout more visible; namely, that git sparse-checkout [add|set] --[no-]cone --end-of-options ... would simply treat "--end-of-options" as one of the paths to include in the sparse-checkout. But this was already problematic before; namely, git sparse-checkout [add|set| --[no-]cone --sikp-checks ... would not give an error on the mis-typed "--skip-checks" but instead simply treat "--sikp-checks" as a path or pattern to include in the sparse-checkout, which is highly unfriendly. This behavior began when the command was converted to parse-options in 7bffca95ea (sparse-checkout: add '--stdin' option to set subcommand, 2019-11-21). Back then it was just called KEEP_UNKNOWN. Later it was renamed to KEEP_UNKNOWN_OPT in 99d86d60e5 (parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options, 2022-08-19) to clarify that it was only about dashed options; we always keep non-option arguments. Looking at that original patch, both Peff and I think that the author was simply confused about the mis-named option, and really just wanted to keep the non-option arguments. We never should have used the flag all along (and the other cases were cargo-culted within the file). Remove the erroneous PARSE_OPT_KEEP_UNKNOWN_OPT flag now to fix this bug. Note that this does mean that anyone who might have been using git sparse-checkout [add|set] [--[no-]cone] --foo --bar to request paths or patterns '--foo' and '--bar' will now have to use git sparse-checkout [add|set] [--[no-]cone] -- --foo --bar That makes sparse-checkout more consistent with other git commands, provides users much friendlier error messages and behavior, and is consistent with the all-caps warning in git-sparse-checkout.txt that this command "is experimental...its behavior...will likely change". :-) Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02tests: teach callers of test_i18ngrep to use test_grepJunio C Hamano
They are equivalents and the former still exists, so as long as the only change this commit makes are to rewrite test_i18ngrep to test_grep, there won't be any new bug, even if there still are callers of test_i18ngrep remaining in the tree, or when merged to other topics that add new uses of test_i18ngrep. This patch was produced more or less with git grep -l -e 'test_i18ngrep ' 't/t[0-9][0-9][0-9][0-9]-*.sh' | xargs perl -p -i -e 's/test_i18ngrep /test_grep /' and a good way to sanity check the result yourself is to run the above in a checkout of c4603c1c (test framework: further deprecate test_i18ngrep, 2023-10-31) and compare the resulting working tree contents with the result of applying this patch to the same commit. You'll see that test_i18ngrep in a few t/lib-*.sh files corrected, in addition to the manual reproduction. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27builtin/sparse-checkout: add check-rules commandWilliam Sprent
There exists no direct way to interrogate git about which paths are matched by a given set of sparsity rules. It is possible to get this information from git, but it includes checking out the commit that contains the paths, applying the sparse checkout patterns and then using something like 'git ls-files -t' to check if the skip worktree bit is set. This works in some case, but there are cases where it is awkward or infeasible to generate a checkout for this purpose. Exposing the pattern matching of sparse checkout enables more tooling to be built and avoids a situation where tools that want to reason about sparse checkouts start containing parallel implementation of the rules. To accommodate this, add a 'check-rules' subcommand to the 'sparse-checkout' builtin along the lines of the 'git check-ignore' and 'git check-attr' commands. The new command accepts a list of paths on stdin and outputs just the ones the match the sparse checkout. To allow for use in a bare repository and to allow for interrogating about other patterns than the current ones, include a '--rules-file' option which allows the caller to explicitly pass sparse checkout rules in the format accepted by 'sparse-checkout set --stdin'. To allow for reuse of the handling of input patterns for the '--rules-file' flag, modify 'add_patterns_from_input()' to be able to read from a 'FILE' instead of just stdin. To allow for reuse of the logic which decides whether or not rules should be interpreted as cone-mode patterns, split that part out of 'update_modes()' such that can be called without modifying the config. An alternative could have been to create a new 'check-sparsity' command. However, placing it under 'sparse-checkout' allows for a) more easily re-using the sparse checkout pattern matching and cone/non-code mode handling, and b) keeps the documentation for the command next to the experimental warning and the cone-mode discussion. Signed-off-by: William Sprent <williams@unity3d.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-27builtin/sparse-checkout: remove NEED_WORK_TREE flagWilliam Sprent
In preparation for adding a sub-command to 'sparse-checkout' that can be run in a bare repository, remove the 'NEED_WORK_TREE' flag from its entry in the 'commands' array of 'git.c'. To avoid that this changes any behaviour, add calls to 'setup_work_tree()' to all of the 'sparse-checkout' sub-commands and add tests that verify that 'sparse-checkout <cmd>' still fail with a clear error message telling the user that the command needs a work tree. Signed-off-by: William Sprent <williams@unity3d.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-05dir: check for single file cone patternsWilliam Sprent
The sparse checkout documentation states that the cone mode pattern set is limited to patterns that either recursively include directories or patterns that match all files in a directory. In the sparse checkout file, the former manifest in the form: /A/B/C/ while the latter become a pair of patterns either in the form: /A/B/ !/A/B/*/ or in the special case of matching the toplevel files: /* !/*/ The 'add_pattern_to_hashsets()' function contains checks which serve to disable cone-mode when non-cone patterns are encountered. However, these do not catch when the pattern list attempts to match a single file or directory, e.g. a pattern in the form: /A/B/C This causes sparse-checkout to exhibit unexpected behaviour when such a pattern is in the sparse-checkout file and cone mode is enabled. Concretely, with the pattern like the above, sparse-checkout, in non-cone mode, will only include the directory or file located at '/A/B/C'. However, with cone mode enabled, sparse-checkout will instead just manifest the toplevel files but not any file located at '/A/B/C'. Relatedly, issues occur when supplying the same kind of filter when partial cloning with '--filter=sparse:oid=<oid>'. 'upload-pack' will correctly just include the objects that match the non-cone pattern matching. Which means that checking out the newly cloned repo with the same filter, but with cone mode enabled, fails due to missing objects. To fix these issues, add a cone mode pattern check that asserts that every pattern is either a directory match or the pattern '/*'. Add a test to verify the new pattern check and modify another to reflect that non-directory patterns are caught earlier. Signed-off-by: William Sprent <williams@unity3d.com> Acked-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-06Sync with 2.36.3Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06Sync with 2.35.5Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06Sync with 2.34.5Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06Sync with 2.33.5Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06Sync with 2.31.5Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-06Sync with 2.30.6Taylor Blau
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-01t/t1NNN: allow local submodulesTaylor Blau
To prepare for the default value of `protocol.file.allow` to change to "user", ensure tests that rely on local submodules can initialize them over the file protocol. Tests that only need to interact with submodules in a limited capacity have individual Git commands annotated with the appropriate configuration via `-c`. Tests that interact with submodules a handful of times use `test_config_global` instead. Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-04-21tests: stop assuming --no-cone is the default mode for sparse-checkoutElijah Newren
Add an explicit --no-cone to several sparse-checkout invocations in preparation for changing the default to cone mode. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-06Merge branch 'en/sparse-checkout-fixes'Junio C Hamano
Further polishing of "git sparse-checkout". * en/sparse-checkout-fixes: sparse-checkout: reject arguments in cone-mode that look like patterns sparse-checkout: error or warn when given individual files sparse-checkout: pay attention to prefix for {set, add} sparse-checkout: correctly set non-cone mode when expected sparse-checkout: correct reapply's handling of options
2022-02-25Merge branch 'ds/sparse-checkout-requires-per-worktree-config'Junio C Hamano
"git sparse-checkout" wants to work with per-worktree configuration, but did not work well in a worktree attached to a bare repository. * ds/sparse-checkout-requires-per-worktree-config: config: make git_configset_get_string_tmp() private worktree: copy sparse-checkout patterns and config on add sparse-checkout: set worktree-config correctly config: add repo_config_set_worktree_gently() worktree: create init_worktree_config() Documentation: add extensions.worktreeConfig details
2022-02-20sparse-checkout: reject arguments in cone-mode that look like patternsElijah Newren
In sparse-checkout add/set under cone mode, the arguments passed are supposed to be directories rather than gitignore-style patterns. However, given the amount of effort spent in the manual discussing patterns, it is easy for users to assume they need to pass patterns such as /foo/* or !/bar/*/ or perhaps they really do ignore the directory rule and specify a random gitignore-style pattern like *.c To help catch such mistakes, throw an error if any of the positional arguments: * starts with any of '/!' * contains any of '*?[]' Inform users they can pass --skip-checks if they have a directory that really does have such special characters in its name. (We exclude '\' because of sparse-checkout's special handling of backslashes; see the MINGW test in t1091.46.) Reviewed-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-20sparse-checkout: error or warn when given individual filesElijah Newren
The set and add subcommands accept multiple positional arguments. The meaning of these arguments differs slightly in the two modes: Cone mode only accepts directories. If given a file, it would previously treat it as a directory, causing not just the file itself to be included but all sibling files as well -- likely against users' expectations. Throw an error if the specified path is a file in the index. Provide a --skip-checks argument to allow users to override (e.g. for the case when the given path IS a directory on another branch). Non-cone mode accepts general gitignore patterns. There are many reasons to avoid this mode, but one possible reason to use it instead of cone mode: to be able to select individual files within a directory. However, if a file is passed to set/add in non-cone mode, you won't be selecting a single file, you'll be selecting a file with the same name in any directory. Thus users will likely want to prefix any paths they specify with a leading '/' character; warn users if the patterns they specify exactly name a file because it means they are likely missing such a leading slash. Reviewed-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-20sparse-checkout: pay attention to prefix for {set, add}Elijah Newren
In cone mode, non-option arguments to set & add are clearly paths, and as such, we should pay attention to prefix. In non-cone mode, it is not clear that folks intend to provide paths since the inputs are gitignore-style patterns. Paying attention to prefix would prevent folks from doing things like git sparse-checkout add /.gitattributes git sparse-checkout add '/toplevel-dir/*' In fact, the former will result in fatal: '/.gitattributes' is outside repository... while the later will result in fatal: Invalid path '/toplevel-dir': No such file or directory despite the fact that both are valid gitignore-style patterns that would select real files if added to the sparse-checkout file. This might lead people to just use the path without the leading slash, potentially resulting in them grabbing files with the same name throughout the directory hierarchy contrary to their expectations. See also [1] and [2]. Adding prefix seems to just be fraught with error; so for now simply throw an error in non-cone mode when sparse-checkout set/add are run from a subdirectory. [1] https://lore.kernel.org/git/e1934710-e228-adc4-d37c-f706883bd27c@gmail.com/ [2] https://lore.kernel.org/git/CABPp-BHXZ-XLxY0a3wCATfdq=6-EjW62RzbxKAoFPeXfJswD2w@mail.gmail.com/ Helped-by: Junio Hamano <gitster@pobox.com> Reviewed-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-20sparse-checkout: correct reapply's handling of optionsElijah Newren
Commit 4e256731d6 ("sparse-checkout: enable reapply to take --[no-]{cone,sparse-index}", 2021-12-14) made it so that reapply could take additional options but added no tests. Tests would have shown that the feature doesn't work because the initial values are set AFTER parsing the command line options instead of before. Add a test and set the initial value at the appropriate time. Reviewed-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-09Merge branch 'js/sparse-vs-split-index'Junio C Hamano
Mark in various places in the code that the sparse index and the split index features are mutually incompatible. * js/sparse-vs-split-index: split-index: it really is incompatible with the sparse index t1091: disable split index sparse-index: sparse index is disallowed when split index is active
2022-02-09Merge branch 'jt/sparse-checkout-leading-dir-fix'Junio C Hamano
"git sparse-checkout init" failed to write into $GIT_DIR/info directory when the repository was created without one, which has been corrected to auto-create it. * jt/sparse-checkout-leading-dir-fix: sparse-checkout: create leading directory
2022-02-08worktree: copy sparse-checkout patterns and config on addDerrick Stolee
When adding a new worktree, it is reasonable to expect that we want to use the current set of sparse-checkout settings for that new worktree. This is particularly important for repositories where the worktree would become too large to be useful. This is even more important when using partial clone as well, since we want to avoid downloading the missing blobs for files that should not be written to the new worktree. The only way to create such a worktree without this intermediate step of expanding the full worktree is to copy the sparse-checkout patterns and config settings during 'git worktree add'. Each worktree has its own sparse-checkout patterns, and the default behavior when the sparse-checkout file is missing is to include all paths at HEAD. Thus, we need to have patterns from somewhere, they might as well be the current worktree's patterns. These are then modified independently in the future. In addition to the sparse-checkout file, copy the worktree config file if worktree config is enabled and the file exists. This will copy over any important settings to ensure the new worktree behaves the same as the current one. The only exception we must continue to make is that core.bare and core.worktree should become unset in the worktree's config file. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-08sparse-checkout: set worktree-config correctlyDerrick Stolee
`git sparse-checkout set/init` enables worktree-specific configuration[*] by setting extensions.worktreeConfig=true, but neglects to perform the additional necessary bookkeeping of relocating `core.bare=true` and `core.worktree` from $GIT_COMMON_DIR/config to $GIT_COMMON_DIR/config.worktree, as documented in git-worktree.txt. As a result of this oversight, these settings, which are nonsensical for secondary worktrees, can cause Git commands to incorrectly consider a worktree bare (in the case of `core.bare`) or operate on the wrong worktree (in the case of `core.worktree`). Fix this problem by taking advantage of the recently-added init_worktree_config() which enables `extensions.worktreeConfig` and takes care of necessary bookkeeping. While at it, for backward-compatibility reasons, also stop upgrading the repository format to "1" since doing so is (unintentionally) not required to take advantage of `extensions.worktreeConfig`, as explained by 11664196ac ("Revert "check_repository_format_gently(): refuse extensions for old repositories"", 2020-07-15). [*] The main reason to use worktree-specific config for the sparse-checkout builtin was to avoid enabling sparse-checkout patterns in one and causing a loss of files in another. If a worktree does not have a sparse-checkout patterns file, then the sparse-checkout logic will not kick in on that worktree. Reported-by: Sean Allred <allred.sean@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-23t1091: disable split indexJohannes Schindelin
In 61feddcdf28 (tests: disable GIT_TEST_SPLIT_INDEX for sparse index tests, 2021-08-26), it was already called out that the split index feature is incompatible with the sparse index feature, and its commit message wondered aloud whether more checks would be required to ensure that the split index and sparse index features aren't enabled at the same time. We are about to introduce such additional checks, and indeed, t1091 would utterly fail with them. Therefore, let's preemptively disable the split index for the entirety of t1091. This partially reverts above-mentioned patch because it covered only one test case whereas we want to cover the entire test script. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-21sparse-checkout: create leading directoryJonathan Tan
When creating the sparse-checkout file, Git does not create the leading directory, "$GIT_DIR/info", if it does not exist. This causes problems if the repository does not have that directory. Therefore, ensure that the leading directory is created. This is the only "open" in builtin/sparse-checkout.c that does not have a leading directory check. (The other one in write_patterns_and_update() does.) Note that the test needs to explicitly specify a template when running "git init" because the default template used in the tests has the "info/" directory included. Helped-by: Jose Lopes <jabolopes@google.com> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-10Merge branch 'ds/fetch-pull-with-sparse-index'Junio C Hamano
"git fetch" and "git pull" are now declared sparse-index clean. Also "git ls-files" learns the "--sparse" option to help debugging. * ds/fetch-pull-with-sparse-index: test-read-cache: remove --table, --expand options t1091/t3705: remove 'test-tool read-cache --table' t1092: replace 'read-cache --table' with 'ls-files --sparse' ls-files: add --sparse option fetch/pull: use the sparse index
2022-01-10Merge branch 'ds/sparse-checkout-malformed-pattern-fix'Junio C Hamano
Certain sparse-checkout patterns that are valid in non-cone mode led to segfault in cone mode, which has been corrected. * ds/sparse-checkout-malformed-pattern-fix: sparse-checkout: refuse to add to bad patterns sparse-checkout: fix OOM error with mixed patterns sparse-checkout: fix segfault on malformed patterns
2022-01-03Merge branch 'en/sparse-checkout-set'Junio C Hamano
The "init" and "set" subcommands in "git sparse-checkout" have been unified for a better user experience and performance. * en/sparse-checkout-set: sparse-checkout: remove stray trailing space clone: avoid using deprecated `sparse-checkout init` Documentation: clarify/correct a few sparsity related statements git-sparse-checkout.txt: update to document init/set/reapply changes sparse-checkout: enable reapply to take --[no-]{cone,sparse-index} sparse-checkout: enable `set` to initialize sparse-checkout mode sparse-checkout: split out code for tweaking settings config sparse-checkout: disallow --no-stdin as an argument to set sparse-checkout: add sanity-checks on initial sparsity state sparse-checkout: break apart functions for sparse_checkout_(set|add) sparse-checkout: pass use_stdin as a parameter instead of as a global
2021-12-30sparse-checkout: refuse to add to bad patternsDerrick Stolee
When in cone mode sparse-checkout, it is unclear how 'git sparse-checkout add <dir1> ...' should behave if the existing sparse-checkout file does not match the cone mode patterns. Change the behavior to fail with an error message about the existing patterns. Also, all cone mode patterns start with a '/' character, so add that restriction. This is necessary for our example test 'cone mode: warn on bad pattern', but also requires modifying the example sparse-checkout file we use to test the warnings related to recognizing cone mode patterns. This error checking would cause a failure further down the test script because of a test that adds non-cone mode patterns without cleaning them up. Perform that cleanup as part of the test now. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-30sparse-checkout: fix OOM error with mixed patternsDerrick Stolee
Add a test to t1091-sparse-checkout-builtin.sh that would result in an infinite loop and out-of-memory error before this change. The issue relies on having non-cone-mode patterns while trying to modify the patterns in cone-mode. The fix is simple, allowing us to break from the loop when the input path does not contain a slash, as the "dir" pattern we added does not. This is only a fix to the critical out-of-memory error. A better response to such a strange state will follow in a later change. Reported-by: Calbabreaker <calbabreaker@gmail.com> Helped-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-30sparse-checkout: fix segfault on malformed patternsDerrick Stolee
Then core.sparseCheckoutCone is enabled, the sparse-checkout patterns are used to populate two hashsets that accelerate pattern matching. If the user modifies the sparse-checkout file outside of the 'sparse-checkout' builtin, then strange patterns can happen, triggering some error checks. One of these error checks is possible to hit when some special characters exist in a line. A warning message is correctly written to stderr, but then there is additional logic that attempts to remove the line from the hashset and free the data. This leads to a segfault in the 'git sparse-checkout list' command because it iterates over the contents of the hashset, which is now invalid. The fix here is to stop trying to remove from the hashset. In addition, we disable cone mode sparse-checkout because of the malformed data. This results in the pattern-matching working with a possibly-slower algorithm, but using the patterns as they are in the sparse-checkout file. This also changes the behavior of commands such as 'git sparse-checkout list' because the output patterns will be the contents of the sparse-checkout file instead of the list of directories. This is an existing behavior for other types of bad patterns. Add a test that triggers the segfault without the code change. Reported-by: John Burnett <johnburnett@johnburnett.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-22t1091/t3705: remove 'test-tool read-cache --table'Derrick Stolee
Now that 'git ls-files --sparse' exists, we can use it to verify the state of a sparse index instead of 'test-tool read-cache table'. Replace these usages within t1091-sparse-checkout-builtin.sh and t3705-add-sparse-checkout.sh. The important changes are due to the different output format. In t3705, we need to use the '--stage' output to get a file mode and OID, but it also includes a stage value and drops the object type. This leads to some differences in how we handle looking for specific entries. In t1091, the test focuses on enabling the sparse index, so we do not need the --stage flag to demonstrate how the index changes, and instead can use a diff. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15sparse-checkout: add sanity-checks on initial sparsity stateElijah Newren
Most sparse-checkout subcommands (list, add, reapply) only make sense when already in a sparse state. Add a quick check that will error out early if this is not the case. Also document with a comment why we do not exit early in `disable` even when core.sparseCheckout starts as false. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13t0000-t3999: detect and signal failure within loopEric Sunshine
Failures within `for` and `while` loops can go unnoticed if not detected and signaled manually since the loop itself does not abort when a contained command fails, nor will a failure necessarily be detected when the loop finishes since the loop returns the exit code of the last command it ran on the final iteration, which may not be the command which failed. Therefore, detect and signal failures manually within loops using the idiom `|| return 1` (or `|| exit 1` within subshells). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-13Merge branch 'ds/add-rm-with-sparse-index'Junio C Hamano
"git add", "git mv", and "git rm" have been adjusted to avoid updating paths outside of the sparse-checkout definition unless the user specifies a "--sparse" option. * ds/add-rm-with-sparse-index: advice: update message to suggest '--sparse' mv: refuse to move sparse paths rm: skip sparse paths with missing SKIP_WORKTREE rm: add --sparse option add: update --renormalize to skip sparse paths add: update --chmod to skip sparse paths add: implement the --sparse option add: skip tracked paths outside sparse-checkout cone add: fail when adding an untracked sparse file dir: fix pattern matching on dirs dir: select directories correctly t1092: behavior for adding sparse files t3705: test that 'sparse_entry' is unstaged
2021-09-28add: skip tracked paths outside sparse-checkout coneDerrick Stolee
When 'git add' adds a tracked file that is outside of the sparse-checkout cone, it checks the SKIP_WORKTREE bit to see if the file exists outside of the sparse-checkout cone. This is usually correct, except in the case of a merge conflict outside of the cone. Modify add_pathspec_matched_against_index() to be more careful about paths by checking the sparse-checkout patterns in addition to the SKIP_WORKTREE bit. This causes 'git add' to no longer allow files outside of the cone that removed the SKIP_WORKTREE bit due to a merge conflict. With only this change, users will only be able to add the file after adding the file to the sparse-checkout cone. A later change will allow users to force adding even though the file is outside of the sparse-checkout cone. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07tests: disable GIT_TEST_SPLIT_INDEX for sparse index testsSZEDER Gábor
The sparse index and split index features are said to be currently incompatible [1], and consequently GIT_TEST_SPLIT_INDEX=1 might interfere with the test cases exercising the sparse index feature. Therefore GIT_TEST_SPLIT_INDEX is already explicitly disabled for the whole of 't1092-sparse-checkout-compatibility.sh'. There are, however, two other test cases exercising sparse index, namely 'sparse-index enabled and disabled' in 't1091-sparse-checkout-builtin.sh' and 'status succeeds with sparse index' in 't7519-status-fsmonitor.sh', and these two could fail with GIT_TEST_SPLIT_INDEX=1 as well [2]. Unset GIT_TEST_SPLIT_INDEX and disable the split index in these two test cases to avoid such interference. Note that this is the minimal change to merely avoid failures when these test cases are run with GIT_TEST_SPLIT_INDEX=1. Interestingly, though, without these changes the 'git sparse-checkout init --cone --sparse-index' commands still succeed even with split index, and set all the necessary configuration variables and create the initial '$GIT_DIR/info/sparse-checkout' file, but the test failures are caused by later sanity checks finding that the index is not in fact a sparse index. This indicates that 'git sparse-checkout init --sparse-index' lacks some error checking and its tests lack coverage related to split index, but fixing those issues is beyond the scope of this patch series. [1] https://public-inbox.org/git/48e9c3d6-407a-1843-2d91-22112410e3f8@gmail.com/ [2] Neither of these test cases fail at the moment, because GIT_TEST_SPLIT_INDEX=1 is broken and never splits the index, and it broke long before the sparse index feature was added. This patch series is about to fix GIT_TEST_SPLIT_INDEX, and then both test cases mentioned above would fail. (The diff is best viewed with '--ignore-all-space') Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07sparse-checkout: clear tracked sparse dirsDerrick Stolee
When changing the scope of a sparse-checkout using cone mode, we might have some tracked directories go out of scope. The current logic removes the tracked files from within those directories, but leaves the ignored files within those directories. This is a bit unexpected to users who have given input to Git saying they don't need those directories anymore. This is something that is new to the cone mode pattern type: the user has explicitly said "I want these directories and _not_ those directories." The typical sparse-checkout patterns more generally apply to "I want files with with these patterns" so it is natural to leave ignored files as they are. This focus on directories in cone mode provides us an opportunity to change the behavior. Leaving these ignored files in the sparse directories makes it impossible to gain performance benefits in the sparse index. When we track into these directories, we need to know if the files are ignored or not, which might depend on the _tracked_ .gitignore file(s) within the sparse directory. This depends on the indexed version of the file, so the sparse directory must be expanded. We must take special care to look for untracked, non-ignored files in these directories before deleting them. We do not want to delete any meaningful work that the users were doing in those directories and perhaps forgot to add and commit before switching sparse-checkout definitions. Since those untracked files might be code files that generated ignored build output, also do not delete any ignored files from these directories in that case. The users can recover their state by resetting their sparse-checkout definition to include that directory and continue. Alternatively, they can see the warning that is presented and delete the directory themselves to regain the performance they expect. By deleting the sparse directories when changing scope (or running 'git sparse-checkout reapply') we regain these performance benefits as if the repository was in a clean state. Since these ignored files are frequently build output or helper files from IDEs, the users should not need the files now that the tracked files are removed. If the tracked files reappear, then they will have newer timestamps than the build artifacts, so the artifacts will need to be regenerated anyway. Use the sparse-index as a data structure in order to find the sparse directories that can be safely deleted. Re-expand the index to a full one if it was full before. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30sparse-checkout: disable sparse-indexDerrick Stolee
We use 'git sparse-checkout init --cone --sparse-index' to toggle the sparse-index feature. It makes sense to also disable it when running 'git sparse-checkout disable'. This is particularly important because it removes the extensions.sparseIndex config option, allowing other tools to use this Git repository again. This does mean that 'git sparse-checkout init' will not re-enable the sparse-index feature, even if it was previously enabled. While testing this feature, I noticed that the sparse-index was not being written on the first run, but by a second. This was caught by the call to 'test-tool read-cache --table'. This requires adjusting some assignments to core_apply_sparse_checkout and pl.use_cone_patterns in the sparse_checkout_init() logic. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19t[01]*: adjust the references to the default branch name "main"Johannes Schindelin
Carefully excluding t1309, which sees independent development elsewhere at the time of writing, we transition above-mentioned tests to the default branch name `main`. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -e 's/naster/nain/g' -- t[01]*.sh && git checkout HEAD -- t1309\*) Note that t5533 contains a variation of the name `master` (`naster`) that we rename here, too. This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19tests: mark tests relying on the current default for `init.defaultBranch`Johannes Schindelin
In addition to the manual adjustment to let the `linux-gcc` CI job run the test suite with `master` and then with `main`, this patch makes sure that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts that currently rely on the initial branch name being `master by default. To determine which test scripts to mark up, the first step was to force-set the default branch name to `master` in - all test scripts that contain the keyword `master`, - t4211, which expects `t/t4211/history.export` with a hard-coded ref to initialize the default branch, - t5560 because it sources `t/t556x_common` which uses `master`, - t8002 and t8012 because both source `t/annotate-tests.sh` which also uses `master`) This trick was performed by this command: $ sed -i '/^ *\. \.\/\(test-lib\|lib-\(bash\|cvs\|git-svn\)\|gitweb-lib\)\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' $(git grep -l master t/t[0-9]*.sh) \ t/t4211*.sh t/t5560*.sh t/t8002*.sh t/t8012*.sh After that, careful, manual inspection revealed that some of the test scripts containing the needle `master` do not actually rely on a specific default branch name: either they mention `master` only in a comment, or they initialize that branch specificially, or they do not actually refer to the current default branch. Therefore, the aforementioned modification was undone in those test scripts thusly: $ git checkout HEAD -- \ t/t0027-auto-crlf.sh t/t0060-path-utils.sh \ t/t1011-read-tree-sparse-checkout.sh \ t/t1305-config-include.sh t/t1309-early-config.sh \ t/t1402-check-ref-format.sh t/t1450-fsck.sh \ t/t2024-checkout-dwim.sh \ t/t2106-update-index-assume-unchanged.sh \ t/t3040-subprojects-basic.sh t/t3301-notes.sh \ t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \ t/t3436-rebase-more-options.sh \ t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \ t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \ t/t5511-refspec.sh t/t5526-fetch-submodules.sh \ t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \ t/t5548-push-porcelain.sh \ t/t5552-skipping-fetch-negotiator.sh \ t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \ t/t5614-clone-submodules-shallow.sh \ t/t7508-status.sh t/t7606-merge-custom.sh \ t/t9302-fast-import-unpack-limit.sh We excluded one set of test scripts in these commands, though: the range of `git p4` tests. The reason? `git p4` stores the (foreign) remote branch in the branch called `p4/master`, which is obviously not the default branch. Manual analysis revealed that only five of these tests actually require a specific default branch name to pass; They were modified thusly: $ sed -i '/^ *\. \.\/lib-git-p4\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' t/t980[0167]*.sh t/t9811*.sh Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-30t1001: use $ZERO_OIDbrian m. carlson
Use $ZERO_OID to make the test hash independent. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>