aboutsummaryrefslogtreecommitdiff
path: root/setup.h
AgeCommit message (Collapse)Author
2026-03-16Merge branch 'ty/setup-error-tightening'Junio C Hamano
While discovering a ".git" directory, the code treats any stat() failure as a sign that a filesystem entity .git does not exist there, and ignores ".git" that is not a "gitdir" file or a directory. The code has been tightened to notice and report filesystem corruption better. * ty/setup-error-tightening: setup: improve error diagnosis for invalid .git files
2026-03-04Merge branch 'kn/ref-location'Junio C Hamano
Allow the directory in which reference backends store their data to be specified. * kn/ref-location: refs: add GIT_REFERENCE_BACKEND to specify reference backend refs: allow reference location in refstorage config refs: receive and use the reference storage payload refs: move out stub modification to generic layer refs: extract out `refs_create_refdir_stubs()` setup: don't modify repo in `create_reference_database()`
2026-03-04setup: improve error diagnosis for invalid .git filesTian Yuchen
'read_gitfile_gently()' treats any non-regular file as 'READ_GITFILE_ERR_NOT_A_FILE' and fails to discern between 'ENOENT' and other stat failures. This flawed error reporting is noted by two 'NEEDSWORK' comments. Address these comments by introducing two new error codes: 'READ_GITFILE_ERR_MISSING'(which groups the "file missing" scenarios together) and 'READ_GITFILE_ERR_IS_A_DIR': 1. Update 'read_gitfile_error_die()' to treat 'IS_A_DIR', 'MISSING', 'NOT_A_FILE' and 'STAT_FAILED' as non-fatal no-ops. This accommodates intentional non-repo scenarios (e.g., GIT_DIR=/dev/null). 2. Explicitly catch 'NOT_A_FILE' and 'STAT_FAILED' during discovery and call 'die()' if 'die_on_error' is set. 3. Unconditionally pass '&error_code' to 'read_gitfile_gently()'. 4. Only invoke 'is_git_directory()' when we explicitly receive 'READ_GITFILE_ERR_IS_A_DIR', avoiding redundant checks. Additionally, audit external callers of 'read_gitfile_gently()' in 'submodule.c' and 'worktree.c' to accommodate the refined error codes. Signed-off-by: Tian Yuchen <a3205153416@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25refs: allow reference location in refstorage configKarthik Nayak
The 'extensions.refStorage' config is used to specify the reference backend for a given repository. Both the 'files' and 'reftable' backends utilize the $GIT_DIR as the reference folder by default in `get_main_ref_store()`. Since the reference backends are pluggable, this means that they could work with out-of-tree reference directories too. Extend the 'refStorage' config to also support taking an URI input, where users can specify the reference backend and the location. Add the required changes to obtain and propagate this value to the individual backends. Add the necessary documentation and tests. Traditionally, for linked worktrees, references were stored in the '$GIT_DIR/worktrees/<wt_id>' path. But when using an alternate reference storage path, it doesn't make sense to store the main worktree references in the new path, and the linked worktree references in the $GIT_DIR. So, let's store linked worktree references in '$ALTERNATE_REFERENCE_DIR/worktrees/<wt_id>'. To do this, create the necessary files and folders while also adding stubs in the $GIT_DIR path to ensure that it is still considered a Git directory. Ideally, we would want to pass in a `struct worktree *` to individual backends, instead of passing the `gitdir`. This allows them to handle worktree specific logic. Currently, that is not possible since the worktree code is: - Tied to using the global `the_repository` variable. - Is not setup before the reference database during initialization of the repository. Add a TODO in 'refs.c' to ensure we can eventually make that change. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-25setup: don't modify repo in `create_reference_database()`Karthik Nayak
The `create_reference_database()` function is used to create the reference database during initialization of a repository. The function calls `repo_set_ref_storage_format()` to set the repositories reference format. This is an unexpected side-effect of the function. More so because the function is only called in two locations: 1. During git-init(1) where the value is propagated from the `struct repository_format repo_fmt` value. 2. During git-clone(1) where the value is propagated from the `the_repository` value. The former is valid, however the flow already calls `repo_set_ref_storage_format()`, so this effort is simply duplicated. The latter sets the existing value in `the_repository` back to itself. While this is okay for now, introduction of more fields in `repo_set_ref_storage_format()` would cause issues, especially dynamically allocated strings, where we would free/allocate the same string back into `the_repostiory`. To avoid all this confusion, clean up the function to no longer take in and set the repo's reference storage format. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-05Merge branch 'ar/submodule-gitdir-tweak'Junio C Hamano
Avoid local submodule repository directory paths overlapping with each other by encoding submodule names before using them as path components. * ar/submodule-gitdir-tweak: submodule: detect conflicts with existing gitdir configs submodule: hash the submodule name for the gitdir path submodule: fix case-folding gitdir filesystem collisions submodule--helper: fix filesystem collisions by encoding gitdir paths builtin/credential-store: move is_rfc3986_unreserved to url.[ch] submodule--helper: add gitdir migration command submodule: allow runtime enabling extensions.submodulePathConfig submodule: introduce extensions.submodulePathConfig builtin/submodule--helper: add gitdir command submodule: always validate gitdirs inside submodule_name_to_gitdir submodule--helper: use submodule_name_to_gitdir in add_submodule
2026-01-12submodule: introduce extensions.submodulePathConfigAdrian Ratiu
The idea of this extension is to abstract away the submodule gitdir path implementation: everyone is expected to use the config and not worry about how the path is computed internally, either in git or other implementations. With this extension enabled, the submodule.<name>.gitdir repo config becomes the single source of truth for all submodule gitdir paths. The submodule.<name>.gitdir config is added automatically for all new submodules when this extension is enabled. Git will throw an error if the extension is enabled and a config is missing, advising users how to migrate. Migration is manual for now. E.g. to add a missing config entry for an existing "foo" module: git config submodule.foo.gitdir .git/modules/foo Suggested-by: Junio C Hamano <gitster@pobox.com> Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19setup: convert `set_git_dir()` to have file scopePatrick Steinhardt
We don't have any external callers of `set_git_dir()` anymore now that `enter_repo()` has been moved into "setup.c". Remove the declaration and mark the function as static. Note that this change requires us to move the implementation around so that we can avoid adding any new forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19path: move `enter_repo()` into "setup.c"Patrick Steinhardt
The function `enter_repo()` is used to enter a repository at a given path. As such it sits way closer to setting up a repository than it does with handling paths, but regardless of that it's located in "path.c" instead of in "setup.c". Move the function into "setup.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01setup: use the default algorithm to initialize repo formatbrian m. carlson
When we define a new repository format with REPOSITORY_FORMAT_INIT, we always use GIT_HASH_SHA1, and this value ends up getting used as the default value to initialize a repository if none of the command line, environment, or config tell us to do otherwise. Because we might not always want to use SHA-1 as the default, let's instead specify the default hash algorithm constant so that we will use whatever the specified default is. However, we also need to continue to read older repositories. If we're in a v0 repository or extensions.objectformat is not set, then we must continue to default to the original hash algorithm: SHA-1. If an algorithm is set explicitly, however, it will override the hash_algo member of the repository_format struct and we'll get the right value. Similarly, if the repository was initialized before Git 0.99.3, then it may lack a core.repositoryformatversion key, and some repositories lack a config file altogether. In both cases, format->version is -1 and we need to assume that SHA-1 is in use. Because clear_repository_format reinitializes the struct repository_format and therefore sets the hash_algo member to the default (which could in the future not be SHA-1), we need to reset this member explicitly. We know, however, that at the point we call read_repository_format, we are actually reading an existing repository and not initializing a new one or operating outside of a repository, so we are not changing the default behavior back to SHA-1 if the default algorithm is different. It is potentially questionable that we ignore all repository configuration if there is a config file but it doesn't have core.repositoryformatversion set, in which case we reset all of the configuration to the default. However, it is unclear what the right thing to do instead with such an old repository is and a simple git init will add the missing entry, so for now, we simply honor what the existing code does and reset the value to the default, simply adding our initialization to SHA-1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02worktree: add `relativeWorktrees` extensionCaleb White
A new extension, `relativeWorktrees`, is added to indicate that at least one worktree in the repository has been linked with relative paths. This ensures older Git versions do not attempt to automatically prune worktrees with relative paths, as they would not not recognize the paths as being valid. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Caleb White <cdwhite3@pm.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: move `set_git_dir()` and related into setup layerPatrick Steinhardt
The functions `set_git_dir()` and friends are used to set up repositories. As such, they are quite clearly part of the setup subsystem, but still live in "environment.c". Move them over, which also helps to get rid of dependencies on `the_repository` in the environment subsystem. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_git_dir()` accept a repositoryPatrick Steinhardt
The `get_git_dir()` function retrieves the path to the Git directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06refs: convert ref storage format to an enumPatrick Steinhardt
The ref storage format is tracked as a simple unsigned integer, which makes it harder than necessary to discover what that integer actually is or where its values are defined. Convert the ref storage format to instead be an enum. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-29Sync with 2.44.1Johannes Schindelin
* maint-2.44: (41 commits) Git 2.44.1 Git 2.43.4 Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel ...
2024-04-19Sync with 2.43.4Johannes Schindelin
* maint-2.43: (40 commits) Git 2.43.4 Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories ...
2024-04-19Sync with 2.41.1Johannes Schindelin
* maint-2.41: (38 commits) Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs ...
2024-04-19Sync with 2.40.2Johannes Schindelin
* maint-2.40: (39 commits) Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs upload-pack: disable lazy-fetching by default ...
2024-03-28Merge branch 'eb/hash-transition'Junio C Hamano
Work to support a repository that work with both SHA-1 and SHA-256 hash algorithms has started. * eb/hash-transition: (30 commits) t1016-compatObjectFormat: add tests to verify the conversion between objects t1006: test oid compatibility with cat-file t1006: rename sha1 to oid test-lib: compute the compatibility hash so tests may use it builtin/ls-tree: let the oid determine the output algorithm object-file: handle compat objects in check_object_signature tree-walk: init_tree_desc take an oid to get the hash algorithm builtin/cat-file: let the oid determine the output algorithm rev-parse: add an --output-object-format parameter repository: implement extensions.compatObjectFormat object-file: update object_info_extended to reencode objects object-file-convert: convert commits that embed signed tags object-file-convert: convert commit objects when writing object-file-convert: don't leak when converting tag objects object-file-convert: convert tag objects when writing object-file-convert: add a function to convert trees between algorithms object: factor out parse_mode out of fast-import and tree-walk into in object.h cache: add a function to read an OID of a specific algorithm tag: sign both hashes commit: export add_header_signature to support handling signatures on tags ...
2024-01-02setup: introduce "extensions.refStorage" extensionPatrick Steinhardt
Introduce a new "extensions.refStorage" extension that allows us to specify the ref storage format used by a repository. For now, the only supported format is the "files" format, but this list will likely soon be extended to also support the upcoming "reftable" format. There have been discussions on the Git mailing list in the past around how exactly this extension should look like. One alternative [1] that was discussed was whether it would make sense to model the extension in such a way that backends are arbitrarily stackable. This would allow for a combined value of e.g. "loose,packed-refs" or "loose,reftable", which indicates that new refs would be written via "loose" files backend and compressed into "packed-refs" or "reftable" backends, respectively. It is arguable though whether this flexibility and the complexity that it brings with it is really required for now. It is not foreseeable that there will be a proliferation of backends in the near-term future, and the current set of existing formats and formats which are on the horizon can easily be configured with the much simpler proposal where we have a single value, only. Furthermore, if we ever see that we indeed want to gain the ability to arbitrarily stack the ref formats, then we can adapt the current extension rather easily. Given that Git clients will refuse any unknown value for the "extensions.refStorage" extension they would also know to ignore a stacked "loose,packed-refs" in the future. So let's stick with the easy proposal for the time being and wire up the extension. [1]: <pull.1408.git.1667846164.gitgitgadget@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02setup: start tracking ref storage formatPatrick Steinhardt
In order to discern which ref storage format a repository is supposed to use we need to start setting up and/or discovering the format. This needs to happen in two separate code paths. - The first path is when we create a repository via `init_db()`. When we are re-initializing a preexisting repository we need to retain the previously used ref storage format -- if the user asked for a different format then this indicates an error and we error out. Otherwise we either initialize the repository with the format asked for by the user or the default format, which currently is the "files" backend. - The second path is when discovering repositories, where we need to read the config of that repository. There is not yet any way to configure something other than the "files" backend, so we can just blindly set the ref storage format to this backend. Wire up this logic so that we have the ref storage format always readily available when needed. As there is only a single backend and because it is not configurable we cannot yet verify that this tracking works as expected via tests, but tests will be added in subsequent commits. To countermand this ommission now though, raise a BUG() in case the ref storage format is not set up properly in `ref_store_init()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12builtin/clone: create the refdb with the correct object formatPatrick Steinhardt
We're currently creating the reference database with a potentially incorrect object format when the remote repository's object format is different from the local default object format. This works just fine for now because the files backend never records the object format anywhere. But this logic will fail with any new reference backend that encodes this information in some form either on-disk or in-memory. The preceding commits have reshuffled code in git-clone(1) so that there is no code path that will access the reference database before we have detected the remote's object format. With these refactorings we can now defer initialization of the reference database until after we have learned the remote's object format and thus initialize it with the correct format from the get-go. These refactorings are required to make git-clone(1) work with the upcoming reftable backend when cloning repositories with the SHA256 object format. This change breaks a test in "t5550-http-fetch-dumb.sh" when cloning an empty repository with `GIT_TEST_DEFAULT_HASH=sha256`. The test expects the resulting hash format of the empty cloned repository to match the default hash, but now we always end up with a sha1 repository. The problem is that for dumb HTTP fetches, we have no easy way to figure out the remote's hash function except for deriving it based on the hash length of refs in `info/refs`. But as the remote repository is empty we cannot rely on this detection mechanism. Before the change in this commit we already initialized the repository with the default hash function and then left it as-is. With this patch we always use the hash function detected via the remote, where we fall back to "sha1" in case we cannot detect it. Neither the old nor the new behaviour are correct as we second-guess the remote hash function in both cases. But given that this is a rather unlikely edge case (we use the dumb HTTP protocol, the remote repository uses SHA256 and the remote repository is empty), let's simply adapt the test to assert the new behaviour. If we want to properly address this edge case in the future we will have to extend the dumb HTTP protocol so that we can properly detect the hash function for empty repositories. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12setup: allow skipping creation of the refdbPatrick Steinhardt
Allow callers to skip creation of the reference database via a new flag `INIT_DB_SKIP_REFDB`, which is required for git-clone(1) so that we can create it at a later point once the object format has been discovered from the remote repository. Note that we also uplift the call to `create_reference_database()` into `init_db()`, which makes it easier to handle the new flag for us. This changes the order in which we do initialization so that we now set up the Git configuration before we create the reference database. In practice this move should not result in any change in behaviour. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02Merge branch 'ds/scalar-updates' into maint-2.42Junio C Hamano
Scalar updates. * ds/scalar-updates: scalar reconfigure: help users remove buggy repos setup: add discover_git_directory_reason() scalar: add --[no-]src option
2023-10-02repository: implement extensions.compatObjectFormatbrian m. carlson
Add a configuration option to enable updating and reading from compatibility hash maps when git accesses the reposotiry. Call the helper function repo_set_compat_hash_algo with the value that compatObjectFormat is set to. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-29Merge branch 'ds/scalar-updates'Junio C Hamano
Scalar updates. * ds/scalar-updates: scalar reconfigure: help users remove buggy repos setup: add discover_git_directory_reason() scalar: add --[no-]src option
2023-08-28setup: add discover_git_directory_reason()Derrick Stolee
There are many reasons why discovering a Git directory may fail. In particular, 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02) added ownership checks as a security precaution. Callers attempting to set up a Git directory may want to inform the user about the reason for the failure. For that, expose the enum discovery_result from within setup.c and move it into cache.h where discover_git_directory() is defined. I initially wanted to change the return type of discover_git_directory() to be this enum, but several callers rely upon the "zero means success". The two problems with this are: 1. The zero value of the enum is actually GIT_DIR_NONE, so nonpositive results are errors. 2. There are multiple successful states; positive results are successful. It is worth noting that GIT_DIR_NONE is not returned, so we remove this option from the enum. We must be careful to keep the successful reasons as positive values, so they are given explicit positive values. Instead of updating all callers immediately, add a new method, discover_git_directory_reason(), and convert discover_git_directory() to be a thin shim on top of it. One thing that is important to note is that discover_git_directory() previously returned -1 on error, so let's continue that into the future. There is only one caller (in scalar.c) that depends on that signedness instead of a non-zero check, so clean that up, too. Because there are extra checks that discover_git_directory_reason() does after setup_git_directory_gently_1(), there are other modes that can be returned for failure states. Add these modes to the enum, but be sure to explicitly add them as BUG() states in the switch of setup_git_directory_gently(). Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21setup: adopt shared init-db & clone codeElijah Newren
The functions init_db() and initialize_repository_version() were shared by builtin/init-db.c and builtin/clone.c, and declared in cache.h. Move these functions, plus their several helpers only used by these functions, to setup.[ch]. Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21setup.h: move declarations for setup.c functions from cache.hElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>