aboutsummaryrefslogtreecommitdiff
path: root/setup.c
AgeCommit message (Collapse)Author
2024-09-19setup: fix a typoAndrew Kreimer
Fix a typo in comments. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: stop storing "core.logAllRefUpdates" globallyPatrick Steinhardt
The value of "core.logAllRefUpdates" is being stored in the global variable `log_all_ref_updates`. This design is somewhat aged nowadays, where it is entirely possible to access multiple repositories in the same process which all have different values for this setting. So using a single global variable to track it is plain wrong. Remove the global variable. Instead, we now provide a new function part of the repo-settings subsystem that parses the value for a specific repository. While that may require us to read the value multiple times, we work around this by reading it once when the ref backends are set up and caching the value there. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: move `set_git_dir()` and related into setup layerPatrick Steinhardt
The functions `set_git_dir()` and friends are used to set up repositories. As such, they are quite clearly part of the setup subsystem, but still live in "environment.c". Move them over, which also helps to get rid of dependencies on `the_repository` in the environment subsystem. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_git_work_tree()` accept a repositoryPatrick Steinhardt
The `get_git_work_tree()` function retrieves the path of the work tree of `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_object_directory()` accept a repositoryPatrick Steinhardt
The `get_object_directory()` function retrieves the path to the object directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_git_common_dir()` accept a repositoryPatrick Steinhardt
The `get_git_common_dir()` function retrieves the path to the common directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_git_dir()` accept a repositoryPatrick Steinhardt
The `get_git_dir()` function retrieves the path to the Git directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-26Merge branch 'jk/mark-unused-parameters'Junio C Hamano
Mark unused parameters as UNUSED to squelch -Wunused warnings. * jk/mark-unused-parameters: t-hashmap: stop calling setup() for t_intern() test scalar: mark unused parameters in dummy function daemon: mark unused parameters in non-posix fallbacks setup: mark unused parameter in config callback test-mergesort: mark unused parameters in trivial callback t-hashmap: mark unused parameters in callback function reftable: mark unused parameters in virtual functions reftable: drop obsolete test function declarations reftable: ignore unused argc/argv in test functions unit-tests: ignore unused argc/argv t/helper: mark more unused argv/argc arguments oss-fuzz: mark unused argv/argc argument refs: mark unused parameters in do_for_each_reflog_helper() refs: mark unused parameters in ref_store fsck callbacks update-ref: mark more unused parameters in parser callbacks imap-send: mark unused parameter in ssl_socket_connect() fallback
2024-08-23Merge branch 'ps/hash-and-ref-format-from-config'Junio C Hamano
The default object hash and ref backend format used to be settable only with explicit command line option to "git init" and environment variables, but now they can be configured in the user's global and system wide configuration. * ps/hash-and-ref-format-from-config: setup: make ref storage format configurable via config setup: make object format configurable via config setup: merge configuration of repository formats t0001: delete repositories when object format tests finish t0001: exercise initialization with ref formats more thoroughly
2024-08-17setup: mark unused parameter in config callbackJeff King
This is logically a continuation of 783a86c142 (config: mark unused callback parameters, 2022-08-19), but this case was introduced much later in 4412a04fe6 (init.templateDir: consider this config setting protected, 2024-03-29). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16setup: make ref storage format configurable via configPatrick Steinhardt
Similar to the preceding commit, introduce a new "init.defaultRefFormat" config that allows the user to globally set the ref storage format used by newly created repositories. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16setup: make object format configurable via configPatrick Steinhardt
The object format for repositories can either be configured explicitly by passing the `--object-format=` option to git-init(1) or git-clone(1), or globally by setting the `GIT_DEFAULT_HASH` environment variable. While the former makes sense, setting random environment variables is not really a good user experience in case someone decides to only use SHA256 repositories. It is only natural to expect for a user that things like this can also be configured via their config. As such, introduce a new config "init.defaultObjectFormat", similar to "init.defaultBranch", that allows the user to configure the default object format when creating new repos. The precedence order now is the following, where the first one wins: 1. The `--object-format=` switch. 2. The `GIT_DEFAULT_HASH` environment variable. 3. The `init.defaultObjectFormat` config variable. This matches the typical precedence order we use in Git. We typically let the environment override the config such that the latter can easily be overridden on an ephemeral basis, for example by scripts. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-16setup: merge configuration of repository formatsPatrick Steinhardt
The configuration of repository formats is split up across two functions `validate_hash_algorithm()` and `validate_ref_storage_format()`. This is fine as-is, but we are about to extend the logic to also read default values from the config. With the logic split across two functions, we would either have to pass in additional parameters read from the config, or read the config multiple times. Both of these options feel a bit unwieldy. Merge the code into a new function `repository_format_configure()` that is responsible for configuring the whole repository's format. Like this, we can easily read the config in a single place, only. Furthermore, move the calls to `repo_set_ref_storage_format()` and `repo_set_hash_algo()` into this new function as well, such that all the logic to configure the repository format is self-contained here. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30safe.directory: normalize the configured pathJunio C Hamano
The pathname of a repository comes from getcwd() and it could be a path aliased via symbolic links, e.g., the real directory may be /home/u/repository but a symbolic link /home/u/repo may point at it, and the clone request may come as "git clone file:///home/u/repo/" A request to check if /home/u/repository is safe would be rejected if the safe.directory configuration allows /home/u/repo/ but not its alias /home/u/repository/. Normalize the paths configured for the safe.directory configuration variable before comparing them with the path being checked. Two and a half things to note, compared to the previous step to normalize the actual path of the suspected repository, are: - A configured safe.directory may be coming from .gitignore in the home directory that may be shared across machines. The path meant to match with an entry may not necessarily exist on all of such machines, so not being able to convert them to real path on this machine is *not* a condition that is worthy of warning. Hence, we ignore a path that cannot be converted to a real path. - A configured safe.directory is essentially a random string that user throws at us, written completely unrelated to the directory the current process happens to be in. Hence it makes little sense to give a non-absolute path. Hence we ignore any non-absolute paths, except for ".". - The safe.directory set to "." was once advertised on the list as a valid workaround for the regression caused by the overly tight safe.directory check introduced in 2.45.1; we treat it to mean "if we are at the top level of a repository, it is OK". (cf. <834862fd-b579-438a-b9b3-5246bf27ce8a@gmail.com>). Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30safe.directory: normalize the checked pathJunio C Hamano
The pathname of a repository comes from getcwd() and it could be a path aliased via symbolic links, e.g., the real directory may be /home/u/repository but a symbolic link /home/u/repo may point at it, and the clone request may come as "git clone file:///home/u/repo/". A request to check if /home/u/repo is safe would be rejected if the safe.directory configuration allows /home/u/repository/ but not its alias /home/u/repo/. Normalize the path being checked before comparing with safe.directory value(s). Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-30safe.directory: preliminary clean-upJunio C Hamano
The paths given in the safe.directory configuration variable are allowed to contain "~user" (which interpolates to user's home directory) and "%(prefix)" (which interpolates to the installation location in RUNTIME_PREFIX-enabled builds, and a call to the git_config_pathname() function is tasked to obtain a copy of the path with these constructs interpolated. The function, when it succeeds, always yields an allocated string in the location given as the out-parameter; even when there is nothing to interpolate in the original, a literal copy is made. The code path that contains this caller somehow made two contradicting and incorrect assumptions of the behaviour when there is no need for interpolation, and was written with extra defensiveness against two phantom risks that do not exist. One wrong assumption was that the function might yield NULL when there is no interpolation. This led to the use of an extra "check" variable, conditionally holding either the interpolated or the original string. The assumption was with us since 8959555c (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02) originally introduced the safe.directory feature. Another wrong assumption was that the function might yield the same pointer as the input when there is no interpolation. This led to a conditional free'ing of the interpolated copy, that the conditional never skipped, as we always received an allocated string. Simplify the code by removing the extra defensiveness. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02Merge branch 'ps/use-the-repository'Junio C Hamano
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help transition the codebase to rely less on the availability of the singleton the_repository instance. * ps/use-the-repository: hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE` t/helper: remove dependency on `the_repository` in "proc-receive" t/helper: fix segfault in "oid-array" command without repository t/helper: use correct object hash in partial-clone helper compat/fsmonitor: fix socket path in networked SHA256 repos replace-object: use hash algorithm from passed-in repository protocol-caps: use hash algorithm from passed-in repository oidset: pass hash algorithm when parsing file http-fetch: don't crash when parsing packfile without a repo hash-ll: merge with "hash.h" refs: avoid include cycle with "repository.h" global: introduce `USE_THE_REPOSITORY_VARIABLE` macro hash: require hash algorithm in `empty_tree_oid_hex()` hash: require hash algorithm in `is_empty_{blob,tree}_oid()` hash: make `is_null_oid()` independent of `the_repository` hash: convert `oidcmp()` and `oideq()` to compare whole hash global: ensure that object IDs are always padded hash: require hash algorithm in `oidread()` and `oidclr()` hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()` hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
2024-06-24Merge branch 'tb/precompose-getcwd'Junio C Hamano
We forgot to normalize the result of getcwd() to NFC on macOS where all other paths are normalized, which has been corrected. This still does not address the case where core.precomposeUnicode configuration is not defined globally. * tb/precompose-getcwd: macOS: ls-files path fails if path of workdir is NFD
2024-06-17Merge branch 'ps/ref-storage-migration'Junio C Hamano
A new command has been added to migrate a repository that uses the files backend for its ref storage to use the reftable backend, with limitations. * ps/ref-storage-migration: builtin/refs: new command to migrate ref storage formats refs: implement logic to migrate between ref storage formats refs: implement removal of ref storages worktree: don't store main worktree twice reftable: inline `merged_table_release()` refs/files: fix NULL pointer deref when releasing ref store refs/files: extract function to iterate through root refs refs/files: refactor `add_pseudoref_and_head_entries()` refs: allow to skip creation of reflog entries refs: pass storage format to `ref_store_init()` explicitly refs: convert ref storage format to an enum setup: unset ref storage when reinitializing repository version
2024-06-14global: introduce `USE_THE_REPOSITORY_VARIABLE` macroPatrick Steinhardt
Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-13Merge branch 'ps/ref-storage-migration' into ps/use-the-repositoryJunio C Hamano
* ps/ref-storage-migration: builtin/refs: new command to migrate ref storage formats refs: implement logic to migrate between ref storage formats refs: implement removal of ref storages worktree: don't store main worktree twice reftable: inline `merged_table_release()` refs/files: fix NULL pointer deref when releasing ref store refs/files: extract function to iterate through root refs refs/files: refactor `add_pseudoref_and_head_entries()` refs: allow to skip creation of reflog entries refs: pass storage format to `ref_store_init()` explicitly refs: convert ref storage format to an enum setup: unset ref storage when reinitializing repository version
2024-06-12Merge branch 'jc/safe-directory-leading-path'Junio C Hamano
The safe.directory configuration knob has been updated to optionally allow leading path matches. * jc/safe-directory-leading-path: safe.directory: allow "lead/ing/path/*" match
2024-06-06Merge branch 'ps/leakfixes'Junio C Hamano
Leakfixes. * ps/leakfixes: builtin/mv: fix leaks for submodule gitfile paths builtin/mv: refactor to use `struct strvec` builtin/mv duplicate string list memory builtin/mv: refactor `add_slash()` to always return allocated strings strvec: add functions to replace and remove strings submodule: fix leaking memory for submodule entries commit-reach: fix memory leak in `ahead_behind()` builtin/credential: clear credential before exit config: plug various memory leaks config: clarify memory ownership in `git_config_string()` builtin/log: stop using globals for format config builtin/log: stop using globals for log config convert: refactor code to clarify ownership of check_roundtrip_encoding diff: refactor code to clarify memory ownership of prefixes config: clarify memory ownership in `git_config_pathname()` http: refactor code to clarify memory ownership checkout: clarify memory ownership in `unique_tracking_name()` strbuf: fix leak when `appendwholeline()` fails with EOF transport-helper: fix leaking helper name
2024-06-06refs: convert ref storage format to an enumPatrick Steinhardt
The ref storage format is tracked as a simple unsigned integer, which makes it harder than necessary to discover what that integer actually is or where its values are defined. Convert the ref storage format to instead be an enum. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-06setup: unset ref storage when reinitializing repository versionPatrick Steinhardt
When reinitializing a repository's version we may end up unsetting the hash algorithm when it matches the default hash algorithm. If we didn't do that then the previously configured value might remain intact. While the same issue exists for the ref storage extension, we don't do this here. This has been fine for most of the part because it is not supported to re-initialize a repository with a different ref storage format anyway. We're about to introduce a new command to migrate ref storages though, so this is about to become an issue there. Prepare for this and unset the ref storage format when reinitializing a repository with the "files" format. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-03Merge branch 'ps/fix-reinit-includeif-onbranch'Junio C Hamano
"git init" in an already created directory, when the user configuration has includeif.onbranch, started to fail recently, which has been corrected. * ps/fix-reinit-includeif-onbranch: setup: fix bug with "includeIf.onbranch" when initializing dir
2024-05-31macOS: ls-files path fails if path of workdir is NFDTorsten Bögershausen
Under macOS, `git ls-files path` does not work (gives an error) if the absolute 'path' contains characters in NFD (decomposed). This happens when core.precomposeunicode is true, which is the most common case. The bug report says: $ cd somewhere # some safe place, /tmp or ~/tmp etc. $ mkdir $'u\xcc\x88' # ü in NFD $ cd ü # or cd $'u\xcc\x88' or cd $'\xc3\xbc' $ git init $ git ls-files $'/somewhere/u\xcc\x88' # NFD fatal: /somewhere/ü: '/somewhere/ü' is outside repository at '/somewhere/ü' $ git ls-files $'/somewhere/\xc3\xbc' # NFC (the same error as above) In the 'fatal:' error message, there are three ü; the 1st and 2nd are in NFC, the 3rd is in NFD. Add test cases that follows the bug report, with the simplification that the 'ü' is replaced by an 'ä', which is already used as NFD and NFC in t3910. The solution is to add a call to precompose_string_if_needed() to this code in setup.c : `work_tree = precompose_string_if_needed(get_git_work_tree());` There is, however, a limitation with this very usage of Git: The (repo) local .gitconfig file is not used, only the global "core.precomposeunicode" is taken into account, if it is set (or not). To set it to true is a good recommendation anyway, and here is the analyzes from Jun T : The problem is the_repository->config->hash_initialized is set to 1 before the_repository->commondir is set to ".git". Due to this, .git/config is never read, and precomposed_unicode is never set to 1 (remains -1). run_builtin() { setup_git_directory() { strbuf_getcwd() { # setup.c:1542 precompose_{strbuf,string}_if_needed() { # precomposed_unicode is still -1 git_congig_get_bool("core.precomposeunicode") { git_config_check_init() { repo_read_config() { git_config_init() { # !!! the_repository->config->hash_initialized=1 # !!! } # does not read .git/config since # the_repository->commondir is still NULL } } } returns without converting to NFC } returns cwd in NFD } setup_discovered_git_dir() { set_git_work_tree(".") { repo_set_worktree() { # this function indirectly calls strbuf_getcwd() # --> precompose_{strbuf,string}_if_needed() --> # {git,repo}_config_get_bool("core.precomposeunicode"), # but does not try to read .git/config since # the_repository->config->hash_initialized # is already set to 1 above. And it will not read # .git/config even if hash_initialized is 0 # since the_repository->commondir is still NULL. the_repository->worktree = NFD } } } setup_git_env() { repo_setup_gitdir() { repo_set_commondir() { # finally commondir is set here the_repository->commondir = ".git" } } } } // END setup_git_directory Reported-by: Jun T <takimoto-j@kba.biglobe.ne.jp> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30Merge branch 'ps/refs-without-the-repository-updates'Junio C Hamano
Further clean-up the refs subsystem to stop relying on the_repository, and instead use the repository associated to the ref_store object. * ps/refs-without-the-repository-updates: refs/packed: remove references to `the_hash_algo` refs/files: remove references to `the_hash_algo` refs/files: use correct repository refs: remove `dwim_log()` refs: drop `git_default_branch_name()` refs: pass repo when peeling objects refs: move object peeling into "object.c" refs: pass ref store when detecting dangling symrefs refs: convert iteration over replace refs to accept ref store refs: retrieve worktree ref stores via associated repository refs: refactor `resolve_gitlink_ref()` to accept a repository refs: pass repo when retrieving submodule ref store refs: track ref stores via strmap refs: implement releasing ref storages refs: rename `init_db` callback to avoid confusion refs: adjust names for `init` and `init_db` callbacks
2024-05-30Merge branch 'ps/undecided-is-not-necessarily-sha1'Junio C Hamano
Before discovering the repository details, We used to assume SHA-1 as the "default" hash function, which has been corrected. Hopefully this will smoke out codepaths that rely on such an unwarranted assumptions. * ps/undecided-is-not-necessarily-sha1: repository: stop setting SHA1 as the default object hash oss-fuzz/commit-graph: set up hash algorithm builtin/shortlog: don't set up revisions without repo builtin/diff: explicitly set hash algo when there is no repo builtin/bundle: abort "verify" early when there is no repository builtin/blame: don't access potentially unitialized `the_hash_algo` builtin/rev-parse: allow shortening to more than 40 hex characters remote-curl: fix parsing of detached SHA256 heads attr: fix BUG() when parsing attrs outside of repo attr: don't recompute default attribute source parse-options-cb: only abbreviate hashes when hash algo is known path: move `validate_headref()` to its only user path: harden validation of HEAD with non-standard hashes
2024-05-29safe.directory: allow "lead/ing/path/*" matchJunio C Hamano
When safe.directory was introduced in v2.30.3 timeframe, 8959555c (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), it only allowed specific opt-out directories. Immediately after an embargoed release that included the change, 0f85c4a3 (setup: opt-out of check with safe.directory=*, 2022-04-13) was done as a response to loosen the check so that a single '*' can be used to say "I trust all repositories" for folks who host too many repositories to list individually. Let's further loosen the check to allow people to say "everything under this hierarchy is deemed safe" by specifying such a leading directory with "/*" appended to it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27config: clarify memory ownership in `git_config_pathname()`Patrick Steinhardt
The out parameter of `git_config_pathname()` is a `const char **` even though we transfer ownership of memory to the caller. This is quite misleading and has led to many memory leaks all over the place. Adapt the parameter to instead be `char **`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-23Merge branch 'ps/refs-without-the-repository-updates' into ↵Junio C Hamano
ps/ref-storage-migration * ps/refs-without-the-repository-updates: refs/packed: remove references to `the_hash_algo` refs/files: remove references to `the_hash_algo` refs/files: use correct repository refs: remove `dwim_log()` refs: drop `git_default_branch_name()` refs: pass repo when peeling objects refs: move object peeling into "object.c" refs: pass ref store when detecting dangling symrefs refs: convert iteration over replace refs to accept ref store refs: retrieve worktree ref stores via associated repository refs: refactor `resolve_gitlink_ref()` to accept a repository refs: pass repo when retrieving submodule ref store refs: track ref stores via strmap refs: implement releasing ref storages refs: rename `init_db` callback to avoid confusion refs: adjust names for `init` and `init_db` callbacks
2024-05-22setup: fix bug with "includeIf.onbranch" when initializing dirPatrick Steinhardt
It was reported that git-init(1) can fail when initializing an existing directory in case the config contains an "includeIf.onbranch:" condition: $ mkdir repo $ git -c includeIf.onbranch:main.path=nonexistent init repo BUG: refs.c:2056: reference backend is unknown The same error can also be triggered when re-initializing an already existing repository. The bug has been introduced in 173761e21b (setup: start tracking ref storage format, 2023-12-29), which wired up the ref storage format. The root cause is in `init_db()`, which tries to read the config before we have initialized `the_repository` and most importantly its ref storage format. We eventually end up calling `include_by_branch()` and execute `refs_resolve_ref_unsafe()`, but because we have not initialized the ref storage format yet this will trigger the above bug. Interestingly, `include_by_branch()` has a mechanism that will only cause us to resolve the ref when `the_repository->gitdir` is set. This is also the reason why this only happens when we initialize an already existing directory or repository: `gitdir` is set in those cases, but not when creating a new directory. Now there are two ways to address the issue: - We can adapt `include_by_branch()` to also make the code conditional on whether `the_repository->ref_storage_format` is set. - We can shift around code such that we initialize the repository format before we read the config. While the first approach would be safe, it may also cause us to paper over issues where a ref store should have been set up. In our case for example, it may be reasonable to expect that re-initializing the repo will cause the "onbranch:" condition to trigger, but we would not do that if the ref storage format was not set up yet. This also used to work before the above commit that introduced this bug. Rearrange the code such that we set up the repository format before reading the config. This fixes the bug and ensures that "onbranch:" conditions can trigger. Reported-by: Heghedus Razvan <heghedus.razvan@protonmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Tested-by: Heghedus Razvan <heghedus.razvan@protonmail.com> [jc: fixed a test and backported to v2.44.0 codebase] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-20Merge branch 'kn/ref-transaction-symref'Junio C Hamano
Updates to symbolic refs can now be made as a part of ref transaction. * kn/ref-transaction-symref: refs: remove `create_symref` and associated dead code refs: rename `refs_create_symref()` to `refs_update_symref()` refs: use transaction in `refs_create_symref()` refs: add support for transactional symref updates refs: move `original_update_refname` to 'refs.c' refs: support symrefs in 'reference-transaction' hook files-backend: extract out `create_symref_lock()` refs: accept symref values in `ref_transaction_update()`
2024-05-17refs: drop `git_default_branch_name()`Patrick Steinhardt
The `git_default_branch_name()` function is a thin wrapper around `repo_default_branch_name()` with two differences: - We implicitly rely on `the_repository`. - We cache the default branch name. None of the callsites of `git_default_branch_name()` are hot code paths though, so the caching of the branch name is not really required. Refactor the callsites to use `repo_default_branch_name()` instead and drop `git_default_branch_name()`, thus getting rid of one more case where we rely on `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17refs: rename `init_db` callback to avoid confusionPatrick Steinhardt
Reference backends have two callbacks `init` and `init_db`. The similarity of these two callbacks has repeatedly confused me whenever I was looking at them, where I always had to look up which of them does what. Rename the `init_db` callback to `create_on_disk`, which should hopefully be clearer. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-16Merge branch 'ps/refs-without-the-repository'Junio C Hamano
The refs API lost functions that implicitly assumes to work on the primary ref_store by forcing the callers to pass a ref_store as an argument. * ps/refs-without-the-repository: refs: remove functions without ref store cocci: apply rules to rewrite callers of "refs" interfaces cocci: introduce rules to transform "refs" to pass ref store refs: add `exclude_patterns` parameter to `for_each_fullref_in()` refs: introduce missing functions that accept a `struct ref_store`
2024-05-16Merge branch 'ps/refs-without-the-repository' into ↵Junio C Hamano
ps/refs-without-the-repository-updates * ps/refs-without-the-repository: refs: remove functions without ref store cocci: apply rules to rewrite callers of "refs" interfaces cocci: introduce rules to transform "refs" to pass ref store refs: add `exclude_patterns` parameter to `for_each_fullref_in()` refs: introduce missing functions that accept a `struct ref_store`
2024-05-07cocci: apply rules to rewrite callers of "refs" interfacesPatrick Steinhardt
Apply the rules that rewrite callers of "refs" interfaces to explicitly pass `struct ref_store`. The resulting patch has been applied with the `--whitespace=fix` option. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06path: move `validate_headref()` to its only userPatrick Steinhardt
While `validate_headref()` is only called from `is_git_directory()` in "setup.c", it is currently implemented in "path.c". Move it over such that it becomes clear that it is only really used during setup in order to discover repositories. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-29Sync with 2.44.1Johannes Schindelin
* maint-2.44: (41 commits) Git 2.44.1 Git 2.43.4 Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel ...
2024-04-19Sync with 2.43.4Johannes Schindelin
* maint-2.43: (40 commits) Git 2.43.4 Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories ...
2024-04-19Sync with 2.42.2Johannes Schindelin
* maint-2.42: (39 commits) Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' ...
2024-04-19Sync with 2.41.1Johannes Schindelin
* maint-2.41: (38 commits) Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs ...
2024-04-19Sync with 2.40.2Johannes Schindelin
* maint-2.40: (39 commits) Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs upload-pack: disable lazy-fetching by default ...
2024-04-19init.templateDir: consider this config setting protectedJohannes Schindelin
The ability to configuring the template directory is a delicate feature: It allows defining hooks that will be run e.g. during a `git clone` operation, such as the `post-checkout` hook. As such, it is of utmost importance that Git would not allow that config setting to be changed during a `git clone` by mistake, allowing an attacker a chance for a Remote Code Execution, allowing attackers to run arbitrary code on unsuspecting users' machines. As a defense-in-depth measure, to prevent minor vulnerabilities in the `git clone` code from ballooning into higher-serverity attack vectors, let's make this a protected setting just like `safe.directory` and friends, i.e. ignore any `init.templateDir` entries from any local config. Note: This does not change the behavior of any recursive clone (modulo bugs), as the local repository config is not even supposed to be written while cloning the superproject, except in one scenario: If a config template is configured that sets the template directory. This might be done because `git clone --recurse-submodules --template=<directory>` does not pass that template directory on to the submodules' initialization. Another scenario where this commit changes behavior is where repositories are _not_ cloned recursively, and then some (intentional, benign) automation configures the template directory to be used before initializing the submodules. So the caveat is that this could theoretically break existing processes. In both scenarios, there is a way out, though: configuring the template directory via the environment variable `GIT_TEMPLATE_DIR`. This change in behavior is a trade-off between security and backwards-compatibility that is struck in favor of security. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17init: refactor the template directory discovery into its own functionJohannes Schindelin
We will need to call this function from `hook.c` to be able to prevent hooks from running that were written as part of a `clone` but did not originate from the template directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-17fetch/clone: detect dubious ownership of local repositoriesJohannes Schindelin
When cloning from somebody else's repositories, it is possible that, say, the `upload-pack` command is overridden in the repository that is about to be cloned, which would then be run in the user's context who started the clone. To remind the user that this is a potentially unsafe operation, let's extend the ownership checks we have already established for regular gitdir discovery to extend also to local repositories that are about to be cloned. This protection extends also to file:// URLs. The fixes in this commit address CVE-2024-32004. Note: This commit does not touch the `fetch`/`clone` code directly, but instead the function used implicitly by both: `enter_repo()`. This function is also used by `git receive-pack` (i.e. pushes), by `git upload-archive`, by `git daemon` and by `git http-backend`. In setups that want to serve repositories owned by different users than the account running the service, this will require `safe.*` settings to be configured accordingly. Also note: there are tiny time windows where a time-of-check-time-of-use ("TOCTOU") race is possible. The real solution to those would be to work with `fstat()` and `openat()`. However, the latter function is not available on Windows (and would have to be emulated with rather expensive low-level `NtCreateFile()` calls), and the changes would be quite extensive, for my taste too extensive for the little gain given that embargoed releases need to pay extra attention to avoid introducing inadvertent bugs. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-03-28Merge branch 'eb/hash-transition'Junio C Hamano
Work to support a repository that work with both SHA-1 and SHA-256 hash algorithms has started. * eb/hash-transition: (30 commits) t1016-compatObjectFormat: add tests to verify the conversion between objects t1006: test oid compatibility with cat-file t1006: rename sha1 to oid test-lib: compute the compatibility hash so tests may use it builtin/ls-tree: let the oid determine the output algorithm object-file: handle compat objects in check_object_signature tree-walk: init_tree_desc take an oid to get the hash algorithm builtin/cat-file: let the oid determine the output algorithm rev-parse: add an --output-object-format parameter repository: implement extensions.compatObjectFormat object-file: update object_info_extended to reencode objects object-file-convert: convert commits that embed signed tags object-file-convert: convert commit objects when writing object-file-convert: don't leak when converting tag objects object-file-convert: convert tag objects when writing object-file-convert: add a function to convert trees between algorithms object: factor out parse_mode out of fast-import and tree-walk into in object.h cache: add a function to read an OID of a specific algorithm tag: sign both hashes commit: export add_header_signature to support handling signatures on tags ...
2024-03-21Merge branch 'jc/safe-implicit-bare'Junio C Hamano
Users with safe.bareRepository=explicit can still work from within $GIT_DIR of a seconary worktree (which resides at .git/worktrees/$name/) of the primary worktree without explicitly specifying the $GIT_DIR environment variable or the --git-dir=<path> option. * jc/safe-implicit-bare: setup: notice more types of implicit bare repositories