aboutsummaryrefslogtreecommitdiff
path: root/repository.c
AgeCommit message (Collapse)Author
2026-03-25Merge branch 'ps/object-counting'Junio C Hamano
The logic to count objects has been cleaned up. * ps/object-counting: odb: introduce generic object counting odb/source: introduce generic object counting object-file: generalize counting objects object-file: extract logic to approximate object count packfile: extract logic to count number of objects odb: stop including "odb/source.h"
2026-03-12Merge branch 'bc/sha1-256-interop-02'Junio C Hamano
The code to maintain mapping between object names in multiple hash functions is being added, written in Rust. * bc/sha1-256-interop-02: object-file-convert: always make sure object ID algo is valid rust: add a small wrapper around the hashfile code rust: add a new binary object map format rust: add functionality to hash an object rust: add a build.rs script for tests rust: fix linking binaries with cargo hash: expose hash context functions to Rust write-or-die: add an fsync component for the object map csum-file: define hashwrite's count as a uint32_t rust: add additional helpers for ObjectID hash: add a function to look up hash algo structs rust: add a hash algorithm abstraction rust: add a ObjectID struct hash: use uint32_t for object_id algorithm conversion: don't crash when no destination algo repository: require Rust support for interoperability
2026-03-12odb: stop including "odb/source.h"Patrick Steinhardt
The "odb.h" header currently includes the "odb/source.h" file. This is somewhat roundabout though: most callers shouldn't have to care about the `struct odb_source`, but should rather use the ODB-level functions. Furthermore, it means that a couple of definitions have to live on the source level even though they should be part of the generic interface. Reverse the relation between "odb/source.h" and "odb.h" and move the enums and typedefs that relate to the generic interfaces back into "odb.h". Add the necessary includes to all files that rely on the transitive include. Suggested-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-10Merge branch 'ar/config-hooks'Junio C Hamano
Allow hook commands to be defined (possibly centrally) in the configuration files, and run multiple of them for the same hook event. * ar/config-hooks: hook: add -z option to "git hook list" hook: allow out-of-repo 'git hook' invocations hook: allow event = "" to overwrite previous values hook: allow disabling config hooks hook: include hooks from the config hook: add "git hook list" command hook: run a list of hooks to prepare for multihook support hook: add internal state alloc/free callbacks
2026-03-05Merge branch 'ob/core-attributesfile-in-repository'Junio C Hamano
The core.attributesfile is intended to be set per repository, but were kept track of by a single global variable in-core, which has been corrected by moving it to per-repository data structure. * ob/core-attributesfile-in-repository: environment: move "branch.autoSetupMerge" into `struct repo_config_values` environment: stop using core.sparseCheckout globally environment: stop storing `core.attributesFile` globally
2026-03-04Merge branch 'kn/ref-location'Junio C Hamano
Allow the directory in which reference backends store their data to be specified. * kn/ref-location: refs: add GIT_REFERENCE_BACKEND to specify reference backend refs: allow reference location in refstorage config refs: receive and use the reference storage payload refs: move out stub modification to generic layer refs: extract out `refs_create_refdir_stubs()` setup: don't modify repo in `create_reference_database()`
2026-02-25refs: allow reference location in refstorage configKarthik Nayak
The 'extensions.refStorage' config is used to specify the reference backend for a given repository. Both the 'files' and 'reftable' backends utilize the $GIT_DIR as the reference folder by default in `get_main_ref_store()`. Since the reference backends are pluggable, this means that they could work with out-of-tree reference directories too. Extend the 'refStorage' config to also support taking an URI input, where users can specify the reference backend and the location. Add the required changes to obtain and propagate this value to the individual backends. Add the necessary documentation and tests. Traditionally, for linked worktrees, references were stored in the '$GIT_DIR/worktrees/<wt_id>' path. But when using an alternate reference storage path, it doesn't make sense to store the main worktree references in the new path, and the linked worktree references in the $GIT_DIR. So, let's store linked worktree references in '$ALTERNATE_REFERENCE_DIR/worktrees/<wt_id>'. To do this, create the necessary files and folders while also adding stubs in the $GIT_DIR path to ensure that it is still considered a Git directory. Ideally, we would want to pass in a `struct worktree *` to individual backends, instead of passing the `gitdir`. This allows them to handle worktree specific logic. Currently, that is not possible since the worktree code is: - Tied to using the global `the_repository` variable. - Is not setup before the reference database during initialization of the repository. Add a TODO in 'refs.c' to ensure we can eventually make that change. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-19hook: include hooks from the configAdrian Ratiu
Teach the hook.[hc] library to parse configs to populate the list of hooks to run for a given event. Multiple commands can be specified for a given hook by providing "hook.<friendly-name>.command = <path-to-hook>" and "hook.<friendly-name>.event = <hook-event>" lines. Hooks will be started in config order of the "hook.<name>.event" lines and will be run sequentially (.jobs == 1) like before. Running the hooks in parallel will be enabled in a future patch. The "traditional" hook from the hookdir is run last, if present. A strmap cache is added to struct repository to avoid re-reading the configs on each rook run. This is useful for hooks like the ref-transaction which gets executed multiple times per process. Examples: $ git config --get-regexp "^hook\." hook.bar.command=~/bar.sh hook.bar.event=pre-commit # Will run ~/bar.sh, then .git/hooks/pre-commit $ git hook run pre-commit Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-17environment: stop storing `core.attributesFile` globallyOlamide Caleb Bello
The `core.attributeFile` config value is parsed in git_default_core_config(), loaded eagerly and stored in the global variable `git_attributes_file`. Storing this value in a global variable can lead to it being overwritten by another repository when more than one Git repository run in the same Git process. Create a new struct `repo_config_values` to hold this value and other repository dependent values parsed by `git_default_config()`. This will ensure the current behaviour remains the same while also enabling the libification of Git. An accessor function 'repo_config_values()' s created to ensure that we do not access an uninitialized repository, or an instance of a different repository than the current one. Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Usman Akinyemi <usmanakinyemi202@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Olamide Caleb Bello <belkid98@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-07hash: use uint32_t for object_id algorithmbrian m. carlson
We currently use an int for this value, but we'll define this structure from Rust in a future commit and we want to ensure that our data types are exactly identical. To make that possible, use a uint32_t for the hash algorithm. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-07repository: require Rust support for interoperabilitybrian m. carlson
We'll be implementing some of our interoperability code, like the loose object map, in Rust. While the code currently compiles with the old loose object map format, which is written entirely in C, we'll soon replace that with the Rust-based implementation. Require the use of Rust for compatibility mode and die if it is not supported. Because the repo argument is not used when Rust is missing, cast it to void to silence the compiler warning, which we do not care about. Add a prerequisite in our tests, RUST, that checks if Rust functionality is available and use it in the tests that handle interoperability. This is technically a regression in functionality compared to our existing state, but pack index v3 is not yet implemented and thus the functionality is mostly quite broken, which is why we've recently marked this functionality as experimental. We don't believe anyone is getting useful use out of the interoperability code in its current state, so no actual users should be negatively impacted by this change. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-05Merge branch 'ar/submodule-gitdir-tweak'Junio C Hamano
Avoid local submodule repository directory paths overlapping with each other by encoding submodule names before using them as path components. * ar/submodule-gitdir-tweak: submodule: detect conflicts with existing gitdir configs submodule: hash the submodule name for the gitdir path submodule: fix case-folding gitdir filesystem collisions submodule--helper: fix filesystem collisions by encoding gitdir paths builtin/credential-store: move is_rfc3986_unreserved to url.[ch] submodule--helper: add gitdir migration command submodule: allow runtime enabling extensions.submodulePathConfig submodule: introduce extensions.submodulePathConfig builtin/submodule--helper: add gitdir command submodule: always validate gitdirs inside submodule_name_to_gitdir submodule--helper: use submodule_name_to_gitdir in add_submodule
2026-01-12submodule: introduce extensions.submodulePathConfigAdrian Ratiu
The idea of this extension is to abstract away the submodule gitdir path implementation: everyone is expected to use the config and not worry about how the path is computed internally, either in git or other implementations. With this extension enabled, the submodule.<name>.gitdir repo config becomes the single source of truth for all submodule gitdir paths. The submodule.<name>.gitdir config is added automatically for all new submodules when this extension is enabled. Git will throw an error if the extension is enabled and a config is missing, advising users how to migrate. Migration is manual for now. E.g. to add a missing config entry for an existing "foo" module: git config submodule.foo.gitdir .git/modules/foo Suggested-by: Junio C Hamano <gitster@pobox.com> Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Suggested-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30Merge branch 'gf/clear-path-cache-cleanup'Junio C Hamano
Code clean-up. * gf/clear-path-cache-cleanup: repository: remove duplicate free of cache->squash_msg
2025-12-19repository: remove duplicate free of cache->squash_msgGreg Funni
Thankfully, it is set to NULL, so no security consequences. However, this is still a mistake that must be rectified. Signed-off-by: Greg Funni <gfunni234@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25odb: handle changing a repository's commondirPatrick Steinhardt
The function `repo_set_gitdir()` is called in two situations: - To initialize the repository with its discovered location. As part of this we also set up the new object database. - To update the repository's discovered location in case the process changes its working directory so that we update relative paths. This means we also have to update any relative paths that are potentially used in the object database. In the context of the object database we ideally wouldn't ever have to worry about the second case: if all paths used by our object database sources were absolute, then we wouldn't have to update them. But unfortunately, the paths aren't only used to locate files owned by the given source, but we also use them for reporting purposes. One such example is `repo_get_object_directory()`, where we cannot just change semantics to always return absolute paths, as that is likely to break tooling out there. One solution to this would be to have both a "display path" and an "internal path". This would allow us to use internal paths for all internal matters, but continue to use the potentially-relative display paths so that we don't break compatibility. But converting the codebase to honor this split is quite a messy endeavour, and it wouldn't even help us with the goal to get rid of the need to update the display path on chdir(3p). Another solution would be to rework "setup.c" so that we never have to update paths in the first place. In that case, we'd only initialize the repository once we have figured out final locations for all directories. This would be a significant simplification of that subsystem indeed, but the current logic is so messy that it would take significant investments to get there. Meanwhile though, while object sources may still use relative paths, the best thing we can do is to handle the reparenting of the object source paths in the object database itself. This can be done by registering one callback for each object database so that we get notified whenever the current working directory changes, and we then perform the reparenting ourselves. Ideally, this wouldn't even happen on the object database level, but instead handled by each object database source. But we don't yet have proper pluggable object database sources, so this will need to be handled at a later point in time. The logic itself is rather simple: - We register the callback when creating the object database. - We unregister the callback when releasing it again. - We split up `set_git_dir_1()` so that it becomes possible to skip recreating the object database. This is required because the function is called both when the current working directory changes, but also when we set up the repository. Calling this function without skipping creation of the ODB will result in a bug in case it's already created. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25odb: handle initialization of sources in `odb_new()`Patrick Steinhardt
The logic to set up a new object database is currently distributed across two functions in "repository.c": - In `initialize_repository()` we initialize an empty object database. This object database is not fully initialized and doesn't have any sources attached to it. - The primary object database source is then created in `repo_set_gitdir()`. Ideally though, the logic should be entirely self-contained so that we can iterate more readily on how exactly the sources themselves get set up. Refactor `odb_new()` to handle both allocation and setup of the object database. This ensures that the object database is always initialized and ready for use, and it allows us to change how the sources get set up eventually. Note that `repo_set_gitdir()` still reaches into the sources when the function gets called with an already-initialized object database. This will be fixed in the next commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25odb: move logic to disable ref updates into repoPatrick Steinhardt
Our object database sources have a field `disable_ref_updates`. This field can obviously be set to disable reference updates, but it is somewhat curious that this logic is hosted by the object database. The reason for this is that it was primarily added to keep us from accidentally updating references while an ODB transaction is ongoing. Any objects part of the transaction have not yet been committed to disk, so new references that point to them might get corrupted in case we never end up committing the transaction. As such, whenever we create a new transaction we set up a new temporary ODB source and mark it as disabling reference updates. This has one (and only one?) upside: once we have committed the transaction, the temporary source will be dropped and thus we clean up the disabled reference updates automatically. But other than that, it's somewhat misdesigned: - We can have multiple ODB sources, but only the currently active source inhibits reference updates. - We're mixing concerns of the refdb with the ODB. Arguably, the decision of whether we can update references or not should be handled by the refdb. But that wouldn't be a great fit either, as there can be one refdb per worktree. So we'd again have the same problem that a "global" intent becomes localized to a specific instance. Instead, move the setting into the repository. While at it, convert it into a boolean. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-19odb: refactor `odb_clear()` to `odb_free()`Patrick Steinhardt
The function `odb_clear()` releases all resources allocated to an object database and ensures that all fields become zero'd out. Despite its naming though it doesn't really clear the object database so that it becomes ready for reuse afterwards again -- the caller would first have to reinitialize it, and that contradicts the terminology of "clearing" as we have defined it in our coding guidelines. There isn't really only a reason to have "clearing" semantics, either. There's only a single caller of `odb_clear()`, and that caller also ends up freeing the object database structure itself. Refactor the function to have "freeing" semantics instead, so that the structure itself is also freed, which allows us to drop some useless boilerplate to zero out the structure's members. This refactoring reveals that we're trying to close the commit graph multiple times: once directly via `free_commit_graph()`, and once via `odb_close()`. Drop the former call. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-03odb: introduce `odb_source_new()`Patrick Steinhardt
We have three different locations where we create a new ODB source. Deduplicate the logic via a new `odb_source_new()` function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-18Merge branch 'pw/3.0-commentchar-auto-deprecation'Junio C Hamano
"core.commentChar=auto" that attempts to dynamically pick a suitable comment character is non-workable, as it is too much trouble to support for little benefit, and is marked as deprecated. * pw/3.0-commentchar-auto-deprecation: commit: print advice when core.commentString=auto config: warn on core.commentString=auto breaking-changes: deprecate support for core.commentString=auto
2025-08-26config: warn on core.commentString=autoPhillip Wood
As support for this setting was deprecated in the last commit print a warning (or die when WITH_BREAKING_CHANGES is enabled) if it is set. Avoid bombarding the user with warnings by only printing it (a) when running commands that call "git commit" and (b) only once per command. Some scaffolding is added to repo_read_config() to allow it to detect deprecated config settings and warn about them. As both "core.commentChar" and "core.commentString" set the comment character we record which one of them is used and tailor the warning message appropriately. Note the odd combination of die_message() followed by die(NULL) is to allow the next commit to insert a call to advise() in the middle. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-11odb: store locality in object database sourcesPatrick Steinhardt
Object database sources are classified either as: - Local, which means that the source is the repository's primary source. This is typically ".git/objects". - Non-local, which is everything else. Most importantly this includes alternates and quarantine directories. This locality is often computed ad-hoc by checking whether a given object source is the first one. This works, but it is quite roundabout. Refactor the code so that we store locality when creating the sources in the first place. This makes it both more accessible and robust. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-15Merge branch 'ps/object-store'Junio C Hamano
Code clean-up around object access API. * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-07repository: move 'repository_format_precious_objects' to repo scopeAyush Chandekar
The 'extensions.preciousObjects' setting when set true, prevents operations that might drop objects from the object storage. This setting is populated in the global variable 'repository_format_precious_objects'. Move this global variable to repo scope by adding it to 'struct repository and also refactor all the occurences accordingly. This change is part of an ongoing effort to eliminate global variables, improve modularity and help libify the codebase. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: introduce parent pointersPatrick Steinhardt
In subsequent commits we'll get rid of our use of `the_repository` in "odb.c" in favor of explicitly passing in a `struct object_database` or a `struct odb_source`. In some cases though we'll need access to the repository, for example to read a config value from it, but we don't have a way to access the repository owning a specific object database. Introduce parent pointers for `struct object_database` to its owning repository as well as for `struct odb_source` to its owning object database, which will allow us to adapt those use cases. Note that this change requires us to pass through the object database to `link_alt_odb_entry()` so that we can set up the parent pointers for any source there. The callchain is adapted to pass through the object database accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename `object_directory` to `odb_source`Patrick Steinhardt
The `object_directory` structure is used as an access point for a single object directory like ".git/objects". While the structure isn't yet fully self-contained, the intent is for it to eventually contain all information required to access objects in one specific location. While the name "object directory" is a good fit for now, this will change over time as we continue with the agenda to make pluggable object databases a thing. Eventually, objects may not be accessed via any kind of directory at all anymore, but they could instead be backed by any kind of durable storage mechanism. While it seems quite far-fetched for now, it is thinkable that eventually this might even be some form of a database, for example. As such, the current name of this structure will become worse over time as we evolve into the direction of pluggable ODBs. Immediate next steps will start to carve out proper self-contained object directories, which requires us to pass in these object directories as parameters. Based on our modern naming schema this means that those functions should then be named after their subsystem, which means that we would start to bake the current name into the codebase more and more. Let's preempt this by renaming the structure. There have been a couple alternatives that were discussed: - `odb_backend` was discarded because it led to the association that one object database has a single backend, but the model is that one alternate has one backend. Furthermore, "backend" is more about the actual backing implementation and less about the high-level concept. - `odb_alternate` was discarded because it is a bit of a stretch to also call the main object directory an "alternate". Instead, pick `odb_source` as the new name. It makes it sufficiently clear that there can be multiple sources and does not cause confusion when mixed with the already-existing "alternate" terminology. In the future, this change allows us to easily introduce for example a `odb_files_source` and other format-specific implementations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename `raw_object_store` to `object_database`Patrick Steinhardt
The `raw_object_store` structure is the central entry point for reading and writing objects in a repository. The main purpose of this structure is to manage object directories and provide an interface to access and write objects in those object directories. Right now, many of the functions associated with the raw object store implicitly rely on `the_repository` to get access to its `objects` pointer, which is the `raw_object_store`. As we want to generally get rid of using `the_repository` across our codebase we will have to convert this implicit dependency on this global variable into an explicit parameter. This conversion can be done by simply passing in an explicit pointer to a repository and then using its `->objects` pointer. But there is a second effort underway, which is to make the object subsystem more selfcontained so that we can eventually have pluggable object backends. As such, passing in a repository wouldn't make a ton of sense, and the goal is to convert the object store interfaces such that we always pass in a reference to the `raw_object_store` instead. This will expose the `raw_object_store` type to a lot more callers though, which surfaces that this type is named somewhat awkwardly. The "raw_" prefix makes readers wonder whether there is a non-raw variant of the object store, but there isn't. Furthermore, we nowadays want to name functions in a way that they can be clearly attributed to a specific subsystem, but calling them e.g. `raw_object_store_has_object()` is just too unwieldy, even when dropping the "raw_" prefix. Instead, rename the structure to `object_database`. This term is already used a lot throughout our codebase, and it cannot easily be mistaken for "object directories", either. Furthermore, its acronym ODB is already well-known and works well as part of a function's name, like for example `odb_has_object()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28repo-settings: introduce function to clear structPatrick Steinhardt
We don't provide a way to clear a `struct repo_settings`, and instead open-code this in `repo_clear()`. This is mixing up concerns and means that developers have to touch multiple files whenever they add a new field to the structure in case the associated resources need to be released. Provide a new `repo_settings_clear()` function to improve this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-07path: refactor `repo_worktree_path()` family of functionsPatrick Steinhardt
As explained in an earlier commit, we're refactoring path-related functions to provide a consistent interface for computing paths into the commondir, gitdir and worktree. Refactor the "worktree" family of functions accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-02worktree: add `relativeWorktrees` extensionCaleb White
A new extension, `relativeWorktrees`, is added to indicate that at least one worktree in the repository has been linked with relative paths. This ensures older Git versions do not attempt to automatically prune worktrees with relative paths, as they would not not recognize the paths as being valid. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Caleb White <cdwhite3@pm.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-23Merge branch 'ps/environ-wo-the-repository'Junio C Hamano
Code clean-up. * ps/environ-wo-the-repository: (21 commits) environment: stop storing "core.notesRef" globally environment: stop storing "core.warnAmbiguousRefs" globally environment: stop storing "core.preferSymlinkRefs" globally environment: stop storing "core.logAllRefUpdates" globally refs: stop modifying global `log_all_ref_updates` variable branch: stop modifying `log_all_ref_updates` variable repo-settings: track defaults close to `struct repo_settings` repo-settings: split out declarations into a standalone header environment: guard state depending on a repository environment: reorder header to split out `the_repository`-free section environment: move `set_git_dir()` and related into setup layer environment: make `get_git_namespace()` self-contained environment: move object database functions into object layer config: make dependency on repo in `read_early_config()` explicit config: document `read_early_config()` and `read_very_early_config()` environment: make `get_git_work_tree()` accept a repository environment: make `get_graft_file()` accept a repository environment: make `get_index_file()` accept a repository environment: make `get_object_directory()` accept a repository environment: make `get_git_common_dir()` accept a repository ...
2024-09-12environment: make `get_git_work_tree()` accept a repositoryPatrick Steinhardt
The `get_git_work_tree()` function retrieves the path of the work tree of `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_graft_file()` accept a repositoryPatrick Steinhardt
The `get_graft_file()` function retrieves the path to the graft file of `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_index_file()` accept a repositoryPatrick Steinhardt
The `get_index_file()` function retrieves the path to the index file of `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_object_directory()` accept a repositoryPatrick Steinhardt
The `get_object_directory()` function retrieves the path to the object directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_git_common_dir()` accept a repositoryPatrick Steinhardt
The `get_git_common_dir()` function retrieves the path to the common directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_git_dir()` accept a repositoryPatrick Steinhardt
The `get_git_dir()` function retrieves the path to the Git directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05object: clear grafts when clearing parsed object poolPatrick Steinhardt
We do not clear grafts part of the parsed object pool when clearing the pool itself, which can lead to memory leaks when a repository is being cleared. Fix this by moving `reset_commit_grafts()` into "object.c" and making it part of the `struct parsed_object_pool` interface such that we can call it from `parsed_object_pool_clear()`. Adapt `parsed_object_pool_new()` to take and store a reference to its owning repository, which is needed by `unparse_commit()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14global: introduce `USE_THE_REPOSITORY_VARIABLE` macroPatrick Steinhardt
Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-13Merge branch 'ps/ref-storage-migration' into ps/use-the-repositoryJunio C Hamano
* ps/ref-storage-migration: builtin/refs: new command to migrate ref storage formats refs: implement logic to migrate between ref storage formats refs: implement removal of ref storages worktree: don't store main worktree twice reftable: inline `merged_table_release()` refs/files: fix NULL pointer deref when releasing ref store refs/files: extract function to iterate through root refs refs/files: refactor `add_pseudoref_and_head_entries()` refs: allow to skip creation of reflog entries refs: pass storage format to `ref_store_init()` explicitly refs: convert ref storage format to an enum setup: unset ref storage when reinitializing repository version
2024-06-06refs: convert ref storage format to an enumPatrick Steinhardt
The ref storage format is tracked as a simple unsigned integer, which makes it harder than necessary to discover what that integer actually is or where its values are defined. Convert the ref storage format to instead be an enum. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-30Merge branch 'jc/undecided-is-not-necessarily-sha1-fix'Junio C Hamano
The base topic started to make it an error for a command to leave the hash algorithm unspecified, which revealed a few commands that were not ready for the change. Give users a knob to revert back to the "default is sha-1" behaviour as an escape hatch, and start fixing these breakages. * jc/undecided-is-not-necessarily-sha1-fix: apply: fix uninitialized hash function builtin/hash-object: fix uninitialized hash function builtin/patch-id: fix uninitialized hash function t1517: test commands that are designed to be run outside repository setup: add an escape hatch for "no more default hash algorithm" change
2024-05-30Merge branch 'ps/refs-without-the-repository-updates'Junio C Hamano
Further clean-up the refs subsystem to stop relying on the_repository, and instead use the repository associated to the ref_store object. * ps/refs-without-the-repository-updates: refs/packed: remove references to `the_hash_algo` refs/files: remove references to `the_hash_algo` refs/files: use correct repository refs: remove `dwim_log()` refs: drop `git_default_branch_name()` refs: pass repo when peeling objects refs: move object peeling into "object.c" refs: pass ref store when detecting dangling symrefs refs: convert iteration over replace refs to accept ref store refs: retrieve worktree ref stores via associated repository refs: refactor `resolve_gitlink_ref()` to accept a repository refs: pass repo when retrieving submodule ref store refs: track ref stores via strmap refs: implement releasing ref storages refs: rename `init_db` callback to avoid confusion refs: adjust names for `init` and `init_db` callbacks
2024-05-30Merge branch 'ps/undecided-is-not-necessarily-sha1'Junio C Hamano
Before discovering the repository details, We used to assume SHA-1 as the "default" hash function, which has been corrected. Hopefully this will smoke out codepaths that rely on such an unwarranted assumptions. * ps/undecided-is-not-necessarily-sha1: repository: stop setting SHA1 as the default object hash oss-fuzz/commit-graph: set up hash algorithm builtin/shortlog: don't set up revisions without repo builtin/diff: explicitly set hash algo when there is no repo builtin/bundle: abort "verify" early when there is no repository builtin/blame: don't access potentially unitialized `the_hash_algo` builtin/rev-parse: allow shortening to more than 40 hex characters remote-curl: fix parsing of detached SHA256 heads attr: fix BUG() when parsing attrs outside of repo attr: don't recompute default attribute source parse-options-cb: only abbreviate hashes when hash algo is known path: move `validate_headref()` to its only user path: harden validation of HEAD with non-standard hashes
2024-05-21setup: add an escape hatch for "no more default hash algorithm" changeJunio C Hamano
Partially revert c8aed5e8 (repository: stop setting SHA1 as the default object hash, 2024-05-07), to keep end-user systems still broken when we have gap in our test coverage but yet give them an escape hatch to set the GIT_TEST_DEFAULT_HASH_ALGO environment variable to "sha1" in order to revert to the previous behaviour, in case we haven't done a thorough job in fixing the fallout from c8aed5e8. After we build confidence, we should remove the escape hatch support, but we are not there yet after only fixing three commands (hash-object, apply, and patch-id) in this series. Due to the way the end-user facing GIT_DEFAULT_HASH environment variable is used in our test suite, we unfortunately cannot reuse it for this purpose. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17refs: retrieve worktree ref stores via associated repositoryPatrick Steinhardt
Similar as with the preceding commit, the worktree ref stores are always looked up via `the_repository`. Also, again, those ref stores are stored in a global map. Refactor the code so that worktrees have a pointer to their repository. Like this, we can move the global map into `struct repository` and stop using `the_repository`. With this change, we can now in theory look up worktree ref stores for repositories other than `the_repository`. In practice, the worktree code will need further changes to look up arbitrary worktrees. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-17refs: pass repo when retrieving submodule ref storePatrick Steinhardt
Looking up submodule ref stores has two deficiencies: - The initialized subrepo will be attributed to `the_repository`. - The submodule ref store will be tracked in a global map. This makes it impossible to have submodule ref stores for a repository other than `the_repository`. Modify the function to accept the parent repository as parameter and move the global map into `struct repository`. Like this it becomes possible to look up submodule ref stores for arbitrary repositories. Note that this also adds a new reference to `the_repository` in `resolve_gitlink_ref()`, which is part of the refs interfaces. This will get adjusted in the next patch. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>