aboutsummaryrefslogtreecommitdiff
path: root/apply.c
AgeCommit message (Collapse)Author
7 daysMerge branch 'jc/whitespace-incomplete-line'Junio C Hamano
Fix whitespace correction for new-style empty context lines. * jc/whitespace-incomplete-line: apply: fix new-style empty context line triggering incomplete-line check
2026-03-30Merge branch 'jw/apply-corrupt-location'Junio C Hamano
"git apply" now reports the name of the input file along with the line number when it encounters a corrupt patch, and correctly resets the line counter when processing multiple patch files. * jw/apply-corrupt-location: apply: report input location in binary and garbage patch errors apply: report input location in header parsing errors apply: report the location of corrupt patches
2026-03-27Merge branch 'mf/apply-p-no-atoi'Junio C Hamano
"git apply -p<n>" parses <n> more carefully now. * mf/apply-p-no-atoi: apply.c: fix -p argument parsing
2026-03-17apply: fix new-style empty context line triggering incomplete-line checkJunio C Hamano
A new-style unified context diff represents an empty context line with an empty line (instead of a line with a single SP on it). The code to check whitespace errors in an incoming patch is designed to omit the first byte of a line (typically SP, "-", or "+") and pass the remainder of the line to the whitespace checker. Usually we do not pass a context line to the whitespace error checker, but when we are correcting errors, we do. This "remove the first byte and send the remainder" strategy of checking a line ended up sending a zero-length string to the whitespace checker when seeing a new-style empty context line, which caused the whitespace checker to say "ah, you do not even have a newline at the end!", leading to an "incomplete line" in the middle of the patch! Fix this by pretending that we got a traditional empty context line when we drive the whitespace checker. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-17apply: report input location in binary and garbage patch errorsJialong Wang
Several binary parsing paths in apply.c still report only line numbers. When more than one patch input is fed to a single invocation, that does not tell the user which input the line belongs to. Report the patch input location for corrupt and unrecognized binary patches, as well as the "patch with only garbage" case, and update the related tests. Signed-off-by: Jialong Wang <jerrywang183@yahoo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-17apply: report input location in header parsing errorsJialong Wang
Several header parsing errors in apply.c still report only line numbers. When applying more than one input, that does not tell the user which input the line belongs to. Report the patch input location for these header parsing errors, and update the related tests. While touching parse_git_diff_header(), update the helper state to use the current header line when reporting these errors. Signed-off-by: Jialong Wang <jerrywang183@yahoo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-17apply: report the location of corrupt patchesJialong Wang
When parsing a corrupt patch, git apply reports only the line number. That does not tell the user which input the line number refers to. Include the patch input path in the error message so the reported location is easier to use. Reset the line number for each patch input so the reported location stays correct when multiple input files are provided. Add tests for file input, standard input, multiple patch inputs, and existing binary-diff corrupt patch cases. Signed-off-by: Jialong Wang <jerrywang183@yahoo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-16apply.c: fix -p argument parsingMirko Faina
"git apply" has an option -p that takes an integer as its argument. Unfortunately the function apply_option_parse_p() in charge of parsing this argument uses atoi() to convert from string to integer, which allows a non-digit after the number (e.g. "1q") to be silently ignored. As a consequence, an argument that does not begin with a digit silently becomes a zero. Despite this command working fine when a non-positive argument is passed, it might be useful for the end user to know that their input contains non-digits that might've been unintended. Replace atoi() with strtol_i() to catch malformed inputs. Signed-off-by: Mirko Faina <mroik@delayed.space> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-04Merge branch 'jr/apply-directory-normalize'Junio C Hamano
"git apply --directory=./un/../normalized/path" now normalizes the given path before using it. * jr/apply-directory-normalize: apply: normalize path in --directory argument
2026-02-27Merge branch 'jc/whitespace-incomplete-line'Junio C Hamano
It does not make much sense to apply the "incomplete-line" whitespace rule to symbolic links, whose contents almost always lack the final newline. "git apply" and "git diff" are now taught to exclude them for a change to symbolic links. * jc/whitespace-incomplete-line: whitespace: symbolic links usually lack LF at the end
2026-02-20apply: normalize path in --directory argumentJoaquim Rocha
When passing a relative path like --directory=./some/sub, the leading "./" caused apply to prepend it literally to patch filenames, resulting in an error (invalid path). There may be more cases like this where users pass some/./path to the directory which can easily be normalized to an acceptable path, so these changes try to normalize the path before using it. Signed-off-by: Joaquim Rocha <joaquim@amutable.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-17Merge branch 'rs/xdiff-wo-the-repository'Junio C Hamano
Reduce dependency on the_repository of xdiff-interface layer. * rs/xdiff-wo-the-repository: xdiff-interface: stop using the_repository
2026-02-10xdiff-interface: stop using the_repositoryRené Scharfe
Use the algorithm-agnostic is_null_oid() and push the dependency of read_mmblob() on the_repository->objects to its callers. This allows it to be used with arbitrary object databases. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-05whitespace: symbolic links usually lack LF at the endJunio C Hamano
For a patch that touches a symbolic link, it is perfectly normal that the contents ends with "\ No newline at end of file". The checks introduced recently to detect incomplete lines (i.e., a text file that lack the newline on its final line) should not trigger. Disable the check early for symbolic links, both in "git apply" and "git diff" and test them. For "git apply", we check only when the postimage is a symbolic link regardless of the preimage, and we only care about preimage when applying in reverse. Similarly, "git diff" would warn only when the postimage is a symbolic link, or the preimage when running "git diff -R". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-05global: constify some pointers that are not written toCollin Funk
The recent glibc 2.43 release had the following change listed in its NEWS file: For ISO C23, the functions bsearch, memchr, strchr, strpbrk, strrchr, strstr, wcschr, wcspbrk, wcsrchr, wcsstr and wmemchr that return pointers into their input arrays now have definitions as macros that return a pointer to a const-qualified type when the input argument is a pointer to a const-qualified type. When compiling with GCC 15, which defaults to -std=gnu23, this causes many warnings like this: merge-ort.c: In function ‘apply_directory_rename_modifications’: merge-ort.c:2734:36: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 2734 | char *last_slash = strrchr(cur_path, '/'); | ^~~~~~~ This patch fixes the more obvious ones by making them const when we do not write to the returned pointer. Signed-off-by: Collin Funk <collin.funk1@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30Merge branch 'js/test-symlink-windows'Junio C Hamano
Prepare test suite for Git for Windows that supports symbolic links. * js/test-symlink-windows: t7800: work around the MSYS path conversion on Windows t6423: introduce Windows-specific handling for symlinking to /dev/null t1305: skip symlink tests that do not apply to Windows t1006: accommodate for symlink support in MSYS2 t0600: fix incomplete prerequisite for a test case t0301: another fix for Windows compatibility t0001: handle `diff --no-index` gracefully mingw: special-case `open(symlink, O_CREAT | O_EXCL)` apply: symbolic links lack a "trustable executable bit" t9700: accommodate for Windows paths
2025-12-18apply: symbolic links lack a "trustable executable bit"Johannes Schindelin
When 0482c32c334b (apply: ignore working tree filemode when !core.filemode, 2023-12-26) fixed `git apply` to stop warning about executable files, it inadvertently changed the code flow also for symbolic links and directories. Let's narrow the scope of the special `!trust_executable_git` code path to apply only to regular files. This is needed to let t4115.5(symlink escape when creating new files) pass on Windows when symbolic link support is enabled in the MSYS2 runtime. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12apply: check and fix incomplete linesJunio C Hamano
The final line of a file that lacks the terminating newline at its end is called an incomplete line. In general they are frowned upon for many reasons (imagine concatenating two files with "cat A B" and what happens when A ends in an incomplete line, for example), and text-oriented tools often mishandle such a line. Implement checks in "git apply" for incomplete lines, which is off by default for backward compatibility's sake, so that "git apply --whitespace={fix,warn,error}" can notice, warn against, and fix them. As one of the new test shows, if you modify contents on an incomplete line in the original and leave the resulting line incomplete, it is still considered a whitespace error, the reasoning being that "you'd better fix it while at it if you are making a change on an incomplete line anyway", which may controversial. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12apply: revamp the parsing of incomplete linesJunio C Hamano
A patch file represents the incomplete line at the end of the file with two lines, one that is the usual "context" with " " as the first letter, "added" with "+" as the first letter, or "removed" with "-" as the first letter that shows the content of the line, plus an extra "\ No newline at the end of file" line that comes immediately after it. Ever since the apply machinery was written, the "git apply" machinery parses "\ No newline at the end of file" line independently, without even knowing what line the incomplete-ness applies to, simply because it does not even remember what the previous line was. This poses a problem if we want to check and warn on an incomplete line. Revamp the code that parses a fragment, to actually drop the '\n' at the end of the incoming patch file that terminates a line, so that check_whitespace() calls made from the code path actually sees an incomplete as incomplete. Note that the result of this parsing is not directly used by the code path that applies the patch. apply_one_fragment() function already checks if each of the patch text it handles is followed by a line that begins with a backslash to drop the newline at the end of the current line it is looking at. In a sense, this patch harmonizes the behaviour of the parsing side to what is already done in the application side. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-05Merge branch 'ps/object-file-wo-the-repository'Junio C Hamano
Reduce implicit assumption and dependence on the_repository in the object-file subsystem. * ps/object-file-wo-the-repository: object-file: get rid of `the_repository` in index-related functions object-file: get rid of `the_repository` in `force_object_loose()` object-file: get rid of `the_repository` in `read_loose_object()` object-file: get rid of `the_repository` in loose object iterators object-file: remove declaration for `for_each_file_in_obj_subdir()` object-file: inline `for_each_loose_file_in_objdir_buf()` object-file: get rid of `the_repository` when writing objects odb: introduce `odb_write_object()` loose: write loose objects map via their source object-file: get rid of `the_repository` in `finalize_object_file()` object-file: get rid of `the_repository` in `loose_object_info()` object-file: get rid of `the_repository` when freshening objects object-file: inline `check_and_freshen()` functions object-file: get rid of `the_repository` in `has_loose_object()` object-file: stop using `the_hash_algo` object-file: fix -Wsign-compare warnings
2025-07-23config: drop `git_config_get_string()` wrapperPatrick Steinhardt
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_string()`. All callsites are adjusted so that they use `repo_config_get_string(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-23config: drop `git_config()` wrapperPatrick Steinhardt
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config()`. All callsites are adjusted so that they use `repo_config(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-16odb: introduce `odb_write_object()`Patrick Steinhardt
We do not have a backend-agnostic way to write objects into an object database. While there is `write_object_file()`, this function is rather specific to the loose object format. Introduce `odb_write_object()` to plug this gap. For now, this function is a simple wrapper around `write_object_file()` and doesn't even use the passed-in object database yet. This will change in subsequent commits, where `write_object_file()` is converted so that it works on top of an `odb_source`. `odb_write_object()` will then become responsible for deciding which source an object shall be written to. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-15Merge branch 'ps/object-store'Junio C Hamano
Code clean-up around object access API. * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-07apply: only write intents to add for new filesRaymond E. Pasco
In the "apply only to files" mode (i.e., neither --index nor --cached mode), the index should not be touched except to record intents to add when --intent-to-add is on. Because having --intent-to-add on sets update_index, to indicate that we may touch the index, we can't rely only on that flag in create_file() (which is called to write both new files and updated files) to decide whether to write an index entry; if we did, we would write an index entry for every file being patched (which would moreover be an intent-to-add entry despite not being a new file, because we are going to turn on the CE_INTENT_TO_ADD flag in add_index_entry() if we enter it here and ita_only is true). To decide whether to touch the index, we need to check the specific reason the index would be updated, rather than merely their aggregate in the update_index flag. Because we have already entered write_out_results() and are performing writes, we know that state->apply is true. If state->check_index is additionally true, we are in --index or --cached mode, which updates the index and should always write, whereas if we are merely in ita_only mode we must only write if the patch is a new file creation patch. Signed-off-by: Raymond E. Pasco <ray@ameretat.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07apply: read in the index in --intent-to-add modeRaymond E. Pasco
There are three main modes of operation for apply: applying only to the worktree, applying to the worktree and index (--index), and applying only to the index (--cached). The --intent-to-add flag modifies the first of these modes, applying only to the worktree, in a way which touches the index, because intents to add are special index entries. However, since its introduction in cff5dc09ed (apply: add --intent-to-add, 2018-05-26), it has not worked correctly in any but the most trivial (empty repository) cases, because the index is never read in (in apply, this is done in read_apply_cache()) before writing to it. This causes the operation to clobber the old, correct index with a new empty-tree index before writing intent-to-add entries to this empty index; the final result is that the index now records every existing file in the repository as deleted, which is incorrect. This error can be corrected by first reading the index. The update_index flag is correctly set if ita_only is true, because this flag causes the index to be updated. However, if we merely gate the call to read_apply_cache() behind update_index, then it will not be read when state->apply is false, even if it must be checked due to being in --index or --cached mode. Therefore, we instead read the index if it will be either checked or updated, because reading the index is a prerequisite to either. Reported-by: Ryan Hodges <rhodges@cisco.com> Original-patch-by: Johannes Altmanninger <aclopte@gmail.com> Signed-off-by: Raymond E. Pasco <ray@ameretat.dev> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `has_object()`Patrick Steinhardt
Rename `has_object()` to `odb_has_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `repo_read_object_file()`Patrick Steinhardt
Rename `repo_read_object_file()` to `odb_read_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-27apply: set file mode when --reverse creates a deleted fileMark Mentovai
Commit 01aff0a (apply: correctly reverse patch's pre- and post-image mode bits, 2023-12-26) revised reverse_patches() to maintain the desired property that when only one of patch::old_mode and patch::new_mode is set, the mode will be carried in old_mode. That property is generally correct, with one notable exception: when creating a file, only new_mode will be set. Since reversing a deletion results in a creation, new_mode must be set in that case. Omitting handling for this case means that reversing a patch that removes an executable file will not result in the executable permission being set on the re-created file. Existing test coverage for file modes focuses only on mode changes of existing files. Swap old_mode and new_mode in reverse_patches() for what's represented in the patch as a file deletion, as it is transformed into a file creation under reversal. This causes git apply --reverse to set the executable permission properly when re-creating a deleted executable file. Add tests ensuring that git apply sets file modes correctly on file creation, both in the forward and reverse directions. Signed-off-by: Mark Mentovai <mark@chromium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-24Merge branch 'ps/parse-options-integers'Junio C Hamano
Update parse-options API to catch mistakes to pass address of an integral variable of a wrong type/size. * ps/parse-options-integers: parse-options: detect mismatches in integer signedness parse-options: introduce precision handling for `OPTION_UNSIGNED` parse-options: introduce precision handling for `OPTION_INTEGER` parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()` parse-options: support unit factors in `OPT_INTEGER()` global: use designated initializers for options parse: fix off-by-one for minimum signed values
2025-04-17parse-options: detect mismatches in integer signednessPatrick Steinhardt
It was reported that "t5620-backfill.sh" fails on s390x and sparc64 in a test that exercises the "--min-batch-size" command line option. The symptom was that the option didn't seem to have an effect: we didn't fetch objects with a batch size of 20, but instead fetched all objects at once. As it turns out, the root cause is that `--min-batch-size` uses `OPT_INTEGER()` to parse the command line option. While this macro expects the caller to pass a pointer to an integer, we instead pass a pointer to a `size_t`. This coincidentally works on most platforms, but it breaks apart on the mentioned platforms because they are big endian. This issue isn't specific to git-backfill(1): there are a couple of other places where we have the same type confusion going on. This indicates that the issue really is the interface that the parse-options subsystem provides -- it is simply too easy to get this wrong as there isn't any kind of compiler warning, and things just work on the most common systems. Address the systemic issue by introducing two new build asserts `BARF_UNLESS_SIGNED()` and `BARF_UNLESS_UNSIGNED()`. As the names already hint at, those macros will cause a compiler error when passed a value that is not signed or unsigned, respectively. Adapt `OPT_INTEGER()`, `OPT_UNSIGNED()` as well as `OPT_MAGNITUDE()` to use those asserts. This uncovers a small set of sites where we indeed have the same bug as in git-backfill(1). Adapt all of them to use the correct option. Reported-by: Todd Zullinger <tmz@pobox.com> Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03apply.c: *.txt -> *.adoc fixesTodd Zullinger
Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-10Merge branch 'pw/apply-ulong-overflow-check'Junio C Hamano
"git apply" internally uses unsigned long for line numbers and uses strtoul() to parse numbers on the hunk headers. It however forgot to check parse errors. * pw/apply-ulong-overflow-check: apply: detect overflow when parsing hunk header
2025-01-30apply: detect overflow when parsing hunk headerPhillip Wood
"git apply" uses strtoul() to parse the numbers in the hunk header but silently ignores overflows. As LONG_MAX is a legitimate return value for strtoul() we need to set errno to zero before the call to strtoul() and check that it is still zero afterwards. The error message we display is not particularly helpful as it does not say what was wrong. However, it seems pretty unlikely that users are going to trigger this error in practice and we can always improve it later if needed. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25Merge branch 'ps/apply-leakfix'Junio C Hamano
"git apply" had custom buffer management code that predated before use of strbuf got widespread, which has been updated to use strbuf, which also plugged some memory leaks. * ps/apply-leakfix: apply: refactor `struct image` to use a `struct strbuf` apply: rename members that track line count and allocation length apply: refactor code to drop `line_allocated` apply: introduce macro and function to init images apply: rename functions operating on `struct image` apply: reorder functions to move image-related things together
2024-09-23Merge branch 'ps/environ-wo-the-repository'Junio C Hamano
Code clean-up. * ps/environ-wo-the-repository: (21 commits) environment: stop storing "core.notesRef" globally environment: stop storing "core.warnAmbiguousRefs" globally environment: stop storing "core.preferSymlinkRefs" globally environment: stop storing "core.logAllRefUpdates" globally refs: stop modifying global `log_all_ref_updates` variable branch: stop modifying `log_all_ref_updates` variable repo-settings: track defaults close to `struct repo_settings` repo-settings: split out declarations into a standalone header environment: guard state depending on a repository environment: reorder header to split out `the_repository`-free section environment: move `set_git_dir()` and related into setup layer environment: make `get_git_namespace()` self-contained environment: move object database functions into object layer config: make dependency on repo in `read_early_config()` explicit config: document `read_early_config()` and `read_very_early_config()` environment: make `get_git_work_tree()` accept a repository environment: make `get_graft_file()` accept a repository environment: make `get_index_file()` accept a repository environment: make `get_object_directory()` accept a repository environment: make `get_git_common_dir()` accept a repository ...
2024-09-17apply: refactor `struct image` to use a `struct strbuf`Patrick Steinhardt
The `struct image` uses a character array to track the pre- or postimage of a patch operation. This has multiple downsides: - It is somewhat hard to track memory ownership. In fact, we have several memory leaks in git-apply(1) because we do not (and cannot easily) free the buffer in all situations. - We have to reinvent the wheel and manually implement a lot of functionality that would already be provided by `struct strbuf`. - We have to carefully track whether `update_pre_post_images()` can do an in-place update of the postimage or whether it has to allocate a new buffer for it. This is all rather cumbersome, and especially `update_pre_post_images()` is really hard to understand as a consequence even though what it is doing is rather trivial. Refactor the code to use a `struct strbuf` instead, addressing all of the above. Like this we can easily perform in-place updates in all situations, the logic to perform those updates becomes way simpler and the lifetime of the buffer becomes a ton easier to track. This refactoring also plugs some leaking buffers as a side effect. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-17apply: rename members that track line count and allocation lengthPatrick Steinhardt
The `struct image` has two members `nr` and `alloc` that track the number of lines as well as how large its array is. It is somewhat easy to confuse these members with `len` though, which tracks the length of the `buf` member. Rename these members to `line_nr` and `line_alloc` respectively to avoid confusion. This is in line with how we typically name variables that track an array in this way. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-17apply: refactor code to drop `line_allocated`Patrick Steinhardt
The `struct image` has two members `line` and `line_allocated`. The former member is the one that should be used throughout the code, whereas the latter one is used to track whether the lines have been allocated or not. In practice, the array of lines is always allocated. The reason why we have `line_allocated` is that `remove_first_line()` will advance the array pointer to drop the first entry, and thus it points into the array instead of to the array header. Refactor the function to use memmove(3P) instead, which allows us to get rid of this double bookkeeping. This is less efficient, but I doubt that this matters much in practice. If this judgement call is found to be wrong at a later point in time we can likely refactor the surrounding loop such that we first calculate the number of leading context lines to remove and then remove them in a single call to memmove(3P). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-17apply: introduce macro and function to init imagesPatrick Steinhardt
We're about to convert the `struct image` to gain a `struct strbuf` member, which requires more careful initialization than just memsetting it to zeros. Introduce the `IMAGE_INIT` macro and `image_init()` function to prepare for this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-17apply: rename functions operating on `struct image`Patrick Steinhardt
Rename functions operating on `struct image` to have a `image_` prefix to match our modern code style. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-17apply: reorder functions to move image-related things togetherPatrick Steinhardt
While most of the functions relating to `struct image` are relatively close to one another, `fuzzy_matchlines()` sits in between those even though it is rather unrelated. Reorder functions such that `struct image`-related functions are next to each other. While at it, move `clear_image()` to the top such that it is close to the struct definition itself. This makes this lifecycle-related thing easy to discover. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_git_dir()` accept a repositoryPatrick Steinhardt
The `get_git_dir()` function retrieves the path to the Git directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-09apply: support --ours, --theirs, and --union for three-way mergesAlex Henrie
--ours, --theirs, and --union are already supported in `git merge-file` for automatically resolving conflicts in favor of one version or the other, instead of leaving conflict markers in the file. Support them in `git apply -3` as well because the two commands do the same kind of file-level merges. In case in the future --ours, --theirs, and --union gain a meaning outside of three-way-merges, they do not imply --3way but rather must be specified alongside it. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14Merge branch 'jk/apply-patch-mode-check-fix'Junio C Hamano
The patch parser in 'git apply' has been a bit more lenient against unexpected mode bits, like 100664, recorded on extended header lines. * jk/apply-patch-mode-check-fix: apply: canonicalize modes read from patches
2024-08-05apply: canonicalize modes read from patchesJeff King
Git stores only canonical modes for blobs. So for a regular file, we care about only "100644" or "100755" (depending only on the executable bit), but never modes where the group or other permissions are more exotic. So never "100664", "100700", etc. When a file in the working tree has such a mode, we quietly turn it into one of the two canonical modes, and that's what is stored both in the index and in tree objects. However, we don't canonicalize modes we read from incoming patches in git-apply. These may appear in a few lines: - "old mode" / "new mode" lines for mode changes - "new file mode" lines for newly created files - "deleted file mode" for removing files For "new mode" and for "new file mode", this is harmless. The patch is asking the result to have a certain mode, but: - when we add an index entry (for --index or --cached), it is canonicalized as we create the entry, via create_ce_mode(). - for a working tree file, try_create_file() passes either 0777 or 0666 to open(), so what you get depends only on your umask, not any other bits (aside from the executable bit) in the original mode. However, for "old mode" and "deleted file mode", there is a minor annoyance. We compare the patch's expected preimage mode with the current state. But that current state is always going to be a canonical mode itself: - updating an index entry via --cached will have the canonical mode in the index - for updating a working tree file, check_preimage() runs the mode through ce_mode_from_stat(), which does the usual canonicalization So if the patch feeds a non-canonical mode, it's impossible for it to match, and we will always complain with something like: file has type 100644, expected 100664 Since this is just a warning, the operation proceeds, but it's confusing and annoying. These cases should be pretty rare in practice. Git would never produce a patch with non-canonical modes itself (since it doesn't store them). And while we do accept patches from other programs, all of those lines were invented by Git. So you'd need a program trying to be Git compatible, but not handling canonicalization the same way. Reportedly "quilt" is such a program. We should canonicalize the modes as we read them so that the user never sees the useless warning. A few notes on the tests: - I've covered instances of all lines for completeness, even though the "new mode" / "new file mode" ones behave OK currently. - the tests apply patches to both the index and working tree, and check the result of both. Again, we know that all of these paths canonicalize anyway, but it's giving us extra coverage (although we are even less likely to have such a bug now since we canonicalize up front). - the test patches are missing "index" lines, which is also something Git would never produce. But they don't matter for the test, they do match the case from quilt we saw in the wild, and they avoid some sha1/sha256 complexity. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08Merge branch 'ps/leakfixes-more'Junio C Hamano
More memory leaks have been plugged. * ps/leakfixes-more: (29 commits) builtin/blame: fix leaking ignore revs files builtin/blame: fix leaking prefixed paths blame: fix leaking data for blame scoreboards line-range: plug leaking find functions merge: fix leaking merge bases builtin/merge: fix leaking `struct cmdnames` in `get_strategy()` sequencer: fix memory leaks in `make_script_with_merges()` builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()` apply: fix leaking string in `match_fragment()` sequencer: fix leaking string buffer in `commit_staged_changes()` commit: fix leaking parents when calling `commit_tree_extended()` config: fix leaking "core.notesref" variable rerere: fix various trivial leaks builtin/stash: fix leak in `show_stash()` revision: free diff options builtin/log: fix leaking commit list in git-cherry(1) merge-recursive: fix memory leak when finalizing merge builtin/merge-recursive: fix leaking object ID bases builtin/difftool: plug memory leaks in `run_dir_diff()` object-name: free leaking object contexts ...