aboutsummaryrefslogtreecommitdiff
path: root/diff.c
AgeCommit message (Collapse)Author
7 daysMerge branch 'mm/line-log-use-standard-diff-output'Junio C Hamano
The way the "git log -L<range>:<file>" feature is bolted onto the log/diff machinery is being reworked a bit to make the feature compatible with more diff options, like -S/G. * mm/line-log-use-standard-diff-output: doc: note that -L supports patch formatting and pickaxe options t4211: add tests for -L with standard diff options line-log: route -L output through the standard diff pipeline line-log: fix crash when combined with pickaxe options
2026-03-16line-log: route -L output through the standard diff pipelineMichael Montalbo
`git log -L` has always bypassed the standard diff pipeline. `dump_diff_hacky()` in line-log.c hand-rolls its own diff headers and hunk output, which means most diff formatting options are silently ignored. A NEEDSWORK comment has acknowledged this since the feature was introduced: /* * NEEDSWORK: manually building a diff here is not the Right * Thing(tm). log -L should be built into the diff pipeline. */ Remove `dump_diff_hacky()` and its helpers and route -L output through `builtin_diff()` / `fn_out_consume()`, the same path used by `git diff` and `git log -p`. The mechanism is a pair of callback wrappers that sit between `xdi_diff_outf()` and `fn_out_consume()`, filtering xdiff's output to only the tracked line ranges. To ensure xdiff emits all lines within each range as context, the context length is inflated to span the largest range. Wire up the `-L` implies `--patch` default in revision setup rather than forcing it at output time, so `line_log_print()` is just `diffcore_std()` + `diff_flush()` with no format save/restore. Rename detection is a no-op since pairs are already resolved during the history walk in `queue_diffs()`, but running `diffcore_std()` means `-S`/`-G` (pickaxe), `--orderfile`, and `--diff-filter` now work with `-L`, and `diff_resolve_rename_copy()` sets pair statuses correctly without manual assignment. Switch `diff_filepair_dup()` from `xmalloc` to `xcalloc` so that new fields (including `line_ranges`) are zero-initialized by default. As a result, diff formatting options that were previously silently ignored (e.g. --word-diff, --no-prefix, -w, --color-moved) now work with -L, and output gains `index` lines, `new file mode` headers, and funcname context in `@@` headers. This is a user-visible output change: tools that parse -L output may need to handle the additional header lines. The context-length inflation means xdiff may process more output than needed for very wide line ranges, but benchmarks on files up to 7800 lines show no measurable regression. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-03-10Merge branch 'mm/diff-no-index-find-object'Junio C Hamano
"git diff --no-index --find-object=<object-name>" outside a repository of course wouldn't be able to find the object and died while parsing the command line, which is made to die in a bit more user-friendly way. * mm/diff-no-index-find-object: diff: fix crash with --find-object outside repository
2026-03-09Merge branch 'lp/diff-stat-utf8-display-width-fix'Junio C Hamano
"git log --graph --stat" did not count the display width of colored graph part of its own output correctly, which has been corrected. * lp/diff-stat-utf8-display-width-fix: t4052: test for diffstat width when prefix contains ANSI escape codes diff: handle ANSI escape codes in prefix when calculating diffstat width
2026-03-04Merge branch 'en/merge-ort-almost-wo-the-repository'Junio C Hamano
Mark the marge-ort codebase to prevent more uses of the_repository from getting added. * en/merge-ort-almost-wo-the-repository: replay: prevent the_repository from coming back merge-ort: prevent the_repository from coming back merge-ort: replace the_hash_algo with opt->repo->hash_algo merge-ort: replace the_repository with opt->repo merge-ort: pass repository to write_tree() merge,diff: remove the_repository check before prefetching blobs
2026-03-02diff: fix crash with --find-object outside repositoryMichael Montalbo
When "git diff --find-object=<oid>" is run outside a git repository, the option parsing callback eagerly resolves the OID via repo_get_oid(), which reaches get_main_ref_store() and hits a BUG() assertion because no repository has been set up. Check startup_info->have_repository before attempting to resolve the OID, and return a user-friendly error instead. Signed-off-by: Michael Montalbo <mmontalbo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-27Merge branch 'jc/whitespace-incomplete-line'Junio C Hamano
It does not make much sense to apply the "incomplete-line" whitespace rule to symbolic links, whose contents almost always lack the final newline. "git apply" and "git diff" are now taught to exclude them for a change to symbolic links. * jc/whitespace-incomplete-line: whitespace: symbolic links usually lack LF at the end
2026-02-27diff: handle ANSI escape codes in prefix when calculating diffstat widthLorenzoPegorari
The diffstat width is calculated by taking the terminal width and incorrectly subtracting the `strlen()` of `line_prefix`, instead of the actual display width of `line_prefix`, which may contain ANSI escape codes (e.g., ANSI-colored strings in `log --graph --stat`). Utilize the display width instead, obtained via `utf8_strnwidth()` with the flag `skip_ansi`. Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-21merge,diff: remove the_repository check before prefetching blobsElijah Newren
Prefetching of blobs from promisor remotes was added to diff in 7fbbcb21b162 (diff: batch fetching of missing blobs, 2019-04-05). In that commit, https://lore.kernel.org/git/20190405170934.20441-1-jonathantanmy@google.com/ was squashed into https://lore.kernel.org/git/44de02e584f449481e6fb00cf35d74adf0192e9d.1553895166.git.jonathantanmy@google.com/ without the extra explanation about the squashed changes being added to the commit message; in particular, this explanation from that first link is absent: > Also, prefetch only if the repository being diffed is the_repository > (because we do not support lazy fetching for any other repository > anyway). Then, later, this checking was spread from diff.c to diffcore-rename.c and diffcore-break.c by 95acf11a3dc3 (diff: restrict when prefetching occurs, 2020-04-07) and then further split in d331dd3b0c82 (diffcore-rename: allow different missing_object_cb functions, 2021-06-22). I also copied the logic from prefetching blobs from diff.c to merge-ort.c in 2bff554b23e8 (merge-ort: add prefetching for content merges, 2021-06-22). The reason for all these checks was noted above -- we only supported lazy fetching for the_repository. However, that changed with ef830cc43412 (promisor-remote: teach lazy-fetch in any repo, 2021-06-17), so these checks are now unnecessary. Remove them. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-13Merge branch 'cf/c23-const-preserving-strchr-updates-0'Junio C Hamano
ISO C23 redefines strchr and friends that tradiotionally took a const pointer and returned a non-const pointer derived from it to preserve constness (i.e., if you ask for a substring in a const string, you get a const pointer to the substring). Update code paths that used non-const pointer to receive their results that did not have to be non-const to adjust. * cf/c23-const-preserving-strchr-updates-0: gpg-interface: remove an unnecessary NULL initialization global: constify some pointers that are not written to
2026-02-05whitespace: symbolic links usually lack LF at the endJunio C Hamano
For a patch that touches a symbolic link, it is perfectly normal that the contents ends with "\ No newline at end of file". The checks introduced recently to detect incomplete lines (i.e., a text file that lack the newline on its final line) should not trigger. Disable the check early for symbolic links, both in "git apply" and "git diff" and test them. For "git apply", we check only when the postimage is a symbolic link regardless of the preimage, and we only care about preimage when applying in reverse. Similarly, "git diff" would warn only when the postimage is a symbolic link, or the preimage when running "git diff -R". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-05global: constify some pointers that are not written toCollin Funk
The recent glibc 2.43 release had the following change listed in its NEWS file: For ISO C23, the functions bsearch, memchr, strchr, strpbrk, strrchr, strstr, wcschr, wcspbrk, wcsrchr, wcsstr and wmemchr that return pointers into their input arrays now have definitions as macros that return a pointer to a const-qualified type when the input argument is a pointer to a const-qualified type. When compiling with GCC 15, which defaults to -std=gnu23, this causes many warnings like this: merge-ort.c: In function ‘apply_directory_rename_modifications’: merge-ort.c:2734:36: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 2734 | char *last_slash = strrchr(cur_path, '/'); | ^~~~~~~ This patch fixes the more obvious ones by making them const when we do not write to the returned pointer. Signed-off-by: Collin Funk <collin.funk1@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-02-05Merge branch 'lp/diff-stat-utf8-display-width-fix'Junio C Hamano
The computation of column width made by "git diff --stat" was confused when pathnames contain non-ASCII characters. * lp/diff-stat-utf8-display-width-fix: t4073: add test for diffstat paths length when containing UTF-8 chars diff: improve scaling of filenames in diffstat to handle UTF-8 chars
2026-01-16diff: improve scaling of filenames in diffstat to handle UTF-8 charsLorenzoPegorari
The `show_stats()` function tries to scale the filenames in the diffstat to ensure they don't exceed the given `name-width`. It does so by calculating the "display width" of the characters to be dropped, but then advances the filename pointer by that number of bytes. However, the "display width" of a character is not always equal to its byte count. The result is that sometimes, when displaying UTF-8 characters, filenames exceed the given `name-width`, and frequently the bytes of the UTF-8 characters are truncated. The following is an example of the issue, where the 2 files are "HelloHi" and "Hello你好", and `name-width=6`: ...oHi | 0 ...<BD><A0>好 | 0 Make the filename pointer move by the actual number of bytes of the characters to drop from the filename, rather than their display width, using the `utf8_width()` function. Force `len` to not be less than 0 (this happens if the given `name-width` is 2 or less), otherwise an infinite loop is entered. Signed-off-by: LorenzoPegorari <lorenzo.pegorari2002@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30diff: avoid segfault with freed entriesDerrick Stolee
When computing a diff in a partial clone, there is a chance that we could trigger a prefetch of missing objects at the same time as we are freeing entries from the global diff queue. This is difficult to reproduce, as we need to have some objects be freed from the queue before triggering the prefetch of missing objects. There is a new test in t4067 that does trigger the segmentation fault that results in this case. The fix is to set the queue pointer to NULL after it is freed, and then to be careful about NULL values in the prefetch. The more elaborate explanation is that within diffcore_std(), we may skip the initial prefetch due to the output format (--name-only in the test) and go straight to diffcore_skip_stat_unmatch(). In that method, the index entries that have been invalidated by path changes show up as entries but may be deleted because they are not actually content diffs and only newer timestamps than expected. As those entries are deleted, later entries are checked with diff_filespec_check_stat_unmatch(), which uses diff_queued_diff_prefetch() as the missing_object_cb in its diff options. That can trigger downloading missing objects if the appropriate scenario occurs to trigger a call to diff_popoulate_filespec(). It's finally within that callback to diff_queued_diff_prefetch() that the segfault occurs. The test was hard to find because it required some real differences, some not-different files that had a newer modified time, and the order of those files alphabetically was important to trigger the deletion before the prefetch was triggered. I briefly considered a "lock" member for the diff queue, but it was a much larger diff and introduced many more possible error scenarios. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-14Merge branch 'rs/diff-index-find-copies-harder-optim'Junio C Hamano
Halve the memory consumed by artificial filepairs created during "git diff --find-copioes-harder", also making the operation run faster. * rs/diff-index-find-copies-harder-optim: diff-index: don't queue unchanged filepairs with diff_change()
2025-11-30Merge branch 'jc/whitespace-incomplete-line'Junio C Hamano
Both "git apply" and "git diff" learn a new whitespace error class, "incomplete-line". * jc/whitespace-incomplete-line: attr: enable incomplete-line whitespace error for this project diff: highlight and error out on incomplete lines apply: check and fix incomplete lines whitespace: allocate a few more bits and define WS_INCOMPLETE_LINE apply: revamp the parsing of incomplete lines diff: update the way rewrite diff handles incomplete lines diff: call emit_callback ecbdata everywhere diff: refactor output of incomplete line diff: keep track of the type of the last line seen diff: correct suppress_blank_empty hack diff: emit_line_ws_markup() if/else style fix whitespace: correct bit assignment comments
2025-11-30diff-index: don't queue unchanged filepairs with diff_change()René Scharfe
diff_cache() queues unchanged filepairs if the flag find_copies_harder is set, and uses diff_change() for that. This function allocates a filespec for each side, does a few other things that are unnecessary for unchanged filepairs and always sets the diff_flag has_changes, which is simply misleading in this case. Add a new streamlined function for queuing unchanged filepairs and use it in show_modified(), which is called by diff_cache() via oneway_diff() and do_oneway_diff(). It allocates only a single filespec for each filepair and uses it twice with reference counting. This has a measurable effect if there are a lot of them, like in the Linux repo: Benchmark 1: ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder Time (mean ± σ): 31.8 ms ± 0.2 ms [User: 24.2 ms, System: 6.3 ms] Range (min … max): 31.5 ms … 32.3 ms 85 runs Benchmark 2: ./git -C ../linux diff --cached --find-copies-harder Time (mean ± σ): 23.9 ms ± 0.2 ms [User: 18.1 ms, System: 4.6 ms] Range (min … max): 23.5 ms … 24.4 ms 111 runs Summary ./git -C ../linux diff --cached --find-copies-harder ran 1.33 ± 0.01 times faster than ./git_v2.52.0 -C ../linux diff --cached --find-copies-harder Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26Merge branch 'ad/blame-diff-algorithm'Junio C Hamano
"git blame" learns "--diff-algorithm=<algo>" option. * ad/blame-diff-algorithm: blame: make diff algorithm configurable xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
2025-11-21Merge branch 'rs/diff-quiet-no-rename'Junio C Hamano
As "git diff --quiet" only cares about the existence of any changes, disable rename/copy detection to skip more expensive processing whose result will be discarded anyway. * rs/diff-quiet-no-rename: diff: disable rename detection with --quiet
2025-11-17xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASKAntonin Delpeuch
The XDF_DIFF_ALGORITHM_MASK bit mask only includes bits for the patience and histogram diffs, not for the minimal one. This means that when reseting the diff algorithm to the default one, one needs to separately clear the bit for the minimal diff. There are places in the code that fail to do that: merge-ort.c and builtin/merge-file.c. Add the XDF_NEED_MINIMAL bit to the bit mask, and remove the separate clearing of this bit in the places where it hasn't been forgotten. Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12diff: highlight and error out on incomplete linesJunio C Hamano
Teach "git diff" to highlight "\ No newline at end of file" message as a whitespace error when incomplete-line whitespace error class is in effect. Thanks to the previous refactoring of complete rewrite code path, we can do this at a single place. Unlike whitespace errors in the payload where we need to annotate in line, possibly using colors, the line that has whitespace problems, we have a dedicated line already that can serve as the error message, so paint it as a whitespace error message. Also teach "git diff --check" to notice incomplete lines as whitespace errors and report when incomplete-line whitespace error class is in effect. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12whitespace: allocate a few more bits and define WS_INCOMPLETE_LINEJunio C Hamano
Reserve a few more bits in the diff flags word to be used for future whitespace rules. Add WS_INCOMPLETE_LINE without implementing the behaviour (yet). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12diff: update the way rewrite diff handles incomplete linesJunio C Hamano
The diff_symbol based output framework uses one DIFF_SYMBOL_* enum value per the kind of output lines of "git diff", which corresponds to one output line from the xdiff machinery used internally. Most notably, DIFF_SYMBOL_PLUS and DIFF_SYMBOL_MINUS that correspond to "+" and "-" lines are designed to always take a complete line, even if the output from xdiff machinery may produce "\ No newline at the end of file" immediately after them. But this is not true in the rewrite-diff codepath, which completely bypasses the xdiff machinery. Since the code path feeds the bytes directly from the payload to the output routines, the output layer has to deal with an incomplete line with DIFF_SYMBOL_PLUS and DIFF_SYMBOL_MINUS, which never would see an incomplete line in the normal code paths. This lack of final newline is compensated by an ugly hack for a fabricated DIFF_SYMBOL_NO_LF_EOF token to inject an extra newline to the output to simulate output coming from the xdiff machinery. Revamp the way the complete-rewrite code path feeds the lines to the output layer by treating the last line of the pre/post image when it is an incomplete line specially. This lets us remove the DIFF_SYMBOL_NO_LF_EOF hack and use the usual DIFF_SYMBOL_CONTEXT_INCOMPLETE code path, which will later learn how to handle whitespace errors. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12diff: call emit_callback ecbdata everywhereJunio C Hamano
Everybody else, except for emit_rewrite_lines(), calls the emit_callback data ecbdata. Make sure we call the same thing by the same name for consistency. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12diff: refactor output of incomplete lineJunio C Hamano
Create a helper function that reacts to "\ No newline at the end of file" in preparation for unifying the incomplete line handling in the code path that handles xdiff output and the code path that bypasses xdiff and produces a complete-rewrite patch. Currently the output from the DIFF_SYMBOL_CONTEXT_INCOMPLETE case still (ab)uses the same code as what is used for context lines, but that would change in a later step where we introduce support to treat an incomplete line as a whitespace error. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12diff: keep track of the type of the last line seenJunio C Hamano
The "\ No newline at the end of the file" can come after any of the "-" (deleted preimage line), " " (unchanged line), or "+" (added postimage line). In later steps in this series, we will start treating a change that makes a file to end in an incomplete line as a whitespace error, and we would need to know what the previous line was when we react to "\ No newline" in the diff output. If the previous line was a context (i.e., unchanged) line, the file lacked the final newline before the change, and the change did not touch that line and left it still incomplete, so we do not want to warn in such a case. Teach fn_out_consume() function to keep track of what the previous line was, and prepare an otherwise empty switch statement to let us react differently to "\ No newline" based on that. Note that there is an existing curiosity (read: likely to be a bug) in the code that increments line number in the preimage file every time it sees a line with "\ No newline" on it, regardless of what the previous line was. I left it as-is, because it does not affect the main theme of this series, and more importantly, I do not think it matters, as these numbers are used only to compare them with blank_at_eof_in_{pre,post}image to issue a warning when we see more empty line was added at the end, but by definition, after we see "\ No newline at the end of the file" for an added line, we will not see an added line for the file. An independent audit to ensure that this curious increment can be safely removed would make a good #leftoverbits clean-up (we may even find some code that decrements this counter or over-increments the other quantity this counter is compared with that compensates the effect of this curious increment that hides a bug, in which case we may also need to remove them). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12diff: correct suppress_blank_empty hackJunio C Hamano
The suppress-blank-empty feature abused the CONTEXT_INCOMPLETE symbol that was meant to be used only for "\ No newline at the end of file" code path. The intent of the feature was to turn a context line we receive from xdiff machinery (which always uses ' ' for context lines, even an empty one) and spit it out as a truly empty line. Perform such a conversion very locally at where a line from xdiff that begins with ' ' is handled for output; there are many checks before the control reaches such place that checks the first letter of the diff output line to see if it is a context line, and having to check for '\n' and treat it as a special case is error prone. In order to catch similar hacks in the future, make sure the code path that is meant for "\ No newline" case checks the first byte is indeed a backslash. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12diff: emit_line_ws_markup() if/else style fixJunio C Hamano
Apply the simple rule: if you need {} in one arm of the if/else if/else... cascade, have {} in all of them. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12whitespace: correct bit assignment commentsJunio C Hamano
A comment in diff.c claimed that bits up to 12th (counting from 0th) are whitespace rules, and 13th thru 15th are for new/old/context, but it turns out it was miscounting. Correct them, and clarify where the whitespace rule bits come from in the comment. Extend bit assignment comments to cover bits used for color-moved, which weren't described. Also update the way these bit constants are defined to use (1 << N) notation, instead of octal constants, as it tends to make it easier to notice a breakage like this. Sprinkle a few blank lines between logically distinct groups of CPP macro definitions to make them easier to read. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-10diff: disable rename detection with --quietRené Scharfe
Detecting renames and copies improves diff's output. This effort is wasted if we don't show any. Disable detection in that case. This actually fixes the error code when using the options --cached, --find-copies-harder, --no-ext-diff and --quiet together: run_diff_index() indirectly calls diff-lib.c::show_modified(), which queues even non-modified entries using diff_change() because we need them for copy detection. diff_change() sets flags.has_changes, though, which causes diff_can_quit_early() to declare we're done after seeing only the very first entry -- way too soon. Using --cached, --find-copies-harder and --quiet together without --no-ext-diff was not affected even before, as it causes the flag flags.diff_from_contents to be set, which disables the optimization in a different way. Reported-by: D. Ben Knoble <ben.knoble@gmail.com> Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24diff: simplify run_external_diff() quiet logicJeff King
We'd sometimes end up in run_external_diff() to do a dry-run diff (e.g., to find content-level changes for --quiet). We recognize this quiet mode by seeing the lack of DIFF_FORMAT_PATCH in the output format. But since introducing an explicit dry-run check via 3ed5d8bd73 (diff: stop output garbled message in dry run mode, 2025-10-20), this logic can never trigger. We can only get to this function by calling diff_flush_patch(), and that comes from only two places: 1. A dry-run flush comes from diff_flush_patch_quietly(), which is always in dry-run mode (so the other half of our "||" is true anyway). 2. A regular flush comes from diff_flush_patch_all_file_pairs(), which is only called when output_format has DIFF_FORMAT_PATCH in it. So we can simplify our "quiet" condition to just checking dry-run mode (which used to be a specific flag, but recently became just a NULL "file" pointer). And since it's so simple, we can just do that inline. This makes the logic about o->file more obvious, since we handle the NULL and non-stdout cases next to each other. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24diff: drop dry-run redirection to /dev/nullJeff King
As an added protection against dry-run diffs accidentally producing output, we redirect diff_options.file to /dev/null. But as of the previous patch, this now does nothing, since dry-run diffs are implemented by setting "file" to NULL. So we can drop this extra code with no change in behavior. This is effectively a revert of 623f7af284 (diff: restore redirection to /dev/null for diff_from_contents, 2025-10-17) and 3da4413dbc (diff: make sure the other caller of diff_flush_patch_quietly() is silent, 2025-10-22), but: 1. We get a conflict because we already dropped the color_moved handling in an earlier patch. But we just resolve the conflicts to "theirs" (removing all of the code). 2. We retain the test from 623f7af284. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24diff: replace diff_options.dry_run flag with NULL fileJeff King
We introduced a dry_run flag to diff_options in b55e6d36eb (diff: ensure consistent diff behavior with ignore options, 2025-08-08), with the idea that the lower-level diff code could skip output when it is set. As we saw with the bugs fixed by 3ed5d8bd73 (diff: stop output garbled message in dry run mode, 2025-10-20), it is easy to miss spots. In the end, we located all of them by checking where diff_options.file is used. That suggests another possible approach: we can replace the dry_run boolean with a NULL pointer for "file", as we know that using "file" in dry_run mode would always be an error. This turns any missed spots from producing extra output[1] into a segfault. Which is less forgiving, but that is the point: this is indicative of a programming error, and complaining loudly and immediately is good. [1] We protect ourselves against garbled output as a separate step, courtesy of 623f7af284 (diff: restore redirection to /dev/null for diff_from_contents, 2025-10-17). So in that sense this patch can only introduce user-visible errors (since any "bugs" were going to /dev/null before), but the idea is to catch them rather than quietly send garbage to /dev/null. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24diff: drop save/restore of color_moved in dry-run modeJeff King
When running a dry-run content-level diff to check whether a "--quiet" diff has any changes, we have always unset the color_moved variable since the feature was added in 2e2d5ac184 (diff.c: color moved lines differently, 2017-06-30). The reasoning is not given explicitly there, but presumably the idea is that since color_moved requires a lot of extra computation to match lines but does not actually affect the found_changes flag, we want to skip it. Later, in 3da4413dbc (diff: make sure the other caller of diff_flush_patch_quietly() is silent, 2025-10-22) we copied the same idea for other dry-run diffs. But neither spot actually needs to reset this flag at all, because diff_flush_patch() will not ever compute color_moved. Nor could it, as it is only looking at a single file-pair, and we detect moves across files. So color_moved is checked only when we are actually doing real DIFF_FORMAT_PATCH output, and call diff_flush_patch_all_file_pairs(). So we can get rid of these extra lines to save and restore the color_moved flag without changing the behavior at all. (Note that there is no "restore" to drop for the second caller, as we know at that point we are not generating any output and can just leave the feature disabled). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24diff: send external diff output to diff_options.fileJeff King
Diff output usually goes to the process stdout, but it can be redirected with the "--output" option. We store this in the "file" pointer of diff_options, and all of the diff code should write there instead of to stdout. But there's one spot we missed: running an external diff cmd. We don't redirect its output at all, so it just defaults to the stdout of the parent process. We should instead point its stdout at our output file. There are a few caveats to watch out for when doing so: - The stdout field takes a descriptor, not a FILE pointer. We can pull out the descriptor with fileno(). - The run-command API always closes the stdout descriptor we pass to it. So we must duplicate it (otherwise we break the FILE pointer, since it now points to a closed descriptor). - We don't need to worry about closing our dup'd descriptor, since the point is that run-command will do it for us (even in the case of an error). But we do need to make sure we skip the dup() if we set no_stdout (because then run-command will not look at it at all). - When the output is going to stdout, it would not be wrong to dup() the descriptor, but we don't need to. We can skip that extra work with a simple pointer comparison. - It seems like you'd need to fflush() the descriptor before handing off a copy to the child process to prevent out-of-order writes. But that was true even before this patch! It works because run-command always calls fflush(NULL) before running the child. The new test shows the breakage (and fix). The need for duplicating the descriptor doesn't need a new test; that is covered by the later test "GIT_EXTERNAL_DIFF with more than one changed files". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-24Merge branch 'jc/diff-from-contents-fix'Junio C Hamano
The code to squelch output from "git diff -w --name-status" etc. for paths that "git diff -w -p" would have stayed silent leaked output from dry-run patch generation, which has been corrected. * jc/diff-from-contents-fix: diff: make sure the other caller of diff_flush_patch_quietly() is silent
2025-10-24Merge branch 'jk/diff-from-contents-fix'Junio C Hamano
Recently we attempted to improve "git diff -w" and friends to handle cases where patch output would be suppressed, but it introduced a bug that emits unnecessary output, which has been corrected. * jk/diff-from-contents-fix: diff: restore redirection to /dev/null for diff_from_contents
2025-10-23diff: stop output garbled message in dry run modeLidong Yan
Earlier, b55e6d36 (diff: ensure consistent diff behavior with ignore options, 2025-08-08) introduced "dry-run" mode to the diff machinery so that content-based diff filtering (like ignoring space changes or those that match -I<regex>) can first try to produce a patch without emitting any output to see if under the given diff filtering condition we would get any output lines, and a new helper function diff_flush_patch_quietly() was introduced to use the mode to see an individual filepair needs to be shown. However, the solution was not complete. When files are deleted, file modes change, or there are unmerged entries in the index, dry-run mode still produces output because we overlooked these conditions, and as a result, dry-run mode was not quiet. To fix this, return early in emit_diff_symbol_from_struct() if we are in dry-run mode. This function will be called by all the emit functions to output the results. Returning early can avoid diff output when files are deleted or file modes are changed. Stop print message in dry-run mode if we have unmerged entries in index. Discard output of external diff tool in dry-run mode. Signed-off-by: Lidong Yan <yldhome2d2@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-23Merge branch 'jc/diff-from-contents-fix' into ↵Junio C Hamano
ly/diff-name-only-with-diff-from-content * jc/diff-from-contents-fix: diff: make sure the other caller of diff_flush_patch_quietly() is silent
2025-10-23diff: make sure the other caller of diff_flush_patch_quietly() is silentJunio C Hamano
Earlier, we added is a protection for the loop that computes "git diff --quiet -w" to ensure calls to the diff_flush_patch_quietly() helper stays quiet. Do the same for another loop that deals with options like "--name-status" to make calls to the same helper. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-22Merge branch 'jk/diff-from-contents-fix' into ↵Junio C Hamano
ly/diff-name-only-with-diff-from-content * jk/diff-from-contents-fix: diff: restore redirection to /dev/null for diff_from_contents
2025-10-17diff: restore redirection to /dev/null for diff_from_contentsJeff King
In --quiet mode, since we produce only an exit code for "something was changed" and no actual output, we can often get by with just a tree-level diff. However, certain options require us to actually look at the file contents (e.g., if we are ignoring whitespace changes). We have a flag "diff_from_contents" for that, and if it is set we call diff_flush() on each path. To avoid producing any output (since we were asked to be --quiet), we traditionally just redirected the output to /dev/null. That changed in b55e6d36eb (diff: ensure consistent diff behavior with ignore options, 2025-08-08), which replaced that with a "dry_run" flag. In theory, with dry_run set, we should produce no output. But it carries a risk of regression: if we forget to respect dry_run in any of the output paths, we'll accidentally produce output. And indeed, there is at least one such regression in that commit, as it covered only the case where we actually call into xdiff, and not creation or deletion diffs, where we manually generate the headers. We even test this case in t4035, but only with diff-tree, which does not show the bug by default because it does not require diff_from_contents. But git-diff does, because it allows external diff programs by default (so we must dig into each diff filepair to decide if it requires running an external diff that may declare two distinct blobs to actually be the same). We should fix all of those code paths to respect dry_run correctly, but in the meantime we can protect ourselves more fully by restoring the redirection to /dev/null. This gives us an extra layer of protection against regressions dues to other code paths we've missed. Though the original issue was reported with "git diff" (and due to its default of --ext-diff), I've used "diff-tree -w" in the new test. It triggers the same issue, but I think the fact that "-w" implies diff_from_contents is a bit more obvious, and fits in with the rest of t4035. Reported-by: Jake Zimmerman <jake@zimmerman.io> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-29Merge branch 'jk/color-variable-fixes'Junio C Hamano
Some places in the code confused a variable that is *not* a boolean to enable color but is an enum that records what the user requested to do about color. A couple of bugs of this sort have been fixed, while the code has been cleaned up to prevent similar bugs in the future. * jk/color-variable-fixes: config: store want_color() result in a separate bool add-interactive: retain colorbool values longer color: return bool from want_color() color: use git_colorbool enum type to store colorbools pretty: use format_commit_context.auto_color as colorbool diff: stop passing ecbdata->use_color as boolean diff: pass o->use_color directly to fill_metainfo() diff: don't use diff_options.use_color as a strict bool diff: simplify color_moved check when flushing grep: don't treat grep_opt.color as a strict bool color: return enum from git_config_colorbool() color: use GIT_COLOR_* instead of numeric constants
2025-09-16color: use git_colorbool enum type to store colorboolsJeff King
We traditionally used "int" to store and pass around the values defined by "enum git_colorbool" (which were originally just #define macros). Using an int doesn't produce incorrect results, but using the actual enum makes the intent of the code more clear. It would be nice if the compiler could catch cases where we used the enum and an int interchangeably, since it's very easy to accidentally check the boolean true/false of a colorbool like: if (branch_use_color) This is wrong because GIT_COLOR_UNKNOWN and GIT_COLOR_AUTO evaluate to true in C, even though we may ultimately decide not to use color. But C is pretty happy to convert between ints and enums (even with various -Wenum-* warnings). So this sadly doesn't protect us from such mistakes, but it hopefully does make the code easier to read. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-16diff: stop passing ecbdata->use_color as booleanJeff King
In emit_hunk_header(), we evaluate ecbdata->color_diff both as a git_colorbool, passing it to diff_get_color(): const char *reset = diff_get_color(ecbdata->color_diff, DIFF_RESET); and as a strict boolean: const char *reverse = ecbdata->color_diff ? GIT_COLOR_REVERSE : ""; At first glance this seems wrong. Usually we store the color decision as a git_colorbool, so the second line would get confused by GIT_COLOR_AUTO (which is boolean true, but may still mean we do not produce color). However, the second line is correct because our caller sets color_diff using want_color(), which collapses the colorbool to a strict true/false boolean. The first line is _also_ correct because of the idempotence of want_color(). Even though diff_get_color() will pass our true/false value through want_color() again, the result will be left untouched. But let's pass through the colorbool itself, which makes it more consistent with the rest of the diff code. We'll need to then call want_color() whenever we treat it as a boolean, but there is only such spot (the one quoted above). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-16diff: pass o->use_color directly to fill_metainfo()Jeff King
We pass the use_color parameter of fill_metainfo() as a strict boolean, using: want_color(o->use_color) && !pgm to derive its value. But then inside the function, we pass it to diff_get_color(), which expects one of the git_colorbool enum values, and so feeds it to want_color() again. Even though want_color() produces a strict 0/1 boolean, this doesn't produce wrong results because want_color() is idempotent. Since GIT_COLOR_ALWAYS and NEVER are defined as 1 and 0, and because want_color() passes through those values, evaluating "want_color(foo)" and "want_color(want_color(foo))" will return the same result. But as part of a longer strategy to align the types we use for storing these values, let's pass through the colorbool directly. To handle the "&&" case here, we'll convert the presence of "pgm" into "NEVER", which arguably makes the intent of the code more clear anyway. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-16diff: don't use diff_options.use_color as a strict boolJeff King
We disable --color-moved if color is not in use at all. This happens in diff_setup_done(), where we set options->color_moved to 0 if options->use_color is not true. But a strict boolean check here is not correct; use_color could be GIT_COLOR_UNKNOWN or GIT_COLOR_AUTO, both of which evaluate to true, even though we may later decide not to show colors. We should be using want_color() to convert that git_colorbool into a true boolean. As it turns out, this does not produce wrong output. Even though we go to the trouble to detect the moved lines, ultimately we get the color values from diff_get_color(), which does check want_color(). And so it returns the empty string for each color, and we "color" the result with nothing. So the output is correct, but there is a small but measurable performance cost to doing the line detection. E.g., in git.git before and after this patch (there are no colors shown because hyperfine redirects output to /dev/null): Benchmark 1: ./git.old log --no-merges -p --color-moved -1000 Time (mean ± σ): 1.019 s ± 0.013 s [User: 0.955 s, System: 0.064 s] Range (min … max): 1.005 s … 1.045 s 10 runs Benchmark 2: ./git.new log --no-merges -p --color-moved -1000 Time (mean ± σ): 982.9 ms ± 14.5 ms [User: 925.8 ms, System: 57.1 ms] Range (min … max): 965.1 ms … 1003.2 ms 10 runs Summary ./git.new log --no-merges -p --color-moved -1000 ran 1.04 ± 0.02 times faster than ./git.old log --no-merges -p --color-moved -1000 Note that the fix is not quite as simple as just calling want_color() from diff_setup_done(). There's a subtle timing issue that goes back to daa0c3d971 (color: delay auto-color decision until point of use, 2011-08-17), the commit that adds want_color() in the first place. As discussed there, we must delay evaluating the colorbool value until all pager setup is complete. So instead, we'll leave the "color_moved" field intact in diff_setup_done(), and modify the point where it is evaluated. Fortunately there is only one such spot that controls whether we run any of the color-moved code at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-16diff: simplify color_moved check when flushingJeff King
In diff_flush_patch_all_file_pairs(), we set o->emitted_symbols if and only if o->color_moved is true. That causes the lower-level routines to fill up o->emitted_symbols, which we then analyze in order to do the actual colorizing. But in that final step, we do: if (o->emitted_symbols) { if (o->color_moved) { ...actual coloring... } ...clean up of emitted_symbols... } The inner "if" will always trigger, since we set emitted_symbols only when doing color_moved (it is a little confusing that it is set inside the diff_options struct, but that is for convenience of passing it to the lower-level routines; we always clear it at the end of flushing, since 48edf3a02a (diff: clear emitted_symbols flag after use, 2019-01-24)). Let's simplify the code a bit by just dropping the inner "if" and running its block unconditionally. In theory the current code might be useful if another feature besides color_moved setup and used emitted_symbols, but it would be easy to refactor later to handle that. And in the meantime, this makes further work in this area easier. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-16color: use GIT_COLOR_* instead of numeric constantsJeff King
Long ago Git's decision to show color for a subsytem was stored in a tri-state variable: it could be true (1), false (0), or unknown (-1). But since daa0c3d971 (color: delay auto-color decision until point of use, 2011-08-17) we want to carry around a new state, "auto", which bases the decision on the tty-ness of stdout (rather than collapsing that "auto" state to a true/false immediately). That commit introduced a set of GIT_COLOR_* defines to represent each state: UNKNOWN, ALWAYS, NEVER, and AUTO. But it only used the AUTO value, and left alone code using bare 0/1/-1 values. And of course since then we've grown many new spots that use those bare values. Let's switch all of these to use the named constants. That should make the code a bit easier to read, as it is more obvious that we're representing a color decision. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>