<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/grep.c, branch gitk-resize-error</title>
<subtitle>Fork of git SCM with my patches.</subtitle>
<id>http://git.kilabit.info/git/atom?h=gitk-resize-error</id>
<link rel='self' href='http://git.kilabit.info/git/atom?h=gitk-resize-error'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/'/>
<updated>2022-01-10T19:52:54Z</updated>
<entry>
<title>Merge branch 'lh/use-gnu-color-in-grep'</title>
<updated>2022-01-10T19:52:54Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2022-01-10T19:52:54Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=c0450ca09864baae1cd80b746500aaef2eeda956'/>
<id>urn:sha1:c0450ca09864baae1cd80b746500aaef2eeda956</id>
<content type='text'>
The color palette used by "git grep" has been updated to match that
of GNU grep.

* lh/use-gnu-color-in-grep:
  grep: align default colors with GNU grep ones
</content>
</entry>
<entry>
<title>Merge branch 'rs/pcre2-utf'</title>
<updated>2022-01-05T22:01:31Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2022-01-05T22:01:31Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=c91b0b7c7270dac8c539062d193749e654b7f002'/>
<id>urn:sha1:c91b0b7c7270dac8c539062d193749e654b7f002</id>
<content type='text'>
"git grep --perl-regexp" failed to match UTF-8 characters with
wildcard when the pattern consists only of ASCII letters, which has
been corrected.

* rs/pcre2-utf:
  grep/pcre2: factor out literal variable
  grep/pcre2: use PCRE2_UTF even with ASCII patterns
</content>
</entry>
<entry>
<title>grep: align default colors with GNU grep ones</title>
<updated>2022-01-05T20:42:54Z</updated>
<author>
<name>Lénaïc Huard</name>
<email>lenaic@lhuard.fr</email>
</author>
<published>2022-01-05T08:18:35Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=b83f99c3990cc9080ec8f5459555069d47e4cdfb'/>
<id>urn:sha1:b83f99c3990cc9080ec8f5459555069d47e4cdfb</id>
<content type='text'>
git-grep shares a lot of options with the standard grep tool.
Like GNU grep, it has coloring options to highlight the matching text.
And like it, it has options to customize the various colored parts.

This patch updates the default git-grep colors to make them match the
GNU grep default ones [1].

It was possible to get the same result by setting the various `color.grep.&lt;slot&gt;`
options, but this patch makes `git grep --color` share the same color scheme as
`grep --color` by default without any user configuration.

[1] https://www.man7.org/linux/man-pages/man1/grep.1.html#ENVIRONMENT

Signed-off-by: Lénaïc Huard &lt;lenaic@lhuard.fr&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep/pcre2: factor out literal variable</title>
<updated>2021-12-20T20:46:39Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2021-12-18T19:53:15Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=32e3e8bc551e7b10bbda07110ae7cb15442d0392'/>
<id>urn:sha1:32e3e8bc551e7b10bbda07110ae7cb15442d0392</id>
<content type='text'>
Patterns that contain no wildcards and don't have to be case-folded are
literal.  Give this condition a name to increase the readability of the
boolean expression for enabling the option PCRE2_UTF.

Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep/pcre2: use PCRE2_UTF even with ASCII patterns</title>
<updated>2021-12-20T20:45:02Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2021-12-18T19:50:02Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=dc2c44fbb100fa609174d9069a70e2b54b0591ca'/>
<id>urn:sha1:dc2c44fbb100fa609174d9069a70e2b54b0591ca</id>
<content type='text'>
compile_pcre2_pattern() currently uses the option PCRE2_UTF only for
patterns with non-ASCII characters.  Patterns with ASCII wildcards can
match non-ASCII strings, though.  Without that option PCRE2 mishandles
UTF-8 input, though -- it matches parts of multi-byte characters.  Fix
that by using PCRE2_UTF even for ASCII-only patterns.

This is a remake of the reverted ae39ba431a (grep/pcre2: fix an edge
case concerning ascii patterns and UTF-8 data, 2021-10-15).  The change
to the condition and the test are simplified and more targeted.

Original-patch-by: Hamza Mahfooz &lt;someguy@effective-light.com&gt;
Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>log: let --invert-grep only invert --grep</title>
<updated>2021-12-17T22:13:08Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2021-12-17T16:48:49Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=794c000267b7bd29024b56e282509a82b31e6fc8'/>
<id>urn:sha1:794c000267b7bd29024b56e282509a82b31e6fc8</id>
<content type='text'>
The option --invert-grep is documented to filter out commits whose
messages match the --grep filters.  However, it also affects the
header matches (--author, --committer), which is not intended.

Move the handling of that option to grep.c, as only the code there can
distinguish between matches in the header from those in the message
body.  If --invert-grep is given then enable extended expressions (not
the regex type, we just need git grep's --not to work), negate the body
patterns and check if any of them match by piggy-backing on the
collect_hits mechanism of grep_source_1().

Collecting the matches in struct grep_opt is a bit iffy, but with
"last_shown" we have a precedent for writing state information to that
struct.

Reported-by: Dotan Cohen &lt;dotancohen@gmail.com&gt;
Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Revert "grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data"</title>
<updated>2021-11-19T17:10:27Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-11-19T17:06:36Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=e7f3925bed86edf1b79fd18e5600252e445019d1'/>
<id>urn:sha1:e7f3925bed86edf1b79fd18e5600252e445019d1</id>
<content type='text'>
This reverts commit ae39ba431ab861548eb60b4bd2e1d8b8813db76f, as it
breaks "grep" when looking for a string in non UTF-8 haystack, when
linked with certain versions of PCREv2 library.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data</title>
<updated>2021-10-15T19:45:39Z</updated>
<author>
<name>Hamza Mahfooz</name>
<email>someguy@effective-light.com</email>
</author>
<published>2021-10-15T16:13:56Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=ae39ba431ab861548eb60b4bd2e1d8b8813db76f'/>
<id>urn:sha1:ae39ba431ab861548eb60b4bd2e1d8b8813db76f</id>
<content type='text'>
If we attempt to grep non-ascii log message text with an ascii pattern, we
run into the following issue:

    $ git log --color --author='.var.*Bjar' -1 origin/master | grep ^Author
    grep: (standard input): binary file matches

So, to fix this teach the grep code to use PCRE2_UTF, as long as the log
output is encoded in UTF-8.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Hamza Mahfooz &lt;someguy@effective-light.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep: refactor next_match() and match_one_pattern() for external use</title>
<updated>2021-09-29T20:23:11Z</updated>
<author>
<name>Hamza Mahfooz</name>
<email>someguy@effective-light.com</email>
</author>
<published>2021-09-29T11:57:15Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=3f566c4e695a6df8237c34b7c1f34f0832b7e575'/>
<id>urn:sha1:3f566c4e695a6df8237c34b7c1f34f0832b7e575</id>
<content type='text'>
These changes are made in preparation of, the colorization support for the
"git log" subcommands that, rely on regex functionality (i.e. "--author",
"--committer" and "--grep"). These changes are necessary primarily because
match_one_pattern() expects header lines to be prefixed, however, in
pretty, the prefixes are stripped from the lines because the name-email
pairs need to go through additional parsing, before they can be printed and
because next_match() doesn't handle the case of
"ctx == GREP_CONTEXT_HEAD" at all. So, teach next_match() how to handle the
new case and move match_one_pattern()'s core logic to
headerless_match_one_pattern() while preserving match_one_pattern()'s uses
that depend on the additional processing.

Signed-off-by: Hamza Mahfooz &lt;someguy@effective-light.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>grep: store grep_source buffer as const</title>
<updated>2021-09-22T18:59:50Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-09-21T03:51:28Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=1e66871608d1f6f4cd66e899ee33755bbf6deafa'/>
<id>urn:sha1:1e66871608d1f6f4cd66e899ee33755bbf6deafa</id>
<content type='text'>
Our grep_buffer() function takes a non-const buffer, which is confusing:
we don't take ownership of nor write to the buffer.

This mostly comes from the fact that the underlying grep_source struct
in which we store the buffer uses non-const pointer. The memory pointed
to by the struct is sometimes owned by us (for FILE or OID sources), and
sometimes not (for BUF sources).

Let's store it as const, which lets us err on the side of caution (i.e.,
the compiler will warn us if any of our code writes to or tries to free
it).

As a result, we must annotate the one place where we do free it by
casting away the constness. But that's a small price to pay for the
extra safety and clarity elsewhere (and indeed, it already had a comment
explaining why GREP_SOURCE_BUF _didn't_ free it).

And then we can mark grep_buffer() as taking a const buffer.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
