<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/commit.c, branch v2.41.2</title>
<subtitle>Fork of git SCM with my patches.</subtitle>
<id>http://git.kilabit.info/git/atom?h=v2.41.2</id>
<link rel='self' href='http://git.kilabit.info/git/atom?h=v2.41.2'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/'/>
<updated>2023-05-09T23:45:46Z</updated>
<entry>
<title>Merge branch 'en/header-split-cache-h-part-2'</title>
<updated>2023-05-09T23:45:46Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-05-09T23:45:45Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=ccd12a3d6cc62f51b746654ae56e26d92f89ba92'/>
<id>urn:sha1:ccd12a3d6cc62f51b746654ae56e26d92f89ba92</id>
<content type='text'>
More header clean-up.

* en/header-split-cache-h-part-2: (22 commits)
  reftable: ensure git-compat-util.h is the first (indirect) include
  diff.h: reduce unnecessary includes
  object-store.h: reduce unnecessary includes
  commit.h: reduce unnecessary includes
  fsmonitor: reduce includes of cache.h
  cache.h: remove unnecessary headers
  treewide: remove cache.h inclusion due to previous changes
  cache,tree: move basic name compare functions from read-cache to tree
  cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c
  hash-ll.h: split out of hash.h to remove dependency on repository.h
  tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h
  dir.h: move DTYPE defines from cache.h
  versioncmp.h: move declarations for versioncmp.c functions from cache.h
  ws.h: move declarations for ws.c functions from cache.h
  match-trees.h: move declarations for match-trees.c functions from cache.h
  pkt-line.h: move declarations for pkt-line.c functions from cache.h
  base85.h: move declarations for base85.c functions from cache.h
  copy.h: move declarations for copy.c functions from cache.h
  server-info.h: move declarations for server-info.c functions from cache.h
  packfile.h: move pack_window and pack_entry from cache.h
  ...
</content>
</entry>
<entry>
<title>Merge branch 'jk/parse-commit-with-malformed-ident'</title>
<updated>2023-05-09T23:45:45Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-05-09T23:45:44Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=620e92b8454d2569b9ad9a2070fd2edea99895cc'/>
<id>urn:sha1:620e92b8454d2569b9ad9a2070fd2edea99895cc</id>
<content type='text'>
The commit object parser has been taught to be a bit more lenient
to parse timestamps on the author/committer line with a malformed
author/committer ident.

* jk/parse-commit-with-malformed-ident:
  parse_commit(): describe more date-parsing failure modes
  parse_commit(): handle broken whitespace-only timestamp
  parse_commit(): parse timestamp from end of line
  t4212: avoid putting git on left-hand side of pipe
</content>
</entry>
<entry>
<title>parse_commit(): describe more date-parsing failure modes</title>
<updated>2023-04-27T16:31:46Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2023-04-27T08:17:24Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=90ef0f14eb1410747885806d8e55725053572654'/>
<id>urn:sha1:90ef0f14eb1410747885806d8e55725053572654</id>
<content type='text'>
The previous few commits improved the parsing of dates in malformed
commit objects. But there's one big case left implicit: we may still
feed garbage to parse_timestamp(). This is preferable to trying to be
more strict, but let's document the thinking in a comment.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>parse_commit(): handle broken whitespace-only timestamp</title>
<updated>2023-04-27T15:53:53Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2023-04-27T08:17:15Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=089d9adff6408b8f3406e2f46179501337715ae8'/>
<id>urn:sha1:089d9adff6408b8f3406e2f46179501337715ae8</id>
<content type='text'>
The comment in parse_commit_date() claims that parse_timestamp() will
not walk past the end of the buffer we've been given, since it will hit
the newline at "eol" and stop. This is usually true, when dateptr
contains actual numbers to parse. But with a line like:

   committer name &lt;email&gt;   \n

with just whitespace, and no numbers, parse_timestamp() will consume
that newline as part of the leading whitespace, and we may walk past our
"tail" pointer (which itself is set from the "size" parameter passed in
to parse_commit_buffer()).

In practice this can't cause us to walk off the end of an array, because
we always add an extra NUL byte to the end of objects we load from disk
(as a defense against exactly this kind of bug). However, you can see
the behavior in action when "committer" is the final header (which it
usually is, unless there's an encoding) and the subject line can be
parsed as an integer. We walk right past the newline on the committer
line, as well as the "\n\n" separator, and mistake the subject for the
timestamp.

We can solve this by trimming the whitespace ourselves, making sure that
it has some non-whitespace to parse. Note that we need to be a bit
careful about the definition of "whitespace" here, as our isspace()
doesn't match exotic characters like vertical tab or formfeed. We can
work around that by checking for an actual number (see the in-code
comment). This is slightly more restrictive than the current code, but
in practice the results are either the same (we reject "foo" as "0", but
so would parse_timestamp()) or extremely unlikely even for broken
commits (parse_timestamp() would allow "\v123" as "123", but we'll now
make it "0").

I did also allow "-" here, which may be controversial, as we don't
currently support negative timestamps. My reasoning was two-fold. One,
the design of parse_timestamp() is such that we should be able to easily
switch it to handling signed values, and this otherwise creates a
hard-to-find gotcha that anybody doing that work would get tripped up
on. And two, the status quo is that we currently parse them, though the
result of course ends up as a very large unsigned value (which is likely
to just get clamped to "0" for display anyway, since our date routines
can't handle it).

The new test checks the commit parser (via "--until") for both vanilla
spaces and the vertical-tab case. I also added a test to check these
against the pretty-print formatter, which uses split_ident_line().  It's
not subject to the same bug, because it already insists that there be
one or more digits in the timestamp.

Helped-by: Phillip Wood &lt;phillip.wood123@gmail.com&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>parse_commit(): parse timestamp from end of line</title>
<updated>2023-04-27T15:53:35Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2023-04-27T08:14:09Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=ea1615dfdd70fcc49f8567fd6abdf5fcda2fbc0d'/>
<id>urn:sha1:ea1615dfdd70fcc49f8567fd6abdf5fcda2fbc0d</id>
<content type='text'>
To find the committer timestamp, we parse left-to-right looking for the
closing "&gt;" of the email, and then expect the timestamp right after
that. But we've seen some broken cases in the wild where this fails, but
we _could_ find the timestamp with a little extra work. E.g.:

  Name &lt;Name&lt;email&gt;&gt; 123456789 -0500

This means that features that rely on the committer timestamp, like
--since or --until, will treat the commit as happening at time 0 (i.e.,
1970).

This is doubly confusing because the pretty-print parser learned to
handle these in 03818a4a94 (split_ident: parse timestamp from end of
line, 2013-10-14). So printing them via "git show", etc, makes
everything look normal, but --until, etc are still broken (despite the
fact that that commit explicitly mentioned --until!).

So let's use the same trick as 03818a4a94: find the end of the line, and
parse back to the final "&gt;". In theory we could use split_ident_line()
here, but it's actually a bit more strict. In particular, it requires a
valid time-zone token, too. That should be present, of course, but we
wouldn't want to break --until for cases that are working currently.

We might want to teach split_ident_line() to become more lenient there,
but it would require checking its many callers (since right now they can
assume that if date_start is non-NULL, so is tz_start).

So for now we'll just reimplement the same trick in the commit parser.

The test is in t4212, which already covers similar cases, courtesy of
03818a4a94. We'll just adjust the broken commit to munge both the author
and committer timestamps. Note that we could match (author|committer)
here, but alternation can't be used portably in sed. Since we wouldn't
expect to see "&gt;" except as part of an ident line, we can just match
that character on any line.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit.h: reduce unnecessary includes</title>
<updated>2023-04-24T19:47:33Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-04-22T20:17:26Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=d4a4f9291d63b48b368f79bce3151bee9ca28009'/>
<id>urn:sha1:d4a4f9291d63b48b368f79bce3151bee9ca28009</id>
<content type='text'>
Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>treewide: remove cache.h inclusion due to object.h changes</title>
<updated>2023-04-11T15:52:10Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-04-11T07:41:56Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=d812c3b6a0b41d48ce68f971d9cec50676048863'/>
<id>urn:sha1:d812c3b6a0b41d48ce68f971d9cec50676048863</id>
<content type='text'>
Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Acked-by: Calvin Wan &lt;calvinwan@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>object-name.h: move declarations for object-name.c functions from cache.h</title>
<updated>2023-04-11T15:52:09Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-04-11T07:41:49Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=dabab1d6e6c49f3d4a6393a9984e3f6a4e8fb991'/>
<id>urn:sha1:dabab1d6e6c49f3d4a6393a9984e3f6a4e8fb991</id>
<content type='text'>
Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Acked-by: Calvin Wan &lt;calvinwan@google.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h</title>
<updated>2023-04-04T15:25:52Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-04-04T15:25:52Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=e7dca80692dec7f0717d9ad6d978d0ceab90cf6a'/>
<id>urn:sha1:e7dca80692dec7f0717d9ad6d978d0ceab90cf6a</id>
<content type='text'>
* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect &amp; verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
</content>
</entry>
<entry>
<title>cocci: apply the "refs.h" part of "the_repository.pending"</title>
<updated>2023-03-28T14:36:46Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2023-03-28T13:58:54Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=12cb1c10a64170a5d600dd1c6c8abfeec105fb6b'/>
<id>urn:sha1:12cb1c10a64170a5d600dd1c6c8abfeec105fb6b</id>
<content type='text'>
Apply the part of "the_repository.pending.cocci" pertaining to
"refs.h".

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
