<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/commit-graph.c, branch v2.44.0-rc0</title>
<subtitle>Fork of git SCM with my patches.</subtitle>
<id>http://git.kilabit.info/git/atom?h=v2.44.0-rc0</id>
<link rel='self' href='http://git.kilabit.info/git/atom?h=v2.44.0-rc0'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/'/>
<updated>2024-01-26T16:54:45Z</updated>
<entry>
<title>Merge branch 'ps/commit-graph-write-leakfix'</title>
<updated>2024-01-26T16:54:45Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-01-26T16:54:45Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=f95bafbaed2d9f9c891e04c3680c1aa0da30629e'/>
<id>urn:sha1:f95bafbaed2d9f9c891e04c3680c1aa0da30629e</id>
<content type='text'>
Leakfix.

* ps/commit-graph-write-leakfix:
  commit-graph: fix memory leak when not writing graph
</content>
</entry>
<entry>
<title>Merge branch 'jk/commit-graph-slab-clear-fix'</title>
<updated>2024-01-16T18:11:58Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-01-16T18:11:58Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=4cc0f8e8fa3694fdef632784592fc8af63022618'/>
<id>urn:sha1:4cc0f8e8fa3694fdef632784592fc8af63022618</id>
<content type='text'>
Clearing in-core repository (happens during e.g., "git fetch
--recurse-submodules" with commit graph enabled) made in-core
commit object in an inconsistent state by discarding the necessary
data from commit-graph too early, which has been corrected.

* jk/commit-graph-slab-clear-fix:
  commit-graph: retain commit slab when closing NULL commit_graph
</content>
</entry>
<entry>
<title>commit-graph: fix memory leak when not writing graph</title>
<updated>2024-01-16T01:08:28Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2023-12-18T10:02:28Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=4efa9308eabba5b474f7ff5b43a8a7b767b6de79'/>
<id>urn:sha1:4efa9308eabba5b474f7ff5b43a8a7b767b6de79</id>
<content type='text'>
When `write_commit_graph()` bails out writing a split commit-graph early
then it may happen that we have already gathered the set of existing
commit-graph file names without yet determining the new merged set of
files. This can result in a memory leak though because we only clear the
preimage of files when we have collected the postimage.

Fix this issue by dropping the condition altogether so that we always
try to free both preimage and postimage filenames. As the context
structure is zero-initialized this simplification is safe to do.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Acked-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'en/header-cleanup'</title>
<updated>2024-01-08T22:05:15Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2024-01-08T22:05:15Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=492ee03f60297e7e83d101f4519ab8abc98782bc'/>
<id>urn:sha1:492ee03f60297e7e83d101f4519ab8abc98782bc</id>
<content type='text'>
Remove unused header "#include".

* en/header-cleanup:
  treewide: remove unnecessary includes in source files
  treewide: add direct includes currently only pulled in transitively
  trace2/tr2_tls.h: remove unnecessary include
  submodule-config.h: remove unnecessary include
  pkt-line.h: remove unnecessary include
  line-log.h: remove unnecessary include
  http.h: remove unnecessary include
  fsmonitor--daemon.h: remove unnecessary includes
  blame.h: remove unnecessary includes
  archive.h: remove unnecessary include
  treewide: remove unnecessary includes in source files
  treewide: remove unnecessary includes from header files
</content>
</entry>
<entry>
<title>commit-graph: retain commit slab when closing NULL commit_graph</title>
<updated>2024-01-05T16:35:26Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2024-01-05T05:41:42Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=d70f554cdf38b0b05cfaa8e8eb9f80d54a5ae11c'/>
<id>urn:sha1:d70f554cdf38b0b05cfaa8e8eb9f80d54a5ae11c</id>
<content type='text'>
This fixes a regression introduced in ac6d45d11f (commit-graph: move
slab-clearing to close_commit_graph(), 2023-10-03), in which running:

  git -c fetch.writeCommitGraph=true fetch --recurse-submodules

multiple times in a freshly cloned repository causes a segfault. What
happens in the second (and subsequent) runs is this:

  1. We make a "struct commit" for any ref tips which we're storing
     (even if we already have them, they still go into FETCH_HEAD).

     Because the first run will have created a commit graph, we'll find
     those commits in the graph.

     The commit struct is therefore created with a NULL "maybe_tree"
     entry, because we can load its oid from the graph later. But to do
     that we need to remember that we got the commit from the graph,
     which is recorded in a global commit_graph_data_slab object.

  2. Because we're using --recurse-submodules, we'll try to fetch each
     of the possible submodules. That implies creating a separate
     "struct repository" in-process for each submodule, which will
     require a later call to repo_clear().

     The call to repo_clear() calls raw_object_store_clear(), which in
     turn calls close_object_store(), which in turn calls
     close_commit_graph(). And the latter frees the commit graph data
     slab.

  3. Later, when trying to write out a new commit graph, we'll ask for
     their tree oid via get_commit_tree_oid(), which will see that the
     object is parsed but with a NULL maybe_tree field. We'd then
     usually pull it from the graph file, but because the slab was
     cleared, we don't realize that we can do so! We end up returning
     NULL and segfaulting.

     (It seems questionable that we'd write a graph entry for such a
     commit anyway, since we know we already have one. I didn't
     double-check, but that may simply be another side effect of having
     cleared the slab).

The bug is in step (2) above. We should not be clearing the slab when
cleaning up the submodule repository structs. Prior to ac6d45d11f, we
did not do so because it was done inside a helper function that returned
early when it saw NULL. So the behavior change from that commit is that
we'll now _always_ clear the slab via repo_clear(), even if the
repository being closed did not have a commit graph (and thus would have
a NULL commit_graph struct).

The most immediate fix is to add in a NULL check in close_commit_graph(),
making it a true noop when passed in an object_store with a NULL
commit_graph (it's OK to just return early, since the rest of its code
is already a noop when passed NULL). That restores the pre-ac6d45d11f
behavior. And that's what this patch does, along with a test that
exercises it (we already have a test that uses submodules along with
fetch.writeCommitGraph, but the bug only triggers when there is a
subsequent fetch and when that fetch uses --recurse-submodules).

So that fixes the regression in the least-risky way possible.

I do think there's some fragility here that we might want to follow up
on. We have a global commit_graph_data_slab that contains graph
positions, and our global commit structs depend on the that slab
remaining valid. But close_commit_graph() is just about closing _one_
object store's graph. So it's dangerous to call that function and clear
the slab without also throwing away any "struct commit" we might have
parsed that depends on it.

Which at first glance seems like a bug we could already trigger. In the
situation described here, there is no commit graph in the submodule
repository, so our commit graph is NULL (in fact, in our test script
there is no submodule repo at all, so we immediately return from
repo_init() and call repo_clear() only to free up memory). But what
would happen if there was one? Wouldn't we see a non-NULL commit_graph
entry, and then clear the global slab anyway?

The answer is "no", but for very bizarre reasons. Remember that
repo_clear() calls raw_object_store_clear(), which then calls
close_object_store() and thus close_commit_graph(). But before it does
so, raw_object_store_clear() does something else: it frees the commit
graph and sets it to NULL! So by this code path we'll _never_ see a
non-NULL commit_graph struct, and thus never clear the slab.

So it happens to work out. But it still seems questionable to me that we
would clear a global slab (which might still be in use) when closing the
commit graph. This clearing comes from 957ba814bf (commit-graph: when
closing the graph, also release the slab, 2021-09-08), and was fixing a
case where we really did need it to be closed (and in that case we
presumably call close_object_store() more directly).

So I suspect there may still be a bug waiting to happen there, as any
object loaded before the call to close_object_store() may be stranded
with a bogus maybe_tree entry (and thus looking at it after the call
might cause an error). But I'm not sure how to trigger it, nor what the
fix should look like (you probably would need to "unparse" any objects
pulled from the graph). And so this patch punts on that for now in favor
of fixing the recent regression in the most direct way, which should not
have any other fallouts.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>treewide: remove unnecessary includes in source files</title>
<updated>2023-12-26T20:04:33Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-12-23T17:15:00Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=d57c671a511d885a5cd390e3d6064c37af524a91'/>
<id>urn:sha1:d57c671a511d885a5cd390e3d6064c37af524a91</id>
<content type='text'>
Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>treewide: add direct includes currently only pulled in transitively</title>
<updated>2023-12-26T20:04:32Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-12-23T17:14:59Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=ec2101abf3ea00a3cbb4c88c14d6658fa6d09984'/>
<id>urn:sha1:ec2101abf3ea00a3cbb4c88c14d6658fa6d09984</id>
<content type='text'>
The next commit will remove a bunch of unnecessary includes, but to do
so, we need some of the lower level direct includes that files rely on
to be explicitly specified.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>treewide: remove unnecessary includes in source files</title>
<updated>2023-12-26T20:04:31Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2023-12-23T17:14:50Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=eea0e59ffbed6e33d171ace5be13cde9faa41639'/>
<id>urn:sha1:eea0e59ffbed6e33d171ace5be13cde9faa41639</id>
<content type='text'>
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ps/commit-graph-less-paranoid'</title>
<updated>2023-12-18T22:10:11Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-12-18T22:10:11Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=66685e85556ad1890d896bc30ba3a7a99bf4dd78'/>
<id>urn:sha1:66685e85556ad1890d896bc30ba3a7a99bf4dd78</id>
<content type='text'>
Earlier we stopped relying on commit-graph that (still) records
information about commits that are lost from the object store,
which has negative performance implications.  The default has been
flipped to disable this pessimization.

* ps/commit-graph-less-paranoid:
  commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default
</content>
</entry>
<entry>
<title>Merge branch 'jk/chunk-bounds-more'</title>
<updated>2023-12-10T00:37:48Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2023-12-10T00:37:48Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=34401b7ddbd79c8511e864048bc52d896eac2f22'/>
<id>urn:sha1:34401b7ddbd79c8511e864048bc52d896eac2f22</id>
<content type='text'>
Code clean-up for jk/chunk-bounds topic.

* jk/chunk-bounds-more:
  commit-graph: mark chunk error messages for translation
  commit-graph: drop verify_commit_graph_lite()
  commit-graph: check order while reading fanout chunk
  commit-graph: use fanout value for graph size
  commit-graph: abort as soon as we see a bogus chunk
  commit-graph: clarify missing-chunk error messages
  commit-graph: drop redundant call to "lite" verification
  midx: check consistency of fanout table
  commit-graph: handle overflow in chunk_size checks
</content>
</entry>
</feed>
