<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/pack-bitmap-write.c, branch gitk-resize-error</title>
<subtitle>Fork of git SCM with my patches.</subtitle>
<id>http://git.kilabit.info/git/atom?h=gitk-resize-error</id>
<link rel='self' href='http://git.kilabit.info/git/atom?h=gitk-resize-error'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/'/>
<updated>2021-09-01T20:56:43Z</updated>
<entry>
<title>pack-bitmap: read multi-pack bitmaps</title>
<updated>2021-09-01T20:56:43Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-08-31T20:52:21Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=0f533c728418fd3ef6ebcae5240e8df566cdaa72'/>
<id>urn:sha1:0f533c728418fd3ef6ebcae5240e8df566cdaa72</id>
<content type='text'>
This prepares the code in pack-bitmap to interpret the new multi-pack
bitmaps described in Documentation/technical/bitmap-format.txt, which
mostly involves converting bit positions to accommodate looking them up
in a MIDX.

Note that there are currently no writers who write multi-pack bitmaps,
and that this will be implemented in the subsequent commit. Note also
that get_midx_checksum() and get_midx_filename() are made non-static so
they can be called from pack-bitmap.c.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap-write.c: free existing bitmaps</title>
<updated>2021-08-24T20:21:13Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-08-24T16:15:56Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=1d7f7f242c89947c3f0bc0ade651e614a2f0a38f'/>
<id>urn:sha1:1d7f7f242c89947c3f0bc0ade651e614a2f0a38f</id>
<content type='text'>
When writing a new bitmap, the bitmap writer code attempts to read the
existing bitmap (if one is present). This is done in order to quickly
permute the bits of any bitmaps for commits which appear in the existing
bitmap, and were also selected for the new bitmap.

But since this code was added in 341fa34887 (pack-bitmap-write: use
existing bitmaps, 2020-12-08), the resources associated with opening an
existing bitmap were never released.

It's fine to ignore this, but it's bad hygiene. It will also cause a
problem for the multi-pack-index builtin, which will be responsible not
only for writing bitmaps, but also for expiring any old multi-pack
bitmaps.

If an existing bitmap was reused here, it will also be expired. That
will cause a problem on platforms which require file resources to be
closed before unlinking them, like Windows. Avoid this by ensuring we
close reused bitmaps with free_bitmap_index() before removing them.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap-write.c: gracefully fail to write non-closed bitmaps</title>
<updated>2021-08-24T20:21:13Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-08-24T16:15:54Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=3ba3d0621b6822e974f06123db9394b33e454ed7'/>
<id>urn:sha1:3ba3d0621b6822e974f06123db9394b33e454ed7</id>
<content type='text'>
The set of objects covered by a bitmap must be closed under
reachability, since it must be the case that there is a valid bit
position assigned for every possible reachable object (otherwise the
bitmaps would be incomplete).

Pack bitmaps are never written from 'git repack' unless repacking
all-into-one, and so we never write non-closed bitmaps (except in the
case of partial clones where we aren't guaranteed to have all objects).

But multi-pack bitmaps change this, since it isn't known whether the
set of objects in the MIDX is closed under reachability until walking
them. Plumb through a bit that is set when a reachable object isn't
found.

As soon as a reachable object isn't found in the set of objects to
include in the bitmap, bitmap_writer_build() knows that the set is not
closed, and so it now fails gracefully.

A test is added in t0410 to trigger a bitmap write without full
reachability closure by removing local copies of some reachable objects
from a promisor remote.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>oid_pos(): access table through const pointers</title>
<updated>2021-01-28T20:03:26Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-01-28T06:20:23Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=8380dcd700e80cd22745bcffb3625f90f943950c'/>
<id>urn:sha1:8380dcd700e80cd22745bcffb3625f90f943950c</id>
<content type='text'>
When we are looking up an oid in an array, we obviously don't need to
write to the array. Let's mark it as const in the function interfaces,
as well as in the local variables we use to derference the void pointer
(note a few cases use pointers-to-pointers, so we mark everything
const).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>hash_pos(): convert to oid_pos()</title>
<updated>2021-01-28T20:02:39Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-01-28T06:19:42Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=45ee13b942b26925b33d827bda2856e1ed0af0b7'/>
<id>urn:sha1:45ee13b942b26925b33d827bda2856e1ed0af0b7</id>
<content type='text'>
All of our callers are actually looking up an object_id, not a bare
hash. Likewise, the arrays they are looking in are actual arrays of
object_id (not just raw bytes of hashes, as we might find in a pack
.idx; those are handled by bsearch_hash()).

Using an object_id gives us more type safety, and makes the callers
slightly shorter. It also gets rid of the word "sha1" from several
access functions, though we could obviously also rename those with
s/sha1/hash/.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ma/sha1-is-a-hash'</title>
<updated>2021-01-15T23:20:29Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-01-15T23:20:29Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=8b327f1784d21c52bc177b5ca75665db388d6367'/>
<id>urn:sha1:8b327f1784d21c52bc177b5ca75665db388d6367</id>
<content type='text'>
Retire more names with "sha1" in it.

* ma/sha1-is-a-hash:
  hash-lookup: rename from sha1-lookup
  sha1-lookup: rename `sha1_pos()` as `hash_pos()`
  object-file.c: rename from sha1-file.c
  object-name.c: rename from sha1-name.c
</content>
</entry>
<entry>
<title>hash-lookup: rename from sha1-lookup</title>
<updated>2021-01-04T21:01:55Z</updated>
<author>
<name>Martin Ågren</name>
<email>martin.agren@gmail.com</email>
</author>
<published>2020-12-31T11:56:23Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=bc626927575cea80b8bc5fd0dbb6c6439e34e606'/>
<id>urn:sha1:bc626927575cea80b8bc5fd0dbb6c6439e34e606</id>
<content type='text'>
Change all remnants of "sha1" in hash-lookup.c and .h and rename them to
reflect that we're not just able to handle SHA-1 these days.

Signed-off-by: Martin Ågren &lt;martin.agren@gmail.com&gt;
Reviewed-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>sha1-lookup: rename `sha1_pos()` as `hash_pos()`</title>
<updated>2021-01-04T21:01:55Z</updated>
<author>
<name>Martin Ågren</name>
<email>martin.agren@gmail.com</email>
</author>
<published>2020-12-31T11:56:22Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=7a7d992d0dbd78cf1fc477cf1e1caf61833b3e41'/>
<id>urn:sha1:7a7d992d0dbd78cf1fc477cf1e1caf61833b3e41</id>
<content type='text'>
Rename this function to reflect that we're not just able to handle SHA-1
these days. There are a few instances of "sha1" left in sha1-lookup.[ch]
after this, but those will be addressed in the next commit.

Signed-off-by: Martin Ågren &lt;martin.agren@gmail.com&gt;
Reviewed-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap-write: better reuse bitmaps</title>
<updated>2020-12-08T22:49:07Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2020-12-08T22:05:30Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=f077b0a9860f383a3cd0ce3a0e2a11ff2f27fc65'/>
<id>urn:sha1:f077b0a9860f383a3cd0ce3a0e2a11ff2f27fc65</id>
<content type='text'>
If the old bitmap file contains a bitmap for a given commit, then that
commit does not need help from intermediate commits in its history to
compute its final bitmap. Eject that commit from the walk and insert it
into a separate list of reusable commits that are eventually stored in
the list of commits for computing bitmaps.

This helps the repeat bitmap computation task, even if the selected
commits shift drastically. This helps when a previously-bitmapped commit
exists in the first-parent history of a newly-selected commit. Since we
stop the walk at these commits and we use a first-parent walk, it is
harder to walk "around" these bitmapped commits. It's not impossible,
but we can greatly reduce the computation time for many selected
commits.

             |   runtime (sec)    |   peak heap (GB)   |
             |                    |                    |
             |   from  |   with   |   from  |   with   |
             | scratch | existing | scratch | existing |
  -----------+---------+----------+---------+-----------
  last patch |  88.478 |   53.218 |   2.157 |    2.224 |
  this patch |  86.681 |   16.164 |   2.157 |    2.222 |

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap-write: relax unique revwalk condition</title>
<updated>2020-12-08T22:49:07Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2020-12-08T22:05:26Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=45f4eeb2919e2c802a6c1f64a2b1c299f7434938'/>
<id>urn:sha1:45f4eeb2919e2c802a6c1f64a2b1c299f7434938</id>
<content type='text'>
The previous commits improved the bitmap computation process for very
long, linear histories with many refs by removing quadratic growth in
how many objects were walked. The strategy of computing "intermediate
commits" using bitmasks for which refs can reach those commits
partitioned the poset of reachable objects so each part could be walked
exactly once. This was effective for linear histories.

However, there was a (significant) drawback: wide histories with many
refs had an explosion of memory costs to compute the commit bitmasks
during the exploration that discovers these intermediate commits. Since
these wide histories are unlikely to repeat walking objects, the benefit
of walking objects multiple times was not expensive before. But now, the
commit walk *before computing bitmaps* is incredibly expensive.

In an effort to discover a happy medium, this change reduces the walk
for intermediate commits to only the first-parent history. This focuses
the walk on how the histories converge, which still has significant
reduction in repeat object walks. It is still possible to create
quadratic behavior in this version, but it is probably less likely in
realistic data shapes.

Here is some data taken on a fresh clone of the kernel:

             |   runtime (sec)    |   peak heap (GB)   |
             |                    |                    |
             |   from  |   with   |   from  |   with   |
             | scratch | existing | scratch | existing |
  -----------+---------+----------+---------+-----------
    original |  64.044 |   83.241 |   2.088 |    2.194 |
  last patch |  45.049 |   37.624 |   2.267 |    2.334 |
  this patch |  88.478 |   53.218 |   2.157 |    2.224 |

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Helped-by: Johannes Schindelin &lt;Johannes.Schindelin@gmx.de&gt;
Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
