<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/pack-bitmap.c, branch gitk-resize-error</title>
<subtitle>Fork of git SCM with my patches.</subtitle>
<id>http://git.kilabit.info/git/atom?h=gitk-resize-error</id>
<link rel='self' href='http://git.kilabit.info/git/atom?h=gitk-resize-error'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/'/>
<updated>2021-12-10T22:35:08Z</updated>
<entry>
<title>Merge branch 'jk/test-bitmap-fix'</title>
<updated>2021-12-10T22:35:08Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-12-10T22:35:08Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=a9c84980d0e55fa7802be4b02b12801ed7cd06d6'/>
<id>urn:sha1:a9c84980d0e55fa7802be4b02b12801ed7cd06d6</id>
<content type='text'>
Tighten code for testing pack-bitmap.

* jk/test-bitmap-fix:
  test_bitmap_hashes(): handle repository without bitmaps
</content>
</entry>
<entry>
<title>test_bitmap_hashes(): handle repository without bitmaps</title>
<updated>2021-11-05T18:52:42Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-11-05T09:01:31Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=875da7f061bf141aa6bf2c34afad1cf16d179e17'/>
<id>urn:sha1:875da7f061bf141aa6bf2c34afad1cf16d179e17</id>
<content type='text'>
If prepare_bitmap_git() returns NULL (one easy-to-trigger cause being
that the repository does not have bitmaps at all), then we'll segfault
accessing bitmap_git-&gt;hashes:

  $ t/helper/test-tool bitmap dump-hashes
  Segmentation fault

We should treat this the same as a repository with bitmaps but no
name-hashes, and quietly produce an empty output. The later call to
free_bitmap_index() in the cleanup label is OK, as it treats a NULL
pointer as a noop.

This isn't a big deal in practice, as this function is intended for and
used only by test-tool. It's probably worth fixing to avoid confusion,
but not worth adding coverage for this to the test suite.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap.c: more aggressively free in free_bitmap_index()</title>
<updated>2021-10-28T22:32:14Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-10-26T21:01:26Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=655b8561d6b10f22f0e7350df9388110667001af'/>
<id>urn:sha1:655b8561d6b10f22f0e7350df9388110667001af</id>
<content type='text'>
The function free_bitmap_index() is somewhat lax in what it frees. There
are two notable examples:

  - While it does call kh_destroy_oid_map on the "bitmaps" map, which
    maps commit OIDs to their corresponding bitmaps, the bitmaps
    themselves are not freed. Note here that we recycle already-freed
    ewah_bitmaps into a pool, but these are handled correctly by
    ewah_pool_free().

  - We never bother to free the extended index's "positions" map, which
    we always allocate in load_bitmap().

Fix both of these.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap.c: don't leak type-level bitmaps</title>
<updated>2021-10-28T22:32:14Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-10-26T21:01:23Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=022815114a8a57188bc0e8fd622e10d5e22604dc'/>
<id>urn:sha1:022815114a8a57188bc0e8fd622e10d5e22604dc</id>
<content type='text'>
test_bitmap_walk() is used to implement `git rev-list --test-bitmap`,
which compares the result of the on-disk bitmaps with ones generated
on-the-fly during a revision walk.

In fa95666a40 (pack-bitmap.c: harden 'test_bitmap_walk()' to check type
bitmaps, 2021-08-24), we hardened those tests to also check the four
special type-level bitmaps, but never freed those bitmaps. We should
have, since each required an allocation when we EWAH-decompressed them.

Free those, plugging that leak, and also free the base (the scratch-pad
bitmap), too.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>midx.c: write MIDX filenames to strbuf</title>
<updated>2021-10-28T22:32:14Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-10-26T21:01:21Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=60980aed786487e9113f0cb2907dfc75a77d363c'/>
<id>urn:sha1:60980aed786487e9113f0cb2907dfc75a77d363c</id>
<content type='text'>
To ask for the name of a MIDX and its corresponding .rev file, callers
invoke get_midx_filename() and get_midx_rev_filename(), respectively.
These both invoke xstrfmt(), allocating a chunk of memory which must be
freed later on.

This makes callers in pack-bitmap.c somewhat awkward. Specifically,
midx_bitmap_filename(), which is implemented like:

    return xstrfmt("%s-%s.bitmap",
                   get_midx_filename(midx-&gt;object_dir),
                   hash_to_hex(get_midx_checksum(midx)));

this leaks the second argument to xstrfmt(), which itself was allocated
with xstrfmt(). This caller could assign both the result of
get_midx_filename() and the outer xstrfmt() to a temporary variable,
remembering to free() the former before returning. But that involves a
wasteful copy.

Instead, get_midx_filename() and get_midx_rev_filename() take a strbuf
as an output parameter. This way midx_bitmap_filename() can manipulate
and pass around a temporary buffer which it detaches back to its caller.

That allows us to implement the function without copying or open-coding
get_midx_filename() in a way that doesn't leak.

Update the other callers of get_midx_filename() and
get_midx_rev_filename() accordingly.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'tb/repack-write-midx'</title>
<updated>2021-10-18T22:47:57Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-10-18T22:47:57Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=0b69bb0fb1ebe1a9ab7a3f4bfde5cad82eb892e3'/>
<id>urn:sha1:0b69bb0fb1ebe1a9ab7a3f4bfde5cad82eb892e3</id>
<content type='text'>
"git repack" has been taught to generate multi-pack reachability
bitmaps.

* tb/repack-write-midx:
  test-read-midx: fix leak of bitmap_index struct
  builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
  builtin/repack.c: make largest pack preferred
  builtin/repack.c: support writing a MIDX while repacking
  builtin/repack.c: extract showing progress to a variable
  builtin/repack.c: rename variables that deal with non-kept packs
  builtin/repack.c: keep track of existing packs unconditionally
  midx: preliminary support for `--refs-snapshot`
  builtin/multi-pack-index.c: support `--stdin-packs` mode
  midx: expose `write_midx_file_only()` publicly
</content>
</entry>
<entry>
<title>builtin/repack.c: make largest pack preferred</title>
<updated>2021-09-29T04:20:56Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-09-29T01:55:20Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=6d08b9d4caa230441b7d9e2b4f23deaf9ff74c13'/>
<id>urn:sha1:6d08b9d4caa230441b7d9e2b4f23deaf9ff74c13</id>
<content type='text'>
When repacking into a geometric series and writing a multi-pack bitmap,
it is beneficial to have the largest resulting pack be the preferred
object source in the bitmap's MIDX, since selecting the large packs can
lead to fewer broken delta chains and better compression.

Teach 'git repack' to identify this pack and pass it to the MIDX write
machinery in order to mark it as preferred.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap.c: propagate namehash values from existing bitmaps</title>
<updated>2021-09-14T23:34:17Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-09-14T22:06:04Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=8de300e1f7ebe7099ce6ce60f6c6fb494e6703b2'/>
<id>urn:sha1:8de300e1f7ebe7099ce6ce60f6c6fb494e6703b2</id>
<content type='text'>
When an old bitmap exists while writing a new one, we load it and build
a "reposition" table which maps bit positions of objects from the old
bitmap to their respective positions in the new bitmap. This can help
when we encounter a commit which was selected in both the old and new
bitmap, since we only need to permute its bit (not recompute it from
scratch).

We do not, however, repurpose existing namehash values in the case of
the hash-cache extension. There has been thus far no good reason to do
so, since all of the namehash values for objects in the new bitmap would
be populated during the traversal that was just performed by
pack-objects when generating single-pack reachability bitmaps.

But this isn't the case for multi-pack bitmaps, which are written via
`git multi-pack-index write --bitmap` and do not perform any traversal.
In this case all namehash values are set to zero, but we don't even
bother to check the `pack.writeBitmapHashcache` option anyway, so it
fails to matter.

There are two approaches we could take to fill in non-zero hash-cache
values:

  - have either the multi-pack-index builtin run its own
    traversal to attempt to fill in some values, or let a hypothetical
    caller (like `pack-objects` when `repack` eventually drives the
    `multi-pack-index` builtin) fill in the values they found during
    their traversal

  - or copy any existing namehash values that were stored in an
    existing bitmap to their corresponding positions in the new bitmap

In a system where a repository is generally repacked with `git repack
--geometric=&lt;d&gt;` and occasionally repacked with `git repack -a`, the
hash-cache coverage will tend towards all objects.

Since populating the hash-cache is additive (i.e., doing so only helps
our delta search), any intermediate lack of full coverage is just fine.
So let's start by just propagating any values from the existing
hash-cache if we see one.

The next patch will respect the `pack.writeBitmapHashcache` option while
writing MIDX bitmaps, and then test this new behavior.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>t/helper/test-bitmap.c: add 'dump-hashes' mode</title>
<updated>2021-09-14T23:34:17Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2021-09-14T22:06:02Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=a05f02b1d9a1253e11a327c95cd47cbd24317ba6'/>
<id>urn:sha1:a05f02b1d9a1253e11a327c95cd47cbd24317ba6</id>
<content type='text'>
The pack-bitmap writer code is about to learn how to propagate values
from an existing hash-cache. To prepare, teach the test-bitmap helper to
dump the values from a bitmap's hash-cache extension in order to test
those changes.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap: drop bitmap_index argument from try_partial_reuse()</title>
<updated>2021-09-10T00:32:40Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-09-09T19:57:21Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=73cd7d9420bb7d75207e8149521db375c789a81c'/>
<id>urn:sha1:73cd7d9420bb7d75207e8149521db375c789a81c</id>
<content type='text'>
Starting in commit 0f533c7284 (pack-bitmap: read multi-pack bitmaps,
2021-08-31), we no longer look at the "struct bitmap_index" passed to
try_partial_reuse(). This is because we only handle verbatim reuse from
a single pack: either the pack whose bitmap we're looking at, or the
"preferred" pack of a midx bitmap. And thus the primary item we look at
is the "pack" parameter added by that same commit, and not the
bitmap_git-&gt;pack parameter (which would be NULL for a midx bitmap). It's
our caller, reuse_partial_packfile_from_bitmap(), which decides which
pack to use and passes it in to us.

Drop the unused parameter to prevent confusion.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Reviewed-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
