aboutsummaryrefslogtreecommitdiff
path: root/t
diff options
context:
space:
mode:
authorTaylor Blau <me@ttaylorr.com>2026-02-24 14:00:41 -0500
committerJunio C Hamano <gitster@pobox.com>2026-02-24 11:16:34 -0800
commit9df44a97f165bbdd8d591c6e99c8894ae036b5bd (patch)
treef0fdbc701c8e8e47966817eda9c74cfdcd6e3360 /t
parentdedf71f0b1b9a64d5aa7558ef45b26166d8d66fc (diff)
downloadgit-9df44a97f165bbdd8d591c6e99c8894ae036b5bd.tar.xz
midx: implement MIDX compaction
When managing a MIDX chain with many layers, it is convenient to combine a sequence of adjacent layers into a single layer to prevent the chain from growing too long. While it is conceptually possible to "compact" a sequence of MIDX layers together by running "git multi-pack-index write --stdin-packs", there are a few drawbacks that make this less than desirable: - Preserving the MIDX chain is impossible, since there is no way to write a MIDX layer that contains objects or packs found in an earlier MIDX layer already part of the chain. So callers would have to write an entirely new (non-incremental) MIDX containing only the compacted layers, discarding all other objects/packs from the MIDX. - There is (currently) no way to write a MIDX layer outside of the MIDX chain to work around the above, such that the MIDX chain could be reassembled substituting the compacted layers with the MIDX that was written. - The `--stdin-packs` command-line option does not allow us to specify the order of packs as they appear in the MIDX. Therefore, even if there were workarounds for the previous two challenges, any bitmaps belonging to layers which come after the compacted layer(s) would no longer be valid. This commit introduces a way to compact a sequence of adjacent MIDX layers into a single layer while preserving the MIDX chain, as well as any bitmap(s) in layers which are newer than the compacted ones. Implementing MIDX compaction does not require a significant number of changes to how MIDX layers are written. The main changes are as follows: - Instead of calling `fill_packs_from_midx()`, we call a new function `fill_packs_from_midx_range()`, which walks backwards along the portion of the MIDX chain which we are compacting, and adds packs one layer a time. In order to preserve the pseudo-pack order, the concatenated pack order is preserved, with the exception of preferred packs which are always added first. - After adding entries from the set of packs in the compaction range, `compute_sorted_entries()` must adjust the `pack_int_id`'s for all objects added in each fanout layer to match their original `pack_int_id`'s (as opposed to the index at which each pack appears in `ctx.info`). Note that we cannot reuse `midx_fanout_add_midx_fanout()` directly here, as it unconditionally recurs through the `->base_midx`. Factor out a `_1()` variant that operates on a single layer, reimplement the existing function in terms of it, and use the new variant from `midx_fanout_add_compact()`. Since we are sorting the list of objects ourselves, the order we add them in does not matter. - When writing out the new 'multi-pack-index-chain' file, discard any layers in the compaction range, replacing them with the newly written layer, instead of keeping them and placing the new layer at the end of the chain. This ends up being sufficient to implement MIDX compaction in such a way that preserves bitmaps corresponding to more recent layers in the MIDX chain. The tests for MIDX compaction are so far fairly spartan, since the main interesting behavior here is ensuring that the right packs/objects are selected from each layer, and that the pack order is preserved despite whether or not they are sorted in lexicographic order in the original MIDX chain. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 't')
-rw-r--r--t/meson.build1
-rwxr-xr-xt/t5335-compact-multi-pack-index.sh175
2 files changed, 176 insertions, 0 deletions
diff --git a/t/meson.build b/t/meson.build
index f80e366cff..2421220917 100644
--- a/t/meson.build
+++ b/t/meson.build
@@ -618,6 +618,7 @@ integration_tests = [
't5332-multi-pack-reuse.sh',
't5333-pseudo-merge-bitmaps.sh',
't5334-incremental-multi-pack-index.sh',
+ 't5335-compact-multi-pack-index.sh',
't5351-unpack-large-objects.sh',
't5400-send-pack.sh',
't5401-update-hooks.sh',
diff --git a/t/t5335-compact-multi-pack-index.sh b/t/t5335-compact-multi-pack-index.sh
new file mode 100755
index 0000000000..797ae05c3b
--- /dev/null
+++ b/t/t5335-compact-multi-pack-index.sh
@@ -0,0 +1,175 @@
+#!/bin/sh
+
+test_description='multi-pack-index compaction'
+
+. ./test-lib.sh
+
+GIT_TEST_MULTI_PACK_INDEX=0
+GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0
+GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0
+
+objdir=.git/objects
+packdir=$objdir/pack
+midxdir=$packdir/multi-pack-index.d
+midx_chain=$midxdir/multi-pack-index-chain
+
+nth_line() {
+ local n="$1"
+ shift
+ awk "NR==$n" "$@"
+}
+
+write_packs () {
+ for c in "$@"
+ do
+ test_commit "$c" &&
+
+ git pack-objects --all --unpacked $packdir/pack-$c &&
+ git prune-packed &&
+
+ git multi-pack-index write --incremental --bitmap || return 1
+ done
+}
+
+test_midx_layer_packs () {
+ local checksum="$1" &&
+ shift &&
+
+ test-tool read-midx $objdir "$checksum" >out &&
+
+ printf "%s\n" "$@" >expect &&
+ # NOTE: do *not* pipe through sort here, we want to ensure the
+ # order of packs is preserved during compaction.
+ grep "^pack-" out | cut -d"-" -f2 >actual &&
+
+ test_cmp expect actual
+}
+
+test_midx_layer_object_uniqueness () {
+ : >objs.all
+ while read layer
+ do
+ test-tool read-midx --show-objects $objdir "$layer" >out &&
+ grep "\.pack$" out | cut -d" " -f1 | sort >objs.layer &&
+ test_stdout_line_count = 0 comm -12 objs.all objs.layer &&
+ cat objs.all objs.layer | sort >objs.tmp &&
+ mv objs.tmp objs.all || return 1
+ done <$midx_chain
+}
+
+test_expect_success 'MIDX compaction with lex-ordered pack names' '
+ git init midx-compact-lex-order &&
+ (
+ cd midx-compact-lex-order &&
+
+ git config maintenance.auto false &&
+
+ write_packs A B C D E &&
+ test_line_count = 5 $midx_chain &&
+
+ git multi-pack-index compact --incremental \
+ "$(nth_line 2 "$midx_chain")" \
+ "$(nth_line 4 "$midx_chain")" &&
+ test_line_count = 3 $midx_chain &&
+
+ test_midx_layer_packs "$(nth_line 1 "$midx_chain")" A &&
+ test_midx_layer_packs "$(nth_line 2 "$midx_chain")" B C D &&
+ test_midx_layer_packs "$(nth_line 3 "$midx_chain")" E &&
+
+ test_midx_layer_object_uniqueness
+ )
+'
+
+test_expect_success 'MIDX compaction with non-lex-ordered pack names' '
+ git init midx-compact-non-lex-order &&
+ (
+ cd midx-compact-non-lex-order &&
+
+ git config maintenance.auto false &&
+
+ write_packs D C A B E &&
+ test_line_count = 5 $midx_chain &&
+
+ git multi-pack-index compact --incremental \
+ "$(nth_line 2 "$midx_chain")" \
+ "$(nth_line 4 "$midx_chain")" &&
+ test_line_count = 3 $midx_chain &&
+
+ test_midx_layer_packs "$(nth_line 1 "$midx_chain")" D &&
+ test_midx_layer_packs "$(nth_line 2 "$midx_chain")" C A B &&
+ test_midx_layer_packs "$(nth_line 3 "$midx_chain")" E &&
+
+ test_midx_layer_object_uniqueness
+ )
+'
+
+test_expect_success 'setup for bogus MIDX compaction scenarios' '
+ git init midx-compact-bogus &&
+ (
+ cd midx-compact-bogus &&
+
+ git config maintenance.auto false &&
+
+ write_packs A B C
+ )
+'
+
+test_expect_success 'MIDX compaction with missing endpoints' '
+ (
+ cd midx-compact-bogus &&
+
+ test_must_fail git multi-pack-index compact --incremental \
+ "<missing>" "<missing>" 2>err &&
+ test_grep "could not find MIDX: <missing>" err &&
+
+ test_must_fail git multi-pack-index compact --incremental \
+ "<missing>" "$(nth_line 2 "$midx_chain")" 2>err &&
+ test_grep "could not find MIDX: <missing>" err &&
+
+ test_must_fail git multi-pack-index compact --incremental \
+ "$(nth_line 2 "$midx_chain")" "<missing>" 2>err &&
+ test_grep "could not find MIDX: <missing>" err
+ )
+'
+
+test_expect_success 'MIDX compaction with reversed endpoints' '
+ (
+ cd midx-compact-bogus &&
+
+ from="$(nth_line 3 "$midx_chain")" &&
+ to="$(nth_line 1 "$midx_chain")" &&
+
+ test_must_fail git multi-pack-index compact --incremental \
+ "$from" "$to" 2>err &&
+
+ test_grep "MIDX $from must be an ancestor of $to" err
+ )
+'
+
+test_expect_success 'MIDX compaction with identical endpoints' '
+ (
+ cd midx-compact-bogus &&
+
+ from="$(nth_line 3 "$midx_chain")" &&
+ to="$(nth_line 3 "$midx_chain")" &&
+
+ test_must_fail git multi-pack-index compact --incremental \
+ "$from" "$to" 2>err &&
+
+ test_grep "MIDX compaction endpoints must be unique" err
+ )
+'
+
+test_expect_success 'MIDX compaction with midx.version=1' '
+ (
+ cd midx-compact-bogus &&
+
+ test_must_fail git -c midx.version=1 multi-pack-index compact \
+ "$(nth_line 1 "$midx_chain")" \
+ "$(nth_line 2 "$midx_chain")" 2>err &&
+
+ test_grep "fatal: cannot perform MIDX compaction with v1 format" err
+ )
+'
+
+test_done