From 1c573cdd7219db5600fb2b5249f7c8835c8d416d Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Fri, 20 May 2022 19:17:38 -0400 Subject: pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' This structure will be used to communicate the per-object mtimes when writing a cruft pack. Here, we need the full packing_data structure because the mtime information is stored in an array there, not on the individual object_entry's themselves (to avoid paying the overhead in structure width for operations which do not generate a cruft pack). We haven't passed this information down before because one of the two callers (in bulk-checkin.c) does not have a packing_data structure at all. In that case (where no cruft pack will be generated), NULL is passed instead. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- builtin/pack-objects.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'builtin/pack-objects.c') diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 014dcd4bc9..6ac927047c 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1262,7 +1262,8 @@ static void write_pack_file(void) stage_tmp_packfiles(&tmpname, pack_tmp_name, written_list, nr_written, - &pack_idx_opts, hash, &idx_tmp_name); + &to_pack, &pack_idx_opts, hash, + &idx_tmp_name); if (write_bitmap_index) { size_t tmpname_len = tmpname.len; -- cgit v1.3 From fa23090b0c5cc27d0720535e27236fef9b409efb Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Fri, 20 May 2022 19:17:49 -0400 Subject: builtin/pack-objects.c: return from create_object_entry() A new caller in the next commit will want to immediately modify the object_entry structure created by create_object_entry(). Instead of forcing that caller to wastefully look-up the entry we just created, return it from create_object_entry() instead. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- builtin/pack-objects.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) (limited to 'builtin/pack-objects.c') diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 6ac927047c..c6d16872ee 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1516,13 +1516,13 @@ static int want_object_in_pack(const struct object_id *oid, return 1; } -static void create_object_entry(const struct object_id *oid, - enum object_type type, - uint32_t hash, - int exclude, - int no_try_delta, - struct packed_git *found_pack, - off_t found_offset) +static struct object_entry *create_object_entry(const struct object_id *oid, + enum object_type type, + uint32_t hash, + int exclude, + int no_try_delta, + struct packed_git *found_pack, + off_t found_offset) { struct object_entry *entry; @@ -1539,6 +1539,8 @@ static void create_object_entry(const struct object_id *oid, } entry->no_try_delta = no_try_delta; + + return entry; } static const char no_closure_warning[] = N_( -- cgit v1.3 From b757353676d6ffe1ac275366fa8b5b42b5d9727d Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Fri, 20 May 2022 19:17:52 -0400 Subject: builtin/pack-objects.c: --cruft without expiration Teach `pack-objects` how to generate a cruft pack when no objects are dropped (i.e., `--cruft-expiration=never`). Later patches will teach `pack-objects` how to generate a cruft pack that prunes objects. When generating a cruft pack which does not prune objects, we want to collect all unreachable objects into a single pack (noting and updating their mtimes as we accumulate them). Ordinary use will pass the result of a `git repack -A` as a kept pack, so when this patch says "kept pack", readers should think "reachable objects". Generating a non-expiring cruft packs works as follows: - Callers provide a list of every pack they know about, and indicate which packs are about to be removed. - All packs which are going to be removed (we'll call these the redundant ones) are marked as kept in-core. Any packs the caller did not mention (but are known to the `pack-objects` process) are also marked as kept in-core. Packs not mentioned by the caller are assumed to be unknown to them, i.e., they entered the repository after the caller decided which packs should be kept and which should be discarded. Since we do not want to include objects in these "unknown" packs (because we don't know which of their objects are or aren't reachable), these are also marked as kept in-core. - Then, we enumerate all objects in the repository, and add them to our packing list if they do not appear in an in-core kept pack. This results in a new cruft pack which contains all known objects that aren't included in the kept packs. When the kept pack is the result of `git repack -A`, the resulting pack contains all unreachable objects. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- Documentation/git-pack-objects.txt | 30 +++++ builtin/pack-objects.c | 201 +++++++++++++++++++++++++++++++++- object-file.c | 2 +- object-store.h | 2 + t/t5329-pack-objects-cruft.sh | 218 +++++++++++++++++++++++++++++++++++++ 5 files changed, 448 insertions(+), 5 deletions(-) create mode 100755 t/t5329-pack-objects-cruft.sh (limited to 'builtin/pack-objects.c') diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index f8344e1e5b..a9995a932c 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -13,6 +13,7 @@ SYNOPSIS [--no-reuse-delta] [--delta-base-offset] [--non-empty] [--local] [--incremental] [--window=] [--depth=] [--revs [--unpacked | --all]] [--keep-pack=] + [--cruft] [--cruft-expiration=