From 9fec7b213045135655354e864d15894175428d5a Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 1 Sep 2021 15:09:50 +0200 Subject: connected: refactor iterator to return next object ID directly MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The object ID iterator used by the connectivity checks returns the next object ID via an out-parameter and then uses a return code to indicate whether an item was found. This is a bit roundabout: instead of a separate error code, we can just return the next object ID directly and use `NULL` pointers as indicator that the iterator got no items left. Furthermore, this avoids a copy of the object ID. Refactor the iterator and all its implementations to return object IDs directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs: Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s] Range (min … max): 29.934 s … 30.406 s 10 runs Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s] Range (min … max): 29.696 s … 29.996 s 10 runs Summary '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran 1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch' While this 1% speedup could be labelled as statistically insignificant, the speedup is consistent on my machine. Furthermore, this is an end to end test, so it is expected that the improvement in the connectivity check itself is more significant. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- fetch-pack.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) (limited to 'fetch-pack.c') diff --git a/fetch-pack.c b/fetch-pack.c index 0bf7ed7e47..e6ec79f81a 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -1912,16 +1912,15 @@ static void update_shallow(struct fetch_pack_args *args, oid_array_clear(&ref); } -static int iterate_ref_map(void *cb_data, struct object_id *oid) +static const struct object_id *iterate_ref_map(void *cb_data) { struct ref **rm = cb_data; struct ref *ref = *rm; if (!ref) - return -1; /* end of the list */ + return NULL; *rm = ref->next; - oidcpy(oid, &ref->old_oid); - return 0; + return &ref->old_oid; } struct ref *fetch_pack(struct fetch_pack_args *args, -- cgit v1.3 From 62b5a35a33ad6a4537e2ae75a49036e4173fcc87 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 1 Sep 2021 15:09:54 +0200 Subject: fetch-pack: optimize loading of refs via commit graph MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In order to negotiate a packfile, we need to dereference refs to see which commits we have in common with the remote. To do so, we first look up the object's type -- if it's a tag, we peel until we hit a non-tag object. If we hit a commit eventually, then we return that commit. In case the object ID points to a commit directly, we can avoid the initial lookup of the object type by opportunistically looking up the commit via the commit-graph, if available, which gives us a slight speed bump of about 2% in a huge repository with about 2.3M refs: Benchmark #1: HEAD~: git-fetch Time (mean ± σ): 31.634 s ± 0.258 s [User: 28.400 s, System: 5.090 s] Range (min … max): 31.280 s … 31.896 s 5 runs Benchmark #2: HEAD: git-fetch Time (mean ± σ): 31.129 s ± 0.543 s [User: 27.976 s, System: 5.056 s] Range (min … max): 30.172 s … 31.479 s 5 runs Summary 'HEAD: git-fetch' ran 1.02 ± 0.02 times faster than 'HEAD~: git-fetch' In case this fails, we fall back to the old code which peels the objects to a commit. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- fetch-pack.c | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'fetch-pack.c') diff --git a/fetch-pack.c b/fetch-pack.c index e6ec79f81a..a9604f35a3 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -119,6 +119,11 @@ static struct commit *deref_without_lazy_fetch(const struct object_id *oid, { enum object_type type; struct object_info info = { .typep = &type }; + struct commit *commit; + + commit = lookup_commit_in_graph(the_repository, oid); + if (commit) + return commit; while (1) { if (oid_object_info_extended(the_repository, oid, &info, -- cgit v1.3