diff options
| author | Junio C Hamano <gitster@pobox.com> | 2025-02-18 15:30:31 -0800 |
|---|---|---|
| committer | Junio C Hamano <gitster@pobox.com> | 2025-02-18 15:30:31 -0800 |
| commit | e565f3755342caf1d21e22359eaf09ec11d8c0ae (patch) | |
| tree | 5f72183ddfbef4150aabe189127d03248c0e8151 /Documentation/git-backfill.adoc | |
| parent | 03944513488db4a81fdb4c21c3b515e4cb260b05 (diff) | |
| parent | 85127bcdeab5ab34f9c738da3fcc88d637f39089 (diff) | |
| download | git-e565f3755342caf1d21e22359eaf09ec11d8c0ae.tar.xz | |
Merge branch 'ds/backfill'
Lazy-loading missing files in a blobless clone on demand is costly
as it tends to be one-blob-at-a-time. "git backfill" is introduced
to help bulk-download necessary files beforehand.
* ds/backfill:
backfill: assume --sparse when sparse-checkout is enabled
backfill: add --sparse option
backfill: add --min-batch-size=<n> option
backfill: basic functionality and tests
backfill: add builtin boilerplate
Diffstat (limited to 'Documentation/git-backfill.adoc')
| -rw-r--r-- | Documentation/git-backfill.adoc | 71 |
1 files changed, 71 insertions, 0 deletions
diff --git a/Documentation/git-backfill.adoc b/Documentation/git-backfill.adoc new file mode 100644 index 0000000000..95623051f7 --- /dev/null +++ b/Documentation/git-backfill.adoc @@ -0,0 +1,71 @@ +git-backfill(1) +=============== + +NAME +---- +git-backfill - Download missing objects in a partial clone + + +SYNOPSIS +-------- +[synopsis] +git backfill [--min-batch-size=<n>] [--[no-]sparse] + +DESCRIPTION +----------- + +Blobless partial clones are created using `git clone --filter=blob:none` +and then configure the local repository such that the Git client avoids +downloading blob objects unless they are required for a local operation. +This initially means that the clone and later fetches download reachable +commits and trees but no blobs. Later operations that change the `HEAD` +pointer, such as `git checkout` or `git merge`, may need to download +missing blobs in order to complete their operation. + +In the worst cases, commands that compute blob diffs, such as `git blame`, +become very slow as they download the missing blobs in single-blob +requests to satisfy the missing object as the Git command needs it. This +leads to multiple download requests and no ability for the Git server to +provide delta compression across those objects. + +The `git backfill` command provides a way for the user to request that +Git downloads the missing blobs (with optional filters) such that the +missing blobs representing historical versions of files can be downloaded +in batches. The `backfill` command attempts to optimize the request by +grouping blobs that appear at the same path, hopefully leading to good +delta compression in the packfile sent by the server. + +In this way, `git backfill` provides a mechanism to break a large clone +into smaller chunks. Starting with a blobless partial clone with `git +clone --filter=blob:none` and then running `git backfill` in the local +repository provides a way to download all reachable objects in several +smaller network calls than downloading the entire repository at clone +time. + +By default, `git backfill` downloads all blobs reachable from the `HEAD` +commit. This set can be restricted or expanded using various options. + +THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR MAY CHANGE IN THE FUTURE. + + +OPTIONS +------- + +`--min-batch-size=<n>`:: + Specify a minimum size for a batch of missing objects to request + from the server. This size may be exceeded by the last set of + blobs seen at a given path. The default minimum batch size is + 50,000. + +`--[no-]sparse`:: + Only download objects if they appear at a path that matches the + current sparse-checkout. If the sparse-checkout feature is enabled, + then `--sparse` is assumed and can be disabled with `--no-sparse`. + +SEE ALSO +-------- +linkgit:git-clone[1]. + +GIT +--- +Part of the linkgit:git[1] suite |
