aboutsummaryrefslogtreecommitdiff
path: root/builtin/pack-redundant.c
AgeCommit message (Collapse)Author
2026-02-21pack-redundant: fix memory leak when open_pack_index() failsSahitya Chandra
In add_pack(), we allocate l.remaining_objects with llist_init() before calling open_pack_index(). If open_pack_index() fails we return NULL without freeing the allocated list, leaking the memory. Fix by calling llist_free(l.remaining_objects) on the error path before returning. Signed-off-by: Sahitya Chandra <sahityajb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-30Merge branch 'ps/remove-packfile-store-get-packs'Junio C Hamano
Two slightly different ways to get at "all the packfiles" in API has been cleaned up. * ps/remove-packfile-store-get-packs: packfile: rename `packfile_store_get_all_packs()` packfile: introduce macro to iterate through packs packfile: drop `packfile_store_get_packs()` builtin/grep: simplify how we preload packs builtin/gc: convert to use `packfile_store_get_all_packs()` object-name: convert to use `packfile_store_get_all_packs()`
2025-10-16packfile: introduce macro to iterate through packsPatrick Steinhardt
We have a bunch of different sites that want to iterate through all packs of a given `struct packfile_store`. This pattern is somewhat verbose and repetitive, which makes it somewhat cumbersome. Introduce a new macro `repo_for_each_pack()` that removes some of the boilerplate. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-07Merge branch 'ps/packfile-store'Junio C Hamano
Code clean-up around the in-core list of all the pack files and object database(s). * ps/packfile-store: packfile: refactor `get_packed_git_mru()` to work on packfile store packfile: refactor `get_all_packs()` to work on packfile store packfile: refactor `get_packed_git()` to work on packfile store packfile: move `get_multi_pack_index()` into "midx.c" packfile: introduce function to load and add packfiles packfile: refactor `install_packed_git()` to work on packfile store packfile: split up responsibilities of `reprepare_packed_git()` packfile: refactor `prepare_packed_git()` to work on packfile store packfile: reorder functions to avoid function declaration odb: move kept cache into `struct packfile_store` odb: move MRU list of packfiles into `struct packfile_store` odb: move packfile map into `struct packfile_store` odb: move initialization bit into `struct packfile_store` odb: move list of packfiles into `struct packfile_store` packfile: introduce a new `struct packfile_store`
2025-09-24packfile: refactor `get_all_packs()` to work on packfile storePatrick Steinhardt
The `get_all_packs()` function prepares the packfile store and then returns its packfiles. Refactor it to accept a packfile store instead of a repository to clarify its scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17whatchanged: hint about git-log(1) and aliasingKristoffer Haugsbakk
There have been quite a few `--i-still-use-this` user reports since Git 2.51.0 was released.[1][2] And it doesn’t seem like they are reading the man page about the git-log(1) equivalent. Tell them what options to plug into git-log(1), either as a replacement command or as an alias.[3] That template produces almost the same output[4] and is arguably a plug-in replacement. Concretely, add an optional `hint` argument so that we can use it right after the initial error line. Also mention the same concrete options in the documentation while we’re at it. [1]: E.g., • https://lore.kernel.org/git/e1a69dea-bcb6-45fc-83d3-9e50d32c410b@5y5.one/ • https://lore.kernel.org/git/1011073f-9930-4360-a42f-71eb7421fe3f@chrispalmer.uk/#t • https://lore.kernel.org/git/9fcbfcc4-79f9-421f-b9a4-dc455f7db485@acm.org/#t • https://lore.kernel.org/git/83241BDE-1E0D-489A-9181-C608E9FCC17B@gmail.com/ [2]: The error message on 2.51.0 does tell them to report it, unconditionally [3]: We allow aliasing deprecated builtins now for people who are very used to the command name or just like it a lot [4]: You only get different outputs if you happen to have empty commits (no changes)[4] [5]: https://lore.kernel.org/git/20250825085428.GA367101@coredump.intra.peff.net/ Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-15Merge branch 'ps/object-store'Junio C Hamano
Code clean-up around object access API. * ps/object-store: odb: rename `read_object_with_reference()` odb: rename `pretend_object_file()` odb: rename `has_object()` odb: rename `repo_read_object_file()` odb: rename `oid_object_info()` odb: trivial refactorings to get rid of `the_repository` odb: get rid of `the_repository` when handling submodule sources odb: get rid of `the_repository` when handling the primary source odb: get rid of `the_repository` in `for_each()` functions odb: get rid of `the_repository` when handling alternates odb: get rid of `the_repository` in `odb_mkstemp()` odb: get rid of `the_repository` in `assert_oid_type()` odb: get rid of `the_repository` in `find_odb()` odb: introduce parent pointers object-store: rename files to "odb.{c,h}" object-store: rename `object_directory` to `odb_source` object-store: rename `raw_object_store` to `object_database`
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-25Merge branch 'jc/you-still-use-whatchanged'Junio C Hamano
"git whatchanged" that is longer to type than "git log --raw" which is its modern rough equivalent has outlived its usefulness more than 10 years ago. Plan to deprecate and remove it. * jc/you-still-use-whatchanged: whatschanged: list it in BreakingChanges document whatchanged: remove when built with WITH_BREAKING_CHANGES whatchanged: require --i-still-use-this tests: prepare for a world without whatchanged doc: prepare for a world without whatchanged you-still-use-that??: help deprecating commands for removal
2025-05-12you-still-use-that??: help deprecating commands for removalJunio C Hamano
Commands slated for removal like "git pack-redundant" now require an explicit "--i-still-use-this" option to run. This is to discourage casual use and surface their pending deprecation to users. The warning message is long, so factor it into a helper function you_still_use_that() to simplify reuse by other commands. Also add a missing test to ensure this enforcement works for "pack-redundant". Helped-by: Elijah Newren <newren@gmail.com> [en: log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-17builtin: send usage() help text to standard outputJunio C Hamano
Using the show_usage_and_exit_if_asked() helper we introduced earlier, fix callers of usage() that want to show the help text when explicitly asked by the end-user. The help text now goes to the standard output stream for them. These are the bog standard "if we got only '-h', then that is a request for help" callers. Their if (argc == 2 && !strcmp(argv[1], "-h")) usage(message); are simply replaced with show_usage_and_exit_if_asked(argc, argv, message); With this, the built-ins tested by t0012 all send their help text to their standard output stream, so the check in t0012 that was half tightened earlier is now fully tightened to insist on standard error stream being empty. Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-23Merge branch 'ps/build-sign-compare'Junio C Hamano
Start working to make the codebase buildable with -Wsign-compare. * ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-06global: trivial conversions to fix `-Wsign-compare` warningsPatrick Steinhardt
We have a bunch of loops which iterate up to an unsigned boundary using a signed index, which generates warnigs because we compare a signed and unsigned value in the loop condition. Address these sites for trivial cases and enable `-Wsign-compare` warnings for these code units. This patch only adapts those code units where we can drop the `DISABLE_SIGN_COMPARE_WARNINGS` macro in the same step. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04packfile: pass down repository to `odb_pack_name`Karthik Nayak
The function `odb_pack_name` currently relies on the global variable `the_repository`. To eliminate global variable usage in `packfile.c`, we should progressively shift the dependency on the_repository to higher layers. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-25packfile: drop sha1_pack_index_name()Jeff King
Like sha1_pack_name() that we dropped in the previous commit, this function uses an error-prone static strbuf and the somewhat misleading name "sha1". The only caller left is in pack-redundant.c. While this command is marked for potential removal in our BreakingChanges document, we still have it for now. But it's simple enough to convert it to use its own strbuf with the underlying odb_pack_name() function, letting us drop the otherwise obsolete function. Note that odb_pack_name() does its own strbuf_reset(), so it's safe to use directly within a loop like this. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-09-30builtin/pack-redundant: fix various memory leaksPatrick Steinhardt
There are various different memory leaks in git-pack-redundant(1), mostly caused by not even trying to free allocated memory. Fix them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13builtin: remove USE_THE_REPOSITORY_VARIABLE from builtin.hJohn Cai
Instead of including USE_THE_REPOSITORY_VARIABLE by default on every builtin, remove it from builtin.h and add it to all the builtins that include builtin.h (by definition, that means all builtins/*.c). Also, remove the include statement for repository.h since it gets brought in through builtin.h. The next step will be to migrate each builtin from having to use the_repository. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-13builtin: add a repository parameter for builtin functionsJohn Cai
In order to reduce the usage of the global the_repository, add a parameter to builtin functions that will get passed a repository variable. This commit uses UNUSED on most of the builtin functions, as subsequent commits will modify the actual builtins to pass the repository parameter down. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14hash: require hash algorithm in `oidread()` and `oidclr()`Patrick Steinhardt
Both `oidread()` and `oidclr()` use `the_repository` to derive the hash function that shall be used. Require callers to pass in the hash algorithm to get rid of this implicit dependency. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`Patrick Steinhardt
Many of our hash functions have two variants, one receiving a `struct git_hash_algo` and one that derives it via `the_repository`. Adapt all of those functions to always require the hash algorithm as input and drop the variants that do not accept one. As those functions are now independent of `the_repository`, we can move them from "hash.h" to "hash-ll.h". Note that both in this and subsequent commits in this series we always just pass `the_repository->hash_algo` as input even if it is obvious that there is a repository in the context that we should be using the hash from instead. This is done to be on the safe side and not introduce any regressions. All callsites should eventually be amended to use a repo passed via parameters, but this is outside the scope of this patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-06Merge branch 'en/header-split-cleanup'Junio C Hamano
Split key function and data structure definitions out of cache.h to new header files and adjust the users. * en/header-split-cleanup: csum-file.h: remove unnecessary inclusion of cache.h write-or-die.h: move declarations for write-or-die.c functions from cache.h treewide: remove cache.h inclusion due to setup.h changes setup.h: move declarations for setup.c functions from cache.h treewide: remove cache.h inclusion due to environment.h changes environment.h: move declarations for environment.c functions from cache.h treewide: remove unnecessary includes of cache.h wrapper.h: move declarations for wrapper.c functions from cache.h path.h: move function declarations for path.c functions from cache.h cache.h: remove expand_user_path() abspath.h: move absolute path functions from cache.h environment: move comment_line_char from cache.h treewide: remove unnecessary cache.h inclusion from several sources treewide: remove unnecessary inclusion of gettext.h treewide: be explicit about dependence on gettext.h treewide: remove unnecessary cache.h inclusion from a few headers
2023-04-06Merge branch 'jk/unused-post-2.40-part2'Junio C Hamano
Code clean-up for "-Wunused-parameter" build. * jk/unused-post-2.40-part2: parse-options: drop parse_opt_unknown_cb() t/helper: mark unused argv/argc arguments mark "argv" as unused when we check argc builtins: mark unused prefix parameters builtins: annotate always-empty prefix parameters builtins: always pass prefix to parse_options() fast-import: fix file access when run from subdir
2023-04-04Merge branch 'jk/really-deprecate-pack-redundant'Junio C Hamano
"git pack-redundant" gave a warning when run, as the command has outlived its usefulness long ago and is nominated for future removal. Now we escalate to give an error. * jk/really-deprecate-pack-redundant: pack-redundant: escalate deprecation warning to an error
2023-03-28builtins: mark unused prefix parametersJeff King
All builtins receive a "prefix" parameter, but it is only useful if they need to adjust filenames given by the user on the command line. For builtins that do not even call parse_options(), they often don't look at the prefix at all, and -Wunused-parameter complains. Let's annotate those to silence the compiler warning. I gave a quick scan of each of these cases, and it seems like they don't have anything they _should_ be using the prefix for (i.e., there is no hidden bug that we are missing). The only questionable cases I saw were: - in git-unpack-file, we create a tempfile which will always be at the root of the repository, even if the command is run from a subdir. Arguably this should be created in the subdir from which we're run (as we report the path only as a relative name). However, nobody has complained, and I'm hesitant to change something that is deep plumbing going back to April 2005 (though I think within our scripts, the sole caller in git-merge-one-file would be OK, as it moves to the toplevel itself). - in fetch-pack, local-filesystem remotes are taken as relative to the project root, not the current directory. So: git init server.git [...put stuff in server.git...] git init client.git cd client.git mkdir subdir cd subdir git fetch-pack ../../server.git ... won't work, as we quietly move to the top of the repository before interpreting the path (so "../server.git" would work). This is weird, but again, nobody has complained and this is how it has always worked. And this is how "git fetch" works, too. Plus it raises questions about how a configured remote like: git config remote.origin.url ../server.git should behave. I can certainly come up with a reasonable set of behavior, but it may not be worth stirring up complications in a plumbing tool. So I've left the behavior untouched in both of those cases. If anybody really wants to revisit them, it's easy enough to drop the UNUSED marker. This commit is just about removing them as obstacles to turning on -Wunused-parameter all the time. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-23pack-redundant: escalate deprecation warning to an errorJeff King
In c3b58472be2 (pack-redundant: gauge the usage before proposing its removal, 2020-08-25), we added a big, ugly warning when pack-redundant is run. The plan there indicated that we would ratchet that up to an error before finally removing it. Since it has been 2.5 years (and 9 releases) since then, let's continue with the plan. Note that we did get one bite on the warning, which was somebody asking about alternatives: https://lore.kernel.org/git/CAKvOHKAFXQwt4D8yUCCkf_TQL79mYaJ=KAKhtpDNTvHJFuX1NA@mail.gmail.com/ but we didn't undo the ugly warning (and the advice continues to be "use repack -d" instead). There was also some discussion around the time of the deprecation that pack-redundant was invoked by the bitbake tool, and it still seems to do so now: https://git.openembedded.org/bitbake That use should probably just go away in favor of an occasional repack (which probably even happens via auto-gc after fetch these days). But since neither of those data points caused us to cancel the deprecation plan by dropping the warning, it seems like we should proceed with the next step. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21treewide: be explicit about dependence on gettext.hElijah Newren
Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23cache.h: remove dependence on hex.h; make other files include it explicitlyElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13doc txt & -h consistency: fix mismatching labelsÆvar Arnfjörð Bjarmason
Fix various inconsistencies between command SYNOPSIS and the corresponding -h output where our translatable labels didn't match up. In some cases we need to adjust the prose that follows the SYNOPSIS accordingly, as it refers back to the changed label. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-02tree-wide: apply equals-null.cocciJunio C Hamano
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27builtin/pack-redundant: avoid casting buffers to struct object_idbrian m. carlson
Now that we need our instances of struct object_id to be zero padded, we can no longer cast unsigned char buffers to be pointers to struct object_id. This file reads data out of the pack objects and then inserts it directly into a linked list item which is a pointer to struct object_id. Instead, let's have the linked list item hold its own struct object_id and copy the data into it. In addition, since these are not really pointers to struct object_id, stop passing them around as such, and call them what they really are: pointers to unsigned char. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13use CALLOC_ARRAYRené Scharfe
Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25Merge branch 'jc/deprecate-pack-redundant'Junio C Hamano
Warn loudly when the "pack-redundant" command, which has been left stale with almost unusable performance issues, gets used, as we no longer want to recommend its use (instead just "repack -d" instead). * jc/deprecate-pack-redundant: pack-redundant: gauge the usage before proposing its removal
2020-12-23Merge branch 'jx/pack-redundant-on-single-pack'Junio C Hamano
"git pack-redandant" when there is only one packfile used to crash, which has been corrected. * jx/pack-redundant-on-single-pack: pack-redundant: fix crash when one packfile in repo
2020-12-16pack-redundant: fix crash when one packfile in repoJiang Xin
Command `git pack-redundant --all` will crash if there is only one packfile in the repository. This is because, if there is only one packfile in local_packs, `cmp_local_packs` will do nothing and will leave `pl->unique_objects` as uninitialized. Also add testcases for repository with no packfile and one packfile in t5323. Reported-by: Daniel C. Klauer <daniel.c.klauer@web.de> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15pack-redundant: gauge the usage before proposing its removalJunio C Hamano
The subcommand is unusably slow and the reason why nobody reports it as a performance bug is suspected to be the absense of users. Let's show a big message that asks the user to tell us that they still care about the command when an attempt is made to run the command, with an escape hatch to override it with a command line option. In a few releases, we may turn it into an error and keep it for a few more releases before finally removing it (during the whole time, the plan to remove it would be interrupted by end user raising hand). Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16use size_t to store pack .idx byte offsetsJeff King
We sometimes store the offset into a pack .idx file as an "unsigned long", but the mmap'd size of a pack .idx file can exceed 4GB. This is sufficient on LP64 systems like Linux, but will be too small on LLP64 systems like Windows, where "unsigned long" is still only 32 bits. Let's use size_t, which is a better type for an offset into a memory buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-01object-store: rename and expand packed_git's sha1 memberbrian m. carlson
This member is used to represent the pack checksum of the pack in question. Expand this member to be GIT_MAX_RAWSZ bytes in length so it works with longer hashes and rename it to be "hash" instead of "sha1". This transformation was made with a change to the definition and the following semantic patch: @@ struct packed_git *E1; @@ - E1->sha1 + E1->hash @@ struct packed_git E1; @@ - E1.sha1 + E1.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-07Merge branch 'sc/pack-redundant'Junio C Hamano
Update the implementation of pack-redundant for performance in a repository with many packfiles. * sc/pack-redundant: pack-redundant: consistent sort method pack-redundant: rename pack_list.all_objects pack-redundant: new algorithm to find min packs pack-redundant: delete redundant code pack-redundant: delay creation of unique_objects t5323: test cases for git-pack-redundant
2019-02-04pack-redundant: consistent sort methodJiang Xin
SZEDER reported that test case t5323 has different test result on MacOS. This is because `cmp_pack_list_reverse` cannot give identical result when two pack being sorted has the same size of remaining_objects. Changes to the sorting function will make consistent test result for t5323. The new algorithm to find redundant packs is a trade-off to save memory resources, and the result of it may be different with old one, and may be not the best result sometimes. Update t5323 for the new algorithm. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-04pack-redundant: rename pack_list.all_objectsJiang Xin
New algorithm uses `pack_list.all_objects` to track remaining objects, so rename it to `pack_list.remaining_objects`. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-04pack-redundant: new algorithm to find min packsSun Chao
When calling `git pack-redundant --all`, if there are too many local packs and too many redundant objects within them, the too deep iteration of `get_permutations` will exhaust all the resources, and the process of `git pack-redundant` will be killed. The following script could create a repository with too many redundant packs, and running `git pack-redundant --all` in the `test.git` repo will die soon. #!/bin/sh repo="$(pwd)/test.git" work="$(pwd)/test" i=1 max=199 if test -d "$repo" || test -d "$work"; then echo >&2 "ERROR: '$repo' or '$work' already exist" exit 1 fi git init -q --bare "$repo" git --git-dir="$repo" config gc.auto 0 git --git-dir="$repo" config transfer.unpackLimit 0 git clone -q "$repo" "$work" 2>/dev/null while :; do cd "$work" echo "loop $i: $(date +%s)" >$i git add $i git commit -q -sm "loop $i" git push -q origin HEAD:master printf "\rCreate pack %4d/%d\t" $i $max if test $i -ge $max; then break; fi cd "$repo" git repack -q if test $(($i % 2)) -eq 0; then git repack -aq pack=$(ls -t $repo/objects/pack/*.pack | head -1) touch "${pack%.pack}.keep" fi i=$((i+1)) done printf "\ndone\n" To get the `min` unique pack list, we can replace the iteration in `minimize` function with a new algorithm, and this could solve this issue: 1. Get the unique and non_uniqe packs, add the unique packs to the `min` list. 2. Remove the objects of unique packs from non_unique packs, then each object left in the non_unique packs will have at least two copies. 3. Sort the non_unique packs by the objects' size, more objects first, and add the first non_unique pack to `min` list. 4. Drop the duplicated objects from other packs in the ordered non_unique pack list, and repeat step 3. Some test cases will fail on Mac OS X. Mark them and will resolve in later commit. Original PR and discussions: https://github.com/jiangxin/git/pull/25 Signed-off-by: Sun Chao <sunchao9@huawei.com> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-04pack-redundant: delete redundant codeSun Chao
The objects in alt-odb are removed from `all_objects` twice in `load_all_objects` and `scan_alt_odb_packs`, remove it from the later function. Signed-off-by: Sun Chao <sunchao9@huawei.com> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-04pack-redundant: delay creation of unique_objectsJiang Xin
Instead of initializing unique_objects in `add_pack()`, copy from all_objects in `cmp_two_packs()`, when unwanted objects are removed from all_objects. This will save memory (no allocate memory for alt-odb packs), and run `llist_sorted_difference_inplace()` only once when removing ignored objects and removing objects in alt-odb in `scan_alt_odb_packs()`. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-04various: tighten constness of some local variablesShahzad Lone
Signed-off-by: Shahzad Lone <shahzadlone@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20treewide: use get_all_packsDerrick Stolee
There are many places in the codebase that want to iterate over all packfiles known to Git. The purposes are wide-ranging, and those that can take advantage of the multi-pack-index already do. So, use get_all_packs() instead of get_packed_git() to be sure we are iterating over all packfiles. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02pack-redundant: convert linked lists to use struct object_idbrian m. carlson
Convert struct llist_item and the rest of the linked list code to use struct object_id. Add a use of GIT_MAX_HEXSZ to avoid a dependency on a hard-coded constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02pack-redundant: abstract away hash algorithmbrian m. carlson
Instead of using hard-coded instances of the constant 20, use the_hash_algo to look up the correct constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>