| Age | Commit message (Collapse) | Author |
|
In the preceding commit we refactored `lookup_commit_reference_gently()`
so that it doesn't parse non-commit objects anymore. This has led to a
speedup when git-receive-pack(1) accepts a shallow push into a repo
with lots of refs that point to blobs or trees.
But while this case is now faster, we still have the issue that
accepting pushes with lots of "normal" refs that point to commits are
still slow. This is mostly because we look up the commits via the object
database, and that is rather costly.
Adapt the code to use `repo_parse_commit_gently()` instead of
`parse_object()` to parse the resulting commit object. This function
knows to use the commit-graph to fill in the object, which is way more
cost efficient.
This leads to another significant speedup when accepting shallow pushes.
The following benchmark pushes a single objects from a shallow clone
into a repository with 600,000 references that all point to commits:
Benchmark 1: git-receive-pack (rev = HEAD~)
Time (mean ± σ): 9.179 s ± 0.031 s [User: 8.858 s, System: 0.528 s]
Range (min … max): 9.154 s … 9.213 s 3 runs
Benchmark 2: git-receive-pack (rev = HEAD)
Time (mean ± σ): 2.337 s ± 0.032 s [User: 2.331 s, System: 0.234 s]
Range (min … max): 2.308 s … 2.371 s 3 runs
Summary
git-receive-pack . </tmp/input (rev = HEAD) ran
3.93 ± 0.05 times faster than git-receive-pack (rev = HEAD~)
Also, this again leads to a significant reduction in memory allocations.
Before this change:
HEAP SUMMARY:
in use at exit: 17,524,978 bytes in 22,393 blocks
total heap usage: 33,313 allocs, 10,920 frees, 407,774,251 bytes allocated
And after this change:
HEAP SUMMARY:
in use at exit: 11,534,036 bytes in 12,406 blocks
total heap usage: 13,284 allocs, 878 frees, 15,521,451 bytes allocated
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the next commit we will start to parse more commits via the
commit-graph. This change will lead to a segfault though because we try
to access the tree of a commit via `repo_get_commit_tree()`, but:
- The commit has been parsed via the commit-graph, and thus its
`maybe_tree` field is not yet populated.
- We cannot use the commit-graph to populate the commit's tree because
we're in the process of writing the commit-graph.
The consequence is that we'll get a `NULL` pointer for the tree in
`write_graph_chunk_data()`.
In theory we are already mindful of this situation, as we explicitly use
`repo_parse_commit_no_graph()` to parse the commit without the help of
the commit-graph. But that doesn't do the trick as the commit is already
marked as parsed, so the function will not re-populate it. And as the
commit-graph has been closed, neither will `get_commit_tree_oid()` be
able to load the tree for us.
It seems like this issue can only be hit under artificial circumstances:
the error was hit via `git_test_write_commit_graph_or_die()`, which is
run by git-commit(1) and git-merge(1) in case `GIT_TEST_COMMIT_GRAPH=1`:
$ GIT_TEST_COMMIT_GRAPH=1 meson test t7507-commit-verbose \
--test-args=-ix -i
...
++ git -c commit.verbose=true commit --amend
hint: Waiting for your editor to close the file...
./test-lib.sh: line 1012: 55895 Segmentation fault (core dumped) git -c commit.verbose=true commit --amend
To the best of my knowledge, this is the only case where we end up
writing a commit-graph in the same process that might have already
consulted the commit-graph to look up arbitrary objects. But regardless
of that, this feels like a bigger accident that is just waiting to
happen.
Make the code more robust by extending `repo_parse_commit_no_graph()` to
unparse a commit first in case we detect it's coming from a graph. This
ensures that we will re-read the object without it, and thus we will
populate `maybe_tree` properly.
This fix shouldn't have any performance consequences: the function is
only ever called in the "commit-graph.c" code, and we'll only re-parse
the commit at most once.
Add an exclusion to our Coccinelle rules so that it doesn't complain
about us accessing `maybe_tree` directly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The function `lookup_commit_reference_gently()` can be used to look up a
committish by object ID. As such, the function knows to peel for example
tag objects so that we eventually end up with the commit.
The function is used quite a lot throughout our tree. One such user is
"shallow.c" via `assign_shallow_commits_to_refs()`. The intent of this
function is to figure out whether a shallow push is missing any objects
that are required to satisfy the ref updates, and if so, which of the
ref updates is missing objects.
This is done by painting the tree with `UNINTERESTING`. We start
painting by calling `refs_for_each_ref()` so that we can mark all
existing referenced objects as the boundary of objects that we already
have, and which are supposed to be fully connected. The reference tips
are then parsed via `lookup_commit_reference_gently()`, and the commit
is then marked as uninteresting.
But references may not necessarily point to a committish, and if a lot
of them aren't then this step takes a lot of time. This is mostly due to
the way that `lookup_commit_reference_gently()` is implemented: before
we learn about the type of the object we already call `parse_object()`
on the object ID. This has two consequences:
- We parse all objects, including trees and blobs, even though we
don't even need the contents of them.
- More importantly though, `parse_object()` will cause us to check
whether the object ID matches its contents.
Combined this means that we deflate and hash every non-committish
object, and that of course ends up being both CPU- and memory-intensive.
Improve the logic so that we first use `peel_object()`. This function
won't parse the object for us, and thus it allows us to learn about the
object's type before we parse and return it.
The following benchmark pushes a single object from a shallow clone into
a repository that has 100,000 refs. These refs were created by listing
all objects via `git rev-list(1) --objects --all` and creating refs for
a subset of them, so lots of those refs will cover non-commit objects.
Benchmark 1: git-receive-pack (rev = HEAD~)
Time (mean ± σ): 62.571 s ± 0.413 s [User: 58.331 s, System: 4.053 s]
Range (min … max): 62.191 s … 63.010 s 3 runs
Benchmark 2: git-receive-pack (rev = HEAD)
Time (mean ± σ): 38.339 s ± 0.192 s [User: 36.220 s, System: 1.992 s]
Range (min … max): 38.176 s … 38.551 s 3 runs
Summary
git-receive-pack . </tmp/input (rev = HEAD) ran
1.63 ± 0.01 times faster than git-receive-pack . </tmp/input (rev = HEAD~)
This leads to a sizeable speedup as we now skip reading and parsing
non-commit objects. Before this change we spent around 40% of the time
in `assign_shallow_commits_to_refs()`, after the change we only spend
around 1.2% of the time in there. Almost the entire remainder of the
time is spent in git-rev-list(1) to perform the connectivity checks.
Despite the speedup though, this also leads to a massive reduction in
allocations. Before:
HEAP SUMMARY:
in use at exit: 352,480,441 bytes in 97,185 blocks
total heap usage: 2,793,820 allocs, 2,696,635 frees, 67,271,456,983 bytes allocated
And after:
HEAP SUMMARY:
in use at exit: 17,524,978 bytes in 22,393 blocks
total heap usage: 33,313 allocs, 10,920 frees, 407,774,251 bytes allocated
Note that when all references refer to commits performance stays roughly
the same, as expected. The following benchmark was executed with 600k
commits:
Benchmark 1: git-receive-pack (rev = HEAD~)
Time (mean ± σ): 9.101 s ± 0.006 s [User: 8.800 s, System: 0.520 s]
Range (min … max): 9.095 s … 9.106 s 3 runs
Benchmark 2: git-receive-pack (rev = HEAD)
Time (mean ± σ): 9.128 s ± 0.094 s [User: 8.820 s, System: 0.522 s]
Range (min … max): 9.019 s … 9.188 s 3 runs
Summary
git-receive-pack (rev = HEAD~) ran
1.00 ± 0.01 times faster than git-receive-pack (rev = HEAD)
This will be improved in the next commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Several tests use a pattern that writes to a temporary file like this:
printf "do something with %d\n" $(test_seq <count>) >tmpfile &&
git do-something --stdin <tmpfile
Other tests use test_seq's -f parameter, but still write to a temporary file:
test_seq -f "do something with %d" <count> >input &&
git do-something --stdin <input
Simplify both of these patterns to
test_seq -f "do something with %d" <count> |
git do-something --stdin
Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
wt-status.c still uses the global the_hash_algo even though a repository
instance is already available via struct wt_status.
Replace uses of the_hash_algo with the hash algorithm stored in the
associated repository (s->repo->hash_algo or r->hash_algo).
This removes another dependency on global state and keeps wt-status
consistent with local repository usage.
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
wt-status.c uses the global the_repository in several places even when a
repository instance is already available via struct wt_status *s or struct
repository *r.
Replace these uses of the_repository with the repository available in the
local context (i.e. s->repo or r).
The replacements of all the_repository with s->repo are mostly to cases
where a repository instance is already available via struct wt_status *s
and struct repository *r, all functions operating on struct wt_status *s
are only used after s is initialized by wt_status_prepare(), which sets
s->repo from the repository provided by the caller. As a result, s->repo is
guaranteed to be available and consistent whenever these functions are
invoked.
This reduces reliance on global state and keeps wt-status consistent,
though many functions operating on struct wt_status *s are called via
commit.c and it still relies on the_repository, but within wt-status.c the
local repository pointer refers to the same underlying repository object.
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Some functions in wt-status.c (count_stash_entries(),
read_line_from_git_path(), abbrev_oid_in_line(), and
read_rebase_todolist()) rely on the_repository as they do not have access
to a local repository instance.
Add a struct repository *r parameter to these functions and pass the local
repository instance through the callers, which already have access to it
either directly by struct repository *r or indirectly by struct wt_state
*s (s->repo).
Replace uses of the_repository in these functions with the passed parameter.
Signed-off-by: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* 'pks-meson-fixes' of github.com:pks-gitlab/git-gui:
git-gui: wire up "git-gui--askyesno" with Meson
git-gui: massage "git-gui--askyesno" with "generate-script.sh"
git-gui: prefer shell at "/bin/sh" with Meson
git-gui: fix use of GIT_CEILING_DIRECTORIES
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
|
|
"git format-patch" takes "--from=<user ident>" command line option
and uses the given ident for patch e-mails, but this is not applied
to the cover letter, the option is ignored and the committer ident
of the current user is used. This has been the case ever since
"--from" was introduced in a9080475 (teach format-patch to place
other authors into in-body "From", 2013-07-03).
Teach the make_cover_letter() function to honor the option, instead of
always using the current committer identity. Change variable name from
"committer" to "from" to better reflect the purpose of the variable.
Signed-off-by: Mirko Faina <mroik@delayed.space>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Doc update.
* jc/doc-rerere-update:
rerere: minor documantation update
|
|
Test updates.
* sd/t7003-test-path-is-helpers:
t7003: modernize path existence checks using test helpers
|
|
Doc update.
* kh/doc-rerere-options-xref:
doc: rerere-options.adoc: link to git-rerere(1)
|
|
Test update.
* bk/t2003-modernise:
t2003: modernize path existence checks using test helpers
|
|
Code clean-up to use the commit_stack API.
* rs/commit-commit-stack:
commit: use commit_stack
|
|
Clean up redundant includes of header files.
* rs/clean-includes:
remove duplicate includes
|
|
Reduce dependency on the_repository of xdiff-interface layer.
* rs/xdiff-wo-the-repository:
xdiff-interface: stop using the_repository
|
|
Code clean-up.
* rs/version-wo-the-repository:
version: stop using the_repository
|
|
"git merge-file" can be run outside a repository, but it ignored
all configuration, even the per-user ones. The command now uses
available configuration files to find its customization.
* yt/merge-file-outside-a-repository:
merge-file: honor merge.conflictStyle outside of a repository
|
|
A handful of documentation pages have been modernized to use the
"synopsis" style.
* ja/doc-synopsis-style-even-more:
doc: convert git-show to synopsis style
doc: fix some style issues in git-clone and for-each-ref-options
doc: finalize git-clone documentation conversion to synopsis style
doc: convert git-submodule to synopsis style
|
|
Allow recording process ID of the process that holds the lock next
to a lockfile for diagnosis.
* pc/lockfile-pid:
lockfile: add PID file for debugging stale locks
|
|
The `core.attributeFile` config value is parsed in
git_default_core_config(), loaded eagerly and stored in the global
variable `git_attributes_file`. Storing this value in a global
variable can lead to it being overwritten by another repository when
more than one Git repository run in the same Git process.
Create a new struct `repo_config_values` to hold this value and
other repository dependent values parsed by `git_default_config()`.
This will ensure the current behaviour remains the same while also
enabling the libification of Git.
An accessor function 'repo_config_values()' s created to ensure
that we do not access an uninitialized repository, or an instance
of a different repository than the current one.
Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Usman Akinyemi <usmanakinyemi202@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Olamide Caleb Bello <belkid98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
On mobile, long project names in the page header can wrap to multiple lines,
but the fixed 25px header height does not grow with wrapped content.
Switch the header from fixed height to min-height so it expands as needed
while keeping the same baseline height for single-line titles.
Signed-off-by: Rito Rhymes <rito@ritovision.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
On narrow screens, footer text can wrap, but the fixed 22px footer height
and floated footer blocks can cause overflow.
Switch to min-height and add a clearfix on the footer container so it grows
to contain wrapped float content cleanly.
Signed-off-by: Rito Rhymes <rito@ritovision.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
On mobile-sized viewports, gitweb pages in log/commit/blob/diff views can
overflow horizontally due to desktop-oriented paddings and fixed-width
preformatted content.
Extend the existing mobile media query to rebalance those layouts: reduce
or clear paddings in log/commit sections, and allow horizontal scrolling
for preformatted blob/diff content instead of forcing page-wide overflow.
All layout adjustments in this patch are mobile-scoped, except one global
safeguard: set overflow-wrap:anywhere on div.log_body. Log content can
contain escaped or non-breaking text that behaves like a single long token
and can overflow at any viewport width, including desktop.
Signed-off-by: Rito Rhymes <rito@ritovision.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
On narrow screens, the project search input can exceed the available width
and force page-wide horizontal scrolling.
Add a mobile media query and apply side padding to the search container,
then cap the input width to its container with border-box sizing so the
form stays within the viewport.
Signed-off-by: Rito Rhymes <rito@ritovision.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Without a viewport meta tag, phone browsers render gitweb at desktop
width and scale the whole page down to fit the screen.
Add a viewport meta tag so the layout viewport tracks device width.
This is the baseline needed for mobile CSS fixes in follow-up commits.
Signed-off-by: Rito Rhymes <rito@ritovision.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Previous commits have set up an infrastructure for `--filter=auto` to
automatically prepare a partial clone filter based on what the server
advertised and the client accepted.
Using that infrastructure, let's now enable the `--filter=auto` option
in `git clone` and `git fetch` by setting `allow_auto_filter` to 1.
Note that these small changes mean that when `git clone --filter=auto`
or `git fetch --filter=auto` are used, "auto" is automatically saved
as the partial clone filter for the server on the client. Therefore
subsequent calls to `git fetch` on the client will automatically use
this "auto" mode even without `--filter=auto`.
Let's also set `allow_auto_filter` to 1 in `transport.c`, as the
transport layer must be able to accept the "auto" filter spec even if
the invoking command hasn't fully parsed it yet.
When an "auto" filter is requested, let's have the "fetch-pack.c" code
in `do_fetch_pack_v2()` compute a filter and send it to the server.
In `do_fetch_pack_v2()` the logic also needs to check for the
"promisor-remote" capability and call `promisor_remote_reply()` to
parse advertised remotes and populate the list of those accepted (and
their filters).
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `promisor_remote_reply()` function performs two tasks:
1. It uses filter_promisor_remote() to parse the server's
"promisor-remote" advertisement and to mark accepted remotes in the
repository configuration.
2. It assembles a reply string containing the accepted remote names to
send back to the server.
In a following commit, the fetch-pack logic will need to trigger the
side effect (1) to ensure the repository state is correct, but it will
not need to send a reply (2).
To avoid assembling a reply string when it is not needed, let's change
the signature of promisor_remote_reply(). It will now return `void` and
accept a second `char **accepted_out` argument. Only if that argument
is not NULL will a reply string be assembled and returned back to the
caller via that argument.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Currently, advertised filters are only kept in memory temporarily
during parsing, or persisted to disk if `promisor.storeFields`
contains 'partialCloneFilter'.
In a following commit though, we will add a `--filter=auto` option.
This option will enable the client to use the filters that the server
is suggesting for the promisor remotes the client accepts.
To use them even if `promisor.storeFields` is not configured, these
filters should be stored somewhere for the current session.
Let's add an `advertised_filter` field to `struct promisor_remote`
for that purpose.
To ensure that the filters are available in all cases,
filter_promisor_remote() captures them into a temporary list and
applies them to the `promisor_remote` structs after the potential
configuration reload.
Then the accepted remotes are marked as `accepted` in the repository
state. This ensures that subsequent calls to look up accepted remotes
(like in the filter construction below) actually find them.
In a following commit, we will add a `--filter=auto` option that will
enable a client to use the filters suggested by the server for the
promisor remotes the client accepted.
To enable the client to construct a filter spec based on these filters,
let's also add a `promisor_remote_construct_filter(repo)` function.
This function:
- iterates over all accepted promisor remotes in the repository,
- collects the filters advertised for them (using `advertised_filter`
added in this commit, and
- generates a single filter spec for them.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a following commit, we are going to allow passing "auto" as a
<filterspec> to the `--filter=<filterspec>` option, but only for some
commands. Other commands that support the `--filter=<filterspec>`
option should still die() when 'auto' is passed.
Let's set up the "list-objects-filter-options.{c,h}" infrastructure to
support that:
- Add a new `unsigned int allow_auto_filter : 1;` flag to
`struct list_objects_filter_options` which specifies if "auto" is
accepted or not by the current command.
- Change gently_parse_list_objects_filter() to parse "auto" if it's
accepted.
- Make sure we die() if "auto" is combined with another filter.
- Update list_objects_filter_release() to preserve the
allow_auto_filter flag, as this function is often called (via
opt_parse_list_objects_filter) to reset the struct before parsing a
new value.
Let's also update `list-objects-filter.c` to recognize the new
`LOFC_AUTO` choice. Since "auto" must be resolved to a concrete filter
before filtering actually begins, initializing a filter with
`LOFC_AUTO` is invalid and will trigger a BUG().
Note that ideally combining "auto" with "auto" could be allowed, but in
practice, it's probably not worth the added code complexity. And if we
really want it, nothing prevents us to allow it in future work.
If we ever want to give a meaning to combining "auto" with a different
filter too, nothing prevents us to do that in future work either.
Also note that the new `allow_auto_filter` flag depends on the command,
not user choices, so it should be reset to the command default when
`struct list_objects_filter_options` instances are reset.
While at it, let's add a new "u-list-objects-filter-options.c" file for
`struct list_objects_filter_options` related unit tests. For now it
only tests gently_parse_list_objects_filter() though.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `--filter=<filter-spec>` option is documented in most commands that
support it except `git fetch`.
Let's fix that and document this option. To ensure consistency across
commands, let's reuse the exact description currently found in
`git clone`.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `struct list_objects_filter_options filter_options` variable used
in "builtin/fetch.c" to store the parsed filters specified by
`--filter=<filterspec>` is currently a static variable global to the
file.
As we are going to use it more in a following commit, it could become a
bit less easy to understand how it's managed.
To avoid that, let's make it clear that it's owned by cmd_fetch() by
moving its definition into that function and making it non-static.
This requires passing a pointer to it through the prepare_transport(),
do_fetch(), backfill_tags(), fetch_one_setup_partial(), and fetch_one()
functions, but it's quite straightforward.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `struct list_objects_filter_options filter_options` variable used
in "builtin/clone.c" to store the parsed filters specified by
`--filter=<filterspec>` is currently a static variable global to the
file.
As we are going to use it more in a following commit, it could become
a bit less easy to understand how it's managed.
To avoid that, let's make it clear that it's owned by cmd_clone() by
moving its definition into that function and making it non-static.
The only additional change to make this work is to pass it as an
argument to checkout(). So it's a small quite cheap cleanup anyway.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A previous commit allowed a server to pass additional fields through
the "promisor-remote" protocol capability after the "name" and "url"
fields, specifically the "partialCloneFilter" and "token" fields.
Another previous commit, c213820c51 (promisor-remote: allow a client
to check fields, 2025-09-08), has made it possible for a client to
decide if it accepts a promisor remote advertised by a server based
on these additional fields.
Often though, it would be interesting for the client to just store in
its configuration files these additional fields passed by the server,
so that it can use them when needed.
For example if a token is necessary to access a promisor remote, that
token could be updated frequently only on the server side and then
passed to all the clients through the "promisor-remote" capability,
avoiding the need to update it on all the clients manually.
Storing the token on the client side makes sure that the token is
available when the client needs to access the promisor remotes for a
lazy fetch.
To allow this, let's introduce a new "promisor.storeFields"
configuration variable.
Note that for a partial clone filter, it's less interesting to have
it stored on the client. This is because a filter should be used
right away and we already pass a `--filter=<filter-spec>` option to
`git clone` when starting a partial clone. Storing the filter could
perhaps still be interesting for information purposes.
Like "promisor.checkFields" and "promisor.sendFields", the new
configuration variable should contain a comma or space separated list
of field names. Only the "partialCloneFilter" and "token" field names
are supported for now.
When a server advertises a promisor remote, for example "foo", along
with for example "token=XXXXX" to a client, and on the client side
"promisor.storeFields" contains "token", then the client will store
XXXXX for the "remote.foo.token" variable in its configuration file
and reload its configuration so it can immediately use this new
configuration variable.
A message is emitted on stderr to warn users when the config is
changed.
Note that even if "promisor.acceptFromServer" is set to "all", a
promisor remote has to be already configured on the client side for
some of its config to be changed. In any case no new remote is
configured and no new URL is stored.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In "promisor-remote.c", the fields_sent() and fields_checked()
functions serve similar purposes and contain a small amount of
duplicated code.
As we are going to add a similar function in a following commit,
let's refactor this common code into a new initialize_fields_list()
function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When a shallowed repository gets deepened beyond the beginning of a
merged branch, we may end up with some shallows that are hidden behind
the reachable shallow commits. Added test 'fetching deepen beyond
merged branch' exposes that behaviour.
An example showing the problem based on added test:
0. Whole initial git repo to be cloned from
Graph:
* 033585d (HEAD -> main) Merge branch 'branch'
|\
| * 984f8b1 (branch) five
| * ecb578a four
|/
* 0cb5d20 three
* 2b4e70d two
* 61ba98b one
1. Initial shallow clone --depth=3 (all good)
Shallows:
2b4e70da2a10e1d3231a0ae2df396024735601f1
ecb578a3cf37198d122ae5df7efed9abaca17144
Graph:
* 033585d (HEAD -> main) Merge branch 'branch'
|\
| * 984f8b1 five
| * ecb578a (grafted) four
* 0cb5d20 three
* 2b4e70d (grafted) two
2. Deepen shallow clone with fetch --deepen=1 (NOT OK)
Shallows:
0cb5d204f4ef96ed241feb0f2088c9f4794ba758
61ba98be443fd51c542eb66585a1f6d7e15fcdae
Graph:
* 033585d (HEAD -> main) Merge branch 'branch'
|\
| * 984f8b1 five
| * ecb578a four
|/
* 0cb5d20 (grafted) three
---
Note that second shallow commit 61ba98be443fd51c542eb66585a1f6d7e15fcdae
is not reachable.
On the other hand, it seems that equivalent absolute depth driven
fetches result in all the correct shallows. That led to this proposal,
which unifies absolute and relative deepening in a way that the same
get_shallow_commits() call is used in both cases. The difference is
only that depth is adapted for relative deepening by measuring
equivalent depth of current local shallow commits in the current remote
repo. Thus a new function get_shallows_depth() has been added and the
function get_reachable_list() became redundant / removed.
Same example showing the corrected second step:
2. Deepen shallow clone with fetch --deepen=1 (all good)
Shallow:
61ba98be443fd51c542eb66585a1f6d7e15fcdae
Graph:
* 033585d (HEAD -> main) Merge branch 'branch'
|\
| * 984f8b1 five
| * ecb578a four
|/
* 0cb5d20 three
* 2b4e70d two
* 61ba98b (grafted) one
The get_shallows_depth() function also shares the logic of the
get_shallow_commits() function, but it focuses on counting depth of
each existing shallow commit. The minimum result is stored as
'data->deepen_relative', which is set not to be zero for relative
deepening anyway. That way we can always sum 'data->deepen_relative'
and 'depth' values, because 'data->deepen_relative' is always 0 in
absolute deepening.
To avoid duplicating logic between get_shallows_depth() and
get_shallow_commits(), get_shallow_commits() was modified so that
it is used by get_shallows_depth().
Signed-off-by: Samo Pogačnik <samo_pogacnik@t-2.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The local object_array 'stack' in get_shallow_commits() function
does not free its dynamic elements before the function returns.
As a result elements remain allocated and their reference forgotten.
Also note, that test 'fetching deepen beyond merged branch' added by
'shallow: handling fetch relative-deepen' patch fails without this
correction in linux-leaks and linux-reftable-leaks test runs.
Signed-off-by: Samo Pogačnik <samo_pogacnik@t-2.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
git-cherry(1) links to this command. These two commands are similar and
we also mention it in the “Examples” section now. Let’s link to it.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The utility and usability of git-patch-id(1) was discussed
relatively recently:[1]
Using "git patch-id" is definitely in the "write a script for it"
category. I don't think I've ever used it as-is from the command
line as part of a one-liner. It's very much a command that is
designed purely for scripting, the interface is just odd and baroque
and doesn't really make sense for one-liners.
The typical use of patch-id is to generate two *lists* of patch-ids,
then sort them and use the patch-id as a key to find commits that
look the same.
The command doc *could* use an example, and since it is a mapper command
it makes sense for that example to be a little script.
Mapping the commits of some branch to an upstream ref allows us to
demonstrate generating two lists, sorting them, joining them, and
finally discarding the patch ID lookup column with cut(1).
† 1: https://lore.kernel.org/workflows/CAHk-=wiN+8EUoik4UeAJ-HPSU7hczQP+8+_uP3vtAy_=YfJ9PQ@mail.gmail.com/
Inspired-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Emphasize that you can pass multiple patches or diffs to this command.
git-patch-id(1) is an efficient pID–commit mapper, able to map
thousands of commits in seconds. But discussions on the command
seem to typically[1] use the standard loop-over-rev-list-and-
shell-out pattern:
for commit in rev-list:
prepare a diff from commit | git patch-id
This is unnecessary; we can bulk-process the patches:
git rev-list --no-merges <ref> |
git diff-tree --patch --stdin |
git patch-id --stable
The first version (translated to shell) takes a little over nine
minutes for a commit history of about 78K commits.[2] The other one,
by contrast, takes slightly less than a minute.
Also drop “the” from “standard input”.
† 1: https://stackoverflow.com/a/19758159
† 2: This is `master` of this repository on 2025-10-02
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
After deciding on all hunks in a file, the interactive session
advances automatically to the next file if there is another,
or the process ends.
Now using the `--no-auto-advance` flag with `--patch`, the process
does not advance automatically. A user can choose to go to the next
file by pressing '>' or the previous file by pressing '<', before or
after deciding on all hunks in the current file.
After all hunks have been decided in a file, the user can still
rework with the file by applying the options available in the permit
set for that hunk, and after all the decisions, the user presses 'q'
to submit.
After all hunks have been decided, the user can press '?' which will
show the hunk selection summary in the help patch remainder text
including the total hunks, number of hunks marked for use and number
of hunks marked for skip.
This feature is enabled by passing the `--no-auto-advance` flag
to `--patch` option of the subcommands add, stash, reset,
and checkout.
Signed-off-by: Abraham Samuel Adekunle <abrahamadekunle50@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When the flag `--no-auto-advance` is used with `--patch`,
if the user has decided `USE` on a hunk in a file, goes to another
file, and then returns to this file and changes the previous
decision on the hunk to `SKIP`, because the patch has already
been applied, the last decision is not registered and the now
SKIPPED hunk is still applied.
Move the logic for applying patches into a function so that we can
reuse this logic to implement the all or non application of the patches
after the user is done with the hunk selection.
Signed-off-by: Abraham Samuel Adekunle <abrahamadekunle50@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The function `patch_update_file()` takes the add_p_state struct
pointer and the current `struct file_diff` pointer and returns an
int.
When using the `--no-auto-advance` flag, we want to be able to request
the next or previous file from the caller.
Modify the function signature to instead take the index of the
current `file_diff` and the `add_p_state` struct pointer so that we
can compute the `file_diff` from the index while also having
access to the file index. This will help us request the next or
previous file from the caller.
Signed-off-by: Abraham Samuel Adekunle <abrahamadekunle50@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When using the interactive add, reset, stash or checkout machinery,
we do not have the option of reworking with a file when selecting
hunks, because the session automatically advances to the next file
or ends if we have just one file.
Introduce the flag `--auto-advance` which auto advances by default,
when interactively selecting patches with the '--patch' option.
However, the `--no-auto-advance` option does not auto advance, thereby
allowing users the option to rework with files.
Signed-off-by: Abraham Samuel Adekunle <abrahamadekunle50@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To strip path components from our refname string, we repeatedly call
strrchr() to find the trailing slash, shortening the string each time by
assigning NUL over it. This has two downsides:
1. Calling strrchr() in a loop is quadratic, since each call has to
call strlen() under the hood to find the end of the string (even
though we know exactly where it is from the last loop iteration).
2. We need a temporary buffer, since we're munging the string with NUL
as we shorten it (which we must do, because strrchr() has no other
way of knowing what we consider the end of the string).
Using memrchr() would let us fix both of these, but it isn't portable.
So instead, let's just open-code the string traversal from back to
front as we loop.
I doubt that the quadratic nature is a serious concern. You can see it
in practice with something like:
git init
git commit --allow-empty -m foo
echo "$(git rev-parse HEAD) refs/heads$(perl -e 'print "/a" x 500_000')" >.git/packed-refs
time git for-each-ref --format='%(refname:rstrip=-1)'
That takes ~5.5s to run on my machine before this patch, and ~11ms
after. But I don't think there's a reasonable way for somebody to infect
you with such a garbage ref, as the wire protocol is limited to 64k
pkt-lines. The difference is measurable for me for a 32k-component ref
(about 19ms vs 7ms), so perhaps you could create some chaos by pushing a
lot of them. But we also run into filesystem limits (if the loose
backend is in use), and in practice it seems like there are probably
simpler and more effective ways to waste CPU.
Likewise the extra allocation probably isn't really measurable. In fact,
since our goal is to return an allocated string, we end up having to
make the same allocation anyway (though it is sized to the result,
rather than the input). My main goal was simplicity in avoiding the need
to handle cleaning it up in the early return path.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We're stripping path components from the end of a string, which we do by
assigning a NUL as we parse each component, shortening the string. This
requires an extra temporary buffer to avoid munging our input string.
But the way that we allocate the buffer is unusual. We have an extra
"to_free" variable. Usually this is used when the access variable is
conceptually const, like:
const char *foo;
char *to_free = NULL;
if (...)
foo = to_free = xstrdup(...);
else
foo = some_const_string;
...
free(to_free);
But that's not what's happening here. Our "start" variable always points
to the allocated buffer, and to_free is redundant. Worse, it is marked
as const itself, requiring a cast when we free it.
Let's drop to_free entirely, and mark "start" as non-const, making the
memory handling more clear. As a bonus, this also silences a warning
from glibc-2.43 that our call to strrchr() implicitly strips away the
const-ness of "start".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We're walking forward in the string, skipping path components from
left-to-right. So when we've stripped as much as we want, the pointer we
have is a complete NUL-terminated string and we can just return it
(after duplicating it, of course). So there is no need for a temporary
allocated string.
But we do make an extra temporary copy due to f0062d3b74 (ref-filter:
free item->value and item->value->s, 2018-10-18). This is probably from
cargo-culting the technique used in rstrip_ref_components(), which
_does_ need a separate string (since it is stripping from the end and
ties off the temporary string with a NUL).
Let's drop the extra allocation. This is slightly more efficient, but
more importantly makes the code much simpler.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "lstrip" and "rstrip" options to the %(refname) placeholder both
accept a negative length, which asks us to keep that many path
components (rather than stripping that many).
The code to count components and convert the negative value to a
positive was copied from lstrip to rstrip in 1a34728e6b (ref-filter: add
an 'rstrip=<N>' option to atoms which deal with refnames, 2017-01-10).
Let's factor it out into a separate function. This reduces duplication
and also makes the lstrip/rstrip functions much easier to follow, since
the bulk of their code is now the actual stripping.
Note that the computed "remaining" value is currently stored as a
"long", so in theory that's what our function should return. But this is
purely historical. When the variable was added in 0571979bd6 (tag: do
not show ambiguous tag names as "tags/foo", 2016-01-25), we parsed the
value from strtol(), and thus used a long. But these days we take "len"
as an int, and also use an int to count up components. So let's just
consistently use int here. This value could only overflow in a
pathological case (e.g., 4GB worth of "a/a/...") and even then will not
result in out-of-bounds memory access (we keep stripping until we run
out of string to parse).
The minimal Myers diff here is a little hard to read; with --patience
the code movement is shown much more clearly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While we document the values that can be passed to the "--update-refs="
option, we don't give the user any hint what the default behaviour is.
Document it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|