<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/refs/ref-cache.c, branch main</title>
<subtitle>Fork of git SCM with my patches.</subtitle>
<id>http://git.kilabit.info/git/atom?h=main</id>
<link rel='self' href='http://git.kilabit.info/git/atom?h=main'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/'/>
<updated>2025-11-04T15:32:25Z</updated>
<entry>
<title>refs: drop infrastructure to peel via iterators</title>
<updated>2025-11-04T15:32:25Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-10-23T07:16:19Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=705114772e0a0741c3288329bd9ac4e11e38db9a'/>
<id>urn:sha1:705114772e0a0741c3288329bd9ac4e11e38db9a</id>
<content type='text'>
Now that the peeled object ID gets propagated via the `struct reference`
there is no need anymore to call into the reference iterator itself to
dereference an object. Remove this infrastructure.

Most of the changes are straight-forward deletions of code. There is one
exception though in `refs/packed-backend.c::write_with_updates()`. Here
we stop peeling the iterator and instead just pass the peeled object ID
of that iterator directly.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: fully reset `struct ref_iterator::ref` on iteration</title>
<updated>2025-11-04T15:32:25Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-10-23T07:16:12Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=4cea0422879f6a64c0f7ad0ddac6d43897a53e94'/>
<id>urn:sha1:4cea0422879f6a64c0f7ad0ddac6d43897a53e94</id>
<content type='text'>
With the introduction of the `struct ref_iterator::ref` field it now is
a whole lot easier to introduce new fields that become accessible to the
caller without having to adapt every single callsite. But there's a
downside: when a new field is introduced we always have to adapt all
backends to set that field.

This isn't something we can avoid in the general case: when the new
field is expected to be populated by all backends we of course cannot
avoid doing so. But new fields may be entirely optional, in which case
we'd still have such churn. And furthermore, it is very easy right now
to leak state from a previous iteration into the next iteration.

Address this issue by ensuring that the reference backends all fully
reset the field on every single iteration. This ensures that no state
from previous iterations can leak into the next one. And it ensures that
any newly introduced fields will be zeroed out by default.

Note that we don't have to explicitly adapt the "files" backend, as it
uses the `cache_ref_iterator` internally. Furthermore, other "wrapping"
iterators like for example the `prefix_ref_iterator` copy around the
whole reference, so these don't need to be adapted either.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: introduce `.ref` field for the base iterator</title>
<updated>2025-11-04T15:32:25Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-10-23T07:16:11Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=89baa52da612dde6da031acfa2cb957d4297d544'/>
<id>urn:sha1:89baa52da612dde6da031acfa2cb957d4297d544</id>
<content type='text'>
The base iterator has a couple of fields that tracks the name, target,
object ID and flags for the current reference. Due to this design we
have to create a new `struct reference` whenever we want to hand over
that reference to the callback function, which is tedious and not very
efficient.

Convert the structure to instead contain a `struct reference` as member.
This member is expected to be populated by the implementations of the
iterator and is handed over to the callback directly.

While at it, simplify `should_pack_ref()` to take a `struct reference`
directly instead of passing its respective fields.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs/ref-cache: fix SEGFAULT when seeking in empty directories</title>
<updated>2025-10-01T20:12:24Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2025-10-01T12:17:29Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=351c6e719ae9c5b97506dde6bc287408b80e87e4'/>
<id>urn:sha1:351c6e719ae9c5b97506dde6bc287408b80e87e4</id>
<content type='text'>
The 'cache_ref_iterator_seek()' function is used to seek the
`ref_iterator` to the desired reference in the ref-cache mechanism. We
use the seeking functionality to implement the '--start-after' flag in
'git-for-each-ref(1)'.

When using the files-backend with packed-refs, it is possible that some
of the refs directories are empty. For e.g. just after repacking, the
'refs/heads' directory would be empty. The ref-cache seek mechanism,
doesn't take this into consideration when descending into a
subdirectory, and makes an out of bounds access, causing SEGFAULT as we
try to access entries within the directory. Fix this by breaking out of
the loop when we enter an empty directory.

Since we start with the base directory of 'refs/' which is never empty,
it is okay to perform this check after the first iteration in the
`do..while` clause.

Add tests which simulate this behavior and also provide coverage over
using the feature over packed-refs.

Helped-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>ref-cache: use 'size_t' instead of int for length</title>
<updated>2025-07-28T21:16:36Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2025-07-28T20:20:46Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=65855751d190d66d15bc2c7e17e93fe00d04c539'/>
<id>urn:sha1:65855751d190d66d15bc2c7e17e93fe00d04c539</id>
<content type='text'>
The commit 090eb5336c (refs: selectively set prefix in the seek
functions, 2025-07-15) modified the ref-cache iterator to support
seeking to a specified marker without setting the prefix.

The commit adds and uses an integer 'len' to capture the length of the
seek marker to compare with the entries of a given directory. Since the
type of the variable is 'int', this is met with a typecast of converting
a `strlen` to 'int' so it can be assigned to the 'len' variable.

This is whole operation is a bit wrong:
1. Since the 'len' variable is eventually used in a 'strncmp', it should
have been of type 'size_t'.
2. This also truncates the value provided from 'strlen' to an int, which
could cause a large refname to produce a negative number.

Let's do the correct thing here and simply use 'size_t' for `len`.

Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>ref-cache: set prefix_state when seeking</title>
<updated>2025-07-24T22:31:09Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2025-07-24T22:11:36Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=9201261a70b5d325b344036de15901a4064d66c0'/>
<id>urn:sha1:9201261a70b5d325b344036de15901a4064d66c0</id>
<content type='text'>
In 090eb5336c (refs: selectively set prefix in the seek functions,
2025-07-15) we separated the seeking functionality of reference
iterators from the functionality to set prefix to an iterator. This
allows users of ref iterators to seek to a particular reference to
provide pagination support.

The files-backend, uses the ref-cache iterator to iterate over loose
refs. The iterator tracks directories and entries already processed via
a stack of levels. Each level corresponds to a directory under the files
backend. New levels are added to the stack, and when all entries from a
level is yielded, the corresponding level is popped from the stack.

To accommodate seeking, we need to populate and traverse the levels to
stop the requested seek marker at the appropriate level and its entry
index. Each level also contains a 'prefix_state' which is used for
prefix matching, this allows the iterator to skip levels/entries which
don't match a prefix. The default value of 'prefix_state' is
PREFIX_CONTAINS_DIR, which yields all entries within a level. When
purely seeking without prefix matching, we want to yield all entries.
The commit however, skips setting the value explicitly. This causes the
MemorySanitizer to issue a 'use-of-uninitialized-value' error when
running 't/t6302-for-each-ref-filter'.

Set the value explicitly to avoid to fix the issue.

Reported-by: Kyle Lippincott &lt;spectral@google.com&gt;
Helped-by: Kyle Lippincott &lt;spectral@google.com&gt;
Helped-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs: selectively set prefix in the seek functions</title>
<updated>2025-07-15T18:54:20Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2025-07-15T11:28:28Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=2b4648b9190b552e942d93363482bd617a510fc1'/>
<id>urn:sha1:2b4648b9190b552e942d93363482bd617a510fc1</id>
<content type='text'>
The ref iterator exposes a `ref_iterator_seek()` function. The name
suggests that this would seek the iterator to a specific reference in
some ways similar to how `fseek()` works for the filesystem.

However, the function actually sets the prefix for refs iteration. So
further iteration would only yield references which match the particular
prefix. This is a bit confusing.

Let's add a 'flags' field to the function, which when set with the
'REF_ITERATOR_SEEK_SET_PREFIX' flag, will set the prefix for the
iteration in-line with the existing behavior. Otherwise, the reference
backends will simply seek to the specified reference and clears any
previously set prefix. This allows users to start iteration from a
specific reference.

In the packed and reftable backend, since references are available in a
sorted list, the changes are simply setting the prefix if needed. The
changes on the files-backend are a little more involved, since the files
backend uses the 'ref-cache' mechanism. We move out the existing logic
within `cache_ref_iterator_seek()` to `cache_ref_iterator_set_prefix()`
which is called when the 'REF_ITERATOR_SEEK_SET_PREFIX' flag is set. We
then parse the provided seek string and set the required levels and
their indexes to ensure that seeking is possible.

Helped-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>ref-cache: remove unused function 'find_ref_entry()'</title>
<updated>2025-07-15T18:54:19Z</updated>
<author>
<name>Karthik Nayak</name>
<email>karthik.188@gmail.com</email>
</author>
<published>2025-07-15T11:28:27Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=883a7ea0543426d0ecb172c72c8f706348961605'/>
<id>urn:sha1:883a7ea0543426d0ecb172c72c8f706348961605</id>
<content type='text'>
The 'find_ref_entry' function is no longer used, so remove it.

Signed-off-by: Karthik Nayak &lt;karthik.188@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs/iterator: implement seeking for ref-cache iterators</title>
<updated>2025-03-12T18:31:20Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-12T15:56:19Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=84e656919cb7237f1b11a948974d0591d9d3434f'/>
<id>urn:sha1:84e656919cb7237f1b11a948974d0591d9d3434f</id>
<content type='text'>
Implement seeking of ref-cache iterators. This is done by splitting most
of the logic to seek iterators out of `cache_ref_iterator_begin()` and
putting it into `cache_ref_iterator_seek()` so that we can reuse the
logic.

Note that we cannot use the optimization anymore where we return an
empty ref iterator when there aren't any references, as otherwise it
wouldn't be possible to reseek the iterator to a different prefix that
may exist. This shouldn't be much of a performance concern though as we
now start to bail out early in case `advance()` sees that there are no
more directories to be searched.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>refs/iterator: separate lifecycle from iteration</title>
<updated>2025-03-12T18:31:18Z</updated>
<author>
<name>Patrick Steinhardt</name>
<email>ps@pks.im</email>
</author>
<published>2025-03-12T15:56:15Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=cec2b6f55a805c010d2acc81abf4cbc41b712130'/>
<id>urn:sha1:cec2b6f55a805c010d2acc81abf4cbc41b712130</id>
<content type='text'>
The ref and reflog iterators have their lifecycle attached to iteration:
once the iterator reaches its end, it is automatically released and the
caller doesn't have to care about that anymore. When the iterator should
be released before it has been exhausted, callers must explicitly abort
the iterator via `ref_iterator_abort()`.

This lifecycle is somewhat unusual in the Git codebase and creates two
problems:

  - Callsites need to be very careful about when exactly they call
    `ref_iterator_abort()`, as calling the function is only valid when
    the iterator itself still is. This leads to somewhat awkward calling
    patterns in some situations.

  - It is impossible to reuse iterators and re-seek them to a different
    prefix. This feature isn't supported by any iterator implementation
    except for the reftable iterators anyway, but if it was implemented
    it would allow us to optimize cases where we need to search for
    specific references repeatedly by reusing internal state.

Detangle the lifecycle from iteration so that we don't deallocate the
iterator anymore once it is exhausted. Instead, callers are now expected
to always call a newly introduce `ref_iterator_free()` function that
deallocates the iterator and its internal state.

Note that the `dir_iterator` is somewhat special because it does not
implement the `ref_iterator` interface, but is only used to implement
other iterators. Consequently, we have to provide `dir_iterator_free()`
instead of `dir_iterator_release()` as the allocated structure itself is
managed by the `dir_iterator` interfaces, as well, and not freed by
`ref_iterator_free()` like in all the other cases.

While at it, drop the return value of `ref_iterator_abort()`, which
wasn't really required by any of the iterator implementations anyway.
Furthermore, stop calling `base_ref_iterator_free()` in any of the
backends, but instead call it in `ref_iterator_free()`.

Signed-off-by: Patrick Steinhardt &lt;ps@pks.im&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
