| Age | Commit message (Collapse) | Author |
|
We're including a couple of standard headers like <stdint.h> in a bunch
of locations, which makes it hard for a project to plug in their own
logic for making required functionality available. For us this is for
example via "compat/posix.h", which already includes all of the system
headers relevant to us.
Introduce a new "reftable-system.h" header that allows projects to
provide their own headers. This new header is supposed to contain all
the project-specific bits to provide the POSIX-like environment, and some
additional supporting code. With this change, we thus have the following
split in our system-specific code:
- "reftable/reftable-system.h" is the project-specific header that
provides a POSIX-like environment. Every project is expected to
provide their own implementation.
- "reftable/system.h" contains the project-independent definition of
the interfaces that a project needs to implement. This file should
not be touched by a project.
- "reftable/system.c" contains the project-specific implementation of
the interfaces defined in "system.h". Again, every project is
expected to provide their own implementation.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In fd888311fbc (reftable/table: move reading block into block reader,
2025-04-07), we have refactored how reftable blocks are read so that
most of the logic is contained in the "block.c" subsystem itself. Most
importantly, the whole logic to read the data itself is now contained in
that subsystem.
This change caused a significant performance regression though when
reading blocks that aren't of the specific type one is searching for:
Benchmark 1: update-ref: create 100k refs (revision = fd888311fbc~)
Time (mean ± σ): 2.171 s ± 0.028 s [User: 1.189 s, System: 0.977 s]
Range (min … max): 2.117 s … 2.206 s 10 runs
Benchmark 2: update-ref: create 100k refs (revision = fd888311fbc)
Time (mean ± σ): 3.418 s ± 0.030 s [User: 2.371 s, System: 1.037 s]
Range (min … max): 3.377 s … 3.473 s 10 runs
Summary
update-ref: create 100k refs (revision = fd888311fbc~) ran
1.57 ± 0.02 times faster than update-ref: create 100k refs (revision = fd888311fbc)
The root caute of the performance regression is that we changed when
exactly blocks of an uninteresting type are being discarded. Previous to
the refactoring in the mentioned commit we'd load the block data, read
its type, notice that it's not the wanted type and discard the block.
After the commit though we don't discard the block immediately, but we
fully decode it only to realize that it's not the desired type. We then
discard the block again, but have already performed a bunch of pointless
work.
Fix the regression by making `reftable_block_init()` return early in
case the block is not of the desired type. This fixes the performance
hit:
Benchmark 1: update-ref: create 100k refs (revision = HEAD~)
Time (mean ± σ): 2.712 s ± 0.018 s [User: 1.990 s, System: 0.716 s]
Range (min … max): 2.682 s … 2.741 s 10 runs
Benchmark 2: update-ref: create 100k refs (revision = HEAD)
Time (mean ± σ): 1.670 s ± 0.012 s [User: 0.991 s, System: 0.676 s]
Range (min … max): 1.652 s … 1.693 s 10 runs
Summary
update-ref: create 100k refs (revision = HEAD) ran
1.62 ± 0.02 times faster than update-ref: create 100k refs (revision = HEAD~)
Note that the baseline performance is lower than in the original due to
a couple of unrelated performance improvements that have landed since
the original commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Expose a generic iterator over reftable records and expose it via the
public interface. Together with an upcoming iterator for reftable blocks
contained in a table this will allow users to trivially iterate through
blocks and their respective records individually.
This functionality will be used to implement consistency checks for the
reftable backend, which requires more fine-grained control over how we
read data.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While users of the reftable library wouldn't generally require access to
individual blocks in a reftable table, there are valid usecases where
one may require low-level access to them. One such upcoming usecase in
the Git codebase is to implement consistency checks for the reftable
library where we want to verify each block individually.
Create a public interface for reading blocks. The interface isn't yet
complete and lacks e.g. a way to read individual records from a block.
Such missing functionality will be backfilled in subsequent commits.
Note that this change also requires us to expose `reftable_buf`, which
is used by the `reftable_block_first_key()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|