aboutsummaryrefslogtreecommitdiff
path: root/reftable/reftable-writer.h
AgeCommit message (Collapse)Author
13 daysreftable/stack: provide fsync(3p) via system headerPatrick Steinhardt
Users of the reftable library are expected to provide their own function callback in cases they want to sync(3p) data to disk via the reftable write options. But if no such function was provided we end up calling fsync(3p) directly, which may not even be available on some systems. While dropping the explicit call to fsync(3p) would work, it would lead to an unsafe default behaviour where a project may have forgotten to set up the callback function, and that could lead to potential data loss. So this is not a great solution. Instead, drop the callback function and make it mandatory for the project to define fsync(3p). In the case of Git, we can then easily inject our custom implementation via the "reftable-system.h" header so that we continue to use `fsync_component()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 daysreftable: introduce "reftable-system.h" headerPatrick Steinhardt
We're including a couple of standard headers like <stdint.h> in a bunch of locations, which makes it hard for a project to plug in their own logic for making required functionality available. For us this is for example via "compat/posix.h", which already includes all of the system headers relevant to us. Introduce a new "reftable-system.h" header that allows projects to provide their own headers. This new header is supposed to contain all the project-specific bits to provide the POSIX-like environment, and some additional supporting code. With this change, we thus have the following split in our system-specific code: - "reftable/reftable-system.h" is the project-specific header that provides a POSIX-like environment. Every project is expected to provide their own implementation. - "reftable/system.h" contains the project-independent definition of the interfaces that a project needs to implement. This file should not be touched by a project. - "reftable/system.c" contains the project-specific implementation of the interfaces defined in "system.h". Again, every project is expected to provide their own implementation. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-12reftable/writer: fix type used for number of recordsPatrick Steinhardt
Both `reftable_writer_add_refs()` and `reftable_writer_add_logs()` accept an array of records that should be added to the new table. Callers of this function are expected to also pass the number of such records to the function to tell it how many such records it is supposed to write. But while all callers pass in a `size_t`, which is a sensible choice, the function in fact accepts an `int` as argument, which is less so. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-07reftable: fix formatting of the license headerPatrick Steinhardt
The license headers used across the reftable library doesn't follow our typical coding style for multi-line comments. Fix it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-14Merge branch 'kn/reflog-migration-fix-followup'Junio C Hamano
Code clean-up. * kn/reflog-migration-fix-followup: reftable: prevent 'update_index' changes after adding records refs: use 'uint64_t' for 'ref_update.index' refs: mark `ref_transaction_update_reflog()` as static
2025-01-22reftable: prevent 'update_index' changes after adding recordsKarthik Nayak
The function `reftable_writer_set_limits()` allows updating the 'min_update_index' and 'max_update_index' of a reftable writer. These values are written to both the writer's header and footer. Since the header is written during the first block write, any subsequent changes to the update index would create a mismatch between the header and footer values. The footer would contain the newer values while the header retained the original ones. To protect against this bug, prevent callers from updating these values after any record is written. To do this, modify the function to return an error whenever the limits are modified after any record adds. Check for record adds within `reftable_writer_set_limits()` by checking the `last_key` and `next` variable. The former is updated after each record added, but is reset at certain points. The latter is set after writing the first block. Modify all callers of the function to anticipate a return type and handle it accordingly. Add a unit test to also ensure the function returns the error as expected. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-21reftable/block: adjust type of the restart lengthPatrick Steinhardt
The restart length is tracked as a positive integer even though it cannot ever be negative. Furthermore, it is effectively capped via the MAX_RESTARTS variable. Adjust the type of the variable to be `uint32_t`. While this type is excessive given that MAX_RESTARTS fits into an `uint16_t`, other places already use 32 bit integers for restarts, so this type is being more consistent. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-26reftable/stack: add mechanism to notify callers on reloadPatrick Steinhardt
Reftable stacks are reloaded in two cases: - When calling `reftable_stack_reload()`, if the stat-cache tells us that the stack has been modified. - When committing a reftable addition. While callers can figure out the second case, they do not have a mechanism to figure out whether `reftable_stack_reload()` led to an actual reload of the on-disk data. All they can do is thus to assume that data is always being reloaded in that case. Improve the situation by introducing a new `on_reload()` callback to the reftable options. If provided, the function will be invoked every time the stack has indeed been reloaded. This allows callers to invalidate data that depends on the current stack data. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19reftable/stack: stop using `fsync_component()` directlyPatrick Steinhardt
We're executing `fsync_component()` directly in the reftable library so that we can fsync data to disk depending on "core.fsync". But as we're in the process of converting the reftable library to become standalone we cannot use that function in the library anymore. Refactor the code such that users of the library can inject a custom fsync function via the write options. This allows us to get rid of the dependency on "write-or-die.h". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-19reftable/system: stop depending on "hash.h"Patrick Steinhardt
We include "hash.h" in "reftable/system.h" such that we can use hash format IDs as well as the raw size of SHA1 and SHA256. As we are in the process of converting the reftable library to become standalone we of course cannot rely on those constants anymore. Introduce a new `enum reftable_hash` to replace internal uses of the hash format IDs and new constants that replace internal uses of the hash size. Adapt the reftable backend to set up the correct hash function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10Merge branch 'ps/reftable-alloc-failures'Junio C Hamano
The reftable library is now prepared to expect that the memory allocation function given to it may fail to allocate and to deal with such an error. * ps/reftable-alloc-failures: (26 commits) reftable/basics: fix segfault when growing `names` array fails reftable/basics: ban standard allocator functions reftable: introduce `REFTABLE_FREE_AND_NULL()` reftable: fix calls to free(3P) reftable: handle trivial allocation failures reftable/tree: handle allocation failures reftable/pq: handle allocation failures when adding entries reftable/block: handle allocation failures reftable/blocksource: handle allocation failures reftable/iter: handle allocation failures when creating indexed table iter reftable/stack: handle allocation failures in auto compaction reftable/stack: handle allocation failures in `stack_compact_range()` reftable/stack: handle allocation failures in `reftable_new_stack()` reftable/stack: handle allocation failures on reload reftable/reader: handle allocation failures in `reader_init_iter()` reftable/reader: handle allocation failures for unindexed reader reftable/merged: handle allocation failures in `merged_table_init_iter()` reftable/writer: handle allocation failures in `reftable_new_writer()` reftable/writer: handle allocation failures in `writer_index_hash()` reftable/record: handle allocation failures when decoding records ...
2024-10-02reftable/writer: handle allocation failures in `reftable_new_writer()`Patrick Steinhardt
Handle allocation failures in `reftable_new_writer()`. Adapt the function to return an error code to return such failures. While at it, rename it to match our code style as we have to touch up every callsite anyway. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-24refs/reftable: introduce "reftable.lockTimeout"Patrick Steinhardt
When multiple concurrent processes try to update references in a repository they may try to lock the same lockfiles. This can happen even when the updates are non-conflicting and can both be applied, so it doesn't always make sense to abort the transaction immediately. Both the "loose" and "packed" backends thus have a grace period that they wait for the lock to be released that can be controlled via the config values "core.filesRefLockTimeout" and "core.packedRefsTimeout", respectively. The reftable backend doesn't have such a setting yet and instead fails immediately when it sees such a lock. But the exact same concepts apply here as they do apply to the other backends. Introduce a new "reftable.lockTimeout" config that controls how long we may wait for a "tables.list" lock to be released. The default value of this config is 100ms, which is the same default as we have it for the "loose" backend. Note that even though we also lock individual tables, this config really only applies to the "tables.list" file. This is because individual tables are only ever locked when we already hold the "tables.list" lock during compaction. When we observe such a lock we in fact do not want to compact the table at all because it is already in the process of being compacted by a concurrent process. So applying the same timeout here would not make any sense and only delay progress. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13reftable: make the compaction factor configurablePatrick Steinhardt
When auto-compacting, the reftable library packs references such that the sizes of the tables form a geometric sequence. The factor for this geometric sequence is hardcoded to 2 right now. We're about to expose this as a config option though, so let's expose the factor via write options. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13reftable: use `uint16_t` to track restart intervalPatrick Steinhardt
The restart interval can at most be `UINT16_MAX` as specified in the technical documentation of the reftable format. Furthermore, it cannot ever be negative. Regardless of that we use an `int` to track the restart interval. Change the type to use an `uint16_t` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-13reftable: pass opts as constant pointerPatrick Steinhardt
We sometimes pass the refatble write options as value and sometimes as a pointer. This is quite confusing and makes the reader wonder whether the options get modified sometimes. In fact, `reftable_new_writer()` does cause the caller-provided options to get updated when some values aren't set up. This is quite unexpected, but didn't cause any harm until now. Adapt the code so that we do not modify the caller-provided values anymore. While at it, refactor the code to code to consistently pass the options as a constant pointer to clarify that the caller-provided opts will not ever get modified. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-08Merge branch 'ps/reftable-write-optim'Junio C Hamano
Code to write out reftable has seen some optimization and simplification. * ps/reftable-write-optim: reftable/block: reuse compressed array reftable/block: reuse zstream when writing log blocks reftable/writer: reset `last_key` instead of releasing it reftable/writer: unify releasing memory reftable/writer: refactorings for `writer_flush_nonempty_block()` reftable/writer: refactorings for `writer_add_record()` refs/reftable: don't recompute committer ident reftable: remove name checks refs/reftable: skip duplicate name checks refs/reftable: perform explicit D/F check when writing symrefs refs/reftable: fix D/F conflict error message on ref copy
2024-04-08reftable: remove name checksPatrick Steinhardt
In the preceding commit we have disabled name checks in the "reftable" backend. These checks were responsible for verifying multiple things when writing records to the reftable stack: - Detecting file/directory conflicts. Starting with the preceding commits this is now handled by the reftable backend itself via `refs_verify_refname_available()`. - Validating refnames. This is handled by `check_refname_format()` in the generic ref transacton layer. The code in the reftable library is thus not used anymore and likely to bitrot over time. Remove it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-08reftable/stack: expose option to disable auto-compactionJustin Tobler
The reftable stack already has a variable to configure whether or not to run auto-compaction, but it is inaccessible to users of the library. There exist use cases where a caller may want to have more control over auto-compaction. Move the `disable_auto_compact` option into `reftable_write_options` to allow external callers to disable auto-compaction. This will be used in a subsequent commit. Signed-off-by: Justin Tobler <jltobler@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-23reftable: honor core.fsyncJohn Cai
While the reffiles backend honors configured fsync settings, the reftable backend does not. Address this by fsyncing reftable files using the write-or-die api's fsync_component() in two places: when we add additional entries into the table, and when we close the reftable writer. This commits adds a flush function pointer as a new member of reftable_writer because we are not sure that the first argument to the *write function pointer always contains a file descriptor. In the case of strbuf_add_void, the first argument is a buffer. This way, we can pass in a corresponding flush function that knows how to flush depending on which writer is being used. This patch does not contain tests as they will need to wait for another patch to start to exercise the reftable backend. At that point, the tests will be added to observe that fsyncs are happening when the reftable is in use. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-23reftable: rename writer_stats to reftable_writer_statsHan-Wen Nienhuys
This function is part of the reftable API, so it should use the reftable_ prefix Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-23reftable: support preset file mode for writingHan-Wen Nienhuys
Create files with mode 0666, so umask works as intended. Provides an override, which is useful to support shared repos (test t1301-shared-repo.sh). Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08reftable: write reftable filesHan-Wen Nienhuys
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>