git/object-store.h, branch gitk-resize-error

Merge branch 'ns/tmp-objdir'

2022-01-04T00:24:15Z

New interface into the tmp-objdir API to help in-core use of the quarantine feature. * ns/tmp-objdir: tmp-objdir: disable ref updates when replacing the primary odb tmp-objdir: new API for creating temporary writable databases

tmp-objdir: disable ref updates when replacing the primary odb

2021-12-08T22:06:46Z

When creating a subprocess with a temporary ODB, we set the GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not to update refs, since the tmp-objdir may go away. Introduce a similar mechanism for in-process temporary ODBs when we call tmp_objdir_replace_primary_odb. Now both mechanisms set the disable_ref_updates flag on the odb, which is queried by the ref_transaction_prepare function. Peff's test case [1] was invoking ref updates via the cachetextconv setting. That particular code silently does nothing when a ref update is forbidden. See the call to notes_cache_put in fill_textconv where errors are ignored. [1] https://lore.kernel.org/git/YVOn3hDsb5pnxR53@coredump.intra.peff.net/ Reported-by: Jeff King Signed-off-by: Neeraj Singh Reviewed-by: Elijah Newren Signed-off-by: Junio C Hamano

tmp-objdir: new API for creating temporary writable databases

2021-12-08T22:06:36Z

The tmp_objdir API provides the ability to create temporary object directories, but was designed with the goal of having subprocesses access these object stores, followed by the main process migrating objects from it to the main object store or just deleting it. The subprocesses would view it as their primary datastore and write to it. Here we add the tmp_objdir_replace_primary_odb function that replaces the current process's writable "main" object directory with the specified one. The previous main object directory is restored in either tmp_objdir_migrate or tmp_objdir_destroy. For the --remerge-diff usecase, add a new `will_destroy` flag in `struct object_database` to mark ephemeral object databases that do not require fsync durability. Add 'git prune' support for removing temporary object databases, and make sure that they have a name starting with tmp_ and containing an operation-specific name. Based-on-patch-by: Elijah Newren Signed-off-by: Neeraj Singh Reviewed-by: Elijah Newren Signed-off-by: Junio C Hamano

Merge branch 'ab/fix-commit-error-message-upon-unwritable-object-store'

2021-10-25T23:06:57Z

"git commit" gave duplicated error message when the object store was unwritable, which has been corrected. * ab/fix-commit-error-message-upon-unwritable-object-store: commit: fix duplication regression in permission error output unwritable tests: assert exact error output

Merge branch 'ab/fsck-unexpected-type'

2021-10-25T23:06:56Z

"git fsck" has been taught to report mismatch between expected and actual types of an object better. * ab/fsck-unexpected-type: fsck: report invalid object type-path combinations fsck: don't hard die on invalid object types object-file.c: stop dying in parse_loose_header() object-file.c: return ULHR_TOO_LONG on "header too long" object-file.c: use "enum" return type for unpack_loose_header() object-file.c: simplify unpack_loose_short_header() object-file.c: make parse_loose_header_extended() public object-file.c: return -1, not "status" from unpack_loose_header() object-file.c: don't set "typep" when returning non-zero cat-file tests: test for current --allow-unknown-type behavior cat-file tests: add corrupt loose object test cat-file tests: test for missing/bogus object with -t, -s and -p cat-file tests: move bogus_* variable declarations earlier fsck tests: test for garbage appended to a loose object fsck tests: test current hash/type mismatch behavior fsck tests: refactor one test to use a sub-repo fsck tests: add test for fsck-ing an unknown type

commit: fix duplication regression in permission error output

2021-10-12T18:16:59Z

Fix a regression in the error output emitted when .git/objects can't be written to. Before 9c4d6c0297b (cache-tree: Write updated cache-tree after commit, 2014-07-13) we'd emit only one "insufficient permission" error, now we'll do so again. The cause is rather straightforward, we've got WRITE_TREE_SILENT for the use-case of wanting to prepare an index silently, quieting any permission etc. error output. Then when we attempt to update to that (possibly broken) index we'll run into the same errors again. But with 9c4d6c0297b the gap between the cache-tree API and the object store wasn't closed in terms of asking write_object_file() to be silent. I.e. post-9c4d6c0297b the first call is to prepare_index(), and after that we'll call prepare_to_commit(). We only want verbose error output from the latter. So let's add and use that facility with a corresponding HASH_SILENT flag, its only user is cache-tree.c's update_one(), which will set it if its "WRITE_TREE_SILENT" flag is set. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano

fsck: report invalid object type-path combinations

2021-10-01T22:06:01Z

Improve the error that's emitted in cases where we find a loose object we parse, but which isn't at the location we expect it to be. Before this change we'd prefix the error with a not-a-OID derived from the path at which the object was found, due to an emergent behavior in how we'd end up with an "OID" in these codepaths. Now we'll instead say what object we hashed, and what path it was found at. Before this patch series e.g.: $ git hash-object --stdin -w -t blob >objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 $ git fsck error: garbage at end of loose object 'e69d[...]' error: unable to unpack contents of ./objects/e6/9d[...] error: e69d[...]: object corrupt or missing: ./objects/e6/9d[...] There is currently some weird messaging in the edge case when the two are combined, i.e. because we're not explicitly passing along an error state about this specific scenario from check_stream_oid() via read_loose_object() we'll end up printing the null OID if an object is of an unknown type *and* it can't be unpacked by zlib, e.g.: $ git hash-object --stdin -w -t garbage --literally >objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f $ /usr/bin/git fsck fatal: invalid object type $ ~/g/git/git fsck error: garbage at end of loose object '8315a83d2acc4c174aed59430f9a9c4ed926440f' error: unable to unpack contents of ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f error: 8315a83d2acc4c174aed59430f9a9c4ed926440f: object corrupt or missing: ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f error: 0000000000000000000000000000000000000000: object is of unknown type 'garbage': ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f [...] I think it's OK to leave that for future improvements, which would involve enum-ifying more error state as we've done with "enum unpack_loose_header_result" in preceding commits. In these increasingly more obscure cases the worst that can happen is that we'll get slightly nonsensical or inapplicable error messages. There's other such potential edge cases, all of which might produce some confusing messaging, but still be handled correctly as far as passing along errors goes. E.g. if check_object_signature() returns and oideq(real_oid, null_oid()) is true, which could happen if it returns -1 due to the read_istream() call having failed. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano

fsck: don't hard die on invalid object types

2021-10-01T22:06:01Z

Change the error fsck emits on invalid object types, such as: $ git hash-object --stdin -w -t garbage --literally From the very ungraceful error of: $ git fsck fatal: invalid object type $ To: $ git fsck error: : object is of unknown type 'garbage': [ other fsck output ] We'll still exit with non-zero, but now we'll finish the rest of the traversal. The tests that's being added here asserts that we'll still complain about other fsck issues (e.g. an unrelated dangling blob). To do this we need to pass down the "OBJECT_INFO_ALLOW_UNKNOWN_TYPE" flag from read_loose_object() through to parse_loose_header(). Since the read_loose_object() function is only used in builtin/fsck.c we can simply change it to accept a "struct object_info" (which contains the OBJECT_INFO_ALLOW_UNKNOWN_TYPE in its flags). See f6371f92104 (sha1_file: add read_loose_object() function, 2017-01-13) for the introduction of read_loose_object(). Since we'll need a "struct strbuf" to hold the "type_name" let's pass it to the for_each_loose_file_in_objdir() callback to avoid allocating a new one for each loose object in the iteration. It also makes the memory management simpler than sticking it in fsck_loose() itself, as we'll only need to strbuf_reset() it, with no need to do a strbuf_release() before each "return". Before this commit we'd never check the "type" if read_loose_object() failed, but now we do. We therefore need to initialize it to OBJ_NONE to be able to tell the difference between e.g. its unpack_loose_header() having failed, and us getting past that and into parse_loose_header(). Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano

.[ch] _INIT macros: use { 0 } for a "zero out" idiom

2021-09-27T21:47:59Z

In C it isn't required to specify that all members of a struct are zero'd out to 0, NULL or '\0', just providing a "{ 0 }" will accomplish that. Let's also change code that provided N zero'd fields to just provide one, and change e.g. "{ NULL }" to "{ 0 }" for consistency. I.e. even if the first member is a pointer let's use "0" instead of "NULL". The point of using "0" consistently is to pick one, and to not have the reader wonder why we're not using the same pattern everywhere. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano

Merge branch 'rs/packfile-bad-object-list-in-oidset'

2021-09-23T20:44:46Z

Replace a handcrafted data structure used to keep track of bad objects in the packfile API by an oidset. * rs/packfile-bad-object-list-in-oidset: packfile: use oidset for bad objects packfile: convert has_packed_and_bad() to object_id packfile: convert mark_bad_packed_object() to object_id midx: inline nth_midxed_pack_entry() oidset: make oidset_size() an inline function