<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/object-store.h, branch gitk-resize-error</title>
<subtitle>Fork of git SCM with my patches.</subtitle>
<id>http://git.kilabit.info/git/atom?h=gitk-resize-error</id>
<link rel='self' href='http://git.kilabit.info/git/atom?h=gitk-resize-error'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/'/>
<updated>2022-01-04T00:24:15Z</updated>
<entry>
<title>Merge branch 'ns/tmp-objdir'</title>
<updated>2022-01-04T00:24:15Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2022-01-04T00:24:14Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=0dc90d954db852c6604796b8d817365f94e92a16'/>
<id>urn:sha1:0dc90d954db852c6604796b8d817365f94e92a16</id>
<content type='text'>
New interface into the tmp-objdir API to help in-core use of the
quarantine feature.

* ns/tmp-objdir:
  tmp-objdir: disable ref updates when replacing the primary odb
  tmp-objdir: new API for creating temporary writable databases
</content>
</entry>
<entry>
<title>tmp-objdir: disable ref updates when replacing the primary odb</title>
<updated>2021-12-08T22:06:46Z</updated>
<author>
<name>Neeraj Singh</name>
<email>neerajsi@microsoft.com</email>
</author>
<published>2021-12-06T22:05:05Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=ecd81dfc79cf12cc21fc0340da8ca8fcc5aa58a7'/>
<id>urn:sha1:ecd81dfc79cf12cc21fc0340da8ca8fcc5aa58a7</id>
<content type='text'>
When creating a subprocess with a temporary ODB, we set the
GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not
to update refs, since the tmp-objdir may go away.

Introduce a similar mechanism for in-process temporary ODBs when
we call tmp_objdir_replace_primary_odb. Now both mechanisms set
the disable_ref_updates flag on the odb, which is queried by
the ref_transaction_prepare function.

Peff's test case [1] was invoking ref updates via the cachetextconv
setting. That particular code silently does nothing when a ref
update is forbidden. See the call to notes_cache_put in
fill_textconv where errors are ignored.

[1] https://lore.kernel.org/git/YVOn3hDsb5pnxR53@coredump.intra.peff.net/

Reported-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Neeraj Singh &lt;neerajsi@microsoft.com&gt;
Reviewed-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tmp-objdir: new API for creating temporary writable databases</title>
<updated>2021-12-08T22:06:36Z</updated>
<author>
<name>Neeraj Singh</name>
<email>neerajsi@microsoft.com</email>
</author>
<published>2021-12-06T22:05:04Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=b3cecf49eac00d62e361bf6e6e81392f5a2fb571'/>
<id>urn:sha1:b3cecf49eac00d62e361bf6e6e81392f5a2fb571</id>
<content type='text'>
The tmp_objdir API provides the ability to create temporary object
directories, but was designed with the goal of having subprocesses
access these object stores, followed by the main process migrating
objects from it to the main object store or just deleting it.  The
subprocesses would view it as their primary datastore and write to it.

Here we add the tmp_objdir_replace_primary_odb function that replaces
the current process's writable "main" object directory with the
specified one. The previous main object directory is restored in either
tmp_objdir_migrate or tmp_objdir_destroy.

For the --remerge-diff usecase, add a new `will_destroy` flag in `struct
object_database` to mark ephemeral object databases that do not require
fsync durability.

Add 'git prune' support for removing temporary object databases, and
make sure that they have a name starting with tmp_ and containing an
operation-specific name.

Based-on-patch-by: Elijah Newren &lt;newren@gmail.com&gt;

Signed-off-by: Neeraj Singh &lt;neerajsi@microsoft.com&gt;
Reviewed-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ab/fix-commit-error-message-upon-unwritable-object-store'</title>
<updated>2021-10-25T23:06:57Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-10-25T23:06:57Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=2c428e4205a50cee19669d5b73cc149ec2254a5d'/>
<id>urn:sha1:2c428e4205a50cee19669d5b73cc149ec2254a5d</id>
<content type='text'>
"git commit" gave duplicated error message when the object store
was unwritable, which has been corrected.

* ab/fix-commit-error-message-upon-unwritable-object-store:
  commit: fix duplication regression in permission error output
  unwritable tests: assert exact error output
</content>
</entry>
<entry>
<title>Merge branch 'ab/fsck-unexpected-type'</title>
<updated>2021-10-25T23:06:56Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-10-25T23:06:56Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=061a21d36d807bdcf996f388d5e487d5e1993bbc'/>
<id>urn:sha1:061a21d36d807bdcf996f388d5e487d5e1993bbc</id>
<content type='text'>
"git fsck" has been taught to report mismatch between expected and
actual types of an object better.

* ab/fsck-unexpected-type:
  fsck: report invalid object type-path combinations
  fsck: don't hard die on invalid object types
  object-file.c: stop dying in parse_loose_header()
  object-file.c: return ULHR_TOO_LONG on "header too long"
  object-file.c: use "enum" return type for unpack_loose_header()
  object-file.c: simplify unpack_loose_short_header()
  object-file.c: make parse_loose_header_extended() public
  object-file.c: return -1, not "status" from unpack_loose_header()
  object-file.c: don't set "typep" when returning non-zero
  cat-file tests: test for current --allow-unknown-type behavior
  cat-file tests: add corrupt loose object test
  cat-file tests: test for missing/bogus object with -t, -s and -p
  cat-file tests: move bogus_* variable declarations earlier
  fsck tests: test for garbage appended to a loose object
  fsck tests: test current hash/type mismatch behavior
  fsck tests: refactor one test to use a sub-repo
  fsck tests: add test for fsck-ing an unknown type
</content>
</entry>
<entry>
<title>commit: fix duplication regression in permission error output</title>
<updated>2021-10-12T18:16:59Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-10-12T14:30:49Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=4ef91a2d795c424eda2bec1bfbbd0c813bcc978a'/>
<id>urn:sha1:4ef91a2d795c424eda2bec1bfbbd0c813bcc978a</id>
<content type='text'>
Fix a regression in the error output emitted when .git/objects can't
be written to. Before 9c4d6c0297b (cache-tree: Write updated
cache-tree after commit, 2014-07-13) we'd emit only one "insufficient
permission" error, now we'll do so again.

The cause is rather straightforward, we've got WRITE_TREE_SILENT for
the use-case of wanting to prepare an index silently, quieting any
permission etc. error output. Then when we attempt to update to
that (possibly broken) index we'll run into the same errors again.

But with 9c4d6c0297b the gap between the cache-tree API and the object
store wasn't closed in terms of asking write_object_file() to be
silent. I.e. post-9c4d6c0297b the first call is to prepare_index(),
and after that we'll call prepare_to_commit(). We only want verbose
error output from the latter.

So let's add and use that facility with a corresponding HASH_SILENT
flag, its only user is cache-tree.c's update_one(), which will set it
if its "WRITE_TREE_SILENT" flag is set.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fsck: report invalid object type-path combinations</title>
<updated>2021-10-01T22:06:01Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-10-01T09:16:53Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=96e41f58fe1a5aeadf2bf1c1850c53a1c1144bbc'/>
<id>urn:sha1:96e41f58fe1a5aeadf2bf1c1850c53a1c1144bbc</id>
<content type='text'>
Improve the error that's emitted in cases where we find a loose object
we parse, but which isn't at the location we expect it to be.

Before this change we'd prefix the error with a not-a-OID derived from
the path at which the object was found, due to an emergent behavior in
how we'd end up with an "OID" in these codepaths.

Now we'll instead say what object we hashed, and what path it was
found at. Before this patch series e.g.:

    $ git hash-object --stdin -w -t blob &lt;/dev/null
    e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ mv objects/e6/ objects/e7

Would emit ("[...]" used to abbreviate the OIDs):

    git fsck
    error: hash mismatch for ./objects/e7/9d[...] (expected e79d[...])
    error: e79d[...]: object corrupt or missing: ./objects/e7/9d[...]

Now we'll instead emit:

    error: e69d[...]: hash-path mismatch, found at: ./objects/e7/9d[...]

Furthermore, we'll do the right thing when the object type and its
location are bad. I.e. this case:

    $ git hash-object --stdin -w -t garbage --literally &lt;/dev/null
    8315a83d2acc4c174aed59430f9a9c4ed926440f
    $ mv objects/83 objects/84

As noted in an earlier commits we'd simply die early in those cases,
until preceding commits fixed the hard die on invalid object type:

    $ git fsck
    fatal: invalid object type

Now we'll instead emit sensible error messages:

    $ git fsck
    error: 8315[...]: hash-path mismatch, found at: ./objects/84/15[...]
    error: 8315[...]: object is of unknown type 'garbage': ./objects/84/15[...]

In both fsck.c and object-file.c we're using null_oid as a sentinel
value for checking whether we got far enough to be certain that the
issue was indeed this OID mismatch.

We need to add the "object corrupt or missing" special-case to deal
with cases where read_loose_object() will return an error before
completing check_object_signature(), e.g. if we have an error in
unpack_loose_rest() because we find garbage after the valid gzip
content:

    $ git hash-object --stdin -w -t blob &lt;/dev/null
    e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ chmod 755 objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ echo garbage &gt;&gt;objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ git fsck
    error: garbage at end of loose object 'e69d[...]'
    error: unable to unpack contents of ./objects/e6/9d[...]
    error: e69d[...]: object corrupt or missing: ./objects/e6/9d[...]

There is currently some weird messaging in the edge case when the two
are combined, i.e. because we're not explicitly passing along an error
state about this specific scenario from check_stream_oid() via
read_loose_object() we'll end up printing the null OID if an object is
of an unknown type *and* it can't be unpacked by zlib, e.g.:

    $ git hash-object --stdin -w -t garbage --literally &lt;/dev/null
    8315a83d2acc4c174aed59430f9a9c4ed926440f
    $ chmod 755 objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    $ echo garbage &gt;&gt;objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    $ /usr/bin/git fsck
    fatal: invalid object type
    $ ~/g/git/git fsck
    error: garbage at end of loose object '8315a83d2acc4c174aed59430f9a9c4ed926440f'
    error: unable to unpack contents of ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    error: 8315a83d2acc4c174aed59430f9a9c4ed926440f: object corrupt or missing: ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    error: 0000000000000000000000000000000000000000: object is of unknown type 'garbage': ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    [...]

I think it's OK to leave that for future improvements, which would
involve enum-ifying more error state as we've done with "enum
unpack_loose_header_result" in preceding commits. In these
increasingly more obscure cases the worst that can happen is that
we'll get slightly nonsensical or inapplicable error messages.

There's other such potential edge cases, all of which might produce
some confusing messaging, but still be handled correctly as far as
passing along errors goes. E.g. if check_object_signature() returns
and oideq(real_oid, null_oid()) is true, which could happen if it
returns -1 due to the read_istream() call having failed.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>fsck: don't hard die on invalid object types</title>
<updated>2021-10-01T22:06:01Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-10-01T09:16:52Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=31deb28f5e0c85e8bd556ba135e5f0e0926bad7a'/>
<id>urn:sha1:31deb28f5e0c85e8bd556ba135e5f0e0926bad7a</id>
<content type='text'>
Change the error fsck emits on invalid object types, such as:

    $ git hash-object --stdin -w -t garbage --literally &lt;/dev/null
    &lt;OID&gt;

From the very ungraceful error of:

    $ git fsck
    fatal: invalid object type
    $

To:

    $ git fsck
    error: &lt;OID&gt;: object is of unknown type 'garbage': &lt;OID_PATH&gt;
    [ other fsck output ]

We'll still exit with non-zero, but now we'll finish the rest of the
traversal. The tests that's being added here asserts that we'll still
complain about other fsck issues (e.g. an unrelated dangling blob).

To do this we need to pass down the "OBJECT_INFO_ALLOW_UNKNOWN_TYPE"
flag from read_loose_object() through to parse_loose_header(). Since
the read_loose_object() function is only used in builtin/fsck.c we can
simply change it to accept a "struct object_info" (which contains the
OBJECT_INFO_ALLOW_UNKNOWN_TYPE in its flags). See
f6371f92104 (sha1_file: add read_loose_object() function, 2017-01-13)
for the introduction of read_loose_object().

Since we'll need a "struct strbuf" to hold the "type_name" let's pass
it to the for_each_loose_file_in_objdir() callback to avoid allocating
a new one for each loose object in the iteration. It also makes the
memory management simpler than sticking it in fsck_loose() itself, as
we'll only need to strbuf_reset() it, with no need to do a
strbuf_release() before each "return".

Before this commit we'd never check the "type" if read_loose_object()
failed, but now we do. We therefore need to initialize it to OBJ_NONE
to be able to tell the difference between e.g. its
unpack_loose_header() having failed, and us getting past that and into
parse_loose_header().

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>*.[ch] *_INIT macros: use { 0 } for a "zero out" idiom</title>
<updated>2021-09-27T21:47:59Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-09-27T12:54:25Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=9865b6e6a4ca1e895fd473c827cf1822f3bd8249'/>
<id>urn:sha1:9865b6e6a4ca1e895fd473c827cf1822f3bd8249</id>
<content type='text'>
In C it isn't required to specify that all members of a struct are
zero'd out to 0, NULL or '\0', just providing a "{ 0 }" will
accomplish that.

Let's also change code that provided N zero'd fields to just
provide one, and change e.g. "{ NULL }" to "{ 0 }" for
consistency. I.e. even if the first member is a pointer let's use "0"
instead of "NULL". The point of using "0" consistently is to pick one,
and to not have the reader wonder why we're not using the same pattern
everywhere.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'rs/packfile-bad-object-list-in-oidset'</title>
<updated>2021-09-23T20:44:46Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-09-23T20:44:45Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=28caad63d0f58611a22bf5b11e4914496ecb25b1'/>
<id>urn:sha1:28caad63d0f58611a22bf5b11e4914496ecb25b1</id>
<content type='text'>
Replace a handcrafted data structure used to keep track of bad
objects in the packfile API by an oidset.

* rs/packfile-bad-object-list-in-oidset:
  packfile: use oidset for bad objects
  packfile: convert has_packed_and_bad() to object_id
  packfile: convert mark_bad_packed_object() to object_id
  midx: inline nth_midxed_pack_entry()
  oidset: make oidset_size() an inline function
</content>
</entry>
</feed>
