<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/object.c, branch gitk-resize-error</title>
<subtitle>Fork of git SCM with my patches.</subtitle>
<id>http://git.kilabit.info/git/atom?h=gitk-resize-error</id>
<link rel='self' href='http://git.kilabit.info/git/atom?h=gitk-resize-error'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/'/>
<updated>2022-01-04T00:24:15Z</updated>
<entry>
<title>Merge branch 'ns/tmp-objdir'</title>
<updated>2022-01-04T00:24:15Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2022-01-04T00:24:14Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=0dc90d954db852c6604796b8d817365f94e92a16'/>
<id>urn:sha1:0dc90d954db852c6604796b8d817365f94e92a16</id>
<content type='text'>
New interface into the tmp-objdir API to help in-core use of the
quarantine feature.

* ns/tmp-objdir:
  tmp-objdir: disable ref updates when replacing the primary odb
  tmp-objdir: new API for creating temporary writable databases
</content>
</entry>
<entry>
<title>tmp-objdir: new API for creating temporary writable databases</title>
<updated>2021-12-08T22:06:36Z</updated>
<author>
<name>Neeraj Singh</name>
<email>neerajsi@microsoft.com</email>
</author>
<published>2021-12-06T22:05:04Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=b3cecf49eac00d62e361bf6e6e81392f5a2fb571'/>
<id>urn:sha1:b3cecf49eac00d62e361bf6e6e81392f5a2fb571</id>
<content type='text'>
The tmp_objdir API provides the ability to create temporary object
directories, but was designed with the goal of having subprocesses
access these object stores, followed by the main process migrating
objects from it to the main object store or just deleting it.  The
subprocesses would view it as their primary datastore and write to it.

Here we add the tmp_objdir_replace_primary_odb function that replaces
the current process's writable "main" object directory with the
specified one. The previous main object directory is restored in either
tmp_objdir_migrate or tmp_objdir_destroy.

For the --remerge-diff usecase, add a new `will_destroy` flag in `struct
object_database` to mark ephemeral object databases that do not require
fsync durability.

Add 'git prune' support for removing temporary object databases, and
make sure that they have a name starting with tmp_ and containing an
operation-specific name.

Based-on-patch-by: Elijah Newren &lt;newren@gmail.com&gt;

Signed-off-by: Neeraj Singh &lt;neerajsi@microsoft.com&gt;
Reviewed-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>object.c: use BUG(...) no die("BUG: ...") in lookup_object_by_type()</title>
<updated>2021-12-07T20:33:58Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-12-07T11:05:54Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=eafd6e7e5585ecf7e0d4bd542d7565fea008795a'/>
<id>urn:sha1:eafd6e7e5585ecf7e0d4bd542d7565fea008795a</id>
<content type='text'>
Adjust code added in 7463064b280 (object.h: add
lookup_object_by_type() function, 2021-06-22) to use the BUG()
function.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'ab/fsck-unexpected-type'</title>
<updated>2021-10-25T23:06:56Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-10-25T23:06:56Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=061a21d36d807bdcf996f388d5e487d5e1993bbc'/>
<id>urn:sha1:061a21d36d807bdcf996f388d5e487d5e1993bbc</id>
<content type='text'>
"git fsck" has been taught to report mismatch between expected and
actual types of an object better.

* ab/fsck-unexpected-type:
  fsck: report invalid object type-path combinations
  fsck: don't hard die on invalid object types
  object-file.c: stop dying in parse_loose_header()
  object-file.c: return ULHR_TOO_LONG on "header too long"
  object-file.c: use "enum" return type for unpack_loose_header()
  object-file.c: simplify unpack_loose_short_header()
  object-file.c: make parse_loose_header_extended() public
  object-file.c: return -1, not "status" from unpack_loose_header()
  object-file.c: don't set "typep" when returning non-zero
  cat-file tests: test for current --allow-unknown-type behavior
  cat-file tests: add corrupt loose object test
  cat-file tests: test for missing/bogus object with -t, -s and -p
  cat-file tests: move bogus_* variable declarations earlier
  fsck tests: test for garbage appended to a loose object
  fsck tests: test current hash/type mismatch behavior
  fsck tests: refactor one test to use a sub-repo
  fsck tests: add test for fsck-ing an unknown type
</content>
</entry>
<entry>
<title>fsck: report invalid object type-path combinations</title>
<updated>2021-10-01T22:06:01Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2021-10-01T09:16:53Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=96e41f58fe1a5aeadf2bf1c1850c53a1c1144bbc'/>
<id>urn:sha1:96e41f58fe1a5aeadf2bf1c1850c53a1c1144bbc</id>
<content type='text'>
Improve the error that's emitted in cases where we find a loose object
we parse, but which isn't at the location we expect it to be.

Before this change we'd prefix the error with a not-a-OID derived from
the path at which the object was found, due to an emergent behavior in
how we'd end up with an "OID" in these codepaths.

Now we'll instead say what object we hashed, and what path it was
found at. Before this patch series e.g.:

    $ git hash-object --stdin -w -t blob &lt;/dev/null
    e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ mv objects/e6/ objects/e7

Would emit ("[...]" used to abbreviate the OIDs):

    git fsck
    error: hash mismatch for ./objects/e7/9d[...] (expected e79d[...])
    error: e79d[...]: object corrupt or missing: ./objects/e7/9d[...]

Now we'll instead emit:

    error: e69d[...]: hash-path mismatch, found at: ./objects/e7/9d[...]

Furthermore, we'll do the right thing when the object type and its
location are bad. I.e. this case:

    $ git hash-object --stdin -w -t garbage --literally &lt;/dev/null
    8315a83d2acc4c174aed59430f9a9c4ed926440f
    $ mv objects/83 objects/84

As noted in an earlier commits we'd simply die early in those cases,
until preceding commits fixed the hard die on invalid object type:

    $ git fsck
    fatal: invalid object type

Now we'll instead emit sensible error messages:

    $ git fsck
    error: 8315[...]: hash-path mismatch, found at: ./objects/84/15[...]
    error: 8315[...]: object is of unknown type 'garbage': ./objects/84/15[...]

In both fsck.c and object-file.c we're using null_oid as a sentinel
value for checking whether we got far enough to be certain that the
issue was indeed this OID mismatch.

We need to add the "object corrupt or missing" special-case to deal
with cases where read_loose_object() will return an error before
completing check_object_signature(), e.g. if we have an error in
unpack_loose_rest() because we find garbage after the valid gzip
content:

    $ git hash-object --stdin -w -t blob &lt;/dev/null
    e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ chmod 755 objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ echo garbage &gt;&gt;objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
    $ git fsck
    error: garbage at end of loose object 'e69d[...]'
    error: unable to unpack contents of ./objects/e6/9d[...]
    error: e69d[...]: object corrupt or missing: ./objects/e6/9d[...]

There is currently some weird messaging in the edge case when the two
are combined, i.e. because we're not explicitly passing along an error
state about this specific scenario from check_stream_oid() via
read_loose_object() we'll end up printing the null OID if an object is
of an unknown type *and* it can't be unpacked by zlib, e.g.:

    $ git hash-object --stdin -w -t garbage --literally &lt;/dev/null
    8315a83d2acc4c174aed59430f9a9c4ed926440f
    $ chmod 755 objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    $ echo garbage &gt;&gt;objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    $ /usr/bin/git fsck
    fatal: invalid object type
    $ ~/g/git/git fsck
    error: garbage at end of loose object '8315a83d2acc4c174aed59430f9a9c4ed926440f'
    error: unable to unpack contents of ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    error: 8315a83d2acc4c174aed59430f9a9c4ed926440f: object corrupt or missing: ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    error: 0000000000000000000000000000000000000000: object is of unknown type 'garbage': ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f
    [...]

I think it's OK to leave that for future improvements, which would
involve enum-ifying more error state as we've done with "enum
unpack_loose_header_result" in preceding commits. In these
increasingly more obscure cases the worst that can happen is that
we'll get slightly nonsensical or inapplicable error messages.

There's other such potential edge cases, all of which might produce
some confusing messaging, but still be handled correctly as far as
passing along errors goes. E.g. if check_object_signature() returns
and oideq(real_oid, null_oid()) is true, which could happen if it
returns -1 due to the read_istream() call having failed.

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/log-decorate-optim'</title>
<updated>2021-07-28T20:17:58Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-07-28T20:17:58Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=c9d6d8a1938f86594b33785a8228d9f35ad457c8'/>
<id>urn:sha1:c9d6d8a1938f86594b33785a8228d9f35ad457c8</id>
<content type='text'>
Optimize "git log" for cases where we wasted cycles to load ref
decoration data that may not be needed.

* jk/log-decorate-optim:
  load_ref_decorations(): fix decoration with tags
  add_ref_decoration(): rename s/type/deco_type/
  load_ref_decorations(): avoid parsing non-tag objects
  object.h: add lookup_object_by_type() function
  object.h: expand docstring for lookup_unknown_object()
  log: avoid loading decorations for userformats that don't need it
  pretty.h: update and expand docstring for userformat_find_requirements()
</content>
</entry>
<entry>
<title>speed up alt_odb_usable() with many alternates</title>
<updated>2021-07-08T00:21:12Z</updated>
<author>
<name>Eric Wong</name>
<email>e@80x24.org</email>
</author>
<published>2021-07-07T23:10:15Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=cf2dc1c238c6fd5f93c315a3045ccf95459701cd'/>
<id>urn:sha1:cf2dc1c238c6fd5f93c315a3045ccf95459701cd</id>
<content type='text'>
With many alternates, the duplicate check in alt_odb_usable()
wastes many cycles doing repeated fspathcmp() on every existing
alternate.  Use a khash to speed up lookups by odb-&gt;path.

Since the kh_put_* API uses the supplied key without
duplicating it, we also take advantage of it to replace both
xstrdup() and strbuf_release() in link_alt_odb_entry() with
strbuf_detach() to avoid the allocation and copy.

In a test repository with 50K alternates and each of those 50K
alternates having one alternate each (for a total of 100K total
alternates); this speeds up lookup of a non-existent blob from
over 16 minutes to roughly 2.7 seconds on my busy workstation.

Note: all underlying git object directories were small and
unpacked with only loose objects and no packs.  Having to load
packs increases times significantly.

Signed-off-by: Eric Wong &lt;e@80x24.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>object.h: add lookup_object_by_type() function</title>
<updated>2021-06-29T03:30:18Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-06-22T16:06:41Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=7463064b28086c0a765e247bc8336f8e32356494'/>
<id>urn:sha1:7463064b28086c0a765e247bc8336f8e32356494</id>
<content type='text'>
In some cases it's useful for efficiency reasons to get the type of an
object before deciding whether to parse it, but we still want an object
struct. E.g., in reachable.c, bitmaps give us the type, but we just want
to mark flags on each object. Likewise, we may loop over every object
and only parse tags in order to peel them; checking the type first lets
us avoid parsing the non-tags.

But our lookup_blob(), etc, functions make getting an object struct
annoying: we have to call the right function for every type. And we
cannot just use the generic lookup_object(), because it only returns an
already-seen object; it won't allocate a new object struct.

Let's provide a function that dispatches to the correct lookup_*
function based on a run-time type. In fact, reachable.c already has such
a helper, so we'll just make that public.

I did change the return type from "void *" to "struct object *". While
the former is a clever way to avoid casting inside the function, it's
less safe and less informative to people reading the function
declaration.

The next commit will add a new caller.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>lookup_unknown_object(): take a repository argument</title>
<updated>2021-04-13T20:18:46Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2021-04-13T07:16:36Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=45a187cc34b5016842b91ef14b59e40505afff46'/>
<id>urn:sha1:45a187cc34b5016842b91ef14b59e40505afff46</id>
<content type='text'>
All of the other lookup_foo() functions take a repository argument, but
lookup_unknown_object() was never converted, and it uses the_repository
internally. Let's fix that.

We could leave a wrapper that uses the_repository, but there aren't that
many calls, so we'll just convert them all. I looked briefly at each
site to see if we had a repository struct (besides the_repository) we
could pass, but none of them do (so this conversion to pass
the_repository is a pure noop in each case, though it does take us one
step closer to eventually getting rid of the_repository).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>use CALLOC_ARRAY</title>
<updated>2021-03-14T00:00:09Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2021-03-13T16:17:22Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=ca56dadb4b65ccaeab809d80db80a312dc00941a'/>
<id>urn:sha1:ca56dadb4b65ccaeab809d80db80a312dc00941a</id>
<content type='text'>
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead.  It shortens the code and infers the
element size automatically.

Signed-off-by: René Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
