<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/commit.c, branch v2.19.2</title>
<subtitle>Fork of git SCM with my patches.</subtitle>
<id>http://git.kilabit.info/git/atom?h=v2.19.2</id>
<link rel='self' href='http://git.kilabit.info/git/atom?h=v2.19.2'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/'/>
<updated>2018-11-21T13:57:47Z</updated>
<entry>
<title>Merge branch 'ds/commit-graph-with-grafts' into maint</title>
<updated>2018-11-21T13:57:47Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-11-21T13:57:47Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=e60e38a15da202737790e2b2a9e613a9cf6ec092'/>
<id>urn:sha1:e60e38a15da202737790e2b2a9e613a9cf6ec092</id>
<content type='text'>
The recently introduced commit-graph auxiliary data is incompatible
with mechanisms such as replace &amp; grafts that "breaks" immutable
nature of the object reference relationship.  Disable optimizations
based on its use (and updating existing commit-graph) when these
incompatible features are in use in the repository.

* ds/commit-graph-with-grafts:
  commit-graph: close_commit_graph before shallow walk
  commit-graph: not compatible with uninitialized repo
  commit-graph: not compatible with grafts
  commit-graph: not compatible with replace objects
  test-repository: properly init repo
  commit-graph: update design document
  refs.c: upgrade for_each_replace_ref to be a each_repo_ref_fn callback
  refs.c: migrate internal ref iteration to pass thru repository argument
</content>
</entry>
<entry>
<title>Merge branch 'jk/trailer-fixes' into maint</title>
<updated>2018-11-21T13:57:42Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-11-21T13:57:41Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=e293824d00c243c6a7afb15324902f693ae751e8'/>
<id>urn:sha1:e293824d00c243c6a7afb15324902f693ae751e8</id>
<content type='text'>
"git interpret-trailers" and its underlying machinery had a buggy
code that attempted to ignore patch text after commit log message,
which triggered in various codepaths that will always get the log
message alone and never get such an input.

* jk/trailer-fixes:
  append_signoff: use size_t for string offsets
  sequencer: ignore "---" divider when parsing trailers
  pretty, ref-filter: format %(trailers) with no_divider option
  interpret-trailers: allow suppressing "---" divider
  interpret-trailers: tighten check for "---" patch boundary
  trailer: pass process_trailer_opts to trailer_info_get()
  trailer: use size_t for iterating trailer list
  trailer: use size_t for string offsets
</content>
</entry>
<entry>
<title>Merge branch 'ds/commit-graph-lockfile-fix'</title>
<updated>2018-09-04T21:31:39Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-09-04T21:31:39Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=0a866db570e520ce7b08d1eceefdeaa9d63b6704'/>
<id>urn:sha1:0a866db570e520ce7b08d1eceefdeaa9d63b6704</id>
<content type='text'>
"git merge-base" in 2.19-rc1 has performance regression when the
(experimental) commit-graph feature is in use, which has been
mitigated.

* ds/commit-graph-lockfile-fix:
  commit: don't use generation numbers if not needed
</content>
</entry>
<entry>
<title>commit: don't use generation numbers if not needed</title>
<updated>2018-08-30T18:17:57Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-08-30T12:58:09Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=091f4cf3586957c3fd99d4c4c59c569d009137ad'/>
<id>urn:sha1:091f4cf3586957c3fd99d4c4c59c569d009137ad</id>
<content type='text'>
In 3afc679b "commit: use generations in paint_down_to_common()",
the queue in paint_down_to_common() was changed to use a priority
order based on generation number before commit date. This served
two purposes:

 1. When generation numbers are present, the walk guarantees
    correct topological relationships, regardless of clock skew in
    commit dates.

 2. It enables short-circuiting the walk when the min_generation
    parameter is added in d7c1ec3e "commit: add short-circuit to
    paint_down_to_common()". This short-circuit helps commands
    like 'git branch --contains' from needing to walk to a merge
    base when we know the result is false.

The commit message for 3afc679b includes the following sentence:

    This change does not affect the number of commits that are
    walked during the execution of paint_down_to_common(), only
    the order that those commits are inspected.

This statement is incorrect. Because it changes the order in which
the commits are inspected, it changes the order they are added to
the queue, and hence can change the number of loops before the
queue_has_nonstale() method returns true.

This change makes a concrete difference depending on the topology
of the commit graph. For instance, computing the merge-base between
consecutive versions of the Linux kernel has no effect for versions
after v4.9, but 'git merge-base v4.8 v4.9' presents a performance
regression:

    v2.18.0: 0.122s
v2.19.0-rc1: 0.547s
       HEAD: 0.127s

To determine that this was simply an ordering issue, I inserted
a counter within the while loop of paint_down_to_common() and
found that the loop runs 167,468 times in v2.18.0 and 635,579
times in v2.19.0-rc1.

The topology of this case can be described in a simplified way
here:

  v4.9
   |  \
   |   \
  v4.8  \
   | \   \
   |  \   |
  ...  A  B
   |  /  /
   | /  /
   |/__/
   C

Here, the "..." means "a very long line of commits". By generation
number, A and B have generation one more than C. However, A and B
have commit date higher than most of the commits reachable from
v4.8. When the walk reaches v4.8, we realize that it has PARENT1
and PARENT2 flags, so everything it can reach is marked as STALE,
including A. B has only the PARENT1 flag, so is not STALE.

When paint_down_to_common() is run using
compare_commits_by_commit_date, A and B are removed from the queue
early and C is inserted into the queue. At this point, C and the
rest of the queue entries are marked as STALE. The loop then
terminates.

When paint_down_to_common() is run using
compare_commits_by_gen_then_commit_date, B is removed from the
queue only after the many commits reachable from v4.8 are explored.
This causes the loop to run longer. The reason for this regression
is simple: the queue order is intended to not explore a commit
until everything that _could_ reach that commit is explored. From
the information gathered by the original ordering, we have no
guarantee that there is not a commit D reachable from v4.8 that
can also reach B. We gained absolute correctness in exchange for
a performance regression.

The performance regression is probably the worse option, since
these incorrect results in paint_down_to_common() are rare. The
topology required for the performance regression are less rare,
but still require multiple merge commits where the parents differ
greatly in generation number. In our example above, the commit A
is as important as the commit B to demonstrate the problem, since
otherwise the commit C will sit in the queue as non-stale just as
long in both orders.

The solution provided uses the min_generation parameter to decide
if we should use generation numbers in our ordering. When
min_generation is equal to zero, it means that the caller has no
known cutoff for the walk, so we should rely on our commit-date
heuristic as before; this is the case with merge_bases_many().
When min_generation is non-zero, then the caller knows a valuable
cutoff for the short-circuit mechanism; this is the case with
remove_redundant() and in_merge_bases_many().

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'js/larger-timestamps'</title>
<updated>2018-08-27T21:33:44Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-27T21:33:44Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=1392c5d28935a3aa25367df52adf4ca1e3c5724e'/>
<id>urn:sha1:1392c5d28935a3aa25367df52adf4ca1e3c5724e</id>
<content type='text'>
Portability fix.

* js/larger-timestamps:
  commit: use timestamp_t for author_date_slab
</content>
</entry>
<entry>
<title>append_signoff: use size_t for string offsets</title>
<updated>2018-08-23T17:08:51Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2018-08-23T00:50:51Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=66e83d9b41f7438cb167b9bb54093ebbf0532437'/>
<id>urn:sha1:66e83d9b41f7438cb167b9bb54093ebbf0532437</id>
<content type='text'>
The append_signoff() function takes an "int" to specify the
number of bytes to ignore. Most callers just pass 0, and the
remainder use ignore_non_trailer() to skip over cruft.
That function also returns an int, and uses them internally.

On systems where size_t is larger than an int (i.e., most
64-bit systems), dealing with a ridiculously large commit
message could end up overflowing an int, producing
surprising results (e.g., returning a negative offset, which
would cause us to look outside the original string).

Let's consistently use size_t for these offsets through this
whole stack. As a bonus, this makes the meaning of
"ignore_footer" as an offset (and not a boolean) more clear.
But while we're here, let's also document the interface.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit: use timestamp_t for author_date_slab</title>
<updated>2018-08-21T21:08:18Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-08-21T20:54:12Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=1820703045f8974bc5320d08a3611f4e29c83bf9'/>
<id>urn:sha1:1820703045f8974bc5320d08a3611f4e29c83bf9</id>
<content type='text'>
The author_date_slab is used to store the author date of a commit
when walking with the --author-date flag in rev-list or log. This
was added as an 'unsigned long' in

	81c6b38b "log: --author-date-order"

Since 'unsigned long' is ambiguous in its bit-ness across platforms
(64-bit in Linux, 32-bit in Windows, for example), most references
to the author dates in commit.c were converted to timestamp_t in

	dddbad72 "timestamp_t: a new data type for timestamps"

However, the slab definition was missed, leading to a mismatch in
the data types in Windows. This would not reveal itself as a bug
unless someone authors a commit after February 2106, but commits
can store anything as their author date.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: not compatible with grafts</title>
<updated>2018-08-21T17:22:51Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2018-08-20T18:24:30Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=20fd6d57996e33c30b6bb030329523d0116f15ec'/>
<id>urn:sha1:20fd6d57996e33c30b6bb030329523d0116f15ec</id>
<content type='text'>
Augment commit_graph_compatible(r) to return false when the given
repository r has commit grafts or is a shallow clone. Test that in these
situations we ignore existing commit-graph files and we do not write new
commit-graph files.

Helped-by: Jakub Narebski &lt;jnareb@gmail.com&gt;
Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jt/commit-graph-per-object-store'</title>
<updated>2018-08-02T22:30:47Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-02T22:30:47Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=78a72ad4f8fa91adc876b2fc4b18fd370e43136d'/>
<id>urn:sha1:78a72ad4f8fa91adc876b2fc4b18fd370e43136d</id>
<content type='text'>
The singleton commit-graph in-core instance is made per in-core
repository instance.

* jt/commit-graph-per-object-store:
  commit-graph: add repo arg to graph readers
  commit-graph: store graph in struct object_store
  commit-graph: add free_commit_graph
  commit-graph: add missing forward declaration
  object-store: add missing include
  commit-graph: refactor preparing commit graph
</content>
</entry>
<entry>
<title>Merge branch 'sb/object-store-lookup'</title>
<updated>2018-08-02T22:30:42Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-08-02T22:30:42Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=3a2a1dc17077a27ad1a89db27cb1b4b374f3b0ff'/>
<id>urn:sha1:3a2a1dc17077a27ad1a89db27cb1b4b374f3b0ff</id>
<content type='text'>
lookup_commit_reference() and friends have been updated to find
in-core object for a specific in-core repository instance.

* sb/object-store-lookup: (32 commits)
  commit.c: allow lookup_commit_reference to handle arbitrary repositories
  commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories
  tag.c: allow deref_tag to handle arbitrary repositories
  object.c: allow parse_object to handle arbitrary repositories
  object.c: allow parse_object_buffer to handle arbitrary repositories
  commit.c: allow get_cached_commit_buffer to handle arbitrary repositories
  commit.c: allow set_commit_buffer to handle arbitrary repositories
  commit.c: migrate the commit buffer to the parsed object store
  commit-slabs: remove realloc counter outside of slab struct
  commit.c: allow parse_commit_buffer to handle arbitrary repositories
  tag: allow parse_tag_buffer to handle arbitrary repositories
  tag: allow lookup_tag to handle arbitrary repositories
  commit: allow lookup_commit to handle arbitrary repositories
  tree: allow lookup_tree to handle arbitrary repositories
  blob: allow lookup_blob to handle arbitrary repositories
  object: allow lookup_object to handle arbitrary repositories
  object: allow object_as_type to handle arbitrary repositories
  tag: add repository argument to deref_tag
  tag: add repository argument to parse_tag_buffer
  tag: add repository argument to lookup_tag
  ...
</content>
</entry>
</feed>
