<feed xmlns='http://www.w3.org/2005/Atom'>
<title>git/diffcore.h, branch gitk-resize-error</title>
<subtitle>Fork of git SCM with my patches.</subtitle>
<id>http://git.kilabit.info/git/atom?h=gitk-resize-error</id>
<link rel='self' href='http://git.kilabit.info/git/atom?h=gitk-resize-error'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/'/>
<updated>2021-07-30T16:01:19Z</updated>
<entry>
<title>merge-ort: store filepairs and filespecs in our mem_pool</title>
<updated>2021-07-30T16:01:19Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-07-30T11:47:42Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=f239fff4c183b231134115bc3b38b4f8e3982d3e'/>
<id>urn:sha1:f239fff4c183b231134115bc3b38b4f8e3982d3e</id>
<content type='text'>
For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       198.1 ms ±  2.6 ms     198.5 ms ±  3.4 ms
    mega-renames:     715.8 ms ±  4.0 ms     679.1 ms ±  5.6 ms
    just-one-mega:    276.8 ms ±  4.2 ms     271.9 ms ±  2.8 ms

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diffcore-rename, merge-ort: add wrapper functions for filepair alloc/dealloc</title>
<updated>2021-07-30T16:01:19Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-07-30T11:47:41Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=a8791ef6492bf702ba70352009cbd28fb581c6e2'/>
<id>urn:sha1:a8791ef6492bf702ba70352009cbd28fb581c6e2</id>
<content type='text'>
We want to be able to allocate filespecs and filepairs using a mem_pool.
However, filespec data will still remain outside the pool (perhaps in
the future we could plumb the pool through the various diff APIs to
allocate the filespec data too, but for now we are limiting the scope).
Add some extra functions to allocate these appropriately based on the
non-NULL-ness of opt-&gt;priv-&gt;pool, as well as some extra functions to
handle correctly deallocating the relevant parts of them.  A future
commit will make use of these new functions.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'en/ort-perf-batch-11'</title>
<updated>2021-06-14T04:33:27Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-06-14T04:33:26Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=169914ede2aa205c5500c7be0501889a8962dc24'/>
<id>urn:sha1:169914ede2aa205c5500c7be0501889a8962dc24</id>
<content type='text'>
Optimize out repeated rename detection in a sequence of mergy
operations.

* en/ort-perf-batch-11:
  merge-ort, diffcore-rename: employ cached renames when possible
  merge-ort: handle interactions of caching and rename/rename(1to1) cases
  merge-ort: add helper functions for using cached renames
  merge-ort: preserve cached renames for the appropriate side
  merge-ort: avoid accidental API mis-use
  merge-ort: add code to check for whether cached renames can be reused
  merge-ort: populate caches of rename detection results
  merge-ort: add data structures for in-memory caching of rename detection
  t6429: testcases for remembering renames
  fast-rebase: write conflict state to working tree, index, and HEAD
  fast-rebase: change assert() to BUG()
  Documentation/technical: describe remembering renames optimization
  t6423: rename file within directory that other side renamed
</content>
</entry>
<entry>
<title>merge-ort, diffcore-rename: employ cached renames when possible</title>
<updated>2021-05-20T06:40:39Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-05-20T06:09:41Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=25e65b6dd52c987056f1cac00fe6073fbf8ea237'/>
<id>urn:sha1:25e65b6dd52c987056f1cac00fe6073fbf8ea237</id>
<content type='text'>
When there are many renames between the old base of a series of commits
and the new base, the way sequencer.c, merge-recursive.c, and
diffcore-rename.c have traditionally split the work resulted in
redetecting the same renames with each and every commit being
transplanted.  To address this, the last several commits have been
creating a cache of rename detection results, determining when it was
safe to use such a cache in subsequent merge operations, adding helper
functions, and so on.  See the previous half dozen commit messages for
additional discussion of this optimization, particularly the message a
few commits ago entitled "add code to check for whether cached renames
can be reused".  This commit finally ties all of that work together,
modifying the merge algorithm to make use of these cached renames.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:        5.665 s ±  0.129 s     5.622 s ±  0.059 s
    mega-renames:     11.435 s ±  0.158 s    10.127 s ±  0.073 s
    just-one-mega:   494.2  ms ±  6.1  ms   500.3  ms ±  3.8  ms

That's a fairly small improvement, but mostly because the previous
optimizations were so effective for these particular testcases; this
optimization only kicks in when the others don't.  If we undid the
basename-guided rename detection and skip-irrelevant-renames
optimizations, then we'd see that this series by itself improved
performance as follows:

                   Before Basename Series   After Just This Series
    no-renames:      13.815 s ±  0.062 s      5.697 s ±  0.080 s
    mega-renames:  1799.937 s ±  0.493 s    205.709 s ±  0.457 s

Since this optimization kicks in to help accelerate cases where the
previous optimizations do not apply, this last comparison shows that
this cached-renames optimization has the potential to help signficantly
in cases that don't meet the requirements for the other optimizations to
be effective.

The changes made in this optimization also lay some important groundwork
for a future optimization around having collect_merge_info() avoid
recursing into subtrees in more cases.

However, for this optimization to be effective, merge_switch_to_result()
should only be called when the rebase or cherry-pick operation has
either completed or hit a case where the user needs to resolve a
conflict or edit the result.  If it is called after every commit, as
sequencer.c does, then the working tree and index are needlessly updated
with every commit and the cached metadata is tossed, defeating this
optimization.  Refactoring sequencer.c to only call
merge_switch_to_result() at the end of the operation is a bigger
undertaking, and the practical benefits of this optimization will not be
realized until that work is performed.  Since `test-tool fast-rebase`
only updates at the end of the operation, it was used to obtain the
timings above.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'en/ort-perf-batch-10'</title>
<updated>2021-04-16T20:53:33Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-04-16T20:53:33Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=e2e1a03f6b9c0869e64710c94fc43df82a06f0c8'/>
<id>urn:sha1:e2e1a03f6b9c0869e64710c94fc43df82a06f0c8</id>
<content type='text'>
Various rename detection optimization to help "ort" merge strategy
backend.

* en/ort-perf-batch-10:
  diffcore-rename: determine which relevant_sources are no longer relevant
  merge-ort: record the reason that we want a rename for a file
  diffcore-rename: add computation of number of unknown renames
  diffcore-rename: check if we have enough renames for directories early on
  diffcore-rename: only compute dir_rename_count for relevant directories
  merge-ort: record the reason that we want a rename for a directory
  merge-ort, diffcore-rename: tweak dirs_removed and relevant_source type
  diffcore-rename: take advantage of "majority rules" to skip more renames
</content>
</entry>
<entry>
<title>Merge branch 'en/ort-perf-batch-9'</title>
<updated>2021-04-08T20:23:26Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-04-08T20:23:26Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=1b31224e59750f515f7ceb7adab2a7609371327d'/>
<id>urn:sha1:1b31224e59750f515f7ceb7adab2a7609371327d</id>
<content type='text'>
The ort merge backend has been optimized by skipping irrelevant
renames.

* en/ort-perf-batch-9:
  diffcore-rename: avoid doing basename comparisons for irrelevant sources
  merge-ort: skip rename detection entirely if possible
  merge-ort: use relevant_sources to filter possible rename sources
  merge-ort: precompute whether directory rename detection is needed
  merge-ort: introduce wrappers for alternate tree traversal
  merge-ort: add data structures for an alternate tree traversal
  merge-ort: precompute subset of sources for which we need rename detection
  diffcore-rename: enable filtering possible rename sources
</content>
</entry>
<entry>
<title>Merge branch 'en/ort-perf-batch-8'</title>
<updated>2021-03-22T21:00:24Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2021-03-22T21:00:23Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=dd4048d1c76f067ad7b339dfb3a5f4765f5ae979'/>
<id>urn:sha1:dd4048d1c76f067ad7b339dfb3a5f4765f5ae979</id>
<content type='text'>
Rename detection rework continues.

* en/ort-perf-batch-8:
  diffcore-rename: compute dir_rename_guess from dir_rename_counts
  diffcore-rename: limit dir_rename_counts computation to relevant dirs
  diffcore-rename: compute dir_rename_counts in stages
  diffcore-rename: extend cleanup_dir_rename_info()
  diffcore-rename: move dir_rename_counts into dir_rename_info struct
  diffcore-rename: add function for clearing dir_rename_count
  Move computation of dir_rename_count from merge-ort to diffcore-rename
  diffcore-rename: add a mapping of destination names to their indices
  diffcore-rename: provide basic implementation of idx_possible_rename()
  diffcore-rename: use directory rename guided basename comparisons
</content>
</entry>
<entry>
<title>merge-ort: record the reason that we want a rename for a file</title>
<updated>2021-03-18T21:32:56Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-03-13T22:22:07Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=ec59da6015c457e84953a802fc6484ae2da2d774'/>
<id>urn:sha1:ec59da6015c457e84953a802fc6484ae2da2d774</id>
<content type='text'>
There are two different reasons we might want a rename for a file -- for
three-way content merging or as part of directory rename detection.
Record the reason.  diffcore-rename will potentially be able to filter
some of the ones marked as needed only for directory rename detection,
if it can determine those directory renames based solely on renames
found via exact rename detection and basename-guided rename detection.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>merge-ort: record the reason that we want a rename for a directory</title>
<updated>2021-03-18T21:32:55Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-03-13T22:22:03Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=fb52938eec1cf6ec3169152362fe774849f5ac9b'/>
<id>urn:sha1:fb52938eec1cf6ec3169152362fe774849f5ac9b</id>
<content type='text'>
When one side of history renames a directory, and the other side of
history added files to the old directory, directory rename detection is
used to warn about the location of the added files so the user can
move them to the old directory or keep them with the new one.

This sets up three different types of directories:
  * directories that had new files added to them
  * directories underneath a directory that had new files added to them
  * directories where no new files were added to it or any leading path

Save this information in dirs_removed; the next several commits will
make use of this information.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>merge-ort, diffcore-rename: tweak dirs_removed and relevant_source type</title>
<updated>2021-03-18T21:32:55Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2021-03-13T22:22:02Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/git/commit/?id=a49b55d52e7dcfd628b6328c9098d734ebe7a97d'/>
<id>urn:sha1:a49b55d52e7dcfd628b6328c9098d734ebe7a97d</id>
<content type='text'>
As noted in the previous commit, we want to be able to take advantage of
the "majority rules" portion of directory rename detection to avoid
detecting more renames than necessary.  However, for diffcore-rename to
take advantage of that, it needs to know whether a rename source file
was needed for just directory rename detection reasons, or if it is
wanted for potential three-way content merging.  Modify relevant_sources
from a strset to a strintmap, so we can encode additional information.

We also modify dirs_removed from a strset to a strintmap at the same
time because trying to determine what files are needed for directory
rename detection will require us tracking a bit more information for
each directory.

This commit only changes the types of the two variables from strset to
strintmap; it does not actually store any special values yet and for now
only checks for presence of entries in the strintmap.  Thus, the code is
functionally identical to how it behaved before.  Future commits will
start associating values with each key for these two maps.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
