xdiff: split xrecord_t.ha into line_hash and minimal_perfect_hash

The ha field is serving two different purposes, which makes the code harder to read. At first glance, it looks like many places assume there could never be hash collisions between lines of the two input files. In reality, line_hash is used together with xdl_recmatch() to ensure correct comparisons of lines, even when collisions occur. To make this clearer, the old ha field has been split: * line_hash: a straightforward hash of a line, independent of any external context. Its type is uint64_t, as it comes from a fixed width hash function. * minimal_perfect_hash: Not a new concept, but now a separate field. It comes from the classifier's general-purpose hash table, which assigns each line a unique and minimal hash across the two files. A size_t is used here because it's meant to be used to index an array. This also avoids ` as usize` casts on the Rust side when using it to index a slice. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
author: Ezekiel Newren <ezekielnewren@gmail.com> 2025-11-18 22:34:18 +0000
committer: Junio C Hamano <gitster@pobox.com> 2025-11-18 14:53:10 -0800
commit: 6a26019c81faa07ba811541b4cf35be9e8ee1ead (patch)
tree: 1f1c230dd72da18f65389c727faaa6c0d4f9bbc5 /xdiff/xtypes.h
parent: b0d4ae30f5a23fa9da87e9396b78e6442b351ddc (diff)
download: git-6a26019c81faa07ba811541b4cf35be9e8ee1ead.tar.xz
1 files changed, 2 insertions, 1 deletions
diff --git a/xdiff/xtypes.h b/xdiff/xtypes.h
index 354349b523..d4e9cd2e76 100644
--- a/xdiff/xtypes.h
+++ b/xdiff/xtypes.h
@@ -41,7 +41,8 @@ typedef struct s_chastore {
 typedef struct s_xrecord {
 	uint8_t const *ptr;
 	size_t size;
-	unsigned long ha;
+	uint64_t line_hash;
+	size_t minimal_perfect_hash;
 } xrecord_t;
 
 typedef struct s_xdfile {
author	Ezekiel Newren <ezekielnewren@gmail.com>	2025-11-18 22:34:18 +0000
committer	Junio C Hamano <gitster@pobox.com>	2025-11-18 14:53:10 -0800
commit	6a26019c81faa07ba811541b4cf35be9e8ee1ead (patch)
tree	1f1c230dd72da18f65389c727faaa6c0d4f9bbc5 /xdiff/xtypes.h
parent	b0d4ae30f5a23fa9da87e9396b78e6442b351ddc (diff)
download	git-6a26019c81faa07ba811541b4cf35be9e8ee1ead.tar.xz