runtime: short path for equal pointers in arm64 memequal

If memequal is invoked with the same pointers as arguments it ends up comparing the whole memory contents, instead of just comparing the pointers. This effectively makes an operation that could be O(1) into O(n). All the other architectures already have this optimization in place. For instance, arm64 also have it, in memequal_varlen. Such optimization is very specific, one case that it will probably benefit is programs that rely heavily on interning of strings. goos: darwin goarch: arm64 pkg: bytes │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ Equal/same/1-8 2.678n ± ∞ ¹ 2.400n ± ∞ ¹ -10.38% (p=0.008 n=5) Equal/same/6-8 3.267n ± ∞ ¹ 2.431n ± ∞ ¹ -25.59% (p=0.008 n=5) Equal/same/9-8 2.981n ± ∞ ¹ 2.385n ± ∞ ¹ -19.99% (p=0.008 n=5) Equal/same/15-8 2.974n ± ∞ ¹ 2.390n ± ∞ ¹ -19.64% (p=0.008 n=5) Equal/same/16-8 2.983n ± ∞ ¹ 2.380n ± ∞ ¹ -20.21% (p=0.008 n=5) Equal/same/20-8 3.567n ± ∞ ¹ 2.384n ± ∞ ¹ -33.17% (p=0.008 n=5) Equal/same/32-8 3.568n ± ∞ ¹ 2.385n ± ∞ ¹ -33.16% (p=0.008 n=5) Equal/same/4K-8 78.040n ± ∞ ¹ 2.378n ± ∞ ¹ -96.95% (p=0.008 n=5) Equal/same/4M-8 78713.000n ± ∞ ¹ 2.385n ± ∞ ¹ -100.00% (p=0.008 n=5) Equal/same/64M-8 1348095.000n ± ∞ ¹ 2.381n ± ∞ ¹ -100.00% (p=0.008 n=5) geomean 43.52n 2.390n -94.51% ¹ need >= 6 samples for confidence interval at level 0.95 │ old.txt │ new.txt │ │ B/s │ B/s vs base │ Equal/same/1-8 356.1Mi ± ∞ ¹ 397.3Mi ± ∞ ¹ +11.57% (p=0.008 n=5) Equal/same/6-8 1.711Gi ± ∞ ¹ 2.298Gi ± ∞ ¹ +34.35% (p=0.008 n=5) Equal/same/9-8 2.812Gi ± ∞ ¹ 3.515Gi ± ∞ ¹ +24.99% (p=0.008 n=5) Equal/same/15-8 4.698Gi ± ∞ ¹ 5.844Gi ± ∞ ¹ +24.41% (p=0.008 n=5) Equal/same/16-8 4.995Gi ± ∞ ¹ 6.260Gi ± ∞ ¹ +25.34% (p=0.008 n=5) Equal/same/20-8 5.222Gi ± ∞ ¹ 7.814Gi ± ∞ ¹ +49.63% (p=0.008 n=5) Equal/same/32-8 8.353Gi ± ∞ ¹ 12.496Gi ± ∞ ¹ +49.59% (p=0.008 n=5) Equal/same/4K-8 48.88Gi ± ∞ ¹ 1603.96Gi ± ∞ ¹ +3181.17% (p=0.008 n=5) Equal/same/4M-8 49.63Gi ± ∞ ¹ 1637911.85Gi ± ∞ ¹ +3300381.91% (p=0.008 n=5) Equal/same/64M-8 46.36Gi ± ∞ ¹ 26253069.97Gi ± ∞ ¹ +56626517.99% (p=0.008 n=5) geomean 6.737Gi 122.7Gi +1721.01% ¹ need >= 6 samples for confidence interval at level 0.95 Fixes #64381 Change-Id: I7d423930a688edd88c4ba60d45e097296d9be852 GitHub-Last-Rev: ae8189fafb1cba87b5394f09f971746ae9299273 GitHub-Pull-Request: golang/go#64419 Reviewed-on: https://go-review.googlesource.com/c/go/+/545416 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com>
author: Mauri de Souza Meneguzzo <mauri870@gmail.com> 2023-11-28 18:42:43 +0000
committer: Cherry Mui <cherryyz@google.com> 2024-01-24 16:07:25 +0000
commit: 66b776b0250dd980d8f6aac264b5e3443ec465dc (patch)
tree: bcd8d35ef717a64ba0d9b370eba7ae18cc0b0870
parent: 8b23b7b04234424791e26b8d2d26f61ef1311a9f (diff)
download: go-66b776b0250dd980d8f6aac264b5e3443ec465dc.tar.xz
2 files changed, 8 insertions, 0 deletions
diff --git a/src/bytes/bytes_test.go b/src/bytes/bytes_test.go
index f0733edd3f..5e8cf85fd9 100644
--- a/src/bytes/bytes_test.go
+++ b/src/bytes/bytes_test.go
@@ -629,6 +629,11 @@ func BenchmarkEqual(b *testing.B) {
 	})
 
 	sizes := []int{1, 6, 9, 15, 16, 20, 32, 4 << 10, 4 << 20, 64 << 20}
+
+	b.Run("same", func(b *testing.B) {
+		benchBytes(b, sizes, bmEqual(func(a, b []byte) bool { return Equal(a, a) }))
+	})
+
 	benchBytes(b, sizes, bmEqual(Equal))
 }
 
diff --git a/src/internal/bytealg/equal_arm64.s b/src/internal/bytealg/equal_arm64.s
index d3aabba587..4db9515474 100644
--- a/src/internal/bytealg/equal_arm64.s
+++ b/src/internal/bytealg/equal_arm64.s
@@ -9,6 +9,9 @@
 TEXT runtime·memequal<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-25
 	// short path to handle 0-byte case
 	CBZ	R2, equal
+	// short path to handle equal pointers
+	CMP	R0, R1
+	BEQ	equal
 	B	memeqbody<>(SB)
 equal:
 	MOVD	$1, R0
author	Mauri de Souza Meneguzzo <mauri870@gmail.com>	2023-11-28 18:42:43 +0000
committer	Cherry Mui <cherryyz@google.com>	2024-01-24 16:07:25 +0000
commit	66b776b0250dd980d8f6aac264b5e3443ec465dc (patch)
tree	bcd8d35ef717a64ba0d9b370eba7ae18cc0b0870
parent	8b23b7b04234424791e26b8d2d26f61ef1311a9f (diff)
download	go-66b776b0250dd980d8f6aac264b5e3443ec465dc.tar.xz