diff options
| author | Joel Sing <joel@sing.id.au> | 2022-09-17 16:55:57 +1000 |
|---|---|---|
| committer | Joel Sing <joel@sing.id.au> | 2022-09-19 18:59:50 +0000 |
| commit | fd82718e06a7b8a32becb1751592854d49904075 (patch) | |
| tree | bae05ec1b21f4ea2febeaa9f220ac1acc317a419 /src/internal/bytealg | |
| parent | ceffdc8545c3155b030de9e91d399dc34bd3c678 (diff) | |
| download | go-fd82718e06a7b8a32becb1751592854d49904075.tar.xz | |
internal/bytealg: correct alignment checks for compare/memequal on riscv64
On riscv64 we need 8 byte alignment for 8 byte loads - the existing check
was only ensuring 4 byte alignment, which potentially results in unaligned
loads being performed. Unaligned loads incur a significant performance penality
due to the resulting kernel traps and fix ups.
Adjust BenchmarkCompareBytesBigUnaligned so that this issue would have been
more readily visible.
Updates #50615
name old time/op new time/op delta
CompareBytesBigUnaligned/offset=1-4 6.98ms _ 5% 6.84ms _ 3% ~ (p=0.319 n=5+5)
CompareBytesBigUnaligned/offset=2-4 6.75ms _ 1% 6.99ms _ 4% ~ (p=0.063 n=5+5)
CompareBytesBigUnaligned/offset=3-4 6.84ms _ 1% 6.74ms _ 1% -1.48% (p=0.003 n=5+5)
CompareBytesBigUnaligned/offset=4-4 146ms _ 1% 7ms _ 6% -95.08% (p=0.000 n=5+5)
CompareBytesBigUnaligned/offset=5-4 7.05ms _ 5% 6.75ms _ 1% ~ (p=0.079 n=5+5)
CompareBytesBigUnaligned/offset=6-4 7.11ms _ 5% 6.89ms _ 5% ~ (p=0.177 n=5+5)
CompareBytesBigUnaligned/offset=7-4 7.14ms _ 5% 6.91ms _ 6% ~ (p=0.165 n=5+5)
name old speed new speed delta
CompareBytesBigUnaligned/offset=1-4 150MB/s _ 5% 153MB/s _ 3% ~ (p=0.336 n=5+5)
CompareBytesBigUnaligned/offset=2-4 155MB/s _ 1% 150MB/s _ 4% ~ (p=0.058 n=5+5)
CompareBytesBigUnaligned/offset=3-4 153MB/s _ 1% 156MB/s _ 1% +1.51% (p=0.004 n=5+5)
CompareBytesBigUnaligned/offset=4-4 7.16MB/s _ 1% 145.79MB/s _ 6% +1936.23% (p=0.000 n=5+5)
CompareBytesBigUnaligned/offset=5-4 149MB/s _ 5% 155MB/s _ 1% ~ (p=0.078 n=5+5)
CompareBytesBigUnaligned/offset=6-4 148MB/s _ 5% 152MB/s _ 5% ~ (p=0.175 n=5+5)
CompareBytesBigUnaligned/offset=7-4 147MB/s _ 5% 152MB/s _ 6% ~ (p=0.160 n=5+5)
Change-Id: I2c859e061919db482318ce63b85b808aa973a9ba
Reviewed-on: https://go-review.googlesource.com/c/go/+/431099
Reviewed-by: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Diffstat (limited to 'src/internal/bytealg')
| -rw-r--r-- | src/internal/bytealg/compare_riscv64.s | 4 | ||||
| -rw-r--r-- | src/internal/bytealg/equal_riscv64.s | 4 |
2 files changed, 4 insertions, 4 deletions
diff --git a/src/internal/bytealg/compare_riscv64.s b/src/internal/bytealg/compare_riscv64.s index 7d2f8d6d0b..e616577d53 100644 --- a/src/internal/bytealg/compare_riscv64.s +++ b/src/internal/bytealg/compare_riscv64.s @@ -58,8 +58,8 @@ use_a_len: BLT X5, X6, loop4_check // Check alignment - if alignment differs we have to do one byte at a time. - AND $3, X10, X7 - AND $3, X12, X8 + AND $7, X10, X7 + AND $7, X12, X8 BNE X7, X8, loop4_check BEQZ X7, loop32_check diff --git a/src/internal/bytealg/equal_riscv64.s b/src/internal/bytealg/equal_riscv64.s index 77202d6075..1e070beb3e 100644 --- a/src/internal/bytealg/equal_riscv64.s +++ b/src/internal/bytealg/equal_riscv64.s @@ -42,8 +42,8 @@ TEXT memequal<>(SB),NOSPLIT|NOFRAME,$0 BLT X12, X23, loop4_check // Check alignment - if alignment differs we have to do one byte at a time. - AND $3, X10, X9 - AND $3, X11, X19 + AND $7, X10, X9 + AND $7, X11, X19 BNE X9, X19, loop4_check BEQZ X9, loop32_check |
