diff options
| author | Archana R <aravind5@in.ibm.com> | 2022-11-03 10:02:04 -0500 |
|---|---|---|
| committer | Lynn Boger <laboger@linux.vnet.ibm.com> | 2022-11-07 19:27:43 +0000 |
| commit | 92c7df116ecbd8f230b48f72eb44fa7de5d13233 (patch) | |
| tree | 970b99b7a5464b39d74d512153b67ad7717c1b85 /src/internal/bytealg | |
| parent | d2b5a6f332b011c75c17bfb99216cc51ac7a0b5f (diff) | |
| download | go-92c7df116ecbd8f230b48f72eb44fa7de5d13233.tar.xz | |
internal/bytealg: add PCALIGN to indexbodyp9 function on ppc64x
Adding PCALIGN in indexbodyp9 function shows
improvements in some SimonWaldherr benchmarks and one of the index
benchmarks on both Power9 and Power10
name old time/op new time/op delta
Contains 19.8ns ± 0% 15.6ns ± 0% -21.24%
ContainsNot 21.3ns ± 0% 18.9ns ± 0% -11.03%
ContainsBytes 19.1ns ± 0% 16.0ns ± 0% -16.54%
Index/10 17.3ns ± 0% 16.1ns ± 0% -7.30%
Index/32 59.6ns ± 0% 59.6ns ± 0% +0.12%
Index/4K 3.68µs ± 0% 3.68µs ± 0% ~
Index/4M 3.74ms ± 0% 3.74ms ± 0% -0.00%
Index/64M 59.8ms ± 0% 59.8ms ± 0% ~
Change-Id: I784e57e0b0f5bac143f57f3a32845219e43d47fd
Reviewed-on: https://go-review.googlesource.com/c/go/+/447595
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Diffstat (limited to 'src/internal/bytealg')
| -rw-r--r-- | src/internal/bytealg/index_ppc64x.s | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/src/internal/bytealg/index_ppc64x.s b/src/internal/bytealg/index_ppc64x.s index 735159cd8e..26205cebaf 100644 --- a/src/internal/bytealg/index_ppc64x.s +++ b/src/internal/bytealg/index_ppc64x.s @@ -660,7 +660,7 @@ index2to16: BGT index2to16tail MOVD $3, R17 // Number of bytes beyond 16 - + PCALIGN $32 index2to16loop: LXVB16X (R7)(R0), V1 // Load next 16 bytes of string into V1 from R7 LXVB16X (R7)(R17), V5 // Load next 16 bytes of string into V5 from R7+3 @@ -738,7 +738,7 @@ short: MTVSRD R10, V8 // Set up shift VSLDOI $8, V8, V8, V8 VSLO V1, V8, V1 // Shift by start byte - + PCALIGN $32 index2to16next: VAND V1, SEPMASK, V2 // Just compare size of sep VCMPEQUBCC V0, V2, V3 // Compare sep and partial string |
