diff options
| author | Lynn Boger <laboger@linux.vnet.ibm.com> | 2017-04-18 17:05:31 -0400 |
|---|---|---|
| committer | Lynn Boger <laboger@linux.vnet.ibm.com> | 2017-04-20 18:05:22 +0000 |
| commit | 9248ff46a82aec26164a775f3eba43e3fcfb5651 (patch) | |
| tree | 517b37839d64a13cc077910b9c3e14d74ae178a8 /src/cmd/asm | |
| parent | 865b50c982a2c8b2a790772c6777a53c3f268bab (diff) | |
| download | go-9248ff46a82aec26164a775f3eba43e3fcfb5651.tar.xz | |
cmd/compile: add rotates to PPC64.rules
This updates PPC64.rules to include rules to generate rotates
for ADD, OR, XOR operators that combine two opposite shifts
that sum to 32 or 64.
To support this change opcodes for ROTL and ROTLW were added to
be used like the rotldi and rotlwi extended mnemonics.
This provides the following improvement in sha3:
BenchmarkPermutationFunction-8 302.83 376.40 1.24x
BenchmarkSha3_512_MTU-8 98.64 121.92 1.24x
BenchmarkSha3_384_MTU-8 136.80 168.30 1.23x
BenchmarkSha3_256_MTU-8 169.21 211.29 1.25x
BenchmarkSha3_224_MTU-8 179.76 221.19 1.23x
BenchmarkShake128_MTU-8 212.87 263.23 1.24x
BenchmarkShake256_MTU-8 196.62 245.60 1.25x
BenchmarkShake256_16x-8 163.57 194.37 1.19x
BenchmarkShake256_1MiB-8 199.02 248.74 1.25x
BenchmarkSha3_512_1MiB-8 106.55 133.13 1.25x
Fixes #20030
Change-Id: I484c56f48395d32f53ff3ecb3ac6cb8191cfee44
Reviewed-on: https://go-review.googlesource.com/40992
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Diffstat (limited to 'src/cmd/asm')
| -rw-r--r-- | src/cmd/asm/internal/asm/testdata/ppc64.s | 20 |
1 files changed, 20 insertions, 0 deletions
diff --git a/src/cmd/asm/internal/asm/testdata/ppc64.s b/src/cmd/asm/internal/asm/testdata/ppc64.s index e266593050..30fb0f2c02 100644 --- a/src/cmd/asm/internal/asm/testdata/ppc64.s +++ b/src/cmd/asm/internal/asm/testdata/ppc64.s @@ -582,6 +582,15 @@ label1: CMPB R2,R2,R1 // +// rotate extended mnemonics map onto other shift instructions +// + + ROTL $12,R2,R3 + ROTL R2,R3,R4 + ROTLW $9,R2,R3 + ROTLW R2,R3,R4 + +// // rotate and mask // // LRLWM imm ',' rreg ',' imm ',' rreg @@ -617,6 +626,17 @@ label1: RLDIMI $7, R2, $52, R7 +// opcodes for right and left shifts, const and reg shift counts + + SLD $4, R3, R4 + SLD R2, R3, R4 + SLW $4, R3, R4 + SLW R2, R3, R4 + SRD $8, R3, R4 + SRD R2, R3, R4 + SRW $8, R3, R4 + SRW R2, R3, R4 + // // load/store multiple // |
