diff options
| author | erifan01 <eric.fang@arm.com> | 2019-01-02 09:14:26 +0000 |
|---|---|---|
| committer | Cherry Zhang <cherryyz@google.com> | 2019-02-27 16:09:33 +0000 |
| commit | dd91269b7c470d07ba0efe1abab85011f41e38bc (patch) | |
| tree | b2fc95755a581c1070cfcfb39011a0cfe148ef1e /test/codegen/mathbits.go | |
| parent | f40cb19affbb3be090b3519f957b5198744022be (diff) | |
| download | go-dd91269b7c470d07ba0efe1abab85011f41e38bc.tar.xz | |
cmd/compile: optimize math/bits Len32 intrinsic on arm64
Arm64 has a 32-bit CLZ instruction CLZW, which can be used for intrinsic Len32.
Function LeadingZeros32 calls Len32, with this change, the assembly code of
LeadingZeros32 becomes more concise.
Go code:
func f32(x uint32) { z = bits.LeadingZeros32(x) }
Before:
"".f32 STEXT size=32 args=0x8 locals=0x0 leaf
0x0000 00000 (test.go:7) TEXT "".f32(SB), LEAF|NOFRAME|ABIInternal, $0-8
0x0004 00004 (test.go:7) MOVWU "".x(FP), R0
0x0008 00008 ($GOROOT/src/math/bits/bits.go:30) CLZ R0, R0
0x000c 00012 ($GOROOT/src/math/bits/bits.go:30) SUB $32, R0, R0
0x0010 00016 (test.go:7) MOVD R0, "".z(SB)
0x001c 00028 (test.go:7) RET (R30)
After:
"".f32 STEXT size=32 args=0x8 locals=0x0 leaf
0x0000 00000 (test.go:7) TEXT "".f32(SB), LEAF|NOFRAME|ABIInternal, $0-8
0x0004 00004 (test.go:7) MOVWU "".x(FP), R0
0x0008 00008 ($GOROOT/src/math/bits/bits.go:30) CLZW R0, R0
0x000c 00012 (test.go:7) MOVD R0, "".z(SB)
0x0018 00024 (test.go:7) RET (R30)
Benchmarks:
name old time/op new time/op delta
LeadingZeros-8 2.53ns ± 0% 2.55ns ± 0% +0.67% (p=0.000 n=10+10)
LeadingZeros8-8 3.56ns ± 0% 3.56ns ± 0% ~ (all equal)
LeadingZeros16-8 3.55ns ± 0% 3.56ns ± 0% ~ (p=0.465 n=10+10)
LeadingZeros32-8 3.55ns ± 0% 2.96ns ± 0% -16.71% (p=0.000 n=10+7)
LeadingZeros64-8 2.53ns ± 0% 2.54ns ± 0% ~ (p=0.059 n=8+10)
Change-Id: Ie5666bb82909e341060e02ffd4e86c0e5d67e90a
Reviewed-on: https://go-review.googlesource.com/c/157000
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Diffstat (limited to 'test/codegen/mathbits.go')
| -rw-r--r-- | test/codegen/mathbits.go | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/test/codegen/mathbits.go b/test/codegen/mathbits.go index 44ab2c02b7..d8b1775b0f 100644 --- a/test/codegen/mathbits.go +++ b/test/codegen/mathbits.go @@ -31,7 +31,7 @@ func LeadingZeros64(n uint64) int { func LeadingZeros32(n uint32) int { // amd64:"BSRQ","LEAQ",-"CMOVQEQ" // s390x:"FLOGR" - // arm:"CLZ" arm64:"CLZ" + // arm:"CLZ" arm64:"CLZW" // mips:"CLZ" return bits.LeadingZeros32(n) } |
