aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/sys_linux_ppc64x.s
diff options
context:
space:
mode:
authorLynn Boger <laboger@linux.vnet.ibm.com>2018-08-27 17:15:39 -0400
committerLynn Boger <laboger@linux.vnet.ibm.com>2018-10-23 19:22:44 +0000
commit6994731ec2babb6a4f2bbbb08dbe649767a25942 (patch)
treee0237884a1aa11ccbb97671db3e6ccd95accabc1 /src/runtime/sys_linux_ppc64x.s
parent5c472132bf88cc04c85ad5f848d8a2f77f21b228 (diff)
downloadgo-6994731ec2babb6a4f2bbbb08dbe649767a25942.tar.xz
internal/bytealg: improve asm for memequal on ppc64x
This includes two changes to the memequal function. Previously the asm implementation on ppc64x for Equal called the internal function memequal using a BL, whereas the other asm implementations for bytes functions on ppc64x used BR. The BR is preferred because the BL causes the calling function to stack a frame. This changes Equal so it uses BR and is consistent with the others. This also uses vsx instructions where possible to improve performance of the compares for sizes over 32. Here are results from the sizes affected: Equal/32 8.40ns ± 0% 7.66ns ± 0% -8.81% (p=0.029 n=4+4) Equal/4K 193ns ± 0% 144ns ± 0% -25.39% (p=0.029 n=4+4) Equal/4M 346µs ± 0% 277µs ± 0% -20.08% (p=0.029 n=4+4) Equal/64M 7.66ms ± 1% 7.27ms ± 0% -5.10% (p=0.029 n=4+4) Change-Id: Ib6ee2cdc3e5d146e2705e3338858b8e965d25420 Reviewed-on: https://go-review.googlesource.com/c/143060 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com> Reviewed-by: David Chase <drchase@google.com>
Diffstat (limited to 'src/runtime/sys_linux_ppc64x.s')
0 files changed, 0 insertions, 0 deletions