| Age | Commit message (Collapse) | Author |
|
This helps simplify the noise when adding ppc codegen tests. ppc64x
is used in other places to indicate something which runs on either
endian.
This helps cleanup existing codegen tests which are mostly
identical between endian variants.
condmove tests are converted as an example.
Change-Id: I2b2d98a9a1859015f62db38d62d9d5d7593435b4
Reviewed-on: https://go-review.googlesource.com/c/go/+/462895
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Paul Murphy <murp@ibm.com>
|
|
Add a new SSA opcode ISELZ, similar to ISELB to represent a select
of value or 0. Then, merge candidate ISEL opcodes inside the late
lower pass.
This avoids complicating rules within the the lower pass.
Change-Id: I3b14c94b763863aadc834b0e910a85870c131313
Reviewed-on: https://go-review.googlesource.com/c/go/+/442596
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
|
|
Change-Id: I4fd1c307901c265ab9865bf8a74460ddc15e5d14
Reviewed-on: https://go-review.googlesource.com/c/go/+/416735
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
Auto-Submit: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
|
|
ISEL is roughly equivalent to CMOV on PPC64. Verify ISEL generation
in all reasonable cases.
Note "ISEL test x y z" is the same as "ISEL !test y x z". test is
always one of LT (0), GT (1), EQ (2), SO (3). Sometimes x and y are
swapped if GE/LE/NE is desired.
Change-Id: Ie1bf029224064e004d855099731fe5e8d05aa990
Reviewed-on: https://go-review.googlesource.com/c/go/+/445215
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
This CL adds rewrite rules for CSETM, CSINC, CSINV, and CSNEG. By adding
these rules, we can save one instruction.
For example,
func test(cond bool, a int) int {
if cond {
a++
}
return a
}
Before:
MOVD "".a+8(RSP), R0
ADD $1, R0, R1
MOVBU "".cond(RSP), R2
CMPW $0, R2
CSEL NE, R1, R0, R0
After:
MOVBU "".cond(RSP), R0
CMPW $0, R0
MOVD "".a+8(RSP), R0
CSINC EQ, R0, R0, R0
This patch is a copy of CL 285694. Co-authored-by: JunchenLi
<junchen.li@arm.com>
Change-Id: Ic1a79e8b8ece409b533becfcb7950f11e7b76f24
Reviewed-on: https://go-review.googlesource.com/c/go/+/302231
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Current many architectures use a rule along the lines of
// Canonicalize the order of arguments to comparisons - helps with CSE.
((CMP|CMPW) x y) && x.ID > y.ID => (InvertFlags ((CMP|CMPW) y x))
to normalize comparisons as much as possible for CSE. Replace the
ID comparison with something less variable across compiler changes.
This helps avoid spurious failures in some of the codegen-comparison
tests (though the current choice of comparison is sensitive to Op
ordering).
Two tests changed to accommodate modified instruction choice.
Change-Id: Ib35f450bd2bae9d4f9f7838ceaf7ec682bcf1e1a
Reviewed-on: https://go-review.googlesource.com/c/go/+/280155
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Change-Id: Ife4e065246729319c39e57a4fbd8e6f7b37724e1
GitHub-Last-Rev: e71803eaeb366c00f6c156de0b0b2c50927a0e82
GitHub-Pull-Request: golang/go#38527
Reviewed-on: https://go-review.googlesource.com/c/go/+/228901
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
|
|
Ensure that any comparison between two values has the same argument
order. This helps ensure that they can be eliminated during the
lowered CSE pass which will be particularly important if we eliminate
the Greater and Geq ops (see #37316).
Example:
CMP R0, R1
BLT L1
CMP R1, R0 // different order, cannot eliminate
BEQ L2
CMP R0, R1
BLT L1
CMP R0, R1 // same order, can eliminate
BEQ L2
This does have some drawbacks. Notably comparisons might 'flip'
direction in the assembly output after even small changes to the
code or compiler. It should help make optimizations more reliable
however.
compilecmp master -> HEAD
master (218f4572f5): text/template: make reflect.Value indirections more robust
HEAD (f1661fef3e): cmd/compile: canonicalize comparison argument order
platform: linux/amd64
file before after Δ %
api 6063927 6068023 +4096 +0.068%
asm 5191757 5183565 -8192 -0.158%
cgo 4893518 4901710 +8192 +0.167%
cover 5330345 5326249 -4096 -0.077%
fix 3417778 3421874 +4096 +0.120%
pprof 14889456 14885360 -4096 -0.028%
test2json 2848138 2844042 -4096 -0.144%
trace 11746239 11733951 -12288 -0.105%
total 132739173 132722789 -16384 -0.012%
Change-Id: I11736b3fe2a4553f6fc65018f475e88217fa22f9
Reviewed-on: https://go-review.googlesource.com/c/go/+/220425
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This CL performs the branchelim optimization on WASM with its
select instruction. And the total size of pkg/js_wasm decreased
about 80KB by this optimization.
Change-Id: I868eb146120a1cac5c4609c8e9ddb07e4da8a1d9
Reviewed-on: https://go-review.googlesource.com/c/go/+/190957
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Richard Musiol <neelance@gmail.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
arm64 backend
Current compiler reverses operands to work around NaN in
"less than" and "less equal than" comparisons. But if we
want to use "FCMPD/FCMPS $(0.0), Fn" to do some optimization,
the workaround way does not work. Because assembler does
not support instruction "FCMPD/FCMPS Fn, $(0.0)".
This CL sets condition flags for floating-point comparisons
to resolve this problem.
Change-Id: Ia48076a1da95da64596d6e68304018cb301ebe33
Reviewed-on: https://go-review.googlesource.com/c/go/+/164718
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
We want to issue loads as soon as possible, especially when they
are going to miss in the cache. Using a conditional move (CMOV) here:
i := ...
if cond {
i++
}
... = a[i]
means that we have to wait for cond to be computed before the load
is issued. Without a CMOV, if the branch is predicted correctly the
load can be issued in parallel with computing cond.
Even if the branch is predicted incorrectly, maybe the speculative
load is close to the real load, and we get a prefetch for free.
In the worst case, when the prediction is wrong and the address is
way off, we only lose by the time difference between the CMOV
latency (~2 cycles) and the mispredict restart latency (~15 cycles).
We only squash CMOVs that affect load addresses. Results of CMOVs
that are used for other things (store addresses, store values) we
use as before.
Fixes #26306
Change-Id: I82ca14b664bf05e1d45e58de8c4d9c775a127ca1
Reviewed-on: https://go-review.googlesource.com/c/145717
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
Change-Id: Ia64535492515f725fe3c4b59ea300363a0c4ce10
Reviewed-on: https://go-review.googlesource.com/107136
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
This builds upon the branchelim pass, activating it for amd64 and
lowering CondSelect. Special care is made to FPU instructions for
NaN handling.
Benchmark results on Xeon E5630 (Westmere EP):
name old time/op new time/op delta
BinaryTree17-16 4.99s ± 9% 4.66s ± 2% ~ (p=0.095 n=5+5)
Fannkuch11-16 4.93s ± 3% 5.04s ± 2% ~ (p=0.548 n=5+5)
FmtFprintfEmpty-16 58.8ns ± 7% 61.4ns ±14% ~ (p=0.579 n=5+5)
FmtFprintfString-16 114ns ± 2% 114ns ± 4% ~ (p=0.603 n=5+5)
FmtFprintfInt-16 181ns ± 4% 125ns ± 3% -30.90% (p=0.008 n=5+5)
FmtFprintfIntInt-16 263ns ± 2% 217ns ± 2% -17.34% (p=0.008 n=5+5)
FmtFprintfPrefixedInt-16 230ns ± 1% 212ns ± 1% -7.99% (p=0.008 n=5+5)
FmtFprintfFloat-16 411ns ± 3% 344ns ± 5% -16.43% (p=0.008 n=5+5)
FmtManyArgs-16 828ns ± 4% 790ns ± 2% -4.59% (p=0.032 n=5+5)
GobDecode-16 10.9ms ± 4% 10.8ms ± 5% ~ (p=0.548 n=5+5)
GobEncode-16 9.52ms ± 5% 9.46ms ± 2% ~ (p=1.000 n=5+5)
Gzip-16 334ms ± 2% 337ms ± 2% ~ (p=0.548 n=5+5)
Gunzip-16 64.4ms ± 1% 65.0ms ± 1% +1.00% (p=0.008 n=5+5)
HTTPClientServer-16 156µs ± 3% 155µs ± 3% ~ (p=0.690 n=5+5)
JSONEncode-16 21.0ms ± 1% 21.8ms ± 0% +3.76% (p=0.016 n=5+4)
JSONDecode-16 95.1ms ± 0% 95.7ms ± 1% ~ (p=0.151 n=5+5)
Mandelbrot200-16 6.38ms ± 1% 6.42ms ± 1% ~ (p=0.095 n=5+5)
GoParse-16 5.47ms ± 2% 5.36ms ± 1% -1.95% (p=0.016 n=5+5)
RegexpMatchEasy0_32-16 111ns ± 1% 111ns ± 1% ~ (p=0.635 n=5+4)
RegexpMatchEasy0_1K-16 408ns ± 1% 411ns ± 2% ~ (p=0.087 n=5+5)
RegexpMatchEasy1_32-16 103ns ± 1% 104ns ± 1% ~ (p=0.484 n=5+5)
RegexpMatchEasy1_1K-16 659ns ± 2% 652ns ± 1% ~ (p=0.571 n=5+5)
RegexpMatchMedium_32-16 176ns ± 2% 174ns ± 1% ~ (p=0.476 n=5+5)
RegexpMatchMedium_1K-16 58.6µs ± 4% 57.7µs ± 4% ~ (p=0.548 n=5+5)
RegexpMatchHard_32-16 3.07µs ± 3% 3.04µs ± 4% ~ (p=0.421 n=5+5)
RegexpMatchHard_1K-16 89.2µs ± 1% 87.9µs ± 2% -1.52% (p=0.032 n=5+5)
Revcomp-16 575ms ± 0% 587ms ± 2% +2.12% (p=0.032 n=4+5)
Template-16 110ms ± 1% 107ms ± 3% -3.00% (p=0.032 n=5+5)
TimeParse-16 463ns ± 0% 462ns ± 0% ~ (p=0.810 n=5+4)
TimeFormat-16 538ns ± 0% 535ns ± 0% -0.63% (p=0.024 n=5+5)
name old speed new speed delta
GobDecode-16 70.7MB/s ± 4% 71.4MB/s ± 5% ~ (p=0.452 n=5+5)
GobEncode-16 80.7MB/s ± 5% 81.2MB/s ± 2% ~ (p=1.000 n=5+5)
Gzip-16 58.2MB/s ± 2% 57.7MB/s ± 2% ~ (p=0.452 n=5+5)
Gunzip-16 302MB/s ± 1% 299MB/s ± 1% -0.99% (p=0.008 n=5+5)
JSONEncode-16 92.4MB/s ± 1% 89.1MB/s ± 0% -3.63% (p=0.016 n=5+4)
JSONDecode-16 20.4MB/s ± 0% 20.3MB/s ± 1% ~ (p=0.135 n=5+5)
GoParse-16 10.6MB/s ± 2% 10.8MB/s ± 1% +2.00% (p=0.016 n=5+5)
RegexpMatchEasy0_32-16 286MB/s ± 1% 285MB/s ± 3% ~ (p=1.000 n=5+5)
RegexpMatchEasy0_1K-16 2.51GB/s ± 1% 2.49GB/s ± 2% ~ (p=0.095 n=5+5)
RegexpMatchEasy1_32-16 309MB/s ± 1% 307MB/s ± 1% ~ (p=0.548 n=5+5)
RegexpMatchEasy1_1K-16 1.55GB/s ± 2% 1.57GB/s ± 1% ~ (p=0.690 n=5+5)
RegexpMatchMedium_32-16 5.68MB/s ± 2% 5.73MB/s ± 1% ~ (p=0.579 n=5+5)
RegexpMatchMedium_1K-16 17.5MB/s ± 4% 17.8MB/s ± 4% ~ (p=0.500 n=5+5)
RegexpMatchHard_32-16 10.4MB/s ± 3% 10.5MB/s ± 4% ~ (p=0.460 n=5+5)
RegexpMatchHard_1K-16 11.5MB/s ± 1% 11.7MB/s ± 2% +1.57% (p=0.032 n=5+5)
Revcomp-16 442MB/s ± 0% 433MB/s ± 2% -2.05% (p=0.032 n=4+5)
Template-16 17.7MB/s ± 1% 18.2MB/s ± 3% +3.12% (p=0.032 n=5+5)
Change-Id: I6972e8f35f2b31f9a42ac473a6bf419a18022558
Reviewed-on: https://go-review.googlesource.com/100935
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This reverts commit 080187f4f72bd6594e3c2efc35cf51bf61378552.
It broke build of golang.org/x/exp/shiny/iconvg
See issue 24395 for details
Change-Id: Ifd6134f6214e6cee40bd3c63c32941d5fc96ae8b
Reviewed-on: https://go-review.googlesource.com/100755
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This builds upon the branchelim pass, activating it for amd64 and
lowering CondSelect. Special care is made to FPU instructions for
NaN handling.
Benchmark results on Xeon E5630 (Westmere EP):
name old time/op new time/op delta
BinaryTree17-16 4.99s ± 9% 4.66s ± 2% ~ (p=0.095 n=5+5)
Fannkuch11-16 4.93s ± 3% 5.04s ± 2% ~ (p=0.548 n=5+5)
FmtFprintfEmpty-16 58.8ns ± 7% 61.4ns ±14% ~ (p=0.579 n=5+5)
FmtFprintfString-16 114ns ± 2% 114ns ± 4% ~ (p=0.603 n=5+5)
FmtFprintfInt-16 181ns ± 4% 125ns ± 3% -30.90% (p=0.008 n=5+5)
FmtFprintfIntInt-16 263ns ± 2% 217ns ± 2% -17.34% (p=0.008 n=5+5)
FmtFprintfPrefixedInt-16 230ns ± 1% 212ns ± 1% -7.99% (p=0.008 n=5+5)
FmtFprintfFloat-16 411ns ± 3% 344ns ± 5% -16.43% (p=0.008 n=5+5)
FmtManyArgs-16 828ns ± 4% 790ns ± 2% -4.59% (p=0.032 n=5+5)
GobDecode-16 10.9ms ± 4% 10.8ms ± 5% ~ (p=0.548 n=5+5)
GobEncode-16 9.52ms ± 5% 9.46ms ± 2% ~ (p=1.000 n=5+5)
Gzip-16 334ms ± 2% 337ms ± 2% ~ (p=0.548 n=5+5)
Gunzip-16 64.4ms ± 1% 65.0ms ± 1% +1.00% (p=0.008 n=5+5)
HTTPClientServer-16 156µs ± 3% 155µs ± 3% ~ (p=0.690 n=5+5)
JSONEncode-16 21.0ms ± 1% 21.8ms ± 0% +3.76% (p=0.016 n=5+4)
JSONDecode-16 95.1ms ± 0% 95.7ms ± 1% ~ (p=0.151 n=5+5)
Mandelbrot200-16 6.38ms ± 1% 6.42ms ± 1% ~ (p=0.095 n=5+5)
GoParse-16 5.47ms ± 2% 5.36ms ± 1% -1.95% (p=0.016 n=5+5)
RegexpMatchEasy0_32-16 111ns ± 1% 111ns ± 1% ~ (p=0.635 n=5+4)
RegexpMatchEasy0_1K-16 408ns ± 1% 411ns ± 2% ~ (p=0.087 n=5+5)
RegexpMatchEasy1_32-16 103ns ± 1% 104ns ± 1% ~ (p=0.484 n=5+5)
RegexpMatchEasy1_1K-16 659ns ± 2% 652ns ± 1% ~ (p=0.571 n=5+5)
RegexpMatchMedium_32-16 176ns ± 2% 174ns ± 1% ~ (p=0.476 n=5+5)
RegexpMatchMedium_1K-16 58.6µs ± 4% 57.7µs ± 4% ~ (p=0.548 n=5+5)
RegexpMatchHard_32-16 3.07µs ± 3% 3.04µs ± 4% ~ (p=0.421 n=5+5)
RegexpMatchHard_1K-16 89.2µs ± 1% 87.9µs ± 2% -1.52% (p=0.032 n=5+5)
Revcomp-16 575ms ± 0% 587ms ± 2% +2.12% (p=0.032 n=4+5)
Template-16 110ms ± 1% 107ms ± 3% -3.00% (p=0.032 n=5+5)
TimeParse-16 463ns ± 0% 462ns ± 0% ~ (p=0.810 n=5+4)
TimeFormat-16 538ns ± 0% 535ns ± 0% -0.63% (p=0.024 n=5+5)
name old speed new speed delta
GobDecode-16 70.7MB/s ± 4% 71.4MB/s ± 5% ~ (p=0.452 n=5+5)
GobEncode-16 80.7MB/s ± 5% 81.2MB/s ± 2% ~ (p=1.000 n=5+5)
Gzip-16 58.2MB/s ± 2% 57.7MB/s ± 2% ~ (p=0.452 n=5+5)
Gunzip-16 302MB/s ± 1% 299MB/s ± 1% -0.99% (p=0.008 n=5+5)
JSONEncode-16 92.4MB/s ± 1% 89.1MB/s ± 0% -3.63% (p=0.016 n=5+4)
JSONDecode-16 20.4MB/s ± 0% 20.3MB/s ± 1% ~ (p=0.135 n=5+5)
GoParse-16 10.6MB/s ± 2% 10.8MB/s ± 1% +2.00% (p=0.016 n=5+5)
RegexpMatchEasy0_32-16 286MB/s ± 1% 285MB/s ± 3% ~ (p=1.000 n=5+5)
RegexpMatchEasy0_1K-16 2.51GB/s ± 1% 2.49GB/s ± 2% ~ (p=0.095 n=5+5)
RegexpMatchEasy1_32-16 309MB/s ± 1% 307MB/s ± 1% ~ (p=0.548 n=5+5)
RegexpMatchEasy1_1K-16 1.55GB/s ± 2% 1.57GB/s ± 1% ~ (p=0.690 n=5+5)
RegexpMatchMedium_32-16 5.68MB/s ± 2% 5.73MB/s ± 1% ~ (p=0.579 n=5+5)
RegexpMatchMedium_1K-16 17.5MB/s ± 4% 17.8MB/s ± 4% ~ (p=0.500 n=5+5)
RegexpMatchHard_32-16 10.4MB/s ± 3% 10.5MB/s ± 4% ~ (p=0.460 n=5+5)
RegexpMatchHard_1K-16 11.5MB/s ± 1% 11.7MB/s ± 2% +1.57% (p=0.032 n=5+5)
Revcomp-16 442MB/s ± 0% 433MB/s ± 2% -2.05% (p=0.032 n=4+5)
Template-16 17.7MB/s ± 1% 18.2MB/s ± 3% +3.12% (p=0.032 n=5+5)
Change-Id: Ic7cb7374d07da031e771bdcbfdd832fd1b17159c
Reviewed-on: https://go-review.googlesource.com/98695
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
|