| Age | Commit message (Collapse) | Author |
|
loong64 uses ALSLV the compute the lookup index.
Fixes #78575
Change-Id: Ied90a4f811cc19ffec4d304333546d1fa430ccc0
Reviewed-on: https://go-review.googlesource.com/c/go/+/764180
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Lookup tables for switch statements can be generalized to also support
bools, strings, floats, and complex numbers.
Change-Id: Ic3ece41fe2009050fbf08ba6f06ea8a567407974
Reviewed-on: https://go-review.googlesource.com/c/go/+/763320
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Switch statement containing integer constant cases and case bodies just
returning a constant should be optimizable to a simpler and faster table
lookup instead of a jump table.
That is, a switch like this:
switch x {
case 0: return 10
case 1: return 20
case 2: return 30
case 3: return 40
default: return -1
}
Could be optimized to this:
var table = [4]int{10, 20, 30, 40}
if uint(x) < 4 { return table[x] }
return -1
The resulting code is smaller and faster, especially on platforms where
jump tables are not supported.
goos: windows
goarch: arm64
pkg: cmd/compile/internal/test
│ .\old.txt │ .\new.txt │
│ sec/op │ sec/op vs base │
SwitchLookup8Predictable-12 2.708n ± 6% 2.249n ± 5% -16.97% (p=0.000 n=10)
SwitchLookup8Unpredictable-12 8.758n ± 7% 3.272n ± 4% -62.65% (p=0.000 n=10)
SwitchLookup32Predictable-12 2.672n ± 5% 2.373n ± 6% -11.21% (p=0.000 n=10)
SwitchLookup32Unpredictable-12 9.372n ± 7% 3.385n ± 6% -63.89% (p=0.000 n=10)
geomean 4.937n 2.772n -43.84%
Fixes #78203
Change-Id: I74fa3d77ef618412951b2e5c3cb6ebc760ce4ff1
Reviewed-on: https://go-review.googlesource.com/c/go/+/756340
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: Ia7a955833d761e08c1b8081fb29a2e6317de004c
Reviewed-on: https://go-review.googlesource.com/c/go/+/760681
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Replace \s with a space in backtick-quoted strings
Replace \\s with a space in double-quoted strings
Change-Id: I0c8b249bb12c2c8ca69e683e4bc6f27544fd6094
Reviewed-on: https://go-review.googlesource.com/c/go/+/760680
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Separate patterns in asmcheck by spaces instead of commas.
Many patterns end in comma (like "MOV [$]123,") so separating
patterns by comma is not great; they're already quoted, so spaces are fine.
Also replace all tabs in the assembly lines with spaces before matching.
Finally, replace \$ or \\$ with [$] as the matching idiom.
The effect of all these is to make the patterns look like:
// amd64:"BSFQ" "ORQ [$]256"
instead of the old:
// amd64:"BSFQ","ORQ\t\\$256"
Update all tests as well.
Change-Id: Ia39febe5d7f67ba115846422789e11b185d5c807
Reviewed-on: https://go-review.googlesource.com/c/go/+/716060
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
|
|
Following CL 357330, use jump tables on Loong64.
goos: linux
goarch: loong64
pkg: cmd/compile/internal/test
cpu: Loongson-3A6000-HV @ 2500.00MHz
│ old │ new │
│ sec/op │ sec/op vs base │
Switch8Predictable 2.352n ± 0% 2.101n ± 0% -10.65% (p=0.000 n=10)
Switch8Unpredictable 11.99n ± 0% 10.25n ± 0% -14.51% (p=0.000 n=10)
Switch32Predictable 3.153n ± 0% 1.887n ± 1% -40.14% (p=0.000 n=10)
Switch32Unpredictable 12.47n ± 0% 10.22n ± 0% -18.00% (p=0.000 n=10)
SwitchStringPredictable 3.162n ± 0% 3.352n ± 0% +6.01% (p=0.000 n=10)
SwitchStringUnpredictable 14.70n ± 0% 13.31n ± 0% -9.46% (p=0.000 n=10)
SwitchTypePredictable 3.702n ± 0% 2.201n ± 0% -40.55% (p=0.000 n=10)
SwitchTypeUnpredictable 16.18n ± 0% 14.48n ± 0% -10.51% (p=0.000 n=10)
SwitchInterfaceTypePredictable 7.654n ± 0% 9.680n ± 0% +26.47% (p=0.000 n=10)
SwitchInterfaceTypeUnpredictable 22.04n ± 0% 22.44n ± 0% +1.81% (p=0.000 n=10)
geomean 7.441n 6.469n -13.07%
Change-Id: Id6f30fa73349c60fac17670084daee56973a955f
Reviewed-on: https://go-review.googlesource.com/c/go/+/705396
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
|
|
In addition to unsigned loads which already exist.
This helps code that does switches on strings to constant-fold
the switch away when the string being switched on is constant.
Fixes #71699
Change-Id: If3051af0f7255d2a573da6f96b153a987a7f159d
Reviewed-on: https://go-review.googlesource.com/c/go/+/649295
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
|
|
It is one less dependent load away, and right next to another
field in the itab we also load as part of the type switch or
type assert.
Change-Id: If7aaa7814c47bd79a6c7ed4232ece0bc1d63550e
Reviewed-on: https://go-review.googlesource.com/c/go/+/533117
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
This is the last of the getitab users to receive a cache.
We should now no longer see getitab (and callees) in profiles.
Hopefully.
Change-Id: I2ed72b9943095bbe8067c805da7f08e00706c98c
Reviewed-on: https://go-review.googlesource.com/c/go/+/531055
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
That way we don't need to call into the runtime for every
type assertion (to an interface type).
name old time/op new time/op delta
TypeAssert-24 3.78ns ± 3% 1.00ns ± 1% -73.53% (p=0.000 n=10+8)
Change-Id: I0ba308aaf0f24a5495b4e13c814d35af0c58bfde
Reviewed-on: https://go-review.googlesource.com/c/go/+/529316
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
|
|
That way we don't need to call into the runtime when the type being
switched on has been seen many times before.
The cache is just a hash table of a sample of all the concrete types
that have been switched on at that source location. We record the
matching case number and the resulting itab for each concrete input
type.
The caches seldom get large. The only two in a run of all.bash that
get more than 100 entries, even with the sampling rate set to 1, are
test/fixedbugs/issue29264.go, with 101
test/fixedbugs/issue29312.go, with 254
Both happen at the type switch in fmt.(*pp).handleMethods, perhaps
unsurprisingly.
name old time/op new time/op delta
SwitchInterfaceTypePredictable-24 25.8ns ± 2% 2.5ns ± 3% -90.43% (p=0.000 n=10+10)
SwitchInterfaceTypeUnpredictable-24 37.5ns ± 2% 11.2ns ± 1% -70.02% (p=0.000 n=10+10)
Change-Id: I4961ac9547b7f15b03be6f55cdcb972d176955eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/526658
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
For type switches where the targets are interface types,
call into the runtime once instead of doing a sequence
of assert* calls.
name old time/op new time/op delta
SwitchInterfaceTypePredictable-24 26.6ns ± 1% 25.8ns ± 2% -2.86% (p=0.000 n=10+10)
SwitchInterfaceTypeUnpredictable-24 39.3ns ± 1% 37.5ns ± 2% -4.57% (p=0.000 n=10+10)
Not super helpful by itself, but this code organization allows
followon CLs that add caching to the lookups.
Change-Id: I7967f85a99171faa6c2550690e311bea8b54b01c
Reviewed-on: https://go-review.googlesource.com/c/go/+/526657
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
For large interface -> concrete type switches, we can use a jump
table on some bits of the type hash instead of a binary search on
the type hash.
name old time/op new time/op delta
SwitchTypePredictable-24 1.99ns ± 2% 1.78ns ± 5% -10.87% (p=0.000 n=10+10)
SwitchTypeUnpredictable-24 11.0ns ± 1% 9.1ns ± 2% -17.55% (p=0.000 n=7+9)
Change-Id: Ida4768e5d62c3ce1c2701288b72664aaa9e64259
Reviewed-on: https://go-review.googlesource.com/c/go/+/521497
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
|
|
Usually optimization rules have corresponding priorities, some need to
be run first, some run next, and some run last, which produces the best
code. But currently our optimization rules have no priority, this CL
adds a late lower pass that runs those rules that need to be run at last,
such as split unreasonable constant folding. This pass can be seen as
the second round of the lower pass.
For example:
func foo(a, b uint64) uint64 {
d := a+0x1234568
d1 := b+0x1234568
return d&d1
}
The code generated by the master branch:
0x0004 00004 ADD $19088744, R0, R2 // movz+movk+add
0x0010 00016 ADD $19088744, R1, R1 // movz+movk+add
0x001c 00028 AND R1, R2, R0
This is because the current constant folding optimization rules do not
take into account the range of constants, causing the constant to be
loaded repeatedly. This CL splits these unreasonable constants folding
in the late lower pass. With this CL the generated code:
0x0004 00004 MOVD $19088744, R2 // movz+movk
0x000c 00012 ADD R0, R2, R3
0x0010 00016 ADD R1, R2, R1
0x0014 00020 AND R1, R3, R0
This CL also adds constant folding optimization for ADDS instruction.
In addition, in order not to introduce the codegen regression, an
optimization rule is added to change the addition of a negative number
into a subtraction of a positive number.
go1 benchmarks:
name old time/op new time/op delta
BinaryTree17-8 1.22s ± 1% 1.24s ± 0% +1.56% (p=0.008 n=5+5)
Fannkuch11-8 1.54s ± 0% 1.53s ± 0% -0.69% (p=0.016 n=4+5)
FmtFprintfEmpty-8 14.1ns ± 0% 14.1ns ± 0% ~ (p=0.079 n=4+5)
FmtFprintfString-8 26.0ns ± 0% 26.1ns ± 0% +0.23% (p=0.008 n=5+5)
FmtFprintfInt-8 32.3ns ± 0% 32.9ns ± 1% +1.72% (p=0.008 n=5+5)
FmtFprintfIntInt-8 54.5ns ± 0% 55.5ns ± 0% +1.83% (p=0.008 n=5+5)
FmtFprintfPrefixedInt-8 61.5ns ± 0% 62.0ns ± 0% +0.93% (p=0.008 n=5+5)
FmtFprintfFloat-8 72.0ns ± 0% 73.6ns ± 0% +2.24% (p=0.008 n=5+5)
FmtManyArgs-8 221ns ± 0% 224ns ± 0% +1.22% (p=0.008 n=5+5)
GobDecode-8 1.91ms ± 0% 1.93ms ± 0% +0.98% (p=0.008 n=5+5)
GobEncode-8 1.40ms ± 1% 1.39ms ± 0% -0.79% (p=0.032 n=5+5)
Gzip-8 115ms ± 0% 117ms ± 1% +1.17% (p=0.008 n=5+5)
Gunzip-8 19.4ms ± 1% 19.3ms ± 0% -0.71% (p=0.016 n=5+4)
HTTPClientServer-8 27.0µs ± 0% 27.3µs ± 0% +0.80% (p=0.008 n=5+5)
JSONEncode-8 3.36ms ± 1% 3.33ms ± 0% ~ (p=0.056 n=5+5)
JSONDecode-8 17.5ms ± 2% 17.8ms ± 0% +1.71% (p=0.016 n=5+4)
Mandelbrot200-8 2.29ms ± 0% 2.29ms ± 0% ~ (p=0.151 n=5+5)
GoParse-8 1.35ms ± 1% 1.36ms ± 1% ~ (p=0.056 n=5+5)
RegexpMatchEasy0_32-8 24.5ns ± 0% 24.5ns ± 0% ~ (p=0.444 n=4+5)
RegexpMatchEasy0_1K-8 131ns ±11% 118ns ± 6% ~ (p=0.056 n=5+5)
RegexpMatchEasy1_32-8 22.9ns ± 0% 22.9ns ± 0% ~ (p=0.905 n=4+5)
RegexpMatchEasy1_1K-8 126ns ± 0% 127ns ± 0% ~ (p=0.063 n=4+5)
RegexpMatchMedium_32-8 486ns ± 5% 483ns ± 0% ~ (p=0.381 n=5+4)
RegexpMatchMedium_1K-8 15.4µs ± 1% 15.5µs ± 0% ~ (p=0.151 n=5+5)
RegexpMatchHard_32-8 687ns ± 0% 686ns ± 0% ~ (p=0.103 n=5+5)
RegexpMatchHard_1K-8 20.7µs ± 0% 20.7µs ± 1% ~ (p=0.151 n=5+5)
Revcomp-8 175ms ± 2% 176ms ± 3% ~ (p=1.000 n=5+5)
Template-8 20.4ms ± 6% 20.1ms ± 2% ~ (p=0.151 n=5+5)
TimeParse-8 112ns ± 0% 113ns ± 0% +0.97% (p=0.016 n=5+4)
TimeFormat-8 156ns ± 0% 145ns ± 0% -7.14% (p=0.029 n=4+4)
Change-Id: I3ced26e89041f873ac989586514ccc5ee09f13da
Reviewed-on: https://go-review.googlesource.com/c/go/+/425134
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
|
|
Currently we use a full cmpstring to do the comparison for each
split in the binary search for a string switch.
Instead, split by comparing a single byte of the input string with a
constant. That will give us a much faster split (although it might be
not quite as good a split).
Fixes #53333
R=go1.20
Change-Id: I28c7209342314f367071e4aa1f2beb6ec9ff7123
Reviewed-on: https://go-review.googlesource.com/c/go/+/414894
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
Following CL 357330, use jump tables on ARM64.
name old time/op new time/op delta
Switch8Predictable-4 3.41ns ± 0% 3.21ns ± 0% ~ (p=0.079 n=4+5)
Switch8Unpredictable-4 12.0ns ± 0% 9.5ns ± 0% -21.17% (p=0.000 n=5+4)
Switch32Predictable-4 3.06ns ± 0% 2.82ns ± 0% -7.78% (p=0.008 n=5+5)
Switch32Unpredictable-4 13.3ns ± 0% 9.5ns ± 0% -28.87% (p=0.016 n=4+5)
SwitchStringPredictable-4 3.71ns ± 0% 3.21ns ± 0% -13.43% (p=0.000 n=5+4)
SwitchStringUnpredictable-4 14.8ns ± 0% 15.1ns ± 0% +2.37% (p=0.008 n=5+5)
Change-Id: Ia0b85df7ca9273cf70c05eb957225c6e61822fa6
Reviewed-on: https://go-review.googlesource.com/c/go/+/403979
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Change-Id: Ic67f676f5ebe146166a0d3c1d78a802881320e49
Reviewed-on: https://go-review.googlesource.com/c/go/+/400375
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
When compiling expression switches, we try to optimize runs of
constants into binary searches. The ordering used isn't visible to the
application, so it's unimportant as long as we're consistent between
sorting and searching.
For strings, it's much cheaper to compare string lengths than strings
themselves, so instead of ordering strings by "si <= sj", we currently
order them by "len(si) < len(sj) || len(si) == len(sj) && si <= sj"
(i.e., the lexicographical ordering on the 2-tuple (len(s), s)).
However, it's also somewhat cheaper to compare strings for equality
(i.e., ==) than for ordering (i.e., <=). And if there were two or
three string constants of the same length in a switch statement, we
might unnecessarily emit ordering comparisons.
For example, given:
switch s {
case "", "1", "2", "3": // ordered by length then content
goto L
}
we currently compile this as:
if len(s) < 1 || len(s) == 1 && s <= "1" {
if s == "" { goto L }
else if s == "1" { goto L }
} else {
if s == "2" { goto L }
else if s == "3" { goto L }
}
This CL switches to using a 2-level binary search---first on len(s),
then on s itself---so that string ordering comparisons are only needed
when there are 4 or more strings of the same length. (4 being the
cut-off for when using binary search is actually worthwhile.)
So the above switch instead now compiles to:
if len(s) == 0 {
if s == "" { goto L }
} else if len(s) == 1 {
if s == "1" { goto L }
else if s == "2" { goto L }
else if s == "3" { goto L }
}
which is better optimized by walk and SSA. (Notably, because there are
only two distinct lengths and no more than three strings of any
particular length, this example ends up falling back to simply using
linear search.)
Test case by khr@ from CL 195138.
Fixes #33934.
Change-Id: I8eeebcaf7e26343223be5f443d6a97a0daf84f07
Reviewed-on: https://go-review.googlesource.com/c/go/+/195340
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|