| Age | Commit message (Collapse) | Author |
|
Switch statement containing integer constant cases and case bodies just
returning a constant should be optimizable to a simpler and faster table
lookup instead of a jump table.
That is, a switch like this:
switch x {
case 0: return 10
case 1: return 20
case 2: return 30
case 3: return 40
default: return -1
}
Could be optimized to this:
var table = [4]int{10, 20, 30, 40}
if uint(x) < 4 { return table[x] }
return -1
The resulting code is smaller and faster, especially on platforms where
jump tables are not supported.
goos: windows
goarch: arm64
pkg: cmd/compile/internal/test
│ .\old.txt │ .\new.txt │
│ sec/op │ sec/op vs base │
SwitchLookup8Predictable-12 2.708n ± 6% 2.249n ± 5% -16.97% (p=0.000 n=10)
SwitchLookup8Unpredictable-12 8.758n ± 7% 3.272n ± 4% -62.65% (p=0.000 n=10)
SwitchLookup32Predictable-12 2.672n ± 5% 2.373n ± 6% -11.21% (p=0.000 n=10)
SwitchLookup32Unpredictable-12 9.372n ± 7% 3.385n ± 6% -63.89% (p=0.000 n=10)
geomean 4.937n 2.772n -43.84%
Fixes #78203
Change-Id: I74fa3d77ef618412951b2e5c3cb6ebc760ce4ff1
Reviewed-on: https://go-review.googlesource.com/c/go/+/756340
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
For type switches where the targets are interface types,
call into the runtime once instead of doing a sequence
of assert* calls.
name old time/op new time/op delta
SwitchInterfaceTypePredictable-24 26.6ns ± 1% 25.8ns ± 2% -2.86% (p=0.000 n=10+10)
SwitchInterfaceTypeUnpredictable-24 39.3ns ± 1% 37.5ns ± 2% -4.57% (p=0.000 n=10+10)
Not super helpful by itself, but this code organization allows
followon CLs that add caching to the lookups.
Change-Id: I7967f85a99171faa6c2550690e311bea8b54b01c
Reviewed-on: https://go-review.googlesource.com/c/go/+/526657
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
For large interface -> concrete type switches, we can use a jump
table on some bits of the type hash instead of a binary search on
the type hash.
name old time/op new time/op delta
SwitchTypePredictable-24 1.99ns ± 2% 1.78ns ± 5% -10.87% (p=0.000 n=10+10)
SwitchTypeUnpredictable-24 11.0ns ± 1% 9.1ns ± 2% -17.55% (p=0.000 n=7+9)
Change-Id: Ida4768e5d62c3ce1c2701288b72664aaa9e64259
Reviewed-on: https://go-review.googlesource.com/c/go/+/521497
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
|
|
Reorganize the way we rewrite expression switches on strings, so that
jump tables are naturally used for the outer switch on the string length.
The changes to the prove pass in this CL are required so as to not repeat
the test for string length in each case.
name old time/op new time/op delta
SwitchStringPredictable 2.28ns ± 9% 2.08ns ± 5% -9.04% (p=0.000 n=10+10)
SwitchStringUnpredictable 10.5ns ± 1% 9.5ns ± 1% -9.08% (p=0.000 n=9+10)
Update #5496
Update #34381
Change-Id: Ie6846b1dd27f3e472f7c30dfcc598c68d440b997
Reviewed-on: https://go-review.googlesource.com/c/go/+/395714
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Performance is kind of hard to exactly quantify.
One big difference between jump tables and the old binary search
scheme is that there's only 1 branch statement instead of O(n) of
them. That can be both a blessing and a curse, and can make evaluating
jump tables very hard to do.
The single branch can become a choke point for the hardware branch
predictor. A branch table jump must fit all of its state in a single
branch predictor entry (technically, a branch target predictor entry).
With binary search that predictor state can be spread among lots of
entries. In cases where the case selection is repetitive and thus
predictable, binary search can perform better.
The big win for a jump table is that it doesn't consume so much of the
branch predictor's resources. But that benefit is essentially never
observed in microbenchmarks, because the branch predictor can easily
keep state for all the binary search branches in a microbenchmark. So
that benefit is really hard to measure.
So predictable switch microbenchmarks are ~useless - they will almost
always favor the binary search scheme. Fully unpredictable switch
microbenchmarks are better, as they aren't lying to us quite so
much. In a perfectly unpredictable situation, a jump table will expect
to incur 1-1/N branch mispredicts, where a binary search would incur
lg(N)/2 of them. That makes the crossover point at about N=4. But of
course switches in real programs are seldom fully unpredictable, so
we'll use a higher crossover point.
Beyond the branch predictor, jump tables tend to execute more
instructions per switch but have no additional instructions per case,
which also argues for a larger crossover.
As far as code size goes, with this CL cmd/go has a slightly smaller
code segment and a slightly larger overall size (from the jump tables
themselves which live in the data segment).
This is a case where some FDO (feedback-directed optimization) would
be really nice to have. #28262
Some large-program benchmarks might help make the case for this
CL. Especially if we can turn on branch mispredict counters so we can
see how much using jump tables can free up branch prediction resources
that can be gainfully used elsewhere in the program.
name old time/op new time/op delta
Switch8Predictable 1.89ns ± 2% 1.27ns ± 3% -32.58% (p=0.000 n=9+10)
Switch8Unpredictable 9.33ns ± 1% 7.50ns ± 1% -19.60% (p=0.000 n=10+9)
Switch32Predictable 2.20ns ± 2% 1.64ns ± 1% -25.39% (p=0.000 n=10+9)
Switch32Unpredictable 10.0ns ± 2% 7.6ns ± 2% -24.04% (p=0.000 n=10+10)
Fixes #5496
Update #34381
Change-Id: I3ff56011d02be53f605ca5fd3fb96b905517c34f
Reviewed-on: https://go-review.googlesource.com/c/go/+/357330
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|