| Age | Commit message (Collapse) | Author |
|
The SSA generic rewrite rules implement DeMorgan's laws but are
missing the closely related boolean absorption laws:
x & (x | y) == x
x | (x & y) == x
These are fundamental boolean algebra identities (see
https://en.wikipedia.org/wiki/Absorption_law) that hold for all
bit patterns, all widths, signed and unsigned. Both GCC and LLVM
recognize and optimize these patterns at -O2.
Add two generic rules covering all four widths (8, 16, 32, 64).
Commutativity of AND/OR is handled automatically by the rule
engine, so all argument orderings are matched.
The rules eliminate two redundant ALU instructions per occurrence
and fire on real code (defer bit-manipulation patterns in runtime,
testing, go/parser, and third-party packages).
Fixes #78632
Change-Id: Ib59e839081302ad1635e823309d8aec768c25dcf
GitHub-Last-Rev: 23f8296ece08c77fcaeeaf59c2c2d8ce23d1202c
GitHub-Pull-Request: golang/go#78634
Reviewed-on: https://go-review.googlesource.com/c/go/+/765580
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
|
|
loong64 uses ALSLV the compute the lookup index.
Fixes #78575
Change-Id: Ied90a4f811cc19ffec4d304333546d1fa430ccc0
Reviewed-on: https://go-review.googlesource.com/c/go/+/764180
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Fixes #78558
I've also added tests to make sure PPC still generate ISEL when
the constant isn't 1.
This is to make sure we aren't generating a sequence that wouldn't
work right now.
But it does not mean we couldn't try to optimize other constants
on PPC64 if a fast sequence exists; for example like arm64's
inline register shifts.
Change-Id: Ic241d593149b7a11533948f5d4c52db357cc134f
Reviewed-on: https://go-review.googlesource.com/c/go/+/763340
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jayanth Krishnamurthy <jayanth.krishnamurthy@ibm.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
|
|
Original algorithm merges stores with the first
mergeable store in the chain, but it misses some
cases. Additionally, creating list of STs, which
store data to adjacent memory cells allows merging them
according to the direction of increase of their addresses.
I have already tried another algorithm in CL 698097,
but it was reverted. This algorithm works differently
and fixes bug, generated by variant from another CL.
Fixes #71987, #75365
There are the results of sweet benchmarks
│ base.stat │ opt.stat │
│ sec/op │ sec/op vs base │
ESBuildThreeJS-4 1.088 ± 2% 1.086 ± 1% ~ (p=1.000 n=10)
ESBuildRomeTS-4 263.0m ± 2% 260.8m ± 1% ~ (p=0.105 n=10)
EtcdPut-4 73.08m ± 1% 73.16m ± 1% ~ (p=0.971 n=10)
EtcdSTM-4 414.9m ± 1% 415.4m ± 1% ~ (p=0.393 n=10)
GoBuildKubelet-4 203.3 ± 0% 203.5 ± 0% ~ (p=0.393 n=10)
GoBuildKubeletLink-4 19.06 ± 1% 19.05 ± 0% ~ (p=0.280 n=10)
GoBuildIstioctl-4 156.6 ± 0% 156.6 ± 0% ~ (p=0.796 n=10)
GoBuildIstioctlLink-4 14.16 ± 1% 14.18 ± 1% ~ (p=0.853 n=10)
GoBuildFrontend-4 56.45 ± 1% 56.57 ± 0% ~ (p=0.579 n=10)
GoBuildFrontendLink-4 3.635 ± 1% 3.646 ± 0% ~ (p=0.436 n=10)
GoBuildTsgo-4 103.0 ± 1% 103.4 ± 1% ~ (p=0.529 n=10)
GoBuildTsgoLink-4 1.865 ± 1% 1.860 ± 1% ~ (p=0.684 n=10)
GopherLuaKNucleotide-4 33.55 ± 0% 33.58 ± 0% ~ (p=0.075 n=10)
MarkdownRenderXHTML-4 281.0m ± 0% 280.3m ± 0% -0.23% (p=0.019 n=10)
Tile38QueryLoad-4 970.0µ ± 1% 969.3µ ± 0% ~ (p=0.436 n=10)
geomean 3.128 3.128 -0.01%
Change-Id: Ia548b43601b1bdb1c1723d300a4b8b907ab0c040
Reviewed-on: https://go-review.googlesource.com/c/go/+/760100
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
Lookup tables for switch statements can be generalized to also support
bools, strings, floats, and complex numbers.
Change-Id: Ic3ece41fe2009050fbf08ba6f06ea8a567407974
Reviewed-on: https://go-review.googlesource.com/c/go/+/763320
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Change-Id: I27696b1a5fa0593d9f36743efa3559a36d23ec4b
Reviewed-on: https://go-review.googlesource.com/c/go/+/760844
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
- fix a bug where it wouldn't recognize 1<<63 as a power of two
- remove the IsSigned check; there is no such thing as a signed Mul
If the rule works for signed numbers it works for unsigned ones too.
Even if the intermediary steps makes no sense, it ends up wrapping
the right way around in the end.
Change-Id: I86182762aec5eff784e2d9bc49ee028825fb9ea0
Reviewed-on: https://go-review.googlesource.com/c/go/+/760843
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
On amd64 along:
if b { x += 1 } => x += b
We can also implement constants 2 4 and 8:
if b { x += 2 } => x += b * 2
This compiles to a displacement LEA.
Change-Id: Ib00fcc5059acb0ebb346e056c4a656f164cc63df
Reviewed-on: https://go-review.googlesource.com/c/go/+/760841
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Logical ops on uint8/uint16 (AND/OR/XOR) with constants sometimes
materialized the mask via MOVD (often as a negative immediate), even
when the value fit in the UI-immediate range. This prevented the backend
from selecting andi. / ori / xori forms.
This CL makes:
UI-immediate truncation is performed only at the use-site of
logical-immediate ops, and only when the constant does not fit in the
8- or 16-bit unsigned domain (m != uint8(m) / m != uint16(m)).
This avoids negative-mask materialization and enables correct emission of
UI-form logical instructions. Arithmetic SI-immediate instructions (addi, subfic, etc.) and other
use-patterns are unchanged.
Codegen tests are added to ensure the expected andi./ori/xori
patterns appear and that MOVD is not emitted for valid 8/16-bit masks.
Change-Id: I9fcdf4498c4e984c7587814fb9019a75865c4a0d
Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10
Reviewed-on: https://go-review.googlesource.com/c/go/+/704015
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
|
|
Switch statement containing integer constant cases and case bodies just
returning a constant should be optimizable to a simpler and faster table
lookup instead of a jump table.
That is, a switch like this:
switch x {
case 0: return 10
case 1: return 20
case 2: return 30
case 3: return 40
default: return -1
}
Could be optimized to this:
var table = [4]int{10, 20, 30, 40}
if uint(x) < 4 { return table[x] }
return -1
The resulting code is smaller and faster, especially on platforms where
jump tables are not supported.
goos: windows
goarch: arm64
pkg: cmd/compile/internal/test
│ .\old.txt │ .\new.txt │
│ sec/op │ sec/op vs base │
SwitchLookup8Predictable-12 2.708n ± 6% 2.249n ± 5% -16.97% (p=0.000 n=10)
SwitchLookup8Unpredictable-12 8.758n ± 7% 3.272n ± 4% -62.65% (p=0.000 n=10)
SwitchLookup32Predictable-12 2.672n ± 5% 2.373n ± 6% -11.21% (p=0.000 n=10)
SwitchLookup32Unpredictable-12 9.372n ± 7% 3.385n ± 6% -63.89% (p=0.000 n=10)
geomean 4.937n 2.772n -43.84%
Fixes #78203
Change-Id: I74fa3d77ef618412951b2e5c3cb6ebc760ce4ff1
Reviewed-on: https://go-review.googlesource.com/c/go/+/756340
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
If the bool comes from a local operation this is foldable into the comparison.
if a == b {
} else {
x++
}
becomes:
x += !(a == b)
becomes:
x += a != b
If the bool is passed in or loaded rather than being locally computed
this adds an extra XOR ^1 to invert it.
But at worst it should make the math equal to the compute + CMP + CMOV
which is a tie on modern CPUs which can execute CMOV on all int ALUs
and a win on the cheaper or older ones which can't.
Change-Id: Idd2566c7a3826ec432ebfbba7b3898aa0db4b812
Reviewed-on: https://go-review.googlesource.com/c/go/+/760922
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
After CL 760780, commas aren't allowed.
But some CLs that were already in flight don't know that.
Change-Id: I31f586c87def4a9746dc2c055923fce8bad6647e
Reviewed-on: https://go-review.googlesource.com/c/go/+/761620
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Change-Id: I081da8c79f0264118e079af21ff58c511ae37e6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/760682
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
|
|
Change-Id: Ia7a955833d761e08c1b8081fb29a2e6317de004c
Reviewed-on: https://go-review.googlesource.com/c/go/+/760681
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Replace \s with a space in backtick-quoted strings
Replace \\s with a space in double-quoted strings
Change-Id: I0c8b249bb12c2c8ca69e683e4bc6f27544fd6094
Reviewed-on: https://go-review.googlesource.com/c/go/+/760680
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
A bunch of tests had broken yet undetected syntax errors
in their assembly output regexps. Things like mismatched quotes,
using ^ instead of - for negation, etc.
In addition, since CL 716060 using commas as separators between
regexps doesn't work, and ends up just silently dropping every
regexp after the comma.
Fix all these things, and add a test to make sure that we're not
silently dropping regexps on the floor.
After this CL I will do some cleanup to align with CL 716060, like
replacing commas and \s with spaces (which was the point of that CL,
but wasn't consistently rewritten everywhere).
Change-Id: I54f226120a311ead0c6c62eaf5d152ceed106034
Reviewed-on: https://go-review.googlesource.com/c/go/+/760521
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Paul Murphy <paumurph@redhat.com>
Auto-Submit: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Similar to CL 685676 but for XOR.
Change-Id: Ib5ffd4c13348f176a808b3218fdbbafc2c42794f
Reviewed-on: https://go-review.googlesource.com/c/go/+/760921
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Similar to CL 685676 but for OR.
Change-Id: I0ddfd457ed9e8888462306138a251ac48ad42084
Reviewed-on: https://go-review.googlesource.com/c/go/+/760920
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
Softfloat doesn't use the hardware instructions.
Followon to CL 739520 + CL 757300
Change-Id: Ic271cd5567c62933d2d0c01d8834f9bf07e31061
Reviewed-on: https://go-review.googlesource.com/c/go/+/760520
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Julian Zhu <jz531210@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Theses test cases search for an AND that gets
optimized away after CL 760307.
Should help to fix riscv64 (coudn't check as
builders don't appear on https://build.golang.org/ )
and loong64 CI on master.
Change-Id: I57e4e5ab7d3003f239355137472585e46493d8dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/760640
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
|
|
The prove pass removes superfluous bit masking. This was meant to
test the edge cases of the ppc64 folding rules which are exactly
the cases the prove pass now removes.
Fixes #78403
Change-Id: I45eeac58e01b42e19b8a06bb0d7af96c616ccbff
Reviewed-on: https://go-review.googlesource.com/c/go/+/760307
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Paul Murphy <paumurph@redhat.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Change-Id: I4dff3ba1462848f408257cbadedf202e62d1ea69
Reviewed-on: https://go-review.googlesource.com/c/go/+/750320
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
On ppc64/ppc64le, rewrite (x + x) << c to x << (c+1) for constant shifts. This removes an ADD, shortens the dependency chain, and reduces code size.
Add rules for both 64-bit (SLDconst) and 32-bit (SLWconst), and extend
test/codegen/shift.go with ppc64x checks to assert a single SLD/SLW and
forbid ADD. Aligns ppc64 with other architectures that already assert
similar codegen in shift.go.
Change-Id: Ie564afbb029a5bd48887b82b0c455ca1dddd5508
Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10
Reviewed-on: https://go-review.googlesource.com/c/go/+/712000
Reviewed-by: Archana Ravindar <aravinda@redhat.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
Updates #75463
Change-Id: Iec51bdedd5a29bbb81ac553ad7e22403e1715ee3
Reviewed-on: https://go-review.googlesource.com/c/go/+/757300
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Look into the following block(s) for a load that can be paired with
the load we're trying to pair up.
This particularly helps the generated equality functions. Instead of doing
MOVD x(R0), R2
MOVD x(R1), R3
CMP R2, R3
BNE noteq
MOVD x+8(R0), R2
MOVD x+8(R1), R3
CMP R2, R3
BNE noteq
we do
LDP x(R0), (R2, R4)
LDP x(R1), (R3, R5)
CMP R2, R3
BNE noteq
CMP R4, R5
BNE noteq
Removes 5296 bytes of code from cmd/go.
Change-Id: I6368686892ac944783c8b07ed7252126d1ef4031
Reviewed-on: https://go-review.googlesource.com/c/go/+/740741
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Absorb unnecessary conversion between float32 and float64
if both src and dst are 32 bit.
Updates #75463
Change-Id: Ia71941223b5cca3fea66b559da7b8f916e63feaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/733621
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Julian Zhu <jz531210@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Fixes #77720
Add a generic SSA rewrite that forwards `Load` from a `Move` destination
back to the `Move` source when it is provably safe, so field reads like
`s.h.Value().typ` don’t force a full struct copy.
- Add `Load <- Move` rewrite in `generic.rules` with safety guard:
non-volatile source
- Tweak `fixedbugs/issue22200*` so that it can still trigger the "stack frame too large" error.
- Regenerate `rewritegeneric.go`.
- Add `test/codegen/moveload.go` to assert no `MOVUPS` and direct `MOVBLZX`
in both direct and inlined forms.
Benchmark results (linux/amd64, i7-14700KF):
$ go test cmd/compile/internal/test -run='^$' -bench='MoveLoad' -count=20
Before:
BenchmarkMoveLoadTypViaValue-20 ~76.9 ns/op
BenchmarkMoveLoadTypViaPtr-20 ~1.97 ns/op
After:
BenchmarkMoveLoadTypViaValue-20 ~1.894 ns/op
BenchmarkMoveLoadTypViaPtr-20 ~1.905 ns/op
The rewrite removes the redundant struct copy in
`s.h.Value().typ`, bringing it in line with the direct pointer form.
Change-Id: Iddf2263e390030ba013e0642a695b87c75f899da
Reviewed-on: https://go-review.googlesource.com/c/go/+/748200
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Mark Freeman <markfreeman@google.com>
|
|
Negation on a condition can be eliminated.
Change-Id: I94fab5f019cbaebb2ca589e1d8796a9cb72f3894
Reviewed-on: https://go-review.googlesource.com/c/go/+/748401
Reviewed-by: Xueqi Luo <1824368278@qq.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Julian Zhu <jz531210@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
Before this CL, the simdgen contains a sign check to selectively enable
such rules for deduplication purposes. This left out `VPSRL` as it's
only available in unsigned form. This CL fixes that.
It looks like the previous documentation fix to SHA instruction might
not had run go generate, so this CL also contains the generated code for
that fix.
There is also a weird phantom import in
cmd/compile/internal/ssa/issue77582_test.go
This CL also fixes that
The trybot didn't complain?
Change-Id: Ibbf9f789c1a67af1474f0285ab376bc07f17667e
Reviewed-on: https://go-review.googlesource.com/c/go/+/748501
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Not a fix because there are other architectures
still to be done.
Updates #75463.
Change-Id: Ia5233c2b6c5f4439e269950efdd851e72e8e7ff6
Reviewed-on: https://go-review.googlesource.com/c/go/+/730160
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Not a fix because there are other architectures
still to be done.
Updates #75463.
Change-Id: Ifca03975023e4e5d0ffa98d1f877314a1a291be0
Reviewed-on: https://go-review.googlesource.com/c/go/+/729161
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Not a fix because there are other architectures
still to be done.
Updates #75463.
Change-Id: I3d7754ce4a26af0f5c4ef0be1254d164e68f8442
Reviewed-on: https://go-review.googlesource.com/c/go/+/729160
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
The subtle.ConstantTimeSelect function is now intrinsified on a number of
architectures. Add test coverage for code generation.
Change-Id: Iff97510838b39ec2137d64a62d2f516c94710c68
Reviewed-on: https://go-review.googlesource.com/c/go/+/748400
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Xueqi Luo <1824368278@qq.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Ensure that loading the value of one is included and specify the actual
constants for various instructions. Avoid potential ambiguity with
instructions (e.g. use "AND " to avoid matching "ANDI").
Change-Id: I42707bd68fab9e12d984670e64a02ee93667f77c
Reviewed-on: https://go-review.googlesource.com/c/go/+/748920
Reviewed-by: Julian Zhu <julian.oerv@isrc.iscas.ac.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
|
|
When creating a dynamically-sized slice, the compiler attempts to use a
stack-allocated buffer if the slice does not escape and its buffer size
is ≤ 32 bytes.
In this case, the SSA will contain a (OpPhi (OpSliceMake) (OpSliceMake))
value: one OpSliceMake uses the stack-allocated buffer, and the other
uses the heap-allocated buffer. The len and cap arguments for these two
OpSliceMake values are expected to be identical.
This CL enables the prove pass to recognize this scenario and handle
OpSliceLen and OpSliceCap as intended.
Fixes #77375
Change-Id: Id77a2473caf66d366f5c94108aa6cb6d3df5b887
Reviewed-on: https://go-review.googlesource.com/c/go/+/740840
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This change introduces a new SSA pass that converts conditionals
with logical AND into CMP/CCMP instruction sequences on ARM64.
Fixes #71268
Change-Id: I3eba9c05b88ed6eb70350d30f6e805e6a4dddbf1
Reviewed-on: https://go-review.googlesource.com/c/go/+/698099
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
The Go compiler always generates Loong64 binaries, which can execute any
processor of LA364, LA464, LA664 or higher core. And these processors
support unaligned memory access [1].
goos: linux
goarch: loong64
pkg: strings
cpu: Loongson-3A6000 @ 2500.00MHz
| old.txt | new.txt |
| sec/op | sec/op vs base |
StringPrefix3 4.0040n ± 0% 0.4003n ± 0% -90.00% (p=0.000 n=10)
StringPrefix5 4.0040n ± 0% 0.4003n ± 0% -90.00% (p=0.000 n=10)
StringPrefix6 3.6030n ± 0% 0.4002n ± 0% -88.89% (p=0.000 n=10)
StringPrefix7 4.0040n ± 0% 0.4002n ± 0% -90.00% (p=0.000 n=10)
geomean 3.900n 0.4003n -89.74%
goos: linux
goarch: loong64
pkg: strings
cpu: Loongson-3A5000-HV @ 2500.00MHz
| old.txt │ new.txt |
| sec/op │ sec/op vs base |
StringPrefix3 5.6160n ± 0% 0.4011n ± 0% -92.86% (p=0.000 n=10)
StringPrefix5 5.6180n ± 0% 0.4011n ± 0% -92.86% (p=0.000 n=10)
StringPrefix6 5.2170n ± 0% 0.4011n ± 0% -92.31% (p=0.000 n=10)
StringPrefix7 5.6170n ± 0% 0.4009n ± 0% -92.86% (p=0.000 n=10)
geomean 5.514n 0.4010n -92.73%
goos: linux
goarch: loong64
pkg: strings
cpu: Loongson-3B6000M @ 2400.00MHz
| old.txt │ new.txt |
| sec/op │ sec/op vs base |
StringPrefix3 5.0060n ± 0% 0.4223n ± 1% -91.56% (p=0.000 n=10)
StringPrefix5 4.5890n ± 0% 0.4214n ± 0% -90.82% (p=0.000 n=10)
StringPrefix6 4.5890n ± 0% 0.4190n ± 1% -90.87% (p=0.000 n=10)
StringPrefix7 4.5890n ± 0% 0.4226n ± 1% -90.79% (p=0.000 n=10)
geomean 4.690n 0.4213n -91.02%
[1]: https://go.dev/wiki/MinimumRequirements#loong64
Change-Id: I1870080e0122a7d136685e3045699d0cf1e4194d
Reviewed-on: https://go-review.googlesource.com/c/go/+/742260
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
|
|
Change-Id: Ibaee223894693ac4e2694bd1748f57a3e76c0b12
Reviewed-on: https://go-review.googlesource.com/c/go/+/735980
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This commit adds two test functions, bitsOptXor1 and bitsOptXor2,
to verify that the compiler correctly optimizes certain bitwise
expression patterns in future CLs.
Change-Id: Idf5bd1ff8653f8fa218604d857639e063546d8e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/736540
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Fixes #76449
This saves a single byte for the REX prefix per OpCopy it triggers on.
Change-Id: I1eab364d07354555ba2f23ffd2f9c522d4a04bd0
Reviewed-on: https://go-review.googlesource.com/c/go/+/731640
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Add optimization patterns for MemEq with small constant sizes
(3-32 bytes). These patterns help to avoid runtime calls for
small sizes.
For sizes 3-16, combine two chunks loading and comparison.
For sizes 17-32, combine a 16-byte comparison with the remaining bytes.
This change may increase binary size slightly due to inline expansion,
but improves performance for code with many small memequals,
e.g. DecodehealingTracker benchmark on arm64:
shortname: minio
pkg: github.com/minio/minio/cmd
│ Orig.res │ Uexp.res │
│ sec/op │ sec/op vs base │
DecodehealingTracker-4 842.5n ± 1% 794.0n ± 3% -5.75% (p=0.000 n=10)
AppendMsgResyncTargetsInfo-4 8.472n ± 0% 8.472n ± 0% ~ (p=0.582 n=10)
DataUpdateTracker-4 2.856µ ± 2% 2.804µ ± 3% ~ (p=0.210 n=10)
MarshalMsgdataUsageCacheInfo-4 131.2n ± 1% 131.6n ± 2% ~ (p=0.494 n=10)
geomean 227.4n 223.2n -1.86%
│ Orig.res │ Uexp.res │
│ B/s │ B/s vs base │
DecodehealingTracker-4 352.0Mi ± 1% 373.5Mi ± 3% +6.10% (p=0.000 n=10)
AppendMsgResyncTargetsInfo-4 1.099Gi ± 0% 1.099Gi ± 0% ~ (p=0.183 n=10)
DataUpdateTracker-4 341.8Ki ± 3% 351.6Ki ± 3% ~ (p=0.286 n=10)
geomean 50.95Mi 52.46Mi +2.96%
Change-Id: If3d7e7395656d5f36e3ab303a71044293d17bc3e
Reviewed-on: https://go-review.googlesource.com/c/go/+/688195
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
The bug was first introduced when the compiler is still written in C,
with CL 2254041. The static array was laid out with the wrong context,
causing a stack pointer will be stored in global object.
Fixes #61730
Fixes #77193
Change-Id: I22c8393314d251beb53db537043a63714c84f36a
Reviewed-on: https://go-review.googlesource.com/c/go/+/737821
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
This makes it easier to identify which functions are used for memory
allocation by looking for functions that start with mallocgc. The Size
suffix is added so that the isSpecializedMalloc function in
cmd/compile/internal/ssa can distinguish between the generated functions
and the mallocgcTiny function called by mallocgc, similar to the SC
suffixes for the mallocgcSmallNoScanSC* and mallocgcSmallScanNoHeaderSC*
functons.
Change-Id: I6ad7f15617bf6f18ae5d1bfa2a0b94e86a6a6964
Reviewed-on: https://go-review.googlesource.com/c/go/+/735780
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Also more consistently include commas after constants to increase
accuracy (i.e. "1," cannot inadvertantly match "10")
Change-Id: I480a73859d2e83354b8e9f94bc73c6563976d0e3
Reviewed-on: https://go-review.googlesource.com/c/go/+/733460
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
IsNaN's underlying instruction, VCMPPS (or VCMPPD), takes two
inputs, and computes either of them is NaN. Optimize the Or
pattern to generate two-operand form.
This implements the optimization mentioned in CL 733660.
Change-Id: I13943b377ee384864c913eed320763f333a03e41
Reviewed-on: https://go-review.googlesource.com/c/go/+/733680
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Fix a pattern in test/codegen/comparisons.go to use whitespace instead
of a tab. The test needs a whitespace to properly fail if the regalloc
change from CL686655 would be missing (in that case we would have a
spill immediately after call to memequal, which is supposed to be
captured by this pattern).
Change-Id: I5b6fb5e861b9327c7f071660592b8ffa265e0030
Reviewed-on: https://go-review.googlesource.com/c/go/+/727620
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Use Go idiomatic function names, use a common prefix, attempt to
maintain some consistency, avoid naming functions based upon
machine specific instructions and combine a duplicate test that
likely exists due to this confusion.
Change-Id: I996e9efd7497821edef94c1997d4a310d9d79a71
Reviewed-on: https://go-review.googlesource.com/c/go/+/732200
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Joel Sing <joel@sing.id.au>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Change-Id: Iba4d3ded15d578e97a978780069e70a51a5e944b
Reviewed-on: https://go-review.googlesource.com/c/go/+/732180
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Nicholas Husin <husin@google.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Joel Sing <joel@sing.id.au>
|
|
Merge List:
+ 2025-12-08 a33bbf1988 weak: fix weak pointer test to correctly iterate over weak pointers after GC
+ 2025-12-08 a88a96330f cmd/cgo: use doc link for cgo.Handle
+ 2025-12-08 276cc4d3db cmd/link: fix AIX builds after recent linker changes
+ 2025-12-08 f2d96272cb runtime/trace: update TestSubscribers to dump traces
+ 2025-12-08 4837bcc92c internal/trace: skip tests for alloc/free experiment by default
+ 2025-12-08 b5f6816cea cmd/link: generate DWARF for moduledata
+ 2025-12-08 44a39c9dac runtime: only run TestNotInGoMetricCallback when building with cgo
+ 2025-12-08 3a6a034cd6 runtime: disable TestNotInGoMetricCallback on FreeBSD + race
+ 2025-12-08 4122d3e9ea runtime: use atomic C types with atomic C functions
+ 2025-12-08 34397865b1 runtime: deflake TestProfBufWakeup
+ 2025-12-08 d4972f6295 runtime: mark getfp as nosplit
+ 2025-12-05 0d0d5c9a82 test/codegen: test negation with add/sub on riscv64
+ 2025-12-05 2e509e61ef cmd/go: convert some more tests to script tests
+ 2025-12-05 c270e71835 cmd/go/internal/vet: skip -fix on pkgs from vendor or non-main mod
+ 2025-12-05 745349712e runtime: don't count nGsyscallNoP for extra Ms in C
+ 2025-12-05 f3d572d96a cmd/go: fix race applying fixes in fix and vet -fix modes
+ 2025-12-05 76345533f7 runtime: expand Pinner documentation
+ 2025-12-05 b133524c0f cmd/go/testdata/script: skip vet_cache in short mode
+ 2025-12-05 96e142ba2b runtime: skip TestArenaCollision if we run out of hints
+ 2025-12-05 fe4952f116 runtime: relax threadsSlack in TestReadMetricsSched
+ 2025-12-05 8947f092a8 runtime: skip mayMoreStackMove in goroutine leak tests
+ 2025-12-05 44cb82449e runtime/race: set missing argument frame for ppc64x atomic And/Or wrappers
+ 2025-12-05 435e61c801 runtime: reject any goroutine leak test failure that failed to execute
+ 2025-12-05 54e5540014 runtime: print output in case of segfault in goroutine leak tests
+ 2025-12-05 9616c33295 runtime: don't specify GOEXPERIMENT=greenteagc in goroutine leak tests
+ 2025-12-05 2244bd7eeb crypto/subtle: add speculation barrier after DIT
+ 2025-12-05 f84f8d86be cmd/compile: fix mis-infer bounds in slice len/cap calculations
+ 2025-12-05 a70addd3b3 all: fix some comment issues
+ 2025-12-05 93b49f773d internal/runtime/maps: clarify probeSeq doc comment
+ 2025-12-04 91267f0a70 all: update vendored x/tools
+ 2025-12-04 a753a9ed54 cmd/internal/fuzztest: move fuzz tests out of cmd/go test suite
+ 2025-12-04 1681c3b67f crypto: use rand.IsDefaultReader instead of comparing to boring.RandReader
+ 2025-12-04 7b67b68a0d cmd/compile: use isUnsignedPowerOfTwo rather than isPowerOfTwo for unsigneds
+ 2025-12-03 2b62144069 all: REVERSE MERGE dev.simd (9ac524a) into master
Change-Id: Ia0cdf06cdde89b6a4db30ed15ed8e0bcbac6ae30
|
|
Also removes a few leftover TODOs and scraps of commented-out code
from simd development.
Updated etetest.sh to make it behave whether amd64 implies the
experiment, or not.
Fixes #76473.
Change-Id: I6d9792214d7f514cb90c21b101dbf7d07c1d0e55
Reviewed-on: https://go-review.googlesource.com/c/go/+/728220
TryBot-Bypass: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|