| Age | Commit message (Collapse) | Author |
|
The debug info generation currently depends on this invariant.
A small update to CL 468455.
Update #58482
Change-Id: Ica305d360d9af04036c604b6a65b683f7cb6e212
Reviewed-on: https://go-review.googlesource.com/c/go/+/468695
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
The actual scheduling of SP early doesn't really matter, but lots of
early spills (of arguments) depend on SP so they can't schedule until
SP does.
Fixes #58482
Change-Id: Ie581fba7cb173d665c11f797f39d824b1c040a2d
Reviewed-on: https://go-review.googlesource.com/c/go/+/468455
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
When scheduling a block, deprioritize values whose results aren't used
until subsequent blocks.
For #58166, this has the effect of pushing the induction variable increment
to the end of the block, past all the other uses of the pre-incremented value.
Do this only with optimizations on. Debuggers have a preference for values
in source code order, which this CL can degrade.
Fixes #58166
Fixes #57976
Change-Id: I40d5885c661b142443c6d4702294c8abe8026c4f
Reviewed-on: https://go-review.googlesource.com/c/go/+/463751
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
If OpArgIntReg is incorrectly scheduled, that causes it to be spilled
incorrectly, which causes the argument to not be considered live
at the start of the function.
This is the test for CL 462858
Add a brief mention of why CL 462858 is needed in the scheduling code.
Change-Id: Id199456f88d9ee5ca46d7b0353a3c2049709880e
Reviewed-on: https://go-review.googlesource.com/c/go/+/462899
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
|
|
The register allocator doesn't like OpArg coming in between other
OpIntArg operations, as it doesn't put the spills in the right place
in that situation.
This is just a bug in the new scheduler, I didn't copy over the
proper score from the old scheduler correctly.
Change-Id: I3b4ee1754982fb360e99c5864b19e7408d60b5bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/462858
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Convert the scheduling pass from scheduling backwards to scheduling forwards.
Forward scheduling makes it easier to prioritize scheduling values as
soon as they are ready, which is important for things like nil checks,
select ops, etc.
Forward scheduling is also quite a bit clearer. It was originally
backwards because computing uses is tricky, but I found a way to do it
simply and with n lg n complexity. The new scheme also makes it easy
to add new scheduling edges if needed.
Fixes #42673
Update #56568
Change-Id: Ibbb38c52d191f50ce7a94f8c1cbd3cd9b614ea8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/270940
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
The SPanchored opcode is identical to SP, except that it takes a memory
argument so that it (and more importantly, anything that uses it)
must be scheduled at or after that memory argument.
This opcode ensures that a LEAQ of a variable gets scheduled after the
corresponding VARDEF for that variable.
This may lead to less CSE of LEAQ operations. The effect is very small.
The go binary is only 80 bytes bigger after this CL. Usually LEAQs get
folded into load/store operations, so the effect is only for pointerful
types, large enough to need a duffzero, and have their address passed
somewhere. Even then, usually the CSEd LEAQs will be un-CSEd because
the two uses are on different sides of a function call and the LEAQ
ends up being rematerialized at the second use anyway.
Change-Id: Ib893562cd05369b91dd563b48fb83f5250950293
Reviewed-on: https://go-review.googlesource.com/c/go/+/452916
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Martin Möhrmann <martin@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This needs to be as low as possible while not breaking priority
assumptions of other scores to correctly schedule carry chains.
Prior to the arm64 changes, it was set below ReadTuple. At the time,
this prevented the MulHiLo implementation on PPC64 from occluding
the scheduling of a full carry chain.
Memory scores can also prevent better scheduling, as can be observed
with crypto/internal/edwards25519/field.feMulGeneric.
Fixes #56497
Change-Id: Ia4b54e6dffcce584faf46b1b8d7cea18a3913887
Reviewed-on: https://go-review.googlesource.com/c/go/+/447435
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
We kind of have this mechanism already, just normalizing it and
using it in a bunch of places. Previously a bunch of places cached
slices only for the duration of a single function compilation. Now
we can reuse slices across a whole compiler run.
Use a sync.Pool of powers-of-two sizes. This lets us use not
too much memory, and avoid holding onto memory we're no longer
using when a GC happens.
There's a few different types we need, so generate the code for it.
Generics would be useful here, but we can't use generics in the
compiler because of bootstrapping.
Change-Id: I6cf37e7b7b2e802882aaa723a0b29770511ccd82
Reviewed-on: https://go-review.googlesource.com/c/go/+/444820
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
This is a follow up of CL 393656 on arm64.
This CL puts ScoreCarryChainTail before ScoreMemory and after
ScoreReadFlags, so that the scheduling of the carry chain will not
break the scheduling of ScoreVarDef.
Benchmarks:
name old time/op new time/op delta
ScalarMult/P256-8 42.0µs ± 0% 42.0µs ± 0% -0.13% (p=0.032 n=5+5)
ScalarMult/P224-8 135µs ± 0% 96µs ± 0% -29.04% (p=0.008 n=5+5)
ScalarMult/P384-8 573µs ± 1% 355µs ± 0% -38.05% (p=0.008 n=5+5)
ScalarMult/P521-8 1.50ms ± 4% 0.77ms ± 0% -48.78% (p=0.008 n=5+5)
MarshalUnmarshal/P256/Uncompressed-8 505ns ± 1% 506ns ± 0% ~ (p=0.460 n=5+5)
MarshalUnmarshal/P256/Compressed-8 6.75µs ± 0% 6.73µs ± 0% -0.27% (p=0.016 n=5+5)
MarshalUnmarshal/P224/Uncompressed-8 927ns ± 0% 818ns ± 0% -11.76% (p=0.008 n=5+5)
MarshalUnmarshal/P224/Compressed-8 136µs ± 0% 96µs ± 0% -29.58% (p=0.008 n=5+5)
MarshalUnmarshal/P384/Uncompressed-8 1.77µs ± 0% 1.36µs ± 1% -23.14% (p=0.008 n=5+5)
MarshalUnmarshal/P384/Compressed-8 56.5µs ± 0% 31.9µs ± 0% -43.59% (p=0.016 n=5+4)
MarshalUnmarshal/P521/Uncompressed-8 2.91µs ± 0% 2.03µs ± 1% -30.32% (p=0.008 n=5+5)
MarshalUnmarshal/P521/Compressed-8 148µs ± 0% 68µs ± 1% -54.28% (p=0.008 n=5+5)
Change-Id: I4bf4e3265d7e1ee85765ff2bf006ca5a794d4979
Reviewed-on: https://go-review.googlesource.com/c/go/+/432275
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
|
|
This reverts commit 4c414c7673af6b2aedee276d2e62cb2910eb19f3.
Reason for revert: breaks ppc64 build (see issue 55254)
Change-Id: I096ffa0e6535d31d9dd4079b48bb201b20220d76
Reviewed-on: https://go-review.googlesource.com/c/go/+/432196
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This is a follow up of CL 393656 on arm64.
Benchmarks:
name old time/op new time/op delta
ScalarMult/P256-8 42.0µs ± 0% 42.0µs ± 0% -0.13% (p=0.032 n=5+5)
ScalarMult/P224-8 135µs ± 0% 96µs ± 0% -29.04% (p=0.008 n=5+5)
ScalarMult/P384-8 573µs ± 1% 355µs ± 0% -38.05% (p=0.008 n=5+5)
ScalarMult/P521-8 1.50ms ± 4% 0.77ms ± 0% -48.78% (p=0.008 n=5+5)
MarshalUnmarshal/P256/Uncompressed-8 505ns ± 1% 506ns ± 0% ~ (p=0.460 n=5+5)
MarshalUnmarshal/P256/Compressed-8 6.75µs ± 0% 6.73µs ± 0% -0.27% (p=0.016 n=5+5)
MarshalUnmarshal/P224/Uncompressed-8 927ns ± 0% 818ns ± 0% -11.76% (p=0.008 n=5+5)
MarshalUnmarshal/P224/Compressed-8 136µs ± 0% 96µs ± 0% -29.58% (p=0.008 n=5+5)
MarshalUnmarshal/P384/Uncompressed-8 1.77µs ± 0% 1.36µs ± 1% -23.14% (p=0.008 n=5+5)
MarshalUnmarshal/P384/Compressed-8 56.5µs ± 0% 31.9µs ± 0% -43.59% (p=0.016 n=5+4)
MarshalUnmarshal/P521/Uncompressed-8 2.91µs ± 0% 2.03µs ± 1% -30.32% (p=0.008 n=5+5)
MarshalUnmarshal/P521/Compressed-8 148µs ± 0% 68µs ± 1% -54.28% (p=0.008 n=5+5)
Change-Id: I33170360eb8279b998e3c559f7136717fe32e07d
Reviewed-on: https://go-review.googlesource.com/c/go/+/424907
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
A recent change was made for ppc64x to treat ANDCCconst as
a tuple, allowing ANDconst to be removed from the list
of ops. Included in that change were some improvements to the
rules to avoid some extra code, mainly the elimination of a
cmp 0 following an andi. and in some cases the following
isel. While those changes worked for most cases, in a few
cases some extra unnecessary code was generated.
Currently the snippet appears in archive/zip.(*FileHeader).Mode:
ANDCC R4,$1,R5 // andi. r5,r4,1
ANDCC R4,$16,R5 // andi. r5,r4,16
CMPW R5,R0 // cmpw r5,r0
ADDIS $0,$-32768,R5 // lis r5,-32768
OR R5,$511,R5 // ori r5,r5,511
MOVD $438,R6 // li r6,438
ISEL $2,R6,R5,R5 // isel r5,r6,r5,eq
MOVD $-147,R6 // li r6,-147
AND R6,R5,R6 // and r6,r5,r6
ANDCC R4,$1,R4 // andi. r4,r4,1
ISEL $2,R5,R6,R4 // isel r4,r5,r6,eq
The first ANDCC is never used and should not be there.
From the ssa.html file, the scheduler is not putting the Select1
close to the ISEL, which results in the flag being clobbered
before it can be used. By changing the score for a Select0 or Select1
with type Flags, the extra ANDCC does not occur.
Change-Id: I82f4bc7c02afb1c2b1c048dc6995e0b3f9363fb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/424294
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
|
|
Following CL 424454, using command
rg --multiline " the\s{1,}the " *
rg --multiline " the\s{1,}//\s{1,}the " *
all the words "the" that are repeated in comments are found.
Change-Id: I60b769b98f04c927b4c228e10f37faf190964069
Reviewed-on: https://go-review.googlesource.com/c/go/+/423836
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
|
|
Contributors to the loong64 port are:
Weining Lu <luweining@loongson.cn>
Lei Wang <wanglei@loongson.cn>
Lingqin Gong <gonglingqin@loongson.cn>
Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Meidan Li <limeidan@loongson.cn>
Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
Qiyuan Pu <puqiyuan@loongson.cn>
Guoqi Chen <chenguoqi@loongson.cn>
This port has been updated to Go 1.15.6:
https://github.com/loongson/go
Updates #46229
Change-Id: Id533912c62d8c4e2aa3c124561772b543d685d7d
Reviewed-on: https://go-review.googlesource.com/c/go/+/367041
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
|
|
This results in a 1.7-2.4x improvement in native go crypto/elliptic
multiplication operations on PPC64, and similar improvements might
be possible on other architectures which use flags or similar to
represent the carry bit in SSA form.
If it is possible, schedule carry chains independently of each
other to avoid clobbering the carry flag. This is very expensive.
This is done by:
1. Identifying carry bit using, but not creating ops, and lowering
their priority below all other ops which do not need to be
placed at the top of a block. This effectively ensures only
one carry chain will be placed at a time in most important
cases (crypto/elliptic/internal/fiat contains most of them).
2. Raising the priority of carry bit generating ops to schedule
later in a block to ensure they are placed as soon as they
are ready.
Likewise, tuple ops which separate carrying ops are scored
similar to 2 above. This prevents unrelated ops from being
scheduled between carry-dependent operations. This occurs
when unrelated ops are ready to schedule alongside such
tuple ops. This reduces the chances a flag clobbering op
might be placed between two carry-dependent operations.
With PPC64 Add64/Sub64 lowering into SSA and this patch, the net
performance difference in crypto/elliptic benchmarks on P9/ppc64le
are:
name old time/op new time/op delta
ScalarBaseMult/P256 46.3µs ± 0% 46.9µs ± 0% +1.34%
ScalarBaseMult/P224 356µs ± 0% 209µs ± 0% -41.14%
ScalarBaseMult/P384 1.20ms ± 0% 0.57ms ± 0% -52.14%
ScalarBaseMult/P521 3.38ms ± 0% 1.44ms ± 0% -57.27%
ScalarMult/P256 199µs ± 0% 199µs ± 0% -0.17%
ScalarMult/P224 357µs ± 0% 212µs ± 0% -40.56%
ScalarMult/P384 1.20ms ± 0% 0.58ms ± 0% -51.86%
ScalarMult/P521 3.37ms ± 0% 1.44ms ± 0% -57.32%
MarshalUnmarshal/P256/Uncompressed 2.59µs ± 0% 2.52µs ± 0% -2.63%
MarshalUnmarshal/P256/Compressed 2.58µs ± 0% 2.52µs ± 0% -2.06%
MarshalUnmarshal/P224/Uncompressed 1.54µs ± 0% 1.40µs ± 0% -9.42%
MarshalUnmarshal/P224/Compressed 1.54µs ± 0% 1.39µs ± 0% -9.87%
MarshalUnmarshal/P384/Uncompressed 2.40µs ± 0% 1.80µs ± 0% -24.93%
MarshalUnmarshal/P384/Compressed 2.35µs ± 0% 1.81µs ± 0% -23.03%
MarshalUnmarshal/P521/Uncompressed 3.79µs ± 0% 2.58µs ± 0% -31.81%
MarshalUnmarshal/P521/Compressed 3.80µs ± 0% 2.60µs ± 0% -31.67%
Note, P256 uses an asm implementation, thus, little variation is expected.
Updates #40171
Change-Id: I810850e8ff429505424c92d6fe37f99aaa0c6e84
Reviewed-on: https://go-review.googlesource.com/c/go/+/393656
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Filippo Valsorda <valsorda@google.com>
|
|
[This CL is part of a sequence implementing the proposal #51082.
The design doc is at https://go.dev/s/godocfmt-design.]
Run the updated gofmt, which reformats doc comments,
on the main repository. Vendored files are excluded.
For #51082.
Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407
Reviewed-on: https://go-review.googlesource.com/c/go/+/384268
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
for block control
In the scheduler we have the logic that if a Value is used as the
block's control, we schedule it at the end, except for Phis and
Args. Even the comment says so, the code doesn't exclude
in-register Args (OpArgXXXReg).
Change to check for score instead, which includes OpArgXXXRegs.
It also includes GetClosurePtr, which must be scheduled early.
We just happen to never use it as block control.
Found when working on ARM64 register ABI. In theory this could
apply to AMD64 as well. But on AMD64 we never use in-register
Value as block control, as conditional branch is always based on
FLAGS, never based on registers, so it doesn't actually cause any
problem.
Change-Id: I167a550309772639574f7468caf91bd805eb74c6
Reviewed-on: https://go-review.googlesource.com/c/go/+/322849
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
OpArgXXXReg values must be scheduled at the very top, as their
registers need to be live at the beginning before any other use
of the register.
Change-Id: Ic76768bb74da402adbe61db3b2d174ecd3f9fffc
Reviewed-on: https://go-review.googlesource.com/c/go/+/306329
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
in progress; doesn't fully work until they are also passed on
register on the caller side.
For #40724.
Change-Id: I29a6680e60bdbe9d132782530214f2a2b51fb8f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/293394
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
StaticLECall (multiple value in +mem, multiple value result +mem) ->
StaticCall (multiple ergister value in +mem,
multiple register-sized-value result +mem) ->
ARCH CallStatic (multiple ergister value in +mem,
multiple register-sized-value result +mem)
But the architecture-dependent stuff is indifferent to whether
it is mem->mem or (mem)->(mem) until Prog generation.
Deal with OpSelectN -> Prog in ssagen/ssa.go, others, as they
appear.
For #40724.
Change-Id: I1d0436f6371054f1881862641d8e7e418e4a6a16
Reviewed-on: https://go-review.googlesource.com/c/go/+/293391
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
These comparisons are fairly arbitrary,
but they should be more stable in the face
of other compiler changes than value ID.
This reduces the number of value ID
comparisons in schedule while running
make.bash from 542,442 to 99,703.
There are lots of changes to generated code
from this change, but they appear to
be overall neutral.
It is possible to further reduce the
number of comparisons in schedule;
I have changes locally that reduce the
number to about 25,000 during make.bash.
However, the changes are increasingly
complex and arcane, and reduce in much less
code churn. Given that the goal is stability,
that suggests that this is a reasonable
place to stop, at least for now.
Change-Id: Ie3a75f84fd3f3fdb102fcd0b29299950ea66b827
Reviewed-on: https://go-review.googlesource.com/c/go/+/229799
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Falling back to comparing Value.ID during scheduling
is undesirable: Not only are we simply hoping for a good
outcome, but the decision we make will be easily perturbed
by other compiler changes, leading to random fluctuations.
This change adds another decision point to the scheduler
by scheduling Values with many uses earlier.
Values with fewer uses are less likely to be spilled for
other reasons, so we should issue them as late as possible
in the hope of avoiding a spill.
This reduces the number of Value ID comparisons
in schedule while running make.bash
from 1,000,844 to 542,442.
As you would expect, this changes a lot of functions,
but the overall trend is positive:
file before after Δ %
api 5237184 5233088 -4096 -0.078%
compile 19926480 19918288 -8192 -0.041%
cover 5281816 5277720 -4096 -0.078%
dist 3711608 3707512 -4096 -0.110%
total 113588440 113567960 -20480 -0.018%
Change-Id: Ic99ebc4c614d4ae3807ce44473ec6b04684388ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/229798
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Based on riscv-go port.
Updates #27532
Change-Id: Ia329daa243db63ff334053b8807ea96b97ce3acf
Reviewed-on: https://go-review.googlesource.com/c/go/+/204631
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Control values are used to choose which successor of a block is
jumped to. Typically a control value takes the form of a 'flags'
value that represents the result of a comparison. Some
architectures however use a variable in a register as a control
value.
Up until now we have managed with a single control value per block.
However some architectures (e.g. s390x and riscv64) have combined
compare-and-branch instructions that take two variables in registers
as parameters. To generate these instructions we need to support 2
control values per block.
This CL allows up to 2 control values to be used in a block in
order to support the addition of compare-and-branch instructions.
I have implemented s390x compare-and-branch instructions in a
different CL.
Passes toolstash-check -all.
Results of compilebench:
name old time/op new time/op delta
Template 208ms ± 1% 209ms ± 1% ~ (p=0.289 n=20+20)
Unicode 83.7ms ± 1% 83.3ms ± 3% -0.49% (p=0.017 n=18+18)
GoTypes 748ms ± 1% 748ms ± 0% ~ (p=0.460 n=20+18)
Compiler 3.47s ± 1% 3.48s ± 1% ~ (p=0.070 n=19+18)
SSA 11.5s ± 1% 11.7s ± 1% +1.64% (p=0.000 n=19+18)
Flate 130ms ± 1% 130ms ± 1% ~ (p=0.588 n=19+20)
GoParser 160ms ± 1% 161ms ± 1% ~ (p=0.211 n=20+20)
Reflect 465ms ± 1% 467ms ± 1% +0.42% (p=0.007 n=20+20)
Tar 184ms ± 1% 185ms ± 2% ~ (p=0.087 n=18+20)
XML 253ms ± 1% 253ms ± 1% ~ (p=0.377 n=20+18)
LinkCompiler 769ms ± 2% 774ms ± 2% ~ (p=0.070 n=19+19)
ExternalLinkCompiler 3.59s ±11% 3.68s ± 6% ~ (p=0.072 n=20+20)
LinkWithoutDebugCompiler 446ms ± 5% 454ms ± 3% +1.79% (p=0.002 n=19+20)
StdCmd 26.0s ± 2% 26.0s ± 2% ~ (p=0.799 n=20+20)
name old user-time/op new user-time/op delta
Template 238ms ± 5% 240ms ± 5% ~ (p=0.142 n=20+20)
Unicode 105ms ±11% 106ms ±10% ~ (p=0.512 n=20+20)
GoTypes 876ms ± 2% 873ms ± 4% ~ (p=0.647 n=20+19)
Compiler 4.17s ± 2% 4.19s ± 1% ~ (p=0.093 n=20+18)
SSA 13.9s ± 1% 14.1s ± 1% +1.45% (p=0.000 n=18+18)
Flate 145ms ±13% 146ms ± 5% ~ (p=0.851 n=20+18)
GoParser 185ms ± 5% 188ms ± 7% ~ (p=0.174 n=20+20)
Reflect 534ms ± 3% 538ms ± 2% ~ (p=0.105 n=20+18)
Tar 215ms ± 4% 211ms ± 9% ~ (p=0.079 n=19+20)
XML 295ms ± 6% 295ms ± 5% ~ (p=0.968 n=20+20)
LinkCompiler 832ms ± 4% 837ms ± 7% ~ (p=0.707 n=17+20)
ExternalLinkCompiler 1.58s ± 8% 1.60s ± 4% ~ (p=0.296 n=20+19)
LinkWithoutDebugCompiler 478ms ±12% 489ms ±10% ~ (p=0.429 n=20+20)
name old object-bytes new object-bytes delta
Template 559kB ± 0% 559kB ± 0% ~ (all equal)
Unicode 216kB ± 0% 216kB ± 0% ~ (all equal)
GoTypes 2.03MB ± 0% 2.03MB ± 0% ~ (all equal)
Compiler 8.07MB ± 0% 8.07MB ± 0% -0.06% (p=0.000 n=20+20)
SSA 27.1MB ± 0% 27.3MB ± 0% +0.89% (p=0.000 n=20+20)
Flate 343kB ± 0% 343kB ± 0% ~ (all equal)
GoParser 441kB ± 0% 441kB ± 0% ~ (all equal)
Reflect 1.36MB ± 0% 1.36MB ± 0% ~ (all equal)
Tar 487kB ± 0% 487kB ± 0% ~ (all equal)
XML 632kB ± 0% 632kB ± 0% ~ (all equal)
name old export-bytes new export-bytes delta
Template 18.5kB ± 0% 18.5kB ± 0% ~ (all equal)
Unicode 7.92kB ± 0% 7.92kB ± 0% ~ (all equal)
GoTypes 35.0kB ± 0% 35.0kB ± 0% ~ (all equal)
Compiler 109kB ± 0% 110kB ± 0% +0.72% (p=0.000 n=20+20)
SSA 137kB ± 0% 138kB ± 0% +0.58% (p=0.000 n=20+20)
Flate 4.89kB ± 0% 4.89kB ± 0% ~ (all equal)
GoParser 8.49kB ± 0% 8.49kB ± 0% ~ (all equal)
Reflect 11.4kB ± 0% 11.4kB ± 0% ~ (all equal)
Tar 10.5kB ± 0% 10.5kB ± 0% ~ (all equal)
XML 16.7kB ± 0% 16.7kB ± 0% ~ (all equal)
name old text-bytes new text-bytes delta
HelloSize 761kB ± 0% 761kB ± 0% ~ (all equal)
CmdGoSize 10.8MB ± 0% 10.8MB ± 0% ~ (all equal)
name old data-bytes new data-bytes delta
HelloSize 10.7kB ± 0% 10.7kB ± 0% ~ (all equal)
CmdGoSize 312kB ± 0% 312kB ± 0% ~ (all equal)
name old bss-bytes new bss-bytes delta
HelloSize 122kB ± 0% 122kB ± 0% ~ (all equal)
CmdGoSize 146kB ± 0% 146kB ± 0% ~ (all equal)
name old exe-bytes new exe-bytes delta
HelloSize 1.13MB ± 0% 1.13MB ± 0% ~ (all equal)
CmdGoSize 15.1MB ± 0% 15.1MB ± 0% ~ (all equal)
Change-Id: I3cc2f9829a109543d9a68be4a21775d2d3e9801f
Reviewed-on: https://go-review.googlesource.com/c/go/+/196557
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
If a for loop has a simple condition and begins with a simple
"if x { break; }"; we can simply add "!x" to the loop's condition.
While at it, simplify a few assignments to use the common pattern
"x := staticDefault; if cond { x = otherValue(); }".
Finally, simplify a couple of var declarations.
Change-Id: I413982c6abd32905adc85a9a666cb3819139c19f
Reviewed-on: https://go-review.googlesource.com/c/go/+/165342
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
For the entry block, make the "first instruction" be truly
the first instruction. This allows printing of incoming
parameters with Delve.
Also be sure Phis are marked as being at the start of their
block. This is observed to move location list pointers,
and where moved, they become correct.
Leading zero-width instructions include LoweredGetClosurePtr.
Because this instruction is actually architecture-specific,
and it is now tested for in 3 different places, also created
Op.isLoweredGetClosurePtr() to reduce future surprises.
Change-Id: Ic043b7265835cf1790382a74334b5714ae4060af
Reviewed-on: https://go-review.googlesource.com/c/145179
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
There's nothing enforcing ordering between redundant nil checks when
they may occur during the same memory state. Commit to using the
earliest one in source order.
Otherwise the choice of which to remove depends on the ordering of
values in a block (before scheduling). That's unfortunate when trying
to ensure that the compiler doesn't depend on that ordering for
anything.
Update #20178
Change-Id: I2cdd5be10618accd9d91fa07406c90cbd023ffba
Reviewed-on: https://go-review.googlesource.com/c/151517
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
name old time/op new time/op delta
Add-8 1.11ns ± 0% 1.18ns ± 0% +6.31% (p=0.029 n=4+4)
Add32-8 1.02ns ± 0% 1.02ns ± 1% ~ (p=0.333 n=4+5)
Add64-8 1.11ns ± 1% 1.17ns ± 0% +5.79% (p=0.008 n=5+5)
Add64multiple-8 4.35ns ± 1% 0.86ns ± 0% -80.22% (p=0.000 n=5+4)
The individual ops are a bit slower (but still very fast).
Using the ops in carry chains is very fast.
Update #28273
Change-Id: Id975f76df2b930abf0e412911d327b6c5b1befe5
Reviewed-on: https://go-review.googlesource.com/c/144257
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
The location list for OpArg starts where the OpArg appears;
this is not necessarily as soon as the OpArg coulde be
observed, and it is reasonable for a user to expect that
if a breakpoint is set "on f" then the arguments to f will
be observable where that breakpoint happens to be set (this
may also require setting the breakpoint after the prologue,
but that is another issue).
Change-Id: I0a1b848e50f475e5d8a5fad781241126872a0400
Reviewed-on: https://go-review.googlesource.com/c/142819
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This reverts commit 1a27f048ad25f151d2a17ce7f2d73d0d2dbe94cf.
Reason for revert: Broke the ssacheck and -N-l builders, and the -N-l fix looks like it will take some time and take a different route entirely.
Change-Id: Ie0ac5e86ab7d72a303dfbbc48dfdf1e092d4f61a
Reviewed-on: https://go-review.googlesource.com/121715
Reviewed-by: David Chase <drchase@google.com>
|
|
Given a carefully constructed input, writebarrier would
split a block with the OpAddr in the first half and the
VarDef in the second half which ultimately leads to a
compiler crash because the scheduler is no longer able
to put them in the proper order.
To fix, recognize the implicit dependence of OpAddr on
the VarDef of the same symbol if any exists.
This fix was chosen over making OpAddr take a memory
operand to make the dependence explicit, because this
change is less invasive at this late part of the 1.11
release cycle.
Fixes #26105.
Change-Id: I9b65460673af3af41740ef877d2fca91acd336bc
Reviewed-on: https://go-review.googlesource.com/121436
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This commit adds the wasm architecture to the compile command.
A later commit will contain the corresponding linker changes.
Design doc: https://docs.google.com/document/d/131vjr4DH6JFnb-blm_uRdaC0_Nv3OUwjEY5qVCxCup4
The following files are generated:
- src/cmd/compile/internal/ssa/opGen.go
- src/cmd/compile/internal/ssa/rewriteWasm.go
- src/cmd/internal/obj/wasm/anames.go
Updates #18892
Change-Id: Ifb4a96a3e427aac2362a1c97967d5667450fba3b
Reviewed-on: https://go-review.googlesource.com/103295
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Allocate two more ssa local worklists on the stack. The initial sizes
are chosen to cover >99% of the calls.
name old time/op new time/op delta
Template 281ms ± 2% 283ms ± 5% ~ (p=0.443 n=18+19)
Unicode 136ms ± 4% 135ms ± 7% ~ (p=0.277 n=20+20)
GoTypes 886ms ± 2% 885ms ± 2% ~ (p=0.862 n=20+20)
Compiler 4.03s ± 2% 4.02s ± 1% ~ (p=0.270 n=19+20)
SSA 9.66s ± 1% 9.64s ± 2% ~ (p=0.253 n=20+20)
Flate 186ms ± 5% 183ms ± 6% ~ (p=0.174 n=20+20)
GoParser 222ms ± 4% 219ms ± 4% ~ (p=0.081 n=20+20)
Reflect 569ms ± 2% 568ms ± 2% ~ (p=0.686 n=19+19)
Tar 258ms ± 4% 256ms ± 3% ~ (p=0.211 n=20+20)
XML 319ms ± 2% 317ms ± 3% ~ (p=0.158 n=18+20)
name old user-time/op new user-time/op delta
Template 396ms ± 6% 392ms ± 6% ~ (p=0.211 n=20+20)
Unicode 212ms ±10% 211ms ± 9% ~ (p=0.904 n=20+20)
GoTypes 1.21s ± 3% 1.21s ± 2% ~ (p=0.183 n=20+20)
Compiler 5.60s ± 2% 5.62s ± 2% ~ (p=0.355 n=18+18)
SSA 14.0s ± 6% 13.9s ± 5% ~ (p=0.678 n=20+20)
Flate 250ms ± 8% 245ms ± 6% ~ (p=0.166 n=19+20)
GoParser 305ms ± 6% 304ms ± 5% ~ (p=0.659 n=20+20)
Reflect 760ms ± 3% 758ms ± 4% ~ (p=0.758 n=20+20)
Tar 362ms ± 6% 357ms ± 5% ~ (p=0.108 n=20+20)
XML 429ms ± 4% 429ms ± 4% ~ (p=0.799 n=20+20)
name old alloc/op new alloc/op delta
Template 39.0MB ± 0% 38.8MB ± 0% -0.55% (p=0.000 n=20+20)
Unicode 29.1MB ± 0% 29.1MB ± 0% -0.06% (p=0.000 n=20+20)
GoTypes 116MB ± 0% 115MB ± 0% -0.50% (p=0.000 n=20+20)
Compiler 493MB ± 0% 491MB ± 0% -0.46% (p=0.000 n=19+20)
SSA 1.40GB ± 0% 1.40GB ± 0% -0.31% (p=0.000 n=19+20)
Flate 25.0MB ± 0% 24.9MB ± 0% -0.60% (p=0.000 n=19+19)
GoParser 30.9MB ± 0% 30.7MB ± 0% -0.66% (p=0.000 n=20+20)
Reflect 77.5MB ± 0% 77.1MB ± 0% -0.52% (p=0.000 n=20+20)
Tar 39.2MB ± 0% 39.0MB ± 0% -0.47% (p=0.000 n=20+20)
XML 44.8MB ± 0% 44.6MB ± 0% -0.45% (p=0.000 n=20+19)
name old allocs/op new allocs/op delta
Template 382k ± 0% 379k ± 0% -0.69% (p=0.000 n=20+19)
Unicode 337k ± 0% 336k ± 0% -0.09% (p=0.000 n=20+20)
GoTypes 1.19M ± 0% 1.18M ± 0% -0.64% (p=0.000 n=20+20)
Compiler 4.60M ± 0% 4.58M ± 0% -0.57% (p=0.000 n=20+20)
SSA 11.5M ± 0% 11.4M ± 0% -0.42% (p=0.000 n=19+20)
Flate 235k ± 0% 233k ± 0% -0.74% (p=0.000 n=20+19)
GoParser 316k ± 0% 313k ± 0% -0.69% (p=0.000 n=20+20)
Reflect 953k ± 0% 946k ± 0% -0.81% (p=0.000 n=20+20)
Tar 391k ± 0% 388k ± 0% -0.61% (p=0.000 n=20+19)
XML 413k ± 0% 411k ± 0% -0.56% (p=0.000 n=20+20)
Change-Id: I7378174e3550b47df4368b24cf24c8ce1b85c906
Reviewed-on: https://go-review.googlesource.com/104656
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
|
|
Compiler instrumentation shows that the cap of the stores slice in the
storeOrder function is almost always 64 or less. Since the slice does
not escape, pre-allocating on the stack a 64-elements one greatly
reduces the number of allocations performed by the function.
name old time/op new time/op delta
Template 289ms ± 5% 283ms ± 3% -1.99% (p=0.000 n=19+20)
Unicode 140ms ± 6% 136ms ± 6% -2.61% (p=0.021 n=19+20)
GoTypes 915ms ± 2% 895ms ± 2% -2.24% (p=0.000 n=19+20)
Compiler 4.15s ± 1% 4.04s ± 2% -2.73% (p=0.000 n=20+20)
SSA 10.0s ± 1% 9.8s ± 2% -2.13% (p=0.000 n=20+20)
Flate 189ms ± 6% 186ms ± 4% -1.75% (p=0.028 n=19+20)
GoParser 229ms ± 5% 224ms ± 4% -2.25% (p=0.001 n=20+19)
Reflect 584ms ± 2% 573ms ± 3% -1.83% (p=0.000 n=18+20)
Tar 265ms ± 3% 261ms ± 3% -1.33% (p=0.021 n=20+20)
XML 328ms ± 2% 321ms ± 2% -2.11% (p=0.000 n=20+20)
name old user-time/op new user-time/op delta
Template 408ms ± 4% 400ms ± 4% -1.98% (p=0.006 n=19+20)
Unicode 216ms ± 9% 216ms ± 7% ~ (p=0.883 n=20+20)
GoTypes 1.25s ± 1% 1.23s ± 3% -1.32% (p=0.002 n=19+20)
Compiler 5.77s ± 1% 5.69s ± 2% -1.47% (p=0.000 n=18+19)
SSA 14.6s ± 5% 14.1s ± 4% -3.45% (p=0.000 n=20+20)
Flate 252ms ± 7% 251ms ± 7% ~ (p=0.659 n=20+20)
GoParser 314ms ± 5% 310ms ± 5% ~ (p=0.165 n=20+20)
Reflect 780ms ± 2% 769ms ± 3% -1.34% (p=0.004 n=19+18)
Tar 365ms ± 7% 367ms ± 5% ~ (p=0.841 n=20+20)
XML 439ms ± 4% 432ms ± 4% -1.45% (p=0.043 n=20+20)
name old alloc/op new alloc/op delta
Template 38.9MB ± 0% 38.8MB ± 0% -0.26% (p=0.000 n=19+20)
Unicode 29.0MB ± 0% 29.0MB ± 0% -0.02% (p=0.001 n=20+19)
GoTypes 115MB ± 0% 115MB ± 0% -0.31% (p=0.000 n=20+20)
Compiler 492MB ± 0% 490MB ± 0% -0.41% (p=0.000 n=20+19)
SSA 1.40GB ± 0% 1.39GB ± 0% -0.48% (p=0.000 n=20+20)
Flate 24.9MB ± 0% 24.9MB ± 0% -0.24% (p=0.000 n=20+20)
GoParser 30.9MB ± 0% 30.8MB ± 0% -0.39% (p=0.000 n=20+20)
Reflect 77.1MB ± 0% 76.8MB ± 0% -0.32% (p=0.000 n=17+20)
Tar 39.1MB ± 0% 39.0MB ± 0% -0.23% (p=0.000 n=20+20)
XML 44.7MB ± 0% 44.6MB ± 0% -0.30% (p=0.000 n=20+18)
name old allocs/op new allocs/op delta
Template 385k ± 0% 382k ± 0% -0.99% (p=0.000 n=20+19)
Unicode 336k ± 0% 336k ± 0% -0.08% (p=0.000 n=19+17)
GoTypes 1.20M ± 0% 1.18M ± 0% -1.11% (p=0.000 n=20+18)
Compiler 4.66M ± 0% 4.59M ± 0% -1.42% (p=0.000 n=19+20)
SSA 11.6M ± 0% 11.5M ± 0% -1.49% (p=0.000 n=20+20)
Flate 237k ± 0% 235k ± 0% -1.00% (p=0.000 n=20+19)
GoParser 319k ± 0% 315k ± 0% -1.12% (p=0.000 n=20+20)
Reflect 960k ± 0% 952k ± 0% -0.92% (p=0.000 n=18+20)
Tar 394k ± 0% 390k ± 0% -0.87% (p=0.000 n=20+20)
XML 418k ± 0% 413k ± 0% -1.18% (p=0.000 n=20+20)
Change-Id: I01b9f45b161379967d7a52e23f39ac30dd90edb0
Reviewed-on: https://go-review.googlesource.com/104415
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Allow the compiler to generate code like CMPQ 16(AX), $7
It's tricky because it's difficult to spill such a comparison during
flagalloc, because the same memory state might not be available at
the restore locations.
Solve this problem by decomposing the compare+load back into its parts
if it needs to be spilled.
The big win is that the write barrier test goes from:
MOVL runtime.writeBarrier(SB), CX
TESTL CX, CX
JNE 60
to
CMPL runtime.writeBarrier(SB), $0
JNE 59
It's one instruction and one byte smaller.
Fixes #19485
Fixes #15245
Update #22460
Binaries are about 0.15% smaller.
Change-Id: I4fd8d1111b6b9924d52f9a0901ca1b2e5cce0836
Reviewed-on: https://go-review.googlesource.com/86035
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
|
|
Enhance the one-live-memory-at-a-time check to run during many
more phases of the SSA backend. Also make it work in an interblock
fashion.
Change types.IsMemory to return true for tuples containing a memory type.
Fix trim pass to build the merged phi correctly. Doesn't affect
code but allows the check to pass after trim runs.
Switch the AddTuple* ops to take the memory-containing tuple argument second.
Update #20335
Change-Id: I5b03ef3606b75a9e4f765276bb8b183cdc172b43
Reviewed-on: https://go-review.googlesource.com/43495
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Tuple ops are weird. They are essentially a pair of ops,
one which consumes a mem and one which generates a mem (the Select1).
The schedule pass didn't handle these quite right.
Fix the scheduler to include both parts of the paired op in
the store chain. That makes sure that loads are correctly ordered
with respect to the first of the pair.
Add a check for the ssacheck builder, that there is only one
live store at a time. I thought we already had such a check, but
apparently not...
Fixes #20335
Change-Id: I59eb3446a329100af38d22820b1ca2190ca46a78
Reviewed-on: https://go-review.googlesource.com/43294
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This change adds a method to replace expressions
of the form
v.Args[len(v.Args)-1]
so that the code's intention to walk memory arguments
is explicit.
Passes toolstash-check.
Change-Id: I0c80d73bc00989dd3cdf72b4f2c8e1075a2515e0
Reviewed-on: https://go-review.googlesource.com/37757
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
storeOrder visits values in DFS order. It should "break" after
pushing one argument to stack, instead of "continue".
Fixes #19179.
Change-Id: I561afb44213df40ebf8bf7d28e0fd00f22a81ac0
Reviewed-on: https://go-review.googlesource.com/37250
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
SSA's writebarrier pass requires WB store ops are always at the
end of a block. If we move write barrier insertion into SSA and
emits normal Store ops when building SSA, this requirement becomes
impractical -- it will create too many blocks for all the Store
ops.
Redo SSA's writebarrier pass, explicitly order values in store
order, so it no longer needs this requirement.
Updates #17583.
Fixes #19067.
Change-Id: I66e817e526affb7e13517d4245905300a90b7170
Reviewed-on: https://go-review.googlesource.com/36834
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
This is a mostly mechanical rename followed by manual fixes where necessary.
Change-Id: Ie5c670b133db978f15dc03e50dc2da0c80fc8842
Reviewed-on: https://go-review.googlesource.com/34137
Reviewed-by: David Lazar <lazard@golang.org>
|
|
Adjust cmd/compile accordingly.
This will make it easier to replace the underlying implementation.
Change-Id: I33645850bb18c839b24785b6222a9e028617addb
Reviewed-on: https://go-review.googlesource.com/34133
Reviewed-by: David Lazar <lazard@golang.org>
|
|
Change-Id: I632d4aef7295778ba5018d98bcb06a68bcf07ce1
Reviewed-on: https://go-review.googlesource.com/31478
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Before this change a nil check on s390x could be scheduled after the
target pointer has been dereferenced.
Change-Id: I7ea40a4b52f975739f6db183a2794be4981c4e3d
Reviewed-on: https://go-review.googlesource.com/29730
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Get rid of BlockCheck. Josh goaded me into it, and I went
down a rabbithole making it happen.
NilCheck now panics if the pointer is nil and returns void, as before.
BlockCheck is gone, and NilCheck is no longer a Control value for
any block. It just exists (and deadcode knows not to throw it away).
I rewrote the nilcheckelim pass to handle this case. In particular,
there can now be multiple NilCheck ops per block.
I moved all of the arch-dependent nil check elimination done as
part of ssaGenValue into its own proper pass, so we don't have to
duplicate that code for every architecture.
Making the arch-dependent nil check its own pass means I needed
to add a bunch of flags to the opcode table so I could write
the code without arch-dependent ops everywhere.
Change-Id: I419f891ac9b0de313033ff09115c374163416a9f
Reviewed-on: https://go-review.googlesource.com/29120
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
The new SSA backend modifies the ABI slightly: R0 is now a usable
general purpose register.
Fixes #16677.
Change-Id: I367435ce921e0c7e79e021c80cf8ef5d1d1466cf
Reviewed-on: https://go-review.googlesource.com/28978
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Intrinsified atomic op produces <value,memory>. Make sure this
memory is considered in the store chain calculation.
Fixes #16948.
Change-Id: I029f164b123a7e830214297f8373f06ea0bf1e26
Reviewed-on: https://go-review.googlesource.com/28350
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
- implement *, /, %, shifts, Zero, Move.
- fix mistakes in comparison.
- fix floating point rounding.
- handle RetJmp in assembler (which was not handled, as a consequence
Duff's device was disabled in the old backend.)
all.bash now passes with SSA on.
Updates #16359.
Change-Id: Ia14eed0ed1176b5d800592080c8f53dded7fe73f
Reviewed-on: https://go-review.googlesource.com/27592
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Support the following:
- Shifts. ARM64 machine instructions only use lowest 6 bits of the
shift (i.e. mod 64). Use conditional selection instruction to
ensure Go semantics.
- Zero/Move. Alignment is ensured.
- Hmul, Avg64u, Sqrt.
- reserve R18 (platform register in ARM64 ABI) and R29 (frame pointer
in ARM64 ABI).
Everything compiles, all.bash passed (with non-SSA test disabled).
Change-Id: Ia8ed58dae5cbc001946f0b889357b258655078b1
Reviewed-on: https://go-review.googlesource.com/25290
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|