aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/compile/internal/ssa/schedule.go
AgeCommit message (Collapse)Author
2023-02-16cmd/compile: ensure InitMem comes after ArgsKeith Randall
The debug info generation currently depends on this invariant. A small update to CL 468455. Update #58482 Change-Id: Ica305d360d9af04036c604b6a65b683f7cb6e212 Reviewed-on: https://go-review.googlesource.com/c/go/+/468695 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com>
2023-02-15cmd/compile: schedule SP earlierKeith Randall
The actual scheduling of SP early doesn't really matter, but lots of early spills (of arguments) depend on SP so they can't schedule until SP does. Fixes #58482 Change-Id: Ie581fba7cb173d665c11f797f39d824b1c040a2d Reviewed-on: https://go-review.googlesource.com/c/go/+/468455 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com>
2023-02-01cmd/compile: schedule values with no in-block uses laterKeith Randall
When scheduling a block, deprioritize values whose results aren't used until subsequent blocks. For #58166, this has the effect of pushing the induction variable increment to the end of the block, past all the other uses of the pre-incremented value. Do this only with optimizations on. Debuggers have a preference for values in source code order, which this CL can degrade. Fixes #58166 Fixes #57976 Change-Id: I40d5885c661b142443c6d4702294c8abe8026c4f Reviewed-on: https://go-review.googlesource.com/c/go/+/463751 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-01-21test: test that we schedule OpArgIntReg earlyKeith Randall
If OpArgIntReg is incorrectly scheduled, that causes it to be spilled incorrectly, which causes the argument to not be considered live at the start of the function. This is the test for CL 462858 Add a brief mention of why CL 462858 is needed in the scheduling code. Change-Id: Id199456f88d9ee5ca46d7b0353a3c2049709880e Reviewed-on: https://go-review.googlesource.com/c/go/+/462899 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Keith Randall <khr@golang.org>
2023-01-20cmd/compile: ensure register args come before on-stack args in scheduleKeith Randall
The register allocator doesn't like OpArg coming in between other OpIntArg operations, as it doesn't put the spills in the right place in that situation. This is just a bug in the new scheduler, I didn't copy over the proper score from the old scheduler correctly. Change-Id: I3b4ee1754982fb360e99c5864b19e7408d60b5bc Reviewed-on: https://go-review.googlesource.com/c/go/+/462858 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-01-20cmd/compile: improve scheduling passKeith Randall
Convert the scheduling pass from scheduling backwards to scheduling forwards. Forward scheduling makes it easier to prioritize scheduling values as soon as they are ready, which is important for things like nil checks, select ops, etc. Forward scheduling is also quite a bit clearer. It was originally backwards because computing uses is tricky, but I found a way to do it simply and with n lg n complexity. The new scheme also makes it easy to add new scheduling edges if needed. Fixes #42673 Update #56568 Change-Id: Ibbb38c52d191f50ce7a94f8c1cbd3cd9b614ea8b Reviewed-on: https://go-review.googlesource.com/c/go/+/270940 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com>
2023-01-19cmd/compile: add anchored version of SPKeith Randall
The SPanchored opcode is identical to SP, except that it takes a memory argument so that it (and more importantly, anything that uses it) must be scheduled at or after that memory argument. This opcode ensures that a LEAQ of a variable gets scheduled after the corresponding VARDEF for that variable. This may lead to less CSE of LEAQ operations. The effect is very small. The go binary is only 80 bytes bigger after this CL. Usually LEAQs get folded into load/store operations, so the effect is only for pointerful types, large enough to need a duffzero, and have their address passed somewhere. Even then, usually the CSEd LEAQs will be un-CSEd because the two uses are on different sides of a function call and the LEAQ ends up being rematerialized at the second use anyway. Change-Id: Ib893562cd05369b91dd563b48fb83f5250950293 Reviewed-on: https://go-review.googlesource.com/c/go/+/452916 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Martin Möhrmann <martin@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2022-11-03cmd/compile/internal/ssa: re-adjust CarryChainTail scheduling priorityPaul E. Murphy
This needs to be as low as possible while not breaking priority assumptions of other scores to correctly schedule carry chains. Prior to the arm64 changes, it was set below ReadTuple. At the time, this prevented the MulHiLo implementation on PPC64 from occluding the scheduling of a full carry chain. Memory scores can also prevent better scheduling, as can be observed with crypto/internal/edwards25519/field.feMulGeneric. Fixes #56497 Change-Id: Ia4b54e6dffcce584faf46b1b8d7cea18a3913887 Reviewed-on: https://go-review.googlesource.com/c/go/+/447435 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Paul Murphy <murp@ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2022-10-31cmd/compile: add cache of sizeable objects so they can be reusedKeith Randall
We kind of have this mechanism already, just normalizing it and using it in a bunch of places. Previously a bunch of places cached slices only for the duration of a single function compilation. Now we can reuse slices across a whole compiler run. Use a sync.Pool of powers-of-two sizes. This lets us use not too much memory, and avoid holding onto memory we're no longer using when a GC happens. There's a few different types we need, so generate the code for it. Generics would be useful here, but we can't use generics in the compiler because of bootstrapping. Change-Id: I6cf37e7b7b2e802882aaa723a0b29770511ccd82 Reviewed-on: https://go-review.googlesource.com/c/go/+/444820 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2022-10-08cmd/compile: enable carry chain scheduling for arm64eric fang
This is a follow up of CL 393656 on arm64. This CL puts ScoreCarryChainTail before ScoreMemory and after ScoreReadFlags, so that the scheduling of the carry chain will not break the scheduling of ScoreVarDef. Benchmarks: name old time/op new time/op delta ScalarMult/P256-8 42.0µs ± 0% 42.0µs ± 0% -0.13% (p=0.032 n=5+5) ScalarMult/P224-8 135µs ± 0% 96µs ± 0% -29.04% (p=0.008 n=5+5) ScalarMult/P384-8 573µs ± 1% 355µs ± 0% -38.05% (p=0.008 n=5+5) ScalarMult/P521-8 1.50ms ± 4% 0.77ms ± 0% -48.78% (p=0.008 n=5+5) MarshalUnmarshal/P256/Uncompressed-8 505ns ± 1% 506ns ± 0% ~ (p=0.460 n=5+5) MarshalUnmarshal/P256/Compressed-8 6.75µs ± 0% 6.73µs ± 0% -0.27% (p=0.016 n=5+5) MarshalUnmarshal/P224/Uncompressed-8 927ns ± 0% 818ns ± 0% -11.76% (p=0.008 n=5+5) MarshalUnmarshal/P224/Compressed-8 136µs ± 0% 96µs ± 0% -29.58% (p=0.008 n=5+5) MarshalUnmarshal/P384/Uncompressed-8 1.77µs ± 0% 1.36µs ± 1% -23.14% (p=0.008 n=5+5) MarshalUnmarshal/P384/Compressed-8 56.5µs ± 0% 31.9µs ± 0% -43.59% (p=0.016 n=5+4) MarshalUnmarshal/P521/Uncompressed-8 2.91µs ± 0% 2.03µs ± 1% -30.32% (p=0.008 n=5+5) MarshalUnmarshal/P521/Compressed-8 148µs ± 0% 68µs ± 1% -54.28% (p=0.008 n=5+5) Change-Id: I4bf4e3265d7e1ee85765ff2bf006ca5a794d4979 Reviewed-on: https://go-review.googlesource.com/c/go/+/432275 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Eric Fang <eric.fang@arm.com>
2022-09-20Revert "cmd/compile: enable carry chain scheduling for arm64"Keith Randall
This reverts commit 4c414c7673af6b2aedee276d2e62cb2910eb19f3. Reason for revert: breaks ppc64 build (see issue 55254) Change-Id: I096ffa0e6535d31d9dd4079b48bb201b20220d76 Reviewed-on: https://go-review.googlesource.com/c/go/+/432196 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com>
2022-09-20cmd/compile: enable carry chain scheduling for arm64eric fang
This is a follow up of CL 393656 on arm64. Benchmarks: name old time/op new time/op delta ScalarMult/P256-8 42.0µs ± 0% 42.0µs ± 0% -0.13% (p=0.032 n=5+5) ScalarMult/P224-8 135µs ± 0% 96µs ± 0% -29.04% (p=0.008 n=5+5) ScalarMult/P384-8 573µs ± 1% 355µs ± 0% -38.05% (p=0.008 n=5+5) ScalarMult/P521-8 1.50ms ± 4% 0.77ms ± 0% -48.78% (p=0.008 n=5+5) MarshalUnmarshal/P256/Uncompressed-8 505ns ± 1% 506ns ± 0% ~ (p=0.460 n=5+5) MarshalUnmarshal/P256/Compressed-8 6.75µs ± 0% 6.73µs ± 0% -0.27% (p=0.016 n=5+5) MarshalUnmarshal/P224/Uncompressed-8 927ns ± 0% 818ns ± 0% -11.76% (p=0.008 n=5+5) MarshalUnmarshal/P224/Compressed-8 136µs ± 0% 96µs ± 0% -29.58% (p=0.008 n=5+5) MarshalUnmarshal/P384/Uncompressed-8 1.77µs ± 0% 1.36µs ± 1% -23.14% (p=0.008 n=5+5) MarshalUnmarshal/P384/Compressed-8 56.5µs ± 0% 31.9µs ± 0% -43.59% (p=0.016 n=5+4) MarshalUnmarshal/P521/Uncompressed-8 2.91µs ± 0% 2.03µs ± 1% -30.32% (p=0.008 n=5+5) MarshalUnmarshal/P521/Compressed-8 148µs ± 0% 68µs ± 1% -54.28% (p=0.008 n=5+5) Change-Id: I33170360eb8279b998e3c559f7136717fe32e07d Reviewed-on: https://go-review.googlesource.com/c/go/+/424907 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Eric Fang <eric.fang@arm.com> Reviewed-by: Keith Randall <khr@golang.org>
2022-08-26cmd/compile: fix score for Select{0,1} with type flagsLynn Boger
A recent change was made for ppc64x to treat ANDCCconst as a tuple, allowing ANDconst to be removed from the list of ops. Included in that change were some improvements to the rules to avoid some extra code, mainly the elimination of a cmp 0 following an andi. and in some cases the following isel. While those changes worked for most cases, in a few cases some extra unnecessary code was generated. Currently the snippet appears in archive/zip.(*FileHeader).Mode: ANDCC R4,$1,R5 // andi. r5,r4,1 ANDCC R4,$16,R5 // andi. r5,r4,16 CMPW R5,R0 // cmpw r5,r0 ADDIS $0,$-32768,R5 // lis r5,-32768 OR R5,$511,R5 // ori r5,r5,511 MOVD $438,R6 // li r6,438 ISEL $2,R6,R5,R5 // isel r5,r6,r5,eq MOVD $-147,R6 // li r6,-147 AND R6,R5,R6 // and r6,r5,r6 ANDCC R4,$1,R4 // andi. r4,r4,1 ISEL $2,R5,R6,R4 // isel r4,r5,r6,eq The first ANDCC is never used and should not be there. From the ssa.html file, the scheduler is not putting the Select1 close to the ISEL, which results in the flag being clobbered before it can be used. By changing the score for a Select0 or Select1 with type Flags, the extra ANDCC does not occur. Change-Id: I82f4bc7c02afb1c2b1c048dc6995e0b3f9363fb3 Reviewed-on: https://go-review.googlesource.com/c/go/+/424294 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Than McIntosh <thanm@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Paul Murphy <murp@ibm.com>
2022-08-18all: remove duplicate "the" words in commentsAbirdcfly
Following CL 424454, using command rg --multiline " the\s{1,}the " * rg --multiline " the\s{1,}//\s{1,}the " * all the words "the" that are repeated in comments are found. Change-Id: I60b769b98f04c927b4c228e10f37faf190964069 Reviewed-on: https://go-review.googlesource.com/c/go/+/423836 Auto-Submit: Filippo Valsorda <filippo@golang.org> Reviewed-by: Than McIntosh <thanm@google.com> Run-TryBot: Filippo Valsorda <filippo@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Filippo Valsorda <filippo@golang.org>
2022-05-12cmd/compile/internal/ssa: add support on loong64 for schedule phaseXiaodong Liu
Contributors to the loong64 port are: Weining Lu <luweining@loongson.cn> Lei Wang <wanglei@loongson.cn> Lingqin Gong <gonglingqin@loongson.cn> Xiaolin Zhao <zhaoxiaolin@loongson.cn> Meidan Li <limeidan@loongson.cn> Xiaojuan Zhai <zhaixiaojuan@loongson.cn> Qiyuan Pu <puqiyuan@loongson.cn> Guoqi Chen <chenguoqi@loongson.cn> This port has been updated to Go 1.15.6: https://github.com/loongson/go Updates #46229 Change-Id: Id533912c62d8c4e2aa3c124561772b543d685d7d Reviewed-on: https://go-review.googlesource.com/c/go/+/367041 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com>
2022-05-08cmd/compile: schedule carry chain arithmetic disjointlyPaul E. Murphy
This results in a 1.7-2.4x improvement in native go crypto/elliptic multiplication operations on PPC64, and similar improvements might be possible on other architectures which use flags or similar to represent the carry bit in SSA form. If it is possible, schedule carry chains independently of each other to avoid clobbering the carry flag. This is very expensive. This is done by: 1. Identifying carry bit using, but not creating ops, and lowering their priority below all other ops which do not need to be placed at the top of a block. This effectively ensures only one carry chain will be placed at a time in most important cases (crypto/elliptic/internal/fiat contains most of them). 2. Raising the priority of carry bit generating ops to schedule later in a block to ensure they are placed as soon as they are ready. Likewise, tuple ops which separate carrying ops are scored similar to 2 above. This prevents unrelated ops from being scheduled between carry-dependent operations. This occurs when unrelated ops are ready to schedule alongside such tuple ops. This reduces the chances a flag clobbering op might be placed between two carry-dependent operations. With PPC64 Add64/Sub64 lowering into SSA and this patch, the net performance difference in crypto/elliptic benchmarks on P9/ppc64le are: name old time/op new time/op delta ScalarBaseMult/P256 46.3µs ± 0% 46.9µs ± 0% +1.34% ScalarBaseMult/P224 356µs ± 0% 209µs ± 0% -41.14% ScalarBaseMult/P384 1.20ms ± 0% 0.57ms ± 0% -52.14% ScalarBaseMult/P521 3.38ms ± 0% 1.44ms ± 0% -57.27% ScalarMult/P256 199µs ± 0% 199µs ± 0% -0.17% ScalarMult/P224 357µs ± 0% 212µs ± 0% -40.56% ScalarMult/P384 1.20ms ± 0% 0.58ms ± 0% -51.86% ScalarMult/P521 3.37ms ± 0% 1.44ms ± 0% -57.32% MarshalUnmarshal/P256/Uncompressed 2.59µs ± 0% 2.52µs ± 0% -2.63% MarshalUnmarshal/P256/Compressed 2.58µs ± 0% 2.52µs ± 0% -2.06% MarshalUnmarshal/P224/Uncompressed 1.54µs ± 0% 1.40µs ± 0% -9.42% MarshalUnmarshal/P224/Compressed 1.54µs ± 0% 1.39µs ± 0% -9.87% MarshalUnmarshal/P384/Uncompressed 2.40µs ± 0% 1.80µs ± 0% -24.93% MarshalUnmarshal/P384/Compressed 2.35µs ± 0% 1.81µs ± 0% -23.03% MarshalUnmarshal/P521/Uncompressed 3.79µs ± 0% 2.58µs ± 0% -31.81% MarshalUnmarshal/P521/Compressed 3.80µs ± 0% 2.60µs ± 0% -31.67% Note, P256 uses an asm implementation, thus, little variation is expected. Updates #40171 Change-Id: I810850e8ff429505424c92d6fe37f99aaa0c6e84 Reviewed-on: https://go-review.googlesource.com/c/go/+/393656 Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Run-TryBot: Paul Murphy <murp@ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Filippo Valsorda <valsorda@google.com>
2022-04-11all: gofmt main repoRuss Cox
[This CL is part of a sequence implementing the proposal #51082. The design doc is at https://go.dev/s/godocfmt-design.] Run the updated gofmt, which reformats doc comments, on the main repository. Vendored files are excluded. For #51082. Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407 Reviewed-on: https://go-review.googlesource.com/c/go/+/384268 Reviewed-by: Jonathan Amsterdam <jba@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-05-26[dev.typeparams] cmd/compile: do not schedule in-register args late, even ↵Cherry Mui
for block control In the scheduler we have the logic that if a Value is used as the block's control, we schedule it at the end, except for Phis and Args. Even the comment says so, the code doesn't exclude in-register Args (OpArgXXXReg). Change to check for score instead, which includes OpArgXXXRegs. It also includes GetClosurePtr, which must be scheduled early. We just happen to never use it as block control. Found when working on ARM64 register ABI. In theory this could apply to AMD64 as well. But on AMD64 we never use in-register Value as block control, as conditional branch is always based on FLAGS, never based on registers, so it doesn't actually cause any problem. Change-Id: I167a550309772639574f7468caf91bd805eb74c6 Reviewed-on: https://go-review.googlesource.com/c/go/+/322849 Trust: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: David Chase <drchase@google.com>
2021-03-31cmd/compile: schedule in-register OpArg firstCherry Zhang
OpArgXXXReg values must be scheduled at the very top, as their registers need to be live at the beginning before any other use of the register. Change-Id: Ic76768bb74da402adbe61db3b2d174ecd3f9fffc Reviewed-on: https://go-review.googlesource.com/c/go/+/306329 Trust: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com>
2021-03-03cmd/compile: retrieve Args from registersDavid Chase
in progress; doesn't fully work until they are also passed on register on the caller side. For #40724. Change-Id: I29a6680e60bdbe9d132782530214f2a2b51fb8f6 Reviewed-on: https://go-review.googlesource.com/c/go/+/293394 Trust: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-02-26cmd/compile: change StaticCall to return a "Results"David Chase
StaticLECall (multiple value in +mem, multiple value result +mem) -> StaticCall (multiple ergister value in +mem, multiple register-sized-value result +mem) -> ARCH CallStatic (multiple ergister value in +mem, multiple register-sized-value result +mem) But the architecture-dependent stuff is indifferent to whether it is mem->mem or (mem)->(mem) until Prog generation. Deal with OpSelectN -> Prog in ssagen/ssa.go, others, as they appear. For #40724. Change-Id: I1d0436f6371054f1881862641d8e7e418e4a6a16 Reviewed-on: https://go-review.googlesource.com/c/go/+/293391 Trust: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-04-24cmd/compile: add more non-ID comparisons to scheduleJosh Bleecher Snyder
These comparisons are fairly arbitrary, but they should be more stable in the face of other compiler changes than value ID. This reduces the number of value ID comparisons in schedule while running make.bash from 542,442 to 99,703. There are lots of changes to generated code from this change, but they appear to be overall neutral. It is possible to further reduce the number of comparisons in schedule; I have changes locally that reduce the number to about 25,000 during make.bash. However, the changes are increasingly complex and arcane, and reduce in much less code churn. Given that the goal is stability, that suggests that this is a reasonable place to stop, at least for now. Change-Id: Ie3a75f84fd3f3fdb102fcd0b29299950ea66b827 Reviewed-on: https://go-review.googlesource.com/c/go/+/229799 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-04-24cmd/compile: add Value.Uses comparison during schedulingJosh Bleecher Snyder
Falling back to comparing Value.ID during scheduling is undesirable: Not only are we simply hoping for a good outcome, but the decision we make will be easily perturbed by other compiler changes, leading to random fluctuations. This change adds another decision point to the scheduler by scheduling Values with many uses earlier. Values with fewer uses are less likely to be spilled for other reasons, so we should issue them as late as possible in the hope of avoiding a spill. This reduces the number of Value ID comparisons in schedule while running make.bash from 1,000,844 to 542,442. As you would expect, this changes a lot of functions, but the overall trend is positive: file before after Δ % api 5237184 5233088 -4096 -0.078% compile 19926480 19918288 -8192 -0.041% cover 5281816 5277720 -4096 -0.078% dist 3711608 3707512 -4096 -0.110% total 113588440 113567960 -20480 -0.018% Change-Id: Ic99ebc4c614d4ae3807ce44473ec6b04684388ec Reviewed-on: https://go-review.googlesource.com/c/go/+/229798 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-01-18cmd/compile: implement compiler for riscv64Joel Sing
Based on riscv-go port. Updates #27532 Change-Id: Ia329daa243db63ff334053b8807ea96b97ce3acf Reviewed-on: https://go-review.googlesource.com/c/go/+/204631 Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-10-02cmd/compile: allow multiple SSA block control valuesMichael Munday
Control values are used to choose which successor of a block is jumped to. Typically a control value takes the form of a 'flags' value that represents the result of a comparison. Some architectures however use a variable in a register as a control value. Up until now we have managed with a single control value per block. However some architectures (e.g. s390x and riscv64) have combined compare-and-branch instructions that take two variables in registers as parameters. To generate these instructions we need to support 2 control values per block. This CL allows up to 2 control values to be used in a block in order to support the addition of compare-and-branch instructions. I have implemented s390x compare-and-branch instructions in a different CL. Passes toolstash-check -all. Results of compilebench: name old time/op new time/op delta Template 208ms ± 1% 209ms ± 1% ~ (p=0.289 n=20+20) Unicode 83.7ms ± 1% 83.3ms ± 3% -0.49% (p=0.017 n=18+18) GoTypes 748ms ± 1% 748ms ± 0% ~ (p=0.460 n=20+18) Compiler 3.47s ± 1% 3.48s ± 1% ~ (p=0.070 n=19+18) SSA 11.5s ± 1% 11.7s ± 1% +1.64% (p=0.000 n=19+18) Flate 130ms ± 1% 130ms ± 1% ~ (p=0.588 n=19+20) GoParser 160ms ± 1% 161ms ± 1% ~ (p=0.211 n=20+20) Reflect 465ms ± 1% 467ms ± 1% +0.42% (p=0.007 n=20+20) Tar 184ms ± 1% 185ms ± 2% ~ (p=0.087 n=18+20) XML 253ms ± 1% 253ms ± 1% ~ (p=0.377 n=20+18) LinkCompiler 769ms ± 2% 774ms ± 2% ~ (p=0.070 n=19+19) ExternalLinkCompiler 3.59s ±11% 3.68s ± 6% ~ (p=0.072 n=20+20) LinkWithoutDebugCompiler 446ms ± 5% 454ms ± 3% +1.79% (p=0.002 n=19+20) StdCmd 26.0s ± 2% 26.0s ± 2% ~ (p=0.799 n=20+20) name old user-time/op new user-time/op delta Template 238ms ± 5% 240ms ± 5% ~ (p=0.142 n=20+20) Unicode 105ms ±11% 106ms ±10% ~ (p=0.512 n=20+20) GoTypes 876ms ± 2% 873ms ± 4% ~ (p=0.647 n=20+19) Compiler 4.17s ± 2% 4.19s ± 1% ~ (p=0.093 n=20+18) SSA 13.9s ± 1% 14.1s ± 1% +1.45% (p=0.000 n=18+18) Flate 145ms ±13% 146ms ± 5% ~ (p=0.851 n=20+18) GoParser 185ms ± 5% 188ms ± 7% ~ (p=0.174 n=20+20) Reflect 534ms ± 3% 538ms ± 2% ~ (p=0.105 n=20+18) Tar 215ms ± 4% 211ms ± 9% ~ (p=0.079 n=19+20) XML 295ms ± 6% 295ms ± 5% ~ (p=0.968 n=20+20) LinkCompiler 832ms ± 4% 837ms ± 7% ~ (p=0.707 n=17+20) ExternalLinkCompiler 1.58s ± 8% 1.60s ± 4% ~ (p=0.296 n=20+19) LinkWithoutDebugCompiler 478ms ±12% 489ms ±10% ~ (p=0.429 n=20+20) name old object-bytes new object-bytes delta Template 559kB ± 0% 559kB ± 0% ~ (all equal) Unicode 216kB ± 0% 216kB ± 0% ~ (all equal) GoTypes 2.03MB ± 0% 2.03MB ± 0% ~ (all equal) Compiler 8.07MB ± 0% 8.07MB ± 0% -0.06% (p=0.000 n=20+20) SSA 27.1MB ± 0% 27.3MB ± 0% +0.89% (p=0.000 n=20+20) Flate 343kB ± 0% 343kB ± 0% ~ (all equal) GoParser 441kB ± 0% 441kB ± 0% ~ (all equal) Reflect 1.36MB ± 0% 1.36MB ± 0% ~ (all equal) Tar 487kB ± 0% 487kB ± 0% ~ (all equal) XML 632kB ± 0% 632kB ± 0% ~ (all equal) name old export-bytes new export-bytes delta Template 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) Unicode 7.92kB ± 0% 7.92kB ± 0% ~ (all equal) GoTypes 35.0kB ± 0% 35.0kB ± 0% ~ (all equal) Compiler 109kB ± 0% 110kB ± 0% +0.72% (p=0.000 n=20+20) SSA 137kB ± 0% 138kB ± 0% +0.58% (p=0.000 n=20+20) Flate 4.89kB ± 0% 4.89kB ± 0% ~ (all equal) GoParser 8.49kB ± 0% 8.49kB ± 0% ~ (all equal) Reflect 11.4kB ± 0% 11.4kB ± 0% ~ (all equal) Tar 10.5kB ± 0% 10.5kB ± 0% ~ (all equal) XML 16.7kB ± 0% 16.7kB ± 0% ~ (all equal) name old text-bytes new text-bytes delta HelloSize 761kB ± 0% 761kB ± 0% ~ (all equal) CmdGoSize 10.8MB ± 0% 10.8MB ± 0% ~ (all equal) name old data-bytes new data-bytes delta HelloSize 10.7kB ± 0% 10.7kB ± 0% ~ (all equal) CmdGoSize 312kB ± 0% 312kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 122kB ± 0% 122kB ± 0% ~ (all equal) CmdGoSize 146kB ± 0% 146kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.13MB ± 0% 1.13MB ± 0% ~ (all equal) CmdGoSize 15.1MB ± 0% 15.1MB ± 0% ~ (all equal) Change-Id: I3cc2f9829a109543d9a68be4a21775d2d3e9801f Reviewed-on: https://go-review.googlesource.com/c/go/+/196557 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-08all: simplify multiple for loopsDaniel Martí
If a for loop has a simple condition and begins with a simple "if x { break; }"; we can simply add "!x" to the loop's condition. While at it, simplify a few assignments to use the common pattern "x := staticDefault; if cond { x = otherValue(); }". Finally, simplify a couple of var declarations. Change-Id: I413982c6abd32905adc85a9a666cb3819139c19f Reviewed-on: https://go-review.googlesource.com/c/go/+/165342 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-11-29cmd/compile: begin OpArg and OpPhi location lists at block startDavid Chase
For the entry block, make the "first instruction" be truly the first instruction. This allows printing of incoming parameters with Delve. Also be sure Phis are marked as being at the start of their block. This is observed to move location list pointers, and where moved, they become correct. Leading zero-width instructions include LoweredGetClosurePtr. Because this instruction is actually architecture-specific, and it is now tested for in 3 different places, also created Op.isLoweredGetClosurePtr() to reduce future surprises. Change-Id: Ic043b7265835cf1790382a74334b5714ae4060af Reviewed-on: https://go-review.googlesource.com/c/145179 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-11-28cmd/compile: order nil checks by source positionKeith Randall
There's nothing enforcing ordering between redundant nil checks when they may occur during the same memory state. Commit to using the earliest one in source order. Otherwise the choice of which to remove depends on the ordering of values in a block (before scheduling). That's unfortunate when trying to ensure that the compiler doesn't depend on that ordering for anything. Update #20178 Change-Id: I2cdd5be10618accd9d91fa07406c90cbd023ffba Reviewed-on: https://go-review.googlesource.com/c/151517 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-25cmd/compile: intrinsify math/bits.Add on amd64Keith Randall
name old time/op new time/op delta Add-8 1.11ns ± 0% 1.18ns ± 0% +6.31% (p=0.029 n=4+4) Add32-8 1.02ns ± 0% 1.02ns ± 1% ~ (p=0.333 n=4+5) Add64-8 1.11ns ± 1% 1.17ns ± 0% +5.79% (p=0.008 n=5+5) Add64multiple-8 4.35ns ± 1% 0.86ns ± 0% -80.22% (p=0.000 n=5+4) The individual ops are a bit slower (but still very fast). Using the ops in carry chains is very fast. Update #28273 Change-Id: Id975f76df2b930abf0e412911d327b6c5b1befe5 Reviewed-on: https://go-review.googlesource.com/c/144257 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-23cmd/compile: schedule OpArg earlier in blocks for better debuggingDavid Chase
The location list for OpArg starts where the OpArg appears; this is not necessarily as soon as the OpArg coulde be observed, and it is reasonable for a user to expect that if a breakpoint is set "on f" then the arguments to f will be observable where that breakpoint happens to be set (this may also require setting the breakpoint after the prologue, but that is another issue). Change-Id: I0a1b848e50f475e5d8a5fad781241126872a0400 Reviewed-on: https://go-review.googlesource.com/c/142819 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>
2018-06-29Revert "cmd/compile: make OpAddr depend on VarDef in storeOrder"David Chase
This reverts commit 1a27f048ad25f151d2a17ce7f2d73d0d2dbe94cf. Reason for revert: Broke the ssacheck and -N-l builders, and the -N-l fix looks like it will take some time and take a different route entirely. Change-Id: Ie0ac5e86ab7d72a303dfbbc48dfdf1e092d4f61a Reviewed-on: https://go-review.googlesource.com/121715 Reviewed-by: David Chase <drchase@google.com>
2018-06-29cmd/compile: make OpAddr depend on VarDef in storeOrderDavid Chase
Given a carefully constructed input, writebarrier would split a block with the OpAddr in the first half and the VarDef in the second half which ultimately leads to a compiler crash because the scheduler is no longer able to put them in the proper order. To fix, recognize the implicit dependence of OpAddr on the VarDef of the same symbol if any exists. This fix was chosen over making OpAddr take a memory operand to make the dependence explicit, because this change is less invasive at this late part of the 1.11 release cycle. Fixes #26105. Change-Id: I9b65460673af3af41740ef877d2fca91acd336bc Reviewed-on: https://go-review.googlesource.com/121436 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-05-04cmd/compile: add wasm architectureRichard Musiol
This commit adds the wasm architecture to the compile command. A later commit will contain the corresponding linker changes. Design doc: https://docs.google.com/document/d/131vjr4DH6JFnb-blm_uRdaC0_Nv3OUwjEY5qVCxCup4 The following files are generated: - src/cmd/compile/internal/ssa/opGen.go - src/cmd/compile/internal/ssa/rewriteWasm.go - src/cmd/internal/obj/wasm/anames.go Updates #18892 Change-Id: Ifb4a96a3e427aac2362a1c97967d5667450fba3b Reviewed-on: https://go-review.googlesource.com/103295 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-04-04cmd/compile: stack-allocate 2 worklists in order, dom passesAlberto Donizetti
Allocate two more ssa local worklists on the stack. The initial sizes are chosen to cover >99% of the calls. name old time/op new time/op delta Template 281ms ± 2% 283ms ± 5% ~ (p=0.443 n=18+19) Unicode 136ms ± 4% 135ms ± 7% ~ (p=0.277 n=20+20) GoTypes 886ms ± 2% 885ms ± 2% ~ (p=0.862 n=20+20) Compiler 4.03s ± 2% 4.02s ± 1% ~ (p=0.270 n=19+20) SSA 9.66s ± 1% 9.64s ± 2% ~ (p=0.253 n=20+20) Flate 186ms ± 5% 183ms ± 6% ~ (p=0.174 n=20+20) GoParser 222ms ± 4% 219ms ± 4% ~ (p=0.081 n=20+20) Reflect 569ms ± 2% 568ms ± 2% ~ (p=0.686 n=19+19) Tar 258ms ± 4% 256ms ± 3% ~ (p=0.211 n=20+20) XML 319ms ± 2% 317ms ± 3% ~ (p=0.158 n=18+20) name old user-time/op new user-time/op delta Template 396ms ± 6% 392ms ± 6% ~ (p=0.211 n=20+20) Unicode 212ms ±10% 211ms ± 9% ~ (p=0.904 n=20+20) GoTypes 1.21s ± 3% 1.21s ± 2% ~ (p=0.183 n=20+20) Compiler 5.60s ± 2% 5.62s ± 2% ~ (p=0.355 n=18+18) SSA 14.0s ± 6% 13.9s ± 5% ~ (p=0.678 n=20+20) Flate 250ms ± 8% 245ms ± 6% ~ (p=0.166 n=19+20) GoParser 305ms ± 6% 304ms ± 5% ~ (p=0.659 n=20+20) Reflect 760ms ± 3% 758ms ± 4% ~ (p=0.758 n=20+20) Tar 362ms ± 6% 357ms ± 5% ~ (p=0.108 n=20+20) XML 429ms ± 4% 429ms ± 4% ~ (p=0.799 n=20+20) name old alloc/op new alloc/op delta Template 39.0MB ± 0% 38.8MB ± 0% -0.55% (p=0.000 n=20+20) Unicode 29.1MB ± 0% 29.1MB ± 0% -0.06% (p=0.000 n=20+20) GoTypes 116MB ± 0% 115MB ± 0% -0.50% (p=0.000 n=20+20) Compiler 493MB ± 0% 491MB ± 0% -0.46% (p=0.000 n=19+20) SSA 1.40GB ± 0% 1.40GB ± 0% -0.31% (p=0.000 n=19+20) Flate 25.0MB ± 0% 24.9MB ± 0% -0.60% (p=0.000 n=19+19) GoParser 30.9MB ± 0% 30.7MB ± 0% -0.66% (p=0.000 n=20+20) Reflect 77.5MB ± 0% 77.1MB ± 0% -0.52% (p=0.000 n=20+20) Tar 39.2MB ± 0% 39.0MB ± 0% -0.47% (p=0.000 n=20+20) XML 44.8MB ± 0% 44.6MB ± 0% -0.45% (p=0.000 n=20+19) name old allocs/op new allocs/op delta Template 382k ± 0% 379k ± 0% -0.69% (p=0.000 n=20+19) Unicode 337k ± 0% 336k ± 0% -0.09% (p=0.000 n=20+20) GoTypes 1.19M ± 0% 1.18M ± 0% -0.64% (p=0.000 n=20+20) Compiler 4.60M ± 0% 4.58M ± 0% -0.57% (p=0.000 n=20+20) SSA 11.5M ± 0% 11.4M ± 0% -0.42% (p=0.000 n=19+20) Flate 235k ± 0% 233k ± 0% -0.74% (p=0.000 n=20+19) GoParser 316k ± 0% 313k ± 0% -0.69% (p=0.000 n=20+20) Reflect 953k ± 0% 946k ± 0% -0.81% (p=0.000 n=20+20) Tar 391k ± 0% 388k ± 0% -0.61% (p=0.000 n=20+19) XML 413k ± 0% 411k ± 0% -0.56% (p=0.000 n=20+20) Change-Id: I7378174e3550b47df4368b24cf24c8ce1b85c906 Reviewed-on: https://go-review.googlesource.com/104656 Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
2018-04-03cmd/compile: stack-allocate values worklist in scheduleAlberto Donizetti
Compiler instrumentation shows that the cap of the stores slice in the storeOrder function is almost always 64 or less. Since the slice does not escape, pre-allocating on the stack a 64-elements one greatly reduces the number of allocations performed by the function. name old time/op new time/op delta Template 289ms ± 5% 283ms ± 3% -1.99% (p=0.000 n=19+20) Unicode 140ms ± 6% 136ms ± 6% -2.61% (p=0.021 n=19+20) GoTypes 915ms ± 2% 895ms ± 2% -2.24% (p=0.000 n=19+20) Compiler 4.15s ± 1% 4.04s ± 2% -2.73% (p=0.000 n=20+20) SSA 10.0s ± 1% 9.8s ± 2% -2.13% (p=0.000 n=20+20) Flate 189ms ± 6% 186ms ± 4% -1.75% (p=0.028 n=19+20) GoParser 229ms ± 5% 224ms ± 4% -2.25% (p=0.001 n=20+19) Reflect 584ms ± 2% 573ms ± 3% -1.83% (p=0.000 n=18+20) Tar 265ms ± 3% 261ms ± 3% -1.33% (p=0.021 n=20+20) XML 328ms ± 2% 321ms ± 2% -2.11% (p=0.000 n=20+20) name old user-time/op new user-time/op delta Template 408ms ± 4% 400ms ± 4% -1.98% (p=0.006 n=19+20) Unicode 216ms ± 9% 216ms ± 7% ~ (p=0.883 n=20+20) GoTypes 1.25s ± 1% 1.23s ± 3% -1.32% (p=0.002 n=19+20) Compiler 5.77s ± 1% 5.69s ± 2% -1.47% (p=0.000 n=18+19) SSA 14.6s ± 5% 14.1s ± 4% -3.45% (p=0.000 n=20+20) Flate 252ms ± 7% 251ms ± 7% ~ (p=0.659 n=20+20) GoParser 314ms ± 5% 310ms ± 5% ~ (p=0.165 n=20+20) Reflect 780ms ± 2% 769ms ± 3% -1.34% (p=0.004 n=19+18) Tar 365ms ± 7% 367ms ± 5% ~ (p=0.841 n=20+20) XML 439ms ± 4% 432ms ± 4% -1.45% (p=0.043 n=20+20) name old alloc/op new alloc/op delta Template 38.9MB ± 0% 38.8MB ± 0% -0.26% (p=0.000 n=19+20) Unicode 29.0MB ± 0% 29.0MB ± 0% -0.02% (p=0.001 n=20+19) GoTypes 115MB ± 0% 115MB ± 0% -0.31% (p=0.000 n=20+20) Compiler 492MB ± 0% 490MB ± 0% -0.41% (p=0.000 n=20+19) SSA 1.40GB ± 0% 1.39GB ± 0% -0.48% (p=0.000 n=20+20) Flate 24.9MB ± 0% 24.9MB ± 0% -0.24% (p=0.000 n=20+20) GoParser 30.9MB ± 0% 30.8MB ± 0% -0.39% (p=0.000 n=20+20) Reflect 77.1MB ± 0% 76.8MB ± 0% -0.32% (p=0.000 n=17+20) Tar 39.1MB ± 0% 39.0MB ± 0% -0.23% (p=0.000 n=20+20) XML 44.7MB ± 0% 44.6MB ± 0% -0.30% (p=0.000 n=20+18) name old allocs/op new allocs/op delta Template 385k ± 0% 382k ± 0% -0.99% (p=0.000 n=20+19) Unicode 336k ± 0% 336k ± 0% -0.08% (p=0.000 n=19+17) GoTypes 1.20M ± 0% 1.18M ± 0% -1.11% (p=0.000 n=20+18) Compiler 4.66M ± 0% 4.59M ± 0% -1.42% (p=0.000 n=19+20) SSA 11.6M ± 0% 11.5M ± 0% -1.49% (p=0.000 n=20+20) Flate 237k ± 0% 235k ± 0% -1.00% (p=0.000 n=20+19) GoParser 319k ± 0% 315k ± 0% -1.12% (p=0.000 n=20+20) Reflect 960k ± 0% 952k ± 0% -0.92% (p=0.000 n=18+20) Tar 394k ± 0% 390k ± 0% -0.87% (p=0.000 n=20+20) XML 418k ± 0% 413k ± 0% -1.18% (p=0.000 n=20+20) Change-Id: I01b9f45b161379967d7a52e23f39ac30dd90edb0 Reviewed-on: https://go-review.googlesource.com/104415 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-02-26cmd/compile: implement comparisons directly with memoryKeith Randall
Allow the compiler to generate code like CMPQ 16(AX), $7 It's tricky because it's difficult to spill such a comparison during flagalloc, because the same memory state might not be available at the restore locations. Solve this problem by decomposing the compare+load back into its parts if it needs to be spilled. The big win is that the write barrier test goes from: MOVL runtime.writeBarrier(SB), CX TESTL CX, CX JNE 60 to CMPL runtime.writeBarrier(SB), $0 JNE 59 It's one instruction and one byte smaller. Fixes #19485 Fixes #15245 Update #22460 Binaries are about 0.15% smaller. Change-Id: I4fd8d1111b6b9924d52f9a0901ca1b2e5cce0836 Reviewed-on: https://go-review.googlesource.com/86035 Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2017-05-15cmd/compile: better check for single live memoryKeith Randall
Enhance the one-live-memory-at-a-time check to run during many more phases of the SSA backend. Also make it work in an interblock fashion. Change types.IsMemory to return true for tuples containing a memory type. Fix trim pass to build the merged phi correctly. Doesn't affect code but allows the check to pass after trim runs. Switch the AddTuple* ops to take the memory-containing tuple argument second. Update #20335 Change-Id: I5b03ef3606b75a9e4f765276bb8b183cdc172b43 Reviewed-on: https://go-review.googlesource.com/43495 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-05-11cmd/compile: fix store chain in schedule passKeith Randall
Tuple ops are weird. They are essentially a pair of ops, one which consumes a mem and one which generates a mem (the Select1). The schedule pass didn't handle these quite right. Fix the scheduler to include both parts of the paired op in the store chain. That makes sure that loads are correctly ordered with respect to the first of the pair. Add a check for the ssacheck builder, that there is only one live store at a time. I thought we already had such a check, but apparently not... Fixes #20335 Change-Id: I59eb3446a329100af38d22820b1ca2190ca46a78 Reviewed-on: https://go-review.googlesource.com/43294 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-03-09cmd/compile: clean up ssa.Value memory arg usagePhilip Hofer
This change adds a method to replace expressions of the form v.Args[len(v.Args)-1] so that the code's intention to walk memory arguments is explicit. Passes toolstash-check. Change-Id: I0c80d73bc00989dd3cdf72b4f2c8e1075a2515e0 Reviewed-on: https://go-review.googlesource.com/37757 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-02-21cmd/compile: fix storeOrderCherry Zhang
storeOrder visits values in DFS order. It should "break" after pushing one argument to stack, instead of "continue". Fixes #19179. Change-Id: I561afb44213df40ebf8bf7d28e0fd00f22a81ac0 Reviewed-on: https://go-review.googlesource.com/37250 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com>
2017-02-17cmd/compile: redo writebarrier passCherry Zhang
SSA's writebarrier pass requires WB store ops are always at the end of a block. If we move write barrier insertion into SSA and emits normal Store ops when building SSA, this requirement becomes impractical -- it will create too many blocks for all the Store ops. Redo SSA's writebarrier pass, explicitly order values in store order, so it no longer needs this requirement. Updates #17583. Fixes #19067. Change-Id: I66e817e526affb7e13517d4245905300a90b7170 Reviewed-on: https://go-review.googlesource.com/36834 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2016-12-08[dev.inline] cmd/compile/internal/ssa: rename various fields from Line to PosRobert Griesemer
This is a mostly mechanical rename followed by manual fixes where necessary. Change-Id: Ie5c670b133db978f15dc03e50dc2da0c80fc8842 Reviewed-on: https://go-review.googlesource.com/34137 Reviewed-by: David Lazar <lazard@golang.org>
2016-12-08[dev.inline] cmd/internal/src: make Pos implementation abstractRobert Griesemer
Adjust cmd/compile accordingly. This will make it easier to replace the underlying implementation. Change-Id: I33645850bb18c839b24785b6222a9e028617addb Reviewed-on: https://go-review.googlesource.com/34133 Reviewed-by: David Lazar <lazard@golang.org>
2016-11-08cmd/compile/internal/ssa: add support for GOARCH=mips{,le}Vladimir Stefanovic
Change-Id: I632d4aef7295778ba5018d98bcb06a68bcf07ce1 Reviewed-on: https://go-review.googlesource.com/31478 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-09-23cmd/compile: recognize OpS390XLoweredNilCheck as a nil check in the schedulerMichael Munday
Before this change a nil check on s390x could be scheduled after the target pointer has been dereferenced. Change-Id: I7ea40a4b52f975739f6db183a2794be4981c4e3d Reviewed-on: https://go-review.googlesource.com/29730 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2016-09-15cmd/compile: redo nil checksKeith Randall
Get rid of BlockCheck. Josh goaded me into it, and I went down a rabbithole making it happen. NilCheck now panics if the pointer is nil and returns void, as before. BlockCheck is gone, and NilCheck is no longer a Control value for any block. It just exists (and deadcode knows not to throw it away). I rewrote the nilcheckelim pass to handle this case. In particular, there can now be multiple NilCheck ops per block. I moved all of the arch-dependent nil check elimination done as part of ssaGenValue into its own proper pass, so we don't have to duplicate that code for every architecture. Making the arch-dependent nil check its own pass means I needed to add a bunch of flags to the opcode table so I could write the code without arch-dependent ops everywhere. Change-Id: I419f891ac9b0de313033ff09115c374163416a9f Reviewed-on: https://go-review.googlesource.com/29120 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2016-09-13cmd/compile: add SSA backend for s390x and enable by defaultMichael Munday
The new SSA backend modifies the ABI slightly: R0 is now a usable general purpose register. Fixes #16677. Change-Id: I367435ce921e0c7e79e021c80cf8ef5d1d1466cf Reviewed-on: https://go-review.googlesource.com/28978 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2016-09-01cmd/compile: fix scheduling of memory-producing tuple opsCherry Zhang
Intrinsified atomic op produces <value,memory>. Make sure this memory is considered in the store chain calculation. Fixes #16948. Change-Id: I029f164b123a7e830214297f8373f06ea0bf1e26 Reviewed-on: https://go-review.googlesource.com/28350 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2016-08-25cmd/compile: get MIPS64 SSA workingCherry Zhang
- implement *, /, %, shifts, Zero, Move. - fix mistakes in comparison. - fix floating point rounding. - handle RetJmp in assembler (which was not handled, as a consequence Duff's device was disabled in the old backend.) all.bash now passes with SSA on. Updates #16359. Change-Id: Ia14eed0ed1176b5d800592080c8f53dded7fe73f Reviewed-on: https://go-review.googlesource.com/27592 Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-07-27[dev.ssa] cmd/compile: add more on ARM64 SSACherry Zhang
Support the following: - Shifts. ARM64 machine instructions only use lowest 6 bits of the shift (i.e. mod 64). Use conditional selection instruction to ensure Go semantics. - Zero/Move. Alignment is ensured. - Hmul, Avg64u, Sqrt. - reserve R18 (platform register in ARM64 ABI) and R29 (frame pointer in ARM64 ABI). Everything compiles, all.bash passed (with non-SSA test disabled). Change-Id: Ia8ed58dae5cbc001946f0b889357b258655078b1 Reviewed-on: https://go-review.googlesource.com/25290 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>