aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/internal/obj/x86
AgeCommit message (Collapse)Author
2023-01-26runtime: use explicit NOFRAME on darwin/amd64qmuntal
This CL marks some darwin assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Change-Id: I797f3909bcf7f7aad304e4ede820c884231e54f6 Reviewed-on: https://go-review.googlesource.com/c/go/+/460235 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-01-24runtime: use explicit NOFRAME on windows/amd64qmuntal
This CL marks non-leaf nosplit assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Updates #57302 Updates #40044 Change-Id: Ia4d26f8420dcf2b54528969ffbf40a73f1315d61 Reviewed-on: https://go-review.googlesource.com/c/go/+/459395 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-01-23runtime,cmd/internal/obj/x86: use TEB TLS slots on windows/i386qmuntal
This CL redesign how we get the TLS pointer on windows/i386. It applies the same changes as done in CL 431775 for windows/amd64. We were previously reading it from the [TEB] arbitrary data slot, located at 0x14(FS), which can only hold 1 TLS pointer. With this CL, we will read the TLS pointer from the TEB TLS slot array, located at 0xE10(GS). The TLS slot array can hold multiple TLS pointers, up to 64, so multiple Go runtimes running on the same thread can coexists with different TLS. Each new TLS slot has to be allocated via [TlsAlloc], which returns the slot index. This index can then be used to get the slot offset from GS with the following formula: 0xE10 + index*4. The slot index is fixed per Go runtime, so we can store it in runtime.tls_g and use it latter on to read/update the TLS pointer. Loading the TLS pointer requires the following asm instructions: MOVQ runtime.tls_g, AX MOVQ AX(FS), AX Notice that this approach will now be implemented in all the supported windows arches. [TEB]: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block [TlsAlloc]: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc Change-Id: If4550b0d44694ee6480d4093b851f4991a088b32 Reviewed-on: https://go-review.googlesource.com/c/go/+/454675 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-15cmd/internal/obj: use testenv.Command instead of exec.Command in testsBryan C. Mills
testenv.Command sets a default timeout based on the test's deadline and sends SIGQUIT (where supported) in case of a hang. Change-Id: Ica1a9985f9abb1935434367c9c8ba28fc50f331d Reviewed-on: https://go-review.googlesource.com/c/go/+/450699 Auto-Submit: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Bryan Mills <bcmills@google.com>
2022-11-14runtime,cmd/internal/obj/x86: use TEB TLS slots on windows/amd64qmuntal
This CL redesign how we get the TLS pointer on windows/amd64. We were previously reading it from the [TEB] arbitrary data slot, located at 0x28(GS), which can only hold 1 TLS pointer. With this CL, we will read the TLS pointer from the TEB TLS slot array, located at 0x1480(GS). The TLS slot array can hold multiple TLS pointers, up to 64, so multiple Go runtimes running on the same thread can coexists with different TLS. Each new TLS slot has to be allocated via [TlsAlloc], which returns the slot index. This index can then be used to get the slot offset from GS with the following formula: 0x1480 + index*8 The slot index is fixed per Go runtime, so we can store it in runtime.tls_g and use it latter on to read/update the TLS pointer. Loading the TLS pointer requires the following asm instructions: MOVQ runtime.tls_g, AX MOVQ AX(GS), AX Notice that this approach is also implemented on windows/arm64. [TEB]: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block [TlsAlloc]: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc Updates #22192 Change-Id: Idea7119fd76a3cd083979a4d57ed64b552fa101b Reviewed-on: https://go-review.googlesource.com/c/go/+/431775 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2022-11-09all: add missing copyright headercui fliter
Change-Id: Ia5a090953d324f0f8aa9c1808c88125ad5eb6f98 Reviewed-on: https://go-review.googlesource.com/c/go/+/448955 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Auto-Submit: Bryan Mills <bcmills@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Bryan Mills <bcmills@google.com>
2022-09-29cmd/internal/obj/x86: return comparison directlycuiweixie
Change-Id: I4b596b252c1785b13c4a166e9ef5f4ae812cd1bc Reviewed-on: https://go-review.googlesource.com/c/go/+/436704 TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-09-20all: replace package ioutil with os and io in srcAndy Pan
For #45557 Change-Id: I56824135d86452603dd4ed4bab0e24c201bb0683 Reviewed-on: https://go-review.googlesource.com/c/go/+/426257 Run-TryBot: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Andy Pan <panjf2000@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-04-14cmd/compile: implement jump tablesKeith Randall
Performance is kind of hard to exactly quantify. One big difference between jump tables and the old binary search scheme is that there's only 1 branch statement instead of O(n) of them. That can be both a blessing and a curse, and can make evaluating jump tables very hard to do. The single branch can become a choke point for the hardware branch predictor. A branch table jump must fit all of its state in a single branch predictor entry (technically, a branch target predictor entry). With binary search that predictor state can be spread among lots of entries. In cases where the case selection is repetitive and thus predictable, binary search can perform better. The big win for a jump table is that it doesn't consume so much of the branch predictor's resources. But that benefit is essentially never observed in microbenchmarks, because the branch predictor can easily keep state for all the binary search branches in a microbenchmark. So that benefit is really hard to measure. So predictable switch microbenchmarks are ~useless - they will almost always favor the binary search scheme. Fully unpredictable switch microbenchmarks are better, as they aren't lying to us quite so much. In a perfectly unpredictable situation, a jump table will expect to incur 1-1/N branch mispredicts, where a binary search would incur lg(N)/2 of them. That makes the crossover point at about N=4. But of course switches in real programs are seldom fully unpredictable, so we'll use a higher crossover point. Beyond the branch predictor, jump tables tend to execute more instructions per switch but have no additional instructions per case, which also argues for a larger crossover. As far as code size goes, with this CL cmd/go has a slightly smaller code segment and a slightly larger overall size (from the jump tables themselves which live in the data segment). This is a case where some FDO (feedback-directed optimization) would be really nice to have. #28262 Some large-program benchmarks might help make the case for this CL. Especially if we can turn on branch mispredict counters so we can see how much using jump tables can free up branch prediction resources that can be gainfully used elsewhere in the program. name old time/op new time/op delta Switch8Predictable 1.89ns ± 2% 1.27ns ± 3% -32.58% (p=0.000 n=9+10) Switch8Unpredictable 9.33ns ± 1% 7.50ns ± 1% -19.60% (p=0.000 n=10+9) Switch32Predictable 2.20ns ± 2% 1.64ns ± 1% -25.39% (p=0.000 n=10+9) Switch32Unpredictable 10.0ns ± 2% 7.6ns ± 2% -24.04% (p=0.000 n=10+10) Fixes #5496 Update #34381 Change-Id: I3ff56011d02be53f605ca5fd3fb96b905517c34f Reviewed-on: https://go-review.googlesource.com/c/go/+/357330 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com>
2022-04-11all: gofmt main repoRuss Cox
[This CL is part of a sequence implementing the proposal #51082. The design doc is at https://go.dev/s/godocfmt-design.] Run the updated gofmt, which reformats doc comments, on the main repository. Vendored files are excluded. For #51082. Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407 Reviewed-on: https://go-review.googlesource.com/c/go/+/384268 Reviewed-by: Jonathan Amsterdam <jba@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-04-01all: remove trailing blank doc comment linesRuss Cox
A future change to gofmt will rewrite // Doc comment. // func f() to // Doc comment. func f() Apply that change preemptively to all doc comments. For #51082. Change-Id: I4023e16cfb0729b64a8590f071cd92f17343081d Reviewed-on: https://go-review.googlesource.com/c/go/+/384259 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
2021-11-24cmd/internal/obj/x86: modify the threshold of assert loop for span6zhouguangyuan
Fixes: #49716 Change-Id: I7ed73f874c2ee1ee3f31c9c4428ed484167ca803 Reviewed-on: https://go-review.googlesource.com/c/go/+/366094 Reviewed-by: Cherry Mui <cherryyz@google.com> Trust: Heschi Kreinick <heschi@google.com>
2021-11-05cmd/{asm,compile,internal/obj}: add "maymorestack" supportAustin Clements
This adds a debugging hook for optionally calling a "maymorestack" function in the prologue of any function that might call morestack (whether it does at run time or not). The maymorestack function will let us improve lock checking and add debugging modes that stress function preemption and stack growth. Passes toolstash-check -all (except on js/wasm, where toolstash appears to be broken) Fixes #48297. Change-Id: I27197947482b329af75dafb9971fc0d3a52eaf31 Reviewed-on: https://go-review.googlesource.com/c/go/+/359795 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-10-18cmd/asm: report an error when trying to do spectre on 386Keith Randall
The compiler refuses to do spectre mitigation on 386, but the assembler doesn't. Fix that. Fixes #49006 Change-Id: I887b6f7ed7523a47f463706f06ca4c2c6e828b6b Reviewed-on: https://go-review.googlesource.com/c/go/+/356190 Trust: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
2021-10-08cmd/internal/obj: rename MOVBE{LL,QQ,WW} to just MOVBE{L,Q,W}Matthew Dempsky
The double suffix doesn't seem to serve any purpose, and we can keep the old spelling as a backwards compatible alias in cmd/asm. Change-Id: I3f01fc7249fb093ac1b25bd75c1cb9f39b8f62a9 Reviewed-on: https://go-review.googlesource.com/c/go/+/354700 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Trust: Matthew Dempsky <mdempsky@google.com>
2021-08-03[dev.typeparams] runtime,cmd/compile,cmd/link: replace jmpdefer with a loopAustin Clements
Currently, deferreturn runs deferred functions by backing up its return PC to the deferreturn call, and then effectively tail-calling the deferred function (via jmpdefer). The effect of this is that the deferred function appears to be called directly from the deferee, and when it returns, the deferee calls deferreturn again so it can run the next deferred function if necessary. This unusual flow control leads to a large number of special cases and complications all over the tool chain. This used to be necessary because deferreturn copied the deferred function's argument frame directly into its caller's frame and then had to invoke that call as if it had been called from its caller's frame so it could access it arguments. But now that we've simplified defer processing so the runtime only deals with argument-less closures, this approach is no longer necessary. This CL simplifies all of this by making deferreturn simply call deferred functions in a loop. This eliminates the need for jmpdefer, so we can delete a bunch of per-architecture assembly code. This eliminates several special cases on Wasm, since it couldn't support these calling shenanigans directly and thus had to simulate the loop a different way. Now Wasm can largely work the way the other platforms do. This eliminates the per-architecture Ginsnopdefer operation. On PPC64, this was necessary to reload the TOC pointer after the tail call (since TOC pointers in general make tail calls impossible). The tail call is gone, and in the case where we do force a jump to the deferreturn call when recovering from an open-coded defer, we go through gogo (via runtime.recovery), which handles the TOC. On other platforms, we needed a NOP so traceback didn't get confused by seeing the return to the CALL instruction, rather than the usual return to the instruction following the CALL instruction. Now we don't inject a return to the CALL instruction at all, so this NOP is also unnecessary. The one potential effect of this is that deferreturn could now appear in stack traces from deferred functions. However, this could already happen from open-coded defers, so we've long since marked deferreturn as a "wrapper" so it gets elided not only from printed stack traces, but from runtime.Callers*. This is a retry of CL 337652 because we had to back out its parent. There are no changes in this version. Change-Id: I3f54b7fec1d7ccac71cc6cf6835c6a46b7e5fb6c Reviewed-on: https://go-review.googlesource.com/c/go/+/339397 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-07-30[dev.typeparams] Revert "[dev.typeparams] runtime,cmd/compile,cmd/link: ↵Austin Clements
replace jmpdefer with a loop" This reverts CL 227652. I'm reverting CL 337651 and this builds on top of it. Change-Id: I03ce363be44c2a3defff2e43e7b1aad83386820d Reviewed-on: https://go-review.googlesource.com/c/go/+/338709 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-07-30[dev.typeparams] runtime,cmd/compile,cmd/link: replace jmpdefer with a loopAustin Clements
Currently, deferreturn runs deferred functions by backing up its return PC to the deferreturn call, and then effectively tail-calling the deferred function (via jmpdefer). The effect of this is that the deferred function appears to be called directly from the deferee, and when it returns, the deferee calls deferreturn again so it can run the next deferred function if necessary. This unusual flow control leads to a large number of special cases and complications all over the tool chain. This used to be necessary because deferreturn copied the deferred function's argument frame directly into its caller's frame and then had to invoke that call as if it had been called from its caller's frame so it could access it arguments. But now that we've simplified defer processing so the runtime only deals with argument-less closures, this approach is no longer necessary. This CL simplifies all of this by making deferreturn simply call deferred functions in a loop. This eliminates the need for jmpdefer, so we can delete a bunch of per-architecture assembly code. This eliminates several special cases on Wasm, since it couldn't support these calling shenanigans directly and thus had to simulate the loop a different way. Now Wasm can largely work the way the other platforms do. This eliminates the per-architecture Ginsnopdefer operation. On PPC64, this was necessary to reload the TOC pointer after the tail call (since TOC pointers in general make tail calls impossible). The tail call is gone, and in the case where we do force a jump to the deferreturn call when recovering from an open-coded defer, we go through gogo (via runtime.recovery), which handles the TOC. On other platforms, we needed a NOP so traceback didn't get confused by seeing the return to the CALL instruction, rather than the usual return to the instruction following the CALL instruction. Now we don't inject a return to the CALL instruction at all, so this NOP is also unnecessary. The one potential effect of this is that deferreturn could now appear in stack traces from deferred functions. However, this could already happen from open-coded defers, so we've long since marked deferreturn as a "wrapper" so it gets elided not only from printed stack traces, but from runtime.Callers*. Change-Id: Ie9f700cd3fb774f498c9edce363772a868407bf7 Reviewed-on: https://go-review.googlesource.com/c/go/+/337652 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-06-11[dev.typeparams] all: always enable regabig on AMD64Cherry Mui
Always enable regabig on AMD64, which enables the G register and the X15 zero register. Remove the fallback path. Also remove the regabig GOEXPERIMENT. On AMD64 it is always enabled (this CL). Other architectures already have a G register, except for 386, where there are too few registers and it is unlikely that we will reserve one. (If we really do, we can just add a new experiment). Change-Id: I229cac0060f48fe58c9fdaabd38d6fa16b8a0855 Reviewed-on: https://go-review.googlesource.com/c/go/+/327272 Trust: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: David Chase <drchase@google.com> TryBot-Result: Go Bot <gobot@golang.org>
2021-04-16internal/buildcfg: move build configuration out of cmd/internal/objabiRuss Cox
The go/build package needs access to this configuration, so move it into a new package available to the standard library. Change-Id: I868a94148b52350c76116451f4ad9191246adcff Reviewed-on: https://go-review.googlesource.com/c/go/+/310731 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Jay Conrod <jayconrod@google.com>
2021-04-15runtime,runtime/cgo: save all necessary registers on entry to Go on WindowsAustin Clements
There are several assembly functions that transition from the Windows ABI to the Go ABI. These all need to save all registers that are callee-save in the Windows ABI and caller-save in the Go ABI and prepare the register state for Go. However, they all do this slightly differently and most of them don't save the necessary XMM registers for this transition (which could corrupt them in the C caller). Furthermore, now that we have a carefully specified Go ABI, it's clear that none of these actually get all of the details 100% right. So, unify this code into two macros in a shared header in runtime/cgo/abi_amd64.h that handle all necessary registers and setup and use these macros everywhere on Windows that handles transitions from C to Go. Change-Id: I62f41345a507aad1ca383814ac8b7e2a9ffb821e Reviewed-on: https://go-review.googlesource.com/c/go/+/309769 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-04-05cmd/compile: untangle Wrapper and ABIWrapper flagsCherry Zhang
Currently, there are Wrapper and ABIWrapper attributes. Wrapper is set when compiler generates an wrapper function (e.g. method wrapper). ABIWrapper is set when compiler generates an ABI wrapper. It also sets Wrapper flag for ABI wrappers. Currently, they have the following meanings: - Wrapper flag hides the frame from (normal) traceback. - Wrapper flag enables special panic+recover adjustment, so it can correctly recover when a wrapper function is deferred. - ABIWrapper flag disables the panic+recover adjustment, because we never defer an ABI wrapper that can recover. This CL changes them to: - Both Wrapper and ABIWrapper flags hide the frame from (normal) traceback. (Setting one is enough.) - Wrapper flag enables special panic+recover adjustment. ABIWrapper flag no longer has effect on this. This makes it clearer if we do want an ABI wrapper that also does the panic+recover adjustment. In the old mechanism we'd have to unset ABIWrapper flag, even if the function is actually an ABI wrapper. In the new mechanism we just need to set both ABIWrapper and Wrapper flags. Updates #40724. Change-Id: I7fbc83f85d23676dc94db51dfda63dcacdf1fc19 Reviewed-on: https://go-review.googlesource.com/c/go/+/307235 Trust: Cherry Zhang <cherryyz@google.com> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Austin Clements <austin@google.com>
2021-04-05cmd/internal/obj/x86: simplify huge frame prologueAustin Clements
For stack frames larger than StackBig, the stack split prologue needs to guard against potential wraparound. Currently, it carefully arranges to avoid underflow, but this is complicated and requires a special check for StackPreempt. StackPreempt is no longer the only stack poison value, so this check will incorrectly succeed if the stack bound is poisoned with any other value. This CL simplifies the logic of the check, reduces its length, and accounts for any possible poison value by directly checking for underflow. Change-Id: I917a313102d6a21895ef7c4b0f304fb84b292c81 Reviewed-on: https://go-review.googlesource.com/c/go/+/307010 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Than McIntosh <thanm@google.com>
2021-04-02cmd/internal/obj: use REGENTRYTMP* in a few more placesAustin Clements
There are a few remaining places in obj6 where we hard-code safe-on-entry registers. Fix those to use the consts. For #40724. Change-Id: Ie640521aa67d6c99bc057553dc122160049c6edc Reviewed-on: https://go-review.googlesource.com/c/go/+/307009 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-31cmd/internal/obj/x86: use ABI scratch registers for WRAPPER prologueMichael Anthony Knyszek
Currently the prologue generated for WRAPPER assembly functions uses BX and DI, but these are argument registers in the register-based calling convention. Thus, these end up being clobbered when we want to have an ABIInternal assembly function. Define REGENTRYTMP0 and REGENTRYTMP1, aliases for the dedicated function entry scratch registers R12 and R13, and use those instead. For #40724. Change-Id: Ica78c4ccc67a757359900a66b56ef28c83d88b3e Reviewed-on: https://go-review.googlesource.com/c/go/+/303314 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-18all: explode GOEXPERIMENT=regabi into 5 sub-experimentsAustin Clements
This separates GOEXPERIMENT=regabi into five sub-experiments: regabiwrappers, regabig, regabireflect, regabidefer, and regabiargs. Setting GOEXPERIMENT=regabi now implies the working subset of these (currently, regabiwrappers, regabig, and regabireflect). This simplifies testing, helps derisk the register ABI project, and will also help with performance comparisons. This replaces the -abiwrap flag to the compiler and linker with the regabiwrappers experiment. As part of this, regabiargs now enables registers for all calls in the compiler. Previously, this was statically disabled in regabiEnabledForAllCompilation, but now that we can control it independently, this isn't necessary. For #40724. Change-Id: I5171e60cda6789031f2ef034cc2e7c5d62459122 Reviewed-on: https://go-review.googlesource.com/c/go/+/302070 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com>
2021-03-16cmd/asm: when dynamic linking, reject code that uses a clobbered R15Keith Randall
The assember uses R15 as scratch space when assembling global variable references in dynamically linked code. If the assembly code uses the clobbered value of R15, report an error. The user is probably expecting some other value in that register. Getting rid of the R15 use isn't very practical (we could save a register to a field in the G maybe, but that gets cumbersome). Fixes #43661 Change-Id: I43f848a3d8b8a28931ec733386b85e6e9a42d8ff Reviewed-on: https://go-review.googlesource.com/c/go/+/283474 Trust: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-09cmd/compile: fix failure to communicate between ABIinfo producer&consumerDavid Chase
ABI info producer and consumer had different ideas for register order for parameters. Includes a test, includes improvements to debugging output. Updates #44816. Change-Id: I4812976f7a6c08d6fc02aac1ec0544b1f141cca6 Reviewed-on: https://go-review.googlesource.com/c/go/+/299570 Trust: David Chase <drchase@google.com> Reviewed-by: Than McIntosh <thanm@google.com>
2021-03-04cmd/compile: register abi, morestack work and mole whackingDavid Chase
Morestack works for non-pointer register parameters Within a function body, pointer-typed parameters are correctly tracked. Results still not hooked up. For #40724. Change-Id: Icaee0b51d0da54af983662d945d939b756088746 Reviewed-on: https://go-review.googlesource.com/c/go/+/294410 Trust: David Chase <drchase@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-02-24docs: fix spellingJohn Bampton
Change-Id: Ib689e5793d9cb372e759c4f34af71f004010c822 GitHub-Last-Rev: d63798388e5dcccb984689b0ae39b87453b97393 GitHub-Pull-Request: golang/go#44259 Reviewed-on: https://go-review.googlesource.com/c/go/+/291949 Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> Trust: Matthew Dempsky <mdempsky@google.com> Trust: Robert Griesemer <gri@golang.org>
2021-02-19cmd/asm, cmd/link, runtime: introduce FuncInfo flag bitsRuss Cox
The runtime traceback code has its own definition of which functions mark the top frame of a stack, separate from the TOPFRAME bits that exist in the assembly and are passed along in DWARF information. It's error-prone and redundant to have two different sources of truth. This CL provides the actual TOPFRAME bits to the runtime, so that the runtime can use those bits instead of reinventing its own category. This CL also adds a new bit, SPWRITE, which marks functions that write directly to SP (anything but adding and subtracting constants). Such functions must stop a traceback, because the traceback has no way to rederive the SP on entry. Again, the runtime has its own definition which is mostly correct, but also missing some functions. During ordinary goroutine context switches, such functions do not appear on the stack, so the incompleteness in the runtime usually doesn't matter. But profiling signals can arrive at any moment, and the runtime may crash during traceback if it attempts to unwind an SP-writing frame and gets out-of-sync with the actual stack. The runtime contains code to try to detect likely candidates but again it is incomplete. Deriving the SPWRITE bit automatically from the actual assembly code provides the complete truth, and passing it to the runtime lets the runtime use it. This CL is part of a stack adding windows/arm64 support (#36439), intended to land in the Go 1.17 cycle. This CL is, however, not windows/arm64-specific. It is cleanup meant to make the port (and future ports) easier. Change-Id: I227f53b23ac5b3dabfcc5e8ee3f00df4e113cf58 Reviewed-on: https://go-review.googlesource.com/c/go/+/288800 Trust: Russ Cox <rsc@golang.org> Trust: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-08[dev.regabi] cmd/internal/obj/x86: use g register in stack bounds checkCherry Zhang
In ABIInternal context, we can directly use the g register for stack bounds check. Change-Id: I8b1073a3343984a6cd76cf5734ddc4a8cd5dc73f Reviewed-on: https://go-review.googlesource.com/c/go/+/289711 Trust: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2021-02-05[dev.regabi] cmd/asm: define g register on AMD64Cherry Zhang
Define g register as R14 on AMD64. It is not used now, but will be in later CLs. The name "R14" is still recognized. Change-Id: I9a066b15bf1051113db8c6640605e350cea397b9 Reviewed-on: https://go-review.googlesource.com/c/go/+/289195 Trust: Cherry Zhang <cherryyz@google.com> Reviewed-by: Than McIntosh <thanm@google.com>
2021-01-13[dev.regabi] cmd/compile: add code to support register ABI spills around ↵David Chase
morestack calls This is a selected copy from the register ABI experiment CL, focused on the files and data structures that handle spilling around morestack. Unnecessary code from the experiment was removed, other code was adapted. Would it make sense to leave comments in the experiment as pieces are brought over? Experiment CL (for comparison purposes) https://go-review.googlesource.com/c/go/+/28832 Change-Id: I92136f070351d4fcca1407b52ecf9b80898fed95 Reviewed-on: https://go-review.googlesource.com/c/go/+/279520 Trust: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Jeremy Faller <jeremy@golang.org>
2020-12-22[dev.regabi] cmd/compile,cmd/link: initial support for ABI wrappersThan McIntosh
Add compiler support for emitting ABI wrappers by creating real IR as opposed to introducing ABI aliases. At the moment these are "no-op" wrappers in the sense that they make a simple call (using the existing ABI) to their target. The assumption here is that once late call expansion can handle both ABI0 and the "new" ABIInternal (register version), it can expand the call to do the right thing. Note that the runtime contains functions that do not strictly follow the rules of the current Go ABI0; this has been handled in most cases by treating these as ABIInternal instead (these changes have been made in previous patches). Generation of ABI wrappers (as opposed to ABI aliases) is currently gated by GOEXPERIMENT=regabi -- wrapper generation is on by default if GOEXPERIMENT=regabi is set and off otherwise (but can be turned on using "-gcflags=all=-abiwrap -ldflags=-abiwrap"). Wrapper generation currently only workd on AMD64; explicitly enabling wrapper for other architectures (via the command line) is not supported. Also in this patch are a few other command line options for debugging (tracing and/or limiting wrapper creation). These will presumably go away at some point. Updates #27539, #40724. Change-Id: I1ee3226fc15a3c32ca2087b8ef8e41dbe6df4a75 Reviewed-on: https://go-review.googlesource.com/c/go/+/270863 Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Trust: Than McIntosh <thanm@google.com>
2020-11-02cmd: remove Go115AMD64Cherry Zhang
Always do aligned jumps now. Change-Id: If68a16fe93c9173c83323a9063465c9bd166eeb8 Reviewed-on: https://go-review.googlesource.com/c/go/+/266857 Trust: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2020-10-30reflect,runtime: use internal ABI for selected ASM routines, attempt 2Than McIntosh
[This is a roll-forward of CL 262319, with a fix for some Darwin test failures]. Change the definitions of selected runtime assembly routines from ABI0 (the default) to ABIInternal. The ABIInternal def is intended to indicate that these functions don't follow the existing Go runtime ABI. In addition, convert the assembly reference to runtime.main (from runtime.mainPC) to ABIInternal. Finally, for functions such as "runtime.duffzero" that are called directly from generated code, make sure that the compiler looks up the correct ABI version. This is intended to support the register abi work, however these changes should not have any issues even when GOEXPERIMENT=regabi is not in effect. Updates #27539, #40724. Change-Id: Idf507f1c06176073563845239e1a54dad51a9ea9 Reviewed-on: https://go-review.googlesource.com/c/go/+/266638 Trust: Than McIntosh <thanm@google.com> Run-TryBot: Than McIntosh <thanm@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org>
2020-10-29cmd/internal/obj/arm64: add CASx/CASPx instructionsfanzha02
This patch adds support for CASx and CASPx atomic instructions. go syntax gnu syntax CASD Rs, (Rn|RSP), Rt => cas Xs, Xt, (Xn|SP) CASALW Rs, (Rn|RSP), Rt => casal Ws, Wt, (Xn|SP) CASPD (Rs, Rs+1), (Rn|RSP), (Rt, Rt+1) => casp Xs, Xs+1, Xt, Xt+1, (Xn|SP) CASPW (Rs, Rs+1), (Rn|RSP), (Rt, Rt+1) => casp Ws, Ws+1, Wt, Wt+1, (Xn|SP) This patch changes the type of prog.RestArgs from "[]Addr" to "[]struct{Addr, Pos}", Pos is a enum, indicating the position of the operand. This patch also adds test cases. Change-Id: Ib971cfda7890b7aa895d17bab22dea326c7fcaa4 Reviewed-on: https://go-review.googlesource.com/c/go/+/233277 Trust: fannie zhang <Fannie.Zhang@arm.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-16cmd/internal/obj: move LSym.Func into LSym.ExtraRuss Cox
This creates space for a different kind of extension field in LSym without making the struct any larger. (There are many LSym, so we care about keeping the struct small.) Change-Id: Ib16edb9e15f54c2a7351c8b875e19684058711e5 Reviewed-on: https://go-review.googlesource.com/c/go/+/243943 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-28cmd/link: consider interface conversions only in reachable codeCherry Zhang
The linker prunes methods that are not directly reachable if the receiver type is never converted to interface. Currently, this "never" is too strong: it is invalidated even if the interface conversion is in an unreachable function. This CL improves it by only considering interface conversions in reachable code. To do that, we introduce a marker relocation R_USEIFACE, which marks the target symbol as UsedInIface if the source symbol is reached. binary size before after cmd/compile 18897528 18887400 cmd/go 13607372 13470652 Change-Id: I66c6b69eeff9ae02d84d2e6f2bc7f1b29dd53910 Reviewed-on: https://go-review.googlesource.com/c/go/+/256797 Trust: Cherry Zhang <cherryyz@google.com> Reviewed-by: Jeremy Faller <jeremy@golang.org> Reviewed-by: Than McIntosh <thanm@google.com>
2020-08-31cmd/compile,cmd/asm: simplify recording of branch targets, take 2Keith Randall
We currently use two fields to store the targets of branches. Some phases use p.To.Val, some use p.Pcond. Rewrite so that every branch instruction uses p.To.Val. p.From.Val is also used in rare instances. Introduce a Pool link for use by arm/arm64, instead of repurposing Pcond. This is a cleanup CL in preparation for some stack frame CLs. Change-Id: If8239177e4b1ea2bccd0608eb39553d23210d405 Reviewed-on: https://go-review.googlesource.com/c/go/+/251437 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-08-28Revert "cmd/compile,cmd/asm: simplify recording of branch targets"Keith Randall
This reverts CL 243318. Reason for revert: Seems to be crashing some builders. Change-Id: I2ffc59bc5535be60b884b281c8d0eff4647dc756 Reviewed-on: https://go-review.googlesource.com/c/go/+/251169 Reviewed-by: Bryan C. Mills <bcmills@google.com>
2020-08-27cmd/compile,cmd/asm: simplify recording of branch targetsKeith Randall
We currently use two fields to store the targets of branches. Some phases use p.To.Val, some use p.Pcond. Rewrite so that every branch instruction uses p.To.Val. p.From.Val is also used in rare instances. Introduce a Pool link for use by arm/arm64, instead of repurposing Pcond. This is a cleanup CL in preparation for some stack frame CLs. Change-Id: I9055bf0a1d986aff421e47951a1dedc301c846f8 Reviewed-on: https://go-review.googlesource.com/c/go/+/243318 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-08-27runtime: framepointers are no longer an experiment - hard code themKeith Randall
I think they are no longer experimental status. Might as well promote them to permanent. Change-Id: Id1259601b3dd2061dd60df86ee48080bfb575d2f Reviewed-on: https://go-review.googlesource.com/c/go/+/249857 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2020-06-04all: fix dead links to inferno-os bitbucket repositoryTobias Klauser
Generated using: perl -i -npe 's#inferno-os/src/default#inferno-os/src/master#' $(git grep -l "inferno-os/src/default" | grep -v vendor) Change-Id: I4b6443bd09a8ea4c8aaeb40a1c73520d1f7ca648 Reviewed-on: https://go-review.googlesource.com/c/go/+/235821 Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Austin Clements <austin@google.com>
2020-05-06cmd/internal/obj, runtime: preempt & restart some instruction sequencesCherry Zhang
On some architectures, for async preemption the injected call needs to clobber a register (usually REGTMP) in order to return to the preempted function. As a consequence, the PC ranges where REGTMP is live are not preemptible. The uses of REGTMP are usually generated by the assembler, where it needs to load or materialize a large constant or offset that doesn't fit into the instruction. In those cases, REGTMP is not live at the start of the instruction sequence. Instead of giving up preemption in those cases, we could preempt it and restart the sequence when resuming the execution. Basically, this is like reissuing an interrupted instruction, except that here the "instruction" is a Prog that consists of multiple machine instructions. For this to work, we need to generate PC data to mark the start of the Prog. Currently this is only done for ARM64. TODO: the split-stack function prologue is currently not async preemptible. We could use this mechanism, preempt it and restart at the function entry. Change-Id: I37cb282f8e606e7ab6f67b3edfdc6063097b4bd1 Reviewed-on: https://go-review.googlesource.com/c/go/+/208126 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2020-05-01cmd/internal/obj/x86: prevent jumps crossing 32 byte boundariesMark Ryan
This commit adds a new option to the x86 assembler. If the GOAMD64 environment variable is set to alignedjumps (the default) and we're doing a 64 bit build, the assembler will make sure that neither stand alone nor macro-fused jumps will end on or cross 32 byte boundaries. To achieve this, functions are aligned on 32 byte boundaries, rather than 16 bytes, and jump instructions are padded to ensure that they do not cross or end on 32 byte boundaries. Jumps are padded by adding a NOP instruction of the appropriate length before the jump. The commit is likely to result in larger binary sizes when GOAMD64=alignedjumps. On the binaries tested so far, an increase of between 1.4% and 1.5% has been observed. Updates #35881 Co-authored-by: David Chase <drchase@google.com> Change-Id: Ief0722300bc3f987098e4fd92b22b14ad6281d91 Reviewed-on: https://go-review.googlesource.com/c/go/+/219357 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-03-13cmd/asm, cmd/compile, runtime: add -spectre=ret modeRuss Cox
This commit extends the -spectre flag to cmd/asm and adds a new Spectre mitigation mode "ret", which enables the use of retpolines. Retpolines prevent speculation about the target of an indirect jump or call and are described in more detail here: https://support.google.com/faqs/answer/7625886 Change-Id: I4f2cb982fa94e44d91e49bd98974fd125619c93a Reviewed-on: https://go-review.googlesource.com/c/go/+/222661 Reviewed-by: Keith Randall <khr@golang.org>
2019-11-27cmd/internal/obj: mark split-stack prologue nonpreemptibleCherry Zhang
When there are both a synchronous preemption request (by clobbering the stack guard) and an asynchronous one (by signal), the running goroutine may observe the synchronous request first in stack bounds check, and go to the path of calling morestack. If the preemption signal arrives at this point before the call to morestack, the goroutine will be asynchronously preempted, entering the scheduler. When it is resumed, the scheduler clears the preemption request, unclobbers the stack guard. But the resumed goroutine will still call morestack, as it is already on its way. morestack will, as there is no preemption request, double the stack unnecessarily. If this happens multiple times, the stack may grow too big, although only a small amount is actually used. To fix this, we mark the stack bounds check and the call to morestack async-nonpreemptible, starting after the memory instruction (mostly a load, on x86 CMP with memory). Not done for Wasm as it does not support async preemption. Fixes #35470. Change-Id: Ibd7f3d935a3649b80f47539116ec9b9556680cf2 Reviewed-on: https://go-review.googlesource.com/c/go/+/207350 Reviewed-by: David Chase <drchase@google.com>
2019-11-15cmd/internal/obj/x86: mark 2-instruction TLS access nonpreemptibleCherry Zhang
The 2-instruction TLS access sequence MOVQ TLS, BX MOVQ 0(BX)(TLS*1), BX is not async preemptible, as if it is preempted and resumed on a different thread, the TLS address may become invalid. May fix #35349. (This is a rare failure and I haven't been able to reproduce it.) Change-Id: Ie1a366fd0d7d73627dc62ee2de01c0aa09365f2b Reviewed-on: https://go-review.googlesource.com/c/go/+/206903 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Austin Clements <austin@google.com>