aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/internal/obj/x86/obj6.go
AgeCommit message (Collapse)Author
2023-01-26runtime: use explicit NOFRAME on darwin/amd64qmuntal
This CL marks some darwin assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Change-Id: I797f3909bcf7f7aad304e4ede820c884231e54f6 Reviewed-on: https://go-review.googlesource.com/c/go/+/460235 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-01-24runtime: use explicit NOFRAME on windows/amd64qmuntal
This CL marks non-leaf nosplit assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Updates #57302 Updates #40044 Change-Id: Ia4d26f8420dcf2b54528969ffbf40a73f1315d61 Reviewed-on: https://go-review.googlesource.com/c/go/+/459395 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-01-23runtime,cmd/internal/obj/x86: use TEB TLS slots on windows/i386qmuntal
This CL redesign how we get the TLS pointer on windows/i386. It applies the same changes as done in CL 431775 for windows/amd64. We were previously reading it from the [TEB] arbitrary data slot, located at 0x14(FS), which can only hold 1 TLS pointer. With this CL, we will read the TLS pointer from the TEB TLS slot array, located at 0xE10(GS). The TLS slot array can hold multiple TLS pointers, up to 64, so multiple Go runtimes running on the same thread can coexists with different TLS. Each new TLS slot has to be allocated via [TlsAlloc], which returns the slot index. This index can then be used to get the slot offset from GS with the following formula: 0xE10 + index*4. The slot index is fixed per Go runtime, so we can store it in runtime.tls_g and use it latter on to read/update the TLS pointer. Loading the TLS pointer requires the following asm instructions: MOVQ runtime.tls_g, AX MOVQ AX(FS), AX Notice that this approach will now be implemented in all the supported windows arches. [TEB]: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block [TlsAlloc]: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc Change-Id: If4550b0d44694ee6480d4093b851f4991a088b32 Reviewed-on: https://go-review.googlesource.com/c/go/+/454675 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-14runtime,cmd/internal/obj/x86: use TEB TLS slots on windows/amd64qmuntal
This CL redesign how we get the TLS pointer on windows/amd64. We were previously reading it from the [TEB] arbitrary data slot, located at 0x28(GS), which can only hold 1 TLS pointer. With this CL, we will read the TLS pointer from the TEB TLS slot array, located at 0x1480(GS). The TLS slot array can hold multiple TLS pointers, up to 64, so multiple Go runtimes running on the same thread can coexists with different TLS. Each new TLS slot has to be allocated via [TlsAlloc], which returns the slot index. This index can then be used to get the slot offset from GS with the following formula: 0x1480 + index*8 The slot index is fixed per Go runtime, so we can store it in runtime.tls_g and use it latter on to read/update the TLS pointer. Loading the TLS pointer requires the following asm instructions: MOVQ runtime.tls_g, AX MOVQ AX(GS), AX Notice that this approach is also implemented on windows/arm64. [TEB]: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block [TlsAlloc]: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc Updates #22192 Change-Id: Idea7119fd76a3cd083979a4d57ed64b552fa101b Reviewed-on: https://go-review.googlesource.com/c/go/+/431775 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2021-11-05cmd/{asm,compile,internal/obj}: add "maymorestack" supportAustin Clements
This adds a debugging hook for optionally calling a "maymorestack" function in the prologue of any function that might call morestack (whether it does at run time or not). The maymorestack function will let us improve lock checking and add debugging modes that stress function preemption and stack growth. Passes toolstash-check -all (except on js/wasm, where toolstash appears to be broken) Fixes #48297. Change-Id: I27197947482b329af75dafb9971fc0d3a52eaf31 Reviewed-on: https://go-review.googlesource.com/c/go/+/359795 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-06-11[dev.typeparams] all: always enable regabig on AMD64Cherry Mui
Always enable regabig on AMD64, which enables the G register and the X15 zero register. Remove the fallback path. Also remove the regabig GOEXPERIMENT. On AMD64 it is always enabled (this CL). Other architectures already have a G register, except for 386, where there are too few registers and it is unlikely that we will reserve one. (If we really do, we can just add a new experiment). Change-Id: I229cac0060f48fe58c9fdaabd38d6fa16b8a0855 Reviewed-on: https://go-review.googlesource.com/c/go/+/327272 Trust: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: David Chase <drchase@google.com> TryBot-Result: Go Bot <gobot@golang.org>
2021-04-16internal/buildcfg: move build configuration out of cmd/internal/objabiRuss Cox
The go/build package needs access to this configuration, so move it into a new package available to the standard library. Change-Id: I868a94148b52350c76116451f4ad9191246adcff Reviewed-on: https://go-review.googlesource.com/c/go/+/310731 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Jay Conrod <jayconrod@google.com>
2021-04-15runtime,runtime/cgo: save all necessary registers on entry to Go on WindowsAustin Clements
There are several assembly functions that transition from the Windows ABI to the Go ABI. These all need to save all registers that are callee-save in the Windows ABI and caller-save in the Go ABI and prepare the register state for Go. However, they all do this slightly differently and most of them don't save the necessary XMM registers for this transition (which could corrupt them in the C caller). Furthermore, now that we have a carefully specified Go ABI, it's clear that none of these actually get all of the details 100% right. So, unify this code into two macros in a shared header in runtime/cgo/abi_amd64.h that handle all necessary registers and setup and use these macros everywhere on Windows that handles transitions from C to Go. Change-Id: I62f41345a507aad1ca383814ac8b7e2a9ffb821e Reviewed-on: https://go-review.googlesource.com/c/go/+/309769 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-04-05cmd/compile: untangle Wrapper and ABIWrapper flagsCherry Zhang
Currently, there are Wrapper and ABIWrapper attributes. Wrapper is set when compiler generates an wrapper function (e.g. method wrapper). ABIWrapper is set when compiler generates an ABI wrapper. It also sets Wrapper flag for ABI wrappers. Currently, they have the following meanings: - Wrapper flag hides the frame from (normal) traceback. - Wrapper flag enables special panic+recover adjustment, so it can correctly recover when a wrapper function is deferred. - ABIWrapper flag disables the panic+recover adjustment, because we never defer an ABI wrapper that can recover. This CL changes them to: - Both Wrapper and ABIWrapper flags hide the frame from (normal) traceback. (Setting one is enough.) - Wrapper flag enables special panic+recover adjustment. ABIWrapper flag no longer has effect on this. This makes it clearer if we do want an ABI wrapper that also does the panic+recover adjustment. In the old mechanism we'd have to unset ABIWrapper flag, even if the function is actually an ABI wrapper. In the new mechanism we just need to set both ABIWrapper and Wrapper flags. Updates #40724. Change-Id: I7fbc83f85d23676dc94db51dfda63dcacdf1fc19 Reviewed-on: https://go-review.googlesource.com/c/go/+/307235 Trust: Cherry Zhang <cherryyz@google.com> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Austin Clements <austin@google.com>
2021-04-05cmd/internal/obj/x86: simplify huge frame prologueAustin Clements
For stack frames larger than StackBig, the stack split prologue needs to guard against potential wraparound. Currently, it carefully arranges to avoid underflow, but this is complicated and requires a special check for StackPreempt. StackPreempt is no longer the only stack poison value, so this check will incorrectly succeed if the stack bound is poisoned with any other value. This CL simplifies the logic of the check, reduces its length, and accounts for any possible poison value by directly checking for underflow. Change-Id: I917a313102d6a21895ef7c4b0f304fb84b292c81 Reviewed-on: https://go-review.googlesource.com/c/go/+/307010 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Than McIntosh <thanm@google.com>
2021-04-02cmd/internal/obj: use REGENTRYTMP* in a few more placesAustin Clements
There are a few remaining places in obj6 where we hard-code safe-on-entry registers. Fix those to use the consts. For #40724. Change-Id: Ie640521aa67d6c99bc057553dc122160049c6edc Reviewed-on: https://go-review.googlesource.com/c/go/+/307009 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-31cmd/internal/obj/x86: use ABI scratch registers for WRAPPER prologueMichael Anthony Knyszek
Currently the prologue generated for WRAPPER assembly functions uses BX and DI, but these are argument registers in the register-based calling convention. Thus, these end up being clobbered when we want to have an ABIInternal assembly function. Define REGENTRYTMP0 and REGENTRYTMP1, aliases for the dedicated function entry scratch registers R12 and R13, and use those instead. For #40724. Change-Id: Ica78c4ccc67a757359900a66b56ef28c83d88b3e Reviewed-on: https://go-review.googlesource.com/c/go/+/303314 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-18all: explode GOEXPERIMENT=regabi into 5 sub-experimentsAustin Clements
This separates GOEXPERIMENT=regabi into five sub-experiments: regabiwrappers, regabig, regabireflect, regabidefer, and regabiargs. Setting GOEXPERIMENT=regabi now implies the working subset of these (currently, regabiwrappers, regabig, and regabireflect). This simplifies testing, helps derisk the register ABI project, and will also help with performance comparisons. This replaces the -abiwrap flag to the compiler and linker with the regabiwrappers experiment. As part of this, regabiargs now enables registers for all calls in the compiler. Previously, this was statically disabled in regabiEnabledForAllCompilation, but now that we can control it independently, this isn't necessary. For #40724. Change-Id: I5171e60cda6789031f2ef034cc2e7c5d62459122 Reviewed-on: https://go-review.googlesource.com/c/go/+/302070 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com>
2021-03-16cmd/asm: when dynamic linking, reject code that uses a clobbered R15Keith Randall
The assember uses R15 as scratch space when assembling global variable references in dynamically linked code. If the assembly code uses the clobbered value of R15, report an error. The user is probably expecting some other value in that register. Getting rid of the R15 use isn't very practical (we could save a register to a field in the G maybe, but that gets cumbersome). Fixes #43661 Change-Id: I43f848a3d8b8a28931ec733386b85e6e9a42d8ff Reviewed-on: https://go-review.googlesource.com/c/go/+/283474 Trust: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-04cmd/compile: register abi, morestack work and mole whackingDavid Chase
Morestack works for non-pointer register parameters Within a function body, pointer-typed parameters are correctly tracked. Results still not hooked up. For #40724. Change-Id: Icaee0b51d0da54af983662d945d939b756088746 Reviewed-on: https://go-review.googlesource.com/c/go/+/294410 Trust: David Chase <drchase@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-02-19cmd/asm, cmd/link, runtime: introduce FuncInfo flag bitsRuss Cox
The runtime traceback code has its own definition of which functions mark the top frame of a stack, separate from the TOPFRAME bits that exist in the assembly and are passed along in DWARF information. It's error-prone and redundant to have two different sources of truth. This CL provides the actual TOPFRAME bits to the runtime, so that the runtime can use those bits instead of reinventing its own category. This CL also adds a new bit, SPWRITE, which marks functions that write directly to SP (anything but adding and subtracting constants). Such functions must stop a traceback, because the traceback has no way to rederive the SP on entry. Again, the runtime has its own definition which is mostly correct, but also missing some functions. During ordinary goroutine context switches, such functions do not appear on the stack, so the incompleteness in the runtime usually doesn't matter. But profiling signals can arrive at any moment, and the runtime may crash during traceback if it attempts to unwind an SP-writing frame and gets out-of-sync with the actual stack. The runtime contains code to try to detect likely candidates but again it is incomplete. Deriving the SPWRITE bit automatically from the actual assembly code provides the complete truth, and passing it to the runtime lets the runtime use it. This CL is part of a stack adding windows/arm64 support (#36439), intended to land in the Go 1.17 cycle. This CL is, however, not windows/arm64-specific. It is cleanup meant to make the port (and future ports) easier. Change-Id: I227f53b23ac5b3dabfcc5e8ee3f00df4e113cf58 Reviewed-on: https://go-review.googlesource.com/c/go/+/288800 Trust: Russ Cox <rsc@golang.org> Trust: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-08[dev.regabi] cmd/internal/obj/x86: use g register in stack bounds checkCherry Zhang
In ABIInternal context, we can directly use the g register for stack bounds check. Change-Id: I8b1073a3343984a6cd76cf5734ddc4a8cd5dc73f Reviewed-on: https://go-review.googlesource.com/c/go/+/289711 Trust: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2021-01-13[dev.regabi] cmd/compile: add code to support register ABI spills around ↵David Chase
morestack calls This is a selected copy from the register ABI experiment CL, focused on the files and data structures that handle spilling around morestack. Unnecessary code from the experiment was removed, other code was adapted. Would it make sense to leave comments in the experiment as pieces are brought over? Experiment CL (for comparison purposes) https://go-review.googlesource.com/c/go/+/28832 Change-Id: I92136f070351d4fcca1407b52ecf9b80898fed95 Reviewed-on: https://go-review.googlesource.com/c/go/+/279520 Trust: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Jeremy Faller <jeremy@golang.org>
2020-12-22[dev.regabi] cmd/compile,cmd/link: initial support for ABI wrappersThan McIntosh
Add compiler support for emitting ABI wrappers by creating real IR as opposed to introducing ABI aliases. At the moment these are "no-op" wrappers in the sense that they make a simple call (using the existing ABI) to their target. The assumption here is that once late call expansion can handle both ABI0 and the "new" ABIInternal (register version), it can expand the call to do the right thing. Note that the runtime contains functions that do not strictly follow the rules of the current Go ABI0; this has been handled in most cases by treating these as ABIInternal instead (these changes have been made in previous patches). Generation of ABI wrappers (as opposed to ABI aliases) is currently gated by GOEXPERIMENT=regabi -- wrapper generation is on by default if GOEXPERIMENT=regabi is set and off otherwise (but can be turned on using "-gcflags=all=-abiwrap -ldflags=-abiwrap"). Wrapper generation currently only workd on AMD64; explicitly enabling wrapper for other architectures (via the command line) is not supported. Also in this patch are a few other command line options for debugging (tracing and/or limiting wrapper creation). These will presumably go away at some point. Updates #27539, #40724. Change-Id: I1ee3226fc15a3c32ca2087b8ef8e41dbe6df4a75 Reviewed-on: https://go-review.googlesource.com/c/go/+/270863 Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Trust: Than McIntosh <thanm@google.com>
2020-10-30reflect,runtime: use internal ABI for selected ASM routines, attempt 2Than McIntosh
[This is a roll-forward of CL 262319, with a fix for some Darwin test failures]. Change the definitions of selected runtime assembly routines from ABI0 (the default) to ABIInternal. The ABIInternal def is intended to indicate that these functions don't follow the existing Go runtime ABI. In addition, convert the assembly reference to runtime.main (from runtime.mainPC) to ABIInternal. Finally, for functions such as "runtime.duffzero" that are called directly from generated code, make sure that the compiler looks up the correct ABI version. This is intended to support the register abi work, however these changes should not have any issues even when GOEXPERIMENT=regabi is not in effect. Updates #27539, #40724. Change-Id: Idf507f1c06176073563845239e1a54dad51a9ea9 Reviewed-on: https://go-review.googlesource.com/c/go/+/266638 Trust: Than McIntosh <thanm@google.com> Run-TryBot: Than McIntosh <thanm@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org>
2020-10-16cmd/internal/obj: move LSym.Func into LSym.ExtraRuss Cox
This creates space for a different kind of extension field in LSym without making the struct any larger. (There are many LSym, so we care about keeping the struct small.) Change-Id: Ib16edb9e15f54c2a7351c8b875e19684058711e5 Reviewed-on: https://go-review.googlesource.com/c/go/+/243943 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-08-31cmd/compile,cmd/asm: simplify recording of branch targets, take 2Keith Randall
We currently use two fields to store the targets of branches. Some phases use p.To.Val, some use p.Pcond. Rewrite so that every branch instruction uses p.To.Val. p.From.Val is also used in rare instances. Introduce a Pool link for use by arm/arm64, instead of repurposing Pcond. This is a cleanup CL in preparation for some stack frame CLs. Change-Id: If8239177e4b1ea2bccd0608eb39553d23210d405 Reviewed-on: https://go-review.googlesource.com/c/go/+/251437 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-08-28Revert "cmd/compile,cmd/asm: simplify recording of branch targets"Keith Randall
This reverts CL 243318. Reason for revert: Seems to be crashing some builders. Change-Id: I2ffc59bc5535be60b884b281c8d0eff4647dc756 Reviewed-on: https://go-review.googlesource.com/c/go/+/251169 Reviewed-by: Bryan C. Mills <bcmills@google.com>
2020-08-27cmd/compile,cmd/asm: simplify recording of branch targetsKeith Randall
We currently use two fields to store the targets of branches. Some phases use p.To.Val, some use p.Pcond. Rewrite so that every branch instruction uses p.To.Val. p.From.Val is also used in rare instances. Introduce a Pool link for use by arm/arm64, instead of repurposing Pcond. This is a cleanup CL in preparation for some stack frame CLs. Change-Id: I9055bf0a1d986aff421e47951a1dedc301c846f8 Reviewed-on: https://go-review.googlesource.com/c/go/+/243318 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-08-27runtime: framepointers are no longer an experiment - hard code themKeith Randall
I think they are no longer experimental status. Might as well promote them to permanent. Change-Id: Id1259601b3dd2061dd60df86ee48080bfb575d2f Reviewed-on: https://go-review.googlesource.com/c/go/+/249857 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2020-06-04all: fix dead links to inferno-os bitbucket repositoryTobias Klauser
Generated using: perl -i -npe 's#inferno-os/src/default#inferno-os/src/master#' $(git grep -l "inferno-os/src/default" | grep -v vendor) Change-Id: I4b6443bd09a8ea4c8aaeb40a1c73520d1f7ca648 Reviewed-on: https://go-review.googlesource.com/c/go/+/235821 Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Austin Clements <austin@google.com>
2019-11-27cmd/internal/obj: mark split-stack prologue nonpreemptibleCherry Zhang
When there are both a synchronous preemption request (by clobbering the stack guard) and an asynchronous one (by signal), the running goroutine may observe the synchronous request first in stack bounds check, and go to the path of calling morestack. If the preemption signal arrives at this point before the call to morestack, the goroutine will be asynchronously preempted, entering the scheduler. When it is resumed, the scheduler clears the preemption request, unclobbers the stack guard. But the resumed goroutine will still call morestack, as it is already on its way. morestack will, as there is no preemption request, double the stack unnecessarily. If this happens multiple times, the stack may grow too big, although only a small amount is actually used. To fix this, we mark the stack bounds check and the call to morestack async-nonpreemptible, starting after the memory instruction (mostly a load, on x86 CMP with memory). Not done for Wasm as it does not support async preemption. Fixes #35470. Change-Id: Ibd7f3d935a3649b80f47539116ec9b9556680cf2 Reviewed-on: https://go-review.googlesource.com/c/go/+/207350 Reviewed-by: David Chase <drchase@google.com>
2019-10-25cmd/internal/obj/x86: correct pcsp for ADJSPAustin Clements
The x86 assembler supports an "ADJSP" pseudo-op that compiles to an ADD/SUB from SP. Unfortunately, while this seems perfect for an instruction that would allow obj to continue to track the SP/FP delta, obj currently doesn't do that. As a result, FP-relative references won't work and, perhaps worse, the pcsp table will have the wrong frame size. We don't currently use this instruction in any assembly or generate it in the compiler, but this is a perfect instruction for solving a problem in #24543. This CL makes ADJSP useful by incorporating it into the SP delta logic. One subtlety is that we do generate ADJSP in obj itself to open a function's stack frame. Currently, when preprocess enters the loop to compute the SP delta, it may or may not start at this ADJSP instruction depending on various factors. We clean this up by instead always starting the SP delta at 0 and always starting this loop at the entry to the function. Why not just recognize ADD/SUB of SP? The danger is that could change the meaning of existing code. For example, walltime1 in sys_linux_amd64.s saves SP, SUBs from it, and aligns it. Later, it restores the saved copy and then does a few FP-relative references. Currently obj doesn't know any of this is happening, but that's fine once it gets to the FP-relative references. If we taught obj to recognize the SUB, it would start to miscompile this code. An alternative would be to recognize unknown instructions that write to SP and refuse subsequent FP-relative references, but that's kind of annoying. This passes toolstash -cmp for std on both amd64 and 386. Change-Id: Ic6c6a7cbf980bca904576676c07b44c0aaa9c82d Reviewed-on: https://go-review.googlesource.com/c/go/+/200877 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2019-10-24cmd/compile, cmd/link, runtime: make defers low-cost through inline code and ↵Dan Scales
extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go binary without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff Reviewed-on: https://go-review.googlesource.com/c/go/+/202340 Reviewed-by: Austin Clements <austin@google.com>
2019-10-17cmd/asm: add missing x86 instructionsArtem Alekseev
Instructions added: CLDEMOTE, CLWB, TPAUSE, UMWAIT, UMONITOR. Change-Id: I1ba550d4d5acc41a2fd97068ff5834e0412d3bcf Reviewed-on: https://go-review.googlesource.com/c/go/+/183225 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-10-10all: remove nacl (part 3, more amd64p32)Brad Fitzpatrick
Part 1: CL 199499 (GOOS nacl) Part 2: CL 200077 (amd64p32 files, toolchain) Part 3: stuff that arguably should've been part of Part 2, but I forgot one of my grep patterns when splitting the original CL up into two parts. This one might also have interesting stuff to resurrect for any future x32 ABI support. Updates #30439 Change-Id: I2b4143374a253a003666f3c69e776b7e456bdb9c Reviewed-on: https://go-review.googlesource.com/c/go/+/200318 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-10-09all: remove the nacl port (part 2, amd64p32 + toolchain)Brad Fitzpatrick
This is part two if the nacl removal. Part 1 was CL 199499. This CL removes amd64p32 support, which might be useful in the future if we implement the x32 ABI. It also removes the nacl bits in the toolchain, and some remaining nacl bits. Updates #30439 Change-Id: I2475d5bb066d1b474e00e40d95b520e7c2e286e1 Reviewed-on: https://go-review.googlesource.com/c/go/+/200077 Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-04-30cmd/asm: reject BSWAPW on amd64Iskander Sharipov
Since BSWAP operation on 16-bit registers is undefined, forbid the usage of BSWAPW. Users should rely on XCHGB instead. This behavior is consistent with what GAS does. Fixes #29167 Change-Id: I3b31e3dd2acfd039f7564a1c17e6068617bcde8d Reviewed-on: https://go-review.googlesource.com/c/go/+/174312 Run-TryBot: Iskander Sharipov <quasilyte@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-04-03cmd/compile: handle new panicindex/slice names in optimizationsKeith Randall
These new calls should not prevent NOSPLIT promotion, like the old ones. These new calls should not prevent racefuncenter/exit removal. (The latter was already true, as the new calls are not yet lowered to StaticCalls at the point where racefuncenter/exit removal is done.) Add tests to make sure we don't regress (again). Fixes #31219 Change-Id: I3fb6b17cdd32c425829f1e2498defa813a5a9ace Reviewed-on: https://go-review.googlesource.com/c/go/+/170639 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2019-03-29cmd/link/ld,cmd/internal/obj,runtime: make the Android TLS offset dynamicElias Naur
We're going to need a different TLS offset for Android Q, so the static offsets used for 386 and amd64 are no longer viable on Android. Introduce runtime·tls_g and use that for indexing into TLS storage. As an added benefit, we can then merge the TLS setup code for all android GOARCHs. While we're at it, remove a bunch of android special cases no longer needed. Updates #29674 Updates #29249 (perhaps fixes it) Change-Id: I77c7385aec7de8f1f6a4da7c9c79999157e39572 Reviewed-on: https://go-review.googlesource.com/c/go/+/169817 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-02-15cmd/compile: implement shifts by signed amountsKeith Randall
Allow shifts by signed amounts. Panic if the shift amount is negative. TODO: We end up doing two compares per shift, see Ian's comment https://github.com/golang/go/issues/19113#issuecomment-443241799 that we could do it with a single comparison in the normal case. The prove pass mostly handles this code well. For instance, it removes the <0 check for cases like this: if s >= 0 { _ = x << s } _ = x << len(a) This case isn't handled well yet: _ = x << (y & 0xf) I'll do followon CLs for unhandled cases as needed. Update #19113 R=go1.13 Change-Id: I839a5933d94b54ab04deb9dd5149f32c51c90fa1 Reviewed-on: https://go-review.googlesource.com/c/158719 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-01-09cmd/dist, cmd/link, runtime: fix stack size when cross-compiling aix/ppc64Clément Chigot
This commit allows to cross-compiling aix/ppc64. The nosplit limit must twice as large as on others platforms because of AIX syscalls. The stack limit, especially stackGuardMultiplier, was set by cmd/dist during the bootstrap and doesn't depend on GOOS/GOARCH target. Fixes #29572 Change-Id: Id51e38885e1978d981aa9e14972eaec17294322e Reviewed-on: https://go-review.googlesource.com/c/157117 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-22cmd/internal/obj: consolidate emitting entry stack mapAustin Clements
The obj package needs to emit the PCDATA to select the entry stack map before calling morestack. Currently this is copied for every architecture. Since we're about to change how this works, consolidate all of these copies into a single helper function. For #24543. Change-Id: Ia92d94de78f8e23fd06dba747c43e03e5989f67b Reviewed-on: https://go-review.googlesource.com/109346 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2018-05-14cmd/compile: plumb prologueEnd into DWARFDavid Chase
This marks the first instruction after the prologue for consumption by debuggers, specifically Delve, who asked for it. gdb appears to ignore it, lldb appears to use it. The bits for end-of-prologue and beginning-of-epilogue are added to Pos (reducing maximum line number by 4x, to 1048575). They're added in cmd/internal/obj/<ARCH>.go (currently x86 only), so the compiler-proper need not deal with them. The linker currently does nothing with beginning-of-epilogue, but the plumbing exists to make it easier in the future. This also upgrades the line number table to DWARF version 3. This CL includes a regression in the coverage for testdata/i22558.gdb-dbg.nexts, this appears to be a gdb artifact but the fix would be in the preceding CL in the stack. Change-Id: I3bda5f46a0ed232d137ad48f65a14835c742c506 Reviewed-on: https://go-review.googlesource.com/110416 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2018-04-02cmd: remove some unused parametersDaniel Martí
Change-Id: I9d2a4b8df324897e264d30801e95ddc0f0e75f3a Reviewed-on: https://go-review.googlesource.com/102337 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
2018-03-21all: enable c-shared/c-archive support for freebsd/amd64Tim Wright
Fixes #14327 Much of the code is based on the linux/amd64 code that implements these build modes, and code is shared where possible. Change-Id: Ia510f2023768c0edbc863aebc585929ec593b332 Reviewed-on: https://go-review.googlesource.com/93875 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-02-27cmd/internal/obj/x86: add missing legacy instsisharipo
Minimizes the amount of "TODO" stuff in test suite of cmd/asm/internal/asm/testdata/amd64enc.s. Some instructions were already implemented, but test cases for them were commented-out. Does not enable MMX instructions, calls/jumps and some segment registers instructions. -- Affected instructions -- BLENDVPD, BLENDVPS BSWAPW CBW CDQE CLAC CLFLUSHOPT CMPXCHG16B CRC32B, CRC32L, CRC32W CWDE FBLD FBSTP FCMOVB FCMOVBE FCMOVE FCMOVNB FCMOVNBE FCMOVU FCOMI FCOMIP IMUL3L, IMUL3Q, IMUL3W ICEBP, INT INVPCID LARQ LGDT, LIDT, LLDT LMSW LTR LZCNTL, LZCNTQ, LZCNTW MONITOR MOVBELL, MOVBEQQ, MOVBEWW MOVBQZX MOVQ MOVSWW, MOVZWW MWAIT NOPL, NOPW PBLENDVB PEXTRW RDPKRU RDRANDL, RDRANDQ, RDRANDW RDSEEDL, RDSEEDQ, RDSEEDW RDTSCP SAHF SGDT, SIDT SLDTL, SLDTQ, SLDTW SMSWL, SMSWQ, SMSWW STAC STRL, STRQ, STRW SYSENTER, SYSENTER64 SYSEXIT, SYSEXIT64 SHA256RNDS2 TZCNTL, TZCNTQ, TZCNTW UD1, UD2 WRPKRU XRSTOR, XRSTOR64 XRSTORS, XRSTORS64 XSAVE, XSAVE64 XSAVEC, XSAVEC64 XSAVEOPT, XSAVEOPT64 XSAVES, XSAVES64 XSETBV Fixes #6739 Change-Id: I8b125d9a5ea39bb4b9da7e66a63a16f609cef376 Reviewed-on: https://go-review.googlesource.com/97235 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2018-02-12cmd/internal/obj/x86: adjust SP correctly for tail callsAustin Clements
Currently, tail calls on x86 don't adjust the SP on return, so it's important that the compiler produce a zero-sized frame and disable the frame pointer. However, these constraints aren't necessary. For example, on other architectures it's generally necessary to restore the saved LR before a tail call, so obj simply makes this work. Likewise, on x86, there's no reason we can't simply make this work. Hence, this CL adjusts the compiler to use the same tail call convention for x86 that we use on LR machines by producing a RET with a target, rather than a JMP with a target. In fact, obj already understands this convention for x86 except that it's buggy with non-zero frame sizes. So we also fix this bug obj. As a result of these fixes, the compiler no longer needs to mark wrappers as NoFramePointer since it's now perfectly fine to save the frame pointer. In fact, this eliminates the only use of NoFramePointer in the compiler, which will enable further cleanups. This also fixes what is very nearly, but not quite, a code generation bug. NoFramePointer becomes obj.NOFRAME in the object file, which on ppc64 and s390x means to omit the saved LR. Hence, on these architectures, NoFramePointer (and NOFRAME) is only safe to set on leaf functions. However, on *most* architectures, wrappers aren't necessarily leaf functions because they may call DUFFZERO. We're saved on ppc64 and s390x only because the compiler doesn't have the rules to produce DUFFZERO calls on these architectures. Hence, this only works because the set of LR architectures that implement NOFRAME is disjoint from the set where the compiler produces DUFFZERO operations. (I discovered this whole mess when I attempted to add NOFRAME support to arm.) Change-Id: Icc589aeb86beacb850d0a6a80bd3024974a33947 Reviewed-on: https://go-review.googlesource.com/92035 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-11-16cmd/compile: fix buglet/typo in DWARF x86 setupThan McIntosh
Fix typo in DWARF register config for GOOARCH=x86; was picking up the AMD64 set, should have been selecting x86 set. Change-Id: I9a4c6f1378baf3cb2f0ad8d60f3ee2f24cd5dc91 Reviewed-on: https://go-review.googlesource.com/77990 Run-TryBot: Than McIntosh <thanm@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
2017-09-15cmd/internal/obj: change Prog.From3 to RestArgs ([]Addr)isharipo
This change makes it easier to express instructions with arbitrary number of operands. Rationale: previous approach with operand "hiding" does not scale well, AVX and especially AVX512 have many instructions with 3+ operands. x86 asm backend is updated to handle up to 6 explicit operands. It also fixes issue with 4-th immediate operand type checks. All `ytab` tables are updated accordingly. Changes to non-x86 backends only include these patterns: `p.From3 = X` => `p.SetFrom3(X)` `p.From3.X = Y` => `p.GetFrom3().X = Y` Over time, other backends can adapt Prog.RestArgs and reduce the amount of workarounds. -- Performance -- x/benchmark/build: $ benchstat upstream.bench patched.bench name old time/op new time/op delta Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10) name old binary-size new binary-size delta Build-48 10.3M ± 0% 10.3M ± 0% ~ (all equal) name old build-time/op new build-time/op delta Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10) name old build-peak-RSS-bytes new build-peak-RSS-bytes delta Build-48 145MB ± 5% 148MB ± 5% ~ (p=0.218 n=10+10) name old build-user+sys-time/op new build-user+sys-time/op delta Build-48 21.0s ± 2% 21.2s ± 2% ~ (p=0.075 n=10+10) Microbenchmark shows a slight slowdown. name old time/op new time/op delta AMD64asm-4 49.5ms ± 1% 49.9ms ± 1% +0.67% (p=0.001 n=23+15) func BenchmarkAMD64asm(b *testing.B) { for i := 0; i < b.N; i++ { TestAMD64EndToEnd(nil) TestAMD64Encoder(nil) } } Change-Id: I4f1d37b5c2c966da3f2127705ccac9bff0038183 Reviewed-on: https://go-review.googlesource.com/63490 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-09-06cmd/asm: add amd64 CLFLUSH instructionisharipo
This is the last instruction I found missing in SSE2 set. It does not reuse 'yprefetch' ytabs due to differences in operands SRC/DST roles: - PREFETCHx: ModRM:r/m(r) -> FROM - CLFLUSH: ModRM:r/m(w) -> TO unaryDst map is extended accordingly. Change-Id: I89e34ebb81cc0ee5f9ebbb1301bad417f7ee437f Reviewed-on: https://go-review.googlesource.com/56833 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com> Reviewed-by: Russ Cox <rsc@golang.org>
2017-08-26all: remove some double spaces from commentsDaniel Martí
Went mainly for the ones that make no sense, such as the ones mid-sentence or after commas. Change-Id: Ie245d2c19cc7428a06295635cf6a9482ade25ff0 Reviewed-on: https://go-review.googlesource.com/57293 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-07-27[dev.debug] cmd/compile: better DWARF with optimizations onHeschi Kreinick
Debuggers use DWARF information to find local variables on the stack and in registers. Prior to this CL, the DWARF information for functions claimed that all variables were on the stack at all times. That's incorrect when optimizations are enabled, and results in debuggers showing data that is out of date or complete gibberish. After this CL, the compiler is capable of representing variable locations more accurately, and attempts to do so. Due to limitations of the SSA backend, it's not possible to be completely correct. There are a number of problems in the current design. One of the easier to understand is that variable names currently must be attached to an SSA value, but not all assignments in the source code actually result in machine code. For example: type myint int var a int b := myint(int) and b := (*uint64)(unsafe.Pointer(a)) don't generate machine code because the underlying representation is the same, so the correct value of b will not be set when the user would expect. Generating the more precise debug information is behind a flag, dwarflocationlists. Because of the issues described above, setting the flag may not make the debugging experience much better, and may actually make it worse in cases where the variable actually is on the stack and the more complicated analysis doesn't realize it. A number of changes are included: - Add a new pseudo-instruction, RegKill, which indicates that the value in the register has been clobbered. - Adjust regalloc to emit RegKills in the right places. Significantly, this means that phis are mixed with StoreReg and RegKills after regalloc. - Track variable decomposition in ssa.LocalSlots. - After the SSA backend is done, analyze the result and build location lists for each LocalSlot. - After assembly is done, update the location lists with the assembled PC offsets, recompose variables, and build DWARF location lists. Emit the list as a new linker symbol, one per function. - In the linker, aggregate the location lists into a .debug_loc section. TODO: - currently disabled for non-X86/AMD64 because there are no data tables. go build -toolexec 'toolstash -cmp' -a std succeeds. With -dwarflocationlists false: before: f02812195637909ff675782c0b46836a8ff01976 after: 06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec benchstat -geomean /tmp/220352263 /tmp/621364410 completed 15 of 15, estimated time remaining 0s (eta 3:52PM) name old time/op new time/op delta Template 199ms ± 3% 198ms ± 2% ~ (p=0.400 n=15+14) Unicode 96.6ms ± 5% 96.4ms ± 5% ~ (p=0.838 n=15+15) GoTypes 653ms ± 2% 647ms ± 2% ~ (p=0.102 n=15+14) Flate 133ms ± 6% 129ms ± 3% -2.62% (p=0.041 n=15+15) GoParser 164ms ± 5% 159ms ± 3% -3.05% (p=0.000 n=15+15) Reflect 428ms ± 4% 422ms ± 3% ~ (p=0.156 n=15+13) Tar 123ms ±10% 124ms ± 8% ~ (p=0.461 n=15+15) XML 228ms ± 3% 224ms ± 3% -1.57% (p=0.045 n=15+15) [Geo mean] 206ms 377ms +82.86% name old user-time/op new user-time/op delta Template 292ms ±10% 301ms ±12% ~ (p=0.189 n=15+15) Unicode 166ms ±37% 158ms ±14% ~ (p=0.418 n=15+14) GoTypes 962ms ± 6% 963ms ± 7% ~ (p=0.976 n=15+15) Flate 207ms ±19% 200ms ±14% ~ (p=0.345 n=14+15) GoParser 246ms ±22% 240ms ±15% ~ (p=0.587 n=15+15) Reflect 611ms ±13% 587ms ±14% ~ (p=0.085 n=15+13) Tar 211ms ±12% 217ms ±14% ~ (p=0.355 n=14+15) XML 335ms ±15% 320ms ±18% ~ (p=0.169 n=15+15) [Geo mean] 317ms 583ms +83.72% name old alloc/op new alloc/op delta Template 40.2MB ± 0% 40.2MB ± 0% -0.15% (p=0.000 n=14+15) Unicode 29.2MB ± 0% 29.3MB ± 0% ~ (p=0.624 n=15+15) GoTypes 114MB ± 0% 114MB ± 0% -0.15% (p=0.000 n=15+14) Flate 25.7MB ± 0% 25.6MB ± 0% -0.18% (p=0.000 n=13+15) GoParser 32.2MB ± 0% 32.2MB ± 0% -0.14% (p=0.003 n=15+15) Reflect 77.8MB ± 0% 77.9MB ± 0% ~ (p=0.061 n=15+15) Tar 27.1MB ± 0% 27.0MB ± 0% -0.11% (p=0.029 n=15+15) XML 42.7MB ± 0% 42.5MB ± 0% -0.29% (p=0.000 n=15+15) [Geo mean] 42.1MB 75.0MB +78.05% name old allocs/op new allocs/op delta Template 402k ± 1% 398k ± 0% -0.91% (p=0.000 n=15+15) Unicode 344k ± 1% 344k ± 0% ~ (p=0.715 n=15+14) GoTypes 1.18M ± 0% 1.17M ± 0% -0.91% (p=0.000 n=15+14) Flate 243k ± 0% 240k ± 1% -1.05% (p=0.000 n=13+15) GoParser 327k ± 1% 324k ± 1% -0.96% (p=0.000 n=15+15) Reflect 984k ± 1% 982k ± 0% ~ (p=0.050 n=15+15) Tar 261k ± 1% 259k ± 1% -0.77% (p=0.000 n=15+15) XML 411k ± 0% 404k ± 1% -1.55% (p=0.000 n=15+15) [Geo mean] 439k 755k +72.01% name old text-bytes new text-bytes delta HelloSize 694kB ± 0% 694kB ± 0% -0.00% (p=0.000 n=15+15) name old data-bytes new data-bytes delta HelloSize 5.55kB ± 0% 5.55kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 133kB ± 0% 133kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.04MB ± 0% 1.04MB ± 0% ~ (all equal) Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8 Reviewed-on: https://go-review.googlesource.com/41770 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-05-01cmd/internal/obj/x86: use LEAx rather than ADDx when calling DUFFxxxx via GOTMichael Hudson-Doyle
DUFFZERO on 386 is not marked as clobbering flags, but rewriteToUseGot rewrote "ADUFFZERO $offset" to "MOVL runtime.duffxxx@GOT, CX; ADDL $offset, CX; CALL CX" which does. Luckily the fix is easier than figuring out what the problem was: replace the ADDL $offset, CX with LEAL $offset(CX), CX. On amd64 DUFFZERO clobbers flags, on arm, arm64 and ppc64 ADD does not clobber flags and s390x does not use the duff functions, so I'm fairly confident this is the only fix required. I don't know how to write a test though. Change-Id: I69b0958f5f45771d61db5f5ecb4ded94e8960d4d Reviewed-on: https://go-review.googlesource.com/41821 Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-04-20cmd/internal/obj: eliminate LSym.VersionJosh Bleecher Snyder
There were only two versions, 0 and 1, and the only user of version 1 was the assembler, to indicate that a symbol was static. Rename LSym.Version to Static, and add it to LSym.Attributes. Simplify call-sites. Passes toolstash-check. Change-Id: Iabd39918f5019cce78f381d13f0481ae09f3871f Reviewed-on: https://go-review.googlesource.com/41201 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>