| Age | Commit message (Collapse) | Author |
|
The assembler isn't handling this correctly for most architectures.
Of course, the two I tried first, arm64 and amd64, worked, so I assumed
other archs could handle it also. Apparently not.
Should fix dashboard failures introduced by CL 751465.
Change-Id: I9fc4f123d11acf3d10cc9806abfb93ec077509a7
Reviewed-on: https://go-review.googlesource.com/c/go/+/752560
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
|
|
s390x
This CL spill arg registers before calling morestack, unspill them
after morestack call. It also avoid clobbering the register that
could contain incoming argument values. Change registers on s390x to
avoid regABI arguments.
Update #40724
Change-Id: I67b20552410dd23ef0b86f14b9c5bfed9f9723a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/719421
Reviewed-by: Vishwanatha HD <vishwanatha.hd@ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
The previous CL made this adjustment unnecessary. The argp field
is no longer used by the runtime.
Change-Id: I3491eeef4103c6653ec345d604c0acd290af9e8f
Reviewed-on: https://go-review.googlesource.com/c/go/+/685356
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
|
|
This reverts commits
3f3782feed6e0726ddb08afd32dad7d94fbb38c6 (CL 648518)
b386b628521780c048af14a148f373c84e687b26 (CL 668475)
Fixes #73542
Change-Id: I218851c5c0b62700281feb0b3f82b6b9b97b910d
Reviewed-on: https://go-review.googlesource.com/c/go/+/670055
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
We currently make some parts of the preamble unpreemptible because
it confuses morestack. See comments in the code.
Instead, have morestack handle those weird cases so we can
remove unpreemptible marks from most places.
This CL makes user functions preemptible everywhere if they have no
write barriers (at least, on x86). In cmd/go the fraction of functions
that need preemptible markings drops from 82% to 36%. Makes the cmd/go
binary 0.3% smaller.
Update #35470
Change-Id: Ic83d5eabfd0f6d239a92e65684bcce7e67ff30bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/648518
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
For #59670.
Change-Id: I91448363be2fc678964ce119d85cd5fae34a14da
Reviewed-on: https://go-review.googlesource.com/c/go/+/486975
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
|
|
For #59670.
Change-Id: Ie784ba4dd2701e4f455e1abde4a6bfebee4b1387
Reviewed-on: https://go-review.googlesource.com/c/go/+/485496
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
|
|
internal/abi"
This reverts commit CL 486379.
Submitted out of order and breaks bootstrap.
Change-Id: Ie20a61cc56efc79a365841293ca4e7352b02d86b
Reviewed-on: https://go-review.googlesource.com/c/go/+/486917
TryBot-Bypass: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
For #59670.
Change-Id: I04a17079b351b9b4999ca252825373c17afb8a88
Reviewed-on: https://go-review.googlesource.com/c/go/+/486379
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Following CL 412474, for the rest of the LR architectures. On
MIPS(32/64), S390X, and RISCV, there is no single instruction that
saves the LR and decrements the SP, so we need to insert an
instruction to save the LR after decrementing the SP.
On ARM(32) and PPC64 we already use a single instruction to save
the LR and decrement the SP.
Updates #53374.
Change-Id: I5a2e211026d95edb0e0f7d084ddb784f8077b86d
Reviewed-on: https://go-review.googlesource.com/c/go/+/413428
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
|
|
And delete now-unused FixedFrameSize methods.
Change-Id: Id257e1647dbeb4eb4ab866c53744010c4efeb953
Reviewed-on: https://go-review.googlesource.com/c/go/+/400819
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
S390X
A "RET f(SB)" wasn't assembled correctly in a leaf function with
non-zero frame size. Follows CL 371034, for MIPS(32/64)(be/le)
and S390X. Other architectures seem to do it right. Add a test.
Change-Id: I41349a7ae9862b924f3a3de2bcb55b782061ce21
Reviewed-on: https://go-review.googlesource.com/c/go/+/371214
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
This adds a debugging hook for optionally calling a "maymorestack"
function in the prologue of any function that might call morestack
(whether it does at run time or not). The maymorestack function will
let us improve lock checking and add debugging modes that stress
function preemption and stack growth.
Passes toolstash-check -all (except on js/wasm, where toolstash
appears to be broken)
Fixes #48297.
Change-Id: I27197947482b329af75dafb9971fc0d3a52eaf31
Reviewed-on: https://go-review.googlesource.com/c/go/+/359795
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
CL 307010 for s390x.
Change-Id: I43e1f93dd01c814417f8ef7480aa82c05b2b6b66
Reviewed-on: https://go-review.googlesource.com/c/go/+/307151
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
The runtime traceback code has its own definition of which functions
mark the top frame of a stack, separate from the TOPFRAME bits that
exist in the assembly and are passed along in DWARF information.
It's error-prone and redundant to have two different sources of truth.
This CL provides the actual TOPFRAME bits to the runtime, so that
the runtime can use those bits instead of reinventing its own category.
This CL also adds a new bit, SPWRITE, which marks functions that
write directly to SP (anything but adding and subtracting constants).
Such functions must stop a traceback, because the traceback has no
way to rederive the SP on entry. Again, the runtime has its own definition
which is mostly correct, but also missing some functions. During ordinary
goroutine context switches, such functions do not appear on the stack,
so the incompleteness in the runtime usually doesn't matter.
But profiling signals can arrive at any moment, and the runtime may
crash during traceback if it attempts to unwind an SP-writing frame
and gets out-of-sync with the actual stack. The runtime contains code
to try to detect likely candidates but again it is incomplete.
Deriving the SPWRITE bit automatically from the actual assembly code
provides the complete truth, and passing it to the runtime lets the
runtime use it.
This CL is part of a stack adding windows/arm64
support (#36439), intended to land in the Go 1.17 cycle.
This CL is, however, not windows/arm64-specific.
It is cleanup meant to make the port (and future ports) easier.
Change-Id: I227f53b23ac5b3dabfcc5e8ee3f00df4e113cf58
Reviewed-on: https://go-review.googlesource.com/c/go/+/288800
Trust: Russ Cox <rsc@golang.org>
Trust: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
When a function with non-zero frame size makes a return jump
(RET target), it assembles to, conceptually,
MOV (SP), LR
ADD $framesize, SP
JMP target
We did not clear some fields in the first instruction's Prog.To,
causing it printed like (on ARM)
MOVW.P 4(R13), (R14)(R14)(REG)
Clear the fields to make it print nicer.
Change-Id: I180901aeea41f1ff287d7c6034a6d69005927744
Reviewed-on: https://go-review.googlesource.com/c/go/+/264343
Trust: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
|
|
This creates space for a different kind of extension field
in LSym without making the struct any larger.
(There are many LSym, so we care about keeping the struct small.)
Change-Id: Ib16edb9e15f54c2a7351c8b875e19684058711e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/243943
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
We currently use two fields to store the targets of branches.
Some phases use p.To.Val, some use p.Pcond. Rewrite so that
every branch instruction uses p.To.Val.
p.From.Val is also used in rare instances.
Introduce a Pool link for use by arm/arm64, instead of
repurposing Pcond.
This is a cleanup CL in preparation for some stack frame CLs.
Change-Id: If8239177e4b1ea2bccd0608eb39553d23210d405
Reviewed-on: https://go-review.googlesource.com/c/go/+/251437
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This reverts CL 243318.
Reason for revert: Seems to be crashing some builders.
Change-Id: I2ffc59bc5535be60b884b281c8d0eff4647dc756
Reviewed-on: https://go-review.googlesource.com/c/go/+/251169
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
We currently use two fields to store the targets of branches.
Some phases use p.To.Val, some use p.Pcond. Rewrite so that
every branch instruction uses p.To.Val.
p.From.Val is also used in rare instances.
Introduce a Pool link for use by arm/arm64, instead of
repurposing Pcond.
This is a cleanup CL in preparation for some stack frame CLs.
Change-Id: I9055bf0a1d986aff421e47951a1dedc301c846f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/243318
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
The optimization that replaces inline markers with pre-existing
instructions assumes that 'Prog' values produced by the compiler are
still reachable after the assembler has run. This was not true on
s390x where the assembler was removing NOP instructions from the
linked list of 'Prog' values. This led to broken inlining data
which in turn caused an infinite loop in the runtime traceback code.
Fix this by stopping the s390x assembler backend removing NOP
values. It does not make any difference to the output of the
assembler because NOP instructions are 0 bytes long anyway.
Fixes #40473.
Change-Id: I9b97c494afaae2d5ed6bca4cd428b4132b5f8133
Reviewed-on: https://go-review.googlesource.com/c/go/+/249448
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This reverts CL 247697.
Reason for revert: This change broke the linux-arm builder.
Change-Id: I8ca0d5b3b2ea0109ffbfadeab1406a1b60e7d18d
Reviewed-on: https://go-review.googlesource.com/c/go/+/248718
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The optimization that replaces inline markers with pre-existing
instructions assumes that 'Prog' values produced by the compiler are
still reachable after the assembler has run. This was not true on
s390x where the assembler was removing NOP instructions from the
linked list of 'Prog' values. This led to broken inlining data
which in turn caused an infinite loop in the runtime traceback code.
Fix this by stopping the s390x assembler backend removing NOP
values. It does not make any difference to the output of the
assembler because NOP instructions are 0 bytes long anyway.
Fixes #40473.
Change-Id: Ib4fabadd1de8adb80421f75950ee9aad2111147a
Reviewed-on: https://go-review.googlesource.com/c/go/+/247697
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
When there are both a synchronous preemption request (by
clobbering the stack guard) and an asynchronous one (by signal),
the running goroutine may observe the synchronous request first
in stack bounds check, and go to the path of calling morestack.
If the preemption signal arrives at this point before the call to
morestack, the goroutine will be asynchronously preempted,
entering the scheduler. When it is resumed, the scheduler clears
the preemption request, unclobbers the stack guard. But the
resumed goroutine will still call morestack, as it is already on
its way. morestack will, as there is no preemption request,
double the stack unnecessarily. If this happens multiple times,
the stack may grow too big, although only a small amount is
actually used.
To fix this, we mark the stack bounds check and the call to
morestack async-nonpreemptible, starting after the memory
instruction (mostly a load, on x86 CMP with memory).
Not done for Wasm as it does not support async preemption.
Fixes #35470.
Change-Id: Ibd7f3d935a3649b80f47539116ec9b9556680cf2
Reviewed-on: https://go-review.googlesource.com/c/go/+/207350
Reviewed-by: David Chase <drchase@google.com>
|
|
For async preemption, we will be using REGTMP as a temporary
register in injected call on S390X, which will clobber it. So any
code that uses REGTMP is not safe for async preemption.
In the assembler backend, we expand a Prog to multiple machine
instructions and use REGTMP as a temporary register if necessary.
These need to be marked unsafe. Unlike ARM64 and MIPS,
instructions on S390X are variable length so we don't use the
length as a condition. Instead, we set a bit on the Prog whenever
REGTMP is used.
Change-Id: Ie5d14068a950f4c7cea51dff2c4a8bdc19ec9348
Reviewed-on: https://go-review.googlesource.com/c/go/+/204105
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
The branch-relative-on-condition (BRC) instruction allows us to use
an immediate to specify under what conditions the branch is taken.
For example, `BRC $7, L1` is equivalent to `BNE L1`. It is sometimes
useful to specify branches in this way when either we don't have
an extended mnemonic for a particular mask value or we want to
generate the condition code mask programmatically.
The new load-on-condition (LOCR and LOCGR) and compare-and-branch
(CRJ, CGRJ, CLRJ, CLGRJ, CIJ, CGIJ, CLIJ and CLGIJ) instructions
provide the same flexibility for conditional loads and combined
compare and branch instructions.
Change-Id: Ic6f5d399b0157e278b39bd3645f4ee0f4df8e5fc
Reviewed-on: https://go-review.googlesource.com/c/go/+/196558
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This commit allows to cross-compiling aix/ppc64. The nosplit limit must
twice as large as on others platforms because of AIX syscalls.
The stack limit, especially stackGuardMultiplier, was set by cmd/dist
during the bootstrap and doesn't depend on GOOS/GOARCH target.
Fixes #29572
Change-Id: Id51e38885e1978d981aa9e14972eaec17294322e
Reviewed-on: https://go-review.googlesource.com/c/157117
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
The obj package needs to emit the PCDATA to select the entry stack map
before calling morestack. Currently this is copied for every
architecture. Since we're about to change how this works, consolidate
all of these copies into a single helper function.
For #24543.
Change-Id: Ia92d94de78f8e23fd06dba747c43e03e5989f67b
Reviewed-on: https://go-review.googlesource.com/109346
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Add a compiler intrinsic for getcallerpc on following architectures:
arm
mips mipsle mips64 mips64le
ppc64 ppc64le
s390x
Change-Id: I758f3d4742fc214b206bcd07d90408622c17dbef
Reviewed-on: https://go-review.googlesource.com/110835
Run-TryBot: Wei Xiao <Wei.Xiao@arm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Before DWARF location lists can be turned on, 3 bugs need
fixing.
This CL addresses two -- lack of register definitions for
various architectures, and bugs on 32-bit platforms.
The third bug comes later.
Passes
GO_GCFLAGS=-dwarflocationlists ./run.bash -no-rebuild
(-no-rebuild because the map dependence causes trouble)
Change-Id: I4223b48ade84763e4b048e4aeb81149f082c7bc7
Reviewed-on: https://go-review.googlesource.com/99255
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
When rewriting loads and stores accessing global variables to use the
GOT we were making use of REGTMP (R10). Unfortunately loads and stores
with large offsets (larger than 20-bits) were also using REGTMP,
causing it to be clobbered and subsequently a segmentation fault.
This can be fixed by using REGTMP2 (R11) for the rewrite. This is fine
because REGTMP2 only has a couple of uses in the assembler (division,
high multiplication and storage-to-storage instructions). We didn't
use REGTMP2 originally because it used to be used more frequently,
in particular for stores of constants to memory. However we have now
eliminated those uses.
This was found while writing a test case for CL 63030. That test case
is included in this CL.
Change-Id: I13956f1f3ca258a7c8a7ff0a7570d2848adf7f68
Reviewed-on: https://go-review.googlesource.com/65011
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Rematerializable ops can be inserted after the flagalloc phase,
they must therefore not clobber flags. This CL adds a check to
ensure this doesn't happen and fixes the instances where it
does currently.
amd64: ADDQconst and ADDLconst were recently changed to be
rematerializable in CL 54393 (only in tip, not 1.9). That change
has been reverted.
s390x: MOVDaddr could clobber flags when using dynamic linking due
to a ADD with immediate instruction. Change the code generation to
use LA/LAY instead.
Fixes #21080.
Change-Id: Ia85c882afa2a820a309e93775354b3169ec6d034
Reviewed-on: https://go-review.googlesource.com/63030
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
|
|
This change makes it easier to express instructions
with arbitrary number of operands.
Rationale: previous approach with operand "hiding" does
not scale well, AVX and especially AVX512 have many
instructions with 3+ operands.
x86 asm backend is updated to handle up to 6 explicit operands.
It also fixes issue with 4-th immediate operand type checks.
All `ytab` tables are updated accordingly.
Changes to non-x86 backends only include these patterns:
`p.From3 = X` => `p.SetFrom3(X)`
`p.From3.X = Y` => `p.GetFrom3().X = Y`
Over time, other backends can adapt Prog.RestArgs
and reduce the amount of workarounds.
-- Performance --
x/benchmark/build:
$ benchstat upstream.bench patched.bench
name old time/op new time/op delta
Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10)
name old binary-size new binary-size delta
Build-48 10.3M ± 0% 10.3M ± 0% ~ (all equal)
name old build-time/op new build-time/op delta
Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10)
name old build-peak-RSS-bytes new build-peak-RSS-bytes delta
Build-48 145MB ± 5% 148MB ± 5% ~ (p=0.218 n=10+10)
name old build-user+sys-time/op new build-user+sys-time/op delta
Build-48 21.0s ± 2% 21.2s ± 2% ~ (p=0.075 n=10+10)
Microbenchmark shows a slight slowdown.
name old time/op new time/op delta
AMD64asm-4 49.5ms ± 1% 49.9ms ± 1% +0.67% (p=0.001 n=23+15)
func BenchmarkAMD64asm(b *testing.B) {
for i := 0; i < b.N; i++ {
TestAMD64EndToEnd(nil)
TestAMD64Encoder(nil)
}
}
Change-Id: I4f1d37b5c2c966da3f2127705ccac9bff0038183
Reviewed-on: https://go-review.googlesource.com/63490
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
There were only two versions, 0 and 1,
and the only user of version 1 was the assembler,
to indicate that a symbol was static.
Rename LSym.Version to Static,
and add it to LSym.Attributes.
Simplify call-sites.
Passes toolstash-check.
Change-Id: Iabd39918f5019cce78f381d13f0481ae09f3871f
Reviewed-on: https://go-review.googlesource.com/41201
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
Now only cmd/asm and cmd/compile depend on cmd/internal/obj. Changing
the assembler backends no longer requires reinstalling cmd/link or
cmd/addr2line.
There's also now one canonical definition of the object file format in
cmd/internal/objabi/doc.go, with a warning to update all three
implementations.
objabi is still something of a grab bag of unrelated code (e.g., flag
and environment variable handling probably belong in a separate "tool"
package), but this is still progress.
Fixes #15165.
Fixes #20026.
Change-Id: Ic4b92fac7d0d35438e0d20c9579aad4085c5534c
Reviewed-on: https://go-review.googlesource.com/40972
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
Automated refactoring using github.com/mdempsky/unbed (to rewrite
s.Foo to s.FuncInfo.Foo) and then gorename (to rename the FuncInfo
field to just Func).
Passes toolstash-check -all.
Change-Id: I802c07a1239e0efea058a91a87c5efe12170083a
Reviewed-on: https://go-review.googlesource.com/40670
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
Prior to this CL, flags such as NOSPLIT
on ATEXT Progs were stored in From3.Offset.
Some but not all of those flags were also
duplicated into From.Sym.Attribute.
This CL migrates all of those flags into
From.Sym.Attribute and stops creating a From3.
A side-effect of this is that printing an
ATEXT Prog can no longer simply dump From3.Offset.
That's kind of good, since the raw flag value
wasn't very informative anyway, but it did
necessitate a bunch of updates to the cmd/asm tests.
The reason I'm doing this work now is that
avoiding storing flags in both From.Sym and From3.Offset
simplifies some other changes to fix the data
race first described in CL 40254.
This CL almost passes toolstash-check -all.
The only changes are in cases where the assembler
has decided that a function's flags may be altered,
e.g. to make a function with no calls in it NOSPLIT.
Prior to this CL, that information was not printed.
Sample before:
"".Ctz64 t=1 size=63 args=0x10 locals=0x0
0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35) TEXT "".Ctz64(SB), $0-16
0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35) FUNCDATA $0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)
Sample after:
"".Ctz64 t=1 nosplit size=63 args=0x10 locals=0x0
0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35) TEXT "".Ctz64(SB), NOSPLIT, $0-16
0x0000 00000 (/Users/josh/go/tip/src/runtime/internal/sys/intrinsics.go:35) FUNCDATA $0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB)
Observe the additional "nosplit" in the first line
and the additional "NOSPLIT" in the second line.
Updates #15756
Change-Id: I5c59bd8f3bdc7c780361f801d94a261f0aef3d13
Reviewed-on: https://go-review.googlesource.com/40495
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
CL 39922 made the arm assembler concurrency-safe.
This CL does the same, but for s390x.
The approach is similar: introduce ctxtz to hold
function-local state and thread it through
the assembler as necessary.
One race remains after this CL, similar to CL 40252.
That race is conceptually unrelated to this refactoring,
and will be addressed in a separate CL.
Passes toolstash-check -all.
Updates #15756
Change-Id: Iabf17aa242b70c0b078c2e85dae3d93a5e512372
Reviewed-on: https://go-review.googlesource.com/40371
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
|
|
Updates #19865
Change-Id: I24fbf5d79b5e4cac09c14cfff678a8215397b670
Reviewed-on: https://go-review.googlesource.com/39914
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
CL 38662 changed the x86 assembler to be eagerly
initialized, for a concurrent backend.
This CL puts in place a proper mechanism for doing so,
and switches all architectures to use it.
Passes toolstash-check -all.
Updates #15756
Change-Id: Id2aa527d3a8259c95797d63a2f0d1123e3ca2a1c
Reviewed-on: https://go-review.googlesource.com/39917
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
This is a straightforward refactoring,
to reduce the scope of upcoming changes.
The symbol size and AttrLocal=true was not
set universally, but it appears not to matter,
since toolstash -cmp is happy.
Passes toolstash-check -all.
Change-Id: I7f8392f939592d3a1bc6f61dec992f5661f42fca
Reviewed-on: https://go-review.googlesource.com/39791
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
It was simply a wrapper around Link.Lookup.
Unwrap everything.
CL prepared using eg with template:
package p
import "cmd/internal/obj"
func before(ctxt *obj.Link, name string, version int) *obj.LSym {
return obj.Linklookup(ctxt, name, version)
}
func after(ctxt *obj.Link, name string, version int) *obj.LSym {
return ctxt.Lookup(name, version)
}
Then one comment in cmd/asm/internal/asm/parse.go
was manually updated (and gofmt'ed!),
and func Linklookup deleted.
Passes toolstash-check (as a sanity measure).
Change-Id: Icc4d56b0b2b5c8888d3184c1898c48359ea1e638
Reviewed-on: https://go-review.googlesource.com/39715
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
The existing bulk Prog allocator is not concurrency-safe.
To allow for concurrency-safe bulk allocation of Progs,
I want to move Prog allocation and caching upstream,
to the clients of cmd/internal/obj.
This is a preliminary enabling refactoring.
After this CL, instead of calling Ctxt.NewProg
throughout the assemblers, we thread through
a newprog function that returns a new Prog.
That function is set up to be Ctxt.NewProg,
so there are no real changes in this CL;
this CL only establishes the plumbing.
Passes toolstash-check -all.
Negligible compiler performance impact.
Updates #15756
name old time/op new time/op delta
Template 213ms ± 3% 214ms ± 4% ~ (p=0.574 n=49+47)
Unicode 90.1ms ± 5% 89.9ms ± 4% ~ (p=0.417 n=50+49)
GoTypes 585ms ± 4% 584ms ± 3% ~ (p=0.466 n=49+49)
SSA 6.50s ± 3% 6.52s ± 2% ~ (p=0.251 n=49+49)
Flate 128ms ± 4% 128ms ± 4% ~ (p=0.673 n=49+50)
GoParser 152ms ± 3% 152ms ± 3% ~ (p=0.810 n=48+49)
Reflect 372ms ± 4% 372ms ± 5% ~ (p=0.778 n=49+50)
Tar 113ms ± 5% 111ms ± 4% -0.98% (p=0.016 n=50+49)
XML 208ms ± 3% 208ms ± 2% ~ (p=0.483 n=47+49)
[Geo mean] 285ms 285ms -0.17%
name old user-ns/op new user-ns/op delta
Template 253M ± 8% 254M ± 9% ~ (p=0.899 n=50+50)
Unicode 106M ± 9% 106M ±11% ~ (p=0.642 n=50+50)
GoTypes 736M ± 4% 740M ± 4% ~ (p=0.121 n=50+49)
SSA 8.82G ± 3% 8.88G ± 2% +0.65% (p=0.006 n=49+48)
Flate 147M ± 4% 147M ± 5% ~ (p=0.844 n=47+48)
GoParser 179M ± 4% 178M ± 6% ~ (p=0.785 n=50+50)
Reflect 443M ± 6% 441M ± 5% ~ (p=0.850 n=48+47)
Tar 126M ± 5% 126M ± 5% ~ (p=0.734 n=50+50)
XML 244M ± 5% 244M ± 5% ~ (p=0.594 n=49+50)
[Geo mean] 341M 341M +0.11%
Change-Id: Ice962f61eb3a524c2db00a166cb582c22caa7d68
Reviewed-on: https://go-review.googlesource.com/39633
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
There is always 128 bytes available below the stackguard. Allow functions
with medium-sized stack frames to use this, potentially allowing them to
avoid growing the stack.
This change makes all architectures use the same calculation as x86.
Change-Id: I2afb1a7c686ae5a933e50903b31ea4106e4cd0a0
Reviewed-on: https://go-review.googlesource.com/38734
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Follow-up to CL 38446.
Passes toolstash-check -all.
Change-Id: I04cadc058cbaa5f396136502c574e5a395a33276
Reviewed-on: https://go-review.googlesource.com/38669
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change-Id: I9ac274dbfe887675a7820d2f8f87b5887b1c9b0e
Reviewed-on: https://go-review.googlesource.com/38383
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
This CL deletes some unnecessary code in objz.go that existed to
support instruction scheduling. It's likely instruction scheduling
will never be done in this part of the backend so this code can
just be deleted.
This file can probably be cleaned up a bit more, but I think this
is a good start.
Passes: go build -toolexec 'toolstash -cmp' -a std.
Change-Id: I1645632ac551a90a4f4be418045c046b488e9469
Reviewed-on: https://go-review.googlesource.com/38394
Run-TryBot: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The s390x port was based on the ppc64 port and, because of the way the
port was done, inherited some instructions from it. ppc64 supports
3-operand (4-operand for FMADD etc.) floating point instructions
but s390x doesn't (the destination register is always an input) and
so these were emulated.
There is a bug in the emulation of FMADD whereby if the destination
register is also a source for the multiplication it will be
clobbered. This doesn't break any assembly code in the std lib but
could affect future work.
To fix this I have gone through the floating point instructions and
removed all unnecessary 3-/4-operand emulation. The compiler doesn't
need it and assembly writers don't need it, it's just a source of
bugs.
I've also deleted the FNMADD family of emulated instructions. They
aren't used anywhere.
Change-Id: Ic07cedcf141a6a3b43a0c84895460f6cfbf56c04
Reviewed-on: https://go-review.googlesource.com/33350
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
The Follow pass in the assembler backend reorders and copies
instructions. This even applies to hand-written assembly code,
which in many cases don't want to be reordered. Now that the
SSA compiler does a good job for laying out instructions, the
benefit of this pass is very little:
AMD64: (old = with Follow, new = without Follow)
name old time/op new time/op delta
BinaryTree17-12 2.78s ± 1% 2.79s ± 1% +0.44% (p=0.000 n=20+19)
Fannkuch11-12 3.11s ± 0% 3.31s ± 1% +6.16% (p=0.000 n=19+19)
FmtFprintfEmpty-12 50.9ns ± 1% 51.6ns ± 3% +1.40% (p=0.000 n=17+20)
FmtFprintfString-12 127ns ± 0% 128ns ± 1% +0.88% (p=0.000 n=17+17)
FmtFprintfInt-12 122ns ± 0% 123ns ± 1% +0.76% (p=0.000 n=20+19)
FmtFprintfIntInt-12 185ns ± 1% 186ns ± 1% +0.65% (p=0.000 n=20+19)
FmtFprintfPrefixedInt-12 192ns ± 1% 202ns ± 1% +4.99% (p=0.000 n=20+19)
FmtFprintfFloat-12 284ns ± 0% 288ns ± 0% +1.33% (p=0.000 n=15+19)
FmtManyArgs-12 807ns ± 0% 804ns ± 0% -0.44% (p=0.000 n=16+18)
GobDecode-12 7.23ms ± 1% 7.21ms ± 1% ~ (p=0.052 n=20+20)
GobEncode-12 6.09ms ± 1% 6.12ms ± 1% +0.41% (p=0.002 n=19+19)
Gzip-12 253ms ± 1% 255ms ± 1% +0.95% (p=0.000 n=18+20)
Gunzip-12 38.4ms ± 0% 38.5ms ± 0% +0.34% (p=0.000 n=17+17)
HTTPClientServer-12 95.4µs ± 2% 96.1µs ± 1% +0.78% (p=0.002 n=19+19)
JSONEncode-12 16.5ms ± 1% 16.6ms ± 1% +1.17% (p=0.000 n=19+19)
JSONDecode-12 54.6ms ± 1% 55.3ms ± 1% +1.23% (p=0.000 n=18+18)
Mandelbrot200-12 4.47ms ± 0% 4.47ms ± 0% +0.06% (p=0.000 n=18+18)
GoParse-12 3.47ms ± 1% 3.47ms ± 1% ~ (p=0.583 n=20+20)
RegexpMatchEasy0_32-12 84.8ns ± 1% 85.2ns ± 2% +0.51% (p=0.022 n=20+20)
RegexpMatchEasy0_1K-12 206ns ± 1% 206ns ± 1% ~ (p=0.770 n=20+20)
RegexpMatchEasy1_32-12 82.8ns ± 1% 83.4ns ± 1% +0.64% (p=0.000 n=20+19)
RegexpMatchEasy1_1K-12 363ns ± 1% 361ns ± 1% -0.48% (p=0.007 n=20+20)
RegexpMatchMedium_32-12 126ns ± 1% 126ns ± 0% +0.72% (p=0.000 n=20+20)
RegexpMatchMedium_1K-12 39.1µs ± 1% 39.8µs ± 0% +1.73% (p=0.000 n=19+19)
RegexpMatchHard_32-12 1.97µs ± 0% 1.98µs ± 1% +0.29% (p=0.005 n=18+20)
RegexpMatchHard_1K-12 59.5µs ± 1% 59.8µs ± 1% +0.36% (p=0.000 n=18+20)
Revcomp-12 442ms ± 1% 445ms ± 2% +0.67% (p=0.000 n=19+20)
Template-12 58.0ms ± 1% 57.5ms ± 1% -0.85% (p=0.000 n=19+19)
TimeParse-12 311ns ± 0% 314ns ± 0% +0.94% (p=0.000 n=20+18)
TimeFormat-12 350ns ± 3% 346ns ± 0% ~ (p=0.076 n=20+19)
[Geo mean] 55.9µs 56.4µs +0.80%
ARM32:
name old time/op new time/op delta
BinaryTree17-4 30.4s ± 0% 30.1s ± 0% -1.14% (p=0.000 n=10+8)
Fannkuch11-4 13.7s ± 0% 13.6s ± 0% -0.75% (p=0.000 n=10+10)
FmtFprintfEmpty-4 664ns ± 1% 651ns ± 1% -1.96% (p=0.000 n=7+8)
FmtFprintfString-4 1.83µs ± 2% 1.77µs ± 2% -3.21% (p=0.000 n=10+10)
FmtFprintfInt-4 1.57µs ± 2% 1.54µs ± 2% -2.25% (p=0.007 n=10+10)
FmtFprintfIntInt-4 2.37µs ± 2% 2.31µs ± 1% -2.68% (p=0.000 n=10+10)
FmtFprintfPrefixedInt-4 2.14µs ± 2% 2.10µs ± 1% -1.83% (p=0.006 n=10+10)
FmtFprintfFloat-4 3.69µs ± 2% 3.74µs ± 1% +1.60% (p=0.000 n=10+10)
FmtManyArgs-4 9.43µs ± 1% 9.17µs ± 1% -2.70% (p=0.000 n=10+10)
GobDecode-4 76.3ms ± 1% 75.5ms ± 1% -1.14% (p=0.003 n=10+10)
GobEncode-4 70.7ms ± 2% 69.0ms ± 1% -2.36% (p=0.000 n=10+10)
Gzip-4 2.64s ± 1% 2.65s ± 0% +0.59% (p=0.002 n=10+10)
Gunzip-4 402ms ± 0% 398ms ± 0% -1.11% (p=0.000 n=10+9)
HTTPClientServer-4 458µs ± 0% 457µs ± 0% ~ (p=0.247 n=10+10)
JSONEncode-4 171ms ± 0% 172ms ± 0% +0.56% (p=0.000 n=10+10)
JSONDecode-4 672ms ± 1% 668ms ± 1% ~ (p=0.105 n=10+10)
Mandelbrot200-4 33.5ms ± 0% 33.5ms ± 0% ~ (p=0.156 n=9+10)
GoParse-4 33.9ms ± 0% 34.0ms ± 0% +0.36% (p=0.031 n=9+9)
RegexpMatchEasy0_32-4 823ns ± 1% 835ns ± 1% +1.49% (p=0.000 n=8+8)
RegexpMatchEasy0_1K-4 3.99µs ± 0% 4.02µs ± 1% +0.92% (p=0.000 n=8+10)
RegexpMatchEasy1_32-4 877ns ± 3% 904ns ± 2% +3.07% (p=0.012 n=10+10)
RegexpMatchEasy1_1K-4 5.99µs ± 0% 5.97µs ± 1% -0.38% (p=0.023 n=8+8)
RegexpMatchMedium_32-4 1.40µs ± 2% 1.40µs ± 2% ~ (p=0.590 n=10+9)
RegexpMatchMedium_1K-4 357µs ± 0% 355µs ± 1% -0.72% (p=0.000 n=7+8)
RegexpMatchHard_32-4 22.3µs ± 0% 22.1µs ± 0% -0.49% (p=0.000 n=8+7)
RegexpMatchHard_1K-4 661µs ± 0% 658µs ± 0% -0.42% (p=0.000 n=8+7)
Revcomp-4 46.3ms ± 0% 46.3ms ± 0% ~ (p=0.393 n=10+10)
Template-4 753ms ± 1% 750ms ± 0% ~ (p=0.211 n=10+9)
TimeParse-4 4.28µs ± 1% 4.22µs ± 1% -1.34% (p=0.000 n=8+10)
TimeFormat-4 9.00µs ± 0% 9.05µs ± 0% +0.59% (p=0.000 n=10+10)
[Geo mean] 538µs 535µs -0.55%
ARM64:
name old time/op new time/op delta
BinaryTree17-8 8.39s ± 0% 8.39s ± 0% ~ (p=0.684 n=10+10)
Fannkuch11-8 5.95s ± 0% 5.99s ± 0% +0.63% (p=0.000 n=10+10)
FmtFprintfEmpty-8 116ns ± 0% 116ns ± 0% ~ (all equal)
FmtFprintfString-8 361ns ± 0% 360ns ± 0% -0.31% (p=0.003 n=8+6)
FmtFprintfInt-8 290ns ± 0% 290ns ± 0% ~ (p=0.620 n=9+9)
FmtFprintfIntInt-8 476ns ± 1% 469ns ± 0% -1.47% (p=0.000 n=10+6)
FmtFprintfPrefixedInt-8 412ns ± 2% 417ns ± 2% +1.39% (p=0.006 n=9+10)
FmtFprintfFloat-8 652ns ± 1% 652ns ± 0% ~ (p=0.161 n=10+8)
FmtManyArgs-8 1.94µs ± 0% 1.94µs ± 2% ~ (p=0.781 n=10+10)
GobDecode-8 17.7ms ± 1% 17.7ms ± 0% ~ (p=0.962 n=10+7)
GobEncode-8 15.6ms ± 0% 15.6ms ± 1% ~ (p=0.063 n=10+10)
Gzip-8 786ms ± 0% 787ms ± 0% ~ (p=0.356 n=10+9)
Gunzip-8 127ms ± 0% 127ms ± 0% +0.08% (p=0.028 n=10+9)
HTTPClientServer-8 198µs ± 6% 198µs ± 7% ~ (p=0.796 n=10+10)
JSONEncode-8 42.5ms ± 0% 42.2ms ± 0% -0.73% (p=0.000 n=9+8)
JSONDecode-8 158ms ± 1% 162ms ± 0% +2.28% (p=0.000 n=10+9)
Mandelbrot200-8 10.1ms ± 0% 10.1ms ± 0% -0.01% (p=0.000 n=10+9)
GoParse-8 8.54ms ± 1% 8.63ms ± 1% +1.06% (p=0.000 n=10+9)
RegexpMatchEasy0_32-8 231ns ± 1% 225ns ± 0% -2.52% (p=0.000 n=9+10)
RegexpMatchEasy0_1K-8 1.63µs ± 0% 1.63µs ± 0% ~ (p=0.170 n=10+10)
RegexpMatchEasy1_32-8 253ns ± 0% 249ns ± 0% -1.41% (p=0.000 n=9+10)
RegexpMatchEasy1_1K-8 2.08µs ± 0% 2.08µs ± 0% -0.32% (p=0.000 n=9+10)
RegexpMatchMedium_32-8 355ns ± 1% 351ns ± 0% -1.04% (p=0.007 n=10+7)
RegexpMatchMedium_1K-8 104µs ± 0% 104µs ± 0% ~ (p=0.148 n=10+10)
RegexpMatchHard_32-8 5.79µs ± 0% 5.79µs ± 0% ~ (p=0.578 n=10+10)
RegexpMatchHard_1K-8 176µs ± 0% 176µs ± 0% ~ (p=0.137 n=10+10)
Revcomp-8 1.37s ± 1% 1.36s ± 1% -0.26% (p=0.023 n=10+10)
Template-8 151ms ± 1% 154ms ± 1% +2.14% (p=0.000 n=9+10)
TimeParse-8 723ns ± 2% 721ns ± 1% ~ (p=0.592 n=10+10)
TimeFormat-8 804ns ± 2% 798ns ± 3% ~ (p=0.344 n=10+10)
[Geo mean] 154µs 154µs -0.02%
Therefore remove this pass. Also reduce text size by 0.5~2%.
Comment out some dead code in runtime/sys_nacl_amd64p32.s
which contains undefined symbols.
Change-Id: I1473986fe5b18b3d2554ce96cdc6f0999b8d955d
Reviewed-on: https://go-review.googlesource.com/36205
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Change-Id: I7585d85907869f5a286b36936dfd035f1e8e9906
Reviewed-on: https://go-review.googlesource.com/34197
Run-TryBot: David Lazar <lazard@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|