| Age | Commit message (Collapse) | Author |
|
Currently this is a raw symbol name with no package component, which is
confusing when seen in profilers or similar tools.
This function does not follow a Go ABI, and thus should not have a Go
function declaration. go vet requires declaration for standard assembly
functions.
CL 176100 removed the package name as part of making vet pass on package
runtime, but simply making the function static via the <> suffix is
sufficient, there is no need to shorten the symbol name.
Change-Id: I6a6a636c6030f1c9a4b8bb330978733bb336b08e
Reviewed-on: https://go-review.googlesource.com/c/go/+/738521
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Implement secret.Do.
- When secret.Do returns:
- Clear stack that is used by the argument function.
- Clear all the registers that might contain secrets.
- On stack growth in secret mode, clear the old stack.
- When objects are allocated in secret mode, mark them and then zero
the marked objects immediately when they are freed.
- If the argument function panics, raise that panic as if it originated
from secret.Do. This removes anything about the secret function
from tracebacks.
For now, this is only implemented on linux for arm64 and amd64.
This is a rebased version of Keith Randalls initial implementation at
CL 600635. I have added arm64 support, signal handling, preemption
handling and dealt with vDSOs spilling into system stacks.
Fixes #21865
Change-Id: I6fbd5a233beeaceb160785e0c0199a5c94d8e520
Co-authored-by: Keith Randall <khr@golang.org>
Reviewed-on: https://go-review.googlesource.com/c/go/+/704615
Reviewed-by: Roland Shoemaker <roland@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
There are a few places in the runtime where new threads enter Go code
with a possibly invalid frame pointer. mstart is the entry point for new
Ms, and rt0_go is the entrypoint for the program. As we try to introduce
frame pointer unwinding in more places (e.g. for heap profiling in CL
540476 or for execution trace events on the system stack in CL 593835),
we see these functions on the stack. We need to ensure that they have
valid frame pointers. These functions are both considered the "top"
(first) frame frame of the call stack, so this CL sets the frame pointer
register to 0 in these functions.
Updates #63630
Change-Id: I6a6a6964a9ebc6f68ba23d2616e5fb6f19677f97
Reviewed-on: https://go-review.googlesource.com/c/go/+/721020
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
(*lfstack).pop and (*spanSet).pop on arm64
When profiling CPU usage LiveKit on AArch64/x86 (AWS), the graphs show
CPU spikes that was repeating in a semi-periodic manner and spikes occur
when the GC(garbage collector) is active.
Our analysis found that the getempty function accounted for 10.54% of the
overhead, which was mainly caused by the work.empty.pop() function. And
listing pop shows that the majority of the time, with a 10.29% overhead,
is spent on atomic.Cas64((*uint64)(head), old, next).
This patch adds a backoff approach to reduce the high overhead of the
atomic operation primarily occurs when contention over a specific memory
address increases, typically with the rise in the number of threads.
Note that on paltforms other than arm64, the initial value of backoff is zero.
This patch rewrites the implementation of procyield() on arm64, which is an
Armv8.0-A compatible delay function using the counter-timer.
The garbage collector benchmark:
│ master │ opt │
│ sec/op │ sec/op vs base │
Garbage/benchmem-MB=64-160 3.782m ± 4% 2.264m ± 2% -40.12% (p=0.000 n=10)
│ user+sys-sec/op │ user+sys-sec/op vs base │
Garbage/benchmem-MB=64-160 433.5m ± 4% 255.4m ± 2% -41.08% (p=0.000 n=10)
Reference for backoff mechianism:
https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/multi-threaded-applications-arm
Change-Id: Ie8128a2243ceacbb82ab2a88941acbb8428bad94
Reviewed-on: https://go-review.googlesource.com/c/go/+/654895
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Change-Id: I9f01692373623687e09bee54efebaac0ee361f81
Reviewed-on: https://go-review.googlesource.com/c/go/+/712664
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
procyield will currently loop infinitely if passed 0 on several
platforms. This change sidesteps this bug by renaming procyield to
procyieldAsm, and adding a wrapper named procyield that checks for
cycles == 0. The benefit of this structure is that procyield called
with a constant cycle count of 0 will be inlined and constant folded
away, the expected behavior of a procyield of 0 cycles.
A follow-up change will fix the assembly to not have this footgun
anymore.
Change-Id: I7068abfeb961bc0fa475e216836f7c0e46b38373
Reviewed-on: https://go-review.googlesource.com/c/go/+/712663
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
This reverts commit 719dfcf8a8478d70360bf3c34c0e920be7b32994.
Reason for revert: Causing crashes.
Change-Id: I0b8526dd03d82fa074ce4f97f1789eeac702b3eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/709755
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Instead of storing LR (the return address) at 0(SP) and the FP
(parent's frame pointer) at -8(SP), store them at framesize-8(SP)
and framesize-16(SP), respectively.
We push and pop data onto the stack such that we're never accessing
anything below SP.
The prolog/epilog lengths are unchanged (3 insns for a typical prolog,
2 for a typical epilog).
We use 8 bytes more per frame.
Typical prologue:
STP.W (FP, LR), -16(SP)
MOVD SP, FP
SUB $C, SP
Typical epilogue:
ADD $C, SP
LDP.P 16(SP), (FP, LR)
RET
The previous word where we stored LR, at 0(SP), is now unused.
We could repurpose that slot for storing a local variable.
The new prolog and epilog instructions are recognized by libunwind,
so pc-sampling tools like perf should now be accurate. (TODO: except
maybe after the first RET instruction? Have to look into that.)
Update #73753 (fixes, for arm64)
Update #57302 (Quim thinks this will help on that issue)
Change-Id: I4800036a9a9a08aaaf35d9f99de79a36cf37ebb8
Reviewed-on: https://go-review.googlesource.com/c/go/+/674615
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
There is a lot of duplication in how arm64 OSes handle entry points.
Do as amd64, have all the logic in a common function.
Cq-Include-Trybots: luci.golang.try:gotip-darwin-arm64-longtest,gotip-windows-arm64
Change-Id: I370c25c3c4b107b525aba14e9dcac34a02d9872e
Reviewed-on: https://go-review.googlesource.com/c/go/+/706175
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Quim Muntal <quimmuntal@gmail.com>
|
|
For all the static bounds checks in cmd/go, we have:
6877 just a single instruction (the call itself)
139 needs an additional reg-reg move
602 needs an additional constant load
25 needs some other instruction
that's ~90% implemented using just a single instruction.
Reduces the text size of cmd/go by ~0.8%.
Total binary size is just barely smaller, ~0.2%. (The difference
is the new pcdata table.)
Change-Id: I416e9c196f5d8d0e8f08e191e6df3045e11dccbe
Reviewed-on: https://go-review.googlesource.com/c/go/+/682496
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Corollary to CL 669615.
morestack uses the frame pointer from g0.sched.bp. This doesn't really
make any sense. morestack wasn't called by whatever used g0 last, so at
best unwinding will get misleading results.
For #63630.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-arm64-longtest
Change-Id: I6a6a636c3a2994eb88f890c506c96fd899e993a1
Reviewed-on: https://go-review.googlesource.com/c/go/+/669616
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Nick Ripley <nick.ripley@datadoghq.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
On arm64, systemstack restores the frame pointer from g0.sched to R29
prior to calling the callback. That doesn't really make any sense. The
frame pointer value in g0.sched is some arbitrary BP from a prior
context save, but that is not the caller of systemstack.
amd64 does not do this. In fact, it leaves BP completely unmodified so
frame pointer unwinders like gdb can walk through the systemstack frame
and continue traceback on the caller's stack. Unlike mcall, systemstack
always returns to the original goroutine, so that is safe.
We should do the same on arm64.
For #63630.
Cq-Include-Trybots: luci.golang.try:gotip-linux-arm64-longtest
Change-Id: I6a6a636c35d321dd5d7dc1c4d09e29b55b1ab621
Reviewed-on: https://go-review.googlesource.com/c/go/+/669236
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Nick Ripley <nick.ripley@datadoghq.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
On amd64, mcall leaves BP untouched, so the callback will push BP,
connecting the g0 stack to the calling g stack. This seems OK (frame
pointer unwinders like Linux perf can see what user code called into the
scheduler), but the "scheduler" part is problematic.
mcall is used when calling into the scheduler to deschedule the current
goroutine (e.g., in goyield). Once the goroutine is descheduled, it may
be picked up by another M and continue execution. The other thread is
mutating the goroutine stack, but our M still has a frame pointer
pointing to the goroutine stack.
A frame pointer unwinder like Linux perf could get bogus values off of
the mutating stack. Note that though the execution tracer uses
framepointer unwinding, it never unwinds a g0, so it isn't affected.
Clear the frame pointer in mcall so that unwinding always stops at
mcall.
On arm64, mcall stores the frame pointer from g0.sched.bp. This doesn't
really make any sense. mcall wasn't called by whatever used g0 last, so
at best unwinding will get misleading results (e.g., it might look like
cgocallback calls mcall?).
Also clear the frame pointer on arm64.
Other architectures don't use frame pointers.
For #63630.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-arm64-longtest
Change-Id: I6a6a636cb6404f3c95ecabdb969c9b8184615cee
Reviewed-on: https://go-review.googlesource.com/c/go/+/669615
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Nick Ripley <nick.ripley@datadoghq.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
|
|
It's not used for anything.
Change-Id: I031b3cdfe52b6b1cff4b3cb6713ffe588084542f
Reviewed-on: https://go-review.googlesource.com/c/go/+/652276
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
this removes the old conditional-on-register-value
handshake from the deferproc/deferprocstack logic.
The "line" for the recovery-exit frame itself (not the defers
that it runs) is the closing brace of the function.
Reduces code size slightly (e.g. go command is 0.2% smaller)
Sample output showing effect of this change, also what sort of
code it requires to observe the effect:
```
package main
import "os"
func main() {
g(len(os.Args) - 1) // stack[0]
}
var gi int
var pi *int = &gi
//go:noinline
func g(i int) {
switch i {
case 0:
defer func() {
println("g0", i)
q() // stack[2] if i == 0
}()
for j := *pi; j < 1; j++ {
defer func() {
println("recover0", recover().(string))
}()
}
default:
for j := *pi; j < 1; j++ {
defer func() {
println("g1", i)
q() // stack[2] if i == 1
}()
}
defer func() {
println("recover1", recover().(string))
}()
}
p()
} // stack[1] (deferreturn)
//go:noinline
func p() {
panic("p()")
}
//go:noinline
func q() {
panic("q()") // stack[3]
}
/* Sample output for "./foo foo":
recover1 p()
g1 1
panic: q()
goroutine 1 [running]:
main.q()
.../main.go:46 +0x2c
main.g.func3()
.../main.go:29 +0x48
main.g(0x1?)
.../main.go:37 +0x68
main.main()
.../main.go:6 +0x28
*/
```
Change-Id: Ie39ea62ecc244213500380ea06d44024cadc2317
Reviewed-on: https://go-review.googlesource.com/c/go/+/650795
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Check presence of LSE support on ARM64 chip if we targeted it at compile
time.
Related to #69124
Updates #60905
Fixes #71411
Change-Id: I65e899a28ff64a390182572c0c353aa5931fc85d
Reviewed-on: https://go-review.googlesource.com/c/go/+/645795
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
|
|
This reverts CL 610195.
Reason for revert: SIGILL on macOS. See issue #71411.
Updates #69124, #60905.
Fixes #71411.
Change-Id: Ie0624e516dfb32fb13563327bcd7f557e5cba940
Reviewed-on: https://go-review.googlesource.com/c/go/+/644695
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
|
|
Check presence of LSE support on ARM64 chip if we targeted it at compile time.
Related to #69124
Update #60905
Change-Id: I6fe244decbb4982548982e1f88376847721a33c7
Reviewed-on: https://go-review.googlesource.com/c/go/+/610195
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Shu-Chun Weng <scw@google.com>
|
|
Adds runtime.debugPinnerV1 which returns a runtime.Pinner object that
pins itself. This is intended to be used by debuggers in conjunction
with runtime.debugCall to keep heap memory reachable even if it isn't
referenced from anywhere else.
Change-Id: I508ee6a7b103e68df83c96f2e04a0599200300dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/558276
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Error like "morestack on g0" is one of the errors that is very
hard to debug, because often it doesn't print a useful stack trace.
The runtime doesn't directly print a stack trace because it is
a bad stack state to call print. Sometimes the SIGABRT may trigger
a traceback, but sometimes not especially in a cgo binary. Even if
it triggers a traceback it often does not include the stack trace
of the bad stack.
This CL makes it explicitly print a stack trace and throw. The
idea is to have some space as an "emergency" crash stack. When the
stack is in a really bad state, we switch to the crash stack and
do a traceback.
Currently only implemented on AMD64 and ARM64.
TODO: also handle errors like "morestack on gsignal" and bad
systemstack. Also handle other architectures.
Change-Id: Ibfc397202f2bb0737c5cbe99f2763de83301c1c1
Reviewed-on: https://go-review.googlesource.com/c/go/+/419435
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Fix spelling errors discovered using https://github.com/codespell-project/codespell. Errors in data files and vendored packages are ignored.
Change-Id: I83c7818222f2eea69afbd270c15b7897678131dc
GitHub-Last-Rev: 3491615b1b82832cc0064f535786546e89aa6184
GitHub-Pull-Request: golang/go#60758
Reviewed-on: https://go-review.googlesource.com/c/go/+/502576
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Change-Id: If13f4d4bc545f78e3eb8c23cf2e63f0eb273d71f
GitHub-Last-Rev: 32ca70f52a5c3dd66f18535c5e595e66afb3903c
GitHub-Pull-Request: golang/go#60703
Reviewed-on: https://go-review.googlesource.com/c/go/+/502055
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
The previous name was wrong due to the mistaken assumption that calling
f->g->getcallerpc and f->g->getcallersp would respectively return the
pc/sp at g. However, they are actually referring to their caller's
caller, i.e. f.
Rename getcallerfp to getfp in order to stay consistent with this
naming convention.
Also see discussion on CL 463835.
For #16638
This is a redo of CL 481617 that became necessary because CL 461738
added another call site for getcallerfp().
Change-Id: If0b536e85a6c26061b65e7b5c2859fc31385d025
Reviewed-on: https://go-review.googlesource.com/c/go/+/494857
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
|
|
This reapplies CL 485500, with a fix drafted in CL 492987 incorporated.
CL 485500 is reverted due to #60004 and #60007. #60004 is fixed in
CL 492743. #60007 is fixed in CL 492987 (incorporated in this CL).
[Original CL 485500 description]
This reapplies CL 481061, with the followup fixes in CL 482975, CL 485315, and
CL 485316 incorporated.
CL 481061, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 482975 is a followup fix to a C declaration in testprogcgo.
CL 485315 is a followup fix for x_cgo_getstackbound on Illumos.
CL 485316 is a followup cleanup for ppc64 assembly.
CL 479915 passed the G to _cgo_getstackbound for direct updates to
gp.stack.lo. A G can be reused on a new thread after the previous thread
exited. This could trigger the C TSAN race detector because it couldn't
see the synchronization in Go (lockextra) preventing the same G from
being used on multiple threads at the same time.
We work around this by passing the address of a stack variable to
_cgo_getstackbound rather than the G. The stack is generally unique per
thread, so TSAN won't see the same address from multiple threads. Even
if stacks are reused across threads by pthread, C TSAN should see the
synchonization in the stack allocator.
A regression test is added to misc/cgo/testsanitizer.
[Original CL 481061 description]
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
[CL 492987 description]
On the first call into Go from a C thread, currently we set the g0
stack's high bound imprecisely based on the SP. With CL 485500, we
keep the M and don't recompute the stack bounds when it calls into
Go again. If the first call is made when the C thread uses some
deep stack, but a subsequent call is made with a shallower stack,
the SP may be above g0.stack.hi.
This is usually okay as we don't check usually stack.hi. One place
where we do check for stack.hi is in the signal handler, in
adjustSignalStack. In particular, C TSAN delivers signals on the
g0 stack (instead of the usual signal stack). If the SP is above
g0.stack.hi, we don't see it is on the g0 stack, and throws.
This CL makes it get an accurate stack upper bound with the
pthread API (on the platforms where it is available).
Also add some debug print for the "handler not on signal stack"
throw.
Fixes #51676.
Fixes #59294.
Fixes #59678.
Fixes #60007.
Change-Id: Ie51c8e81ade34ec81d69fd7bce1fe0039a470776
Reviewed-on: https://go-review.googlesource.com/c/go/+/495855
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This reverts CL 481617.
Reason for revert: breaks test build on Windows
Change-Id: Ifc1a323b0cc521e7a5a1f7de7b3da667f5fee375
Reviewed-on: https://go-review.googlesource.com/c/go/+/494377
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
The previous name was wrong due to the mistaken assumption that calling
f->g->getcallerpc and f->g->getcallersp would respectively return the
pc/sp at g. However, they are actually referring to their caller's
caller, i.e. f.
Rename getcallerfp to getfp in order to stay consistent with this
naming convention.
Also see discussion on CL 463835.
For #16638
Change-Id: I07990645da78819efd3db92f643326652ee516f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/481617
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This reverts CL 485500.
Reason for revert: This breaks internal tests at Google, see b/280861579 and b/280820455.
Change-Id: I426278d400f7611170918fc07c524cb059b9cc55
Reviewed-on: https://go-review.googlesource.com/c/go/+/492995
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Chressie Himpel <chressie@google.com>
|
|
This reapplies CL 481061, with the followup fixes in CL 482975, CL 485315, and
CL 485316 incorporated.
CL 481061, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 482975 is a followup fix to a C declaration in testprogcgo.
CL 485315 is a followup fix for x_cgo_getstackbound on Illumos.
CL 485316 is a followup cleanup for ppc64 assembly.
[Original CL 481061 description]
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
[CL 485500 description]
CL 479915 passed the G to _cgo_getstackbound for direct updates to
gp.stack.lo. A G can be reused on a new thread after the previous thread
exited. This could trigger the C TSAN race detector because it couldn't
see the synchronization in Go (lockextra) preventing the same G from
being used on multiple threads at the same time.
We work around this by passing the address of a stack variable to
_cgo_getstackbound rather than the G. The stack is generally unique per
thread, so TSAN won't see the same address from multiple threads. Even
if stacks are reused across threads by pthread, C TSAN should see the
synchonization in the stack allocator.
A regression test is added to misc/cgo/testsanitizer.
Fixes #51676.
Fixes #59294.
Fixes #59678.
Change-Id: Ic62be31a06ee83568215e875a891df37084e08ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/485500
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
|
|
For #59670.
Change-Id: I0efa743edc08e48dc8d906803ba45e9f641369db
Reviewed-on: https://go-review.googlesource.com/c/go/+/486977
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
|
|
This reverts commit CL 486381.
Submitted out of order and breaks bootstrap.
Change-Id: Ia472111cb966e884a48f8ee3893b3bf4b4f4f875
Reviewed-on: https://go-review.googlesource.com/c/go/+/486915
Reviewed-by: David Chase <drchase@google.com>
TryBot-Bypass: Austin Clements <austin@google.com>
|
|
For #59670.
Change-Id: I4476d6f92663e8a825d063d6e6a7fc9a2ac99d4d
Reviewed-on: https://go-review.googlesource.com/c/go/+/486381
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
We really need 3 mix steps between the data being hashed and the output.
One mix can only spread a 1 bit change to 32 bits. The second mix
can spread to all 128 bits, but the spread is not complete. A third mix
spreads out ~evenly to all 128 bits.
The amd64 version has 3 mix steps.
Fixes #59643
Change-Id: I54ad8686ca42bcffb6d0ec3779d27af682cc96e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/486616
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This reverts CL 481061.
Reason for revert: When built with C TSAN, x_cgo_getstackbound triggers
race detection on `g->stacklo` because the synchronization is in Go,
which isn't instrumented.
For #51676.
For #59294.
For #59678.
Change-Id: I38afcda9fcffd6537582a39a5214bc23dc147d47
Reviewed-on: https://go-review.googlesource.com/c/go/+/485275
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
Fixes #51676.
Fixes #59294.
Change-Id: I9bf1400106d5c08ce621d2ed1df3a2d9e3f55494
Reviewed-on: https://go-review.googlesource.com/c/go/+/481061
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: DeJiang Zhu (doujiang) <doujiang24@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This reverts CL 392854.
Reason for revert: caused #59294, which was derived from google
internal tests. The attempted fix of #59294 caused more breakage.
Change-Id: I5a061561ac2740856b7ecc09725ac28bd30f8bba
Reviewed-on: https://go-review.googlesource.com/c/go/+/481060
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Change tracer to use frame pointer unwinding by default on amd64. The
expansion of inline frames is delayed until the stack table is dumped at
the end of the trace. This requires storing the skip argument in the
stack table, which now resides in pcBuf[0]. For stacks that are not
produced by traceStackID (e.g. CPU samples), a logicalStackSentinel
value in pcBuf[0] indicates that no inline expansion is needed.
Add new GODEBUG=tracefpunwindoff=1 option to use the old unwinder if
needed.
Benchmarks show a considerable decrease in CPU overhead when using frame
pointer unwinding for trace events:
GODEBUG=tracefpunwindoff=1 ../bin/go test -run '^$' -bench '.+PingPong' -count 20 -v -trace /dev/null ./runtime | tee tracefpunwindoff1.txt
GODEBUG=tracefpunwindoff=0 ../bin/go test -run '^$' -bench '.+PingPong' -count 20 -v -trace /dev/null ./runtime | tee tracefpunwindoff0.txt
goos: linux
goarch: amd64
pkg: runtime
cpu: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
│ tracefpunwindoff1.txt │ tracefpunwindoff0.txt │
│ sec/op │ sec/op vs base │
PingPongHog-32 3782.5n ± 0% 740.7n ± 2% -80.42% (p=0.000 n=20)
For #16638
Change-Id: I2928a2fcd8779a31c45ce0f2fbcc0179641190bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/463835
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
Fixes #51676
Change-Id: I380702fe2f9b6b401b2d6f04b0aba990f4b9ee6c
GitHub-Last-Rev: 93dc64ad98e5583372e41f65ee4b7ab78b5aff51
GitHub-Pull-Request: golang/go#51679
Reviewed-on: https://go-review.googlesource.com/c/go/+/392854
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: thepudds <thepudds1460@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Have the write barrier call return a pointer to a buffer into which
the generated code records pointers that need write barrier treatment.
Change-Id: I7871764298e0aa1513de417010c8d46b296b199e
Reviewed-on: https://go-review.googlesource.com/c/go/+/447781
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Bypass: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Previously, the write barrier calls themselves did the actual
writes to memory. Instead, move those writes out to a common location
that both the wb-enabled and wb-disabled code paths share.
This enables us to optimize the write barrier path without having
to worry about performing the actual writes.
Change-Id: Ia71ab651908ec124cc33141afb52e4ca19733ac6
Reviewed-on: https://go-review.googlesource.com/c/go/+/447780
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Bypass: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Future CLs will remove the invariant that pointers are always put in
the write barrier in pairs.
The behavior of the assembly code changes a bit, where instead of writing
the pointers unconditionally and then checking for overflow, check for
overflow first and then write the pointers.
Also changed the write barrier flush function to not take the src/dst
as arguments.
Change-Id: I2ef708038367b7b82ea67cbaf505a1d5904c775c
Reviewed-on: https://go-review.googlesource.com/c/go/+/447779
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Bypass: Keith Randall <khr@golang.org>
|
|
On LR architectures, morestack (and morestack_noctxt) are called
with a special calling convention, where the caller doesn't save
LR on stack but passes it as a register, which morestack will save
to g.sched.lr. The stack unwinder currently doesn't understand it,
and would fail to unwind from it. morestack already writes SP (as
it switches stack), but morestack_noctxt (which tailcalls
morestack) doesn't. If a profiling signal lands right in
morestack_noctxt, the unwinder will try to unwind the stack and
go off, and possibly crash.
Marking morestack_noctxt SPWRITE stops the unwinding.
Ideally we could teach the unwinder about the special calling
convention, or change the calling convention to be less special
(so the unwinder doesn't need to fetch a register from the signal
context). This is a stop-gap solution, to stop the unwinder from
crashing.
Fixes #54332.
Change-Id: I75295f2e27ddcf05f1ea0b541aedcb9000ae7576
Reviewed-on: https://go-review.googlesource.com/c/go/+/425396
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Fixes #53837
Change-Id: I4219fe35aac1a88aae2905998fbb1d7db87bbfb2
Reviewed-on: https://go-review.googlesource.com/c/go/+/418734
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Alessandro Arzilli <alessandro.arzilli@gmail.com>
|
|
Change-Id: I63eb42f3ce5ca452279120a5b33518f4ce16be45
GitHub-Last-Rev: a88f2f72bef402344582ae997a4907457002b5df
GitHub-Pull-Request: golang/go#52951
Reviewed-on: https://go-review.googlesource.com/c/go/+/406843
Run-TryBot: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
|
|
This CL improves the annotation documentation of the debugCallV2 function
for arm64.
Change-Id: Icc2b52063cf4fe779071039d6a3bca1951108eb0
Reviewed-on: https://go-review.googlesource.com/c/go/+/402514
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
This CL adds support for debugger function calls on linux arm64
platform. The protocol is basically the same as in CL 109699, except for
the following differences:
1, The abi difference which affect parameter passing and frame layout.
2, Stores communication information in R20.
3, The closure register is R26.
4, Use BRK 0 instruction to generate a breakpoint. The saved PC in
sigcontext is the PC where the signal occurred, not the next PC.
In addition, this CL refactors the existing code (which is dedicated to
amd64) for easier multi-arch scaling.
Fixes #50614
Change-Id: I06b14e345cc89aab175f4a5f2287b765da85a86b
Reviewed-on: https://go-review.googlesource.com/c/go/+/395754
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
There are several of places that save and restore the C callee-saved registers,
the operation is the same everywhere, so this CL defines several macros
to do this, which will help reduce code redundancy and unify the operation.
This CL also replaced consecutive MOVD instructions with STP and LDP instructions
in several places where these macros do not apply.
Change-Id: I815f39fe484a9ab9b6bd157dfcbc8ad99c1420fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/374397
Trust: Eric Fang <eric.fang@arm.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Change-Id: I3996fb31789a1f8559348e059cf371774e548a8d
Reviewed-on: https://go-review.googlesource.com/c/go/+/393875
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
regabireflect goexperiment was helpful in the register ABI
development, to control code paths for reflect calls, before the
compiler can generate register ABI everywhere. It is not necessary
for now. Drop it.
Change-Id: I2731197d2f496e29616c426a01045c9b685946a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/393362
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
If register ABI is used, no need to store the arguments to stack.
I forgot them in CL 323937.
Change-Id: I888af2b547a8fc97d13716bc8e8f3acd5c5bc127
Reviewed-on: https://go-review.googlesource.com/c/go/+/351609
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
In asmcgocall() we need to switch to the g0 stack if we're not already on
the g0 stack or the gsignal stack. The prefered way of doing this is to
check gsignal first, then g0, since if we are going to switch to g0 we will
need g0 handy (thus avoiding a second load).
Rewrite/reorder 386 and amd64 to check gsignal first - this shaves a few
assembly instructions off and makes the order consistent with arm, arm64,
mips64 and ppc64. Add missing gsignal checks to mips, riscv64 and s390x.
Change-Id: I1b027bf393c25e0c33e1d8eb80de67e4a0a3f561
Reviewed-on: https://go-review.googlesource.com/c/go/+/335869
Trust: Joel Sing <joel@sing.id.au>
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|