aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/runtime2.go
diff options
context:
space:
mode:
authorCherry Mui <cherryyz@google.com>2024-07-22 16:23:43 -0400
committerCherry Mui <cherryyz@google.com>2024-10-30 19:09:32 +0000
commit76a8409eb81eda553363783dcdd9d6224368ae0e (patch)
treea5a9f69448701e6abeb004db4387d5f783e6d578 /src/runtime/runtime2.go
parent555ef554602a7d09ec302df7f2e3397815a804ee (diff)
downloadgo-76a8409eb81eda553363783dcdd9d6224368ae0e.tar.xz
runtime: update and restore g0 stack bounds at cgocallback
Currently, at a cgo callback where there is already a Go frame on the stack (i.e. C->Go->C->Go), we require that at the inner Go callback the SP is within the g0's stack bounds set by a previous callback. This is to prevent that the C code switches stack while having a Go frame on the stack, which we don't really support. But this could also happen when we cannot get accurate stack bounds, e.g. when pthread_getattr_np is not available. Since the stack bounds are just estimates based on the current SP, if there are multiple C->Go callbacks with various stack depth, it is possible that the SP of a later callback falls out of a previous call's estimate. This leads to runtime throw in a seemingly reasonable program. This CL changes it to save the old g0 stack bounds at cgocallback, update the bounds, and restore the old bounds at return. So each callback will get its own stack bounds based on the current SP, and when it returns, the outer callback has the its old stack bounds restored. Also, at a cgo callback when there is no Go frame on the stack, we currently always get new stack bounds. We do this because if we can only get estimated bounds based on the SP, and the stack depth varies a lot between two C->Go calls, the previous estimates may be off and we fall out or nearly fall out of the previous bounds. But this causes a performance problem: the pthread API to get accurate stack bounds (pthread_getattr_np) is very slow when called on the main thread. Getting the stack bounds every time significantly slows down repeated C->Go calls on the main thread. This CL fixes it by "caching" the stack bounds if they are accurate. I.e. at the second time Go calls into C, if the previous stack bounds are accurate, and the current SP is in bounds, we can be sure it is the same stack and we don't need to update the bounds. This avoids the repeated calls to pthread_getattr_np. If we cannot get the accurate bounds, we continue to update the stack bounds based on the SP, and that operation is very cheap. On a Linux/AMD64 machine with glibc: name old time/op new time/op delta CgoCallbackMainThread-8 96.4µs ± 3% 0.1µs ± 2% -99.92% (p=0.000 n=10+9) Fixes #68285. Fixes #68587. Change-Id: I3422badd5ad8ff63e1a733152d05fb7a44d5d435 Reviewed-on: https://go-review.googlesource.com/c/go/+/600296 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
Diffstat (limited to 'src/runtime/runtime2.go')
-rw-r--r--src/runtime/runtime2.go83
1 files changed, 42 insertions, 41 deletions
diff --git a/src/runtime/runtime2.go b/src/runtime/runtime2.go
index 34aefd4c47..5e75e154e6 100644
--- a/src/runtime/runtime2.go
+++ b/src/runtime/runtime2.go
@@ -530,47 +530,48 @@ type m struct {
_ uint32 // align next field to 8 bytes
// Fields not known to debuggers.
- procid uint64 // for debuggers, but offset not hard-coded
- gsignal *g // signal-handling g
- goSigStack gsignalStack // Go-allocated signal handling stack
- sigmask sigset // storage for saved signal mask
- tls [tlsSlots]uintptr // thread-local storage (for x86 extern register)
- mstartfn func()
- curg *g // current running goroutine
- caughtsig guintptr // goroutine running during fatal signal
- p puintptr // attached p for executing go code (nil if not executing go code)
- nextp puintptr
- oldp puintptr // the p that was attached before executing a syscall
- id int64
- mallocing int32
- throwing throwType
- preemptoff string // if != "", keep curg running on this m
- locks int32
- dying int32
- profilehz int32
- spinning bool // m is out of work and is actively looking for work
- blocked bool // m is blocked on a note
- newSigstack bool // minit on C thread called sigaltstack
- printlock int8
- incgo bool // m is executing a cgo call
- isextra bool // m is an extra m
- isExtraInC bool // m is an extra m that is not executing Go code
- isExtraInSig bool // m is an extra m in a signal handler
- freeWait atomic.Uint32 // Whether it is safe to free g0 and delete m (one of freeMRef, freeMStack, freeMWait)
- needextram bool
- traceback uint8
- ncgocall uint64 // number of cgo calls in total
- ncgo int32 // number of cgo calls currently in progress
- cgoCallersUse atomic.Uint32 // if non-zero, cgoCallers in use temporarily
- cgoCallers *cgoCallers // cgo traceback if crashing in cgo call
- park note
- alllink *m // on allm
- schedlink muintptr
- lockedg guintptr
- createstack [32]uintptr // stack that created this thread, it's used for StackRecord.Stack0, so it must align with it.
- lockedExt uint32 // tracking for external LockOSThread
- lockedInt uint32 // tracking for internal lockOSThread
- nextwaitm muintptr // next m waiting for lock
+ procid uint64 // for debuggers, but offset not hard-coded
+ gsignal *g // signal-handling g
+ goSigStack gsignalStack // Go-allocated signal handling stack
+ sigmask sigset // storage for saved signal mask
+ tls [tlsSlots]uintptr // thread-local storage (for x86 extern register)
+ mstartfn func()
+ curg *g // current running goroutine
+ caughtsig guintptr // goroutine running during fatal signal
+ p puintptr // attached p for executing go code (nil if not executing go code)
+ nextp puintptr
+ oldp puintptr // the p that was attached before executing a syscall
+ id int64
+ mallocing int32
+ throwing throwType
+ preemptoff string // if != "", keep curg running on this m
+ locks int32
+ dying int32
+ profilehz int32
+ spinning bool // m is out of work and is actively looking for work
+ blocked bool // m is blocked on a note
+ newSigstack bool // minit on C thread called sigaltstack
+ printlock int8
+ incgo bool // m is executing a cgo call
+ isextra bool // m is an extra m
+ isExtraInC bool // m is an extra m that is not executing Go code
+ isExtraInSig bool // m is an extra m in a signal handler
+ freeWait atomic.Uint32 // Whether it is safe to free g0 and delete m (one of freeMRef, freeMStack, freeMWait)
+ needextram bool
+ g0StackAccurate bool // whether the g0 stack has accurate bounds
+ traceback uint8
+ ncgocall uint64 // number of cgo calls in total
+ ncgo int32 // number of cgo calls currently in progress
+ cgoCallersUse atomic.Uint32 // if non-zero, cgoCallers in use temporarily
+ cgoCallers *cgoCallers // cgo traceback if crashing in cgo call
+ park note
+ alllink *m // on allm
+ schedlink muintptr
+ lockedg guintptr
+ createstack [32]uintptr // stack that created this thread, it's used for StackRecord.Stack0, so it must align with it.
+ lockedExt uint32 // tracking for external LockOSThread
+ lockedInt uint32 // tracking for internal lockOSThread
+ nextwaitm muintptr // next m waiting for lock
mLockProfile mLockProfile // fields relating to runtime.lock contention
profStack []uintptr // used for memory/block/mutex stack traces