aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/proc.go
AgeCommit message (Collapse)Author
2024-05-24internal/runtime/exithook: make safe for concurrent os.ExitRuss Cox
Real programs can call os.Exit concurrently from multiple goroutines. Make internal/runtime/exithook not crash in that case. The throw on panic also now runs in the deferred context, so that we will see the full stack trace that led to the panic. That should give us more visibility into the flaky failures on bugs #55167 and #56197 as well. Fixes #67631. Change-Id: Iefdf71b3a3b52a793ca88d89a9c270eb50ece094 Reviewed-on: https://go-review.googlesource.com/c/go/+/588235 Reviewed-by: Than McIntosh <thanm@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org>
2024-05-23runtime: move exit hooks into internal/runtime/exithookRuss Cox
This removes a //go:linkname usage in the coverage implementation. For #67401. Change-Id: I0602172c7e372a84465160dbf46d9fa371582fff Reviewed-on: https://go-review.googlesource.com/c/go/+/586259 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Than McIntosh <thanm@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-05-23all: document legacy //go:linkname for modules with ≥200 dependentsRuss Cox
Ignored these linknames which have not worked for a while: github.com/xtls/xray-core: context.newCancelCtx removed in CL 463999 (Feb 2023) github.com/u-root/u-root: funcPC removed in CL 513837 (Jul 2023) tinygo.org/x/drivers: net.useNetdev never existed For #67401. Change-Id: I9293f4ef197bb5552b431de8939fa94988a060ce Reviewed-on: https://go-review.googlesource.com/c/go/+/587576 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-23all: document legacy //go:linkname for modules with ≥500 dependentsRuss Cox
For #67401. Change-Id: I7dd28c3b01a1a647f84929d15412aa43ab0089ee Reviewed-on: https://go-review.googlesource.com/c/go/+/587575 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-23all: document legacy //go:linkname for modules with ≥20,000 dependentsRuss Cox
For #67401. Change-Id: Icc10ede72547d8020c0ba45e89d954822a4b2455 Reviewed-on: https://go-review.googlesource.com/c/go/+/587218 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-21runtime: make profstackdepth a GODEBUG optionFelix Geisendörfer
Allow users to decrease the profiling stack depth back to 32 in case they experience any problems with the new default of 128. Users may also use this option to increase the depth up to 1024. Change-Id: Ieaab2513024915a223239278dd97a6e161dde1cf Reviewed-on: https://go-review.googlesource.com/c/go/+/581917 Reviewed-by: Austin Clements <austin@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-05-21runtime: increase profiling stack depth to 128Felix Geisendörfer
The current stack depth limit for alloc, mutex, block, threadcreate and goroutine profiles of 32 frequently leads to truncated stack traces in production applications. Increase the limit to 128 which is the same size used by the execution tracer. Create internal/profilerecord to define variants of the runtime's StackRecord, MemProfileRecord and BlockProfileRecord types that can hold arbitrarily big stack traces. Implement internal profiling APIs based on these new types and use them for creating protobuf profiles and to act as shims for the public profiling APIs using the old types. This will lead to an increase in memory usage for applications that use the impacted profile types and have stack traces exceeding the current limit of 32. Those applications will also experience a slight increase in CPU usage, but this will hopefully soon be mitigated via CL 540476 and 533258 which introduce frame pointer unwinding for the relevant profile types. For #43669. Change-Id: Ie53762e65d0f6295f5d4c7d3c87172d5a052164e Reviewed-on: https://go-review.googlesource.com/c/go/+/572396 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-05-17runtime: fix coro interactions with thread-locked goroutinesMichael Anthony Knyszek
This change fixes problems with thread-locked goroutines using newcoro/coroswitch/etc. Currently, the coro paths do not consider thread-locked goroutines at all and can quickly result in broken scheduler state or lost/leaked goroutines. One possible fix to these issues is to fall back on goroutine+channel semantics, but that turns out to be fairly complicated to implement and results in significant performance cliffs. More complex thread-lock state donation tricks also result in some fairly complicated state tracking that doesn't seem worth it given the use-cases of iter.Pull (and even then, there will be performance cliffs). This change implements a much simpler, but more restrictive semantics. In particular, thread-lock state is tied to the coro at the first call to newcoro (i.e. iter.Pull). From then on, the invariant is that if the coro has any thread-lock state *or* a goroutine calling into coroswitch has any thread-lock state, that the full gamut of thread-lock state must remain the same as it was when newcoro was called (the full gamut meaning internal and external lock counts as well as the identity of the thread that was locked to). This semantics allows the common cases to be always fast, but comes with a non-orthogonality caveat. Specifically, when iter.Pull is used in conjunction with thread-locked goroutines, complex cases (passing next between goroutines or passing yield between goroutines) are likely to fail. Simple cases, where any number of iter.Pull iterators are used in a straightforward way (nested, in series, etc.) from the same goroutine, will work and will be guaranteed to be fast regardless of thread-lock state. This is a compromise for the near-term and we may consider lifting the restrictions imposed by this CL in the future. Fixes #65889. Fixes #65946. Change-Id: I3fb5791e36a61f5ded50226a229a79d28739b24e Reviewed-on: https://go-review.googlesource.com/c/go/+/583675 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Austin Clements <austin@google.com>
2024-05-17runtime: make use of stringslite.{HasPrefix, HasSuffix}Jes Cok
Change-Id: I7461a892e1591e3bad876f0a718a99e6de2c4659 Reviewed-on: https://go-review.googlesource.com/c/go/+/585435 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-05-16runtime: cleanup crashstack codeMauri de Souza Meneguzzo
This patch removes redundant guards for the crash stack feature, as all supported Go architectures now have a crash stack implementation. Change-Id: I7ffb61c57955778d687073418130b2aaab0ff183 GitHub-Last-Rev: 6ff56d6420093ee8f95c75019557dc8bcec32ed7 GitHub-Pull-Request: golang/go#67202 Reviewed-on: https://go-review.googlesource.com/c/go/+/583416 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Mauri de Souza Meneguzzo <mauri870@gmail.com>
2024-05-13runtime: remove unused codeqiulaidongfeng
Change-Id: Ifb9864704f55e27adfa5c21452fed5a243468d13 GitHub-Last-Rev: 6b000e7314ea47eea6fead60988d6d432bc381f7 GitHub-Pull-Request: golang/go#67013 Reviewed-on: https://go-review.googlesource.com/c/go/+/581376 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-08runtime: fix eagerly typoMichael Pratt
Change-Id: I3150e2d0b9f5590c6da95392b0b51df94b8c20eb Reviewed-on: https://go-review.googlesource.com/c/go/+/584338 Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-08runtime: delete pagetrace GOEXPERIMENTMichael Anthony Knyszek
The page tracer's functionality is now captured by the regular execution tracer as an experimental GODEBUG variable. This is a lot more usable and maintainable than the page tracer, which is likely to have bitrotted by this point. There's also no tooling available for the page tracer. Change-Id: I2408394555e01dde75a522e9a489b7e55cf12c8e Reviewed-on: https://go-review.googlesource.com/c/go/+/583379 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-08runtime: move profiling pc buffers to mFelix Geisendörfer
Move profiling pc buffers from being stack allocated to an m field. This is motivated by the next patch, which will increase the default stack depth to 128, which might lead to undesirable stack growth for goroutines that produce profiling events. Additionally, this change paves the way to make the stack depth configurable via GODEBUG. Change-Id: Ifa407f899188e2c7c0a81de92194fdb627cb4b36 Reviewed-on: https://go-review.googlesource.com/c/go/+/574699 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-05-06runtime: delete dead code in the tracerMichael Anthony Knyszek
This code was just missed during the cleanup. There's maybe some merit to keeping OneNewExtraM, but it would still be fairly optimistic. It's trivial to bring back, so delete it for now. Change-Id: I2d033c6daae787e0e8d6b92524f3e59610e2599f Reviewed-on: https://go-review.googlesource.com/c/go/+/583375 Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-06runtime: add crash stack support for mips/mipsleMauri de Souza Meneguzzo
Change-Id: Ic6fcdfb6a9a912a9b1dd268836d2e5ab44d80440 GitHub-Last-Rev: dab6ecc0660d3e0f8e23944286f965a3bb15b4cb GitHub-Pull-Request: golang/go#65305 Reviewed-on: https://go-review.googlesource.com/c/go/+/558698 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Mauri de Souza Meneguzzo <mauri870@gmail.com>
2024-04-23runtime: switch to systemstack before throw in casgstatusMichael Anthony Knyszek
CL 580255 increased the frame size of entersyscall and reentersyscall, which is causing the x/sys repository to fail to build for windows/arm64 because of an overflow of the nosplit stack reservation. Fix this by wrapping the other call to throw in casgstatus in a system stack switch. This is a fatal throw anyway indicating a core runtime invariant is broken, so this path is basically never taken. This cuts off the nosplit frame chain and allows x/sys to build. Change-Id: I00b16c9db3a7467413ed48953c7f8a9a750f000a Reviewed-on: https://go-review.googlesource.com/c/go/+/580775 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-22runtime: reduced struct sizes found via paholeSabyrzhan Tasbolatov
During my research of pahole with Go structs, I've found couple of structs in runtime/ pkg where we can reduce several structs' sizes highligted by pahole tool which detect byte holes and paddings. Overall, there are 80 bytes reduced. Change-Id: I398e5ed6f5b199394307741981cb5ad5b875e98f Reviewed-on: https://go-review.googlesource.com/c/go/+/578795 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Joedian Reid <joedian@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-22runtime: set gp.syscallbp from entersyscallblock_handoffMichael Anthony Knyszek
This was an oversight and is causing a few failures, most notably on Solaris and Illumos, but also occasionally on the Linux builders. Change-Id: I38bd28537ad01d955675f61f9b1d42b9ecdd1ef0 Reviewed-on: https://go-review.googlesource.com/c/go/+/580875 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-22runtime: always acquire M when acquiring locks by rankRhys Hiltner
Profiling of runtime-internal locks checks gp.m.locks to see if it's safe to add a new record to the profile, but direct use of acquireLockRank can change the list of the M's active lock ranks without updating gp.m.locks to match. The runtime's internal rwmutex implementation makes a point of calling acquirem/releasem when manipulating the lock rank list, but the other user of acquireLockRank (the GC's Gscan bit) relied on the GC's invariants to avoid deadlocks. Codify the rwmutex approach by renaming acquireLockRank to acquireLockRankAndM and having it include a call to aquirem. Do the same for release. Fixes #64706 Fixes #66004 Change-Id: Ib76eaa0cc1c45b64861d03345e17e1e843c19713 GitHub-Last-Rev: 160577bdb2bb2a4e869c6fd7e53e3be8fb819182 GitHub-Pull-Request: golang/go#66276 Reviewed-on: https://go-review.googlesource.com/c/go/+/571056 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-04-19runtime: track frame pointer while in syscallMichael Anthony Knyszek
Currently the runtime only tracks the PC and SP upon entering a syscall, but not the FP (BP). This is mainly for historical reasons, and because the tracer (which uses the frame pointer unwinder) does not need it. Until it did, of course, in CL 567076, where the tracer tries to take a stack trace of a goroutine that's in a syscall from afar. It tries to use gp.sched.bp and lots of things go wrong. It *really* should be using the equivalent of gp.syscallbp, which doesn't exist before this CL. This change introduces gp.syscallbp and tracks it. It also introduces getcallerfp which is nice for simplifying some code. Because we now have gp.syscallbp, we can also delete the frame skip count computation in traceLocker.GoSysCall, because it's now the same regardless of whether frame pointer unwinding is used. Fixes #66889. Change-Id: Ib6d761c9566055e0a037134138cb0f81be73ecf7 Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-nocgo Reviewed-on: https://go-review.googlesource.com/c/go/+/580255 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-04-18internal/weak: add package implementing weak pointersMichael Anthony Knyszek
This change adds the internal/weak package, which exposes GC-supported weak pointers to the standard library. This is for the upcoming weak package, but may be useful for other future constructs. For #62483. Change-Id: I4aa8fa9400110ad5ea022a43c094051699ccab9d Reviewed-on: https://go-review.googlesource.com/c/go/+/576297 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-15runtime, cmd/trace: remove code paths that include v1 tracerCarlos Amedee
This change makes the new execution tracer described in #60773, the default tracer. This change attempts to make the smallest amount of changes for a single CL. Updates #66703 For #60773 Change-Id: I3742f3419c54f07d7c020ae5e1c18d29d8bcae6d Reviewed-on: https://go-review.googlesource.com/c/go/+/576256 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-08runtime: account for _Pgcstop in GC CPU pause time in a fine-grained wayMichael Anthony Knyszek
The previous CL, CL 570257, made it so that STW time no longer overlapped with other CPU time tracking. However, what we lost was insight into the CPU time spent _stopping_ the world, which can be just as important. There's pretty much no easy way to measure this indirectly, so this CL implements a direct measurement: whenever a P enters _Pgcstop, it writes down what time it did so. stopTheWorld then accumulates all the time deltas between when it finished stopping the world and each P's stop time into a total additional pause time. The GC pause cases then accumulate this number into the metrics. This should cause minimal additional overhead in stopping the world. GC STWs already take on the order of 10s to 100s of microseconds. Even for 100 Ps, the extra `nanotime` call per P is only 1500ns of additional CPU time. This is likely to be much less in actual pause latency, since it all happens concurrently. Change-Id: Icf190ffea469cd35ebaf0b2587bf6358648c8554 Reviewed-on: https://go-review.googlesource.com/c/go/+/574215 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Nicolas Hillegeer <aktau@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-04-08runtime: remove overlap in the GC CPU pause time metricsMichael Anthony Knyszek
Currently the GC CPU pause time metrics start measuring before the STW is complete. This results in a slightly less accurate measurement and creates some overlap with other timings (for example, the idle time of idle Ps) that will cause double-counting. This CL adds a field to worldStop to track the point at which the world actually stopped and uses that as the basis for the GC CPU pause time metrics, basically eliminating this overlap. Note that this will cause Ps in _Pgcstop before the world is fully stopped to be counted as user time. A follow-up CL will fix this discrepancy. Change-Id: I287731f08415ffd97d327f582ddf7e5d2248a6f5 Reviewed-on: https://go-review.googlesource.com/c/go/+/570258 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Nicolas Hillegeer <aktau@google.com>
2024-04-05runtime: take a stack trace during tracing only when we own the stackMichael Anthony Knyszek
Currently, the execution tracer may attempt to take a stack trace of a goroutine whose stack it does not own. For example, if the goroutine is in _Grunnable or _Gwaiting. This is easily fixed in all cases by simply moving the emission of GoStop and GoBlock events to before the casgstatus happens. The goroutine status is what is used to signal stack ownership, and the GC may shrink a goroutine's stack if it can acquire the scan bit. Although this is easily fixed, the interaction here is very subtle, because stack ownership is only implicit in the goroutine's scan status. To make this invariant more maintainable and less error-prone in the future, this change adds a GODEBUG setting that checks, at the point of taking a stack trace, whether the caller owns the goroutine. This check is not quite perfect because there's no way for the stack tracing code to know that the _Gscan bit was acquired by the caller, so for simplicity it assumes that it was the caller that acquired the scan bit. In all other cases however, we can check for ownership precisely. At the very least, this check is sufficient to catch the issue this change is fixing. To make sure this debug check doesn't bitrot, it's always enabled during trace testing. This new mode has actually caught a few other issues already, so this change fixes them. One issue that this debug mode caught was that it's not safe to take a stack trace of a _Gwaiting goroutine that's being unparked. Another much bigger issue this debug mode caught was the fact that the execution tracer could try to take a stack trace of a G that was in _Gwaiting solely to avoid a deadlock in the GC. The execution tracer already has a partial list of these cases since they're modeled as the goroutine just executing as normal in the tracer, but this change takes the list and makes it more formal. In this specific case, we now prevent the GC from shrinking the stacks of goroutines in this state if tracing is enabled. The stack traces from these scenarios are too useful to discard, but there is indeed a race here between the tracer and any attempt to shrink the stack by the GC. Change-Id: I019850dabc8cede202fd6dcc0a4b1f16764209fb Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-amd64-longtest-race Reviewed-on: https://go-review.googlesource.com/c/go/+/573155 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-03-25runtime: migrate internal/atomic to internal/runtimeAndy Pan
For #65355 Change-Id: I65dd090fb99de9b231af2112c5ccb0eb635db2be Reviewed-on: https://go-review.googlesource.com/c/go/+/560155 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ibrahim Bazoka <ibrahimbazoka729@gmail.com> Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
2024-03-22runtime: add tracing for iter.PullMichael Anthony Knyszek
This change resolves a TODO in the coroutine switch implementation (used exclusively by iter.Pull at the moment) to enable tracing. This was blocked on eliminating the atomic load in the tracer's "off" path (completed in the previous CL in this series) and the addition of new tracer events to minimize the overhead of tracing in this circumstance. This change introduces 3 new event types to support coroutine switches: GoCreateBlocked, GoSwitch, and GoSwitchDestroy. GoCreateBlocked needs to be introduced because the goroutine created for the coroutine starts out in a blocked state. There's no way to represent this in the tracer right now, so we need a new event for it. GoSwitch represents the actual coroutine switch, which conceptually consists of a GoUnblock, a GoBlock, and a GoStart event in series (unblocking the next goroutine to run, blocking the current goroutine, and then starting the next goroutine to run). GoSwitchDestroy is closely related to GoSwitch, implementing the same semantics except that GoBlock is replaced with GoDestroy. This is used when exiting the coroutine. The implementation of all this is fairly straightforward, and the trace parser simply translates GoSwitch* into the three constituent events. Because GoSwitch and GoSwitchDestroy imply a GoUnblock and a GoStart, they need to synchronize with other past and future GoStart events to create a correct partial ordering in the trace. Therefore, these events need a sequence number for the goroutine that will be unblocked and started. Also, while implementing this, I noticed that the coroutine implementation is actually buggy with respect to LockOSThread. In fact, it blatantly disregards its invariants without an explicit panic. While such a case is likely to be rare (and inefficient!) we should decide how iter.Pull behaves with respect to runtime.LockOSThread. Lastly, this change also bumps the trace version from Go 1.22 to Go 1.23. We're adding events that are incompatible with a Go 1.22 parser, but Go 1.22 traces are all valid Go 1.23 traces, so the newer parser supports both (and the CL otherwise updates the Go 1.22 definitions of events and such). We may want to reconsider the structure and naming of some of these packages though; it could quickly get confusing. For #61897. Change-Id: I96897a46d5852c02691cde9f957dc6c13ef4d8e7 Reviewed-on: https://go-review.googlesource.com/c/go/+/565937 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-03-15runtime: add crash stack support for 386Mauri de Souza Meneguzzo
Change-Id: Ib787b27670ad0f10bcc94b3ce76e86746997af00 GitHub-Last-Rev: e5abb9a556ae709d30d6ffb5a13805923c215254 GitHub-Pull-Request: golang/go#65934 Reviewed-on: https://go-review.googlesource.com/c/go/+/566715 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-03-08runtime: avoid pp.timers.lock in updateTimerPMaskRuss Cox
The comment in updateTimerPMask is wrong. It says: // Looks like there are no timers, however another P // may be adding one at this very moment. // Take the lock to synchronize. This was my incorrect simplification of the original comment from CL 264477 when I was renaming all the things it mentioned: // Looks like there are no timers, however another P may transiently // decrement numTimers when handling a timerModified timer in // checkTimers. We must take timersLock to serialize with these changes. updateTimerPMask is being called by pidleput, so the P in question is not in use. And other P's cannot add to this P. As the original comment more precisely noted, the problem was that other P's might be calling timers.check, which updates ts.len occasionally while ts is locked, and one of those updates might "leak" an ephemeral len==0 even when the heap is not going to be empty when the P is finally unlocked. The lock/unlock in updateTimerPMask synchronizes to avoid that. But this defeats most of the purpose of using ts.len in the first place. Instead of requiring that synchronization, we can arrange that ts.len only ever shows a "publishable" length, meaning the len(ts.heap) we leave behind during ts.unlock. Having done that, updateTimerPMask can be inlined into pidleput. The big comment on updateTimerPMask explaining how timerpMask works is better placed as the doc comment for timerpMask itself, so move it there. Change-Id: I5442c9bb7f1473b5fd37c43165429d087012e73f Reviewed-on: https://go-review.googlesource.com/c/go/+/568336 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org>
2024-03-08runtime: introduce timers.lock, timers.unlock methodsRuss Cox
No semantic changes here. Cleaning up for next change. Change-Id: I9706009739677ff9eb893bcc007d805f7877511e Reviewed-on: https://go-review.googlesource.com/c/go/+/568335 Reviewed-by: Austin Clements <austin@google.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-02-29runtime: move per-P timers state into its own structRuss Cox
Continuing conversion from C to Go, introduce type timers encapsulating all timer heap state, with methods for operations. This should at least be easier to think about, instead of having these fields strewn through the P struct. It should also be easier to test. I am skeptical about the pair of atomic int64 deadlines: I think there are missed wakeups lurking. Having the code in an abstracted API should make it easier to reason through and fix if needed. [This is one CL in a refactoring stack making very small changes in each step, so that any subtle bugs that we miss can be more easily pinpointed to a small change.] Change-Id: If5ea3e0b946ca14076f44c85cbb4feb9eddb4f95 Reviewed-on: https://go-review.googlesource.com/c/go/+/564132 Reviewed-by: Austin Clements <austin@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org>
2024-02-28runtime: move all timer-locking code into time.goRuss Cox
No code changes, only code moves here. Move all code that locks pp.timersLock into time.go so that it is all in one place, for easier abstraction. [This is one CL in a refactoring stack making very small changes in each step, so that any subtle bugs that we miss can be more easily pinpointed to a small change.] Change-Id: I1b59af7780431ec6479440534579deb1a3d9d7a3 Reviewed-on: https://go-review.googlesource.com/c/go/+/564117 Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-02-28runtime: delete clearDeletedTimersRuss Cox
adjusttimers already contains the same logic. Use it instead. This avoids having two copies of the code and is faster. adjusttimers was formerly O(n log n) but is now O(n). clearDeletedTimers was formerly O(n² log n) and is now gone! [This is one CL in a refactoring stack making very small changes in each step, so that any subtle bugs that we miss can be more easily pinpointed to a small change.] Change-Id: I32bf24817a589033dc304b359f8df10ea21f48fc Reviewed-on: https://go-review.googlesource.com/c/go/+/564116 Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-02-27runtime: fix the potential race of idle P's during injectglistAndy Pan
Fixes #63555 Change-Id: I26e7baf58fcb78e66e0feed5725e371e25d657cc Reviewed-on: https://go-review.googlesource.com/c/go/+/550175 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-02-27runtime: disable use of runnext on wasmMichael Pratt
When readying a goroutine, the scheduler typically places the readied goroutine in pp.runnext, which will typically be the next goroutine to run in the schedule. In order to prevent a set of ping-pong goroutines from simply switching back and forth via runnext and starving the rest of the run queue, a goroutine scheduled via runnext shares a time slice (pp.schedtick) with the previous goroutine. sysmon detects "long-running goroutines", which really means Ps using the same pp.schedtick for too long, and preempts them to allow the rest of the run queue to run. Thus this avoids starvation via runnext. However, wasm has no threads, and thus no sysmon. Without sysmon to preempt, the possibility for starvation returns. Avoid this by disabling runnext entirely on wasm. This means that readied goroutines always go on the end of the run queue and thus cannot starve via runnext. Note that this CL doesn't do anything about single long-running goroutines. Without sysmon to preempt them, a single goroutine that fails to yield will starve the run queue indefinitely. For #65178. Change-Id: I10859d088776125a2af8c9cd862b6e071da628b5 Cq-Include-Trybots: luci.golang.try:gotip-js-wasm,gotip-wasip1-wasm_wasmtime,gotip-wasip1-wasm_wazero Reviewed-on: https://go-review.googlesource.com/c/go/+/559798 Auto-Submit: Bryan Mills <bcmills@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-02-23runtime: add crash stack support for armMauri de Souza Meneguzzo
Change-Id: Ide4002d1cf82f2daaf7261b367c391dedbbf7719 GitHub-Last-Rev: 80ee248c3e34529e7a522acc97db9fb69c82dffb GitHub-Pull-Request: golang/go#65308 Reviewed-on: https://go-review.googlesource.com/c/go/+/558699 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-02-02runtime: initialize crashFD to -1Michael Pratt
crashFD defaults to the zero value of (surprise!) zero. Zero is a valid FD, so on the first call to SetCrashOutput we actually close FD 0 since it is a "valid" FD. Initialize crashFD to -1, the sentinel for "no FD". Change-Id: I3b108c60603f2b83b867cbe079f035c159b6a6ca Reviewed-on: https://go-review.googlesource.com/c/go/+/560776 Reviewed-by: Alan Donovan <adonovan@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-02-01runtime: traceAcquire and traceRelease across all P stealsMichael Anthony Knyszek
Currently there are a few places where a P can get stolen where the runtime doesn't traceAcquire and traceRelease across the steal itself. What can happen then is the following scenario: - Thread 1 enters a syscall and writes an event about it. - Thread 2 steals Thread 1's P. - Thread 1 exits the syscall and writes one or more events about it. - Tracing ends (trace.gen is set to 0). - Thread 2 checks to see if it should write an event for the P it just stole, sees that tracing is disabled, and doesn't. This results in broken traces, because there's a missing ProcSteal event. The parser always waits for a ProcSteal to advance a GoSyscallEndBlocked event, and in this case, it never comes. Fixes #65181. Change-Id: I437629499bb7669bf7fe2fc6fc4f64c53002916b Reviewed-on: https://go-review.googlesource.com/c/go/+/560235 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-01-30Revert "runtime: disable use of runnext on wasm"Michael Pratt
This reverts CL 557437. Reason for revert: Appears to have broken wasip1 builders. For #65178. Change-Id: I59c1a310eb56589c768536fe444c1efaf862f8b0 Reviewed-on: https://go-review.googlesource.com/c/go/+/559237 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-01-29runtime: disable use of runnext on wasmMichael Pratt
When readying a goroutine, the scheduler typically places the readied goroutine in pp.runnext, which will typically be the next goroutine to run in the schedule. In order to prevent a set of ping-pong goroutines from simply switching back and forth via runnext and starving the rest of the run queue, a goroutine scheduled via runnext shares a time slice (pp.schedtick) with the previous goroutine. sysmon detects "long-running goroutines", which really means Ps using the same pp.schedtick for too long, and preempts them to allow the rest of the run queue to run. Thus this avoids starvation via runnext. However, wasm has no threads, and thus no sysmon. Without sysmon to preempt, the possibility for starvation returns. Avoid this by disabling runnext entirely on wasm. This means that readied goroutines always go on the end of the run queue and thus cannot starve via runnext. Note that this CL doesn't do anything about single long-running goroutines. Without sysmon to preempt them, a single goroutine that fails to yield will starve the run queue indefinitely. For #65178. Cq-Include-Trybots: luci.golang.try:gotip-js-wasm,gotip-wasip1-wasm_wasmtime,gotip-wasip1-wasm_wazero Change-Id: I7dffe1e72c6586474186b72f8068cff77b661eae Reviewed-on: https://go-review.googlesource.com/c/go/+/557437 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Bryan Mills <bcmills@google.com>
2024-01-27runtime: crash stack support for loong64Mauri de Souza Meneguzzo
Change-Id: Icc2641b888440cc27444b5dfb2b8ff286e6a595d GitHub-Last-Rev: f5772e32e9190ab1eed94fcf2c9e58d6bc0d74d6 GitHub-Pull-Request: golang/go#63923 Reviewed-on: https://go-review.googlesource.com/c/go/+/539536 Reviewed-by: abner chenc <chenguoqi@loongson.cn> Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2024-01-25runtime: add crash stack support for s390xMauri de Souza Meneguzzo
Change-Id: Ie923f7bbe5ef22e381ae4f421387fbd570622a28 GitHub-Last-Rev: f8f21635025eb6e26c6994679995ade501e870cf GitHub-Pull-Request: golang/go#63908 Reviewed-on: https://go-review.googlesource.com/c/go/+/539296 Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Run-TryBot: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Joel Sing <joel@sing.id.au>
2024-01-15runtime: more godoc linksOlivier Mengué
Change-Id: I8fe66326994894b17ce0eda991bba942844d26b0 Reviewed-on: https://go-review.googlesource.com/c/go/+/541475 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-01-09runtime: replace rwmutexR/W with per-rwmutex lock rankMichael Pratt
CL 549536 intended to decouple the internal implementation of rwmutex from the semantic meaning of an rwmutex read/write lock in the static lock ranking. Unfortunately, it was not thought through well enough. The internals were represented with the rwmutexR and rwmutexW lock ranks. The idea was that the internal lock ranks need not model the higher-level ordering, since those have separate rankings. That is incorrect; rwmutexW is held for the duration of a write lock, so it must be ranked before any lock taken while any write lock is held, which is precisely what we were trying to avoid. This is visible in violations like: 0 : execW 11 0x0 1 : rwmutexW 51 0x111d9c8 2 : fin 30 0x111d3a0 fatal error: lock ordering problem execW < fin is modeled, but rwmutexW < fin is missing. Fix this by eliminating the rwmutexR/W lock ranks shared across different types of rwmutex. Instead require users to define an additional "internal" lock rank to represent the implementation details of rwmutex.rLock. We can avoid an additional "internal" lock rank for rwmutex.wLock because the existing writeRank has the same semantics for semantic and internal locking. i.e., writeRank is held for the duration of a write lock, which is exactly how rwmutex.wLock is used, so we can use writeRank directly on wLock. For #64722. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-staticlockranking Change-Id: Ia572de188a46ba8fe054ae28537648beaa16b12c Reviewed-on: https://go-review.googlesource.com/c/go/+/555055 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2023-12-15runtime: properly model rwmutex in lock rankingMichael Pratt
Currently, lock ranking doesn't really try to model rwmutex. It records the internal locks rLock and wLock, but in a subpar fashion: 1. wLock is held from lock to unlock, so it works OK, but it conflates write locks of all rwmutexes as rwmutexW, rather than allowing different rwmutexes to have different rankings. 2. rLock is an internal implementation detail that is only taken when there is contention in rlock. As as result, the reader lock path is almost never checked. Add proper modeling. rwmutexR and rwmutexW remain as the ranks of the internal locks, which have their own ordering. The new init method is passed the ranks of the higher level lock that this represents, just like lockInit for mutex. execW ordered before MALLOC captures the case from #64722. i.e., there can be allocation between BeforeFork and AfterFork. For #64722. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-staticlockranking Change-Id: I23335b28faa42fb04f1bc9da02fdf54d1616cd28 Reviewed-on: https://go-review.googlesource.com/c/go/+/549536 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-12-06iter, runtime: add coroutine supportRuss Cox
The exported API is only available with GOEXPERIMENT=rangefunc. This will let Go 1.22 users who want to experiment with rangefuncs access an efficient implementation of iter.Pull and iter.Pull2. For #61897. Change-Id: I6ef5fa8f117567efe4029b7b8b0f4d9b85697fb7 Reviewed-on: https://go-review.googlesource.com/c/go/+/543319 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-12-06runtime: rename GODEBUG=profileruntimelocks to runtimecontentionstacksMichael Pratt
profileruntimelocks is new in CL 544195, but the name is deceptive. Even with profileruntimelocks=0, runtime-internal locks are still profiled. The actual difference is that call stacks are not collected. Instead all contention is reported at runtime._LostContendedLock. Rename this setting to runtimecontentionstacks to make its name more aligned with its behavior. In addition, for this release the default is profileruntimelocks=0, meaning that users are fairly likely to encounter runtime._LostContendedLock. Rename it to runtime._LostContendedRuntimeLock in an attempt to make it more intuitive that these are runtime locks, not locks in application code. For #57071. Change-Id: I38aac28b2c0852db643d53b1eab3f3bc42a43393 Reviewed-on: https://go-review.googlesource.com/c/go/+/547055 Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Rhys Hiltner <rhys@justin.tv>
2023-12-05math/rand, math/rand/v2: use ChaCha8 for global randRuss Cox
Move ChaCha8 code into internal/chacha8rand and use it to implement runtime.rand, which is used for the unseeded global source for both math/rand and math/rand/v2. This also affects the calculation of the start point for iteration over very very large maps (when the 32-bit fastrand is not big enough). The benefit is that misuse of the global random number generators in math/rand and math/rand/v2 in contexts where non-predictable randomness is important for security reasons is no longer a security problem, removing a common mistake among programmers who are unaware of the different kinds of randomness. The cost is an extra 304 bytes per thread stored in the m struct plus 2-3ns more per random uint64 due to the more sophisticated algorithm. Using PCG looks like it would cost about the same, although I haven't benchmarked that. Before this, the math/rand and math/rand/v2 global generator was wyrand (https://github.com/wangyi-fudan/wyhash). For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson ALFG was justifiable, since the latter was not any better. But for math/rand/v2, the global generator really should be at least as good as one of the well-studied, specific algorithms provided directly by the package, and it's not. (Wyrand is still reasonable for scheduling and cache decisions.) Good randomness does have a cost: about twice wyrand. Also rationalize the various runtime rand references. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │ │ sec/op │ sec/op vs base │ ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20) PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20) SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20) GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20) GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20) GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20) GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20) Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20) Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20) GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20) IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20) Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20) Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20) Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20) Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20) Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20) Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20) Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20) Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20) Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20) Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20) Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20) Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20) Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20) ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20) NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20) Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20) Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20) Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20) ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20) Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 │ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │ │ sec/op │ sec/op vs base │ ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20) PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20) SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20) GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20) GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20) GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20) GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20) Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20) Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20) GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20) IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20) Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20) Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20) Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20) Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20) Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20) Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20) Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20) Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20) Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20) Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20) Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20) Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20) Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20) ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20) NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20) Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20) Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20) Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20) ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20) Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ bbb48afeb7.386 │ 5cf807d1ea.386 │ │ sec/op │ sec/op vs base │ ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20) PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20) SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20) GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20) GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20) GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20) GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20) Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20) Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20) GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20) IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20) Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20) Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20) Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20) Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20) Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20) Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20) Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20) Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20) Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20) Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20) Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20) Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20) Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20) ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20) NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20) Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20) Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20) Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20) ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20) Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20) For #61716. Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2 Reviewed-on: https://go-review.googlesource.com/c/go/+/516860 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Filippo Valsorda <filippo@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-22runtime: throw from the systemstack in wirepMichael Anthony Knyszek
The exitsyscall path, since the introduction of the new execution tracer, stores a just little bit more data in the exitsyscall stack frame, causing a build failure from exceeding the nosplit limit with '-N -l' set on all packages (like Delve does). One of the paths through which this fails is "throw" from wirep, called by a callee of exitsyscall. By switching to the systemstack on this path, we can avoid hitting the nosplit limit, fixing the build. It's also not totally unreasonable to switch to the systemstack for the throws in this function, since the function has to be nosplit anyway. It gives the throw path a bit more wiggle room to dump information than it otherwise would have. Fixes #64113. Change-Id: I56e94e40614a202b8ac2fdc8b8b731493b74e5d0 Reviewed-on: https://go-review.googlesource.com/c/go/+/544535 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>