go - Fork of Go programming language with my patches.

Age	Commit message (Collapse)	Author
2024-04-15	runtime: rename v2 execution tracer files	Carlos Amedee
	This change renames the v2 execution tracer files created as part of Updates #66703 For #60773 Change-Id: I91bfdc08fec4ec68ff3a6e8b5c86f6f8bcae6e6d Reviewed-on: https://go-review.googlesource.com/c/go/+/576257 Auto-Submit: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-04-15	runtime, cmd/trace: remove code paths that include v1 tracer	Carlos Amedee
	This change makes the new execution tracer described in #60773, the default tracer. This change attempts to make the smallest amount of changes for a single CL. Updates #66703 For #60773 Change-Id: I3742f3419c54f07d7c020ae5e1c18d29d8bcae6d Reviewed-on: https://go-review.googlesource.com/c/go/+/576256 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-10	runtime: rewrite traceMap to scale better	Michael Anthony Knyszek
	The existing implementation of traceMap is a hash map with a fixed bucket table size which scales poorly with the number of elements added to the map. After a few thousands elements are in the map, it tends to fall over. Furthermore, cleaning up the trace map is currently non-preemptible, without very good reason. This change replaces the traceMap implementation with a simple append-only concurrent hash-trie. The data structure is incredibly simple and does not suffer at all from the same scaling issues. Because the traceMap no longer has a lock, and the traceRegionAlloc it embeds is not thread-safe, we have to push that lock down. While we're here, this change also makes the fast path for the traceRegionAlloc lock-free. This may not be inherently faster due to contention on the atomic add, but it creates an easy path to sharding the main allocation buffer to reduce contention in the future. (We might want to also consider a fully thread-local allocator that covers both string and stack tables. The only reason a thread-local allocator isn't feasible right now is because each of these has their own region, but we could certainly group all them together.) Change-Id: I8c06d42825c326061a1b8569e322afc4bc2a513a Reviewed-on: https://go-review.googlesource.com/c/go/+/570035 Reviewed-by: Carlos Amedee <carlos@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> TryBot-Bypass: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com>
2024-04-05	runtime: emit trace stacks for more goroutines in each generation	Michael Anthony Knyszek
	This change adds a new event, GoStatusStack, which is like GoStatus but also carries a stack ID. The purpose of this event is to emit stacks in more places, in particular for goroutines that may never emit a stack-bearing event in a whole generation. This CL targets one specific case: goroutines that were blocked or in a syscall the entire generation. This particular case is handled at the point that we scribble down the goroutine's status before the generation transition. That way, when we're finishing up the generation and emitting events for any goroutines we scribbled down, we have an accurate stack for those goroutines ready to go, and we emit a GoStatusStack instead of a GoStatus event. There's a small drawback with the way we scribble down the stack though: we immediately register it in the stack table instead of tracking the PCs. This means that if a goroutine does run and emit a trace event in between when we scribbled down its stack and the end of the generation, we will have recorded a stack that never actually gets referenced in the trace. This case should be rare. There are two remaining cases where we could emit stacks for goroutines but we don't. One is goroutines that get unblocked but either never run, or run and never block within a generation. We could take a stack trace at the point of unblocking the goroutine, if we're emitting a GoStatus event for it, but unfortunately we don't own the stack at that point. We could obtain ownership by grabbing its _Gscan bit, but that seems a little risky, since we could hold up the goroutine emitting the event for a while. Something to consider for the future. The other remaining case is a goroutine that was runnable when tracing started and began running, but then ran until the end of the generation without getting preempted or blocking. The main issue here is that although the goroutine will have a GoStatus event, it'll only have a GoStart event for it which doesn't emit a stack trace. This case is rare, but still certainly possible. I believe the only way to resolve it is to emit a GoStatusStack event instead of a GoStatus event for a goroutine that we're emitting GoStart for. This case is a bit easier than the last one because at the point of emitting GoStart, we have ownership of the goroutine's stack. We may consider dealing with these in the future, but for now, this CL captures a fairly large class of goroutines, so is worth it on its own. Fixes #65634. Change-Id: Ief3b6df5848b426e7ee6794e98dc7ef5f37ab2d0 Reviewed-on: https://go-review.googlesource.com/c/go/+/567076 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-25	runtime: migrate internal/atomic to internal/runtime	Andy Pan
	For #65355 Change-Id: I65dd090fb99de9b231af2112c5ccb0eb635db2be Reviewed-on: https://go-review.googlesource.com/c/go/+/560155 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ibrahim Bazoka <ibrahimbazoka729@gmail.com> Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
2024-03-22	runtime: add tracing for iter.Pull	Michael Anthony Knyszek
	This change resolves a TODO in the coroutine switch implementation (used exclusively by iter.Pull at the moment) to enable tracing. This was blocked on eliminating the atomic load in the tracer's "off" path (completed in the previous CL in this series) and the addition of new tracer events to minimize the overhead of tracing in this circumstance. This change introduces 3 new event types to support coroutine switches: GoCreateBlocked, GoSwitch, and GoSwitchDestroy. GoCreateBlocked needs to be introduced because the goroutine created for the coroutine starts out in a blocked state. There's no way to represent this in the tracer right now, so we need a new event for it. GoSwitch represents the actual coroutine switch, which conceptually consists of a GoUnblock, a GoBlock, and a GoStart event in series (unblocking the next goroutine to run, blocking the current goroutine, and then starting the next goroutine to run). GoSwitchDestroy is closely related to GoSwitch, implementing the same semantics except that GoBlock is replaced with GoDestroy. This is used when exiting the coroutine. The implementation of all this is fairly straightforward, and the trace parser simply translates GoSwitch* into the three constituent events. Because GoSwitch and GoSwitchDestroy imply a GoUnblock and a GoStart, they need to synchronize with other past and future GoStart events to create a correct partial ordering in the trace. Therefore, these events need a sequence number for the goroutine that will be unblocked and started. Also, while implementing this, I noticed that the coroutine implementation is actually buggy with respect to LockOSThread. In fact, it blatantly disregards its invariants without an explicit panic. While such a case is likely to be rare (and inefficient!) we should decide how iter.Pull behaves with respect to runtime.LockOSThread. Lastly, this change also bumps the trace version from Go 1.22 to Go 1.23. We're adding events that are incompatible with a Go 1.22 parser, but Go 1.22 traces are all valid Go 1.23 traces, so the newer parser supports both (and the CL otherwise updates the Go 1.22 definitions of events and such). We may want to reconsider the structure and naming of some of these packages though; it could quickly get confusing. For #61897. Change-Id: I96897a46d5852c02691cde9f957dc6c13ef4d8e7 Reviewed-on: https://go-review.googlesource.com/c/go/+/565937 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-03-08	runtime: make checking if tracing is enabled non-atomic	Michael Anthony Knyszek
	Tracing is currently broken when using iter.Pull from the rangefunc experiment partly because the "tracing is off" fast path in traceAcquire was deemed too expensive to check (an atomic load) during the coroutine switch. This change adds trace.enabled, a non-atomic indicator of whether tracing is enabled. It doubles trace.gen, which is the source of truth on whether tracing is enabled. The semantics around trace.enabled are subtle. When tracing is enabled, we need to be careful to make sure that if gen != 0, goroutines enter the tracer on traceAcquire. This is enforced by making sure trace.enabled is published atomically with trace.gen. The STW takes care of synchronization with most Ms, but there's still sysmon and goroutines exiting syscalls. We need to synchronize with those explicitly anyway, which luckily takes care of trace.enabled as well. When tracing is disabled, it's always OK for trace.enabled to be stale, since traceAcquire will always double-check gen before proceeding. For #61897. Change-Id: I47c2a530fb5339c15e419312fbb1e22d782cd453 Reviewed-on: https://go-review.googlesource.com/c/go/+/565935 Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-02-01	runtime: fix trace EvGoStop Gosched reason to match function	Michael Anthony Knyszek
	Currently the stop reason for runtime.Gosched is labeled "runtime.GoSched" which doesn't actually match the function name. Fix the label to match the function name. This change doesn't regenerate the internal/trace/v2 tests, because regenerating the tests breaks summarization tests in internal/trace that rely on very specific details in the example traces that aren't guaranteed. Also, go122-gc-trace.test isn't generated at all, as it turns out. I'll fix this all up in a follow-up CL. For now, just replace runtime.GoSched with runtime.Gosched in the traces so we don't have a problem later if a test wants to look for that string. This change does regenerate the cmd/trace/v2 test, but it turns out the cmd/trace/v2 tests are way too strict about network unblock events, and 3 usually pop up instead of 1 or 2, which is what the test expects. AFAICT this looks plausible to me, so just lift the restriction on "up to 2" events entirely. Change-Id: Id7350132be19119c743c259f2f5250903bf41a04 Reviewed-on: https://go-review.googlesource.com/c/go/+/552275 TryBot-Bypass: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-01-19	runtime: make a much better effort to emit CPU profile in a trace	Michael Anthony Knyszek
	Currently the new execution tracer's handling of CPU profile samples is very best-effort. The same CPU profile buffer is used across generations, leading to a high probability that CPU samples will bleed across generations. Also, because the CPU profile buffer (not the trace buffer the samples get written into) isn't guaranteed to be flushed when we close out a generation, nor when tracing stops. This has led to test failures, but can more generally just lead to lost samples. In general, lost samples are considered OK. The CPU profile buffer is only read from every 100 ms, so if it fills up too much before then, old samples will get overwritten. The tests already account for this, and in that sense the CPU profile samples are already best-effort. But with actual CPU profiles, this is really the only condition under which samples are dropped. This CL aims to align CPU profiles better with traces by eliminating all best-effort parts of the implementation aside from the possibility of dropped samples from a full buffer. To achieve this, this CL adds a second CPU profile buffer and has the SIGPROF handler pick which CPU profile buffer to use based on the generation, much like every other part of the tracer. The SIGPROF handler then reads the trace generation, but not before ensuring it can't change: it grabs its own thread's trace seqlock. It's possible that a SIGPROF signal lands while this seqlock is already held by the thread. Luckily this is detectable and the SIGPROF handler can simply elide the locking if this happens (the tracer will already wait until all threads exit their seqlock critical section). Now that there are two CPU profile buffers written to, the read side needs to change. Instead of calling traceAcquire/traceRelease for every single CPU sample event, the trace CPU profile reader goroutine holds this conceptual lock over the entirety of flushing a buffer. This means it can pick the CPU profile buffer for the current generation to flush. With all this machinery in place, we're now at a point where all CPU profile samples get divided into either the previous generation or the current generation. This is good, since it means that we're able to emit profile samples into the correct generation, avoiding surprises in the final trace. All that's missing is to flush the CPU profile buffer from the previous generation, once the runtime has moved on from that generation. That is, when the generation counter updates, there may yet be CPU profile samples sitting in the last generation's buffer. So, traceCPUFlush now first flushes the CPU profile buffer, followed by any trace buffers containing CPU profile samples. The end result of all this is that no sample gets left behind unless it gets overwritten in the CPU profile buffer in the first place. CPU profile samples in the trace will now also get attributed to the right generation, since the SIGPROF handler now participates in the tracer's synchronization across trace generations. Fixes #55317. Change-Id: I47719fad164c544eef0bb12f99c8f3c15358e344 Reviewed-on: https://go-review.googlesource.com/c/go/+/555495 Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2023-11-21	runtime: emit a ProcSteal from entersyscall_gcwait	Michael Anthony Knyszek
	Currently entersyscall_gcwait always emits a ProcStop event. Most of the time, this is correct, since the thread that just put the P into _Psyscall is the same one that is putting it into _Pgcstop. However it's possible for another thread to steal the P, start running a goroutine, and then enter another syscall, putting the P back into _Psyscall. In this case ProcStop is incorrect; the P is getting stolen. This leads to broken traces. Fix this by always emitting a ProcSteal event from entersyscall_gcwait. This means that most of the time a thread will be 'stealing' the proc from itself when it enters this function, but that's theoretically fine. A ProcSteal is really just a fancy ProcStop. Well, it would be if the parser correctly handled a self-steal. This is a minor bug that just never came up before, but it's an update order error (the mState is looked up and modified, but then it's modified again at the end of the function to match newCtx). There's really no reason a self-steal shouldn't be allowed, so fix that up and add a test. Change-Id: Iec3d7639d331e3f2d127f92ce50c2c4a7818fcd3 Reviewed-on: https://go-review.googlesource.com/c/go/+/544215 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2023-11-14	internal/trace/v2: resolve syscall parsing ambiguity	Michael Anthony Knyszek
	After landing the new execution tracer, the Windows builders failed with some new errors. Currently the GoSyscallBegin event has no indicator that its the target of a ProcSteal event. This can lead to an ambiguous situation that is unresolvable if timestamps are broken. For instance, if the tracer sees the ProcSteal event while a goroutine has been observed to be in a syscall (one that, for instance, did not actually lose its P), it will proceed with the ProcSteal incorrectly. This is a little abstract. For a more concrete example, see the go122-syscall-steal-proc-ambiguous test. This change resolves this ambiguity by interleaving GoSyscallBegin events into how Ps are sequenced. Because a ProcSteal has a sequence number (it has to, it's stopping a P from a distance) it necessarily has to synchronize with a precise ProcStart event. This change basically just extends this synchronization to GoSyscallBegin, so the ProcSteal can't advance until _exactly the right_ syscall has been entered. This change removes the test skip, since it and CL 541695 fix the two main issues observed on Windows platforms. For #60773. Fixes #64061. Change-Id: I069389cd7fe1ea903edf42d79912f6e2bcc23f62 Reviewed-on: https://go-review.googlesource.com/c/go/+/541696 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-10	runtime: add execution tracer v2 behind GOEXPERIMENT=exectracer2	Michael Anthony Knyszek
	This change mostly implements the design described in #60773 and includes a new scalable parser for the new trace format, available in internal/trace/v2. I'll leave this commit message short because this is clearly an enormous CL with a lot of detail. This change does not hook up the new tracer into cmd/trace yet. A follow-up CL will handle that. For #60773. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-amd64-longtest-race Change-Id: I5d2aca2cc07580ed3c76a9813ac48ec96b157de0 Reviewed-on: https://go-review.googlesource.com/c/go/+/494187 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>