aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/trace2runtime.go
AgeCommit message (Collapse)Author
2024-01-19runtime: make a much better effort to emit CPU profile in a traceMichael Anthony Knyszek
Currently the new execution tracer's handling of CPU profile samples is very best-effort. The same CPU profile buffer is used across generations, leading to a high probability that CPU samples will bleed across generations. Also, because the CPU profile buffer (not the trace buffer the samples get written into) isn't guaranteed to be flushed when we close out a generation, nor when tracing stops. This has led to test failures, but can more generally just lead to lost samples. In general, lost samples are considered OK. The CPU profile buffer is only read from every 100 ms, so if it fills up too much before then, old samples will get overwritten. The tests already account for this, and in that sense the CPU profile samples are already best-effort. But with actual CPU profiles, this is really the only condition under which samples are dropped. This CL aims to align CPU profiles better with traces by eliminating all best-effort parts of the implementation aside from the possibility of dropped samples from a full buffer. To achieve this, this CL adds a second CPU profile buffer and has the SIGPROF handler pick which CPU profile buffer to use based on the generation, much like every other part of the tracer. The SIGPROF handler then reads the trace generation, but not before ensuring it can't change: it grabs its own thread's trace seqlock. It's possible that a SIGPROF signal lands while this seqlock is already held by the thread. Luckily this is detectable and the SIGPROF handler can simply elide the locking if this happens (the tracer will already wait until all threads exit their seqlock critical section). Now that there are two CPU profile buffers written to, the read side needs to change. Instead of calling traceAcquire/traceRelease for every single CPU sample event, the trace CPU profile reader goroutine holds this conceptual lock over the entirety of flushing a buffer. This means it can pick the CPU profile buffer for the current generation to flush. With all this machinery in place, we're now at a point where all CPU profile samples get divided into either the previous generation or the current generation. This is good, since it means that we're able to emit profile samples into the correct generation, avoiding surprises in the final trace. All that's missing is to flush the CPU profile buffer from the previous generation, once the runtime has moved on from that generation. That is, when the generation counter updates, there may yet be CPU profile samples sitting in the last generation's buffer. So, traceCPUFlush now first flushes the CPU profile buffer, followed by any trace buffers containing CPU profile samples. The end result of all this is that no sample gets left behind unless it gets overwritten in the CPU profile buffer in the first place. CPU profile samples in the trace will now also get attributed to the right generation, since the SIGPROF handler now participates in the tracer's synchronization across trace generations. Fixes #55317. Change-Id: I47719fad164c544eef0bb12f99c8f3c15358e344 Reviewed-on: https://go-review.googlesource.com/c/go/+/555495 Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2023-11-21runtime: emit a ProcSteal from entersyscall_gcwaitMichael Anthony Knyszek
Currently entersyscall_gcwait always emits a ProcStop event. Most of the time, this is correct, since the thread that just put the P into _Psyscall is the same one that is putting it into _Pgcstop. However it's possible for another thread to steal the P, start running a goroutine, and then enter another syscall, putting the P back into _Psyscall. In this case ProcStop is incorrect; the P is getting stolen. This leads to broken traces. Fix this by always emitting a ProcSteal event from entersyscall_gcwait. This means that most of the time a thread will be 'stealing' the proc from itself when it enters this function, but that's theoretically fine. A ProcSteal is really just a fancy ProcStop. Well, it would be if the parser correctly handled a self-steal. This is a minor bug that just never came up before, but it's an update order error (the mState is looked up and modified, but then it's modified again at the end of the function to match newCtx). There's really no reason a self-steal shouldn't be allowed, so fix that up and add a test. Change-Id: Iec3d7639d331e3f2d127f92ce50c2c4a7818fcd3 Reviewed-on: https://go-review.googlesource.com/c/go/+/544215 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2023-11-14internal/trace/v2: resolve syscall parsing ambiguityMichael Anthony Knyszek
After landing the new execution tracer, the Windows builders failed with some new errors. Currently the GoSyscallBegin event has no indicator that its the target of a ProcSteal event. This can lead to an ambiguous situation that is unresolvable if timestamps are broken. For instance, if the tracer sees the ProcSteal event while a goroutine has been observed to be in a syscall (one that, for instance, did not actually lose its P), it will proceed with the ProcSteal incorrectly. This is a little abstract. For a more concrete example, see the go122-syscall-steal-proc-ambiguous test. This change resolves this ambiguity by interleaving GoSyscallBegin events into how Ps are sequenced. Because a ProcSteal has a sequence number (it has to, it's stopping a P from a distance) it necessarily has to synchronize with a precise ProcStart event. This change basically just extends this synchronization to GoSyscallBegin, so the ProcSteal can't advance until _exactly the right_ syscall has been entered. This change removes the test skip, since it and CL 541695 fix the two main issues observed on Windows platforms. For #60773. Fixes #64061. Change-Id: I069389cd7fe1ea903edf42d79912f6e2bcc23f62 Reviewed-on: https://go-review.googlesource.com/c/go/+/541696 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-10runtime: add execution tracer v2 behind GOEXPERIMENT=exectracer2Michael Anthony Knyszek
This change mostly implements the design described in #60773 and includes a new scalable parser for the new trace format, available in internal/trace/v2. I'll leave this commit message short because this is clearly an enormous CL with a lot of detail. This change does not hook up the new tracer into cmd/trace yet. A follow-up CL will handle that. For #60773. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-amd64-longtest-race Change-Id: I5d2aca2cc07580ed3c76a9813ac48ec96b157de0 Reviewed-on: https://go-review.googlesource.com/c/go/+/494187 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>