| Age | Commit message (Collapse) | Author |
|
Change-Id: I0459f05e7f6abd9738813c65d993114e931720d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/731000
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
Currently, between the forEachP that ensures every P has a status in the
trace and readying dead Ps for the next generation, there's a big window
where GOMAXPROCS could run and change the number of Ps. In such
circumstances, P state will not get properly prepared for the next
generation, and we may fail to emit a status for a P.
This change addresses the problem by holding worldsema across both
operations. It also moves the status emission and state clearing to
before the generation transition because that makes the code *much*
clearer.
Currently we do both these things after the generation transition
targeting the next-next generation. The reason for this is to avoid an
extra forEachP (because we already handle P status in the STW for
enabling tracing to begin with) but approach is just plain harder to
reason about. Preparing the next generation before transitioning to it,
like we do for goroutines, is much clearer. Furthermore, we also need to
set up the dead P state even if we're stopping the trace, which would
mean duplicating code into both branches (if stopping the trace, then we
go non-preemptible, otherwise we do it under worldsema) which is also
less clear.
Note that with this change we no longer need to emit the P statuses
during the STW, which was probably the worse choice anyway, since
holding up the STW for an O(P) operation is worse than a forEachP we're
going to do anyway. So, this change does away with that too.
Fixes #76572.
Change-Id: Ie7908adff43a8a372cae432bcc2f4e0d6a87d7bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/729023
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
The runtime tracer currently uses a per-M seqlock to indicate whether a
thread is writing to a local trace buffer. The seqlock is updated with
two atomic adds, read-modify-write operations. These are quite
expensive, even though they're completely uncontended.
We can make these operations slightly cheaper by using an atomic store.
The key insight here is that only one thread ever writes to the value at
a time, so only the "write" of the read-modify-write actually matters.
At that point, it doesn't really matter that we have a monotonically
increasing counter. This is made clearer by the fact that nothing other
than basic checks make sure the counter is monotonically increasing:
everything only depends on whether the counter is even or odd.
At that point, all we really need is a flag: an atomic.Bool, which we
can update with an atomic Store, a write-only instruction.
Change-Id: I0cfe39b34c7634554c34c53c0f0e196d125bbc4a
Reviewed-on: https://go-review.googlesource.com/c/go/+/721840
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change is a documentation update for the execution tracer. Far too
much is left to small comments scattered around places. This change
accumulates the big important trace invariants, with rationale, into one
file: trace.go.
Change-Id: I5fd1402a3d8fdf14a0051e305b3a8fb5dfeafcb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/708398
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This change takes the EvEndOfGeneration event and promotes it to a real
event that appears in the trace.
This allows the trace parser to unambiguously identify truncated traces
vs. broken traces. It also makes a lot of the logic around parsing
simpler, because there's no more batch spilling necessary.
Fixes #73904.
Change-Id: I37c359b32b6b5f894825aafc02921adeaacf2595
Reviewed-on: https://go-review.googlesource.com/c/go/+/693398
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: Ide4daa5eee3fd4f3007d6ef23aa84b8916562c39
Reviewed-on: https://go-review.googlesource.com/c/go/+/684457
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Almost everywhere we stop the world we casGToWaitingForGC to prevent
mutual deadlock with the GC trying to scan our stack. This historically
was only necessary if we weren't stopping the world to change the GC
phase, because what we were worried about was mutual deadlock with mark
workers' use of suspendG. And, they were the only users of suspendG.
In Go 1.22 this changed. The execution tracer began using suspendG, too.
This leads to the possibility of mutual deadlock between the execution
tracer and a goroutine trying to start or end the GC mark phase. The fix
is simple: make the stop-the-world calls for the GC also call
casGToWaitingForGC. This way, suspendG is guaranteed to make progress in
this circumstance, and once it completes, the stop-the-world can
complete as well.
We can take this a step further, though, and move casGToWaitingForGC
into stopTheWorldWithSema, since there's no longer really a place we can
afford to skip this detail.
While we're here, rename casGToWaitingForGC to casGToWaitingForSuspendG,
since the GC is now not the only potential source of mutual deadlock.
Fixes #72740.
Change-Id: I5e3739a463ef3e8173ad33c531e696e46260692f
Reviewed-on: https://go-review.googlesource.com/c/go/+/681501
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Issue #74045 describes a scenario in which gopark is inlined into
readTrace, such that there are no preemption points. This is only a
problem because readTrace spins if trace.shutdown is set, through
traceReaderAvailable. However, trace.shutdown is almost certainly
overkill for traceReaderAvailable. The first condition, checking whether
the reader gen and the flushed gen match, should be sufficient to ensure
the reader wakes up and finishes flushing all buffers. The first
condition is also safe because it guarantees progress. In the case of
shutdown, all the trace work that will be flushed has been flushed, and
so the trace reader will exit into a regular goroutine context when
it's finished. If not shutting down, then the trace reader will release
doneSema, increase readerGen, and then the gopark unlockf will let it
block until new work actually comes in.
Fixes #74045.
Change-Id: Id9b15c277cb731618488771bd484577341b68675
Reviewed-on: https://go-review.googlesource.com/c/go/+/680738
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Nick Ripley <nick.ripley@datadoghq.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
Change-Id: I118d1ee09dfd6fd0075f9e5eeeb54441328bff4d
Reviewed-on: https://go-review.googlesource.com/c/go/+/681495
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
This change adds the flight recorder to the trace package.
Flight recording is a technique in which trace data is kept
in a circular buffer and can be flushed upon request. The
implementation will be added in follow-up CLs.
The flight recorder has already been implemented inside of the
golang.org/x/exp/trace package. This copies the current implementation
and modifies it to work within the runtime/trace package.
The changes include:
This adds the ability for multiple consumers (both the execution
tracer and the flight recorder) to subscribe to tracing events. This
change allows us to add multiple consumers without making major
modifications to the runtime. Future optimizations are planned
for this functionality.
This removes the use of byte readers from the process that
parses and processes the trace batches.
This modifies the flight recorder to not parse out the trace
clock frequency, since that requires knowledge of the format that's
unfortunate to encode in yet another place. Right now, the trace clock
frequency is considered stable for the lifetime of the program, so just
grab it directly from the runtime.
This change adds an in-band end-of-generation signal to the internal
implementation of runtime.ReadTrace. The internal implementation is
exported via linkname to runtime/trace, so the flight recorder can
identify exactly when a generation has ended. This signal is also useful
for ensuring that subscribers to runtime trace data always see complete
generations, by starting or stopping data streaming only at generation
boundaries.
For #63185
Change-Id: I5c15345981a6bbe9764a3d623448237e983c64ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/673116
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Replace the per-generation EvEventBatch containing a lone EvFrequency
event with a per-generation EvEventBatch containing a EvSync header
followed by an EvFrequency and EvClockSnapshot event.
The new EvClockSnapshot event contains trace, mono and wall clock
snapshots taken in close time proximity. Ignoring minor resolution
differences, the trace and mono clock are the same on linux, but not on
windows (which still uses a TSC based trace clock).
Emit the new sync batch at the very beginning of every new generation
rather than the end to be in harmony with the internal/trace reader
which emits a sync event at the beginning of every generation as well
and guarantees monotonically increasing event timestamps.
Bump the version of the trace file format to 1.25 since this change is
not backwards compatible.
Update the internal/trace reader implementation to decode the new
events, but do not expose them to the public reader API yet. This is
done in the next CL.
For #69869
Change-Id: I5bfedccdd23dc0adaf2401ec0970cbcc32363393
Reviewed-on: https://go-review.googlesource.com/c/go/+/653575
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Currently, we can only cache regular trace event buffers on each M. As
a result, calling unsafeTraceExpWriter will, in effect, always return
a new trace batch, with all of the overhead that entails.
This extends that cache to support buffers for experimental trace
data. This way, unsafeTraceExpWriter can return a partially used
buffer, which the caller can continue to extend. This gives the caller
control over when these buffers get flushed and reuses all of the
existing trace buffering mechanism.
This also has the consequence of simplifying the experimental batch
infrastructure a bit. Now, traceWriter needs to know the experiment ID
anyway, which means there's no need for a separate traceExpWriter
type.
Change-Id: Idc2100176c5d02e0fbb229dc8aa4aea2b1cf5231
Reviewed-on: https://go-review.googlesource.com/c/go/+/594595
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Note that this depends on the revert of CL 581395 to move zeroVal back.
For #67401.
Change-Id: I507c27c2404ad1348aabf1ffa3740e6b1957495b
Reviewed-on: https://go-review.googlesource.com/c/go/+/587217
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Currently the traceallocfree experiment is missing info in the trace for
interpeting the produced events. Most notably, the base heap address is
missing. While not technically necessary, it is useful for getting an
accurate picture of the program's memory layout, and will be useful for
future trace experiments. Since we want to emit a batch for this, we
should also emit a batch for all the alignment info that's used to
compress the addresses (IDs) produced for the alloc/free events.
This CL distinguishes the different formats of the experimental batches
(note that there's already batches containing type information in this
experiment) by putting a byte at the beginning of each experimental
batch indicating its format.
Change-Id: Ifc4e77a23458713b7d95e0dfa056a29e1629ccd7
Reviewed-on: https://go-review.googlesource.com/c/go/+/586997
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change adds expensive alloc/free events to traces, guarded by a
GODEBUG that can be set at run time by mutating the GODEBUG environment
variable. This supersedes the alloc/free trace deleted in a previous CL.
There are two parts to this CL.
The first part is adding a mechanism for exposing experimental events
through the tracer and trace parser. This boils down to a new
ExperimentalEvent event type in the parser API which simply reveals the
raw event data for the event. Each experimental event can also be
associated with "experimental data" which is associated with a
particular generation. This experimental data is just exposed as a bag
of bytes that supplements the experimental events.
In the runtime, this CL organizes experimental events by experiment.
An experiment is defined by a set of experimental events and a single
special batch type. Batches of this special type are exposed through the
parser's API as the aforementioned "experimental data".
The second part of this CL is defining the AllocFree experiment, which
defines 9 new experimental events covering heap object alloc/frees, span
alloc/frees, and goroutine stack alloc/frees. It also generates special
batches that contain a type table: a mapping of IDs to type information.
Change-Id: I965c00e3dcfdf5570f365ff89d0f70d8aeca219c
Reviewed-on: https://go-review.googlesource.com/c/go/+/583377
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This change renames the v2 execution tracer files created as part of
Updates #66703
For #60773
Change-Id: I91bfdc08fec4ec68ff3a6e8b5c86f6f8bcae6e6d
Reviewed-on: https://go-review.googlesource.com/c/go/+/576257
Auto-Submit: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
This change makes the new execution tracer described in #60773, the
default tracer. This change attempts to make the smallest amount of
changes for a single CL.
Updates #66703
For #60773
Change-Id: I3742f3419c54f07d7c020ae5e1c18d29d8bcae6d
Reviewed-on: https://go-review.googlesource.com/c/go/+/576256
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
For #65355
Change-Id: I65dd090fb99de9b231af2112c5ccb0eb635db2be
Reviewed-on: https://go-review.googlesource.com/c/go/+/560155
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ibrahim Bazoka <ibrahimbazoka729@gmail.com>
Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
|
|
This change resolves a TODO in the coroutine switch implementation (used
exclusively by iter.Pull at the moment) to enable tracing. This was
blocked on eliminating the atomic load in the tracer's "off" path
(completed in the previous CL in this series) and the addition of new
tracer events to minimize the overhead of tracing in this circumstance.
This change introduces 3 new event types to support coroutine switches:
GoCreateBlocked, GoSwitch, and GoSwitchDestroy.
GoCreateBlocked needs to be introduced because the goroutine created for
the coroutine starts out in a blocked state. There's no way to represent
this in the tracer right now, so we need a new event for it.
GoSwitch represents the actual coroutine switch, which conceptually
consists of a GoUnblock, a GoBlock, and a GoStart event in series
(unblocking the next goroutine to run, blocking the current goroutine,
and then starting the next goroutine to run).
GoSwitchDestroy is closely related to GoSwitch, implementing the same
semantics except that GoBlock is replaced with GoDestroy. This is used
when exiting the coroutine.
The implementation of all this is fairly straightforward, and the trace
parser simply translates GoSwitch* into the three constituent events.
Because GoSwitch and GoSwitchDestroy imply a GoUnblock and a GoStart,
they need to synchronize with other past and future GoStart events to
create a correct partial ordering in the trace. Therefore, these events
need a sequence number for the goroutine that will be unblocked and
started.
Also, while implementing this, I noticed that the coroutine
implementation is actually buggy with respect to LockOSThread. In fact,
it blatantly disregards its invariants without an explicit panic. While
such a case is likely to be rare (and inefficient!) we should decide how
iter.Pull behaves with respect to runtime.LockOSThread.
Lastly, this change also bumps the trace version from Go 1.22 to Go
1.23. We're adding events that are incompatible with a Go 1.22 parser,
but Go 1.22 traces are all valid Go 1.23 traces, so the newer parser
supports both (and the CL otherwise updates the Go 1.22 definitions of
events and such). We may want to reconsider the structure and naming of
some of these packages though; it could quickly get confusing.
For #61897.
Change-Id: I96897a46d5852c02691cde9f957dc6c13ef4d8e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/565937
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
This ensures the trace buffers are as up-to-date as possible right
before crashing. It increases the chance of finding the culprit for the
crash when looking at core dumps, e.g. if slowness is the cause for the
crash (monitor kills process).
Fixes #65319.
Change-Id: Iaf5551911b3b3b01ba65cb8749cf62a411e02d9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/562616
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Change-Id: I8fe66326994894b17ce0eda991bba942844d26b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/541475
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This change mostly implements the design described in #60773 and
includes a new scalable parser for the new trace format, available in
internal/trace/v2. I'll leave this commit message short because this is
clearly an enormous CL with a lot of detail.
This change does not hook up the new tracer into cmd/trace yet. A
follow-up CL will handle that.
For #60773.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-amd64-longtest-race
Change-Id: I5d2aca2cc07580ed3c76a9813ac48ec96b157de0
Reviewed-on: https://go-review.googlesource.com/c/go/+/494187
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Currently the execution tracer synchronizes with itself using very
heavyweight operations. As a result, it's totally fine for most of the
tracer code to look like:
if traceEnabled() {
traceXXX(...)
}
However, if we want to make that synchronization more lightweight (as
issue #60773 proposes), then this is insufficient. In particular, we
need to make sure the tracer can't observe an inconsistency between g
atomicstatus and the event that would be emitted for a particular
g transition. This means making the g status change appear to happen
atomically with the corresponding trace event being written out from the
perspective of the tracer.
This requires a change in API to something more like a lock. While we're
here, we might as well make sure that trace events can *only* be emitted
while this lock is held. This change introduces such an API:
traceAcquire, which returns a value that can emit events, and
traceRelease, which requires the value that was returned by
traceAcquire. In practice, this won't be a real lock, it'll be more like
a seqlock.
For the current tracer, this API is completely overkill and the value
returned by traceAcquire basically just checks trace.enabled. But it's
necessary for the tracer described in #60773 and we can implement that
more cleanly if we do this refactoring now instead of later.
For #60773.
Change-Id: Ibb9ff5958376339fafc2b5180aef65cf2ba18646
Reviewed-on: https://go-review.googlesource.com/c/go/+/515635
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Run 'unconvert -safe -apply' (https://github.com/mdempsky/unconvert)
Change-Id: I24b7cd7d286cddce86431d8470d15c5f3f0d1106
GitHub-Last-Rev: 022e75384c08bb899a8951ba0daffa0f2e14d5a7
GitHub-Pull-Request: golang/go#62662
Reviewed-on: https://go-review.googlesource.com/c/go/+/528696
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Now that pcvalue keeps its cache on the M, we can drop all of the
stack-allocated pcvalueCaches and stop carefully passing them around
between lots of operations. This significantly simplifies a fair
amount of code and makes several structures smaller.
This series of changes has no statistically significant effect on any
runtime Stack benchmarks.
I also experimented with making the cache larger, now that the impact
is limited to the M struct, but wasn't able to measure any
improvements.
This is a re-roll of CL 515277
Change-Id: Ia27529302f81c1c92fb9c3a7474739eca80bfca1
Reviewed-on: https://go-review.googlesource.com/c/go/+/520064
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
|
|
This fixes a regression from CL 494181.
The traceEnabled function splits the stack and is being
called by reentersyscall that shouldn't call anything
that splits the stack. Same with traceShuttingDown.
Fixes #61975
Change-Id: I5eca0ba74cfa6acb0259e8400b03c2093cd59dd1
GitHub-Last-Rev: 9e55ae9d7cc700de9757d32a7905127a349d973a
GitHub-Pull-Request: golang/go#61981
Reviewed-on: https://go-review.googlesource.com/c/go/+/519055
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Dmitri Shuralyov <dmitshur@google.com>
|
|
This reverts CL 515277
Change-Id: Ie10378eed4993cb69f4a9b43a38af32b9d743155
Reviewed-on: https://go-review.googlesource.com/c/go/+/516855
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Now that pcvalue keeps its cache on the M, we can drop all of the
stack-allocated pcvalueCaches and stop carefully passing them around
between lots of operations. This significantly simplifies a fair
amount of code and makes several structures smaller.
This series of changes has no statistically significant effect on any
runtime Stack benchmarks.
I also experimented with making the cache larger, now that the impact
is limited to the M struct, but wasn't able to measure any
improvements.
Change-Id: I4719ebf347c7150a05e887e75a238e23647c20cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/515277
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
For #60806
Change-Id: I1ac18a6c7c703a1d6c4cd80f220059ba0be51e09
GitHub-Last-Rev: d300ca3f316d34f5013be43d01a9a473fe3000b2
GitHub-Pull-Request: golang/go#60834
Reviewed-on: https://go-review.googlesource.com/c/go/+/503356
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Fixes #60254
Change-Id: Ifd9e517a9479e5cd63fd3622b2556989d5f84eb9
Reviewed-on: https://go-review.googlesource.com/c/go/+/499036
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Bypass: Keith Randall <khr@golang.org>
|
|
The previous name was wrong due to the mistaken assumption that calling
f->g->getcallerpc and f->g->getcallersp would respectively return the
pc/sp at g. However, they are actually referring to their caller's
caller, i.e. f.
Rename getcallerfp to getfp in order to stay consistent with this
naming convention.
Also see discussion on CL 463835.
For #16638
This is a redo of CL 481617 that became necessary because CL 461738
added another call site for getcallerfp().
Change-Id: If0b536e85a6c26061b65e7b5c2859fc31385d025
Reviewed-on: https://go-review.googlesource.com/c/go/+/494857
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
|
|
Currently trace.lock is reentrant in a few cases. AFAICT, this was
necessary a long time ago when the trace reader would goparkunlock, and
might flush a trace buffer while parking the goroutine. Today, that's no
longer true, since that always happens without the trace.lock held.
However, traceReadCPU does still rely on this behavior, since it could
get called either with trace.lock held, or without it held. The silver
lining here is that it doesn't *need* trace.lock to be held, so the
trace reader can just drop the lock to call traceReadCPU (this is
probably also nice for letting other goroutines flush while the trace
reader is reading from the CPU log).
Stress-tested with
$ stress ./trace.test -test.run="TestTraceCPUProfile|TestTraceStress|TestTraceStressStartStop"
...
42m0s: 24520 runs so far, 0 failures
Change-Id: I2016292c17fe7384050fcc0c446f5797c4e46437
Reviewed-on: https://go-review.googlesource.com/c/go/+/496296
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change adds traceBlockReason which leaks fewer implementation
details of the tracer to the runtime. Currently, gopark is called with
an explicit trace event, but this leaks details about trace internals
throughout the runtime.
This change will make it easier to change out the trace implementation.
Change-Id: Id633e1704d2c8838c6abd1214d9695537c4ac7db
Reviewed-on: https://go-review.googlesource.com/c/go/+/494185
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
|
|
It's called from exitsyscall, which is nosplit.
Change-Id: I3f5f92e044497a88a72c1870beb2bdd15c4263c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/496517
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Currently the trace clock is cputicks() with comments sprinkled in
different places as to which clock to use. Since the execution tracer
redesign will use a different clock, it seems like a good time to clean
that up.
Also, rename the start/end timestamps to be more readable (i.e.
startTime vs. timeStart).
Change-Id: If43533eddd0e5f68885bb75cdbadb38da42e7584
Reviewed-on: https://go-review.googlesource.com/c/go/+/494775
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Currently STW events are only emitted for GC STWs. There's little reason
why the trace can't contain events for every STW: they're rare so don't
take up much space in the trace, yet being able to see when the world
was stopped is often critical to debugging certain latency issues,
especially when they stem from user-level APIs.
This change adds new "kinds" to the EvGCSTWStart event, renames the
GCSTW events to just "STW," and lets the parser deal with unknown STW
kinds for future backwards compatibility.
But, this change must break trace compatibility, so it bumps the trace
version to Go 1.21.
This change also includes a small cleanup in the trace command, which
previously checked for STW events when deciding whether user tasks
overlapped with a GC. Looking at the source, I don't see a way for STW
events to ever enter the stream that that code looks at, so that
condition has been deleted.
Change-Id: I9a5dc144092c53e92eb6950e9a5504a790ac00cf
Reviewed-on: https://go-review.googlesource.com/c/go/+/494495
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
|
|
sysBlockTraced is a subtle and confusing flag.
Currently, it's only used in one place: a condition around whether to
traceGoSysExit when a goroutine is about to start running. That condition
looks like "gp.syscallsp != 0 && gp.trace.sysBlockTraced".
In every case but one, "gp.syscallsp != 0" is equivalent to
"gp.trace.sysBlockTraced."
That one case is where a goroutine is running without a P and racing
with trace start (world is stopped). It switches itself back to
_Grunnable from _Gsyscall before the trace start goroutine notices, such
that the trace start goroutine fails to emit a EvGoInSyscall event for
it (EvGoInSyscall or EvGoSysBlock must precede any EvGoSysExit event).
sysBlockTraced is set unconditionally on every syscall entry and the
trace start goroutine clears it if there was no EvGoInSyscall event
emitted (i.e. did not observe _Gsyscall on the goroutine). That way when
the goroutine-without-a-P wakes up and gets scheduled, it only emits
EvGoSysExit if the flag is set, i.e. trace start didn't _clear_ the
flag.
What makes this confusing is the fact that the flag is set
unconditionally and the code relies on it being *cleared*. Really, all
it's trying to communicate is whether the tracer is aware of a
goroutine's syscall at the point where a goroutine that lost its P
during a syscall is trying to run again.
Therefore, we can replace this flag with a less subtle one:
tracedSyscallEnter. It is set when GoSysCall is traced, indicating on
the goroutine that the tracer is aware of the syscall. Later, if
traceGoSysExit is called, the tracer knows its safe to emit an event
because the tracer is aware of the syscall.
This flag is then also set at trace start, when it emits EvGoInSyscall,
which again, lets the goroutine know the tracer is aware of its syscall.
The flag is cleared by GoSysExit to indicate that the tracer is no
longer aware of any syscalls on the goroutine. It's also cleared by
trace start. This is necessary because a syscall may have been started
while a trace was stopping. If the GoSysExit isn't emitted (because it
races with the trace end STW) then the flag will be left set at the
start of the next trace period, which will result in an erroneous
GoSysExit. Instead, the flag is cleared in the same way sysBlockTraced
is today: if the tracer doesn't notice the goroutine is in a syscall, it
makes that explicit to the goroutine.
A more direct flag to use here would be one that explicitly indicates
whether EvGoInSyscall or EvGoSysBlock specifically were already emitted
for a goroutine. The reason why we don't just do this is because setting
the flag when EvGoSysBlock is emitted would be racy: EvGoSysBlock is
emitted by whatever thread is stealing the P out from under the
syscalling goroutine, so it would need to synchronize with the goroutine
its emitting the event for.
The end result of all this is that the new flag can be managed entirely
within trace.go, hiding another implementation detail about the tracer.
Tested with `stress ./trace.test -test.run="TestTraceStressStartStop"`
which was occasionally failing before the CL in which sysBlockTraced was
added (CL 9132). I also confirmed also that this test is still sensitive
to `EvGoSysExit` by removing the one use of sysBlockTraced. The result
is about a 5% error rate. If there is something very subtly wrong about
how this CL emits `EvGoSysExit`, I would expect to see it as a test
failure. Instead:
53m55s: 200434 runs so far, 0 failures
Change-Id: If1d24ee6b6926eec7e90cdb66039a5abac819d9b
Reviewed-on: https://go-review.googlesource.com/c/go/+/494715
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
Just another step to hiding implementation details.
Change-Id: I71b7cc522d18c23f03a9bf32e428279e62b39a89
Reviewed-on: https://go-review.googlesource.com/c/go/+/494192
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
More tightening up of the tracer's interface.
Change-Id: I992141c7f30e5c2d5d77d1fcd6817d35bc6e5f6d
Reviewed-on: https://go-review.googlesource.com/c/go/+/494191
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
More tightening up of the tracer's interface.
While we're here, clarify why waittraceskip isn't included by explaining
what the wait* fields in the M are really for.
Change-Id: I0e7b4cac79fb77a7a0b3ca6b6cc267668e3610bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/494190
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
More tightening up of the tracer's interface.
This increases the size of each G very slightly, which isn't great, but
we stay within the same size class, so actually memory use will be
unchanged.
Change-Id: I7d1f5798edcf437c212beb1e1a2619eab833aafb
Reviewed-on: https://go-review.googlesource.com/c/go/+/494188
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
|
|
In the interest of further cleaning up the trace.go API, move the trace
logic in oneNewExtraM into its own function.
Change-Id: I5cf478cb8cd0d301ee3b068347ed48ce768b8882
Reviewed-on: https://go-review.googlesource.com/c/go/+/494186
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This was accidentally left behind when moving the logic to set the skip
sentinel in pcBuf to the caller.
Change-Id: Id7565f6ea4df6b32cf18b99c700bca322998d182
Reviewed-on: https://go-review.googlesource.com/c/go/+/489095
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
|
|
Also, document traceEvFutileWakeup as not currently used.
Change-Id: I75831a43d39b6c6ceb5a9b6320c3ae9455681572
Reviewed-on: https://go-review.googlesource.com/c/go/+/494184
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Change-Id: I0b123e65f40570caeee611679d80dc27034d5a52
Reviewed-on: https://go-review.googlesource.com/c/go/+/494183
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
This change introduces the trivial traceEnabled function to help tighten
up the execution tracer's API in preparation for the execution trace
redesign GOEXPERIMENT.
A follow-up change will refactor the runtime to use it.
Change-Id: I19c8728e30aefe543b4a826d95446affa14897e3
Reviewed-on: https://go-review.googlesource.com/c/go/+/494180
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change is in service of hiding more execution trace implementation
details for big changes to come.
Change-Id: I49b9716a7bf285d23c86b58912a05eff4ddc2213
Reviewed-on: https://go-review.googlesource.com/c/go/+/494182
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This reverts CL 481617.
Reason for revert: breaks test build on Windows
Change-Id: Ifc1a323b0cc521e7a5a1f7de7b3da667f5fee375
Reviewed-on: https://go-review.googlesource.com/c/go/+/494377
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
The previous name was wrong due to the mistaken assumption that calling
f->g->getcallerpc and f->g->getcallersp would respectively return the
pc/sp at g. However, they are actually referring to their caller's
caller, i.e. f.
Rename getcallerfp to getfp in order to stay consistent with this
naming convention.
Also see discussion on CL 463835.
For #16638
Change-Id: I07990645da78819efd3db92f643326652ee516f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/481617
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|