| Age | Commit message (Collapse) | Author |
|
Change-Id: Ibff8e7a08090d67d9a0625dee068c8d2eefbd190
GitHub-Last-Rev: b90b986121314615cf5d259d76a7be59303de1b0
GitHub-Pull-Request: golang/go#78630
Reviewed-on: https://go-review.googlesource.com/c/go/+/765480
Reviewed-by: Florian Lehner <lehner.florian86@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
The memProfCycle struct holds allocation counts and bytes allocated, and
frees and bytes freed. But the memory profile records are already
aggregated by allocation size, which is stored in the "size" field of
the bucket struct. We can derive the bytes allocated/freed using the
counts and the size we already store. Thus we can delete the bytes
fields from memProfCycle, saving 64 bytes per memRecord.
We can do something similar for the profilerecord.MemProfileRecord type.
We just need to know the object size and we can derive the allocated and
freed bytes accordingly.
Change-Id: I103885c2f29471b25283e330674fc16d6a6a6964
Reviewed-on: https://go-review.googlesource.com/c/go/+/760140
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
CL 688335 accidentally introduced a blank line between the Profile doc
comment and the type definition, causing the entire doc to get dropped.
Change-Id: I97b1c0e57d142d7caea6e543a0138ed6dcd1c3fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/743660
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
|
|
Issue #76540 reports failures in this test from the leaked goroutine
count being too small. The test makes an effort to wait for the
goroutines to leak, but there's no guarantee.
This change instead changes TestGoroutineLeakProfileConcurrency to wait
for the number of leaked goroutines to reach at least the minimum
expected before proceeding. This deflakes this particular issue.
The downside of this change is that a failure to detect leaked
goroutines due to a bug will lead to a timeout instead of an instant
failure, but this change makes an effort to log and report that it was
waiting for the goroutines to leak and timed out for that reason, so at
least the failure is more obvious.
Overall, this is still better than random flakes.
While we're here, I also make some minor stylistic changes and document
the helper functions a little more. I also noticed that checkFrames was
using the wrong *testing.T, so this change fixes that too.
Fixes #76540.
Change-Id: I0508855dc39b91f8c6b72d059ce88dbfc68fe748
Reviewed-on: https://go-review.googlesource.com/c/go/+/729280
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Copy LabelSet to an internal package as label.Set, and include (escaped)
labels within goroutine stack dumps.
Labels are added to the goroutine header as quoted key:value pairs, so
the line may get long if there are a lot of labels.
To handle escaping, we add a printescaped function to the
runtime and hook it up to the print function in the compiler with a new
runtime.quoted type that's a sibling to runtime.hex. (in fact, we
leverage some of the machinery from printhex to generate escape
sequences).
The escaping can be improved for printable runes outside basic ASCII
(particularly for languages using non-latin stripts). Additionally,
invalid UTF-8 can be improved.
So we can experiment with the output format make this opt-in via a
a new tracebacklabels GODEBUG var.
Updates #23458
Updates #76349
Change-Id: I08e78a40c55839a809236fff593ef2090c13c036
Reviewed-on: https://go-review.googlesource.com/c/go/+/694119
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
|
|
The CPU profiler reader goroutine has a hard-coded 100ms sleep between
reads of the CPU profile sample buffer. This is done because waking up
the CPU profile reader is not signal-safe on some platforms. As a
consequence, stopping the profiler takes 200ms (one iteration to read
the last samples and one to see the "eof"), and on many-core systems the
reader does not wake up frequently enought to keep up with incoming
data.
This CL removes the sleep where it is safe to do so, following a
suggestion by Austin Clements in the comments on CL 445375. We let the
reader fully block, and wake up the reader when the buffer is over
half-full.
Fixes #63043
Updates #56029
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-linux-arm64-longtest,gotip-linux-386-longtest
Change-Id: I9f7e7e9918a4a6f16e80f6aaf33103126568a81f
Reviewed-on: https://go-review.googlesource.com/c/go/+/610815
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
|
|
The goroutine leak profile tests currently rely on a function being
inlined, which results in a slightly different representation in the
pprof proto. This function is not inlined on the noopt builder.
Disable inlining of the one function which could be inlined to align but
the regular and noopt builder versions of this test. We're not
interested in testing the inlining functionality of profiles with this
test, we care about certain stack frames appearing in a certain order.
This also simplifies a bunch of the checking in the test.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-noopt
Change-Id: I28902cc4c9fae32d1e3fa41b93b00c3be4d6074a
Reviewed-on: https://go-review.googlesource.com/c/go/+/720100
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Cleaned up goroutine leak profile extraction:
- removed the acquisition of goroutineProfile semaphore
- inlined the call to saveg when recording stacks instead of using
doRecordGoroutineProfile, which had side-effects over
goroutineProfile fields.
Added regression tests for goroutine leak profiling frontend for binary
and debug=1 profile formats.
Added stress tests for concurrent goroutine and goroutine leak profile requests.
Change-Id: I55c1bcef11e9a7fb7699b4c5a2353e594d3e7173
GitHub-Last-Rev: 5e9eb3b1d80c4d2d9b668a01f6b39a7b42f7bb45
GitHub-Pull-Request: golang/go#76045
Reviewed-on: https://go-review.googlesource.com/c/go/+/714580
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
This change mechanically replaces all occurrences of interface{}
by 'any' (where deemed safe by the 'any' modernizer) throughout
std and cmd, minus their vendor trees.
Since this fix is relatively numerous, it gets its own CL.
Also, 'go generate go/types'.
Change-Id: I14a6b52856c3291c1d27935409bca8d5fd4242a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/719702
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Alan Donovan <adonovan@google.com>
|
|
I think the original depth-1 argument to allocDeep was correct.
Reverted that, and also the change to maxSkip in mprof.go, which was
also incorrect. I think before we were usually passing accidentally in
the loop over matched stacks when we really should usually have been
passing in the previous loop.
Change-Id: I6a6a696463e2baf045b66f418d7afbfcb49258e4
Reviewed-on: https://go-review.googlesource.com/c/go/+/712100
Reviewed-by: Michael Matloob <matloob@google.com>
TryBot-Bypass: Michael Matloob <matloob@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
The test has been causing a lot of flakes on the builders. Skip it while
I'm debugging it.
For #74029
Change-Id: I6a6a696450c23f65bc310a2d0ab61b22dba88f00
Reviewed-on: https://go-review.googlesource.com/c/go/+/712060
TryBot-Bypass: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
This change replaces most occurrences (in code as well as in comments) of
errors.As with errors.AsType. It leaves the errors package and vendored
code untouched.
Change-Id: I3bde73f318a0b408bdb8f5a251494af15a13118a
GitHub-Last-Rev: 8aaaa36a5a12d2a6a90c6d51680464e1a3115139
GitHub-Pull-Request: golang/go#75698
Reviewed-on: https://go-review.googlesource.com/c/go/+/708495
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change creates calls to size-specialized malloc functions instead
of calls to newObject when we know the size of the allocation at
compilation time. Most of it is a matter of calling the newObject
function (which will create calls to the size-specialized functions)
rather then the newObjectNonSpecialized function (which won't). In the
newHeapaddr, small, non-pointer case, we'll create a non specialized
newObject and transform that into the appropriate size-specialized
function when we produce the mallocgc in flushPendingHeapAllocations.
We have to update some of the rewrites in generic.rules to also apply to
the size-specialized functions when they apply to newObject.
The messiest thing is we have to adjust the offset we use to save the
memory profiler stack, because the depth of the call to profilealloc is
two frames fewer in the size-specialized malloc functions compared to
when newObject calls mallocgc. A bunch of tests have been adjusted to
account for that.
Change-Id: I6a6a6964c9037fb6719e392c4a498ed700b617d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/707856
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Matloob <matloob@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This change improves the concrete type analysis in the devirtualizer,
it not longer relies on ir.Reassigned, it now statically tries to
determine the concrete type of an interface, even when assigned
multiple times, following type assertions and iface conversions.
Alternative to CL 649195
Updates #69521
Fixes #64824
Change-Id: Ib1656e19f3619ab2e1e6b2c78346cc320490b2af
GitHub-Last-Rev: e8fa0b12f0a7b1d7ae00e5edb54ce04d1f702c09
GitHub-Pull-Request: golang/go#71935
Reviewed-on: https://go-review.googlesource.com/c/go/+/652036
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
Proposal #74609
Change-Id: I97a754b128aac1bc5b7b9ab607fcd5bb390058c8
GitHub-Last-Rev: 60f2a192badf415112246de8bc6c0084085314f6
GitHub-Pull-Request: golang/go#74622
Reviewed-on: https://go-review.googlesource.com/c/go/+/688335
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: t hepudds <thepudds1460@gmail.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
There are a few places in the code which checks that the running kernel
is greater than or equal to x.y. The check takes a few lines and the
checking code is somewhat distracting.
Let's abstract this check into a simple function, KernelVersionGE,
and convert the users accordingly.
Add a test case (I'm not sure it has much value, can be dropped).
Change-Id: I8ec91dcc7452363361f95e46794701c0ae57d956
Reviewed-on: https://go-review.googlesource.com/c/go/+/700796
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
|
|
Also CL 690655 for golang.org/x/sys.
For #71671
Change-Id: Iceb369dec5affb944a39d07cdabfd7add6f1f319
Reviewed-on: https://go-review.googlesource.com/c/go/+/648795
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Before CL 650697, there was only one system goroutine that could
dynamically change between being a user goroutine and a system
goroutine, and that was the finalizer/cleanup goroutine. In goroutine
profiles, it was handled explicitly. It's status would be checked during
the first STW, and its stack would be recorded. This let the goroutine
profiler completely ignore system goroutines once the world was started
again.
CL 650697 added dedicated cleanup goroutines (there may be more than
one), and with this, the logic for finalizer goroutines no longer
scaled. In that CL, I let the isSystemGoroutine check be dynamic and
dropped the special case, but this was based on incorrect assumptions.
Namely, it's possible for the scheduler to observe, for example, the
finalizer goroutine as a system goroutine and ignore it, but then later
the goroutine profiler itself sees it as a user goroutine. At that point
it's too late and already running. This violates the invariant of the
goroutine profile that all goroutines are handled by the profiler before
they start executing. In practice, the result is that the goroutine
profiler can crash when it checks this invariant (not checking the
invariant means racily reading goroutine stack memory).
The root cause of the problem is that these system goroutines do not
participate in the goroutine profiler's state machine. Normally, when
profiling, goroutines transition from 'absent' to 'in-progress' to
'satisfied'. However with system goroutines, the state machine is
ignored entirely. They always stay in the 'absent' state. This means
that if a goroutine transitions from system to user, it is eligible for
a profile record when it shouldn't be. That transition shouldn't be
allowed to occur with respect to the goroutine profiler, because the
goroutine profiler is trying to snapshot the state of every goroutine.
The fix to this problem is simple: don't ignore system goroutines. Let
them participate in the goroutine profile state machine. Instead, decide
whether or not to record the stack after the goroutine has been acquired
for goroutine profiling. This means if the scheduler observes the
finalizer goroutine as a system goroutine, it will get promoted in the
goroutine profiler's state machine, and no other part of the goroutine
profiler will observe the goroutine again. Simultaneously, the
stack record for the goroutine will be correctly skipped.
Fixes #74090.
Change-Id: Icb9a164a033be22aaa942d19e828e895f700ca74
Reviewed-on: https://go-review.googlesource.com/c/go/+/680477
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL fixes a number of (all true positive) findings of vet's
copylock analyzer patched to treat the Bu{ff,uild}er types
as non-copyable after first use.
This does require imposing an additional indirection
between noder.writer and Encoder since the field is
embedded by value but its constructor now returns a pointer.
Updates golang/go#25907
Updates golang/go#47276
Change-Id: I0b4d77ac12bcecadf06a91709e695365da10766c
Reviewed-on: https://go-review.googlesource.com/c/go/+/635339
Reviewed-by: Robert Findley <rfindley@google.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Alan Donovan <adonovan@google.com>
|
|
TestMutexBlockFullAggregation aggregates stacks by function, file, and
line number. But there can be multiple function calls on the same line,
giving us different sequences of PCs. This causes the test to spuriously
fail in some cases. Include PCs in the stacks for this test.
Also pick up a small "range over int" modernize suggestion while we're
looking at the test.
Fixes #73641
Change-Id: I50489e19fcf920e27b9eebd9d4b35feb89981cbc
Reviewed-on: https://go-review.googlesource.com/c/go/+/673115
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Fixes #73107
Change-Id: I41f3e1bd1fdaca2f0e94151b2320bd569e258a51
Reviewed-on: https://go-review.googlesource.com/c/go/+/671576
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change splits the finalizer and cleanup queues and implements a new
lock-free blocking queue for cleanups. The basic design is as follows:
The cleanup queue is organized in fixed-sized blocks. Individual cleanup
functions are queued, but only whole blocks are dequeued.
Enqueuing cleanups places them in P-local cleanup blocks. These are
flushed to the full list as they get full. Cleanups can only be enqueued
by an active sweeper.
Dequeuing cleanups always dequeues entire blocks from the full list.
Cleanup blocks can be dequeued and executed at any time.
The very last active sweeper in the sweep phase is responsible for
flushing all local cleanup blocks to the full list. It can do this
without any synchronization because the next GC can't start yet, so we
can be very certain that nobody else will be accessing the local blocks.
Cleanup blocks are stored off-heap because the need to be allocated by
the sweeper, which is called from heap allocation paths. As a result,
the GC treats cleanup blocks as roots, just like finalizer blocks.
Flushes to the full list signal to the scheduler that cleanup goroutines
should be awoken. Every time the scheduler goes to wake up a cleanup
goroutine and there were more signals than goroutines to wake, it then
forwards this signal to runtime.AddCleanup, so that it creates another
goroutine the next time it is called, up to gomaxprocs goroutines.
The signals here are a little convoluted, but exist because the sweeper
and the scheduler cannot safely create new goroutines.
For #71772.
For #71825.
Change-Id: Ie839fde2b67e1b79ac1426be0ea29a8d923a62cc
Reviewed-on: https://go-review.googlesource.com/c/go/+/650697
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
Go 1.22 promised to remove the setting in a future release once the
semantics of runtime-internal lock contention matched that of
sync.Mutex. That work is done, remove the setting.
Previously reviewed as https://go.dev/cl/585639.
For #66999
Change-Id: I9fe62558ba0ac12824874a0bb1b41efeb7c0853f
Reviewed-on: https://go-review.googlesource.com/c/go/+/668995
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
To simplify the code.
Change-Id: Ib1af5009cc25bb29fd26fdb7b29ff4579f0150aa
GitHub-Last-Rev: f698a8a771ac8c6ecb745ea4c27a7c677c1789d1
GitHub-Pull-Request: golang/go#73255
Reviewed-on: https://go-review.googlesource.com/c/go/+/663735
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: Ib3be7cee6ca6dce899805aac176ca789eb2fd0f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/661738
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Sean Liao <sean@liao.dev>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Instead of always allocating variable-sized "make" calls on the heap,
allocate a small, constant-sized array on the stack and use that array
as the backing store if it is big enough.
Requires the result of the "make" doesn't escape.
if cap <= K {
var arr [K]E
slice = arr[:len:cap]
} else {
slice = makeslice(E, len, cap)
}
Pretty conservatively for now, K = 32/sizeof(E). The slice header is
already 24 bytes, so wasting 32 bytes of stack if the requested size
is too big isn't that bad. Larger would waste more stack space but
maybe avoid more allocations.
This CL also requires the element type be pointer-free. Maybe we
could relax that at some point, but it is hard. If the element type
has pointers we can get heap->stack pointers (in the case where the
requested size is too big and the slice is heap allocated).
Note that this only handles the case of makeslice called directly from
compiler-generated code. It does not handle slices built in the
runtime on behalf of the program (e.g. in growslice). Some of those
are currently handled by passing in a tmpBuf (e.g. concatstrings),
but we could probably do more.
Change-Id: I8378efad527cd00d25948a80b82a68d88fbd93a1
Reviewed-on: https://go-review.googlesource.com/c/go/+/653856
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Users see this frame in tracebacks and it serves as a hint that what is
running here is a finalizer or cleanup. But runfinq is a rather dense
name. We can give it a more obvious name to help users realize what it
is.
For #73011.
Change-Id: I6a6a636ce9a493fd00d4b4c60c23f2b1c96d3568
Reviewed-on: https://go-review.googlesource.com/c/go/+/660296
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
CL 658035 added TestCgoCallbackPprof, which is consistently failing on
solaris. runtime/pprof maintains a list of platforms where CPU profiling
does not work properly. Since this test requires CPU profiling, skip the
this test on those platforms.
For #72870.
Fixes #72876.
Change-Id: I6a6a636cbf6b16abcbba8771178fe1d001be9d9b
Reviewed-on: https://go-review.googlesource.com/c/go/+/658415
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
We've been slowly moving packages from runtime/internal to
internal/runtime. For now, runtime/internal only has test packages.
It's a good chance to clean up the references to runtime/internal
in the toolchain.
For #65355.
Change-Id: Ie6f9091a44511d0db9946ea6de7a78d3afe9f063
GitHub-Last-Rev: fad32e2e81d11508e734c3c3d3b0c1da583f89f5
GitHub-Pull-Request: golang/go#72137
Reviewed-on: https://go-review.googlesource.com/c/go/+/655515
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
this does result in a little bit more inlining,
cmd/compile text is 0.5% larger,
bent-benchmark text geomeans grow by only 0.02%.
some of our tests make assumptions about inlining.
Change-Id: I999d1798aca5dc64a1928bd434258a61e702951a
Reviewed-on: https://go-review.googlesource.com/c/go/+/655157
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
|
|
Heap profiles hide "runtime" frames like runtime.mapassign. This broke
in 1.24 because the map implementation moved to internal/runtime/maps,
and runtime/pprof only considered literal "runtime." when looking for
runtime frames.
It would be nice to use cmd/internal/objabi.PkgSpecial to find runtime
packages, but that is hidden away in cmd.
Fixes #71174.
Change-Id: I6a6a636cb42aa17539e47da16854bd3fd8cb1bfe
Reviewed-on: https://go-review.googlesource.com/c/go/+/641775
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
This change catches an additional error message to trigger skipping
the test when the underlying system is failing.
Fixes #62352
Change-Id: I5c12b20f3e9023597ff89fc905c0646a80ec4811
Reviewed-on: https://go-review.googlesource.com/c/go/+/637995
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: I07e7c8eaa5bd4bac0d576b2f2f4cd3f81b0b77a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/630055
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
|
|
Switch labelMap from map[string]string to use LabelSet as a data
structure. Optimize Labels() for the case where the keys are given in
sorted order without duplicates.
This is primarily motivated by reducing the overhead of distributed
tracing systems that use pprof labels. We have encountered cases where
users complained about the overhead relative to the rest of our
distributed tracing library code. Additionally, we see this as an
opportunity to free up hundreds of CPU cores across our fleet.
A secondary motivation is eBPF profilers that try to access pprof
labels. The current map[string]string requires them to implement Go map
access in eBPF, which is non-trivial. With the enablement of swiss maps,
this complexity is only increasing. The slice data structure introduced
in this CL will greatly lower the implementation complexity for eBPF
profilers in the future. But to be clear: This change does not imply
that the pprof label mechanism is now a stable ABI. They are still an
implementation detail and may change again in the future.
goos: darwin
goarch: arm64
pkg: runtime/pprof
cpu: Apple M1 Max
│ baseline.txt │ patch1.txt │
│ sec/op │ sec/op vs base │
Labels/set-one-10 153.50n ± 3% 75.00n ± 1% -51.14% (p=0.000 n=10)
Labels/merge-one-10 187.8n ± 1% 128.8n ± 1% -31.42% (p=0.000 n=10)
Labels/overwrite-one-10 193.1n ± 2% 102.0n ± 1% -47.18% (p=0.000 n=10)
Labels/ordered/set-many-10 502.6n ± 4% 146.1n ± 2% -70.94% (p=0.000 n=10)
Labels/ordered/merge-many-10 516.3n ± 2% 238.1n ± 1% -53.89% (p=0.000 n=10)
Labels/ordered/overwrite-many-10 569.3n ± 4% 247.6n ± 2% -56.51% (p=0.000 n=10)
Labels/unordered/set-many-10 488.9n ± 2% 308.3n ± 3% -36.94% (p=0.000 n=10)
Labels/unordered/merge-many-10 523.6n ± 1% 258.5n ± 1% -50.64% (p=0.000 n=10)
Labels/unordered/overwrite-many-10 571.4n ± 1% 412.1n ± 2% -27.89% (p=0.000 n=10)
geomean 366.8n 186.9n -49.05%
│ baseline.txt │ patch1b.txt │
│ B/op │ B/op vs base │
Labels/set-one-10 424.0 ± 0% 104.0 ± 0% -75.47% (p=0.000 n=10)
Labels/merge-one-10 424.0 ± 0% 200.0 ± 0% -52.83% (p=0.000 n=10)
Labels/overwrite-one-10 424.0 ± 0% 136.0 ± 0% -67.92% (p=0.000 n=10)
Labels/ordered/set-many-10 1344.0 ± 0% 392.0 ± 0% -70.83% (p=0.000 n=10)
Labels/ordered/merge-many-10 1184.0 ± 0% 712.0 ± 0% -39.86% (p=0.000 n=10)
Labels/ordered/overwrite-many-10 1056.0 ± 0% 712.0 ± 0% -32.58% (p=0.000 n=10)
Labels/unordered/set-many-10 1344.0 ± 0% 712.0 ± 0% -47.02% (p=0.000 n=10)
Labels/unordered/merge-many-10 1184.0 ± 0% 712.0 ± 0% -39.86% (p=0.000 n=10)
Labels/unordered/overwrite-many-10 1.031Ki ± 0% 1.008Ki ± 0% -2.27% (p=0.000 n=10)
geomean 843.1 405.1 -51.95%
│ baseline.txt │ patch1b.txt │
│ allocs/op │ allocs/op vs base │
Labels/set-one-10 5.000 ± 0% 3.000 ± 0% -40.00% (p=0.000 n=10)
Labels/merge-one-10 5.000 ± 0% 5.000 ± 0% ~ (p=1.000 n=10) ¹
Labels/overwrite-one-10 5.000 ± 0% 4.000 ± 0% -20.00% (p=0.000 n=10)
Labels/ordered/set-many-10 8.000 ± 0% 3.000 ± 0% -62.50% (p=0.000 n=10)
Labels/ordered/merge-many-10 8.000 ± 0% 5.000 ± 0% -37.50% (p=0.000 n=10)
Labels/ordered/overwrite-many-10 7.000 ± 0% 4.000 ± 0% -42.86% (p=0.000 n=10)
Labels/unordered/set-many-10 8.000 ± 0% 4.000 ± 0% -50.00% (p=0.000 n=10)
Labels/unordered/merge-many-10 8.000 ± 0% 5.000 ± 0% -37.50% (p=0.000 n=10)
Labels/unordered/overwrite-many-10 7.000 ± 0% 5.000 ± 0% -28.57% (p=0.000 n=10)
geomean 6.640 4.143 -37.60%
¹ all samples are equal
Change-Id: Ie68e960a25c2d97bcfb6239dc481832fa8a39754
Reviewed-on: https://go-review.googlesource.com/c/go/+/574516
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This PR will use test.Skip to bypass a test run for which the vmmap
subprocess appears to hang before the test times out.
In addition it catches a different error message from vmmap that can
occur due to transient resource shortages and triggers a retry for
this additional case.
Fixes #62352
Change-Id: I3ae749e5cd78965c45b1b7c689b896493aa37ba0
Reviewed-on: https://go-review.googlesource.com/c/go/+/560935
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Fixes #65328
Change-Id: I11242be93a95e117a6758ac037e143c3b38aa71c
Reviewed-on: https://go-review.googlesource.com/c/go/+/597980
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
|
|
Add several benchmarks for pprof labels to analyze the impact of
follow-up CLs.
Change-Id: Ifae39cfe83ec93858fce9e3af6c1be024ba76736
Reviewed-on: https://go-review.googlesource.com/c/go/+/574515
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
The TestProfilerStackDepth/heap test can spuriously fail if the profiler
happens to capture a stack with an allocation several frames deep into
runtime code. The pprof API hides runtime frames at the leaf-end of
stacks, but those frames still count against the profiler's stack depth
limit. The test checks only the first stack it finds with the desired
prefix and fails if it's not deep enough or doesn't have the right root
frame. So it can fail in that scenario, even though the implementation
isn't really broken.
Relax the test to check that there is at least one stack with desired
prefix, depth, and root frame.
Fixes #70112
Change-Id: I337fb3cccd1ddde76530b03aa1ec0f9608aa4112
Reviewed-on: https://go-review.googlesource.com/c/go/+/623998
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
First, skip all the allocation count tests.
In some cases this aligns with existing skips for -race, but in others
we've got new issues. These are debug modes, so some performance loss is
expected, and this is clearly no worse than today where the tests fail.
Next, skip internal linking and static linking tests for msan and asan.
With asan we get an explicit failure that neither are supported by the C
and/or Go compilers. With msan, we only get the Go compiler telling us
internal linking is unavailable. With static linking, we segfault
instead. Filed #70080 to track that.
Next, skip some malloc tests with asan that don't quite work because of
the redzone.
This is because of some sizeclass assumptions that get broken with the
redzone and the fact that the tiny allocator is effectively disabled
(again, due to the redzone).
Next, skip some runtime/pprof tests with asan, because of extra
allocations.
Next, skip some malloc tests with asan that also fail because of extra
allocations.
Next, fix up memstats accounting for arenas when asan is enabled. There
is a bug where more is added to the stats than subtracted. This also
simplifies the accounting a little.
Next, skip race tests with msan or asan enabled; they're mutually
incompatible.
Fixes #70054.
Fixes #64256.
Fixes #64257.
For #70079.
For #70080.
Change-Id: I99c02a0b9d621e44f1f918b307aa4a4944c3ec60
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-asan-clang15,gotip-linux-amd64-msan-clang15
Reviewed-on: https://go-review.googlesource.com/c/go/+/622855
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Bypass: Michael Knyszek <mknyszek@google.com>
|
|
Goroutine profiles require checking in with the profiler before any
goroutine starts running. coroswitch is a place where a goroutine may
start running, but where we do not check in with the profiler, which
leads to crashes. Fix this by checking in with the profiler the same way
execute does.
Fixes #69998.
Change-Id: Idef6dd31b70a73dd1c967b56c307c7a46a26ba73
Reviewed-on: https://go-review.googlesource.com/c/go/+/622016
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This reverts commit 5b0f8596b766afae9dd1f117a4a5dcfbbf1b80f1.
Reason for revert: This CL breaks gotip-linux-amd64-noopt builder.
Change-Id: I3950211f05c90e4955c0785409b796987741a9f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/617715
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
|
|
While working on CL 611241 and CL 616375, I introduced a bug that wasn't
caught by any test. CL 611241 added more inline expansion at sample time
for block/mutex profile stacks collected via frame pointer unwinding.
CL 616375 then changed how inline expansion for those stacks is done at
reporting time. So some frames passed through multiple rounds of inline
expansion, and this lead to duplicate stack frames in some cases. The
stacks from TestBlockMutexProfileInlineExpansion looked like
sync.(*Mutex).Unlock
runtime/pprof.inlineF
runtime/pprof.inlineE
runtime/pprof.inlineD
runtime/pprof.inlineD
runtime.goexit
after those two CLs, and in particular after CL 616375. Note the extra
inlineD frame. The test didn't catch that since it was only looking for
a few frames in the stacks rather than checking the entire stacks.
This CL makes that test stricter by checking the entire expected stacks
rather than just a portion of the stacks.
Change-Id: I0acc739d826586e9a63a081bb98ef512d72cdc9a
Reviewed-on: https://go-review.googlesource.com/c/go/+/617235
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Over the years we've had various bugs in pprof stack handling resulting
in appendLocsForStack crashing because stk is too short for a cached
location. i.e., the cached location claims several inlined frames. Those
should always appear together in stk. If some frames are missing from
stk, appendLocsForStack.
If we find this case, replace the slice out of bounds panic with an
explicit panic that contains more context.
Change-Id: I52725a689baf42b8db627ce3e1bc6c654ef245d4
Reviewed-on: https://go-review.googlesource.com/c/go/+/617135
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Fix a regression introduced in CL 572396 causing goroutine stacks not
getting null terminated.
This bug impacts callers that reuse the []StackRecord slice for multiple
calls to GoroutineProfile. See https://github.com/felixge/fgprof/issues/33
for an example of the problem.
Add a test case to prevent similar regressions in the future. Use null
padding instead of null termination to be consistent with other profile
types and because it's less code to implement. Also fix the
ThreadCreateProfile code path.
Fixes #69243
Change-Id: I0b9414f6c694c304bc03a5682586f619e9bf0588
Reviewed-on: https://go-review.googlesource.com/c/go/+/609815
Reviewed-by: Tim King <taking@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Fix a regression introduced in CL 598515 causing runtime.MutexProfile
stack traces to omit their root frames.
In most cases this was merely causing the `runtime.goexit` frame to go
missing. But in the case of runtime._LostContendedRuntimeLock, an empty
stack trace was being produced.
Add a test that catches this regression by checking for a stack trace
with the `runtime.goexit` frame.
Also fix a separate problem in expandFrame that could cause
out-of-bounds panics when profstackdepth is set to a value below 32.
There is no test for this fix because profstackdepth can't be changed at
runtime right now.
Fixes #69335
Change-Id: I1600fe62548ea84981df0916d25072c3ddf1ea1a
Reviewed-on: https://go-review.googlesource.com/c/go/+/611615
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Nick Ripley <nick.ripley@datadoghq.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: I013aae68f47d7a37deb44097f80a213d8c7976bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/612655
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Mutex contention events with delay of 0 need more than CL 604355 added:
When deciding which event to store in the M's single available slot,
always choose to drop the zero-delay event. Store an explicit flag for
whether we have an event to store, rather than relying on a non-zero
delay.
And, fix a test of sync.Mutex contention that expects those events to
have non-zero delay. The reporting of non-runtime contention like this
has long allowed zero-delay events, which we see when cputicks has low
resolution.
Fixes #68892
Fixes #68906
Change-Id: Id412141e4eb09724f3ce195899a20d59c92d7b78
Reviewed-on: https://go-review.googlesource.com/c/go/+/606115
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Replace reflect.DeepEqual with slices.Equal/maps.Equal, which is
much faster.
Also remove some unecessary helper functions.
Change-Id: I3e4fa2938fed1598278c9e556cd4fa3b9ed3ad6d
GitHub-Last-Rev: 69bb43fc6e5c4a4a7d028528fe00b43db784464e
GitHub-Pull-Request: golang/go#67603
Reviewed-on: https://go-review.googlesource.com/c/go/+/587815
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
When using frame pointer unwinding, we defer frame skipping and inline
expansion for call stacks until profile reporting time. We can end up
with records which have different stacks if no frames are skipped, but
identical stacks once skipping is taken into account. Returning multiple
records with the same stack (but different values) has broken programs
which rely on the records already being fully aggregated by call stack
when returned from runtime.MutexProfile. This CL addresses the problem
by handling skipping at recording time. We do full inline expansion to
correctly skip the desired number of frames when recording the call
stack, and then handle the rest of inline expansion when reporting the
profile.
The regression test in this CL is adapted from the reproducer in
https://github.com/grafana/pyroscope-go/issues/103, authored by Tolya
Korniltsev.
Fixes #67548
This reapplies CL 595966.
The original version of this CL introduced a bounds error in
MutexProfile and failed to correctly expand inlined frames from that
call. This CL applies the original CL, fixing the bounds error and
adding a test for the output of MutexProfile to ensure the frames are
expanded properly.
Change-Id: I5ef8aafb9f88152a704034065c0742ba767c4dbb
Reviewed-on: https://go-review.googlesource.com/c/go/+/598515
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
This reverts CL 595966.
Reason for revert: This CL contains a bug. See the comment in https://go-review.googlesource.com/c/go/+/595966/8#message-57f4c1f9570b5fe912e06f4ae3b52817962533c0
Change-Id: I48030907ded173ae20a8965bf1b41a713dd06059
Reviewed-on: https://go-review.googlesource.com/c/go/+/598219
Reviewed-by: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|