| Age | Commit message (Collapse) | Author |
|
It seems like some platforms don't build with cgo at all, like
linux/ppc64.
Change-Id: Ibe5306e79899822e77d42cb3018f9f0c9ca2604c
Reviewed-on: https://go-review.googlesource.com/c/go/+/728140
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
cgo + race is not supported on FreeBSD.
Change-Id: I38abeccaaabfcc104d1d5a077fb99646dc4be792
Reviewed-on: https://go-review.googlesource.com/c/go/+/728120
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
In #76435, it turns out that the new metric
/sched/goroutines/not-in-go:goroutines counts C threads that have called
into Go before (on Linux) as not-in-go goroutines. The reason for this
is that the M is still attached to the C thread on Linux as an
optimization, so we don't go through all the trouble of detaching the M
and, of course, decrementing nGsyscallNoP.
There's an easy fix to this accounting issue. The flag on the M,
isExtraInC, says whether a thread with an extra M attached no longer has
any Go on its (logical) stack. When we take the P from an M in this
state, we simply just don't increment nGsyscallNoP. When it calls back
into Go, we similarly skip the decrement to nGsyscallNoP.
This is more efficient than alternatives, like always updating
nGsyscallNoP in cgocallbackg, since that would add a new
read-modify-write atomic onto that fast path. It does mean we count
threads in C with a P still attached as not-in-go, but this transient in
most real programs, assuming the thread indeed does not call back into
Go any time soon.
Fixes #76435.
Change-Id: Id05563bacbe35d3fae17d67fb5ed45fa43fa0548
Reviewed-on: https://go-review.googlesource.com/c/go/+/726964
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change mechanically replaces all occurrences of interface{}
by 'any' (where deemed safe by the 'any' modernizer) throughout
std and cmd, minus their vendor trees.
Since this fix is relatively numerous, it gets its own CL.
Also, 'go generate go/types'.
Change-Id: I14a6b52856c3291c1d27935409bca8d5fd4242a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/719702
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Alan Donovan <adonovan@google.com>
|
|
There are just too many flakes resulting from background pollution by
the testing package and other tests. Run in a subprocess where at least
the environment can be more tightly controlled.
Fixes #75049.
Change-Id: Iad59edaaf31268f1fcb77273f01317d963708fa6
Reviewed-on: https://go-review.googlesource.com/c/go/+/707155
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
The two remaining popular flakes in TestReadMetricsSched are with
waiting and not-in-go goroutines. I believe the reason for both of these
is pollution due to other goroutines in the test binary, and we can be a
little bit more robust to them.
In particular, both of these tests spin until there's a particular
condition met in the metrics. Then they both immediately re-read the
metrics. The metrics being checked are not guaranteed to stay that way
in the re-read. For example, for the waiting case, it's possible for
>1000 to be waiting, but then some leftover goroutines from a previous
test wake up off of some condition, causing the number to dip again on
the re-read. This is supported by the fact that in some of these
failures, the test takes very little time to execute (the condition that
there are at least 1000 waiting goroutines is almost immediately
satisfied) but it still fails (the re-read causes us to notice that there
are <1000 waiting goroutines).
This change makes the test more robust by not performing this re-read.
This means more possible false negatives, but the error cases we're
looking for (bad metrics) should still show up with some reasonable
probability when something *is* truly wrong.
For #75049.
Change-Id: Idc3a8030bed8da51f24322efe15f3ff13da05623
Reviewed-on: https://go-review.googlesource.com/c/go/+/705875
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
TestReadMetricsSched/running can take some time to enter in steady state
on busy systems. We currently only allow 1 second for that, we should
let it run unlimitedly until success or the test time's out.
Fixes #75049
Change-Id: I452059e1837caf12a2d2d9cae1f70a0ef2d4f518
Reviewed-on: https://go-review.googlesource.com/c/go/+/702295
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
This is used by TestStopTheWorldDeadlock, and the new
TestReadMetricsSched test accidentally modifies this global variable
instead of (as intended) defining a local one.
Change-Id: I7aaece83f285d051ad8b56b7591c76613c39613a
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/696556
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Bypass: Michael Knyszek <mknyszek@google.com>
|
|
Fixes #15490.
Change-Id: I6ce9edc46398030ff639e22d4ca4adebccdfe1b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/690399
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
For #15490.
Change-Id: Ic587dda1f42d613ea131a6b53ce6ba6e6cadf4c7
Reviewed-on: https://go-review.googlesource.com/c/go/+/690398
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This is largely a port of CL 38180.
For #15490.
Change-Id: I2726111e472e81e9f9f0f294df97872c2689f061
Reviewed-on: https://go-review.googlesource.com/c/go/+/690397
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
These metrics are useful for identifying finalizer and cleanup problems,
namely slow finalizers and/or cleanups holding up the queue, which can
lead to a memory leak.
Fixes #72948.
Change-Id: I1bb64a9ca751fcb462c96d986d0346e0c2894c95
Reviewed-on: https://go-review.googlesource.com/c/go/+/690396
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
Go 1.22 promised to remove the setting in a future release once the
semantics of runtime-internal lock contention matched that of
sync.Mutex. That work is done, remove the setting.
Previously reviewed as https://go.dev/cl/585639.
For #66999
Change-Id: I9fe62558ba0ac12824874a0bb1b41efeb7c0853f
Reviewed-on: https://go-review.googlesource.com/c/go/+/668995
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
Have the test use the same clock (cputicks) as the profiler, and use the
test's own measurements as hard bounds on the magnitude to expect in the
profile.
Compare the depiction of two users of the same lock: one where the
critical section is fast, one where it is slow. Confirm that the profile
shows the slow critical section as a large source of delay (with #66999
fixed), rather than showing the fast critical section as a large
recipient of delay.
Previously reviewed as https://go.dev/cl/586237.
For #66999
Change-Id: Ic2d78cc29153d5322577d84abdc448e95ed8f594
Reviewed-on: https://go-review.googlesource.com/c/go/+/667616
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
|
|
For #70602
Change-Id: I3f723ebc17ef690d5be7f4f948c9dd1f890196fd
Reviewed-on: https://go-review.googlesource.com/c/go/+/658095
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Replace reflect.DeepEqual with slices.Equal/maps.Equal, which is
much faster.
Also remove some unecessary helper functions.
Change-Id: I3e4fa2938fed1598278c9e556cd4fa3b9ed3ad6d
GitHub-Last-Rev: 69bb43fc6e5c4a4a7d028528fe00b43db784464e
GitHub-Pull-Request: golang/go#67603
Reviewed-on: https://go-review.googlesource.com/c/go/+/587815
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
This reverts commit 87e930f7289136fad1310d4b63dd4127e409bac5 (CL 585639)
Reason for revert: This is part of a patch series that changed the
handling of contended lock2/unlock2 calls, reducing the maximum
throughput of contended runtime.mutex values, and causing a performance
regression on applications where that is (or became) the bottleneck.
Updates #66999
Updates #67585
Change-Id: I1e286d2a16d16e4af202cd5dc04b2d9c4ee71b32
Reviewed-on: https://go-review.googlesource.com/c/go/+/589097
Reviewed-by: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
|
|
This reverts commit f9ba2cff2286d378eca28c841bea8488c69fc30e (CL 586237)
Reason for revert: This is part of a patch series that changed the
handling of contended lock2/unlock2 calls, reducing the maximum
throughput of contended runtime.mutex values, and causing a performance
regression on applications where that is (or became) the bottleneck.
This test verifies that the semantics of the mutex profile for
runtime.mutex values matches that of sync.Mutex values. Without the rest
of the patch series, this test would correctly identify that Go 1.22's
semantics are incorrect (issue #66999).
Updates #66999
Updates #67585
Change-Id: Id06ae01d7bc91c94054c80d273e6530cb2d59d10
Reviewed-on: https://go-review.googlesource.com/c/go/+/589096
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
As of https://go.dev/cl/586796, the runtime/metrics view of internal
mutex contention is sampled at 1 per gTrackingPeriod, rather than either
1 (immediately prior to CL 586796) or the more frequent of
gTrackingPeriod or the mutex profiling rate (Go 1.22). Thus, we no
longer have a real lower bound on the amount of contention that
runtime/metrics will report. Relax the test's expectations again.
For #64253
Change-Id: I94e1d92348a03599a819ec8ac785a0eb3c1ddd73
Reviewed-on: https://go-review.googlesource.com/c/go/+/587515
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
|
|
Have the test use the same clock (cputicks) as the profiler, and use the
test's own measurements as hard bounds on the magnitude to expect in the
profile.
Compare the depiction of two users of the same lock: one where the
critical section is fast, one where it is slow. Confirm that the profile
shows the slow critical section as a large source of delay (with #66999
fixed), rather than showing the fast critical section as a large
recipient of delay.
For #64253
For #66999
Change-Id: I784c8beedc39de564dc8cee42060a5d5ce55de39
Reviewed-on: https://go-review.googlesource.com/c/go/+/586237
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Go 1.22 promised to remove the setting in a future release once the
semantics of runtime-internal lock contention matched that of
sync.Mutex. That work is done, remove the setting.
For #66999
Change-Id: I3c4894148385adf2756d8754e44d7317305ad758
Reviewed-on: https://go-review.googlesource.com/c/go/+/585639
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Currently freeSpan is called before large object stats are updated when
sweeping large objects. This means heapStats.inHeap might get subtracted
before the large object is added to the largeFree field. The end result
is that the /memory/classes/heap/unused:bytes metric, which subtracts
live objects (alloc-free) from inHeap may overflow.
Fix this by always updating the large object stats before calling
freeSpan.
Fixes #67019.
Change-Id: Ib02bd8dcd1cf8cd1bc0110b6141e74f678c10445
Reviewed-on: https://go-review.googlesource.com/c/go/+/583380
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Fully testing the runtime's profiles and metrics for contention on its
internal mutex values involves comparing two separate clocks (cputicks
for the profile and nanotime for the metric), verifying its fractional
sampling (when MutexProfileRate is greater than 1), and observing a very
small critical section outside of the test's control (semrelease).
Flakiness (#64253) from those parts of the test have led to skipping it
entirely.
But there are portions of the mutex profiling behavior that should have
more consistent behavior: for a mutex under the test's control, the test
and the runtime should be able to agree that the test successfully
induced contention, and should agree on the call stack that caused the
contention. Allow those more consistent parts to run.
For #64253
Change-Id: I7f368d3265a5c003da2765164276fab616eb9959
Reviewed-on: https://go-review.googlesource.com/c/go/+/581296
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Joedian Reid <joedian@google.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
|
|
This change fixes a possible race with updating metrics and reading
them. The update is intended to be protected by the world being stopped,
but here, it clearly isn't.
Fixing this lets us lower the thresholds in the metrics tests by an
order of magnitude, because the only thing we have to worry about now is
floating point error (the tests were previously written assuming the
floating point error was much higher than it actually was; that turns
out not to be the case, and this bug was the problem instead). However,
this still isn't that tight of a bound; we still want to catch any and
all problems of exactness. For this purpose, this CL adds a test to
check the source-of-truth (in uint64 nanoseconds) that ensures the
totals exactly match.
This means we unfortunately have to take another time measurement, but
for now let's prioritize correctness. A few additional nanoseconds of
STW time won't be terribly noticable.
Fixes #66212.
Change-Id: Id02c66e8a43c13b1f70e9b268b8a84cc72293bfd
Reviewed-on: https://go-review.googlesource.com/c/go/+/570257
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Nicolas Hillegeer <aktau@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This test was added to check new mutex profile functionality.
Specifically, it checks to make sure that the functionality behind
GODEBUG=runtimecontentionstacks works. The runtime currently tracks
contention from runtime-internal mutexes in mutex profiles, but it does
not record stack traces for them, attributing the time to a dummy
symbol. This GODEBUG enables collecting stacks.
Just disable the test. Even if this functionality breaks, it won't
affect Go users and it'll help keep the builders green. It's fine to
leave the test because this will be revisited in the next dev cycle.
For #64253.
Change-Id: I7938fe0f036fc4e4a0764f030e691e312ec2c9b5
Reviewed-on: https://go-review.googlesource.com/c/go/+/550775
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Eli Bendersky <eliben@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
profileruntimelocks is new in CL 544195, but the name is deceptive. Even
with profileruntimelocks=0, runtime-internal locks are still profiled.
The actual difference is that call stacks are not collected. Instead all
contention is reported at runtime._LostContendedLock.
Rename this setting to runtimecontentionstacks to make its name more
aligned with its behavior.
In addition, for this release the default is profileruntimelocks=0,
meaning that users are fairly likely to encounter
runtime._LostContendedLock. Rename it to
runtime._LostContendedRuntimeLock in an attempt to make it more
intuitive that these are runtime locks, not locks in application code.
For #57071.
Change-Id: I38aac28b2c0852db643d53b1eab3f3bc42a43393
Reviewed-on: https://go-review.googlesource.com/c/go/+/547055
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Rhys Hiltner <rhys@justin.tv>
|
|
Most contention on the runtime locks inside semaphores is observed in
runtime.semrelease1, but it can also appear in runtime.semacquire1. When
examining contention profiles in TestRuntimeLockMetricsAndProfile, allow
call stacks that include either.
For #64253
Change-Id: Id4f16af5e9a28615ab5032a3197e8df90f7e382f
Reviewed-on: https://go-review.googlesource.com/c/go/+/544375
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Rhys Hiltner <rhys@justin.tv>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Add runtime-internal locks to the mutex contention profile.
Store up to one call stack responsible for lock contention on the M,
until it's safe to contribute its value to the mprof table. Try to use
that limited local storage space for a relatively large source of
contention, and attribute any contention in stacks we're not able to
store to a sentinel _LostContendedLock function.
Avoid ballooning lock contention while manipulating the mprof table by
attributing to that sentinel function any lock contention experienced
while reporting lock contention.
Guard collecting real call stacks with GODEBUG=profileruntimelocks=1,
since the available data has mixed semantics; we can easily capture an
M's own wait time, but we'd prefer for the profile entry of each
critical section to describe how long it made the other Ms wait. It's
too late in the Go 1.22 cycle to make the required changes to
futex-based locks. When not enabled, attribute the time to the sentinel
function instead.
Fixes #57071
This is a roll-forward of https://go.dev/cl/528657, which was reverted
in https://go.dev/cl/543660
Reason for revert: de-flakes tests (reduces dependence on fine-grained
timers, correctly identifies contention on big-endian futex locks,
attempts to measure contention in the semaphore implementation but only
uses that secondary measurement to finish the test early, skips tests on
single-processor systems)
Change-Id: I31389f24283d85e46ad9ba8d4f514cb9add8dfb0
Reviewed-on: https://go-review.googlesource.com/c/go/+/544195
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Rhys Hiltner <rhys@justin.tv>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
|
|
Change-Id: I179b50ae8e73677d4d408b83424afbbfe6aa17a1
GitHub-Last-Rev: 2e2d9c1e45556155d02db4df381b99f2d1bc5c0e
GitHub-Pull-Request: golang/go#63478
Reviewed-on: https://go-review.googlesource.com/c/go/+/534015
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
This reverts commit go.dev/cl/528657.
Reason for revert: broke a lot of builders.
Change-Id: I70c33062020e997c4df67b3eaa2e886cf0da961e
Reviewed-on: https://go-review.googlesource.com/c/go/+/543660
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Add runtime-internal locks to the mutex contention profile.
Store up to one call stack responsible for lock contention on the M,
until it's safe to contribute its value to the mprof table. Try to use
that limited local storage space for a relatively large source of
contention, and attribute any contention in stacks we're not able to
store to a sentinel _LostContendedLock function.
Avoid ballooning lock contention while manipulating the mprof table by
attributing to that sentinel function any lock contention experienced
while reporting lock contention.
Guard collecting real call stacks with GODEBUG=profileruntimelocks=1,
since the available data has mixed semantics; we can easily capture an
M's own wait time, but we'd prefer for the profile entry of each
critical section to describe how long it made the other Ms wait. It's
too late in the Go 1.22 cycle to make the required changes to
futex-based locks. When not enabled, attribute the time to the sentinel
function instead.
Fixes #57071
Change-Id: I3eee0ccbfc20f333b56f20d8725dfd7f3a526b41
Reviewed-on: https://go-review.googlesource.com/c/go/+/528657
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
A follow-up to https://go.dev/cl/534161 -- calls to runtime/trace.Start
and Stop synchronize with the GC, waiting for any in-progress mark phase
to complete. Disable automatic GCs to quiet the system, so we can
observe only the test's intentional pauses.
Change-Id: I6f8106c42528f9bda9afec1c151119783bbc78dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/543075
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
In the existing implementation, all /gc/scan/* metrics are
always equal to 0 due to the dependency on gcStatDep not being
set. This leads to gcStatAggregate always containing zeros, and
always reporting 0 for those metrics.
Also, add a test to ensure that /gc/scan/* metrics are not empty.
Fixes #62477.
Change-Id: I67497347d50ed5c3ce1719a18714c062ec938cab
Reviewed-on: https://go-review.googlesource.com/c/go/+/525595
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
|
|
/gc/heap/live:bytes may exceed MemStats.HeapAlloc, even when all data is
flushed, becuase the GC may double-count objects when marking them. This
is an intentional design choice that is largely inconsequential. The
runtime is already robust to it, and the condition is rare.
Fixes #60607.
Change-Id: I4da402efc24327328d2d8780e4e49961b189f0ea
Reviewed-on: https://go-review.googlesource.com/c/go/+/501858
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This test was introduced as a regression test for #60276. However, it
was quite flaky on a number of different platforms because there are
myriad ways the runtime can eat into time one might expect is completely
idle.
This change re-enables the test, but makes it much more resilient.
Because the issue we're testing for is persistent, we now require 10
consecutive failures to count. Any single success counts as a test
success. This change also makes the test's idle time bound more lenient,
allowing for a little bit of time to be eaten up. The regression we're
testing for results in nearly zero idle time being accounted for.
If this is still not good enough to eliminate flakes, this test should
just be deleted.
For #60276.
Fixes #60376.
Change-Id: Icd81f0c9970821b7f386f6d27c8a566fee4d0ff7
Reviewed-on: https://go-review.googlesource.com/c/go/+/498274
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Currently /gc/scan/total:bytes is computed as a separate sum. Compute it
using the same inputs so it's always consistent with the sum of
everything else in /gc/scan/*.
For #56857.
Change-Id: I43d9148a23b1d2eb948ae990193dca1da85df8a3
Reviewed-on: https://go-review.googlesource.com/c/go/+/497880
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This test is fundamentally flaky because of a mismatch between how
internal idle time is calculated and how the test expects it to be
calculated. It's unclear how to resolve this mismatch, given that it's
perfectly valid for a goroutine to remain asleep while background
goroutines (e.g. the scavenger) run. In practice, we might be able to
set some generous lower-bound, but until we can confirm that on the
affected platforms, skip the test as flaky unconditionally.
For #60276.
For #60376.
Change-Id: Iffd5c4be10cf8ae8a6c285b61fcc9173235fbb2a
Reviewed-on: https://go-review.googlesource.com/c/go/+/497876
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Currently pidleget gets passed "now" from before the M goes into
netpoll, resulting in incorrect accounting of idle CPU time.
lastpoll is also stored with a stale "now": the mistake was added in the
same CL it was added for pidleget.
Recompute "now" after returning from netpoll.
Also, start tracking idle time on js/wasm at all.
Credit to Rhys Hiltner for the test case.
Fixes #60276.
Change-Id: I5dd677471f74c915dfcf3d01621430876c3ff307
Reviewed-on: https://go-review.googlesource.com/c/go/+/496183
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
For #56857
Change-Id: I7e7d2ea3e6ab59291a4cd867c680605ad75bd21f
Reviewed-on: https://go-review.googlesource.com/c/go/+/497317
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
For #56857
Change-Id: I184d752cc615874ada3d0dbc6ed1bf72c8debd0f
Reviewed-on: https://go-review.googlesource.com/c/go/+/497316
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
For #56857
Change-Id: I0622af974783ab435e91b9fb3c1ba43f256ee4ac
Reviewed-on: https://go-review.googlesource.com/c/go/+/497315
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Change-Id: Ifaa73b64e5b6a1d37c753e2440b642478d7dfbce
Reviewed-on: https://go-review.googlesource.com/c/go/+/436957
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: hopehook <hopehook@golangcn.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
|
|
This change adds a metric to the runtime/metrics package which tracks
total mutex wait time for sync.Mutex and sync.RWMutex. The purpose of
this metric is to be able to quickly get an idea of the total mutex wait
time.
The implementation of this metric piggybacks off of the existing G
runnable tracking infrastructure, as well as the wait reason set on a G
when it goes into _Gwaiting.
Fixes #49881.
Change-Id: I4691abf64ac3574bec69b4d7d4428b1573130517
Reviewed-on: https://go-review.googlesource.com/c/go/+/427618
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This changes adds a breakdown for estimated CPU usage by time. These
estimates are not based on real on-CPU counters, so each metric has a
disclaimer explaining so. They can, however, be more reasonably
compared to a total CPU time metric that this change also adds.
Fixes #47216.
Change-Id: I125006526be9f8e0d609200e193da5a78d9935be
Reviewed-on: https://go-review.googlesource.com/c/go/+/404307
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Josh MacDonald <jmacd@lightstep.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
For #47216.
Change-Id: I1c2cd518e6ff510cc3ac8d8f72fd52eadcabc16c
Reviewed-on: https://go-review.googlesource.com/c/go/+/404306
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
|
|
For #47216.
Change-Id: Ib2d48c4583570a2dae9510a52d4c6ffc20161b31
Reviewed-on: https://go-review.googlesource.com/c/go/+/404305
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
|
|
Right now we export alloc count metrics via the runtime/metrics package
and mark them as monotonic, but that's not actually true. As an
optimization, the runtime assumes a span is always fully allocated
before being uncached, and updates the accounting as such. In the rare
case that it's wrong, the span has enough information to back out what
did not get allocated.
This change uses 16 bits of padding in the mspan to house another field
that represents the amount of mspan slots filled just as the mspan is
cached. This is information is enough to get an exact count, allowing us
to make the metrics truly monotonic.
Change-Id: Iaff3ca43f8745dc1bbb0232372423e014b89b920
Reviewed-on: https://go-review.googlesource.com/c/go/+/377516
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This change adds four additional metrics to the runtime/metrics package
to fill in a few gaps with runtime.MemStats that were overlooked. The
biggest one is TotalAlloc, which is impossible to find with the
runtime/metrics package, but also add a few others for convenience and
clarity. For instance, the total number of objects allocated and freed
are technically available via allocs-by-size and frees-by-size, but it's
onerous to get them (one needs to sum the sample counts in the
histograms).
The four additional metrics are:
- /gc/heap/allocs:bytes -- total bytes allocated (TotalAlloc)
- /gc/heap/allocs:objects -- total objects allocated (Mallocs - [tiny])
- /gc/heap/frees:bytes -- total bytes frees (TotalAlloc-HeapAlloc)
- /gc/heap/frees:objects -- total objects freed (Frees - [tiny])
This change also updates the descriptions of allocs-by-size and
frees-by-size to be more precise.
Change-Id: Iec8c1797a584491e3484b198f2e7f325b68954a7
Reviewed-on: https://go-review.googlesource.com/c/go/+/312431
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
Currently tiny allocations are not represented in either MemStats or
runtime/metrics, but they're represented in MemStats (indirectly) via
Mallocs. Add them to runtime/metrics by first merging
memstats.tinyallocs into consistentHeapStats (just for simplicity; it's
monotonic so metrics would still be self-consistent if we just read it
atomically) and then adding /gc/heap/tiny/allocs:objects to the list of
supported metrics.
Change-Id: Ie478006ab942a3e877b4a79065ffa43569722f3d
Reviewed-on: https://go-review.googlesource.com/c/go/+/312909
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|