| Age | Commit message (Collapse) | Author |
|
Copy LabelSet to an internal package as label.Set, and include (escaped)
labels within goroutine stack dumps.
Labels are added to the goroutine header as quoted key:value pairs, so
the line may get long if there are a lot of labels.
To handle escaping, we add a printescaped function to the
runtime and hook it up to the print function in the compiler with a new
runtime.quoted type that's a sibling to runtime.hex. (in fact, we
leverage some of the machinery from printhex to generate escape
sequences).
The escaping can be improved for printable runes outside basic ASCII
(particularly for languages using non-latin stripts). Additionally,
invalid UTF-8 can be improved.
So we can experiment with the output format make this opt-in via a
a new tracebacklabels GODEBUG var.
Updates #23458
Updates #76349
Change-Id: I08e78a40c55839a809236fff593ef2090c13c036
Reviewed-on: https://go-review.googlesource.com/c/go/+/694119
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
|
|
Fixes #73107
Change-Id: I41f3e1bd1fdaca2f0e94151b2320bd569e258a51
Reviewed-on: https://go-review.googlesource.com/c/go/+/671576
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Switch labelMap from map[string]string to use LabelSet as a data
structure. Optimize Labels() for the case where the keys are given in
sorted order without duplicates.
This is primarily motivated by reducing the overhead of distributed
tracing systems that use pprof labels. We have encountered cases where
users complained about the overhead relative to the rest of our
distributed tracing library code. Additionally, we see this as an
opportunity to free up hundreds of CPU cores across our fleet.
A secondary motivation is eBPF profilers that try to access pprof
labels. The current map[string]string requires them to implement Go map
access in eBPF, which is non-trivial. With the enablement of swiss maps,
this complexity is only increasing. The slice data structure introduced
in this CL will greatly lower the implementation complexity for eBPF
profilers in the future. But to be clear: This change does not imply
that the pprof label mechanism is now a stable ABI. They are still an
implementation detail and may change again in the future.
goos: darwin
goarch: arm64
pkg: runtime/pprof
cpu: Apple M1 Max
│ baseline.txt │ patch1.txt │
│ sec/op │ sec/op vs base │
Labels/set-one-10 153.50n ± 3% 75.00n ± 1% -51.14% (p=0.000 n=10)
Labels/merge-one-10 187.8n ± 1% 128.8n ± 1% -31.42% (p=0.000 n=10)
Labels/overwrite-one-10 193.1n ± 2% 102.0n ± 1% -47.18% (p=0.000 n=10)
Labels/ordered/set-many-10 502.6n ± 4% 146.1n ± 2% -70.94% (p=0.000 n=10)
Labels/ordered/merge-many-10 516.3n ± 2% 238.1n ± 1% -53.89% (p=0.000 n=10)
Labels/ordered/overwrite-many-10 569.3n ± 4% 247.6n ± 2% -56.51% (p=0.000 n=10)
Labels/unordered/set-many-10 488.9n ± 2% 308.3n ± 3% -36.94% (p=0.000 n=10)
Labels/unordered/merge-many-10 523.6n ± 1% 258.5n ± 1% -50.64% (p=0.000 n=10)
Labels/unordered/overwrite-many-10 571.4n ± 1% 412.1n ± 2% -27.89% (p=0.000 n=10)
geomean 366.8n 186.9n -49.05%
│ baseline.txt │ patch1b.txt │
│ B/op │ B/op vs base │
Labels/set-one-10 424.0 ± 0% 104.0 ± 0% -75.47% (p=0.000 n=10)
Labels/merge-one-10 424.0 ± 0% 200.0 ± 0% -52.83% (p=0.000 n=10)
Labels/overwrite-one-10 424.0 ± 0% 136.0 ± 0% -67.92% (p=0.000 n=10)
Labels/ordered/set-many-10 1344.0 ± 0% 392.0 ± 0% -70.83% (p=0.000 n=10)
Labels/ordered/merge-many-10 1184.0 ± 0% 712.0 ± 0% -39.86% (p=0.000 n=10)
Labels/ordered/overwrite-many-10 1056.0 ± 0% 712.0 ± 0% -32.58% (p=0.000 n=10)
Labels/unordered/set-many-10 1344.0 ± 0% 712.0 ± 0% -47.02% (p=0.000 n=10)
Labels/unordered/merge-many-10 1184.0 ± 0% 712.0 ± 0% -39.86% (p=0.000 n=10)
Labels/unordered/overwrite-many-10 1.031Ki ± 0% 1.008Ki ± 0% -2.27% (p=0.000 n=10)
geomean 843.1 405.1 -51.95%
│ baseline.txt │ patch1b.txt │
│ allocs/op │ allocs/op vs base │
Labels/set-one-10 5.000 ± 0% 3.000 ± 0% -40.00% (p=0.000 n=10)
Labels/merge-one-10 5.000 ± 0% 5.000 ± 0% ~ (p=1.000 n=10) ¹
Labels/overwrite-one-10 5.000 ± 0% 4.000 ± 0% -20.00% (p=0.000 n=10)
Labels/ordered/set-many-10 8.000 ± 0% 3.000 ± 0% -62.50% (p=0.000 n=10)
Labels/ordered/merge-many-10 8.000 ± 0% 5.000 ± 0% -37.50% (p=0.000 n=10)
Labels/ordered/overwrite-many-10 7.000 ± 0% 4.000 ± 0% -42.86% (p=0.000 n=10)
Labels/unordered/set-many-10 8.000 ± 0% 4.000 ± 0% -50.00% (p=0.000 n=10)
Labels/unordered/merge-many-10 8.000 ± 0% 5.000 ± 0% -37.50% (p=0.000 n=10)
Labels/unordered/overwrite-many-10 7.000 ± 0% 5.000 ± 0% -28.57% (p=0.000 n=10)
geomean 6.640 4.143 -37.60%
¹ all samples are equal
Change-Id: Ie68e960a25c2d97bcfb6239dc481832fa8a39754
Reviewed-on: https://go-review.googlesource.com/c/go/+/574516
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Over the years we've had various bugs in pprof stack handling resulting
in appendLocsForStack crashing because stk is too short for a cached
location. i.e., the cached location claims several inlined frames. Those
should always appear together in stk. If some frames are missing from
stk, appendLocsForStack.
If we find this case, replace the slice out of bounds panic with an
explicit panic that contains more context.
Change-Id: I52725a689baf42b8db627ce3e1bc6c654ef245d4
Reviewed-on: https://go-review.googlesource.com/c/go/+/617135
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
When generic function[a,b] is inlined to the same generic function[b,a]
with different types (not recursion) it is expected to get a pprof with
a single Location with two functions. However due to incorrect check
for generics names using runtime.Frame.Function, the profileBuilder
assumes it is a recursion and emits separate Location.
This change fixes the recursion check for generics functions by using
runtime_expandFinalInlineFrame
Fixes #64641
Change-Id: I3f58818f08ee322b281daa377fa421555ad328c9
Reviewed-on: https://go-review.googlesource.com/c/go/+/549135
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
profileBuilder is using Frame->Function as key for checking if we already
emitted a function. However for generics functions it has dots there [...],
so sometimes for different functions with different generics types,
the profileBuilder emits wrong functions.
Fixes #64528
Change-Id: I8b39245e0b18f4288ce758c912c6748f87cba39a
Reviewed-on: https://go-review.googlesource.com/c/go/+/546815
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
For generic functions, the previous CL makes it record the full
instantiated symbol name in the runtime func table. This CL
changes the pprof package to use that name in CPU profile. This
way, it matches the symbol name the compiler sees, so it can apply
PGO.
TODO: add a test.
Fixes #58712.
Change-Id: If40db01cbef5f73c279adcc9c290a757ef6955b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/491678
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
|
|
- Fix typo in throw error message for arena.
- Correct typos in assembly and Go comments.
- Fix log message in TestTraceCPUProfile.
Change-Id: I874c9e8cd46394448b6717bc6021aa3ecf319d16
GitHub-Last-Rev: d27fad4d3cea81cc7a4ca6917985bcf5fa49b0e0
GitHub-Pull-Request: golang/go#58375
Reviewed-on: https://go-review.googlesource.com/c/go/+/465975
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
I spent quite a while determining the cause of empty stacks in
profiles and reasoning out why this is okay. There isn't a great place
to record this knowledge, but a documentation comment on
appendLocsForStack is better than nothing.
Updates #51550.
Change-Id: I2eefc6ea31f1af885885c3d96199319f45edb4ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/460695
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Now that we plumb the start line to the runtime, we can include in pprof
files. Since runtime.Frame.startLine is not (currently) exported, we
need a runtime helper to get the value.
For #55022.
Updates #56135.
Change-Id: Ifc5b68a7b7170fd7895e4099deb24df7977b22ea
Reviewed-on: https://go-review.googlesource.com/c/go/+/438255
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Change-Id: Icd4062e570559f1d0c69d4bdb9e23412054cf2a6
GitHub-Last-Rev: fbbfbcb54dac88c9a8f5c5c6d210be46f87e27dd
GitHub-Pull-Request: golang/go#55958
Reviewed-on: https://go-review.googlesource.com/c/go/+/436880
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
This causes a problem in the test sometimes. With a mapping like:
00400000-00411000 r--p 00000000 fe:01 4459044 /tmp/go-build1710804385/b001/pprof.test
00411000-00645000 r-xp 00011000 fe:01 4459044 /tmp/go-build1710804385/b001/pprof.test
The removed code would make the first mapping 0x400000-0x645000. Tests
then grab the first few addresses to use as PCs, thinking they are in
an executable range. But those addresses are really not in an
executable range, causing the tests to fail.
Change-Id: I5a69d0259d1fd70ff9745df1cbad4d54c5898e7b
Reviewed-on: https://go-review.googlesource.com/c/go/+/424295
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
|
|
Fixes #43296
Change-Id: Ib277c2e82c95f71a7a9b7fe1b22215ead7a54a88
Reviewed-on: https://go-review.googlesource.com/c/go/+/416975
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
The compiler may choose to inline multiple layers of function call, such
that A calling B calling C may end up with all of the instructions for B
and C written as part of A's function body.
Within that function body, some PCs will represent code from function A.
Some will represent code from function B, and for each of those the
runtime will have an instruction attributable to A that it can report as
its caller. Others will represent code from function C, and for each of
those the runtime will have an instruction attributable to B and an
instruction attributable to A that it can report as callers.
When a profiling signal arrives at an instruction in B (as inlined in A)
that the runtime also uses to describe calls to C, the profileBuilder
ends up with an incorrect cache of allFrames results. That PC should
lead to a location record in the profile that represents the frames
B<-A, but the allFrames cache's view should expand the PC only to the B
frame.
Otherwise, when a profiling signal arrives at an instruction in C (as
inlined in B in A), the PC stack C,B,A can get expanded to the frames
C,B<-A,A as follows: The inlining deck starts empty. The first tryAdd
call proposes PC C and frames C, which the deck accepts. The second
tryAdd call proposes PC B and, due to the incorrect caching, frames B,A.
(A fresh call to allFrames with PC B would return the frame list B.) The
deck accepts that PC and frames. The third tryAdd call proposes PC A and
frames A. The deck rejects those because a call from A to A cannot
possibly have been inlined. This results in a new location record in the
profile representing the frames C<-B<-A (good), as called by A (bad).
The bug is the cached expansion of PC B to frames B<-A. That mapping is
only appropriate for the resulting protobuf-format profile. The cache
needs to reflect the results of a call to allFrames, which expands the
PC B to the single frame B.
For #50996
For #52693
Fixes #52764
Change-Id: I36d080f3c8a05650cdc13ced262189c33b0083b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/404995
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
[This CL is part of a sequence implementing the proposal #51082.
The design doc is at https://go.dev/s/godocfmt-design.]
Run the updated gofmt, which reformats doc comments,
on the main repository. Vendored files are excluded.
For #51082.
Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407
Reviewed-on: https://go-review.googlesource.com/c/go/+/384268
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
When building the inlining deck, correctly identify which is the last
frame in the deck. Otherwise, when some forms of inlining cause a PC to
expand to multiple frames, the length of the deck's two slices will
diverge.
Fixes #51567
Change-Id: I24e7ba32cb16b167f4307178b3f03c29e5362c4b
Reviewed-on: https://go-review.googlesource.com/c/go/+/391134
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Than McIntosh <thanm@google.com>
|
|
When describing call stacks that include inlined function calls, the
runtime uses "fake" PCs to represent the frames that inlining removed.
Those PCs correspond to real NOP instructions that the compiler inserts
for this purpose.
Describing the call stack in a protobuf-formatted profile requires the
runtime/pprof package to collapse any sequences of fake call sites back
into single PCs, removing the NOPs but retaining their line info.
But because the NOP instructions are part of the function, they can
appear as leaf nodes in a CPU profile. That results in an address that
should sometimes be ignored (when it appears as a call site) and that
sometimes should be present in the profile (when it is observed
consuming CPU time).
When processing a PC address, consider it first as a fake PC to add to
the current inlining deck, and then as a previously-seen (real) PC.
Fixes #50996
Change-Id: I80802369978bd7ac9969839ecfc9995ea4f84ab4
Reviewed-on: https://go-review.googlesource.com/c/go/+/384239
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
profBuf.write uses an index in b.tags for each entry, even if that entry
has no tag (that slice entry just remains 0). profBuf.read similarly
returns a tags slice with exactly as many entries as there are records
in data.
profileBuilder.addCPUData iterates through the tags in lockstep with the
data records. Except in the special case of the first record, where it
forgets to increment tags. Thus the first read of profiling data has all
tags off-by-one.
To help avoid regressions, addCPUData is changed to assert that tags
contains exactly the correct number of tags.
For #50007.
Change-Id: I5f32f93003297be8d6e33ad472c185d924a63256
Reviewed-on: https://go-review.googlesource.com/c/go/+/369741
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Many uses of Index/IndexByte/IndexRune/Split/SplitN
can be written more clearly using the new Cut functions.
Do that. Also rewrite to other functions if that's clearer.
For #46336.
Change-Id: I68d024716ace41a57a8bf74455c62279bde0f448
Reviewed-on: https://go-review.googlesource.com/c/go/+/351711
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
internal/abi.FuncPCABIInternal
All funcPC references are ABIInternal functions. Replace with the
intrinsics.
Change-Id: I2266bb6d2b713eb63b6a09846e9f9c423cab6e9b
Reviewed-on: https://go-review.googlesource.com/c/go/+/322349
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
As part of #42026, these helpers from io/ioutil were moved to os.
(ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.)
Update the Go tree to use the preferred names.
As usual, code compiled with the Go 1.4 bootstrap toolchain
and code vendored from other sources is excluded.
ReadDir changes are in a separate CL, because they are not a
simple search and replace.
For #42026.
Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/266365
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Change-Id: Ife4e065246729319c39e57a4fbd8e6f7b37724e1
GitHub-Last-Rev: e71803eaeb366c00f6c156de0b0b2c50927a0e82
GitHub-Pull-Request: golang/go#38527
Reviewed-on: https://go-review.googlesource.com/c/go/+/228901
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
|
|
Following CL 226818, the compiler will allow inlining a single cycle in
an inline chain. Immediately-recursive functions are still disallowed,
which is what this heuristic refers to.
Add a regression test for this case.
Note that in addition to this check, if the compiler were to inline
multiple cycles via a loop (i.e., rather than appending duplicate code),
much more work would be required here to handle a single address
appearing in multiple different inline frames.
Updates #29737
Change-Id: I88de15cfbeabb9c04381e1c12cc36778623132a5
Reviewed-on: https://go-review.googlesource.com/c/go/+/227346
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dan Scales <danscales@google.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
|
|
gentraceback generates PCs which are usually following the CALL
instruction. For those that aren't, it fixes up the PCs so that
functions processing the output can unconditionally decrement the PC.
runtime_expandInlineFrames does this unconditional decrement when
looking up the function. However, the fake stack frame generated for
overflow records fails to meet the contract, and decrementing the PC
results in a PC in the previous function. If that function contains
inlined call, runtime_expandInlineFrames will not short-circuit and will
panic trying to look up a PC that doesn't exist.
Note that the added test does not fail at HEAD. It will only fail (with
a panic) if the function preceeding lostProfileEvent contains inlined
function calls. At the moment (on linux/amd64), that is
runtime/pprof.addMaxRSS, which does not.
Fixes #38096
Change-Id: Iad0819f23c566011c920fd9a5b1254719228da0b
Reviewed-on: https://go-review.googlesource.com/c/go/+/225661
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
When generating stacks, the runtime automatically expands inline
functions to inline all inline frames in the stack. However, due to the
stack size limit, the final frame may be truncated in the middle of
several inline frames at the same location.
As-is, we assume that the final frame is a normal function, and emit and
cache a Location for it. If we later receive a complete stack frame, we
will first use the cached Location for the inlined function and then
generate a new Location for the "caller" frame, in violation of the
pprof requirement to merge inlined functions into the same Location.
As a result, we:
1. Nondeterministically may generate a profile with the different stacks
combined or split, depending on which is encountered first. This is
particularly problematic when performing a diff of profiles.
2. When split stacks are generated, we lose the inlining information.
We avoid both of these problems by performing a second expansion of the
last stack frame to recover additional inline frames that may have been
lost. This expansion is a bit simpler than the one done by the runtime
because we don't have to handle skipping, and we know that the last
emitted frame is not an elided wrapper, since it by definition is
already included in the stack.
Fixes #37446
Change-Id: If3ca2af25b21d252cf457cc867dd932f107d4c61
Reviewed-on: https://go-review.googlesource.com/c/go/+/221577
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
|
|
The profile proto message builder maintains a location entry cache
that maps a location (possibly involving multiple user frames
that represent inlined function calls) to the location id. We have
been using the first pc of the inlined call sequence as the key of
the cached location entry assuming that, for a given pc, the sequence
of frames representing the inlined call stack is deterministic and
stable. Then, when analyzing the new stack trace, we expected the
exact number of pcs to be present in the captured stack trace upon
the cache hit.
This assumption does not hold, however, in the presence of the stack
trace truncation in the runtime during profiling, and also with the
potential bugs in runtime.
A better fix is to use all the pcs of the inlined call sequece as
the key instead of the first pc. But that is a bigger code change.
This CL avoids the crash assuming the trace was truncated.
Fixes #35538
Change-Id: I8c6bae98bc8b178ee51523c7316f56b1cce6df16
Reviewed-on: https://go-review.googlesource.com/c/go/+/207609
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
tryAdd shouldn't succeed (and accept the new frame) if the last
existing frame on the deck is not an inlined frame.
For example, when we see the followig stack
[300656 300664 300655 300664]
with each PC corresponds to
[{PC:300656 Func:nil Function:runtime.nanotime File:/workdir/go/src/runtime/time_nofake.go Line:19 Entry:300416 {0x28dac8 0x386c80}}]
[{PC:300664 Func:0x28dac8 Function:runtime.checkTimers File:/workdir/go/src/runtime/proc.go Line:2623 Entry:300416 {0x28dac8 0x386c80}}]
[{PC:300655 Func:nil Function:runtime.nanotime File:/workdir/go/src/runtime/time_nofake.go Line:19 Entry:300416 {0x28dac8 0x386c80}}]
[{PC:300664 Func:0x28dac8 Function:runtime.checkTimers File:/workdir/go/src/runtime/proc.go Line:2623 Entry:300416 {0x28dac8 0x386c80}}]
PC:300656 and PC:300664 belong to a single location entry,
but the bug in the current tryAdd logic placed the entire stack into one
location entry.
Also adds tests - this crash is a tricky case to test because I think it
should happen with normal go code. The new TestTryAdd simulates it by
using fake call sequences. The test crashed without the fix.
Update #35538
Change-Id: I6d3483f757abf4c429ab91616e4def90832fc04a
Reviewed-on: https://go-review.googlesource.com/c/go/+/206958
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Unlike function calls, when processing instructions that directly
fault we must not subtract 1 from the pc before looking up the
file/line information.
Since the file/line lookup unconditionally subtracts 1, add 1 to
the faulting instruction PCs to compensate.
Fixes #34123
Change-Id: Ie7361e3d2f84a0d4f48d97e5a9e74f6291ba7a8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/196962
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
|
|
Change-Id: Ie4754fefba6057b1cf558d0096fe0e83355f8eff
Reviewed-on: https://go-review.googlesource.com/c/go/+/205098
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Change-Id: Id270a3477bf1a581755c4311eb12f990aa2260b5
Reviewed-on: https://go-review.googlesource.com/c/go/+/205097
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
The pprof profile proto message expects inlined functions of a PC
to be encoded in one Location entry using multiple Line entries.
https://github.com/google/pprof/blob/5e96527/proto/profile.proto#L177-L184
runtime/pprof has encoded the symbolization information by creating
a Location for each PC found in the stack trace and including info
from all the frames expanded from the PC using runtime.CallersFrames.
This assumes inlined functions are represented as a single PC in the
stack trace. (https://go-review.googlesource.com/41256)
In the recent years, behavior around inlining and the traceback
changed significantly (e.g. https://golang.org/cl/152537,
https://golang.org/issue/29582, and many changes). Now the PCs
in the stack trace represent user frames even including inline
marks. As a result, the profile proto started to allocate a Location
entry for each user frame, lose the inline information (so pprof
presented incorrect results when inlined functions are involved),
and confuse the pprof tool with those PCs made up for inline marks.
This CL attempts to detect inlined call frames from the stack traces
of CPU profiles, and organize the Location information as intended.
Currently, runtime does not provide a reliable and convenient way to
detect inlined call frames and expand user frames from a given externally
recognizable PCs. So we use heuristics to recover the groups
- inlined call frames have nil Func field
- inlined call frames will have the same Entry point
- but must be careful with recursive functions that have the
same Entry point by definition, and non-Go functions that
may lack most of the fields of Frame.
The followup CL will address the issue with other profile types.
Change-Id: I0c9667ab016a3e898d648f31c3f82d84c15398db
Reviewed-on: https://go-review.googlesource.com/c/go/+/204636
Reviewed-by: Keith Randall <khr@golang.org>
|
|
In 1.11 we stored "return addresses" in the result of runtime.Callers.
I changed that behavior in CL 152537 to store an address in the call
instruction itself. This CL reverts that part of 152537.
The change in 152537 was made because we now store pcs of inline marks
in the result of runtime.Callers as well. This CL will now store the
address of the inline mark + 1 in the results of runtime.Callers, so
that the subsequent -1 done in CallersFrames will pick out the correct
inline mark instruction.
This CL means that the results of runtime.Callers can be passed to
runtime.FuncForPC as they were before. There are a bunch of packages
in the wild that take the results of runtime.Callers, subtract 1, and
then call FuncForPC. This CL keeps that pattern working as it did in
1.11.
The changes to runtime/pprof in this CL are exactly a revert of the
changes to that package in 152537 (except the locForPC comment).
Update #29582
Change-Id: I04d232000fb482f0f0ff6277f8d7b9c72e97eb48
Reviewed-on: https://go-review.googlesource.com/c/156657
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Work involved in getting a stack trace is divided between
runtime.Callers and runtime.CallersFrames.
Before this CL, runtime.Callers returns a pc per runtime frame.
runtime.CallersFrames is responsible for expanding a runtime frame
into potentially multiple user frames.
After this CL, runtime.Callers returns a pc per user frame.
runtime.CallersFrames just maps those to user frame info.
Entries in the result of runtime.Callers are now pcs
of the calls (or of the inline marks), not of the instruction
just after the call.
Fixes #29007
Fixes #28640
Update #26320
Change-Id: I1c9567596ff73dc73271311005097a9188c3406f
Reviewed-on: https://go-review.googlesource.com/c/152537
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
If binary file of a running program was deleted or moved, maps
file (/proc/pid/maps) will contain lines that have this binary
filename suffixed with "(deleted)" string. This suffix stayed
as a part of the filename and made remote profiling slightly more
difficult by requiring from a user to rename binary file to
include this suffix.
This change cleans up the filename and removes this suffix and
thus simplify debugging.
Fixes #25740
Change-Id: Ib3c8c3b9ef536c2ac037fcc14e8037fa5c960036
Reviewed-on: https://go-review.googlesource.com/116395
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
|
|
Profile's Mapping field is currently populated by reading /proc/self/maps.
On systems where /proc/self/maps is not available, the profile generated
by Go's runtime will not have any Mapping entry. Pprof command then adds
a fake entry and links all Location entries in the profile with the fake
entry to be used during symbolization.
https://github.com/google/pprof/blob/a8644067d5a3c9a6386e7c88fa4a3d9d37877ca3/internal/driver/fetch.go#L437
The fake entry is not enough to suppress the error or warning messages
pprof command produces. We need to tell pprof that Location entries are
symbolized already by Go runtime and pprof does not have to attempt to
perform further symbolization.
In #25743, we made Go runtime mark Mapping entries with HasFunctions=true
when all Location entries from the Mapping entries are successfully
symbolized. This change makes the Go runtime add a fake mapping entry,
otherwise the pprof command tool would add, and set the HasFunctions=true
following the same logic taken when the real mapping information is
available.
Updates #19790.
Fixes #26255. Tested pprof doesn't report the error message any more
for pure Go program.
Change-Id: Ib12b62e15073f5d6c80967e44b3e8709277c11bd
Reviewed-on: https://go-review.googlesource.com/123779
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
The pprof tool utilizes attributes of mapping entries
such as HasFunctions to determine whether the profile
includes necessary symbol information.
If none of the attributes is set, pprof tool tries to
read the corresponding binary to use for local symbolization.
If the binary doesn't exist, it prints out error messages.
Go runtime generated profiles without any of the attributes
set so the pprof tool always printed out the error messages.
The error messages became more obvious with the new
terminal support that uses red color for error messages.
Go runtime can symbolize all Go symbols and generate
self-contained profile for pure Go program. Thus, there
is no reason for the pprof tool to look for the copy of
the binary. So, this CL sets one of the attributes
(HasFunctions) true if all PCs in samples look fully
symbolized.
For non-pure Go program, however, it's possible that
symbolization of non-Go PCs is incomplete. In this case,
we need to leave the attributes all false so pprof can attempt
to symbolize using the local copy of the binary if available.
It's hard to determine whether a mapping includes non-Go
code. Instead, this CL checks PCs from collected samples.
If unsuccessful symbolization is observed, it skips setting
the HasFunctions attribute.
Fixes #25743
Change-Id: I5108be45bbc37ab486d145fa03e7ce37d88fad50
Reviewed-on: https://go-review.googlesource.com/118275
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
The Go's heap profile contains four kinds of samples
(inuse_space, inuse_objects, alloc_space, and alloc_objects).
The pprof tool by default chooses the inuse_space (the bytes
of live, in-use objects). When analyzing the current memory
usage the choice of inuse_space as the default may be useful,
but in some cases, users are more interested in analyzing the
total allocation statistics throughout the program execution.
For example, when we analyze the memory profile from benchmark
or program test run, we are more likely interested in the whole
allocation history than the live heap snapshot at the end of
the test or benchmark.
The pprof tool provides flags to control which sample type
to be used for analysis. However, it is one of the less-known
features of pprof and we believe it's better to choose the
right type of samples as the default when producing the profile.
This CL introduces a new type of profile, "allocs", which is
the same as the "heap" profile but marks the alloc_space
as the default type unlike heap profiles that use inuse_space
as the default type.
'go test -memprofile=...' command is changed to use the new
"allocs" profile type instead of the traditional "heap" profile.
Fixes #24443
Change-Id: I012dd4b6dcacd45644d7345509936b8380b6fbd9
Reviewed-on: https://go-review.googlesource.com/102696
Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
As found by unparam. Picked the low-hanging fruit, consisting only of
errors that were always nil and results that were never used. Left out
those that were useful for consistency with other func signatures.
Change-Id: I06b52bbd3541f8a5d66659c909bd93cb3e172018
Reviewed-on: https://go-review.googlesource.com/102418
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Currently proto symbolization uses runtime.FuncForPC and assumes each
PC maps to a single frame. This isn't true in the presence of inlining
(even with leaf-only inlining this can get incorrect results).
Change PC symbolization to use runtime.CallersFrames to expand each PC
to all of the frames at that PC.
Change-Id: I8d20dff7495a5de495ae07f569122c225d433ced
Reviewed-on: https://go-review.googlesource.com/41256
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
|
|
Proto profile conversion is inconsistent about call vs return PCs in
profile locations. The proto defines locations to be call PCs. This is
what we do when proto-izing CPU profiles, but we fail to convert the
return PCs in memory and count profile stacks to call PCs when
converting them to proto locations.
Fix this in the heap and count profile conversion functions.
TestConvertMemProfile also hard-codes this failure to convert from
return PCs to call PCs, so fix up the addresses in the synthesized
profile to be return PCs while checking that we get call PCs out of
the conversion.
Change-Id: If1fc028b86fceac6d71a2d9fa6c41ff442c89296
Reviewed-on: https://go-review.googlesource.com/42951
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
|
|
Profile labels added by the user using pprof.Do, if present will
be in a *labelMap stored in the unsafe.Pointer 'tag' field of
the profile map entry. This change extracts the labels from the tag
field and writes them to the profile proto.
Change-Id: Ic40fdc58b66e993ca91d5d5effe0e04ffbb5bc46
Reviewed-on: https://go-review.googlesource.com/39613
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Change-Id: I72bea1450386100482b4681b20eb9a9af12c7522
Reviewed-on: https://go-review.googlesource.com/41816
Reviewed-by: Michael Matloob <matloob@golang.org>
|
|
Delete old TestRuntimeFunctionTrimming, which is testing a dead API
and is now handled in end-to-end tests.
Change-Id: I64fc2991ed4a7690456356b5f6b546f36935bb67
Reviewed-on: https://go-review.googlesource.com/41815
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
|
|
The period recorded in CPU profiles is in nanoseconds, but was being
computed incorrectly as hz * 1000. As a result, many absolute times
displayed by pprof were incorrect.
Fix this by computing the period correctly.
Change-Id: I6fadd6d8ad3e57f31e8cc7a25a24fcaec510d8d4
Reviewed-on: https://go-review.googlesource.com/40995
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
This helps systems that maintain an external database mapping
build ID to symbol information for the given binary, especially
in the case where /proc/self/maps lists many different files
(for example, many shared libraries).
Avoid importing debug/elf to avoid dragging in that whole
package (and its dependencies like debug/dwarf) into the
build of every program that generates a profile.
Fixes #19431.
Change-Id: I6d4362a79fe23e4f1726dffb0661d20bb57f766f
Reviewed-on: https://go-review.googlesource.com/37855
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
It's only ever called with the value it was using, but the code was
counterintuitive. Use the parameter instead, like the other funcs near
it.
Found by github.com/mvdan/unparam.
Change-Id: I45855e11d749380b9b2a28e6dd1d5dedf119a19b
Reviewed-on: https://go-review.googlesource.com/37893
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The existing code builds a full profile in memory.
Then it translates that profile into a data structure (in memory).
Then it marshals that data structure into a protocol buffer (in memory).
Then it gzips that marshaled form into the underlying writer.
So there are three copies of the full profile data in memory
at the same time before we're done. This is obviously dumb.
This CL implements a fully streaming conversion from
the original in-memory profile to the underlying writer.
There is now only one copy of the profile in memory.
For the non-CPU profiles, this is optimal, since we have to
have a full copy in memory to start with.
For the CPU profiles, we could still try to bound the profile
size stored in memory and stream fragments out during
the actual profiling, as Go 1.7 did (with a simpler format),
but so far that hasn't been necessary.
Change-Id: Ic36141021857791bf0cd1fce84178fb5e744b989
Reviewed-on: https://go-review.googlesource.com/37164
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
|
|
The old hash table was a place holder that allocates memory
during every lookup for key generation, even for keys that hit
in the the table.
Change-Id: I4f601bbfd349f0be76d6259a8989c9c17ccfac21
Reviewed-on: https://go-review.googlesource.com/37163
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
|
|
This doesn't change the functionality of the current code,
but it sets us up for exporting the profiling labels into the profile.
The old code had a hash table of profile samples maintained
during the signal handler, with evictions going into a log.
The new code just logs every sample directly, leaving the
hash-based deduplication to an ordinary goroutine.
The new code also avoids storing the entire profile in two
forms in memory, an unfortunate regression introduced
when binary profile support was added. After this CL the
entire profile is only stored once in memory. We'd still like
to get back down to storing it zero times (streaming it to
the underlying io.Writer).
Change-Id: I0893a1788267c564aa1af17970d47377b2a43457
Reviewed-on: https://go-review.googlesource.com/36712
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
|
|
These are very tightly coupled, and internal/protopprof is small.
There's no point to having a separate package.
Change-Id: I2c8aa49c9e18a7128657bf2b05323860151b5606
Reviewed-on: https://go-review.googlesource.com/36711
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|