aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/pprof/proto.go
AgeCommit message (Collapse)Author
2025-11-24runtime: add GODEBUG=tracebacklabels=1 to include pprof labels in tracebacksDavid Finkel
Copy LabelSet to an internal package as label.Set, and include (escaped) labels within goroutine stack dumps. Labels are added to the goroutine header as quoted key:value pairs, so the line may get long if there are a lot of labels. To handle escaping, we add a printescaped function to the runtime and hook it up to the print function in the compiler with a new runtime.quoted type that's a sibling to runtime.hex. (in fact, we leverage some of the machinery from printhex to generate escape sequences). The escaping can be improved for printable runes outside basic ASCII (particularly for languages using non-latin stripts). Additionally, invalid UTF-8 can be improved. So we can experiment with the output format make this opt-in via a a new tracebacklabels GODEBUG var. Updates #23458 Updates #76349 Change-Id: I08e78a40c55839a809236fff593ef2090c13c036 Reviewed-on: https://go-review.googlesource.com/c/go/+/694119 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Alan Donovan <adonovan@google.com>
2025-05-13runtime/pprof: return errors from writing profilesSean Liao
Fixes #73107 Change-Id: I41f3e1bd1fdaca2f0e94151b2320bd569e258a51 Reviewed-on: https://go-review.googlesource.com/c/go/+/671576 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-11-16runtime/pprof: reduce label overheadFelix Geisendörfer
Switch labelMap from map[string]string to use LabelSet as a data structure. Optimize Labels() for the case where the keys are given in sorted order without duplicates. This is primarily motivated by reducing the overhead of distributed tracing systems that use pprof labels. We have encountered cases where users complained about the overhead relative to the rest of our distributed tracing library code. Additionally, we see this as an opportunity to free up hundreds of CPU cores across our fleet. A secondary motivation is eBPF profilers that try to access pprof labels. The current map[string]string requires them to implement Go map access in eBPF, which is non-trivial. With the enablement of swiss maps, this complexity is only increasing. The slice data structure introduced in this CL will greatly lower the implementation complexity for eBPF profilers in the future. But to be clear: This change does not imply that the pprof label mechanism is now a stable ABI. They are still an implementation detail and may change again in the future. goos: darwin goarch: arm64 pkg: runtime/pprof cpu: Apple M1 Max │ baseline.txt │ patch1.txt │ │ sec/op │ sec/op vs base │ Labels/set-one-10 153.50n ± 3% 75.00n ± 1% -51.14% (p=0.000 n=10) Labels/merge-one-10 187.8n ± 1% 128.8n ± 1% -31.42% (p=0.000 n=10) Labels/overwrite-one-10 193.1n ± 2% 102.0n ± 1% -47.18% (p=0.000 n=10) Labels/ordered/set-many-10 502.6n ± 4% 146.1n ± 2% -70.94% (p=0.000 n=10) Labels/ordered/merge-many-10 516.3n ± 2% 238.1n ± 1% -53.89% (p=0.000 n=10) Labels/ordered/overwrite-many-10 569.3n ± 4% 247.6n ± 2% -56.51% (p=0.000 n=10) Labels/unordered/set-many-10 488.9n ± 2% 308.3n ± 3% -36.94% (p=0.000 n=10) Labels/unordered/merge-many-10 523.6n ± 1% 258.5n ± 1% -50.64% (p=0.000 n=10) Labels/unordered/overwrite-many-10 571.4n ± 1% 412.1n ± 2% -27.89% (p=0.000 n=10) geomean 366.8n 186.9n -49.05% │ baseline.txt │ patch1b.txt │ │ B/op │ B/op vs base │ Labels/set-one-10 424.0 ± 0% 104.0 ± 0% -75.47% (p=0.000 n=10) Labels/merge-one-10 424.0 ± 0% 200.0 ± 0% -52.83% (p=0.000 n=10) Labels/overwrite-one-10 424.0 ± 0% 136.0 ± 0% -67.92% (p=0.000 n=10) Labels/ordered/set-many-10 1344.0 ± 0% 392.0 ± 0% -70.83% (p=0.000 n=10) Labels/ordered/merge-many-10 1184.0 ± 0% 712.0 ± 0% -39.86% (p=0.000 n=10) Labels/ordered/overwrite-many-10 1056.0 ± 0% 712.0 ± 0% -32.58% (p=0.000 n=10) Labels/unordered/set-many-10 1344.0 ± 0% 712.0 ± 0% -47.02% (p=0.000 n=10) Labels/unordered/merge-many-10 1184.0 ± 0% 712.0 ± 0% -39.86% (p=0.000 n=10) Labels/unordered/overwrite-many-10 1.031Ki ± 0% 1.008Ki ± 0% -2.27% (p=0.000 n=10) geomean 843.1 405.1 -51.95% │ baseline.txt │ patch1b.txt │ │ allocs/op │ allocs/op vs base │ Labels/set-one-10 5.000 ± 0% 3.000 ± 0% -40.00% (p=0.000 n=10) Labels/merge-one-10 5.000 ± 0% 5.000 ± 0% ~ (p=1.000 n=10) ¹ Labels/overwrite-one-10 5.000 ± 0% 4.000 ± 0% -20.00% (p=0.000 n=10) Labels/ordered/set-many-10 8.000 ± 0% 3.000 ± 0% -62.50% (p=0.000 n=10) Labels/ordered/merge-many-10 8.000 ± 0% 5.000 ± 0% -37.50% (p=0.000 n=10) Labels/ordered/overwrite-many-10 7.000 ± 0% 4.000 ± 0% -42.86% (p=0.000 n=10) Labels/unordered/set-many-10 8.000 ± 0% 4.000 ± 0% -50.00% (p=0.000 n=10) Labels/unordered/merge-many-10 8.000 ± 0% 5.000 ± 0% -37.50% (p=0.000 n=10) Labels/unordered/overwrite-many-10 7.000 ± 0% 5.000 ± 0% -28.57% (p=0.000 n=10) geomean 6.640 4.143 -37.60% ¹ all samples are equal Change-Id: Ie68e960a25c2d97bcfb6239dc481832fa8a39754 Reviewed-on: https://go-review.googlesource.com/c/go/+/574516 Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-10-01runtime/pprof: add context to short stack panicMichael Pratt
Over the years we've had various bugs in pprof stack handling resulting in appendLocsForStack crashing because stk is too short for a cached location. i.e., the cached location claims several inlined frames. Those should always appear together in stk. If some frames are missing from stk, appendLocsForStack. If we find this case, replace the slice out of bounds panic with an explicit panic that contains more context. Change-Id: I52725a689baf42b8db627ce3e1bc6c654ef245d4 Reviewed-on: https://go-review.googlesource.com/c/go/+/617135 Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-12-13runtime/pprof: fix inlined generics locationsTolya Korniltsev
When generic function[a,b] is inlined to the same generic function[b,a] with different types (not recursion) it is expected to get a pprof with a single Location with two functions. However due to incorrect check for generics names using runtime.Frame.Function, the profileBuilder assumes it is a recursion and emits separate Location. This change fixes the recursion check for generics functions by using runtime_expandFinalInlineFrame Fixes #64641 Change-Id: I3f58818f08ee322b281daa377fa421555ad328c9 Reviewed-on: https://go-review.googlesource.com/c/go/+/549135 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-12-11runtime/pprof: fix generics function namesTolya Korniltsev
profileBuilder is using Frame->Function as key for checking if we already emitted a function. However for generics functions it has dots there [...], so sometimes for different functions with different generics types, the profileBuilder emits wrong functions. Fixes #64528 Change-Id: I8b39245e0b18f4288ce758c912c6748f87cba39a Reviewed-on: https://go-review.googlesource.com/c/go/+/546815 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2023-05-05runtime, runtime/pprof: record instantiated symbol name in CPU profileCherry Mui
For generic functions, the previous CL makes it record the full instantiated symbol name in the runtime func table. This CL changes the pprof package to use that name in CPU profile. This way, it matches the symbol name the compiler sees, so it can apply PGO. TODO: add a test. Fixes #58712. Change-Id: If40db01cbef5f73c279adcc9c290a757ef6955b6 Reviewed-on: https://go-review.googlesource.com/c/go/+/491678 Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Cherry Mui <cherryyz@google.com>
2023-02-08runtime: correct typosOleksandr Redko
- Fix typo in throw error message for arena. - Correct typos in assembly and Go comments. - Fix log message in TestTraceCPUProfile. Change-Id: I874c9e8cd46394448b6717bc6021aa3ecf319d16 GitHub-Last-Rev: d27fad4d3cea81cc7a4ca6917985bcf5fa49b0e0 GitHub-Pull-Request: golang/go#58375 Reviewed-on: https://go-review.googlesource.com/c/go/+/465975 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-01-09runtime/pprof: document possibility of empty stacksAustin Clements
I spent quite a while determining the cause of empty stacks in profiles and reasoning out why this is okay. There isn't a great place to record this knowledge, but a documentation comment on appendLocsForStack is better than nothing. Updates #51550. Change-Id: I2eefc6ea31f1af885885c3d96199319f45edb4ce Reviewed-on: https://go-review.googlesource.com/c/go/+/460695 Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>
2022-10-14runtime/pprof: set Function.start_line fieldMichael Pratt
Now that we plumb the start line to the runtime, we can include in pprof files. Since runtime.Frame.startLine is not (currently) exported, we need a runtime helper to get the value. For #55022. Updates #56135. Change-Id: Ifc5b68a7b7170fd7895e4099deb24df7977b22ea Reviewed-on: https://go-review.googlesource.com/c/go/+/438255 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> Reviewed-by: Austin Clements <austin@google.com>
2022-09-30all: omit comparison bool constant to simplify codecui fliter
Change-Id: Icd4062e570559f1d0c69d4bdb9e23412054cf2a6 GitHub-Last-Rev: fbbfbcb54dac88c9a8f5c5c6d210be46f87e27dd GitHub-Pull-Request: golang/go#55958 Reviewed-on: https://go-review.googlesource.com/c/go/+/436880 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-08-23runtime/pprof: remove round-to-file-start adjustmentKeith Randall
This causes a problem in the test sometimes. With a mapping like: 00400000-00411000 r--p 00000000 fe:01 4459044 /tmp/go-build1710804385/b001/pprof.test 00411000-00645000 r-xp 00011000 fe:01 4459044 /tmp/go-build1710804385/b001/pprof.test The removed code would make the first mapping 0x400000-0x645000. Tests then grab the first few addresses to use as PCs, thinking they are in an executable range. But those addresses are really not in an executable range, causing the tests to fail. Change-Id: I5a69d0259d1fd70ff9745df1cbad4d54c5898e7b Reviewed-on: https://go-review.googlesource.com/c/go/+/424295 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Keith Randall <khr@golang.org>
2022-08-20runtime/pprof: add memory mapping info for WindowsEgon Elbre
Fixes #43296 Change-Id: Ib277c2e82c95f71a7a9b7fe1b22215ead7a54a88 Reviewed-on: https://go-review.googlesource.com/c/go/+/416975 Run-TryBot: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Than McIntosh <thanm@google.com>
2022-05-13runtime/pprof: fix allFrames cacheRhys Hiltner
The compiler may choose to inline multiple layers of function call, such that A calling B calling C may end up with all of the instructions for B and C written as part of A's function body. Within that function body, some PCs will represent code from function A. Some will represent code from function B, and for each of those the runtime will have an instruction attributable to A that it can report as its caller. Others will represent code from function C, and for each of those the runtime will have an instruction attributable to B and an instruction attributable to A that it can report as callers. When a profiling signal arrives at an instruction in B (as inlined in A) that the runtime also uses to describe calls to C, the profileBuilder ends up with an incorrect cache of allFrames results. That PC should lead to a location record in the profile that represents the frames B<-A, but the allFrames cache's view should expand the PC only to the B frame. Otherwise, when a profiling signal arrives at an instruction in C (as inlined in B in A), the PC stack C,B,A can get expanded to the frames C,B<-A,A as follows: The inlining deck starts empty. The first tryAdd call proposes PC C and frames C, which the deck accepts. The second tryAdd call proposes PC B and, due to the incorrect caching, frames B,A. (A fresh call to allFrames with PC B would return the frame list B.) The deck accepts that PC and frames. The third tryAdd call proposes PC A and frames A. The deck rejects those because a call from A to A cannot possibly have been inlined. This results in a new location record in the profile representing the frames C<-B<-A (good), as called by A (bad). The bug is the cached expansion of PC B to frames B<-A. That mapping is only appropriate for the resulting protobuf-format profile. The cache needs to reflect the results of a call to allFrames, which expands the PC B to the single frame B. For #50996 For #52693 Fixes #52764 Change-Id: I36d080f3c8a05650cdc13ced262189c33b0083b0 Reviewed-on: https://go-review.googlesource.com/c/go/+/404995 Reviewed-by: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Rhys Hiltner <rhys@justin.tv> Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-04-11all: gofmt main repoRuss Cox
[This CL is part of a sequence implementing the proposal #51082. The design doc is at https://go.dev/s/godocfmt-design.] Run the updated gofmt, which reformats doc comments, on the main repository. Vendored files are excluded. For #51082. Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407 Reviewed-on: https://go-review.googlesource.com/c/go/+/384268 Reviewed-by: Jonathan Amsterdam <jba@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-03-09runtime/pprof: fix pcDeck's frame indexingRhys Hiltner
When building the inlining deck, correctly identify which is the last frame in the deck. Otherwise, when some forms of inlining cause a PC to expand to multiple frames, the length of the deck's two slices will diverge. Fixes #51567 Change-Id: I24e7ba32cb16b167f4307178b3f03c29e5362c4b Reviewed-on: https://go-review.googlesource.com/c/go/+/391134 Reviewed-by: Michael Pratt <mpratt@google.com> Trust: Than McIntosh <thanm@google.com>
2022-03-08runtime/pprof: check if PC is reused for inliningRhys Hiltner
When describing call stacks that include inlined function calls, the runtime uses "fake" PCs to represent the frames that inlining removed. Those PCs correspond to real NOP instructions that the compiler inserts for this purpose. Describing the call stack in a protobuf-formatted profile requires the runtime/pprof package to collapse any sequences of fake call sites back into single PCs, removing the NOPs but retaining their line info. But because the NOP instructions are part of the function, they can appear as leaf nodes in a CPU profile. That results in an address that should sometimes be ignored (when it appears as a call site) and that sometimes should be present in the profile (when it is observed consuming CPU time). When processing a PC address, consider it first as a fake PC to add to the current inlining deck, and then as a previously-seen (real) PC. Fixes #50996 Change-Id: I80802369978bd7ac9969839ecfc9995ea4f84ab4 Reviewed-on: https://go-review.googlesource.com/c/go/+/384239 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2021-12-07runtime/pprof: consume tag for first CPU recordMichael Pratt
profBuf.write uses an index in b.tags for each entry, even if that entry has no tag (that slice entry just remains 0). profBuf.read similarly returns a tags slice with exactly as many entries as there are records in data. profileBuilder.addCPUData iterates through the tags in lockstep with the data records. Except in the special case of the first record, where it forgets to increment tags. Thus the first read of profiling data has all tags off-by-one. To help avoid regressions, addCPUData is changed to assert that tags contains exactly the correct number of tags. For #50007. Change-Id: I5f32f93003297be8d6e33ad472c185d924a63256 Reviewed-on: https://go-review.googlesource.com/c/go/+/369741 Reviewed-by: Austin Clements <austin@google.com> Trust: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2021-10-06all: use bytes.Cut, strings.CutRuss Cox
Many uses of Index/IndexByte/IndexRune/Split/SplitN can be written more clearly using the new Cut functions. Do that. Also rewrite to other functions if that's clearer. For #46336. Change-Id: I68d024716ace41a57a8bf74455c62279bde0f448 Reviewed-on: https://go-review.googlesource.com/c/go/+/351711 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-05-24[dev.typeparams] runtime/pprof: replace funcPC with ↵Cherry Mui
internal/abi.FuncPCABIInternal All funcPC references are ABIInternal functions. Replace with the intrinsics. Change-Id: I2266bb6d2b713eb63b6a09846e9f9c423cab6e9b Reviewed-on: https://go-review.googlesource.com/c/go/+/322349 Trust: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2020-12-09all: update to use os.ReadFile, os.WriteFile, os.CreateTemp, os.MkdirTempRuss Cox
As part of #42026, these helpers from io/ioutil were moved to os. (ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.) Update the Go tree to use the preferred names. As usual, code compiled with the Go 1.4 bootstrap toolchain and code vendored from other sources is excluded. ReadDir changes are in a separate CL, because they are not a simple search and replace. For #42026. Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3 Reviewed-on: https://go-review.googlesource.com/c/go/+/266365 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-04-21test/codegen, runtime/pprof, runtime: apply fmtalex-semenyuk
Change-Id: Ife4e065246729319c39e57a4fbd8e6f7b37724e1 GitHub-Last-Rev: e71803eaeb366c00f6c156de0b0b2c50927a0e82 GitHub-Pull-Request: golang/go#38527 Reviewed-on: https://go-review.googlesource.com/c/go/+/228901 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
2020-04-13runtime/pprof: clarify recursive inline heuristicMichael Pratt
Following CL 226818, the compiler will allow inlining a single cycle in an inline chain. Immediately-recursive functions are still disallowed, which is what this heuristic refers to. Add a regression test for this case. Note that in addition to this check, if the compiler were to inline multiple cycles via a loop (i.e., rather than appending duplicate code), much more work would be required here to handle a single address appearing in multiple different inline frames. Updates #29737 Change-Id: I88de15cfbeabb9c04381e1c12cc36778623132a5 Reviewed-on: https://go-review.googlesource.com/c/go/+/227346 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dan Scales <danscales@google.com> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2020-03-27runtime/pprof: increment fake overflow record PCMichael Pratt
gentraceback generates PCs which are usually following the CALL instruction. For those that aren't, it fixes up the PCs so that functions processing the output can unconditionally decrement the PC. runtime_expandInlineFrames does this unconditional decrement when looking up the function. However, the fake stack frame generated for overflow records fails to meet the contract, and decrementing the PC results in a PC in the previous function. If that function contains inlined call, runtime_expandInlineFrames will not short-circuit and will panic trying to look up a PC that doesn't exist. Note that the added test does not fail at HEAD. It will only fail (with a panic) if the function preceeding lostProfileEvent contains inlined function calls. At the moment (on linux/amd64), that is runtime/pprof.addMaxRSS, which does not. Fixes #38096 Change-Id: Iad0819f23c566011c920fd9a5b1254719228da0b Reviewed-on: https://go-review.googlesource.com/c/go/+/225661 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-03-05runtime/pprof: expand final stack frame to avoid truncationMichael Pratt
When generating stacks, the runtime automatically expands inline functions to inline all inline frames in the stack. However, due to the stack size limit, the final frame may be truncated in the middle of several inline frames at the same location. As-is, we assume that the final frame is a normal function, and emit and cache a Location for it. If we later receive a complete stack frame, we will first use the cached Location for the inlined function and then generate a new Location for the "caller" frame, in violation of the pprof requirement to merge inlined functions into the same Location. As a result, we: 1. Nondeterministically may generate a profile with the different stacks combined or split, depending on which is encountered first. This is particularly problematic when performing a diff of profiles. 2. When split stacks are generated, we lose the inlining information. We avoid both of these problems by performing a second expansion of the last stack frame to recover additional inline frames that may have been lost. This expansion is a bit simpler than the one done by the runtime because we don't have to handle skipping, and we know that the last emitted frame is not an elided wrapper, since it by definition is already included in the stack. Fixes #37446 Change-Id: If3ca2af25b21d252cf457cc867dd932f107d4c61 Reviewed-on: https://go-review.googlesource.com/c/go/+/221577 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2019-11-22runtime/pprof: avoid crash due to truncated stack tracesHana Kim
The profile proto message builder maintains a location entry cache that maps a location (possibly involving multiple user frames that represent inlined function calls) to the location id. We have been using the first pc of the inlined call sequence as the key of the cached location entry assuming that, for a given pc, the sequence of frames representing the inlined call stack is deterministic and stable. Then, when analyzing the new stack trace, we expected the exact number of pcs to be present in the captured stack trace upon the cache hit. This assumption does not hold, however, in the presence of the stack trace truncation in the runtime during profiling, and also with the potential bugs in runtime. A better fix is to use all the pcs of the inlined call sequece as the key instead of the first pc. But that is a bigger code change. This CL avoids the crash assuming the trace was truncated. Fixes #35538 Change-Id: I8c6bae98bc8b178ee51523c7316f56b1cce6df16 Reviewed-on: https://go-review.googlesource.com/c/go/+/207609 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>
2019-11-14runtime/pprof: fix the inlined frame merge logicHana Kim
tryAdd shouldn't succeed (and accept the new frame) if the last existing frame on the deck is not an inlined frame. For example, when we see the followig stack [300656 300664 300655 300664] with each PC corresponds to [{PC:300656 Func:nil Function:runtime.nanotime File:/workdir/go/src/runtime/time_nofake.go Line:19 Entry:300416 {0x28dac8 0x386c80}}] [{PC:300664 Func:0x28dac8 Function:runtime.checkTimers File:/workdir/go/src/runtime/proc.go Line:2623 Entry:300416 {0x28dac8 0x386c80}}] [{PC:300655 Func:nil Function:runtime.nanotime File:/workdir/go/src/runtime/time_nofake.go Line:19 Entry:300416 {0x28dac8 0x386c80}}] [{PC:300664 Func:0x28dac8 Function:runtime.checkTimers File:/workdir/go/src/runtime/proc.go Line:2623 Entry:300416 {0x28dac8 0x386c80}}] PC:300656 and PC:300664 belong to a single location entry, but the bug in the current tryAdd logic placed the entire stack into one location entry. Also adds tests - this crash is a tricky case to test because I think it should happen with normal go code. The new TestTryAdd simulates it by using fake call sequences. The test crashed without the fix. Update #35538 Change-Id: I6d3483f757abf4c429ab91616e4def90832fc04a Reviewed-on: https://go-review.googlesource.com/c/go/+/206958 Reviewed-by: Keith Randall <khr@golang.org>
2019-11-08runtime: fix line number for faulting instructionsKeith Randall
Unlike function calls, when processing instructions that directly fault we must not subtract 1 from the pc before looking up the file/line information. Since the file/line lookup unconditionally subtracts 1, add 1 to the faulting instruction PCs to compensate. Fixes #34123 Change-Id: Ie7361e3d2f84a0d4f48d97e5a9e74f6291ba7a8b Reviewed-on: https://go-review.googlesource.com/c/go/+/196962 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
2019-11-08runtime/pprof: delete unused locForPCHana (Hyang-Ah) Kim
Change-Id: Ie4754fefba6057b1cf558d0096fe0e83355f8eff Reviewed-on: https://go-review.googlesource.com/c/go/+/205098 TryBot-Result: Gobot Gobot <gobot@golang.org> Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>
2019-11-08runtime/pprof: correct inlined function location encoding for non-CPU profilesHana (Hyang-Ah) Kim
Change-Id: Id270a3477bf1a581755c4311eb12f990aa2260b5 Reviewed-on: https://go-review.googlesource.com/c/go/+/205097 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-11-08runtime/pprof: correctly encode inlined functions in CPU profileHana (Hyang-Ah) Kim
The pprof profile proto message expects inlined functions of a PC to be encoded in one Location entry using multiple Line entries. https://github.com/google/pprof/blob/5e96527/proto/profile.proto#L177-L184 runtime/pprof has encoded the symbolization information by creating a Location for each PC found in the stack trace and including info from all the frames expanded from the PC using runtime.CallersFrames. This assumes inlined functions are represented as a single PC in the stack trace. (https://go-review.googlesource.com/41256) In the recent years, behavior around inlining and the traceback changed significantly (e.g. https://golang.org/cl/152537, https://golang.org/issue/29582, and many changes). Now the PCs in the stack trace represent user frames even including inline marks. As a result, the profile proto started to allocate a Location entry for each user frame, lose the inline information (so pprof presented incorrect results when inlined functions are involved), and confuse the pprof tool with those PCs made up for inline marks. This CL attempts to detect inlined call frames from the stack traces of CPU profiles, and organize the Location information as intended. Currently, runtime does not provide a reliable and convenient way to detect inlined call frames and expand user frames from a given externally recognizable PCs. So we use heuristics to recover the groups - inlined call frames have nil Func field - inlined call frames will have the same Entry point - but must be careful with recursive functions that have the same Entry point by definition, and non-Go functions that may lack most of the fields of Frame. The followup CL will address the issue with other profile types. Change-Id: I0c9667ab016a3e898d648f31c3f82d84c15398db Reviewed-on: https://go-review.googlesource.com/c/go/+/204636 Reviewed-by: Keith Randall <khr@golang.org>
2019-01-08runtime: store incremented PC in result of runtime.CallersKeith Randall
In 1.11 we stored "return addresses" in the result of runtime.Callers. I changed that behavior in CL 152537 to store an address in the call instruction itself. This CL reverts that part of 152537. The change in 152537 was made because we now store pcs of inline marks in the result of runtime.Callers as well. This CL will now store the address of the inline mark + 1 in the results of runtime.Callers, so that the subsequent -1 done in CallersFrames will pick out the correct inline mark instruction. This CL means that the results of runtime.Callers can be passed to runtime.FuncForPC as they were before. There are a bunch of packages in the wild that take the results of runtime.Callers, subtract 1, and then call FuncForPC. This CL keeps that pattern working as it did in 1.11. The changes to runtime/pprof in this CL are exactly a revert of the changes to that package in 152537 (except the locForPC comment). Update #29582 Change-Id: I04d232000fb482f0f0ff6277f8d7b9c72e97eb48 Reviewed-on: https://go-review.googlesource.com/c/156657 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-12-28cmd/compile,runtime: redo mid-stack inlining tracebacksKeith Randall
Work involved in getting a stack trace is divided between runtime.Callers and runtime.CallersFrames. Before this CL, runtime.Callers returns a pc per runtime frame. runtime.CallersFrames is responsible for expanding a runtime frame into potentially multiple user frames. After this CL, runtime.Callers returns a pc per user frame. runtime.CallersFrames just maps those to user frame info. Entries in the result of runtime.Callers are now pcs of the calls (or of the inline marks), not of the instruction just after the call. Fixes #29007 Fixes #28640 Update #26320 Change-Id: I1c9567596ff73dc73271311005097a9188c3406f Reviewed-on: https://go-review.googlesource.com/c/152537 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2018-09-07runtime/pprof: remove "deleted" suffix while parsing maps fileMarko Kevac
If binary file of a running program was deleted or moved, maps file (/proc/pid/maps) will contain lines that have this binary filename suffixed with "(deleted)" string. This suffix stayed as a part of the filename and made remote profiling slightly more difficult by requiring from a user to rename binary file to include this suffix. This change cleans up the filename and removes this suffix and thus simplify debugging. Fixes #25740 Change-Id: Ib3c8c3b9ef536c2ac037fcc14e8037fa5c960036 Reviewed-on: https://go-review.googlesource.com/116395 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
2018-07-16runtime/pprof: add a fake mapping when /proc/self/maps is unavailableHana (Hyang-Ah) Kim
Profile's Mapping field is currently populated by reading /proc/self/maps. On systems where /proc/self/maps is not available, the profile generated by Go's runtime will not have any Mapping entry. Pprof command then adds a fake entry and links all Location entries in the profile with the fake entry to be used during symbolization. https://github.com/google/pprof/blob/a8644067d5a3c9a6386e7c88fa4a3d9d37877ca3/internal/driver/fetch.go#L437 The fake entry is not enough to suppress the error or warning messages pprof command produces. We need to tell pprof that Location entries are symbolized already by Go runtime and pprof does not have to attempt to perform further symbolization. In #25743, we made Go runtime mark Mapping entries with HasFunctions=true when all Location entries from the Mapping entries are successfully symbolized. This change makes the Go runtime add a fake mapping entry, otherwise the pprof command tool would add, and set the HasFunctions=true following the same logic taken when the real mapping information is available. Updates #19790. Fixes #26255. Tested pprof doesn't report the error message any more for pure Go program. Change-Id: Ib12b62e15073f5d6c80967e44b3e8709277c11bd Reviewed-on: https://go-review.googlesource.com/123779 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-06-13runtime/pprof: set HasFunctions of mapping entriesHana Kim
The pprof tool utilizes attributes of mapping entries such as HasFunctions to determine whether the profile includes necessary symbol information. If none of the attributes is set, pprof tool tries to read the corresponding binary to use for local symbolization. If the binary doesn't exist, it prints out error messages. Go runtime generated profiles without any of the attributes set so the pprof tool always printed out the error messages. The error messages became more obvious with the new terminal support that uses red color for error messages. Go runtime can symbolize all Go symbols and generate self-contained profile for pure Go program. Thus, there is no reason for the pprof tool to look for the copy of the binary. So, this CL sets one of the attributes (HasFunctions) true if all PCs in samples look fully symbolized. For non-pure Go program, however, it's possible that symbolization of non-Go PCs is incomplete. In this case, we need to leave the attributes all false so pprof can attempt to symbolize using the local copy of the binary if available. It's hard to determine whether a mapping includes non-Go code. Instead, this CL checks PCs from collected samples. If unsuccessful symbolization is observed, it skips setting the HasFunctions attribute. Fixes #25743 Change-Id: I5108be45bbc37ab486d145fa03e7ce37d88fad50 Reviewed-on: https://go-review.googlesource.com/118275 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-24runtime/pprof: introduce "allocs" profileHana (Hyang-Ah) Kim
The Go's heap profile contains four kinds of samples (inuse_space, inuse_objects, alloc_space, and alloc_objects). The pprof tool by default chooses the inuse_space (the bytes of live, in-use objects). When analyzing the current memory usage the choice of inuse_space as the default may be useful, but in some cases, users are more interested in analyzing the total allocation statistics throughout the program execution. For example, when we analyze the memory profile from benchmark or program test run, we are more likely interested in the whole allocation history than the live heap snapshot at the end of the test or benchmark. The pprof tool provides flags to control which sample type to be used for analysis. However, it is one of the less-known features of pprof and we believe it's better to choose the right type of samples as the default when producing the profile. This CL introduces a new type of profile, "allocs", which is the same as the "heap" profile but marks the alloc_space as the default type unlike heap profiles that use inuse_space as the default type. 'go test -memprofile=...' command is changed to use the new "allocs" profile type instead of the traditional "heap" profile. Fixes #24443 Change-Id: I012dd4b6dcacd45644d7345509936b8380b6fbd9 Reviewed-on: https://go-review.googlesource.com/102696 Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Russ Cox <rsc@golang.org>
2018-03-24all: remove some unused return parametersDaniel Martí
As found by unparam. Picked the low-hanging fruit, consisting only of errors that were always nil and results that were never used. Left out those that were useful for consistency with other func signatures. Change-Id: I06b52bbd3541f8a5d66659c909bd93cb3e172018 Reviewed-on: https://go-review.googlesource.com/102418 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-05-15runtime/pprof: expand inlined frames in symbolized proto profilesAustin Clements
Currently proto symbolization uses runtime.FuncForPC and assumes each PC maps to a single frame. This isn't true in the presence of inlining (even with leaf-only inlining this can get incorrect results). Change PC symbolization to use runtime.CallersFrames to expand each PC to all of the frames at that PC. Change-Id: I8d20dff7495a5de495ae07f569122c225d433ced Reviewed-on: https://go-review.googlesource.com/41256 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>
2017-05-15runtime/pprof: clean up call/return PCs in memory profilesAustin Clements
Proto profile conversion is inconsistent about call vs return PCs in profile locations. The proto defines locations to be call PCs. This is what we do when proto-izing CPU profiles, but we fail to convert the return PCs in memory and count profile stacks to call PCs when converting them to proto locations. Fix this in the heap and count profile conversion functions. TestConvertMemProfile also hard-codes this failure to convert from return PCs to call PCs, so fix up the addresses in the synthesized profile to be return PCs while checking that we get call PCs out of the conversion. Change-Id: If1fc028b86fceac6d71a2d9fa6c41ff442c89296 Reviewed-on: https://go-review.googlesource.com/42951 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>
2017-04-28runtime/pprof: propagate profile labels into profile protoMichael Matloob
Profile labels added by the user using pprof.Do, if present will be in a *labelMap stored in the unsafe.Pointer 'tag' field of the profile map entry. This change extracts the labels from the tag field and writes them to the profile proto. Change-Id: Ic40fdc58b66e993ca91d5d5effe0e04ffbb5bc46 Reviewed-on: https://go-review.googlesource.com/39613 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
2017-04-26runtime/pprof: ignore dummy huge page mapping in /proc/self/mapsRuss Cox
Change-Id: I72bea1450386100482b4681b20eb9a9af12c7522 Reviewed-on: https://go-review.googlesource.com/41816 Reviewed-by: Michael Matloob <matloob@golang.org>
2017-04-26runtime/pprof: add /proc/self/maps parsing testRuss Cox
Delete old TestRuntimeFunctionTrimming, which is testing a dead API and is now handled in end-to-end tests. Change-Id: I64fc2991ed4a7690456356b5f6b546f36935bb67 Reviewed-on: https://go-review.googlesource.com/41815 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>
2017-04-20runtime/pprof: fix period informationAustin Clements
The period recorded in CPU profiles is in nanoseconds, but was being computed incorrectly as hz * 1000. As a result, many absolute times displayed by pprof were incorrect. Fix this by computing the period correctly. Change-Id: I6fadd6d8ad3e57f31e8cc7a25a24fcaec510d8d4 Reviewed-on: https://go-review.googlesource.com/40995 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com> Reviewed-by: Russ Cox <rsc@golang.org>
2017-03-08runtime/pprof: add GNU build IDs to Mappings recorded from /proc/self/mapsRuss Cox
This helps systems that maintain an external database mapping build ID to symbol information for the given binary, especially in the case where /proc/self/maps lists many different files (for example, many shared libraries). Avoid importing debug/elf to avoid dragging in that whole package (and its dependencies like debug/dwarf) into the build of every program that generates a profile. Fixes #19431. Change-Id: I6d4362a79fe23e4f1726dffb0661d20bb57f766f Reviewed-on: https://go-review.googlesource.com/37855 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-03-07runtime/pprof: actually use tag parameterDaniel Martí
It's only ever called with the value it was using, but the code was counterintuitive. Use the parameter instead, like the other funcs near it. Found by github.com/mvdan/unparam. Change-Id: I45855e11d749380b9b2a28e6dd1d5dedf119a19b Reviewed-on: https://go-review.googlesource.com/37893 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-24runtime/pprof: add streaming protobuf encoderRuss Cox
The existing code builds a full profile in memory. Then it translates that profile into a data structure (in memory). Then it marshals that data structure into a protocol buffer (in memory). Then it gzips that marshaled form into the underlying writer. So there are three copies of the full profile data in memory at the same time before we're done. This is obviously dumb. This CL implements a fully streaming conversion from the original in-memory profile to the underlying writer. There is now only one copy of the profile in memory. For the non-CPU profiles, this is optimal, since we have to have a full copy in memory to start with. For the CPU profiles, we could still try to bound the profile size stored in memory and stream fragments out during the actual profiling, as Go 1.7 did (with a simpler format), but so far that hasn't been necessary. Change-Id: Ic36141021857791bf0cd1fce84178fb5e744b989 Reviewed-on: https://go-review.googlesource.com/37164 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>
2017-02-24runtime/pprof: use more efficient hash table for staging profileRuss Cox
The old hash table was a place holder that allocates memory during every lookup for key generation, even for keys that hit in the the table. Change-Id: I4f601bbfd349f0be76d6259a8989c9c17ccfac21 Reviewed-on: https://go-review.googlesource.com/37163 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>
2017-02-24runtime/pprof: use new profile buffers for CPU profilingRuss Cox
This doesn't change the functionality of the current code, but it sets us up for exporting the profiling labels into the profile. The old code had a hash table of profile samples maintained during the signal handler, with evictions going into a log. The new code just logs every sample directly, leaving the hash-based deduplication to an ordinary goroutine. The new code also avoids storing the entire profile in two forms in memory, an unfortunate regression introduced when binary profile support was added. After this CL the entire profile is only stored once in memory. We'd still like to get back down to storing it zero times (streaming it to the underlying io.Writer). Change-Id: I0893a1788267c564aa1af17970d47377b2a43457 Reviewed-on: https://go-review.googlesource.com/36712 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Matloob <matloob@golang.org>
2017-02-10runtime/pprof: merge internal/protopprof into pprof packageRuss Cox
These are very tightly coupled, and internal/protopprof is small. There's no point to having a separate package. Change-Id: I2c8aa49c9e18a7128657bf2b05323860151b5606 Reviewed-on: https://go-review.googlesource.com/36711 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>