| Age | Commit message (Collapse) | Author |
|
For #59670
Change-Id: If2b05b1ba30b607b518577b0e11ba5a0b07999c5
GitHub-Last-Rev: a664aa18b5ef674dc2d05c1f7533e1974d265894
GitHub-Pull-Request: golang/go#64906
Reviewed-on: https://go-review.googlesource.com/c/go/+/553276
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Change-Id: I8fe66326994894b17ce0eda991bba942844d26b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/541475
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Move ChaCha8 code into internal/chacha8rand and use it to implement
runtime.rand, which is used for the unseeded global source for
both math/rand and math/rand/v2. This also affects the calculation of
the start point for iteration over very very large maps (when the
32-bit fastrand is not big enough).
The benefit is that misuse of the global random number generators
in math/rand and math/rand/v2 in contexts where non-predictable
randomness is important for security reasons is no longer a
security problem, removing a common mistake among programmers
who are unaware of the different kinds of randomness.
The cost is an extra 304 bytes per thread stored in the m struct
plus 2-3ns more per random uint64 due to the more sophisticated
algorithm. Using PCG looks like it would cost about the same,
although I haven't benchmarked that.
Before this, the math/rand and math/rand/v2 global generator
was wyrand (https://github.com/wangyi-fudan/wyhash).
For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson
ALFG was justifiable, since the latter was not any better.
But for math/rand/v2, the global generator really should be
at least as good as one of the well-studied, specific algorithms
provided directly by the package, and it's not.
(Wyrand is still reasonable for scheduling and cache decisions.)
Good randomness does have a cost: about twice wyrand.
Also rationalize the various runtime rand references.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │
│ sec/op │ sec/op vs base │
ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20)
PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20)
SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20)
GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20)
GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20)
GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20)
GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20)
Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20)
Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20)
GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20)
IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20)
Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20)
Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20)
Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20)
Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20)
Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20)
Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20)
Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20)
Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20)
Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20)
Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20)
Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20)
Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20)
Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20)
ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20)
NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20)
Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20)
Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20)
Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20)
ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20)
Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
│ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │
│ sec/op │ sec/op vs base │
ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20)
PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20)
SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20)
GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20)
GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20)
GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20)
GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20)
Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20)
Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20)
GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20)
IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20)
Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20)
Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20)
Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20)
Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20)
Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20)
Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20)
Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20)
Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20)
Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20)
Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20)
Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20)
ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20)
NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20)
Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20)
Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20)
Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20)
ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20)
Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.386 │ 5cf807d1ea.386 │
│ sec/op │ sec/op vs base │
ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20)
PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20)
SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20)
GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20)
GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20)
GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20)
GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20)
Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20)
Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20)
GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20)
IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20)
Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20)
Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20)
Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20)
Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20)
Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20)
Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20)
Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20)
Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20)
Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20)
Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20)
Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20)
Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20)
Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20)
ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20)
NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20)
Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20)
Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20)
Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20)
ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20)
Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20)
For #61716.
Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2
Reviewed-on: https://go-review.googlesource.com/c/go/+/516860
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: Ifb4844efddcb0369b0302eeab72394eeaf5c8072
Reviewed-on: https://go-review.googlesource.com/c/go/+/540022
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: shuang cui <imcusg@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: I3f0b7209621b39cee69566a5cc95e4343b4f1f20
GitHub-Last-Rev: af9dbbe69ad74e8c210254dafa260a886b690853
GitHub-Pull-Request: golang/go#63321
Reviewed-on: https://go-review.googlesource.com/c/go/+/531916
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
The code meant to check if it is the last section, which is
i === len(md.textsectmap)-1. The -1 was missing.
Change-Id: Ifbb9e40df730abe3bec20fde5f56f5c75dfd9e8f
Reviewed-on: https://go-review.googlesource.com/c/go/+/527795
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
Now that pcvalue keeps its cache on the M, we can drop all of the
stack-allocated pcvalueCaches and stop carefully passing them around
between lots of operations. This significantly simplifies a fair
amount of code and makes several structures smaller.
This series of changes has no statistically significant effect on any
runtime Stack benchmarks.
I also experimented with making the cache larger, now that the impact
is limited to the M struct, but wasn't able to measure any
improvements.
This is a re-roll of CL 515277
Change-Id: Ia27529302f81c1c92fb9c3a7474739eca80bfca1
Reviewed-on: https://go-review.googlesource.com/c/go/+/520064
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
|
|
Currently, the pcvalue cache is stack allocated for each operation
that needs to look up a lot of pcvalues. It's not always clear where
to put it, a lot of the time we just pass a nil cache, it doesn't get
reused across operations, and we put a surprising amount of effort
into threading these caches around.
This CL moves it to the M, where it can be long-lived and used by all
pcvalue lookups, and we don't have to carefully thread it across
operations.
This is a re-roll of CL 515276 with a fix for reentrant use of the
pcvalue cache from the signal handler.
Change-Id: Id94c0c0fb3004d1fda1b196790eebd949c621f28
Reviewed-on: https://go-review.googlesource.com/c/go/+/520063
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
This would have helped with debugging the failures caused by CL 515276.
Change-Id: Id641949d8bcd763de7f93778ad9bd3fdde95dcb2
Reviewed-on: https://go-review.googlesource.com/c/go/+/520062
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
|
|
This reverts CL 515276.
This broke the longtest builders. For example:
https://build.golang.org/log/351a1a198a6b843b1881c1fb6cdef51f3e413e8b
Change-Id: Ie79067464fe8e226da31721cf127f3efb6011452
Reviewed-on: https://go-review.googlesource.com/c/go/+/516856
Auto-Submit: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
This reverts CL 515277
Change-Id: Ie10378eed4993cb69f4a9b43a38af32b9d743155
Reviewed-on: https://go-review.googlesource.com/c/go/+/516855
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Now that pcvalue keeps its cache on the M, we can drop all of the
stack-allocated pcvalueCaches and stop carefully passing them around
between lots of operations. This significantly simplifies a fair
amount of code and makes several structures smaller.
This series of changes has no statistically significant effect on any
runtime Stack benchmarks.
I also experimented with making the cache larger, now that the impact
is limited to the M struct, but wasn't able to measure any
improvements.
Change-Id: I4719ebf347c7150a05e887e75a238e23647c20cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/515277
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
Currently, the pcvalue cache is stack allocated for each operation
that needs to look up a lot of pcvalues. It's not always clear where
to put it, a lot of the time we just pass a nil cache, it doesn't get
reused across operations, and we put a surprising amount of effort
into threading these caches around.
This CL moves it to the M, where it can be long-lived and used by all
pcvalue lookups, and we don't have to carefully thread it across
operations.
Change-Id: I675e583e0daac887c8ef77a402ba792648d96027
Reviewed-on: https://go-review.googlesource.com/c/go/+/515276
Run-TryBot: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
Currently, pcvalue only returns a valid start PC if cache is nil.
We're about to eliminate the cache argument and always use a pcvalue
cache, so make sure the cache stores the start PC and always return it
from pcvalue.
Change-Id: Ie8854af4b7e7ba1c2a17a495d9229320821daa23
Reviewed-on: https://go-review.googlesource.com/c/go/+/515275
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
For generic functions, the previous CL makes it record the full
instantiated symbol name in the runtime func table. This CL
changes the pprof package to use that name in CPU profile. This
way, it matches the symbol name the compiler sees, so it can apply
PGO.
TODO: add a test.
Fixes #58712.
Change-Id: If40db01cbef5f73c279adcc9c290a757ef6955b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/491678
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
|
|
table
For generic functions and methods, we replace the instantiated
shape type parameter name to "...", to make the function name
printed in stack traces looks more user friendly. Currently, this
is done in the binary's runtime func table at link time, and the
runtime has no way to access the full symbol name. This causes
the profile to also contain the replaced name. For PGO, this also
cause the compiler to not be able to find out the original fully
instantiated function name from the profile.
With this CL, we change the linker to record the full name, and
do the name substitution at run time when a printing a function's
name in traceback.
For #58712.
Change-Id: Ia0ea0989a1ec231f3c4fbf59365c9333405396c6
Reviewed-on: https://go-review.googlesource.com/c/go/+/491677
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
internal/abi
We also rename the constants related to unsafe-points: currently, they
follow the same naming scheme as the PCDATA table indexes, but are not
PCDATA table indexes.
For #59670.
Change-Id: I06529fecfae535be5fe7d9ac56c886b9106c74fd
Reviewed-on: https://go-review.googlesource.com/c/go/+/485497
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
For #59670.
Change-Id: Ie784ba4dd2701e4f455e1abde4a6bfebee4b1387
Reviewed-on: https://go-review.googlesource.com/c/go/+/485496
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
|
|
For #59670.
Change-Id: I517e97ea74cf232e5cfbb77b127fa8804f74d84b
Reviewed-on: https://go-review.googlesource.com/c/go/+/485495
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
|
|
(This is a retry of CL 462035 which was reverted at 474976.
The only change from that CL is the aix fix SRODATA->SNOPTRDATA
at inittask.go:141)
As described here:
https://github.com/golang/go/issues/31636#issuecomment-493271830
"Find the lexically earliest package that is not initialized yet,
but has had all its dependencies initialized, initialize that package,
and repeat."
Simplify the runtime a bit, by just computing the ordering required
in the linker and giving a list to the runtime.
Update #31636
Fixes #57411
RELNOTE=yes
Change-Id: I28c09451d6aa677d7394c179d23c2c02c503fc56
Reviewed-on: https://go-review.googlesource.com/c/go/+/478916
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Currently, all stack walking logic is in one venerable, large, and
very, very complicated function: runtime.gentraceback. This function
has three distinct operating modes: printing, populating a PC buffer,
or invoking a callback. And it has three different modes of unwinding:
physical Go frames, inlined Go frames, and cgo frames. It also has
several flags. All of this logic is very interwoven.
This CL reimplements the monolithic gentraceback function as an
"unwinder" type with an iterator API. It moves all of the logic for
stack walking into this new type, and gentraceback is now a
much-simplified wrapper around the new unwinder type that still
implements printing, populating a PC buffer, and invoking a callback.
Follow-up CLs will replace uses of gentraceback with direct uses of
unwinder.
Exposing traceback functionality as an iterator API will enable a lot
of follow-up work such as simplifying the open-coded defer
implementation (which should in turn help with #26813 and #37233),
printing the bottom of deep stacks (#7181), and eliminating the small
limit on CPU stacks in profiles (#56029).
Fixes #54466.
Change-Id: I36e046dc423c9429c4f286d47162af61aff49a0d
Reviewed-on: https://go-review.googlesource.com/c/go/+/458218
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
|
|
This converts all places in the runtime that perform inline expansion
to use the new inlineUnwinder abstraction.
For #54466.
Change-Id: I48d996fb6263ed5225bd21d30914a27ae434528d
Reviewed-on: https://go-review.googlesource.com/c/go/+/466099
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
We've replicated the code to expand inlined frames in many places in
the runtime at this point. This CL adds a simple iterator API that
abstracts this out.
We also use this to try out a new idea for structuring tests of
runtime internals: rather than exporting this whole internal data type
and API, we write the test in package runtime and import the few bits
of std we need. The idea is that, for tests of internals, it's easier
to inject public APIs from std than it is to export non-public APIs
from runtime. This is discussed more in #55108.
For #54466.
Change-Id: Iebccc04ff59a1509694a8ac0e0d3984e49121339
Reviewed-on: https://go-review.googlesource.com/c/go/+/466096
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Austin Clements <austin@google.com>
|
|
For #54466.
Change-Id: I4d8e1953703b6c763e5bd53024da43efcc993489
Reviewed-on: https://go-review.googlesource.com/c/go/+/466095
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Austin Clements <austin@google.com>
|
|
This reverts commit ce2a609909d9de3391a99a00fe140506f724f933.
aka CL 462035
Reason for revert: this CL is causing some problems in some internal Google programs.
Change-Id: I4476b8d8d2c3d7b5703d1d85c93baebb4b4e5d26
Reviewed-on: https://go-review.googlesource.com/c/go/+/474976
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
As described here:
https://github.com/golang/go/issues/31636#issuecomment-493271830
"Find the lexically earliest package that is not initialized yet,
but has had all its dependencies initialized, initialize that package,
and repeat."
Simplify the runtime a bit, by just computing the ordering required
in the linker and giving a list to the runtime.
Update #31636
Fixes #57411
RELNOTE=yes
Change-Id: I1e4d3878ebe6e8953527aedb730824971d722cac
Reviewed-on: https://go-review.googlesource.com/c/go/+/462035
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
The existing runtime_expandFinalInlineFrame implementation doesn't skip trailing wrappers, but
gentraceback does skip wrapper functions.
This change makes runtime_expandFinalInlineFrame handling wrapper functions consistent to gentraceback.
Fixes #58288
Change-Id: I1b0e2c10b0a89bcb1e787b98d27730cb40a34406
Reviewed-on: https://go-review.googlesource.com/c/go/+/465097
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Change-Id: Ib6ea1bd04d9b06542ed2b0f453c718115417c62c
Reviewed-on: https://go-review.googlesource.com/c/go/+/449755
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Now that we plumb the start line to the runtime, we can include in pprof
files. Since runtime.Frame.startLine is not (currently) exported, we
need a runtime helper to get the value.
For #55022.
Updates #56135.
Change-Id: Ifc5b68a7b7170fd7895e4099deb24df7977b22ea
Reviewed-on: https://go-review.googlesource.com/c/go/+/438255
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Austin Clements <austin@google.com>
|
|
This adds the function "start line number" to runtime._func and
runtime.inlinedCall objects. The "start line number" is the line number
of the func keyword or TEXT directive for assembly.
Subtracting the start line number from PC line number provides the
relative line offset of a PC from the the start of the function. This
helps with source stability by allowing code above the function to move
without invalidating samples within the function.
Encoding start line rather than relative lines directly is convenient
because the pprof format already contains a start line field.
This CL uses a straightforward encoding of explictly including a start
line field in every _func and inlinedCall. It is possible that we could
compress this further in the future. e.g., functions with a prologue
usually have <line of PC 0> == <start line>. In runtime.test, 95% of
functions have <line of PC 0> == <start line>.
According to bent, this is geomean +0.83% binary size vs master and
-0.31% binary size vs 1.19.
Note that //line directives can change the file and line numbers
arbitrarily. The encoded start line is as adjusted by //line directives.
Since this can change in the middle of a function, `line - start line`
offset calculations may not be meaningful if //line directives are in
use.
For #55022.
Change-Id: Iaabbc6dd4f85ffdda294266ef982ae838cc692f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/429638
Run-TryBot: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Add a new "coverage counter" classification for variables to be used
for storing code coverage counter values (somewhat in the same way
that we identify fuzzer counters). Tagging such variables allows us to
aggregate them in the linker, and to treat updates specially.
Updates #51430.
Change-Id: Ib49fb05736ffece98bcc2f7a7c37e991b7f67bbb
Reviewed-on: https://go-review.googlesource.com/c/go/+/401235
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
To match _func.nameOff.
Change-Id: I75e71cadaa0f7ca8844d1b49950673797b227074
Reviewed-on: https://go-review.googlesource.com/c/go/+/428658
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
|
|
Switch to the more Go-style name to match inlinedCall.nameOff.
Change-Id: I2115b27af8309e1ead7d61ecc65fe4fc966030f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/428657
Run-TryBot: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
The meaning of this field is unchanged, this CL simply gives it a more
descriptive name, as func_ makes it sound like a reference to the _func.
Change-Id: I70e54f34bede7636ce4d7b9dd0f7557308f02143
Reviewed-on: https://go-review.googlesource.com/c/go/+/427961
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
The parent, file, and line fields are no longer used now that we have
parentPc to find the parent and NOPs in the parent to attach file/line
pcdata to.
Removing these fields reduces the binary size of cmd/go on linux-amd64
by 1.1%.
Fixes #54849.
Change-Id: If58f08622736b2b322288608776f8bedf0c3fd17
Reviewed-on: https://go-review.googlesource.com/c/go/+/427960
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
For #53821.
Change-Id: I93409f377881a3c029b41b0f1fbcef5e21091f2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/419438
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
As it can't appear in user package paths.
There is a hack for handling "go:buildid" and "type:*" on windows/386.
Previously, windows/386 requires underscore prefix on external symbols,
but that's only applied for SHOSTOBJ/SUNDEFEXT or cgo export symbols.
"go.buildid" is STEXT, "type.*" is STYPE, thus they are not prefixed
with underscore.
In external linking mode, the external linker can't resolve them as
external symbols. But we are lucky that they have "." in their name,
so the external linker see them as Forwarder RVA exports. See:
- https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#export-address-table
- https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=ld/pe-dll.c;h=e7b82ba6ffadf74dc1b9ee71dc13d48336941e51;hb=HEAD#l972)
This CL changes "." to ":" in symbols name, so theses symbols can not be
found by external linker anymore. So a hacky way is adding the
underscore prefix for these 2 symbols. I don't have enough knowledge to
verify whether adding the underscore for all STEXT/STYPE symbols are
fine, even if it could be, that would be done in future CL.
Fixes #37762
Change-Id: I92eaaf24c0820926a36e0530fdb07b07af1fcc35
Reviewed-on: https://go-review.googlesource.com/c/go/+/317917
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
A future change to gofmt will rewrite
// Doc comment.
//go:foo
to
// Doc comment.
//
//go:foo
Apply that change preemptively to all comments (not necessarily just doc comments).
For #51082.
Change-Id: Iffe0285418d1e79d34526af3520b415a12203ca9
Reviewed-on: https://go-review.googlesource.com/c/go/+/384260
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
With the switch to the register ABI, we now generate wrapper
functions for go statements in many cases. A new goroutine's start
PC now points to the wrapper function. This does not affect
execution, but the runtime tracer uses the start PC and the
function name as the name/label of that goroutine. If the start
function is a named function, using the name of the wrapper loses
that information. Furthur, the tracer's goroutine view groups
goroutines by start PC. For multiple go statements with the same
callee, they are grouped together. With the wrappers, which is
context-dependent as it is a closure, they are no longer grouped.
This CL fixes the problem by providing the underlying unwrapped
PC for tracing. The compiler emits metadata to link the unwrapped
PC to the wrapper function. And the runtime reads that metadata
and record that unwrapped PC for tracing.
(This doesn't work for shared buildmode. Unfortunate.)
TODO: is there a way to test?
Fixes #50622.
Change-Id: Iaa20e1b544111c0255eb0fc04427aab7a5e3b877
Reviewed-on: https://go-review.googlesource.com/c/go/+/384158
Trust: Cherry Mui <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
The code was updated, the comments were not.
Change-Id: If387779f3abd5e8a1b487fe34c33dcf9ce5fa7ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/383495
Trust: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Change filepath reference from cmd/internal/ld/symtab.go to
cmd/link/internal/ld/symtab.go.
Change-Id: Icb207a2e2c82d3976787d2d5cfb0f8005696f738
GitHub-Last-Rev: 428d99c6ca97db79b7d8cdf24843df3492a9aeb0
GitHub-Pull-Request: golang/go#49518
Reviewed-on: https://go-review.googlesource.com/c/go/+/363276
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Brad Fitzpatrick <bradfitz@golang.org>
|
|
For #44167.
Change-Id: I2cd13229d88f630451fabd113b0e5a04841e9e79
Reviewed-on: https://go-review.googlesource.com/c/go/+/309590
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Currently, for stack traces (e.g. at panic or when runtime.Stack
is called), we print argument values from the stack. With register
ABI, we may never store the argument to stack therefore the
argument value on stack may be meaningless. This causes confusion.
This CL makes the compiler keep trace of which argument stack
slots are meaningful. If it is meaningful, it will be printed in
stack traces as before. If it may not be meaningful, it will be
printed as the stack value with a question mark ("?"). In general,
the value could be meaningful on some code paths but not others
depending on the execution, and the compiler couldn't know
statically, so we still print the stack value, instead of not
printing it at all. Also note that if the argument variable is
updated in the function body the printed value may be stale (like
before register ABI) but still considered meaningful.
Arguments passed on stack are always meaningful therefore always
printed without a question mark. Results are never printed, as
before.
(Due to a bug in the compiler we sometimes don't spill args into
their dedicated spill slots (as we should), causing it having
fewer meaningful values than it should be.)
This increases binary sizes a bit:
old new
hello 1129760 1142080 +1.09%
cmd/go 13932320 14088016 +1.12%
cmd/link 6267696 6329168 +0.98%
Fixes #45728.
Change-Id: I308a0402e5c5ab94ca0953f8bd85a56acd28f58c
Reviewed-on: https://go-review.googlesource.com/c/go/+/352057
Trust: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
In findfunc, we first us the relative PC to find the function's
index in functab. When we split text sections, as the external
linker may shift the sections, and the PC may not match the
(virtual) PC we used to build the functab. So the index may be
inaccurate, and we need to do a (forward or backward) linear
search to find the actual entry.
Instead of using the PC directly, we can first compute the
(pre-external-link virtual) relative PC and use that to find the
index in functab. This way, the index will be accurate and we will
not need to do the special backward linear search.
Change-Id: I8ab11c66b7a5a3d79aae00198b98780e10db27b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/354873
Trust: Cherry Mui <cherryyz@google.com>
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
As the func table contains the end marker of the text section, we
sometimes need to get that address from an offset. Currently
textAddr doesn't handle that address, as it is not within any
text section. Instead of letting the callers not call textAddr
with the end offset, just handle it more elegantly in textAddr.
For #48837.
Change-Id: I6e97e455f6cb66e9680a7aac6152ba6f4cda2e12
Reviewed-on: https://go-review.googlesource.com/c/go/+/354635
Trust: Cherry Mui <cherryyz@google.com>
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
fastrandn is ~50% faster than fastrand() % n.
`ack -v 'fastrand\(\)\s?\%'` finds all modulo on fastrand()
name old time/op new time/op delta
Fastrandn/2 2.86ns ± 0% 1.59ns ± 0% -44.35% (p=0.000 n=9+10)
Fastrandn/3 2.87ns ± 1% 1.59ns ± 0% -44.41% (p=0.000 n=10+9)
Fastrandn/4 2.87ns ± 1% 1.58ns ± 1% -45.10% (p=0.000 n=10+10)
Fastrandn/5 2.86ns ± 1% 1.58ns ± 1% -44.84% (p=0.000 n=10+10)
Change-Id: Ic91f5ca9b9e3b65127bc34792b62fd64fbd13b5c
Reviewed-on: https://go-review.googlesource.com/c/go/+/353269
Trust: Meng Zhuo <mzh@golangcn.org>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Accept a uint32 instead of a uintptr to make call sites simpler.
Do less work in the common case in which len(textsectmap) == 1.
Change-Id: Idd6cdc3fdad7a9356864c83790463b5d3000171b
Reviewed-on: https://go-review.googlesource.com/c/go/+/354132
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
They're only used in a single place.
Instead of calculating the end every time,
calculate it in the linker.
It'd be nice to recalculate baseaddr-vaddr,
but that generates relocations that are too large.
While we're here, remove some pointless uintptr -> uintptr conversions.
Change-Id: I91758f9bff11b365bc3a63fee172dbdc3d90b966
Reviewed-on: https://go-review.googlesource.com/c/go/+/354089
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
The slowest thing that can happen in funcdata is a cache miss
on moduledata.gofunc. Move that memory load earlier.
Also, for better ergonomics when working on this code,
do more calculations as uintptrs.
name old time/op new time/op delta
StackCopyWithStkobj-8 10.5ms ± 5% 9.9ms ± 4% -6.03% (p=0.000 n=15+15)
Change-Id: I590f4449725983c7f8d274c4ac7ed384d9018d85
Reviewed-on: https://go-review.googlesource.com/c/go/+/354134
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
funcspdelta should be inlined: It is a tiny wrapper around another func.
The sanity check prevents that. Condition the sanity check on debugPcln.
While we're here, make the sanity check throw when it fails.
Change-Id: Iec022b8463b13a8e5a6d8479e7ddcb68909d6fe0
Reviewed-on: https://go-review.googlesource.com/c/go/+/354133
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|