aboutsummaryrefslogtreecommitdiff
path: root/src/runtime
AgeCommit message (Collapse)Author
2024-10-28all: skip and fix various tests with -asan and -msanMichael Anthony Knyszek
First, skip all the allocation count tests. In some cases this aligns with existing skips for -race, but in others we've got new issues. These are debug modes, so some performance loss is expected, and this is clearly no worse than today where the tests fail. Next, skip internal linking and static linking tests for msan and asan. With asan we get an explicit failure that neither are supported by the C and/or Go compilers. With msan, we only get the Go compiler telling us internal linking is unavailable. With static linking, we segfault instead. Filed #70080 to track that. Next, skip some malloc tests with asan that don't quite work because of the redzone. This is because of some sizeclass assumptions that get broken with the redzone and the fact that the tiny allocator is effectively disabled (again, due to the redzone). Next, skip some runtime/pprof tests with asan, because of extra allocations. Next, skip some malloc tests with asan that also fail because of extra allocations. Next, fix up memstats accounting for arenas when asan is enabled. There is a bug where more is added to the stats than subtracted. This also simplifies the accounting a little. Next, skip race tests with msan or asan enabled; they're mutually incompatible. Fixes #70054. Fixes #64256. Fixes #64257. For #70079. For #70080. Change-Id: I99c02a0b9d621e44f1f918b307aa4a4944c3ec60 Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-asan-clang15,gotip-linux-amd64-msan-clang15 Reviewed-on: https://go-review.googlesource.com/c/go/+/622855 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Bypass: Michael Knyszek <mknyszek@google.com>
2024-10-28internal/runtime/maps: small maps point directly to a groupMichael Pratt
If the map contains 8 or fewer entries, it is wasteful to have a directory that points to a table that points to a group. Add a special case that replaces the directory with a direct pointer to a group. We could theoretically do similar for single table maps (no directory, just point directly to a table), but that is left for later. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I6fc04dfc11c31dadfe5b5d6481b4c4abd43d48ed Reviewed-on: https://go-review.googlesource.com/c/go/+/611188 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-28crypto/internal/fips: add service indicator mechanismFilippo Valsorda
Placed the fipsIndicator field in some 64-bit alignment padding in the g struct to avoid growing per-goroutine memory requirements on 64-bit targets. Fixes #69911 Updates #69536 Change-Id: I176419d0e3814574758cb88a47340a944f405604 Reviewed-on: https://go-review.googlesource.com/c/go/+/620795 Reviewed-by: Roland Shoemaker <roland@golang.org> Reviewed-by: Daniel McCarney <daniel@binaryparadox.net> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Filippo Valsorda <filippo@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Derek Parker <parkerderek86@gmail.com>
2024-10-28Revert "crypto/rand: add randcrash=0 GODEBUG"Filippo Valsorda
A GODEBUG is actually a security risk here: most programs will start to ignore errors from Read because they can't happen (which is the intended behavior), but then if a program is run with GODEBUG=randcrash=0 it will use a partial buffer in case an error occurs, which may be catastrophic. Note that the proposal was accepted without the GODEBUG, which was only added later. This (partially) reverts CL 608435. I kept the tests. Updates #66821 Change-Id: I3fd20f9cae0d34115133fe935f0cfc7a741a2662 Reviewed-on: https://go-review.googlesource.com/c/go/+/622115 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Roland Shoemaker <roland@golang.org> Auto-Submit: Filippo Valsorda <filippo@golang.org> Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
2024-10-25runtime: fix mallocgc for asanMichael Anthony Knyszek
This change finally fully fixes mallocgc for asan after the recent refactoring. Here is everything that changed: Fix the accounting for the alloc header; large objects don't have them. Mask out extra bits set from unrolling the bitmap for slice backing stores in writeHeapBitsSmall. The redzone in asan mode makes it so that dataSize is no longer an exact multiple of typ.Size_ in this case (a new assumption I have recently discovered) but we didn't mask out any extra bits, so we'd accidentally set bits in other allocations. Oops. Move the initHeapBits optimization for the 8-byte scan sizeclass on 64-bit platforms up to mallocgc, out from writeHeapBitsSmall. So, this actually caused a problem with asan when the optimization first landed, but we missed it. The issue was then masked once we started passing the redzone down into writeHeapBitsSmall, since the optimization would no longer erroneously fire on asan. What happened was that dataSize would be 8 (because that was the user-provided alloc size) so we'd skip writing heap bits, but it would turn out the redzone bumped the size class, so we'd actually *have* to write the heap bits for that size class. This is not really a problem now *but* it caused problems for me when debugging, since I would try to remove the red zone from dataSize and this would trigger this bug again. Ultimately, this whole situation is confusing because the check in writeHeapBitsSmall is *not* the same as the check in initHeapBits. By moving this check up to mallocgc, we can make the checks align better by matching on the sizeclass, so this should be less error-prone in the future. Change-Id: I1e9819223be23f722f3bf21e63e812f5fb557194 Reviewed-on: https://go-review.googlesource.com/c/go/+/622041 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-25runtime: reserve fewer memory for aligned reservation on sbrk systemsCherry Mui
Sometimes the runtime needs to reserve some memory with a large alignment, which the OS usually won't directly satisfy. So, it asks size+align bytes instead, and frees the unaligned portions. On sbrk systems, this doesn't work that well, as freeing the tail portion doesn't really free the memory to the OS. Instead, we could simply round the current break up, then reserve the given size, without wasting the tail portion. Also, don't create heap arena hints on sbrk systems. We can only grow the break sequentially, and reserving specific addresses would not succeed anyway. For #69018. Change-Id: Iadc2c54d62b00ad7befa5bbf71146523483a8c47 Reviewed-on: https://go-review.googlesource.com/c/go/+/621715 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-10-25runtime: document that Caller and Frame.File always use forward slashesqmuntal
Document that Caller and Frame.File always use forward slashes as path separators, even on Windows. Fixes #3335 Change-Id: Ic5bbf8a1f14af64277dca4783176cd8f70726b91 Reviewed-on: https://go-review.googlesource.com/c/go/+/603275 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-10-24runtime: uphold goroutine profile invariants in coroswitchMichael Anthony Knyszek
Goroutine profiles require checking in with the profiler before any goroutine starts running. coroswitch is a place where a goroutine may start running, but where we do not check in with the profiler, which leads to crashes. Fix this by checking in with the profiler the same way execute does. Fixes #69998. Change-Id: Idef6dd31b70a73dd1c967b56c307c7a46a26ba73 Reviewed-on: https://go-review.googlesource.com/c/go/+/622016 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-24runtime: fix ASAN poison calculation in mallocgcMichael Anthony Knyszek
A previous CL broke the ASAN poisoning calculation in mallocgc by not taking into account a possible allocation header, so the beginning of the following allocation could have been poisoned. This mostly isn't a problem, actually, since the following slot would usually just have an allocation header in it that programs shouldn't be touching anyway, but if we're going a word-past-the-end at the end of a span, we could be poisoning a valid heap allocation. Change-Id: I76a4f59bcef01af513a1640c4c212c0eb6be85b3 Reviewed-on: https://go-review.googlesource.com/c/go/+/622295 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2024-10-24runtime: support cgo index into pointer-to-arrayIan Lance Taylor
We were missing a case for calling a C function with an index into a pointer-to-array. Fixes #70016 Change-Id: I9c74d629e58722813c1aaa0f0dc225a5a64d111b Reviewed-on: https://go-review.googlesource.com/c/go/+/621576 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-23runtime: add the checkPtraceScope to skip certain testsShuo Wang
When the kernel parameter ptrace_scope is set to 2 or 3, certain test cases in runtime-gdb_test.go will fail. We should skip these tests. Fixes #69932 Change-Id: I685d1217f1521d7f8801680cf6b71d8e7a265188 GitHub-Last-Rev: 063759e04cfc5ea750ed1d381d8586134488a96b GitHub-Pull-Request: golang/go#69933 Reviewed-on: https://go-review.googlesource.com/c/go/+/620857 Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-23runtime/debug: minor cleanups after CL 384154Ian Lance Taylor
Change some vars to consts, remove some unneeded string conversions. Change-Id: Ib12eed11ef080c4b593c8369bb915117e7100045 Reviewed-on: https://go-review.googlesource.com/c/go/+/621838 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2024-10-23runtime/debug: document ParseBuildInfo and (*BuildInfo).StringIan Lance Taylor
For #51026 Fixes #69971 Change-Id: I47f2938d20cbe9462bf738a506baedad4a7006c3 Reviewed-on: https://go-review.googlesource.com/c/go/+/621837 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-23runtime: fix typo in error messagechangwang ma
Change-Id: I27bf98e84545746d90948dd06c4a7bd70782c49d Reviewed-on: https://go-review.googlesource.com/c/go/+/621895 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-22runtime: Check LSE support on ARM64 at runtime initAndrey Bokhanko
Check presence of LSE support on ARM64 chip if we targeted it at compile time. Related to #69124 Update #60905 Change-Id: I6fe244decbb4982548982e1f88376847721a33c7 Reviewed-on: https://go-review.googlesource.com/c/go/+/610195 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Shu-Chun Weng <scw@google.com>
2024-10-21cmd/link,runtime: DWARF/gdb support for swiss mapsMichael Pratt
For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap Change-Id: I6695c0b143560d974b710e1d78e7a7d09278f7cc Reviewed-on: https://go-review.googlesource.com/c/go/+/620215 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21runtime: (re)use unused linear memory on WasmCherry Mui
CL 476717 adopted the memory management mechanism on Plan 9 to manage Wasm's linear memory. But the Plan 9 code uses global variable bloc and blocMax to keep track of the runtime's and the OS's sense of break, whereas the Wasm sbrk function doesn't use those global variables, and directly goes to grow the linear memory instead. This causes that if there is any unused portion at the end of the linear memory, the runtime doesn't use it. This CL fixes it, adopts the same mechanism as the Plan 9 code. In particular, the runtime is not aware of any unused initial memory at startup. Therefore, (most of) the extra initial memory set by the linker are not actually used. This CL fixes this as well. For #69018. Change-Id: I2ea6a138310627eda5f19a1c76b1e1327362e5f2 Reviewed-on: https://go-review.googlesource.com/c/go/+/621635 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21runtime,time: use atomic.Int32 for isSendingMichael Anthony Knyszek
This change switches isSending to be an atomic.Int32 instead of an atomic.Uint8. The Int32 version is managed as a counter, which is something that we couldn't do with Uint8 without adding a new intrinsic which may not be available on all architectures. That is, instead of only being able to support 8 concurrent timer firings on the same timer because we only have 8 independent bits to set for each concurrent timer firing, we can now have 2^31-1 concurrent timer firings before running into any issues. Like the fact that each bit-set was matched with a clear, here we match increments with decrements to indicate that we're in the "sending on a channel" critical section in the timer code, so we can report the correct result back on Stop or Reset. We choose an Int32 instead of a Uint32 because it's easier to check for obviously bad values (negative values are always bad) and 2^31-1 concurrent timer firings should be enough for anyone. Previously, we avoided anything bigger than a Uint8 because we could pack it into some padding in the runtime.timer struct. But it turns out that the type that actually matters, runtime.timeTimer, is exactly 96 bytes in size. This means its in the next size class up in the 112 byte size class because of an allocation header. We thus have some free space to work with. This change increases the size of this struct from 96 bytes to 104 bytes. (I'm not sure if runtime.timer is often allocated directly, but if it is, we get lucky in the same way too. It's exactly 80 bytes in size, which means its in the 96-byte size class, leaving us with some space to work with.) Fixes #69969. Related to #69880 and #69312. Change-Id: I9fd59cb6a69365c62971d1f225490a65c58f3e77 Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/621616 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21runtime: remove linkname from memhash{32,64} functionsAlfonso Subiotto Marques
Remove linkname directives that are no longer necessary given parquet-go/parquet-go#142 removes the dependency on the `memhash{32,64}` functions. This change also removes references to segmentio/parquet-go since that repository was archived in favor of parquet-go/parquet-go. Updates #67401 Change-Id: Ibafb0c41b39cdb86dac5531f62787fb5cb8d3f01 GitHub-Last-Rev: e14c4e4dfe1023df83339da73eb5dd632d52851b GitHub-Pull-Request: golang/go#67784 Reviewed-on: https://go-review.googlesource.com/c/go/+/589795 Auto-Submit: Ian Lance Taylor <iant@google.com> Commit-Queue: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-10-21runtime: execute publicationBarrier in noscan case for delayed zeroingMichael Anthony Knyszek
This is a peace-of-mind change to make sure that delayed-zeroed memory (in the large alloc case) is globally visible from the moment the allocation is published back to the caller. The way it's written right now is good enough for the garbage collector (we already have a publication barrier for a nil span.largeType, so the GC will ignore the noscan span) but this might matter for user code on weak memory architectures. Change-Id: I06ac9b95863074e5f09382629083b19bfa87fdb8 Reviewed-on: https://go-review.googlesource.com/c/go/+/619036 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21runtime: specialize heapSetTypeMichael Anthony Knyszek
Last CL we separated mallocgc into several specialized paths. Let's split up heapSetType too. This will make the specialized heapSetType functions inlineable and cut out some branches as well as a function call. Microbenchmark results at this point in the stack: │ before.out │ after-5.out │ │ sec/op │ sec/op vs base │ Malloc8-4 13.52n ± 3% 12.15n ± 2% -10.13% (p=0.002 n=6) Malloc16-4 21.49n ± 2% 18.32n ± 4% -14.75% (p=0.002 n=6) MallocTypeInfo8-4 27.12n ± 1% 18.64n ± 2% -31.30% (p=0.002 n=6) MallocTypeInfo16-4 28.71n ± 3% 21.63n ± 5% -24.65% (p=0.002 n=6) geomean 21.81n 17.31n -20.64% Change-Id: I5de9ac5089b9eb49bf563af2a74e6dc564420e05 Reviewed-on: https://go-review.googlesource.com/c/go/+/614795 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-21runtime: refactor mallocgc into several independent codepathsMichael Anthony Knyszek
Right now mallocgc is a monster of a function. In real programs, we see that a substantial amount of time in mallocgc is spent in mallocgc itself. It's very branch-y, holds a lot of state, and handles quite a few disparate cases, trying to merge them together. This change breaks apart mallocgc into separate, leaner functions. There's some duplication now, but there are a lot of branches that can be pruned as a result. There's definitely still more we can do here. heapSetType can be inlined and broken down for each case, since its internals roughly map to each case anyway (done in a follow-up CL). We can probably also do more with the size class lookups, since we know more about the size of the object in each case than before. Below are the savings for the full stack up until now. │ after-3.out │ after-4.out │ │ sec/op │ sec/op vs base │ Malloc8-4 13.32n ± 2% 12.17n ± 1% -8.63% (p=0.002 n=6) Malloc16-4 21.64n ± 3% 19.38n ± 10% -10.47% (p=0.002 n=6) MallocTypeInfo8-4 23.15n ± 2% 19.91n ± 2% -14.00% (p=0.002 n=6) MallocTypeInfo16-4 25.86n ± 4% 22.48n ± 5% -13.11% (p=0.002 n=6) MallocLargeStruct-4 270.0n ± ∞ ¹ geomean 20.38n 30.97n -11.58% Change-Id: I681029c0b442f9221c4429950626f06299a5cfe4 Reviewed-on: https://go-review.googlesource.com/c/go/+/614257 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21runtime: break out the debug.malloc codepaths into functionsMichael Anthony Knyszek
This change breaks out the debug.malloc codepaths into dedicated functions, both for making mallocgc easier to read, and to reduce the function's size (currently all that code is inlined and really doesn't need to be). This is a microoptimization that on its own changes very little, but together with other optimizations and a breaking up of the various malloc paths will matter all together ("death by a thousand cuts"). Change-Id: I30b3ab4a1f349ba85b4a1b5b2c399abcdfe4844f Reviewed-on: https://go-review.googlesource.com/c/go/+/617879 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21runtime: move debug checks behind constant flag in mallocgcMichael Anthony Knyszek
These debug checks are very occasionally helpful, but they do cost real time. The biggest issue seems to be the bloat of mallocgc due to the "throw" paths. Overall, after some follow-ups, this change cuts about 1ns off of the mallocgc fast path. This is a microoptimization that on its own changes very little, but together with other optimizations and a breaking up of the various malloc paths will matter all together ("death by a thousand cuts"). Change-Id: I07c4547ad724b9f94281320846677fb558957721 Reviewed-on: https://go-review.googlesource.com/c/go/+/617878 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21runtime: rename shouldhelpgc to checkGCTrigger in mallocgcMichael Anthony Knyszek
shouldhelpgc is a very unhelpful name, because it has nothing to do with assists and solely to do with GC triggering. Name it checkGCTrigger instead, which is much clearer. Change-Id: Id38debd424ddb397376c0cea6e74b3fe94002f71 Reviewed-on: https://go-review.googlesource.com/c/go/+/617877 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21runtime: recompute assistG before and after mallocMichael Anthony Knyszek
This change stops tracking assistG across malloc to reduce number of slots the compiler must keep track of in mallocgc, which adds to register pressure. It also makes the call to deductAssistCredit only happen if the GC is running. This is a microoptimization that on its own changes very little, but together with other optimizations and a breaking up of the various malloc paths will matter all together ("death by a thousand cuts"). Change-Id: I4cfac7f3e8e873ba66ff3b553072737a4707e2c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/617876 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-10-21runtime: use wb flag instead of gcphase for allocate-black checkMichael Anthony Knyszek
This is an allocator microoptimization. There's no reason to check gcphase in general, since it's mostly for debugging anyway. writeBarrier.enabled is set in all the same cases here, and we force one fewer cache line (probably) to be touched during malloc. Conceptually, it also makes a bit more sense. The allocate-black policy is partly informed by the write barrier design. Change-Id: Ia5ff593d64c29cf7f4d1bced3204056566444a98 Reviewed-on: https://go-review.googlesource.com/c/go/+/617875 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21runtime: simplify mem profile checking in mallocgcMichael Anthony Knyszek
Checking whether the current allocation needs to be profiled is currently branch-y and weirdly a lot of code. The branches are generally predictable, but it's a surprising number of instructions. Part of the problem is that MemProfileRate is just a global that can be set at any time, so we need to load it and check certain settings explicitly. In an ideal world, we would just always subtract from nextSample and have a single branch to take the slow path if we subtract below zero. If MemProfileRate were a function, we could trash all the nextSample values intentionally in each mcache. This would be slow, but MemProfileRate changes rarely while the malloc hot path is well, hot. Unfortunate... Although this ideal world is, AFAICT, impossible, we can still get close. If we cache the value of MemProfileRate in each mcache, then we can force malloc to take the slow path whenever MemProfileRate changes. This does require two additional loads, but crucially, these loads are independent of everything else in mallocgc. Furthermore, the branch dependent on those loads is incredibly predictable in practice. This CL on its own has little-to-no impact on mallocgc. But this codepath is going to be duplicated in several places in the next CL, so it'll pay to simplify it. Also, we're very much trying to remedy a death-by-a-thousand-cuts situation, and malloc is currently still kind of a monster -- it will not help if mallocgc isn't really streamlined itself. Lastly, there's a nice property now that all nextSample values get immediately re-sampled when MemProfileRate changes. Change-Id: I6443d0cf9bd7861595584442b675ac1be8ea3455 Reviewed-on: https://go-review.googlesource.com/c/go/+/615815 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2024-10-21runtime: optimize 8-byte allocation pointer data writingMichael Anthony Knyszek
This change brings back a minor optimization lost in the Go 1.22 cycle wherein the 8-byte pointer-ful span class spans would have the pointer bitmap written ahead of time in bulk, because there's only one possible pattern. │ before │ after │ │ sec/op │ sec/op vs base │ MallocTypeInfo8-4 25.13n ± 1% 23.59n ± 2% -6.15% (p=0.002 n=6) Change-Id: I135b84bb1d5b7e678b841b56430930bc73c0a038 Reviewed-on: https://go-review.googlesource.com/c/go/+/614256 Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21runtime: don't call span.heapBits in writeHeapBitsSmallMichael Anthony Knyszek
For whatever reason, span.heapBits is kind of slow. It accounts for about a quarter of the cost of writeHeapBitsSmall, which is absurd. We get a nice speed improvement for small allocations by eliminating this call. │ before │ after │ │ sec/op │ sec/op vs base │ MallocTypeInfo16-4 29.47n ± 1% 27.02n ± 1% -8.31% (p=0.002 n=6) Change-Id: I6270e26902e5a9254cf1503fac81c3c799c59d6a Reviewed-on: https://go-review.googlesource.com/c/go/+/614255 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-10-21cmd/compile,internal/runtime/maps: add extendible hashingMichael Pratt
Extendible hashing splits a swisstable map into many swisstables. This keeps grow operations small. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap Change-Id: Id91f34af9e686bf35eb8882ee479956ece89e821 Reviewed-on: https://go-review.googlesource.com/c/go/+/604936 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-21runtime: revise the documentation comments for netpollShuo Wang
Supplement to CL 511455. Updates #61454 Change-Id: I111cbf297dd9159cffba333d610a7a4542915c55 GitHub-Last-Rev: fe8fa184868d665a4d08d534d3bfb5ea446d12c0 GitHub-Pull-Request: golang/go#69900 Reviewed-on: https://go-review.googlesource.com/c/go/+/620495 Auto-Submit: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-10-18runtime: more thorough map benchmarksMichael Pratt
Based on the benchmarks in github.com/cockroachlabs/swiss. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I9ad925d3272c671e21ec04eb2da5ebd8f0fc6a28 Reviewed-on: https://go-review.googlesource.com/c/go/+/596295 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-18runtime/testdata: fix for C23 nullptr keywordJoseph Myers
src/runtime/testdata/testprogcgo/threadprof.go contains C code with a variable called nullptr. This conflicts with the nullptr keyword in the C23 revision of the C standard (showing up as gccgo test build failures when updating GCC to use C23 by default when building C code). Rename that variable to nullpointer to avoid the clash with the keyword (any other name that's not a keyword would work just as well). Change-Id: Ida5ef371a3f856c611409884e185c3d5ded8e86c GitHub-Last-Rev: 2ec464703be0507a67a077741789a37511d197e4 GitHub-Pull-Request: golang/go#69927 Reviewed-on: https://go-review.googlesource.com/c/go/+/620955 Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-10-14runtime: clarify work.bytesMarked documentationAustin Clements
Change-Id: If5132400aac0ef00e467958beeaab5e64d053d10 Reviewed-on: https://go-review.googlesource.com/c/go/+/619099 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Austin Clements <austin@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-14all: wire up swisstable mapsMichael Pratt
Use the new SwissTable-based map in internal/runtime/maps as the basis for the runtime map when GOEXPERIMENT=swissmap. Integration is complete enough to pass all.bash. Notable missing features: * Race integration / concurrent write detection * Stack-allocated maps * Specialized "fast" map variants * Indirect key / elem For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap Change-Id: Ie97b656b6d8e05c0403311ae08fef9f51756a639 Reviewed-on: https://go-review.googlesource.com/c/go/+/594596 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-14runtime: don't frob isSending for tickersIan Lance Taylor
The Ticker Stop and Reset methods don't report a value, so we don't need to track whether they are interrupting a send. This includes a test that used to fail about 2% of the time on my laptop when run under x/tools/cmd/stress. Change-Id: Ic6d14b344594149dd3c24b37bbe4e42e83f9a9ad Reviewed-on: https://go-review.googlesource.com/c/go/+/620136 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-11runtime: reduce syscall.SyscallX stack usageqmuntal
syscall.SyscallX consumes a lot of stack space, which is a problem because they are nosplit functions. They used to use less stack space, but CL 563315, that landed in Go 1.23, increased the stack usage by a lot. This CL reduces the stack usage back to the previous level. Fixes #69813. Change-Id: Iddedd28b693c66a258da687389768055c493fc2e Reviewed-on: https://go-review.googlesource.com/c/go/+/618497 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-08internal/runtime/maps: initial swiss table map implementationMichael Pratt
Add a new package that will contain a new "Swiss Table" (https://abseil.io/about/design/swisstables) map implementation, which is intended to eventually replace the existing runtime map implementation. This implementation is based on the fabulous github.com/cockroachdb/swiss package contributed by Peter Mattis. This CL adds an hash map implementation. It supports all the core operations, but does not have incremental growth. For #54766. Change-Id: I52cf371448c3817d471ddb1f5a78f3513565db41 Reviewed-on: https://go-review.googlesource.com/c/go/+/582415 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-07runtime: overwrite startupRand instead of clearing itFilippo Valsorda
AT_RANDOM is unfortunately used by libc before we run (so make sure it's not cleared) but also is available to cgo programs after we did. It would be unfortunate if a cgo program assumed it could use AT_RANDOM but instead found all zeroes there. Change-Id: I82eff34d8cf5a499b439052b7827b8ef7cabc21d Reviewed-on: https://go-review.googlesource.com/c/go/+/608437 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Daniel McCarney <daniel@binaryparadox.net> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Roland Shoemaker <roland@golang.org>
2024-10-07runtime: use arc4random_buf() for readRandomFilippo Valsorda
readRandom doesn't matter on Linux because of startupRand, but it does on Windows and macOS. Windows already uses the same API as crypto/rand. Switch macOS away from the /dev/urandom read. Updates #68278 Cq-Include-Trybots: luci.golang.try:gotip-darwin-amd64_14 Change-Id: Ie8f105e35658a6f10ff68798d14883e3b212eb3e Reviewed-on: https://go-review.googlesource.com/c/go/+/608436 Reviewed-by: Roland Shoemaker <roland@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-07crypto/rand: add randcrash=0 GODEBUGFilippo Valsorda
For #66821 Change-Id: I525c308d6d6243a2bc805e819dcf40b67e52ade5 Reviewed-on: https://go-review.googlesource.com/c/go/+/608435 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Daniel McCarney <daniel@binaryparadox.net> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Roland Shoemaker <roland@golang.org>
2024-10-07crypto/rand: crash program if Read would return an errorFilippo Valsorda
Fixes #66821 Fixes #54980 Change-Id: Ib081f4e4f75c7936fc3f5b31d3bd07cca1c2a55c Reviewed-on: https://go-review.googlesource.com/c/go/+/602497 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Roland Shoemaker <roland@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
2024-10-04runtime: use stringslite.CutPrefix in isExportedRuntimeTobias Klauser
Change-Id: I7cbbe3b9a9f08ac98e3e76be7bda2f7df9c61fb3 Reviewed-on: https://go-review.googlesource.com/c/go/+/617915 Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-03runtime: memclrNoHeapPointers optimization for block alignmentVasily Leonenko
goos: linux goarch: arm64 pkg: runtime │ base.log │ opt.log │ │ sec/op │ sec/op vs base │ Memclr/5-4 3.378n ± 2% 3.376n ± 2% ~ (p=0.128 n=10) Memclr/16-4 2.749n ± 1% 2.776n ± 2% +1.00% (p=0.001 n=10) Memclr/64-4 4.588n ± 2% 4.184n ± 2% -8.78% (p=0.000 n=10) Memclr/256-4 8.758n ± 0% 7.103n ± 0% -18.90% (p=0.000 n=10) Memclr/4096-4 58.80n ± 0% 57.43n ± 0% -2.33% (p=0.000 n=10) Memclr/65536-4 868.7n ± 1% 861.7n ± 1% -0.80% (p=0.004 n=10) Memclr/1M-4 23.08µ ± 6% 23.55µ ± 6% ~ (p=0.739 n=10) Memclr/4M-4 219.6µ ± 3% 216.1µ ± 2% ~ (p=0.123 n=10) Memclr/8M-4 586.1µ ± 1% 586.4µ ± 2% ~ (p=0.853 n=10) Memclr/16M-4 1.312m ± 0% 1.311m ± 1% ~ (p=0.481 n=10) Memclr/64M-4 5.332m ± 1% 5.681m ± 0% +6.55% (p=0.000 n=10) geomean 1.723µ 1.683µ -2.31% Change-Id: Icad625065fb1f30b2a4094f3f1e58b4e9b3d841e Reviewed-on: https://go-review.googlesource.com/c/go/+/616137 Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-03Revert "runtime/pprof: make TestBlockMutexProfileInlineExpansion stricter"Nick Ripley
This reverts commit 5b0f8596b766afae9dd1f117a4a5dcfbbf1b80f1. Reason for revert: This CL breaks gotip-linux-amd64-noopt builder. Change-Id: I3950211f05c90e4955c0785409b796987741a9f4 Reviewed-on: https://go-review.googlesource.com/c/go/+/617715 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-02runtime: clear isSending bit earlierIan Lance Taylor
I've done some more testing of the new isSending field. I'm not able to get more than 2 bits set. That said, with this change it's significantly less likely to have even 2 bits set. The idea here is to clear the bit before possibly locking the channel we are sending the value on, thus avoiding some delay and some serialization. For #69312 Change-Id: I8b5f167f162bbcbcbf7ea47305967f349b62b0f4 Reviewed-on: https://go-review.googlesource.com/c/go/+/617497 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Commit-Queue: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Ian Lance Taylor <iant@google.com>
2024-10-02runtime/pprof: make TestBlockMutexProfileInlineExpansion stricterNick Ripley
While working on CL 611241 and CL 616375, I introduced a bug that wasn't caught by any test. CL 611241 added more inline expansion at sample time for block/mutex profile stacks collected via frame pointer unwinding. CL 616375 then changed how inline expansion for those stacks is done at reporting time. So some frames passed through multiple rounds of inline expansion, and this lead to duplicate stack frames in some cases. The stacks from TestBlockMutexProfileInlineExpansion looked like sync.(*Mutex).Unlock runtime/pprof.inlineF runtime/pprof.inlineE runtime/pprof.inlineD runtime/pprof.inlineD runtime.goexit after those two CLs, and in particular after CL 616375. Note the extra inlineD frame. The test didn't catch that since it was only looking for a few frames in the stacks rather than checking the entire stacks. This CL makes that test stricter by checking the entire expected stacks rather than just a portion of the stacks. Change-Id: I0acc739d826586e9a63a081bb98ef512d72cdc9a Reviewed-on: https://go-review.googlesource.com/c/go/+/617235 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-02runtime: don't acquirem() in vgetrandom unless necessaryJason A. Donenfeld
I noticed in pprof that acquirem() was a bit of a hotspot. It turns out that we can use the same trick that runtime.rand() does, and only acquirem if we're doing something non-nosplit -- in this case, getting a new state -- but otherwise just do getg().m, which is safe because we're inside runtime and don't call split functions. cpu: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz │ sec/op │ sec/op vs base │ ParallelGetRandom-16 2.651n ± 4% 2.416n ± 7% -8.87% (p=0.001 n=10) │ B/s │ B/s vs base │ ParallelGetRandom-16 1.406Gi ± 4% 1.542Gi ± 6% +9.72% (p=0.001 n=10) Change-Id: Iae075f4e298b923e499cd01adfabacab725a8684 Reviewed-on: https://go-review.googlesource.com/c/go/+/616738 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-01runtime/pprof: add context to short stack panicMichael Pratt
Over the years we've had various bugs in pprof stack handling resulting in appendLocsForStack crashing because stk is too short for a cached location. i.e., the cached location claims several inlined frames. Those should always appear together in stk. If some frames are missing from stk, appendLocsForStack. If we find this case, replace the slice out of bounds panic with an explicit panic that contains more context. Change-Id: I52725a689baf42b8db627ce3e1bc6c654ef245d4 Reviewed-on: https://go-review.googlesource.com/c/go/+/617135 Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>