aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/malloc.go
AgeCommit message (Collapse)Author
2025-12-17runtime: keep track of secret allocation sizeDaniel Morsing
During a naive attempt to test the new runtime/secret package, I tried wrapping the entire handshake in a secret.Do call. This lead to a panic because some of the allocator logic had been previously untested. freeSpecial takes p and size, but they can be misleading. They don't refer to the pointer and size of the object with the special attached, but a pointer to the enclosing object and the size of the span element. The previous code did not take this into account and when passing the size to memclr would overwrite nearby objects. Fix by storing the size of the object being cleared inside the special. Fixes #76865. Change-Id: Ifae31f1c8d0609a562a37f37c45aec2f369dc6a5 Reviewed-on: https://go-review.googlesource.com/c/go/+/730361 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-11-26runtime/secret: implement new secret packageDaniel Morsing
Implement secret.Do. - When secret.Do returns: - Clear stack that is used by the argument function. - Clear all the registers that might contain secrets. - On stack growth in secret mode, clear the old stack. - When objects are allocated in secret mode, mark them and then zero the marked objects immediately when they are freed. - If the argument function panics, raise that panic as if it originated from secret.Do. This removes anything about the secret function from tracebacks. For now, this is only implemented on linux for arm64 and amd64. This is a rebased version of Keith Randalls initial implementation at CL 600635. I have added arm64 support, signal handling, preemption handling and dealt with vDSOs spilling into system stacks. Fixes #21865 Change-Id: I6fbd5a233beeaceb160785e0c0199a5c94d8e520 Co-authored-by: Keith Randall <khr@golang.org> Reviewed-on: https://go-review.googlesource.com/c/go/+/704615 Reviewed-by: Roland Shoemaker <roland@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Filippo Valsorda <filippo@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-11-23runtime: fix stale comment for mheap/mallocLidong Yan
mheap use pageAlloc to manage free/scav address space instead of using free/scav treap. The comment on mheap states mheap uses treaps. Update the comment to reflect the use of pageAlloc. In the fallback code when sizeSpecializedMalloc is enabled, heapBitsInSpan is false. Update the comment to reflect that. Change-Id: I50d2993c84e2c0312a925ab0a33065cc4cd41c41 Reviewed-on: https://go-review.googlesource.com/c/go/+/722700 Reviewed-by: Lidong Yan <yldhome2d2@gmail.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Mark Freeman <markfreeman@google.com>
2025-11-14runtime: support runtime.freegc in size-specialized mallocs for noscan objectsthepudds
This CL is part of a set of CLs that attempt to reduce how much work the GC must do. See the design in https://go.dev/design/74299-runtime-freegc This CL updates the smallNoScanStub stub in malloc_stubs.go to reuse heap objects that have been freed by runtime.freegc calls, and generates the corresponding size-specialized code in malloc_generated.go. This CL only adds support in the specialized mallocs for noscan heap objects (objects without pointers). A later CL handles objects with pointers. While we are here, we leave a couple of breadcrumbs in mkmalloc.go on how to do the generation. Updates #74299 Change-Id: I2657622601a27211554ee862fce057e101767a70 Reviewed-on: https://go-review.googlesource.com/c/go/+/715761 Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-11-14runtime: add runtime.freegc to reduce GC workthepudds
This CL is part of a set of CLs that attempt to reduce how much work the GC must do. See the design in https://go.dev/design/74299-runtime-freegc This CL adds runtime.freegc: func freegc(ptr unsafe.Pointer, uintptr size, noscan bool) Memory freed via runtime.freegc is made immediately reusable for the next allocation in the same size class, without waiting for a GC cycle, and hence can dramatically reduce pressure on the GC. A sample microbenchmark included below shows strings.Builder operating roughly 2x faster. An experimental modification to reflect to use runtime.freegc and then using that reflect with json/v2 gave reported memory allocation reductions of -43.7%, -32.9%, -21.9%, -22.0%, -1.0% for the 5 official real-world unmarshalling benchmarks from go-json-experiment/jsonbench by the authors of json/v2, covering the CanadaGeometry through TwitterStatus datasets. Note: there is no intent to modify the standard library to have explicit calls to runtime.freegc, and of course such an ability would never be exposed to end-user code. Later CLs in this stack teach the compiler how to automatically insert runtime.freegc calls when it can prove it is safe to do so. (The reflect modification and other experimental changes to the standard library were just that -- experiments. It was very helpful while initially developing runtime.freegc to see more complex uses and closer-to-real-world benchmark results prior to updating the compiler.) This CL only addresses noscan span classes (heap objects without pointers), such as the backing memory for a []byte or string. A follow-on CL adds support for heap objects with pointers. If we update strings.Builder to explicitly call runtime.freegc on its internal buf after a resize operation (but without freeing the usually final incarnation of buf that will be returned to the user as a string), we can see some nice benchmark results on the existing strings benchmarks that call Builder.Write N times and then call Builder.String. Here, the (uncommon) case of a single Builder.Write is not helped (given it never resizes after first alloc if there is only one Write), but the impact grows such that it is up to ~2x faster as there are more resize operations due to more strings.Builder.Write calls: │ disabled.out │ new-free-20.txt │ │ sec/op │ sec/op vs base │ BuildString_Builder/1Write_36Bytes_NoGrow-4 55.82n ± 2% 55.86n ± 2% ~ (p=0.794 n=20) BuildString_Builder/2Write_36Bytes_NoGrow-4 125.2n ± 2% 115.4n ± 1% -7.86% (p=0.000 n=20) BuildString_Builder/3Write_36Bytes_NoGrow-4 224.0n ± 1% 188.2n ± 2% -16.00% (p=0.000 n=20) BuildString_Builder/5Write_36Bytes_NoGrow-4 239.1n ± 9% 205.1n ± 1% -14.20% (p=0.000 n=20) BuildString_Builder/8Write_36Bytes_NoGrow-4 422.8n ± 3% 325.4n ± 1% -23.04% (p=0.000 n=20) BuildString_Builder/10Write_36Bytes_NoGrow-4 436.9n ± 2% 342.3n ± 1% -21.64% (p=0.000 n=20) BuildString_Builder/100Write_36Bytes_NoGrow-4 4.403µ ± 1% 2.381µ ± 2% -45.91% (p=0.000 n=20) BuildString_Builder/1000Write_36Bytes_NoGrow-4 48.28µ ± 2% 21.38µ ± 2% -55.71% (p=0.000 n=20) See the design document for more discussion of the strings.Builder case. For testing, we add tests that attempt to exercise different aspects of the underlying freegc and mallocgc behavior on the reuse path. Validating the assist credit manipulations turned out to be subtle, so a test for that is added in the next CL. There are also invariant checks added, controlled by consts (primarily the doubleCheckReusable const currently). This CL also adds support in runtime.freegc for GODEBUG=clobberfree=1 to immediately overwrite freed memory with 0xdeadbeef, which can help a higher-level test fail faster in the event of a bug, and also the GC specifically looks for that pattern and throws a fatal error if it unexpectedly finds it. A later CL (currently experimental) adds GODEBUG=clobberfree=2, which uses mprotect (or VirtualProtect on Windows) to set freed memory to fault if read or written, until the runtime later unprotects the memory on the mallocgc reuse path. For the cases where a normal allocation is happening without any reuse, some initial microbenchmarks suggest the impact of these changes could be small to negligible (at least with GOAMD64=v3): goos: linux goarch: amd64 pkg: runtime cpu: AMD EPYC 7B13 │ base-512M-v3.bench │ ps16-512M-goamd64-v3.bench │ │ sec/op │ sec/op vs base │ Malloc8-16 11.01n ± 1% 10.94n ± 1% -0.68% (p=0.038 n=20) Malloc16-16 17.15n ± 1% 17.05n ± 0% -0.55% (p=0.007 n=20) Malloc32-16 18.65n ± 1% 18.42n ± 0% -1.26% (p=0.000 n=20) MallocTypeInfo8-16 18.63n ± 0% 18.36n ± 0% -1.45% (p=0.000 n=20) MallocTypeInfo16-16 22.32n ± 0% 22.65n ± 0% +1.50% (p=0.000 n=20) MallocTypeInfo32-16 23.37n ± 0% 23.89n ± 0% +2.23% (p=0.000 n=20) geomean 18.02n 18.01n -0.05% These last benchmark results include the runtime updates to support span classes with pointers (which was originally part of this CL, but later split out for ease of review). Updates #74299 Change-Id: Icceaa0f79f85c70cd1a718f9a4e7f0cf3d77803c Reviewed-on: https://go-review.googlesource.com/c/go/+/673695 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-09-26runtime: use a smaller arena size on WasmCherry Mui
On Wasm, some programs have very small heap. Currently, we use 4 MB arena size (like all other 32-bit platforms). For a very small program, it needs to allocate one heap arena, 4 MB size at a 4 MB aligned address. So we'll need 8 MB of linear memory, whereas only a smaller portion is actually used by the program. On Wasm, samll programs are not uncommon (e.g. WASI plugins), and users are concerned about the memory usage. This CL switches to a smaller arena size, as well as a smaller page allocator chunk size (both are now 512 KB). So the heap will be grown in 512 KB granularity. For a helloworld program, it now uses less than 3 MB of linear memory, instead of 8 MB. Change-Id: Ibd66c1fa6e794a12c00906cbacc8f2e410f196c4 Reviewed-on: https://go-review.googlesource.com/c/go/+/683296 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-09-23runtime: add specialized malloc functions for sizes up to 512 bytesMichael Matloob
This CL adds a generator function in runtime/_mkmalloc to generate specialized mallocgc functions for sizes up throuht 512 bytes. (That's the limit where it's possible to end up in the no header case when there are scan bits, and where the benefits of the specialized functions significantly diminish according to microbenchmarks). If the specializedmalloc GOEXPERIMENT is turned on, mallocgc will call one of these functions in the no header case. malloc_generated.go is the generated file containing the specialized malloc functions. malloc_stubs.go contains the templates that will be stamped to create the specialized malloc functions. malloc_tables_generated contains the tables that mallocgc will use to select the specialized function to call. I've had to update the two stdlib_test.go files to account for the new submodule mkmalloc is in. mprof_test accounts for the changes in the stacks since different functions can be called in some cases. I still need to investigate heapsampling.go. Change-Id: Ia0f68dccdf1c6a200554ae88657cf4d686ace819 Reviewed-on: https://go-review.googlesource.com/c/go/+/665835 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Matloob <matloob@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-09-17runtime: don't enable heap randomization if MSAN or ASAN is enabledRoland Shoemaker
MSAN and ASAN do confusing things to the memory layout, which are likely to conflict with heap base randomization, so if they are enabled, ignore randomizedHeapBase64. We already didn't turn it on when TSAN was enabled. Change-Id: I41e59dfc33d8bb059c208a9595442571fb31eea3 Reviewed-on: https://go-review.googlesource.com/c/go/+/704856 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Roland Shoemaker <roland@golang.org>
2025-08-29runtime/race: add race detector support for linux/riscv64Joel Sing
This enables support for the race detector on linux/riscv64. Fixes #64345 Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64 Change-Id: I98962827e91455404858549b0f9691ee438f104b Reviewed-on: https://go-review.googlesource.com/c/go/+/690497 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2025-08-26runtime: identify virtual memory layout for riscv64Joel Sing
Identify sv39, sv48 and sv57 based on the system stack address. The current approach to memory allocation is less than ideal on RISC-V hardware that is using sv39 mode. On sv39 we currently end up doing around 85 mmap and 66 munmap, since we are trying to map an unusable range. With this change we do 22 mmap and 0 munmap at runtime initialisation. This will also be necessary to support the race detector on sv39. Updates #64345 Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64 Change-Id: I4f8ba6763b5ecfedfad5438e025d633820e8265c Reviewed-on: https://go-review.googlesource.com/c/go/+/690495 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Jorropo <jorropo.pgm@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-24runtime: randomize heap base addressRoland Shoemaker
During initialization, allow randomizing the heap base address by generating a random uint64 and using its bits to randomize various portions of the heap base address. We use the following method to randomize the base address: * We first generate a random heapArenaBytes aligned address that we use for generating the hints. * On the first call to mheap.grow, we then generate a random PallocChunkBytes aligned offset into the mmap'd heap region, which we use as the base for the heap region. * We then mark a random number of pages within the page allocator as allocated. Our final randomized "heap base address" becomes the first byte of the first available page returned by the page allocator. This results in an address with at least heapAddrBits-gc.PageShift-1 bits of entropy. Fixes #27583 Change-Id: Ideb4450a5ff747a132f702d563d2a516dec91a88 Reviewed-on: https://go-review.googlesource.com/c/go/+/674835 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21runtime: add valgrind instrumentationRoland Shoemaker
Add build tag gated Valgrind annotations to the runtime which let it understand how the runtime manages memory. This allows for Go binaries to be run under Valgrind without emitting spurious errors. Instead of adding the Valgrind headers to the tree, and using cgo to call the various Valgrind client request macros, we just add an assembly function which emits the necessary instructions to trigger client requests. In particular we add instrumentation of the memory allocator, using a two-level mempool structure (as described in the Valgrind manual [0]). We also add annotations which allow Valgrind to track which memory we use for stacks, which seems necessary to let it properly function. We describe the memory model to Valgrind as follows: we treat heap arenas as a "pool" created with VALGRIND_CREATE_MEMPOOL_EXT (so that we can use VALGRIND_MEMPOOL_METAPOOL and VALGRIND_MEMPOOL_AUTO_FREE). Within the pool we treat spans as "superblocks", annotated with VALGRIND_MEMPOOL_ALLOC. We then allocate individual objects within spans with VALGRIND_MALLOCLIKE_BLOCK. It should be noted that running binaries under Valgrind can be _quite slow_, and certain operations, such as running the GC, can be _very slow_. It is recommended to run programs with GOGC=off. Additionally, async preemption should be turned off, since it'll cause strange behavior (GODEBUG=asyncpreemptoff=1). Running Valgrind with --leak-check=yes will result in some errors resulting from some things not being marked fully free'd. These likely need more annotations to rectify, but for now it is recommended to run with --leak-check=off. Updates #73602 [0] https://valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools Change-Id: I71b26c47d7084de71ef1e03947ef6b1cc6d38301 Reviewed-on: https://go-review.googlesource.com/c/go/+/674077 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-21runtime: guarantee checkfinalizers test allocates in a shared tiny blockMichael Anthony Knyszek
Currently the checkfinalizers test (TestDetectCleanupOrFinalizerLeak) only *tries* to ensure the tiny alloc with a cleanup attached shares a block with other objects. However, what it does is insufficient, because it could get unlucky and have the last object allocated be the first object of a new block. This change changes the test to guarantee that a tiny object is not at the start of a fresh block by looking at the alignment of the object's pointer. If the object's pointer is odd, then that's good enough to know that it shares a block with something else, since the blocks themselves are aligned to a much higher power of two. This fixes a failure I've seen on the builders. Fixes #73810. Change-Id: Ieafdbb9cccb0d2dc3659a9a5d9d9233718461635 Reviewed-on: https://go-review.googlesource.com/c/go/+/674655 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-05-20runtime: mark and identify tiny blocks in checkfinalizers modeMichael Anthony Knyszek
This change adds support for identifying cleanups and finalizers attached to tiny blocks to checkfinalizers mode. It also notes a subtle pitfall, which is that the cleanup arg, if tiny-allocated, could end up co-located with the object with the cleanup attached! Oops... For #72949. Change-Id: Icbe0112f7dcfc63f35c66cf713216796a70121ce Reviewed-on: https://go-review.googlesource.com/c/go/+/662037 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2025-05-20runtime: prevent unnecessary zeroing of large objects with pointersMichael Anthony Knyszek
CL 614257 refactored mallocgc but lost an optimization: if a span for a large object is already backed by memory fresh from the OS (and thus zeroed), we don't need to zero it. CL 614257 unconditionally zeroed spans for large objects that contain pointers. This change restores the optimization from before CL 614257, which seems to matter in some real-world programs. While we're here, let's also fix a hole with the garbage collector being able to observe uninitialized memory of the large object is observed by the conservative scanner before being published. The gory details are in a comment in heapSetTypeLarge. In short, this change makes span.largeType an atomic variable, such that the GC can only observe initialized memory if span.largeType != nil. Fixes #72991. Change-Id: I2048aeb220ab363d252ffda7d980b8788e9674dc Reviewed-on: https://go-review.googlesource.com/c/go/+/659956 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
2025-05-20runtime: only update freeIndexForScan outside of the mark phaseMichael Anthony Knyszek
Currently, it's possible for asynchronous preemption to observe a partially initialized object. The sequence of events goes like this: - The GC is in the mark phase. - Thread T1 is allocating object O1. - Thread T1 zeroes the allocation, runs the publication barrier, and updates freeIndexForScan. It has not yet updated the mark bit on O1. - Thread T2 is conservatively scanning some stack frame. That stack frame has a dead pointer with the same address as O1. - T2 picks up the pointer, checks isFree (which checks freeIndexForScan without an import barrier), and sees that O1 is allocated. It marks and queues O1. - T2 then goes to scan O1, and observes uninitialized memory. Although a publication barrier was executed, T2 did not have an import barrier. T2 may thus observe T1's writes to zero the object out-of-order with the write to freeIndexForScan. Normally this would be impossible if T2 got a pointer to O1 from somewhere written by T1. The publication barrier guarantees that if the read side is data-dependent on the write side then we'd necessarily observe all writes to O1 before T1 published it. However, T2 got the pointer 'out of thin air' by scanning a stack frame with a dead pointer on it. One fix to this problem would be to add the import barrier in the conservative scanner. We would then also need to put freeIndexForScan behind the publication barrier, or make the write to freeIndexForScan exactly that barrier. However, there's a simpler way. We don't actually care if conservative scanning observes a stale freeIndexForScan during the mark phase. Newly-allocated memory is always marked at the point of allocation (the allocate-black policy part of the GC's design). So it doesn't actually matter that if the garbage collector scans that memory or not. This change modifies the allocator to only update freeIndexForScan outside the mark phase. This means freeIndexForScan is essentially a snapshot of freeindex at the point the mark phase started. Because there's no more race between conservative scanning and newly-allocated objects, the complicated scenario above is no longer a possibility. One thing we do have to be careful of is other callers of isFree. Previously freeIndexForScan would always track freeindex, now it no longer does. This change thus introduces isFreeOrNewlyAllocated which is used by the conservative scanner, and uses freeIndexForScan. Meanwhile isFree goes back to using freeindex like it used to. This change also documents the requirement on isFree that the caller must have obtained the pointer not 'out of thin air' but after the object was published. isFree is not currently used anywhere particularly sensitive (heap dump and checkmark mode, where the world is stopped in both cases) so using freeindex is both conceptually simple and also safe. Change-Id: If66b8c536b775971203fb4358c17d711c2944723 Reviewed-on: https://go-review.googlesource.com/c/go/+/672340 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-19runtime: move atoi to internal/runtime/strconvMichael Pratt
Moving to a smaller package allows its use in other internal/runtime packages. This isn't internal/strconvlite since it can't be used directly by strconv. For #73193. Change-Id: I6a6a636c9c8b3f06b5fd6c07fe9dd5a7a37d1429 Reviewed-on: https://go-review.googlesource.com/c/go/+/672697 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2025-04-24runtime: fix tag pointers on aixKeith Randall
Clean up tagged pointers a bit. I got the shifts wrong for the weird aix case. Change-Id: I21449fd5973f4651fd1103d3b8be9c2b9b93a490 Reviewed-on: https://go-review.googlesource.com/c/go/+/667715 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-23runtime: move some malloc constants to internal/runtime/gcMichael Anthony Knyszek
These constants are needed by some future generator programs. Change-Id: I5dccd009cbb3b2f321523bc0d8eaeb4c82e5df81 Reviewed-on: https://go-review.googlesource.com/c/go/+/655276 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-23runtime: move sizeclass defs to new package internal/runtime/gcMichael Anthony Knyszek
We will want to reference these definitions from new generator programs, and this is a good opportunity to cleanup all these old C-style names. Change-Id: Ifb06f0afc381e2697e7877f038eca786610c96de Reviewed-on: https://go-review.googlesource.com/c/go/+/655275 Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-04-02runtime: simplify needzero logickhr@golang.org
We always need to zero allocations with pointers in them. So we don't need some of the mallocs to take a needzero argument. Change-Id: Ideaa7b738873ba6a93addb5169791b42e2baad7c Reviewed-on: https://go-review.googlesource.com/c/go/+/660455 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2025-03-12runtime: remove nextSampleNoFP from plan9Russ Cox
Plan 9 can use floating point now. Change-Id: If721b243daa31853609cb3d2c535d86c106a1ee1 Reviewed-on: https://go-review.googlesource.com/c/go/+/655879 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Russ Cox <rsc@golang.org>
2025-03-04runtime: decorate anonymous memory mappingsLénaïc Huard
Leverage the prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, ...) API to name the anonymous memory areas. This API has been introduced in Linux 5.17 to decorate the anonymous memory areas shown in /proc/<pid>/maps. This is already used by glibc. See: * https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c;h=27dfd1eb907f4615b70c70237c42c552bb4f26a8;hb=HEAD#l2434 * https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/setvmaname.c;h=ea93a5ffbebc9e5a7e32a297138f465724b4725f;hb=HEAD#l63 This can be useful when investigating the memory consumption of a multi-language program. On a 100% Go program, pprof profiler can be used to profile the memory consumption of the program. But pprof is only aware of what happens within the Go world. On a multi-language program, there could be a doubt about whether the suspicious extra-memory consumption comes from the Go part or the native part. With this change, the following Go program: package main import ( "fmt" "log" "os" ) /* #include <stdlib.h> void f(void) { (void)malloc(1024*1024*1024); } */ import "C" func main() { C.f() data, err := os.ReadFile("/proc/self/maps") if err != nil { log.Fatal(err) } fmt.Println(string(data)) } produces this output: $ GLIBC_TUNABLES=glibc.mem.decorate_maps=1 ~/doc/devel/open-source/go/bin/go run . 00400000-00402000 r--p 00000000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00402000-004a4000 r-xp 00002000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 004a4000-00574000 r--p 000a4000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00574000-00575000 r--p 00173000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00575000-00580000 rw-p 00174000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00580000-005a4000 rw-p 00000000 00:00 0 2e075000-2e096000 rw-p 00000000 00:00 0 [heap] c000000000-c000400000 rw-p 00000000 00:00 0 [anon: Go: heap] c000400000-c004000000 ---p 00000000 00:00 0 [anon: Go: heap reservation] 777f40000000-777f40021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f40021000-777f44000000 ---p 00000000 00:00 0 777f44000000-777f44021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f44021000-777f48000000 ---p 00000000 00:00 0 777f48000000-777f48021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f48021000-777f4c000000 ---p 00000000 00:00 0 777f4c000000-777f4c021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f4c021000-777f50000000 ---p 00000000 00:00 0 777f50000000-777f50021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f50021000-777f54000000 ---p 00000000 00:00 0 777f55afb000-777f55afc000 ---p 00000000 00:00 0 777f55afc000-777f562fc000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216378] 777f562fc000-777f562fd000 ---p 00000000 00:00 0 777f562fd000-777f56afd000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216377] 777f56afd000-777f56afe000 ---p 00000000 00:00 0 777f56afe000-777f572fe000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216376] 777f572fe000-777f572ff000 ---p 00000000 00:00 0 777f572ff000-777f57aff000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216375] 777f57aff000-777f57b00000 ---p 00000000 00:00 0 777f57b00000-777f58300000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216374] 777f58300000-777f58400000 rw-p 00000000 00:00 0 [anon: Go: page alloc index] 777f58400000-777f5a400000 rw-p 00000000 00:00 0 [anon: Go: heap index] 777f5a400000-777f6a580000 ---p 00000000 00:00 0 [anon: Go: scavenge index] 777f6a580000-777f6a581000 rw-p 00000000 00:00 0 [anon: Go: scavenge index] 777f6a581000-777f7a400000 ---p 00000000 00:00 0 [anon: Go: scavenge index] 777f7a400000-777f8a580000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f8a580000-777f8a581000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f8a581000-777f9c430000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9c430000-777f9c431000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9c431000-777f9e806000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9e806000-777f9e807000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9e807000-777f9ec00000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9ec36000-777f9ecb6000 rw-p 00000000 00:00 0 [anon: Go: immortal metadata] 777f9ecb6000-777f9ecc6000 rw-p 00000000 00:00 0 [anon: Go: gc bits] 777f9ecc6000-777f9ecd6000 rw-p 00000000 00:00 0 [anon: Go: allspans array] 777f9ecd6000-777f9ece7000 rw-p 00000000 00:00 0 [anon: Go: immortal metadata] 777f9ece7000-777f9ed67000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9ed67000-777f9ed68000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9ed68000-777f9ede7000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9ede7000-777f9ee07000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9ee07000-777f9ee0a000 rw-p 00000000 00:00 0 [anon: glibc: loader malloc] 777f9ee0a000-777f9ee2e000 r--p 00000000 00:21 48158213 /usr/lib/libc.so.6 777f9ee2e000-777f9ef9f000 r-xp 00024000 00:21 48158213 /usr/lib/libc.so.6 777f9ef9f000-777f9efee000 r--p 00195000 00:21 48158213 /usr/lib/libc.so.6 777f9efee000-777f9eff2000 r--p 001e3000 00:21 48158213 /usr/lib/libc.so.6 777f9eff2000-777f9eff4000 rw-p 001e7000 00:21 48158213 /usr/lib/libc.so.6 777f9eff4000-777f9effc000 rw-p 00000000 00:00 0 777f9effc000-777f9effe000 rw-p 00000000 00:00 0 [anon: glibc: loader malloc] 777f9f00a000-777f9f04a000 rw-p 00000000 00:00 0 [anon: Go: immortal metadata] 777f9f04a000-777f9f04c000 r--p 00000000 00:00 0 [vvar] 777f9f04c000-777f9f04e000 r--p 00000000 00:00 0 [vvar_vclock] 777f9f04e000-777f9f050000 r-xp 00000000 00:00 0 [vdso] 777f9f050000-777f9f051000 r--p 00000000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f051000-777f9f07a000 r-xp 00001000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f07a000-777f9f085000 r--p 0002a000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f085000-777f9f087000 r--p 00034000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f087000-777f9f088000 rw-p 00036000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f088000-777f9f089000 rw-p 00000000 00:00 0 7ffc7bfa7000-7ffc7bfc8000 rw-p 00000000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall] The anonymous memory areas are now labelled so that we can see which ones have been allocated by the Go runtime versus which ones have been allocated by the glibc. Fixes #71546 Change-Id: I304e8b4dd7f2477a6da794fd44e9a7a5354e4bf4 Reviewed-on: https://go-review.googlesource.com/c/go/+/646095 Auto-Submit: Alan Donovan <adonovan@google.com> Commit-Queue: Alan Donovan <adonovan@google.com> Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-02-03runtime: fix GODEBUG=gccheckmark=1 and add smoke testMichael Anthony Knyszek
This change fixes GODEBUG=gccheckmark=1 which seems to have bit-rotted. Because the root jobs weren't being reset, it wasn't doing anything. Then, it turned out that checkmark mode would queue up noscan objects in workbufs, which caused it to fail. Then it turned out checkmark mode was broken with user arenas, since their heap arenas are not registered anywhere. Then, it turned out that checkmark mode could just not run properly if the goroutine's preemption flag was set (since sched.gcwaiting is true during the STW). And lastly, it turned out that async preemption could cause erroneous checkmark failures. This change fixes all these issues and adds a simple smoke test to dist to run the runtime tests under gccheckmark, which exercises all of these issues. Fixes #69074. Fixes #69377. Fixes #69376. Change-Id: Iaa0bb7b9e63ed4ba34d222b47510d6292ce168bc Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/608915 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-11-18runtime: get rid of gc programs for typesKeith Randall
Instead, have the runtime build the gc bitmaps on demand at runtime. Change-Id: If7a245bc62e4bce3ce80972410b0ed307d921abe Reviewed-on: https://go-review.googlesource.com/c/go/+/616255 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-30cmd/compile,runtime: add indirect key/elem to swissmapMichael Pratt
We use the same heuristics as existing maps. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I44bb51483cae2c1714717f1b501850fb9e55a39a Reviewed-on: https://go-review.googlesource.com/c/go/+/616461 Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-25runtime: fix mallocgc for asanMichael Anthony Knyszek
This change finally fully fixes mallocgc for asan after the recent refactoring. Here is everything that changed: Fix the accounting for the alloc header; large objects don't have them. Mask out extra bits set from unrolling the bitmap for slice backing stores in writeHeapBitsSmall. The redzone in asan mode makes it so that dataSize is no longer an exact multiple of typ.Size_ in this case (a new assumption I have recently discovered) but we didn't mask out any extra bits, so we'd accidentally set bits in other allocations. Oops. Move the initHeapBits optimization for the 8-byte scan sizeclass on 64-bit platforms up to mallocgc, out from writeHeapBitsSmall. So, this actually caused a problem with asan when the optimization first landed, but we missed it. The issue was then masked once we started passing the redzone down into writeHeapBitsSmall, since the optimization would no longer erroneously fire on asan. What happened was that dataSize would be 8 (because that was the user-provided alloc size) so we'd skip writing heap bits, but it would turn out the redzone bumped the size class, so we'd actually *have* to write the heap bits for that size class. This is not really a problem now *but* it caused problems for me when debugging, since I would try to remove the red zone from dataSize and this would trigger this bug again. Ultimately, this whole situation is confusing because the check in writeHeapBitsSmall is *not* the same as the check in initHeapBits. By moving this check up to mallocgc, we can make the checks align better by matching on the sizeclass, so this should be less error-prone in the future. Change-Id: I1e9819223be23f722f3bf21e63e812f5fb557194 Reviewed-on: https://go-review.googlesource.com/c/go/+/622041 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-25runtime: reserve fewer memory for aligned reservation on sbrk systemsCherry Mui
Sometimes the runtime needs to reserve some memory with a large alignment, which the OS usually won't directly satisfy. So, it asks size+align bytes instead, and frees the unaligned portions. On sbrk systems, this doesn't work that well, as freeing the tail portion doesn't really free the memory to the OS. Instead, we could simply round the current break up, then reserve the given size, without wasting the tail portion. Also, don't create heap arena hints on sbrk systems. We can only grow the break sequentially, and reserving specific addresses would not succeed anyway. For #69018. Change-Id: Iadc2c54d62b00ad7befa5bbf71146523483a8c47 Reviewed-on: https://go-review.googlesource.com/c/go/+/621715 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-10-24runtime: fix ASAN poison calculation in mallocgcMichael Anthony Knyszek
A previous CL broke the ASAN poisoning calculation in mallocgc by not taking into account a possible allocation header, so the beginning of the following allocation could have been poisoned. This mostly isn't a problem, actually, since the following slot would usually just have an allocation header in it that programs shouldn't be touching anyway, but if we're going a word-past-the-end at the end of a span, we could be poisoning a valid heap allocation. Change-Id: I76a4f59bcef01af513a1640c4c212c0eb6be85b3 Reviewed-on: https://go-review.googlesource.com/c/go/+/622295 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2024-10-23runtime: fix typo in error messagechangwang ma
Change-Id: I27bf98e84545746d90948dd06c4a7bd70782c49d Reviewed-on: https://go-review.googlesource.com/c/go/+/621895 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-21runtime: execute publicationBarrier in noscan case for delayed zeroingMichael Anthony Knyszek
This is a peace-of-mind change to make sure that delayed-zeroed memory (in the large alloc case) is globally visible from the moment the allocation is published back to the caller. The way it's written right now is good enough for the garbage collector (we already have a publication barrier for a nil span.largeType, so the GC will ignore the noscan span) but this might matter for user code on weak memory architectures. Change-Id: I06ac9b95863074e5f09382629083b19bfa87fdb8 Reviewed-on: https://go-review.googlesource.com/c/go/+/619036 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21runtime: specialize heapSetTypeMichael Anthony Knyszek
Last CL we separated mallocgc into several specialized paths. Let's split up heapSetType too. This will make the specialized heapSetType functions inlineable and cut out some branches as well as a function call. Microbenchmark results at this point in the stack: │ before.out │ after-5.out │ │ sec/op │ sec/op vs base │ Malloc8-4 13.52n ± 3% 12.15n ± 2% -10.13% (p=0.002 n=6) Malloc16-4 21.49n ± 2% 18.32n ± 4% -14.75% (p=0.002 n=6) MallocTypeInfo8-4 27.12n ± 1% 18.64n ± 2% -31.30% (p=0.002 n=6) MallocTypeInfo16-4 28.71n ± 3% 21.63n ± 5% -24.65% (p=0.002 n=6) geomean 21.81n 17.31n -20.64% Change-Id: I5de9ac5089b9eb49bf563af2a74e6dc564420e05 Reviewed-on: https://go-review.googlesource.com/c/go/+/614795 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-21runtime: refactor mallocgc into several independent codepathsMichael Anthony Knyszek
Right now mallocgc is a monster of a function. In real programs, we see that a substantial amount of time in mallocgc is spent in mallocgc itself. It's very branch-y, holds a lot of state, and handles quite a few disparate cases, trying to merge them together. This change breaks apart mallocgc into separate, leaner functions. There's some duplication now, but there are a lot of branches that can be pruned as a result. There's definitely still more we can do here. heapSetType can be inlined and broken down for each case, since its internals roughly map to each case anyway (done in a follow-up CL). We can probably also do more with the size class lookups, since we know more about the size of the object in each case than before. Below are the savings for the full stack up until now. │ after-3.out │ after-4.out │ │ sec/op │ sec/op vs base │ Malloc8-4 13.32n ± 2% 12.17n ± 1% -8.63% (p=0.002 n=6) Malloc16-4 21.64n ± 3% 19.38n ± 10% -10.47% (p=0.002 n=6) MallocTypeInfo8-4 23.15n ± 2% 19.91n ± 2% -14.00% (p=0.002 n=6) MallocTypeInfo16-4 25.86n ± 4% 22.48n ± 5% -13.11% (p=0.002 n=6) MallocLargeStruct-4 270.0n ± ∞ ¹ geomean 20.38n 30.97n -11.58% Change-Id: I681029c0b442f9221c4429950626f06299a5cfe4 Reviewed-on: https://go-review.googlesource.com/c/go/+/614257 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21runtime: break out the debug.malloc codepaths into functionsMichael Anthony Knyszek
This change breaks out the debug.malloc codepaths into dedicated functions, both for making mallocgc easier to read, and to reduce the function's size (currently all that code is inlined and really doesn't need to be). This is a microoptimization that on its own changes very little, but together with other optimizations and a breaking up of the various malloc paths will matter all together ("death by a thousand cuts"). Change-Id: I30b3ab4a1f349ba85b4a1b5b2c399abcdfe4844f Reviewed-on: https://go-review.googlesource.com/c/go/+/617879 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21runtime: move debug checks behind constant flag in mallocgcMichael Anthony Knyszek
These debug checks are very occasionally helpful, but they do cost real time. The biggest issue seems to be the bloat of mallocgc due to the "throw" paths. Overall, after some follow-ups, this change cuts about 1ns off of the mallocgc fast path. This is a microoptimization that on its own changes very little, but together with other optimizations and a breaking up of the various malloc paths will matter all together ("death by a thousand cuts"). Change-Id: I07c4547ad724b9f94281320846677fb558957721 Reviewed-on: https://go-review.googlesource.com/c/go/+/617878 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21runtime: rename shouldhelpgc to checkGCTrigger in mallocgcMichael Anthony Knyszek
shouldhelpgc is a very unhelpful name, because it has nothing to do with assists and solely to do with GC triggering. Name it checkGCTrigger instead, which is much clearer. Change-Id: Id38debd424ddb397376c0cea6e74b3fe94002f71 Reviewed-on: https://go-review.googlesource.com/c/go/+/617877 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21runtime: recompute assistG before and after mallocMichael Anthony Knyszek
This change stops tracking assistG across malloc to reduce number of slots the compiler must keep track of in mallocgc, which adds to register pressure. It also makes the call to deductAssistCredit only happen if the GC is running. This is a microoptimization that on its own changes very little, but together with other optimizations and a breaking up of the various malloc paths will matter all together ("death by a thousand cuts"). Change-Id: I4cfac7f3e8e873ba66ff3b553072737a4707e2c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/617876 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-10-21runtime: use wb flag instead of gcphase for allocate-black checkMichael Anthony Knyszek
This is an allocator microoptimization. There's no reason to check gcphase in general, since it's mostly for debugging anyway. writeBarrier.enabled is set in all the same cases here, and we force one fewer cache line (probably) to be touched during malloc. Conceptually, it also makes a bit more sense. The allocate-black policy is partly informed by the write barrier design. Change-Id: Ia5ff593d64c29cf7f4d1bced3204056566444a98 Reviewed-on: https://go-review.googlesource.com/c/go/+/617875 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-21runtime: simplify mem profile checking in mallocgcMichael Anthony Knyszek
Checking whether the current allocation needs to be profiled is currently branch-y and weirdly a lot of code. The branches are generally predictable, but it's a surprising number of instructions. Part of the problem is that MemProfileRate is just a global that can be set at any time, so we need to load it and check certain settings explicitly. In an ideal world, we would just always subtract from nextSample and have a single branch to take the slow path if we subtract below zero. If MemProfileRate were a function, we could trash all the nextSample values intentionally in each mcache. This would be slow, but MemProfileRate changes rarely while the malloc hot path is well, hot. Unfortunate... Although this ideal world is, AFAICT, impossible, we can still get close. If we cache the value of MemProfileRate in each mcache, then we can force malloc to take the slow path whenever MemProfileRate changes. This does require two additional loads, but crucially, these loads are independent of everything else in mallocgc. Furthermore, the branch dependent on those loads is incredibly predictable in practice. This CL on its own has little-to-no impact on mallocgc. But this codepath is going to be duplicated in several places in the next CL, so it'll pay to simplify it. Also, we're very much trying to remedy a death-by-a-thousand-cuts situation, and malloc is currently still kind of a monster -- it will not help if mallocgc isn't really streamlined itself. Lastly, there's a nice property now that all nextSample values get immediately re-sampled when MemProfileRate changes. Change-Id: I6443d0cf9bd7861595584442b675ac1be8ea3455 Reviewed-on: https://go-review.googlesource.com/c/go/+/615815 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2024-10-21runtime: don't call span.heapBits in writeHeapBitsSmallMichael Anthony Knyszek
For whatever reason, span.heapBits is kind of slow. It accounts for about a quarter of the cost of writeHeapBitsSmall, which is absurd. We get a nice speed improvement for small allocations by eliminating this call. │ before │ after │ │ sec/op │ sec/op vs base │ MallocTypeInfo16-4 29.47n ± 1% 27.02n ± 1% -8.31% (p=0.002 n=6) Change-Id: I6270e26902e5a9254cf1503fac81c3c799c59d6a Reviewed-on: https://go-review.googlesource.com/c/go/+/614255 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-10-08internal/runtime/maps: initial swiss table map implementationMichael Pratt
Add a new package that will contain a new "Swiss Table" (https://abseil.io/about/design/swisstables) map implementation, which is intended to eventually replace the existing runtime map implementation. This implementation is based on the fabulous github.com/cockroachdb/swiss package contributed by Peter Mattis. This CL adds an hash map implementation. It supports all the core operations, but does not have incremental growth. For #54766. Change-Id: I52cf371448c3817d471ddb1f5a78f3513565db41 Reviewed-on: https://go-review.googlesource.com/c/go/+/582415 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-09-10runtime: update documentation for ios addr bitsWuGuangyao
After this merge: https://go-review.googlesource.com/c/go/+/344401, ios/arm64 was treated as a 64 bit system and the addr bits of ios/arm64 was set to 40 Change-Id: I32d72787d20a3cf952b036e3e887cf5bae2273d8 GitHub-Last-Rev: 8917029fddc4d187b24fad8245fd7eed2bd570ba GitHub-Pull-Request: golang/go#69343 Reviewed-on: https://go-review.googlesource.com/c/go/+/610856 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Tim King <taking@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-07-23runtime,internal: move runtime/internal/sys to internal/runtime/sysDavid Chase
Cleanup and friction reduction For #65355. Change-Id: Ia14c9dc584a529a35b97801dd3e95b9acc99a511 Reviewed-on: https://go-review.googlesource.com/c/go/+/600436 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-07-23runtime,internal: move runtime/internal/math to internal/runtime/mathDavid Chase
Cleanup and friction reduction. Updates #65355. Change-Id: I6c4fcd409d044c00d16561fe9ed2257877d73f5b Reviewed-on: https://go-review.googlesource.com/c/go/+/600435 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-05-29all: document legacy //go:linkname for final round of modulesRuss Cox
Add linknames for most modules with ≥50 dependents. Add linknames for a few other modules that we know are important but are below 50. Remove linknames from badlinkname.go that do not merit inclusion (very small number of dependents). We can add them back later if the need arises. Fixes #67401. (For now.) Change-Id: I1e49fec0292265256044d64b1841d366c4106002 Reviewed-on: https://go-review.googlesource.com/c/go/+/587756 Auto-Submit: Russ Cox <rsc@golang.org> TryBot-Bypass: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-05-29all: document legacy //go:linkname for modules with ≥100 dependentsRuss Cox
For #67401. Change-Id: I015408a3f437c1733d97160ef2fb5da6d4efcc5c Reviewed-on: https://go-review.googlesource.com/c/go/+/587598 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org>
2024-05-23all: document legacy //go:linkname for modules with ≥200 dependentsRuss Cox
Ignored these linknames which have not worked for a while: github.com/xtls/xray-core: context.newCancelCtx removed in CL 463999 (Feb 2023) github.com/u-root/u-root: funcPC removed in CL 513837 (Jul 2023) tinygo.org/x/drivers: net.useNetdev never existed For #67401. Change-Id: I9293f4ef197bb5552b431de8939fa94988a060ce Reviewed-on: https://go-review.googlesource.com/c/go/+/587576 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-23all: document legacy //go:linkname for modules with ≥500 dependentsRuss Cox
For #67401. Change-Id: I7dd28c3b01a1a647f84929d15412aa43ab0089ee Reviewed-on: https://go-review.googlesource.com/c/go/+/587575 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-23all: document legacy //go:linkname for modules with ≥1,000 dependentsRuss Cox
For #67401. Change-Id: If23a2c07e3dd042a3c439da7088437a330b9caa4 Reviewed-on: https://go-review.googlesource.com/c/go/+/587222 Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-05-23all: document legacy //go:linkname for modules with ≥2,000 dependentsRuss Cox
For #67401. Change-Id: I3ae93042dffd0683b7e6d6225536ae667749515b Reviewed-on: https://go-review.googlesource.com/c/go/+/587221 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>