aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/mheap.go
AgeCommit message (Collapse)Author
2025-06-20runtime: fix struct commentcuishuang
Change-Id: I0c33830b13c8a187ac82504c7653abb8f8cf7530 Reviewed-on: https://go-review.googlesource.com/c/go/+/681655 Reviewed-by: Sean Liao <sean@liao.dev> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Sean Liao <sean@liao.dev>
2025-06-18runtime: set mspan limit field early and eagerlyMichael Anthony Knyszek
Currently the mspan limit field is set after allocSpan returns, *after* the span has already been published to the GC (including the conservative scanner). But the limit field is load-bearing, because it's checked to filter out invalid pointers. A stale limit value could cause a crash by having the conservative scanner access allocBits out of bounds. Fix this by setting the mspan limit field before publishing the span. For large objects and arena chunks, we adjust the limit down after allocSpan because we don't have access to the true object's size from allocSpan. However this is safe, since we first initialize the limit to something definitely safe (the actual span bounds) and only adjust it down after. Adjusting it down has the benefit of more precise debug output, but the window in which it's imprecise is also fine because a single object (logically, with arena chunks) occupies the whole span, so the 'invalid' part of the memory will just safely point back to that object. We can't do this for smaller objects because the limit will include space that does *not* contain any valid objects. Fixes #74288. Change-Id: I0a22e5b9bccc1bfdf51d2b73ea7130f1b99c0c7c Reviewed-on: https://go-review.googlesource.com/c/go/+/682655 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2025-05-29runtime, internal/synctest, sync: associate WaitGroups with bubblesDamien Neil
Add support to internal/synctest for managing associations between arbitrary pointers and synctest bubbles. (Implemented internally to the runtime package by attaching a special to the pointer.) Associate WaitGroups with bubbles. Since WaitGroups don't have a constructor, perform the association when Add is called. All Add calls must be made from within the same bubble, or outside any bubble. When a bubbled goroutine calls WaitGroup.Wait, the wait is durably blocking iff the WaitGroup is associated with the current bubble. Change-Id: I77e2701e734ac2fa2b32b28d5b0c853b7b2825c9 Reviewed-on: https://go-review.googlesource.com/c/go/+/676656 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Damien Neil <dneil@google.com>
2025-05-21runtime: use the immortal weak handle map for sbrk modeMichael Anthony Knyszek
Currently weak pointers break in sbrk mode. We can just use the immortal weak handle map for weak pointers in this case, since nothing is ever freed. Fixes #69729. Change-Id: Ie9fa7e203c22776dc9eb3601c6480107d9ad0c99 Reviewed-on: https://go-review.googlesource.com/c/go/+/674656 Reviewed-by: Carlos Amedee <carlos@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> TryBot-Bypass: Michael Knyszek <mknyszek@google.com>
2025-05-21runtime: add valgrind instrumentationRoland Shoemaker
Add build tag gated Valgrind annotations to the runtime which let it understand how the runtime manages memory. This allows for Go binaries to be run under Valgrind without emitting spurious errors. Instead of adding the Valgrind headers to the tree, and using cgo to call the various Valgrind client request macros, we just add an assembly function which emits the necessary instructions to trigger client requests. In particular we add instrumentation of the memory allocator, using a two-level mempool structure (as described in the Valgrind manual [0]). We also add annotations which allow Valgrind to track which memory we use for stacks, which seems necessary to let it properly function. We describe the memory model to Valgrind as follows: we treat heap arenas as a "pool" created with VALGRIND_CREATE_MEMPOOL_EXT (so that we can use VALGRIND_MEMPOOL_METAPOOL and VALGRIND_MEMPOOL_AUTO_FREE). Within the pool we treat spans as "superblocks", annotated with VALGRIND_MEMPOOL_ALLOC. We then allocate individual objects within spans with VALGRIND_MALLOCLIKE_BLOCK. It should be noted that running binaries under Valgrind can be _quite slow_, and certain operations, such as running the GC, can be _very slow_. It is recommended to run programs with GOGC=off. Additionally, async preemption should be turned off, since it'll cause strange behavior (GODEBUG=asyncpreemptoff=1). Running Valgrind with --leak-check=yes will result in some errors resulting from some things not being marked fully free'd. These likely need more annotations to rectify, but for now it is recommended to run with --leak-check=off. Updates #73602 [0] https://valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools Change-Id: I71b26c47d7084de71ef1e03947ef6b1cc6d38301 Reviewed-on: https://go-review.googlesource.com/c/go/+/674077 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-20runtime: mark and identify tiny blocks in checkfinalizers modeMichael Anthony Knyszek
This change adds support for identifying cleanups and finalizers attached to tiny blocks to checkfinalizers mode. It also notes a subtle pitfall, which is that the cleanup arg, if tiny-allocated, could end up co-located with the object with the cleanup attached! Oops... For #72949. Change-Id: Icbe0112f7dcfc63f35c66cf713216796a70121ce Reviewed-on: https://go-review.googlesource.com/c/go/+/662037 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2025-05-20runtime: annotate checkfinalizers reports with source and type infoMichael Anthony Knyszek
This change adds a new special kind called CheckFinalizer which is used to annotate finalizers and cleanups with extra information about where that cleanup or finalizer came from. For #72949. Change-Id: I3c1ace7bd580293961b7f0ea30345a6ce956d340 Reviewed-on: https://go-review.googlesource.com/c/go/+/662135 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-20runtime: add new GODEBUG checkfinalizerMichael Anthony Knyszek
This new debug mode detects cleanup/finalizer leaks using checkmark mode. It runs a partial GC using only specials as roots. If the GC can find a path from one of these roots back to the object the special is attached to, then the object might never be reclaimed. (The cycle could be broken in the future, but it's almost certainly a bug.) This debug mode is very barebones. It contains no type information and no stack location for where the finalizer or cleanup was created. For #72949. Change-Id: Ibffd64c1380b51f281950e4cfe61f677385d42a5 Reviewed-on: https://go-review.googlesource.com/c/go/+/634599 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2025-05-19runtime: move atoi to internal/runtime/strconvMichael Pratt
Moving to a smaller package allows its use in other internal/runtime packages. This isn't internal/strconvlite since it can't be used directly by strconv. For #73193. Change-Id: I6a6a636c9c8b3f06b5fd6c07fe9dd5a7a37d1429 Reviewed-on: https://go-review.googlesource.com/c/go/+/672697 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2025-05-08runtime: remove ptr/scalar bitmap metrickhr@golang.org
We don't use this mechanism any more, so the metric will always be zero. Since CL 616255. Update #73628 Change-Id: Ic179927a8bc24e6291876c218d88e8848b057c2a Reviewed-on: https://go-review.googlesource.com/c/go/+/671096 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-08runtime: schedule cleanups across multiple goroutinesMichael Anthony Knyszek
This change splits the finalizer and cleanup queues and implements a new lock-free blocking queue for cleanups. The basic design is as follows: The cleanup queue is organized in fixed-sized blocks. Individual cleanup functions are queued, but only whole blocks are dequeued. Enqueuing cleanups places them in P-local cleanup blocks. These are flushed to the full list as they get full. Cleanups can only be enqueued by an active sweeper. Dequeuing cleanups always dequeues entire blocks from the full list. Cleanup blocks can be dequeued and executed at any time. The very last active sweeper in the sweep phase is responsible for flushing all local cleanup blocks to the full list. It can do this without any synchronization because the next GC can't start yet, so we can be very certain that nobody else will be accessing the local blocks. Cleanup blocks are stored off-heap because the need to be allocated by the sweeper, which is called from heap allocation paths. As a result, the GC treats cleanup blocks as roots, just like finalizer blocks. Flushes to the full list signal to the scheduler that cleanup goroutines should be awoken. Every time the scheduler goes to wake up a cleanup goroutine and there were more signals than goroutines to wake, it then forwards this signal to runtime.AddCleanup, so that it creates another goroutine the next time it is called, up to gomaxprocs goroutines. The signals here are a little convoluted, but exist because the sweeper and the scheduler cannot safely create new goroutines. For #71772. For #71825. Change-Id: Ie839fde2b67e1b79ac1426be0ea29a8d923a62cc Reviewed-on: https://go-review.googlesource.com/c/go/+/650697 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2025-05-02runtime: mark and scan small objects in whole spans [green tea]Michael Anthony Knyszek
Our current parallel mark algorithm suffers from frequent stalls on memory since its access pattern is essentially random. Small objects are the worst offenders, since each one forces pulling in at least one full cache line to access even when the amount to be scanned is far smaller than that. Each object also requires an independent access to per-object metadata. The purpose of this change is to improve garbage collector performance by scanning small objects in batches to obtain better cache locality than our current approach. The core idea behind this change is to defer marking and scanning small objects, and then scan them in batches localized to a span. This change adds scanned bits to each small object (<=512 bytes) span in addition to mark bits. The scanned bits indicate that the object has been scanned. (One way to think of them is "grey" bits and "black" bits in the tri-color mark-sweep abstraction.) Each of these spans is always 8 KiB and if they contain pointers, the pointer/scalar data is already packed together at the end of the span, allowing us to further optimize the mark algorithm for this specific case. When the GC encounters a pointer, it first checks if it points into a small object span. If so, it is first marked in the mark bits, and then the object is queued on a work-stealing P-local queue. This object represents the whole span, and we ensure that a span can only appear at most once in any queue by maintaining an atomic ownership bit for each span. Later, when the pointer is dequeued, we scan every object with a set mark that doesn't have a corresponding scanned bit. If it turns out that was the only object in the mark bits since the last time we scanned the span, we scan just that object directly, essentially falling back to the existing algorithm. noscan objects have no scan work, so they are never queued. Each span's mark and scanned bits are co-located together at the end of the span. Since the span is always 8 KiB in size, it can be found with simple pointer arithmetic. Next to the marks and scans we also store the size class, eliminating the need to access the span's mspan altogether. The work-stealing P-local queue is a new source of GC work. If this queue gets full, half of it is dumped to a global linked list of spans to scan. The regular scan queues are always prioritized over this queue to allow time for darts to accumulate. Stealing work from other Ps is a last resort. This change also adds a new debug mode under GODEBUG=gctrace=2 that dumps whole-span scanning statistics by size class on every GC cycle. A future extension to this CL is to use SIMD-accelerated scanning kernels for scanning spans with high mark bit density. For #19112. (Deadlock averted in GOEXPERIMENT.) For #73581. Change-Id: I4bbb4e36f376950a53e61aaaae157ce842c341bc Reviewed-on: https://go-review.googlesource.com/c/go/+/658036 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-24runtime: fix typo in commentchangwang ma
Change-Id: I85f518e36c18f4f0eda8b167750b43cd8c48ecff Reviewed-on: https://go-review.googlesource.com/c/go/+/622675 Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Austin Clements <austin@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-04-23runtime: move sizeclass defs to new package internal/runtime/gcMichael Anthony Knyszek
We will want to reference these definitions from new generator programs, and this is a good opportunity to cleanup all these old C-style names. Change-Id: Ifb06f0afc381e2697e7877f038eca786610c96de Reviewed-on: https://go-review.googlesource.com/c/go/+/655275 Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-03-04runtime: decorate anonymous memory mappingsLénaïc Huard
Leverage the prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, ...) API to name the anonymous memory areas. This API has been introduced in Linux 5.17 to decorate the anonymous memory areas shown in /proc/<pid>/maps. This is already used by glibc. See: * https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c;h=27dfd1eb907f4615b70c70237c42c552bb4f26a8;hb=HEAD#l2434 * https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/setvmaname.c;h=ea93a5ffbebc9e5a7e32a297138f465724b4725f;hb=HEAD#l63 This can be useful when investigating the memory consumption of a multi-language program. On a 100% Go program, pprof profiler can be used to profile the memory consumption of the program. But pprof is only aware of what happens within the Go world. On a multi-language program, there could be a doubt about whether the suspicious extra-memory consumption comes from the Go part or the native part. With this change, the following Go program: package main import ( "fmt" "log" "os" ) /* #include <stdlib.h> void f(void) { (void)malloc(1024*1024*1024); } */ import "C" func main() { C.f() data, err := os.ReadFile("/proc/self/maps") if err != nil { log.Fatal(err) } fmt.Println(string(data)) } produces this output: $ GLIBC_TUNABLES=glibc.mem.decorate_maps=1 ~/doc/devel/open-source/go/bin/go run . 00400000-00402000 r--p 00000000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00402000-004a4000 r-xp 00002000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 004a4000-00574000 r--p 000a4000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00574000-00575000 r--p 00173000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00575000-00580000 rw-p 00174000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00580000-005a4000 rw-p 00000000 00:00 0 2e075000-2e096000 rw-p 00000000 00:00 0 [heap] c000000000-c000400000 rw-p 00000000 00:00 0 [anon: Go: heap] c000400000-c004000000 ---p 00000000 00:00 0 [anon: Go: heap reservation] 777f40000000-777f40021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f40021000-777f44000000 ---p 00000000 00:00 0 777f44000000-777f44021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f44021000-777f48000000 ---p 00000000 00:00 0 777f48000000-777f48021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f48021000-777f4c000000 ---p 00000000 00:00 0 777f4c000000-777f4c021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f4c021000-777f50000000 ---p 00000000 00:00 0 777f50000000-777f50021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f50021000-777f54000000 ---p 00000000 00:00 0 777f55afb000-777f55afc000 ---p 00000000 00:00 0 777f55afc000-777f562fc000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216378] 777f562fc000-777f562fd000 ---p 00000000 00:00 0 777f562fd000-777f56afd000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216377] 777f56afd000-777f56afe000 ---p 00000000 00:00 0 777f56afe000-777f572fe000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216376] 777f572fe000-777f572ff000 ---p 00000000 00:00 0 777f572ff000-777f57aff000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216375] 777f57aff000-777f57b00000 ---p 00000000 00:00 0 777f57b00000-777f58300000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216374] 777f58300000-777f58400000 rw-p 00000000 00:00 0 [anon: Go: page alloc index] 777f58400000-777f5a400000 rw-p 00000000 00:00 0 [anon: Go: heap index] 777f5a400000-777f6a580000 ---p 00000000 00:00 0 [anon: Go: scavenge index] 777f6a580000-777f6a581000 rw-p 00000000 00:00 0 [anon: Go: scavenge index] 777f6a581000-777f7a400000 ---p 00000000 00:00 0 [anon: Go: scavenge index] 777f7a400000-777f8a580000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f8a580000-777f8a581000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f8a581000-777f9c430000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9c430000-777f9c431000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9c431000-777f9e806000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9e806000-777f9e807000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9e807000-777f9ec00000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9ec36000-777f9ecb6000 rw-p 00000000 00:00 0 [anon: Go: immortal metadata] 777f9ecb6000-777f9ecc6000 rw-p 00000000 00:00 0 [anon: Go: gc bits] 777f9ecc6000-777f9ecd6000 rw-p 00000000 00:00 0 [anon: Go: allspans array] 777f9ecd6000-777f9ece7000 rw-p 00000000 00:00 0 [anon: Go: immortal metadata] 777f9ece7000-777f9ed67000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9ed67000-777f9ed68000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9ed68000-777f9ede7000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9ede7000-777f9ee07000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9ee07000-777f9ee0a000 rw-p 00000000 00:00 0 [anon: glibc: loader malloc] 777f9ee0a000-777f9ee2e000 r--p 00000000 00:21 48158213 /usr/lib/libc.so.6 777f9ee2e000-777f9ef9f000 r-xp 00024000 00:21 48158213 /usr/lib/libc.so.6 777f9ef9f000-777f9efee000 r--p 00195000 00:21 48158213 /usr/lib/libc.so.6 777f9efee000-777f9eff2000 r--p 001e3000 00:21 48158213 /usr/lib/libc.so.6 777f9eff2000-777f9eff4000 rw-p 001e7000 00:21 48158213 /usr/lib/libc.so.6 777f9eff4000-777f9effc000 rw-p 00000000 00:00 0 777f9effc000-777f9effe000 rw-p 00000000 00:00 0 [anon: glibc: loader malloc] 777f9f00a000-777f9f04a000 rw-p 00000000 00:00 0 [anon: Go: immortal metadata] 777f9f04a000-777f9f04c000 r--p 00000000 00:00 0 [vvar] 777f9f04c000-777f9f04e000 r--p 00000000 00:00 0 [vvar_vclock] 777f9f04e000-777f9f050000 r-xp 00000000 00:00 0 [vdso] 777f9f050000-777f9f051000 r--p 00000000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f051000-777f9f07a000 r-xp 00001000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f07a000-777f9f085000 r--p 0002a000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f085000-777f9f087000 r--p 00034000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f087000-777f9f088000 rw-p 00036000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f088000-777f9f089000 rw-p 00000000 00:00 0 7ffc7bfa7000-7ffc7bfc8000 rw-p 00000000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall] The anonymous memory areas are now labelled so that we can see which ones have been allocated by the Go runtime versus which ones have been allocated by the glibc. Fixes #71546 Change-Id: I304e8b4dd7f2477a6da794fd44e9a7a5354e4bf4 Reviewed-on: https://go-review.googlesource.com/c/go/+/646095 Auto-Submit: Alan Donovan <adonovan@google.com> Commit-Queue: Alan Donovan <adonovan@google.com> Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-02-19weak: accept linker-allocated objects to MakeMichael Anthony Knyszek
Currently Make panics when passed a linker-allocated object. This is inconsistent with both runtime.AddCleanup and runtime.SetFinalizer. Not panicking in this case is important so that all pointers can be treated equally by these APIs. Libraries should not have to worry where a pointer came from to still make weak pointers. Supporting this behavior is a bit complex for weak pointers versus finalizers and cleanups. For the latter two, it means a function is never called, so we can just drop everything on the floor. For weak pointers, we still need to produce pointers that compare as per the API. To do this, copy the tiny lock-free trace map implementation and use it to store weak handles for "immortal" objects. These paths in the runtime should be rare, so it's OK if it's not incredibly fast, but we should keep the memory footprint relatively low (at least not have it be any worse than specials), so this change tweaks the map implementation a little bit to ensure that's the case. Fixes #71726. Change-Id: I0c87c9d90656d81659ac8d70f511773d0093ce27 Reviewed-on: https://go-review.googlesource.com/c/go/+/649460 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-03runtime: fix GODEBUG=gccheckmark=1 and add smoke testMichael Anthony Knyszek
This change fixes GODEBUG=gccheckmark=1 which seems to have bit-rotted. Because the root jobs weren't being reset, it wasn't doing anything. Then, it turned out that checkmark mode would queue up noscan objects in workbufs, which caused it to fail. Then it turned out checkmark mode was broken with user arenas, since their heap arenas are not registered anywhere. Then, it turned out that checkmark mode could just not run properly if the goroutine's preemption flag was set (since sched.gcwaiting is true during the STW). And lastly, it turned out that async preemption could cause erroneous checkmark failures. This change fixes all these issues and adds a simple smoke test to dist to run the runtime tests under gccheckmark, which exercises all of these issues. Fixes #69074. Fixes #69377. Fixes #69376. Change-Id: Iaa0bb7b9e63ed4ba34d222b47510d6292ce168bc Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/608915 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-12-09runtime: make special offset a uintptrMichael Anthony Knyszek
Currently specials try to save on space by only encoding the offset from the base of the span in a uint16. This worked fine up until Go 1.24. - Most specials have an offset of 0 (mem profile, finalizers, etc.) - Cleanups do not care about the offset at all, so even if it's wrong, it's OK. - Weak pointers *do* care, but the unique package always makes a new allocation, so the weak pointer handle offset it makes is always zero. With Go 1.24 and general weak pointers now available, nothing is stopping someone from just creating a weak pointer that is >64 KiB offset from the start of an object, and this weak pointer must be distinct from others. Fix this problem by just increasing the size of a special and making the offset a uintptr, to capture all possible offsets. Since we're in the freeze, this is the safest thing to do. Specials aren't so common that I expect a substantial memory increase from this change. In a future release (or if there is a problem) we can almost certainly pack the special's kind and offset together. There was already a bunch of wasted space due to padding, so this would bring us back to the same memory footprint before this change. Also, add tests for equality of basic weak interior pointers. This works, but we really should've had tests for it. Fixes #70739. Change-Id: Ib49a7f8f0f1ec3db4571a7afb0f4d94c8a93aa40 Reviewed-on: https://go-review.googlesource.com/c/go/+/634598 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> Commit-Queue: Michael Knyszek <mknyszek@google.com>
2024-11-22runtime: properly search for cleanups in cleanup.stopCarlos Amedee
This change modifies the logic which searches for existing cleanups. The existing search logic sets the next node to the current node in certain conditions. This would cause future searches to loop endlessly. The existing loop could convert non-cleanup specials into cleanups and cause data corruption. This also changes where we release the m while we are adding a cleanup. We are currently holding onto an p-specific gcwork after releasing the m. Change-Id: I0ac0b304f40910549c8df114e523c89d9f0d7a75 Reviewed-on: https://go-review.googlesource.com/c/go/+/630278 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Carlos Amedee <carlos@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-11-20runtime: keep cleanup closure alive across adding the cleanup specialMichael Anthony Knyszek
This is similar to the weak handle bug in #70455. In short, there's a window where a heap-allocated value is only visible through a special that has not been made visible to the GC yet. For #70455. Change-Id: Ic2bb2c60d422a5bc5dab8d971cfc26ff6d7622bc Reviewed-on: https://go-review.googlesource.com/c/go/+/630277 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-11-20runtime: explicitly keep handle alive during getOrAddWeakHandleMichael Anthony Knyszek
getOrAddWeakHandle is very careful about keeping its input alive across the operation, but not very careful about keeping the heap-allocated handle it creates alive. In fact, there's a window in this function where it is *only* visible via the special. Specifically, the window of time between when the handle is stored in the special and when the special actually becomes visible to the GC. (If we fail to add the special because it already exists, that case is fine. We don't even use the same handle value, but the one we obtain from the attached GC-visible special, *and* we return that value, so it remains live.) Fixes #70455. Change-Id: Iadaff0cfb93bcaf61ba2b05be7fa0519c481de82 Reviewed-on: https://go-review.googlesource.com/c/go/+/630315 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-11-18weak: move internal/weak to weak, and update according to proposalMichael Anthony Knyszek
The updates are: - API documentation changes. - Removal of the old package documentation discouraging linkname. - Addition of new package documentation with some advice. - Renaming of weak.Pointer.Strong -> weak.Pointer.Value. Fixes #67552. Change-Id: Ifad7e629b6d339dacaf2ca37b459d7f903e31bf8 Reviewed-on: https://go-review.googlesource.com/c/go/+/628455 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-11-16runtime: implement Stop for AddCleanupCarlos Amedee
This change adds the implementation for AddCleanup.Stop. It allows the caller to cancel the call to execute the cleanup. Cleanup will not be stopped if the cleanup has already been queued for execution. For #67535 Change-Id: I494b77d344e54d772c41489d172286773c3814e5 Reviewed-on: https://go-review.googlesource.com/c/go/+/627975 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Carlos Amedee <carlos@golang.org>
2024-11-16runtime: implement AddCleanupCarlos Amedee
This change introduces AddCleanup to the runtime package. AddCleanup attaches a cleanup function to an pointer to an object. The Stop method on Cleanups will be implemented in a followup CL. AddCleanup is intended to be an incremental improvement over SetFinalizer and will result in SetFinalizer being deprecated. For #67535 Change-Id: I99645152e3fdcee85fcf42a4f312c6917e8aecb1 Reviewed-on: https://go-review.googlesource.com/c/go/+/627695 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-11-13runtime: prevent weak->strong conversions during mark terminationMichael Anthony Knyszek
Currently it's possible for weak->strong conversions to create more GC work during mark termination. When a weak->strong conversion happens during the mark phase, we need to mark the newly-strong pointer, since it may now be the only pointer to that object. In other words, the object could be white. But queueing new white objects creates GC work, and if this happens during mark termination, we could end up violating mark termination invariants. In the parlance of the mark termination algorithm, the weak->strong conversion is a non-monotonic source of GC work, unlike the write barriers (which will eventually only see black objects). This change fixes the problem by forcing weak->strong conversions to block during mark termination. We can do this efficiently by setting a global flag before the ragged barrier that is checked at each weak->strong conversion. If the flag is set, then the conversions block. The ragged barrier ensures that all Ps have observed the flag and that any weak->strong conversions which completed before the ragged barrier have their newly-minted strong pointers visible in GC work queues if necessary. We later unset the flag and wake all the blocked goroutines during the mark termination STW. There are a few subtleties that we need to account for. For one, it's possible that a goroutine which blocked in a weak->strong conversion wakes up only to find it's mark termination time again, so we need to recheck the global flag on wake. We should also stay non-preemptible while performing the check, so that if the check *does* appear as true, it cannot switch back to false while we're actively trying to block. If it switches to false while we try to block, then we'll be stuck in the queue until the following GC. All-in-all, this CL is more complicated than I would have liked, but it's the only idea so far that is clearly correct to me at a high level. This change adds a test which is somewhat invasive as it manipulates mark termination, but hopefully that infrastructure will be useful for debugging, fixing, and regression testing mark termination whenever we do fix it. Fixes #69803. Change-Id: Ie314e6fd357c9e2a07a9be21f217f75f7aba8c4a Reviewed-on: https://go-review.googlesource.com/c/go/+/623615 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-09-04internal/weak: shade pointer in weak-to-strong conversionMichael Anthony Knyszek
There's a bug in the weak-to-strong conversion in that creating the *only* strong pointer to some weakly-held object during the mark phase may result in that object not being properly marked. The exact mechanism for this is that the new strong pointer will always point to a white object (because it was only weakly referenced up until this point) and it can then be stored in a blackened stack, hiding it from the garbage collector. This "hide a white pointer in the stack" problem is pretty much exactly what the Yuasa part of the hybrid write barrier is trying to catch, so we need to do the same thing the write barrier would do: shade the pointer. Added a test and confirmed that it fails with high probability if the pointer shading is missing. Fixes #69210. Change-Id: Iaae64ae95ea7e975c2f2c3d4d1960e74e1bd1c3f Reviewed-on: https://go-review.googlesource.com/c/go/+/610396 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-07-25runtime: allow the tracer to be reentrantMichael Anthony Knyszek
This change allows the tracer to be reentrant by restructuring the internals such that writing an event is atomic with respect to stack growth. Essentially, a core set of functions that are involved in acquiring a trace buffer and writing to it are all marked nosplit. Stack growth is currently the only hidden place where the tracer may be accidentally reentrant, preventing the tracer from being used everywhere. It already lacks write barriers, lacks allocations, and is non-preemptible. This change thus makes the tracer fully reentrant, since the only reentry case it needs to handle is stack growth. Since the invariants needed to attain this are subtle, this change also extends the debugTraceReentrancy debug mode to check these invariants as well. Specifically, the invariants are checked by setting the throwsplit flag. A side benefit of this change is it simplifies the trace event writing API a good bit: there's no need to actually thread the event writer through things, and most callsites look a bit simpler. Change-Id: I7c329fb7a6cb936bd363c44cf882ea0a925132f3 Reviewed-on: https://go-review.googlesource.com/c/go/+/587599 Reviewed-by: Austin Clements <austin@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-07-23runtime,internal: move runtime/internal/sys to internal/runtime/sysDavid Chase
Cleanup and friction reduction For #65355. Change-Id: Ia14c9dc584a529a35b97801dd3e95b9acc99a511 Reviewed-on: https://go-review.googlesource.com/c/go/+/600436 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-05-22runtime: skip tracing events that would cause reentrancyMichael Anthony Knyszek
Some of the new experimental events added have a problem in that they might be emitted during stack growth. This is, to my knowledge, the only restriction on the tracer, because the tracer otherwise prevents preemption, avoids allocation, and avoids write barriers. However, the stack can grow from within the tracer. This leads to tracing-during-tracing which can result in lost buffers and broken event streams. (There's a debug mode to get a nice error message, but it's disabled by default.) This change resolves the problem by skipping writing out these new events. This results in the new events sometimes being broken (alloc without a free, free without an alloc) but for now that's OK. Before the freeze begins we just want to fix broken tests; tools interpreting these events will be totally in-house to begin with, and if they have to be a little bit smarter about missing information, that's OK. In the future we'll have a more robust fix for this, but it appears that it's going to require making the tracer fully reentrant. (This is not too hard; either we force flushing all buffers when going reentrant (which is actually somewhat subtle with respect to event ordering) or we isolate down just the actual event writing to be atomic with respect to stack growth. Both are just bigger changes on shared codepaths that are scary to land this late in the release cycle.) Fixes #67379. Change-Id: I46bb7e470e61c64ff54ac5aec5554b828c1ca4be Reviewed-on: https://go-review.googlesource.com/c/go/+/587597 Reviewed-by: Carlos Amedee <carlos@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-08runtime: delete pagetrace GOEXPERIMENTMichael Anthony Knyszek
The page tracer's functionality is now captured by the regular execution tracer as an experimental GODEBUG variable. This is a lot more usable and maintainable than the page tracer, which is likely to have bitrotted by this point. There's also no tooling available for the page tracer. Change-Id: I2408394555e01dde75a522e9a489b7e55cf12c8e Reviewed-on: https://go-review.googlesource.com/c/go/+/583379 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-08runtime: add traceallocfree GODEBUG for alloc/free events in tracesMichael Anthony Knyszek
This change adds expensive alloc/free events to traces, guarded by a GODEBUG that can be set at run time by mutating the GODEBUG environment variable. This supersedes the alloc/free trace deleted in a previous CL. There are two parts to this CL. The first part is adding a mechanism for exposing experimental events through the tracer and trace parser. This boils down to a new ExperimentalEvent event type in the parser API which simply reveals the raw event data for the event. Each experimental event can also be associated with "experimental data" which is associated with a particular generation. This experimental data is just exposed as a bag of bytes that supplements the experimental events. In the runtime, this CL organizes experimental events by experiment. An experiment is defined by a set of experimental events and a single special batch type. Batches of this special type are exposed through the parser's API as the aforementioned "experimental data". The second part of this CL is defining the AllocFree experiment, which defines 9 new experimental events covering heap object alloc/frees, span alloc/frees, and goroutine stack alloc/frees. It also generates special batches that contain a type table: a mapping of IDs to type information. Change-Id: I965c00e3dcfdf5570f365ff89d0f70d8aeca219c Reviewed-on: https://go-review.googlesource.com/c/go/+/583377 Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-18internal/weak: add package implementing weak pointersMichael Anthony Knyszek
This change adds the internal/weak package, which exposes GC-supported weak pointers to the standard library. This is for the upcoming weak package, but may be useful for other future constructs. For #62483. Change-Id: I4aa8fa9400110ad5ea022a43c094051699ccab9d Reviewed-on: https://go-review.googlesource.com/c/go/+/576297 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-09runtime: remove the allocheaders GOEXPERIMENTMichael Anthony Knyszek
This change removes the allocheaders, deleting all the old code and merging mbitmap_allocheaders.go back into mbitmap.go. This change also deletes the SetType benchmarks which were already broken in the new GOEXPERIMENT (it's harder to set up than before). We weren't really watching these benchmarks at all, and they don't provide additional test coverage. Change-Id: I135497201c3259087c5cd3722ed3fbe24791d25d Reviewed-on: https://go-review.googlesource.com/c/go/+/567200 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-03-25runtime: migrate internal/atomic to internal/runtimeAndy Pan
For #65355 Change-Id: I65dd090fb99de9b231af2112c5ccb0eb635db2be Reviewed-on: https://go-review.googlesource.com/c/go/+/560155 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ibrahim Bazoka <ibrahimbazoka729@gmail.com> Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
2023-12-14runtime: delete todo of the list field for mspanqiulaidongfeng
Change-Id: I10a3308c19da08d2ff0c8077bb74ad888ee04fea GitHub-Last-Rev: 3e95b71384a25e0b29029731b72cf2c7f6a96055 GitHub-Pull-Request: golang/go#64077 Reviewed-on: https://go-review.googlesource.com/c/go/+/541755 Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2023-11-09runtime: refactor runtime->tracer API to appear more like a lockMichael Anthony Knyszek
Currently the execution tracer synchronizes with itself using very heavyweight operations. As a result, it's totally fine for most of the tracer code to look like: if traceEnabled() { traceXXX(...) } However, if we want to make that synchronization more lightweight (as issue #60773 proposes), then this is insufficient. In particular, we need to make sure the tracer can't observe an inconsistency between g atomicstatus and the event that would be emitted for a particular g transition. This means making the g status change appear to happen atomically with the corresponding trace event being written out from the perspective of the tracer. This requires a change in API to something more like a lock. While we're here, we might as well make sure that trace events can *only* be emitted while this lock is held. This change introduces such an API: traceAcquire, which returns a value that can emit events, and traceRelease, which requires the value that was returned by traceAcquire. In practice, this won't be a real lock, it'll be more like a seqlock. For the current tracer, this API is completely overkill and the value returned by traceAcquire basically just checks trace.enabled. But it's necessary for the tracer described in #60773 and we can implement that more cleanly if we do this refactoring now instead of later. For #60773. Change-Id: Ibb9ff5958376339fafc2b5180aef65cf2ba18646 Reviewed-on: https://go-review.googlesource.com/c/go/+/515635 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2023-11-09runtime: implement experiment to replace heap bitmap with alloc headersMichael Anthony Knyszek
This change replaces the 1-bit-per-word heap bitmap for most size classes with allocation headers for objects that contain pointers. The header consists of a single pointer to a type. All allocations with headers are treated as implicitly containing one or more instances of the type in the header. As the name implies, headers are usually stored as the first word of an object. There are two additional exceptions to where headers are stored and how they're used. Objects smaller than 512 bytes do not have headers. Instead, a heap bitmap is reserved at the end of spans for objects of this size. A full word of overhead is too much for these small objects. The bitmap is of the same format of the old bitmap, minus the noMorePtrs bits which are unnecessary. All the objects <512 bytes have a bitmap less than a pointer-word in size, and that was the granularity at which noMorePtrs could stop scanning early anyway. Objects that are larger than 32 KiB (which have their own span) have their headers stored directly in the span, to allow power-of-two-sized allocations to not spill over into an extra page. The full implementation is behind GOEXPERIMENT=allocheaders. The purpose of this change is performance. First and foremost, with headers we no longer have to unroll pointer/scalar data at allocation time for most size classes. Small size classes still need some unrolling, but their bitmaps are small so we can optimize that case fairly well. Larger objects effectively have their pointer/scalar data unrolled on-demand from type data, which is much more compactly represented and results in less TLB pressure. Furthermore, since the headers are usually right next to the object and where we're about to start scanning, we get an additional temporal locality benefit in the data cache when looking up type metadata. The pointer/scalar data is now effectively unrolled on-demand, but it's also simpler to unroll than before; that unrolled data is never written anywhere, and for arrays we get the benefit of retreading the same data per element, as opposed to looking it up from scratch for each pointer-word of bitmap. Lastly, because we no longer have a heap bitmap that spans the entire heap, there's a flat 1.5% memory use reduction. This is balanced slightly by some objects possibly being bumped up a size class, but most objects are not tightly optimized to size class sizes so there's some memory to spare, making the header basically free in those cases. See the follow-up CL which turns on this experiment by default for benchmark results. (CL 538217.) Change-Id: I4c9034ee200650d06d8bdecd579d5f7c1bbf1fc5 Reviewed-on: https://go-review.googlesource.com/c/go/+/437955 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-02runtime: split out pointer/scalar metadata from heapArenaMichael Anthony Knyszek
We're going to want to fork this data in the near future for a GOEXPERIMENT, so break it out now. Change-Id: Ia7ded850bb693c443fe439c6b7279dcac638512c Reviewed-on: https://go-review.googlesource.com/c/go/+/537978 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2023-10-02runtime: use smaller fields for mspan.freeindex and nelemsCherry Mui
mspan.freeindex and nelems can fit into uint16 for all possible values. Use uint16 instead of uintptr. Change-Id: Ifce20751e81d5022be1f6b5cbb5fbe4fd1728b1b Reviewed-on: https://go-review.googlesource.com/c/go/+/451359 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-09-18all: clean unnecessary castsJes Cok
Run 'unconvert -safe -apply' (https://github.com/mdempsky/unconvert) Change-Id: I24b7cd7d286cddce86431d8470d15c5f3f0d1106 GitHub-Last-Rev: 022e75384c08bb899a8951ba0daffa0f2e14d5a7 GitHub-Pull-Request: golang/go#62662 Reviewed-on: https://go-review.googlesource.com/c/go/+/528696 Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2023-09-13runtime: resolve false sharing for frequent memory allocate workloadsLi Gang
False sharing observed inside mheap struct, between arenas and preceding variables.Pad mheap.arenas and preceding variables to avoid false sharing This false-sharing getting worse and impact performance on multi core system and frequent memory allocate workloads. While running MinIO On a 2 socket system(56 Core per socket) and GOGC=1000, we observed HITM>8% (perf c2c) on this cacheline. After resolve this false-sharing issue, we got performance 17% improved. Improvement verified on MinIO: Server: https://github.com/minio/minio Client: https://github.com/minio/warp Config: Single node MinIO Server with 6 ramdisk, without TLS enabled, Run warp GET request, 128KB object and 512 concurrent Fixes #62472 Signed-off-by: Li Gang<gang.g.li@intel.com> Change-Id: I9a4a3c97f5bc8cd014c627f92d59d9187ebaaab5 Reviewed-on: https://go-review.googlesource.com/c/go/+/525955 Reviewed-by: Heschi Kreinick <heschi@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2023-05-19runtime: improve Pinner with gcBitsSven Anderson
This change replaces the statically sized pinnerBits with gcBits based ones, that are copied in each GC cycle if they exist. The pinnerBits now include a second bit per object, that indicates if a pinner counter for multi-pins exists, in order to avoid unnecessary specials iterations. This is a follow-up to CL 367296. Change-Id: I82e38cecd535e18c3b3ae54b5cc67d3aeeaafcfd Reviewed-on: https://go-review.googlesource.com/c/go/+/493275 Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
2023-05-19runtime: implement Pinner API for object pinningSven Anderson
Some C APIs require the use or structures that contain pointers to buffers (iovec, io_uring, ...). The pointer passing rules would require that these buffers are allocated in C memory and to process this data with Go libraries it would need to be copied. In order to provide a zero-copy way to use these C APIs, this CL implements a Pinner API that allows to pin Go objects, which guarantees that the garbage collector does not move these objects while pinned. This allows to relax the pointer passing rules so that pinned pointers can be stored in C allocated memory or can be contained in Go memory that is passed to C functions. The Pin() method accepts pointers to objects of any type and unsafe.Pointer. Slices and arrays can be pinned by calling Pin() with the pointer to the first element. Pinning of maps is not supported. If the GC collects unreachable Pinner holding pinned objects it panics. If Pin() is called with the other non-pointer types it panics as well. Performance considerations: This change has no impact on execution time on existing code, because checks are only done in code paths, that would panic otherwise. The memory footprint on existing code is one pointer per memory span. Fixes: #46787 Signed-off-by: Sven Anderson <sven@anderson.de> Change-Id: I110031fe789b92277ae45a9455624687bd1c54f2 Reviewed-on: https://go-review.googlesource.com/c/go/+/367296 Auto-Submit: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Than McIntosh <thanm@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com>
2023-05-19runtime: add eager scavenging details to GODEBUG=scavtrace=1Michael Anthony Knyszek
Also, clean up atomics on released-per-cycle while we're here. For #57069. Change-Id: I14026e8281f01dea1e8c8de6aa8944712b7b24d9 Reviewed-on: https://go-review.googlesource.com/c/go/+/495916 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-11runtime: replace trace.enabled with traceEnabledMichael Anthony Knyszek
[git-generate] cd src/runtime grep -l 'trace\.enabled' *.go | grep -v "trace.go" | xargs sed -i 's/trace\.enabled/traceEnabled()/g' Change-Id: I14c7821c1134690b18c8abc0edd27abcdabcad72 Reviewed-on: https://go-review.googlesource.com/c/go/+/494181 Run-TryBot: Michael Knyszek <mknyszek@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>
2023-04-19runtime: manage huge pages explicitlyMichael Anthony Knyszek
This change makes it so that on Linux the Go runtime explicitly marks page heap memory as either available to be backed by hugepages or not using heuristics based on density. The motivation behind this change is twofold: 1. In default Linux configurations, khugepaged can recoalesce hugepages even after the scavenger breaks them up, resulting in significant overheads for small heaps when their heaps shrink. 2. The Go runtime already has some heuristics about this, but those heuristics appear to have bit-rotted and result in haphazard hugepage management. Unlucky (but otherwise fairly dense) regions of memory end up not backed by huge pages while sparse regions end up accidentally marked MADV_HUGEPAGE and are not later broken up by the scavenger, because it already got the memory it needed from more dense sections (this is more likely to happen with small heaps that go idle). In this change, the runtime uses a new policy: 1. Mark all new memory MADV_HUGEPAGE. 2. Track whether each page chunk (4 MiB) became dense during the GC cycle. Mark those MADV_HUGEPAGE, and hide them from the scavenger. 3. If a chunk is not dense for 1 full GC cycle, make it visible to the scavenger. 4. The scavenger marks a chunk MADV_NOHUGEPAGE before it scavenges it. This policy is intended to try and back memory that is a good candidate for huge pages (high occupancy) with huge pages, and give memory that is not (low occupancy) to the scavenger. Occupancy is defined not just by occupancy at any instant of time, but also occupancy in the near future. It's generally true that by the end of a GC cycle the heap gets quite dense (from the perspective of the page allocator). Because we want scavenging and huge page management to happen together (the right time to MADV_NOHUGEPAGE is just before scavenging in order to break up huge pages and keep them that way) and the cost of applying MADV_HUGEPAGE and MADV_NOHUGEPAGE is somewhat high, the scavenger avoids releasing memory in dense page chunks. All this together means the scavenger will now more generally release memory on a ~1 GC cycle delay. Notably this has implications for scavenging to maintain the memory limit and the runtime/debug.FreeOSMemory API. This change makes it so that in these cases all memory is visible to the scavenger regardless of sparseness and delays the page allocator in re-marking this memory with MADV_NOHUGEPAGE for around 1 GC cycle to mitigate churn. The end result of this change should be little-to-no performance difference for dense heaps (MADV_HUGEPAGE works a lot like the default unmarked state) but should allow the scavenger to more effectively take back fragments of huge pages. The main risk here is churn, because MADV_HUGEPAGE usually forces the kernel to immediately back memory with a huge page. That's the reason for the large amount of hysteresis (1 full GC cycle) and why the definition of high density is 96% occupancy. Fixes #55328. Change-Id: I8da7998f1a31b498a9cc9bc662c1ae1a6bf64630 Reviewed-on: https://go-review.googlesource.com/c/go/+/436395 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-19runtime: disable huge pages for GC metadata for small heapsMichael Anthony Knyszek
For #55328. Change-Id: I8792161f09906c08d506cc0ace9d07e76ec6baa6 Reviewed-on: https://go-review.googlesource.com/c/go/+/460316 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-02-08runtime: correct typosOleksandr Redko
- Fix typo in throw error message for arena. - Correct typos in assembly and Go comments. - Fix log message in TestTraceCPUProfile. Change-Id: I874c9e8cd46394448b6717bc6021aa3ecf319d16 GitHub-Last-Rev: d27fad4d3cea81cc7a4ca6917985bcf5fa49b0e0 GitHub-Pull-Request: golang/go#58375 Reviewed-on: https://go-review.googlesource.com/c/go/+/465975 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-01-28runtime: remove go119MemoryLimitSupport flagKeith Randall
Change-Id: I207480d991c6242a1610795605c5ec6a3b3c59de Reviewed-on: https://go-review.googlesource.com/c/go/+/463225 Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18runtime: add page tracerMichael Knyszek
This change adds a new GODEBUG flag called pagetrace that writes a low-overhead trace of how pages of memory are managed by the Go runtime. The page tracer is kept behind a GOEXPERIMENT flag due to a potential security risk for setuid binaries. Change-Id: I6f4a2447d02693c25214400846a5d2832ad6e5c0 Reviewed-on: https://go-review.googlesource.com/c/go/+/444157 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>