aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/mem_linux.go
AgeCommit message (Collapse)Author
2025-10-31runtime: avoid zeroing scavenged memoryLance Yang
On Linux, memory returned to the kernel via MADV_DONTNEED is guaranteed to be zero-filled on its next use. This commit leverages this kernel behavior to avoid a redundant software zeroing pass in the runtime, improving performance. Change-Id: Ia14343b447a2cec7af87644fe8050e23e983c787 GitHub-Last-Rev: 6c8df322836e70922c69ca3c5aac36e4b8a0839a GitHub-Pull-Request: golang/go#76063 Reviewed-on: https://go-review.googlesource.com/c/go/+/715160 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com>
2025-03-04runtime: decorate anonymous memory mappingsLénaïc Huard
Leverage the prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, ...) API to name the anonymous memory areas. This API has been introduced in Linux 5.17 to decorate the anonymous memory areas shown in /proc/<pid>/maps. This is already used by glibc. See: * https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c;h=27dfd1eb907f4615b70c70237c42c552bb4f26a8;hb=HEAD#l2434 * https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/setvmaname.c;h=ea93a5ffbebc9e5a7e32a297138f465724b4725f;hb=HEAD#l63 This can be useful when investigating the memory consumption of a multi-language program. On a 100% Go program, pprof profiler can be used to profile the memory consumption of the program. But pprof is only aware of what happens within the Go world. On a multi-language program, there could be a doubt about whether the suspicious extra-memory consumption comes from the Go part or the native part. With this change, the following Go program: package main import ( "fmt" "log" "os" ) /* #include <stdlib.h> void f(void) { (void)malloc(1024*1024*1024); } */ import "C" func main() { C.f() data, err := os.ReadFile("/proc/self/maps") if err != nil { log.Fatal(err) } fmt.Println(string(data)) } produces this output: $ GLIBC_TUNABLES=glibc.mem.decorate_maps=1 ~/doc/devel/open-source/go/bin/go run . 00400000-00402000 r--p 00000000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00402000-004a4000 r-xp 00002000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 004a4000-00574000 r--p 000a4000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00574000-00575000 r--p 00173000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00575000-00580000 rw-p 00174000 00:21 28451768 /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name 00580000-005a4000 rw-p 00000000 00:00 0 2e075000-2e096000 rw-p 00000000 00:00 0 [heap] c000000000-c000400000 rw-p 00000000 00:00 0 [anon: Go: heap] c000400000-c004000000 ---p 00000000 00:00 0 [anon: Go: heap reservation] 777f40000000-777f40021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f40021000-777f44000000 ---p 00000000 00:00 0 777f44000000-777f44021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f44021000-777f48000000 ---p 00000000 00:00 0 777f48000000-777f48021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f48021000-777f4c000000 ---p 00000000 00:00 0 777f4c000000-777f4c021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f4c021000-777f50000000 ---p 00000000 00:00 0 777f50000000-777f50021000 rw-p 00000000 00:00 0 [anon: glibc: malloc arena] 777f50021000-777f54000000 ---p 00000000 00:00 0 777f55afb000-777f55afc000 ---p 00000000 00:00 0 777f55afc000-777f562fc000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216378] 777f562fc000-777f562fd000 ---p 00000000 00:00 0 777f562fd000-777f56afd000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216377] 777f56afd000-777f56afe000 ---p 00000000 00:00 0 777f56afe000-777f572fe000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216376] 777f572fe000-777f572ff000 ---p 00000000 00:00 0 777f572ff000-777f57aff000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216375] 777f57aff000-777f57b00000 ---p 00000000 00:00 0 777f57b00000-777f58300000 rw-p 00000000 00:00 0 [anon: glibc: pthread stack: 216374] 777f58300000-777f58400000 rw-p 00000000 00:00 0 [anon: Go: page alloc index] 777f58400000-777f5a400000 rw-p 00000000 00:00 0 [anon: Go: heap index] 777f5a400000-777f6a580000 ---p 00000000 00:00 0 [anon: Go: scavenge index] 777f6a580000-777f6a581000 rw-p 00000000 00:00 0 [anon: Go: scavenge index] 777f6a581000-777f7a400000 ---p 00000000 00:00 0 [anon: Go: scavenge index] 777f7a400000-777f8a580000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f8a580000-777f8a581000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f8a581000-777f9c430000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9c430000-777f9c431000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9c431000-777f9e806000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9e806000-777f9e807000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9e807000-777f9ec00000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9ec36000-777f9ecb6000 rw-p 00000000 00:00 0 [anon: Go: immortal metadata] 777f9ecb6000-777f9ecc6000 rw-p 00000000 00:00 0 [anon: Go: gc bits] 777f9ecc6000-777f9ecd6000 rw-p 00000000 00:00 0 [anon: Go: allspans array] 777f9ecd6000-777f9ece7000 rw-p 00000000 00:00 0 [anon: Go: immortal metadata] 777f9ece7000-777f9ed67000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9ed67000-777f9ed68000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9ed68000-777f9ede7000 ---p 00000000 00:00 0 [anon: Go: page summary] 777f9ede7000-777f9ee07000 rw-p 00000000 00:00 0 [anon: Go: page alloc] 777f9ee07000-777f9ee0a000 rw-p 00000000 00:00 0 [anon: glibc: loader malloc] 777f9ee0a000-777f9ee2e000 r--p 00000000 00:21 48158213 /usr/lib/libc.so.6 777f9ee2e000-777f9ef9f000 r-xp 00024000 00:21 48158213 /usr/lib/libc.so.6 777f9ef9f000-777f9efee000 r--p 00195000 00:21 48158213 /usr/lib/libc.so.6 777f9efee000-777f9eff2000 r--p 001e3000 00:21 48158213 /usr/lib/libc.so.6 777f9eff2000-777f9eff4000 rw-p 001e7000 00:21 48158213 /usr/lib/libc.so.6 777f9eff4000-777f9effc000 rw-p 00000000 00:00 0 777f9effc000-777f9effe000 rw-p 00000000 00:00 0 [anon: glibc: loader malloc] 777f9f00a000-777f9f04a000 rw-p 00000000 00:00 0 [anon: Go: immortal metadata] 777f9f04a000-777f9f04c000 r--p 00000000 00:00 0 [vvar] 777f9f04c000-777f9f04e000 r--p 00000000 00:00 0 [vvar_vclock] 777f9f04e000-777f9f050000 r-xp 00000000 00:00 0 [vdso] 777f9f050000-777f9f051000 r--p 00000000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f051000-777f9f07a000 r-xp 00001000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f07a000-777f9f085000 r--p 0002a000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f085000-777f9f087000 r--p 00034000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f087000-777f9f088000 rw-p 00036000 00:21 48158204 /usr/lib/ld-linux-x86-64.so.2 777f9f088000-777f9f089000 rw-p 00000000 00:00 0 7ffc7bfa7000-7ffc7bfc8000 rw-p 00000000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall] The anonymous memory areas are now labelled so that we can see which ones have been allocated by the Go runtime versus which ones have been allocated by the glibc. Fixes #71546 Change-Id: I304e8b4dd7f2477a6da794fd44e9a7a5354e4bf4 Reviewed-on: https://go-review.googlesource.com/c/go/+/646095 Auto-Submit: Alan Donovan <adonovan@google.com> Commit-Queue: Alan Donovan <adonovan@google.com> Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-03-25runtime: migrate internal/atomic to internal/runtimeAndy Pan
For #65355 Change-Id: I65dd090fb99de9b231af2112c5ccb0eb635db2be Reviewed-on: https://go-review.googlesource.com/c/go/+/560155 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ibrahim Bazoka <ibrahimbazoka729@gmail.com> Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
2024-03-19runtime: optimize permission changes with mprotectLance Yang
On Linux, both mprotect() and mmap() acquire the mmap_lock (in writer mode), posing scalability challenges. The mmap_lock (formerly called mmap_sem) is a reader/writer lock that controls access to a process's address space; before making changes there (mapping in a new range, for example), the kernel must acquire that lock. Page-fault handling must also acquire mmap_lock (in reader mode) to ensure that the address space doesn't change in surprising ways while a fault is being resolved. A process can have a large address space and many threads running (and incurring page faults) concurrently, turning mmap_lock into a significant bottleneck. While both mmap() and mprotect() are protected by the mmap_lock, the shorter duration of mprotect system call, due to their simpler nature, results in a reduced locking time for the mmap_lock. Change-Id: I7f929544904e31eab34d0d8a9e368abe4de64637 GitHub-Last-Rev: 6f27a216b4fb789181d00316561b44358a118b19 GitHub-Pull-Request: golang/go#65038 Reviewed-on: https://go-review.googlesource.com/c/go/+/554935 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2023-12-05runtime: add the disablethp GODEBUG settingMichael Anthony Knyszek
Go 1.21.1 and Go 1.22 have ceased working around an issue with Linux kernel defaults for transparent huge pages that can result in excessive memory overheads. (https://bugzilla.kernel.org/show_bug.cgi?id=93111) Many Linux distributions disable huge pages altogether these days, so this problem isn't quite as far-reaching as it used to be. Also, the problem only affects Go programs with very particular memory usage patterns. That being said, because the runtime used to actively deal with this problem (but with some unpredictable behavior), it's preventing users that don't have a lot of control over their execution environment from upgrading to Go beyond Go 1.20. This change adds a GODEBUG to smooth over the transition. The GODEBUG setting disables transparent huge pages for all heap memory on Linux, which is much more predictable than restoring the old behavior. Fixes #64332. Change-Id: I73b1894337f0f0b1a5a17b90da1221e118e0b145 Reviewed-on: https://go-review.googlesource.com/c/go/+/547475 Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-08-22runtime: avoid MADV_HUGEPAGE for heap memoryMichael Anthony Knyszek
Currently the runtime marks all new memory as MADV_HUGEPAGE on Linux and manages its hugepage eligibility status. Unfortunately, the default THP behavior on most Linux distros is that MADV_HUGEPAGE blocks while the kernel eagerly reclaims and compacts memory to allocate a hugepage. This direct reclaim and compaction is unbounded, and may result in significant application thread stalls. In really bad cases, this can exceed 100s of ms or even seconds. Really all we want is to undo MADV_NOHUGEPAGE marks and let the default Linux paging behavior take over, but the only way to unmark a region as MADV_NOHUGEPAGE is to also mark it MADV_HUGEPAGE. The overall strategy of trying to keep hugepages for the heap unbroken however is sound. So instead let's use the new shiny MADV_COLLAPSE if it exists. MADV_COLLAPSE makes a best-effort synchronous attempt at collapsing the physical memory backing a memory region into a hugepage. We'll use MADV_COLLAPSE where we would've used MADV_HUGEPAGE, and stop using MADV_NOHUGEPAGE altogether. Because MADV_COLLAPSE is synchronous, it's also important to not re-collapse huge pages if the huge pages are likely part of some large allocation. Although in many cases it's advantageous to back these allocations with hugepages because they're contiguous, eagerly collapsing every hugepage means having to page in at least part of the large allocation. However, because we won't use MADV_NOHUGEPAGE anymore, we'll no longer handle the fact that khugepaged might come in and back some memory we returned to the OS with a hugepage. I've come to the conclusion that this is basically unavoidable without a new madvise flag and that it's just not a good default. If this change lands, advice about Linux huge page settings will be added to the GC guide. Verified that this change doesn't regress Sweet, at least not on my machine with: /sys/kernel/mm/transparent_hugepage/enabled [always or madvise] /sys/kernel/mm/transparent_hugepage/defrag [madvise] /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none [0 or 511] Unfortunately, this workaround means that we only get forced hugepages on Linux 6.1+. Fixes #61718. Change-Id: I7f4a7ba397847de29f800a99f9cb66cb2720a533 Reviewed-on: https://go-review.googlesource.com/c/go/+/516795 Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Michael Knyszek <mknyszek@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2023-05-23runtime: fall back on mmap if madvise is unsupportedLance Yang
Since Linux 3.18, support for madvise is optional, depending on the setting of the CONFIG_ADVISE_SYSCALLS configuration option. The Go runtime currently assumes in several places that we do not unmap heap memory; that needs to remain true. So, if madvise is unsupported, we cannot fall back on munmap. AFAIK, the only way to free the pages is to remap the memory region. For the x86, the system call mmap() is implemented by sys_mmap2() which calls do_mmap2() directly with the same parameters. The main call trace for mmap(v, n, PROT_READ|PROT_WRITE, MAP_ANON|MAP_FIXED|MAP_PRIVATE, -1, 0) is as follows: ``` do_mmap2() \- do_mmap_pgoff() \- get_unmapped_area() \- find_vma_prepare() // If a VMA was found and it is part of the new mmaping, remove // the old mapping as the new one will cover both. // Unmap all the pages in the region to be unmapped. \- do_munmap() // Allocate a VMA from the slab allocator. \- kmem_cache_alloc() // Link in the new vm_area_struct. \- vma_link() ``` So, it's safe to fall back on mmap(). See D.2 https://www.kernel.org/doc/gorman/html/understand/understand021.html Change-Id: Ia2b4234bc0bf8a4631a9926364598854618fe270 GitHub-Last-Rev: 179f04715442b44cd4b7bf3e6cae3dd9092128e7 GitHub-Pull-Request: golang/go#60218 Reviewed-on: https://go-review.googlesource.com/c/go/+/495081 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-26runtime: add a alignment checkioworker0
The Linux implementation requires that the address addr be page-aligned, and allows length to be zero. See Linux notes: https://man7.org/linux/man-pages/man2/madvise.2.html Change-Id: Ic49960c32991ef12f23de2de76e9689567c82d03 GitHub-Last-Rev: 35e7f8e5cc0b045043a88d9f304ef5bb1e9c1ab2 GitHub-Pull-Request: golang/go#59793 Reviewed-on: https://go-review.googlesource.com/c/go/+/488015 Auto-Submit: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org>
2023-04-19runtime: manage huge pages explicitlyMichael Anthony Knyszek
This change makes it so that on Linux the Go runtime explicitly marks page heap memory as either available to be backed by hugepages or not using heuristics based on density. The motivation behind this change is twofold: 1. In default Linux configurations, khugepaged can recoalesce hugepages even after the scavenger breaks them up, resulting in significant overheads for small heaps when their heaps shrink. 2. The Go runtime already has some heuristics about this, but those heuristics appear to have bit-rotted and result in haphazard hugepage management. Unlucky (but otherwise fairly dense) regions of memory end up not backed by huge pages while sparse regions end up accidentally marked MADV_HUGEPAGE and are not later broken up by the scavenger, because it already got the memory it needed from more dense sections (this is more likely to happen with small heaps that go idle). In this change, the runtime uses a new policy: 1. Mark all new memory MADV_HUGEPAGE. 2. Track whether each page chunk (4 MiB) became dense during the GC cycle. Mark those MADV_HUGEPAGE, and hide them from the scavenger. 3. If a chunk is not dense for 1 full GC cycle, make it visible to the scavenger. 4. The scavenger marks a chunk MADV_NOHUGEPAGE before it scavenges it. This policy is intended to try and back memory that is a good candidate for huge pages (high occupancy) with huge pages, and give memory that is not (low occupancy) to the scavenger. Occupancy is defined not just by occupancy at any instant of time, but also occupancy in the near future. It's generally true that by the end of a GC cycle the heap gets quite dense (from the perspective of the page allocator). Because we want scavenging and huge page management to happen together (the right time to MADV_NOHUGEPAGE is just before scavenging in order to break up huge pages and keep them that way) and the cost of applying MADV_HUGEPAGE and MADV_NOHUGEPAGE is somewhat high, the scavenger avoids releasing memory in dense page chunks. All this together means the scavenger will now more generally release memory on a ~1 GC cycle delay. Notably this has implications for scavenging to maintain the memory limit and the runtime/debug.FreeOSMemory API. This change makes it so that in these cases all memory is visible to the scavenger regardless of sparseness and delays the page allocator in re-marking this memory with MADV_NOHUGEPAGE for around 1 GC cycle to mitigate churn. The end result of this change should be little-to-no performance difference for dense heaps (MADV_HUGEPAGE works a lot like the default unmarked state) but should allow the scavenger to more effectively take back fragments of huge pages. The main risk here is churn, because MADV_HUGEPAGE usually forces the kernel to immediately back memory with a huge page. That's the reason for the large amount of hysteresis (1 full GC cycle) and why the definition of high density is 96% occupancy. Fixes #55328. Change-Id: I8da7998f1a31b498a9cc9bc662c1ae1a6bf64630 Reviewed-on: https://go-review.googlesource.com/c/go/+/436395 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-05runtime: add sysNoHugePageMichael Anthony Knyszek
Change-Id: Icccafb896de838256a2ec7c3f385e6cbb2b415fa Reviewed-on: https://go-review.googlesource.com/c/go/+/447360 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-05all: separate doc comment from //go: directivesRuss Cox
A future change to gofmt will rewrite // Doc comment. //go:foo to // Doc comment. // //go:foo Apply that change preemptively to all comments (not necessarily just doc comments). For #51082. Change-Id: Iffe0285418d1e79d34526af3520b415a12203ca9 Reviewed-on: https://go-review.googlesource.com/c/go/+/384260 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-03-31runtime: add wrappers for sys* functions and consolidate docsMichael Anthony Knyszek
This change lifts all non-platform-specific code out of sys* functions for each platform up into wrappers, and moves documentation about the OS virtual memory abstraction layer from malloc.go to mem.go, which contains those wrappers. Change-Id: Ie803e4447403eaafc508b34b53a1a47d6cee9388 Reviewed-on: https://go-review.googlesource.com/c/go/+/393398 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Trust: Michael Knyszek <mknyszek@google.com>
2022-01-19runtime: print error if mmap failsIan Lance Taylor
Fixes #49687 Change-Id: Ife7f64f4c98449eaff7327e09bc1fb67acee72c9 Reviewed-on: https://go-review.googlesource.com/c/go/+/379354 Trust: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2021-11-05runtime: add harddecommit GODEBUG flagMichael Anthony Knyszek
This change adds a new debug flag that makes the runtime map pages PROT_NONE in sysUnused on Linux, in addition to the usual madvise calls. This behavior mimics the behavior of decommit on Windows, and is helpful in debugging the scavenger. sysUsed is also updated to re-map the pages as PROT_READ|PROT_WRITE, mimicing Windows' explicit commit behavior. Change-Id: Iaac5fcd0e6920bd1d0e753dd4e7f0c0b128fe842 Reviewed-on: https://go-review.googlesource.com/c/go/+/356612 Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26runtime: delineate which memstats are system stats with a typeMichael Anthony Knyszek
This change modifies the type of several mstats fields to be a new type: sysMemStat. This type has the same structure as the fields used to have. The purpose of this change is to make it very clear which stats may be used in various functions for accounting (usually the platform-specific sys* functions, but there are others). Currently there's an implicit understanding that the *uint64 value passed to these functions is some kind of statistic whose value is atomically managed. This understanding isn't inherently problematic, but we're about to change how some stats (which currently use mSysStatInc and mSysStatDec) work, so we want to make it very clear what the various requirements are around "sysStat". This change also removes mSysStatInc and mSysStatDec in favor of a method on sysMemStat. Note that those two functions were originally written the way they were because atomic 64-bit adds required a valid G on ARM, but this hasn't been the case for a very long time (since golang.org/cl/14204, but even before then it wasn't clear if mutexes required a valid G anymore). Today we implement 64-bit adds on ARM with a spinlock table. Change-Id: I4e9b37cf14afc2ae20cf736e874eb0064af086d7 Reviewed-on: https://go-review.googlesource.com/c/go/+/246971 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2019-11-04runtime: clean up power-of-two rounding code with align functionsMichael Anthony Knyszek
This change renames the "round" function to the more appropriately named "alignUp" which rounds an integer up to the next multiple of a power of two. This change also adds the alignDown function, which is almost like alignUp but rounds down to the previous multiple of a power of two. With these two functions, we also go and replace manual rounding code with it where we can. Change-Id: Ie1487366280484dcb2662972b01b4f7135f72fec Reviewed-on: https://go-review.googlesource.com/c/go/+/190618 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2019-07-30runtime: add physHugePageShiftMichael Anthony Knyszek
This change adds physHugePageShift which is defined such that 1 << physHugePageShift == physHugePageSize. The purpose of this variable is to avoid doing expensive divisions in key functions, such as (*mspan).hugePages. This change also does a sweep of any place we might do a division or mod operation with physHugePageSize and turns it into bit shifts and other bitwise operations. Finally, this change adds a check to mallocinit which ensures that physHugePageSize is always a power of two. osinit might choose to ignore non-powers-of-two for the value and replace it with zero, but mallocinit will fail if it's not a power of two (or zero). It also derives physHugePageShift from physHugePageSize. This change helps improve the performance of most applications because of how often (*mspan).hugePages is called. Updates #32828. Change-Id: I1a6db113d52d563f59ae8fd4f0e130858859e68f Reviewed-on: https://go-review.googlesource.com/c/go/+/186598 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-05-06runtime: ensure free and unscavenged spans may be backed by huge pagesMichael Anthony Knyszek
This change adds a new sysHugePage function to provide the equivalent of Linux's madvise(MADV_HUGEPAGE) support to the runtime. It then uses sysHugePage to mark a newly-coalesced free span as backable by huge pages to make the freeHugePages approximation a bit more accurate. The problem being solved here is that if a large free span is composed of many small spans which were coalesced together, then there's a chance that they have had madvise(MADV_NOHUGEPAGE) called on them at some point, which makes freeHugePages less accurate. For #30333. Change-Id: Idd4b02567619fc8d45647d9abd18da42f96f0522 Reviewed-on: https://go-review.googlesource.com/c/go/+/173338 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-05-03runtime: remove sys.HugePageSizeMichael Anthony Knyszek
sys.HugePageSize was superceded in the last commit by physHugePageSize which is determined dynamically by querying the operating system. For #30333. Change-Id: I827bfca8bdb347e989cead31564a8fffe56c66ff Reviewed-on: https://go-review.googlesource.com/c/go/+/173757 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-01-02runtime: add GODEBUG=madvdontneed=1Brad Fitzpatrick
Fixes #28466 Change-Id: I05b2e0da09394d111913963b60f2ec865c9b4744 Reviewed-on: https://go-review.googlesource.com/c/155931 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2018-09-18runtime: use MADV_FREE on Linux if availableTobias Klauser
On Linux, sysUnused currently uses madvise(MADV_DONTNEED) to signal the kernel that a range of allocated memory contains unneeded data. After a successful call, the range (but not the data it contained before the call to madvise) is still available but the first access to that range will unconditionally incur a page fault (needed to 0-fill the range). A faster alternative is MADV_FREE, available since Linux 4.5. The mechanism is very similar, but the page fault will only be incurred if the kernel, between the call to madvise and the first access, decides to reuse that memory for something else. In sysUnused, test whether MADV_FREE is supported and fall back to MADV_DONTNEED in case it isn't. This requires making the return value of the madvise syscall available to the caller, so change runtime.madvise to return it. Fixes #23687 Change-Id: I962c3429000dd9f4a00846461ad128b71201bb04 Reviewed-on: https://go-review.googlesource.com/135395 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-02-15runtime: remove non-reserved heap logicAustin Clements
Currently large sysReserve calls on some OSes don't actually reserve the memory, but just check that it can be reserved. This was important when we called sysReserve to "reserve" many gigabytes for the heap up front, but now that we map memory in small increments as we need it, this complication is no longer necessary. This has one curious side benefit: currently, on Linux, allocations that are large enough to be rejected by mmap wind up freezing the application for a long time before it panics. This happens because sysReserve doesn't reserve the memory, so sysMap calls mmap_fixed, which calls mmap, which fails because the mapping is too large. However, mmap_fixed doesn't inspect *why* mmap fails, so it falls back to probing every page in the desired region individually with mincore before performing an (otherwise dangerous) MAP_FIXED mapping, which will also fail. This takes a long time for a large region. Now this logic is gone, so the mmap failure leads to an immediate panic. Updates #10460. Change-Id: I8efe88c611871cdb14f99fadd09db83e0161ca2e Reviewed-on: https://go-review.googlesource.com/85888 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
2017-10-18runtime: separate error result for mmapAustin Clements
Currently mmap returns an unsafe.Pointer that encodes OS errors as values less than 4096. In practice this is okay, but it borders on being really unsafe: for example, the value has to be checked immediately after return and if stack copying were ever to observe such a value, it would panic. It's also not remotely idiomatic. Fix this by making mmap return a separate pointer value and error, like a normal Go function. Updates #22218. Change-Id: Iefd965095ffc82cc91118872753a5d39d785c3a6 Reviewed-on: https://go-review.googlesource.com/71270 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-09-06runtime: don't hard-code physical page sizeAustin Clements
Now that the runtime fetches the true physical page size from the OS, make the physical page size used by heap growth a variable instead of a constant. This isn't used in any performance-critical paths, so it shouldn't be an issue. sys.PhysPageSize is also renamed to sys.DefaultPhysPageSize to make it clear that it's not necessarily the true page size. There are no uses of this constant any more, but we'll keep it around for now. Updates #12480 and #10180. Change-Id: I6c23b9df860db309c38c8287a703c53817754f03 Reviewed-on: https://go-review.googlesource.com/25022 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
2016-07-20runtime: support smaller physical pages than PhysPageSizeAustin Clements
Most operations need an upper bound on the physical page size, which is what sys.PhysPageSize is for (this is checked at runtime init on Linux). However, a few operations need a *lower* bound on the physical page size. Introduce a "minPhysPageSize" constant to act as this lower bound and use it where it makes sense: 1) In addrspace_free, we have to query each page in the given range. Currently we increment by the upper bound on the physical page size, which means we may skip over pages if the true size is smaller. Worse, we currently pass a result buffer that only has enough room for one page. If there are actually multiple pages in the range passed to mincore, the kernel will overflow this buffer. Fix these problems by incrementing by the lower-bound on the physical page size and by passing "1" for the length, which the kernel will round up to the true physical page size. 2) In the write barrier, the bad pointer check tests for pointers to the first physical page, which are presumably small integers masquerading as pointers. However, if physical pages are smaller than we think, we may have legitimate pointers below sys.PhysPageSize. Hence, use minPhysPageSize for this test since pointers should never fall below that. In particular, this applies to ARM64 and MIPS. The runtime is configured to use 64kB pages on ARM64, but by default Linux uses 4kB pages. Similarly, the runtime assumes 16kB pages on MIPS, but both 4kB and 16kB kernel configurations are common. This also applies to ARM on systems where the runtime is recompiled to deal with a larger page size. It is also a step toward making the runtime use only a dynamically-queried page size. Change-Id: I1fdfd18f6e7cbca170cc100354b9faa22fde8a69 Reviewed-on: https://go-review.googlesource.com/25020 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Austin Clements <austin@google.com>
2016-04-16runtime: check that sysUnused is always physical-page alignedAustin Clements
If sysUnused is passed an address or length that is not aligned to the physical page boundary, the kernel will unmap more memory than the caller wanted. Add a check for this. For #9993. Change-Id: I68ff03032e7b65cf0a853fe706ce21dc7f2aaaf8 Reviewed-on: https://go-review.googlesource.com/22065 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
2016-03-02all: single space after period.Brad Fitzpatrick
The tree's pretty inconsistent about single space vs double space after a period in documentation. Make it consistently a single space, per earlier decisions. This means contributors won't be confused by misleading precedence. This CL doesn't use go/doc to parse. It only addresses // comments. It was generated with: $ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])') $ go test go/doc -update Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7 Reviewed-on: https://go-review.googlesource.com/20022 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dave Day <djd@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-01all: make copyright headers consistent with one space after periodBrad Fitzpatrick
This is a subset of https://golang.org/cl/20022 with only the copyright header lines, so the next CL will be smaller and more reviewable. Go policy has been single space after periods in comments for some time. The copyright header template at: https://golang.org/doc/contribute.html#copyright also uses a single space. Make them all consistent. Change-Id: Icc26c6b8495c3820da6b171ca96a74701b4a01b0 Reviewed-on: https://go-review.googlesource.com/20111 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-12runtime: break out system-specific constants into package sysMichael Matloob
runtime/internal/sys will hold system-, architecture- and config- specific constants. Updates #11647 Change-Id: I6db29c312556087a42e8d2bdd9af40d157c56b54 Reviewed-on: https://go-review.googlesource.com/16817 Reviewed-by: Russ Cox <rsc@golang.org>
2015-10-02runtime: adjust huge page flags only on huge page granularityAustin Clements
This fixes an issue where the runtime panics with "out of memory" or "cannot allocate memory" even though there's ample memory by reducing the number of memory mappings created by the memory allocator. Commit 7e1b61c worked around issue #8832 where Linux's transparent huge page support could dramatically increase the RSS of a Go process by setting the MADV_NOHUGEPAGE flag on any regions of pages released to the OS with MADV_DONTNEED. This had the side effect of also increasing the number of VMAs (memory mappings) in a Go address space because a separate VMA is needed for every region of the virtual address space with different flags. Unfortunately, by default, Linux limits the number of VMAs in an address space to 65530, and a large heap can quickly reach this limit when the runtime starts scavenging memory. This commit dramatically reduces the number of VMAs. It does this primarily by only adjusting the huge page flag at huge page granularity. With this change, on amd64, even a pessimal heap that alternates between MADV_NOHUGEPAGE and MADV_HUGEPAGE must reach 128GB to reach the VMA limit. Because of this rounding to huge page granularity, this change is also careful to leave large used and unused regions huge page-enabled. This change reduces the maximum number of VMAs during the runtime benchmarks with GODEBUG=scavenge=1 from 692 to 49. Fixes #12233. Change-Id: Ic397776d042f20d53783a1cacf122e2e2db00584 Reviewed-on: https://go-review.googlesource.com/15191 Reviewed-by: Keith Randall <khr@golang.org>
2015-04-24runtime: implement xadduintptr and update system mstats using itSrdjan Petrovic
The motivation is that sysAlloc/Free() currently aren't safe to be called without a valid G, because arm's xadd64() uses locks that require a valid G. The solution here was proposed by Dmitry Vyukov: use xadduintptr() instead of xadd64(), until arm can support xadd64 on all of its architectures (not a trivial task for arm). Change-Id: I250252079357ea2e4360e1235958b1c22051498f Reviewed-on: https://go-review.googlesource.com/9002 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2015-02-25runtime: fix build, divide by constant 0 is a compile-time errorKeith Randall
Change-Id: Iee319c9f5375c172fb599da77234c10ccb0fd314 Reviewed-on: https://go-review.googlesource.com/6020 Reviewed-by: Keith Randall <khr@golang.org>
2015-02-25runtime: mark pages we return to kernel as NOHUGEPAGEKeith Randall
We return memory to the kernel with madvise(..., DONTNEED). Also mark returned memory with NOHUGEPAGE to keep the kernel from merging this memory into a huge page, effectively reallocating it. Only known to be a problem on linux/{386,amd64,amd64p32} at the moment. It may come up on other os/arch combinations in the future. Fixes #8832 Change-Id: Ifffc6627a0296926e3f189a8a9b6e4bdb54c79eb Reviewed-on: https://go-review.googlesource.com/5660 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2015-02-12runtime: remove obsolete SELinux execmem commentBrad Fitzpatrick
We don't have executable memory anymore. Change-Id: I9835f03a7bcd97d809841ecbed8718b3048bfb32 Reviewed-on: https://go-review.googlesource.com/4681 Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Dave Cheney <dave@cheney.net>
2014-12-28runtime: rename gothrow to throwKeith Randall
Rename "gothrow" to "throw" now that the C version of "throw" is no longer needed. This change is purely mechanical except in panic.go where the old version of "throw" has been deleted. sed -i "" 's/[[:<:]]gothrow[[:>:]]/throw/g' runtime/*.go Change-Id: Icf0752299c35958b92870a97111c67bcd9159dc3 Reviewed-on: https://go-review.googlesource.com/2150 Reviewed-by: Minux Ma <minux@golang.org> Reviewed-by: Dave Cheney <dave@cheney.net>
2014-12-23runtime: correct ptrSize test in Linux version of sysReserveIan Lance Taylor
Change-Id: I90a8ca51269528a307e0d6f52436fc7913cd7900 Reviewed-on: https://go-review.googlesource.com/1541 Reviewed-by: Russ Cox <rsc@golang.org>
2014-11-14[dev.cc] all: merge dev.power64 (7667e41f3ced) into dev.ccRuss Cox
This is to reduce the delta between dev.cc and dev.garbage to just garbage collector changes. These are the files that had merge conflicts and have been edited by hand: malloc.go mem_linux.go mgc.go os1_linux.go proc1.go panic1.go runtime1.go LGTM=austin R=austin CC=golang-codereviews https://golang.org/cl/174180043
2014-11-11[dev.cc] runtime: convert memory allocator and garbage collector to GoRuss Cox
The conversion was done with an automated tool and then modified only as necessary to make it compile and run. [This CL is part of the removal of C code from package runtime. See golang.org/s/dev.cc for an overview.] LGTM=r R=r CC=austin, dvyukov, golang-codereviews, iant, khr https://golang.org/cl/167540043