aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/malloc.go
AgeCommit message (Collapse)Author
2022-11-18all: add missing periods in commentscui fliter
Change-Id: I69065f8adf101fdb28682c55997f503013a50e29 Reviewed-on: https://go-review.googlesource.com/c/go/+/449757 Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Joedian Reid <joedian@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Joedian Reid <joedian@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-11-15runtime: make GC see object as allocated after it is initializedCherry Mui
When the GC is scanning some memory (possibly conservatively), finding a pointer, while concurrently another goroutine is allocating an object at the same address as the found pointer, the GC may see the pointer before the object and/or the heap bits are initialized. This may cause the GC to see bad pointers and possibly crash. To prevent this, we make it that the scanner can only see the object as allocated after the object and the heap bits are initialized. Currently the allocator uses freeindex to find the next available slot, and that code is coupled with updating the free index to a new slot past it. The scanner also uses the freeindex to determine if an object is allocated. This is somewhat racy. This CL makes the scanner use a different field, which is only updated after the object initialization (and a memory barrier). Fixes #54596. Change-Id: I2a57a226369926e7192c253dd0d21d3faf22297c Reviewed-on: https://go-review.googlesource.com/c/go/+/449017 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-14Revert "runtime: delay incrementing freeindex in malloc"Michael Knyszek
This reverts commit bed2b7cf41471e1521af5a83ae28bd643eb3e038. Reason for revert: I clicked submit by accident on the wrong CL. Change-Id: Iddf128cb62f289d472510eb30466e515068271b2 Reviewed-on: https://go-review.googlesource.com/c/go/+/449501 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-11-11runtime: delay incrementing freeindex in mallocCherry Mui
When the GC is scanning some memory (possibly conservatively), finding a pointer, while concurrently another goroutine is allocating an object at the same address as the found pointer, the GC may see the pointer before the object and/or the heap bits are initialized. This may cause the GC to see bad pointers and possibly crash. To prevent this, we make it that the scanner can only see the object as allocated after the object and the heap bits are initialized. As the scanner uses the freeindex to determine if an object is allocated, we delay the increment of freeindex after the initialization. As currently in some code path finding the next free index and updating the free index to a new slot past it is coupled, this needs a small refactoring. In the new code mspan.nextFreeIndex return the next free index but not update it (although allocCache is updated). mallocgc will update it at a later time. Fixes #54596. Change-Id: I6dd5ccf743f2d2c46a1ed67c6a8237fe09a71260 Reviewed-on: https://go-review.googlesource.com/c/go/+/427619 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-10-18runtime: replace all uses of CtzXX with TrailingZerosXXYoulin Feng
Replace all uses of Ctz64/32/8 with TrailingZeros64/32/8, because they are the same and maybe duplicated. Also renamed CtzXX functions in 386 assembly code. Change-Id: I19290204858083750f4be589bb0923393950ae6d Reviewed-on: https://go-review.googlesource.com/c/go/+/438935 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Auto-Submit: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@golang.org>
2022-10-17runtime: remove redundant conversionDan Kortschak
This appears to have been left over from a C cast during the rewrite of malloc into Go in https://golang.org/cl/108840046. Change-Id: I88f212089c2bcf79d5881b3e8bf3f94f343331d8 Reviewed-on: https://go-review.googlesource.com/c/go/+/443235 Run-TryBot: Dan Kortschak <dan@kortschak.io> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-10-12runtime: add safe arena support to the runtimeMichael Anthony Knyszek
This change adds an API to the runtime for arenas. A later CL can potentially export it as an experimental API, but for now, just the runtime implementation will suffice. The purpose of arenas is to improve efficiency, primarily by allowing for an application to manually free memory, thereby delaying garbage collection. It comes with other potential performance benefits, such as better locality, a better allocation strategy, and better handling of interior pointers by the GC. This implementation is based on one by danscales@google.com with a few significant differences: * The implementation lives entirely in the runtime (all layers). * Arena chunks are the minimum of 8 MiB or the heap arena size. This choice is made because in practice 64 MiB appears to be way too large of an area for most real-world use-cases. * Arena chunks are not unmapped, instead they're placed on an evacuation list and when there are no pointers left pointing into them, they're allowed to be reused. * Reusing partially-used arena chunks no longer tries to find one used by the same P first; it just takes the first one available. * In order to ensure worst-case fragmentation is never worse than 25%, only types and slice backing stores whose sizes are 1/4th the size of a chunk or less may be used. Previously larger sizes, up to the size of the chunk, were allowed. * ASAN, MSAN, and the race detector are fully supported. * Sets arena chunks to fault that were deferred at the end of mark termination (a non-public patch once did this; I don't see a reason not to continue that). For #51317. Change-Id: I83b1693a17302554cb36b6daa4e9249a81b1644f Reviewed-on: https://go-review.googlesource.com/c/go/+/423359 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-10-12runtime: make (*mheap).sysAlloc more generalMichael Anthony Knyszek
This change makes (*mheap).sysAlloc take an explicit list of hints and a boolean as to whether or not any newly-created heapArenas should be registered in the full arena list. This is a refactoring in preparation for arenas. For #51317. Change-Id: I0584a033fce3fcb60c5d0bc033d5fb8bd23b2378 Reviewed-on: https://go-review.googlesource.com/c/go/+/432078 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-10-12runtime: factor out GC assist credit accountingMichael Anthony Knyszek
No-op change in preparation for arenas. For #51317. Change-Id: I0777f21763fcd34957b7e709580cf2b7a962ba67 Reviewed-on: https://go-review.googlesource.com/c/go/+/423365 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-08runtime: remove unused scanSize parameter to gcmarknewobjectMichael Anthony Knyszek
This was left over from the old pacer, and never removed when the old pacer was removed in Go 1.19. Change-Id: I79e5f0420c6100c66bd06129a68f5bbab7c1ea8f Reviewed-on: https://go-review.googlesource.com/c/go/+/429256 Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2022-08-19runtime: add and use runtime/internal/sys.NotInHeapCuong Manh Le
Updates #46731 Change-Id: Ic2208c8bb639aa1e390be0d62e2bd799ecf20654 Reviewed-on: https://go-review.googlesource.com/c/go/+/421878 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2022-08-16runtime: redo heap bitmapKeith Randall
[this is a retry of CL 407035 + its revert CL 422395. The content is unchanged] Use just 1 bit per word to record the ptr/nonptr bitmap. Use word-sized operations to manipulate the bitmap, so we can operate on up to 64 ptr/nonptr bits at a time. Use a separate bitmap, one bit per word of the ptr/nonptr bitmap, to encode a no-more-pointers signal. Since we can check 64 ptr/nonptr bits at once, knowing the exact last pointer location is not necessary. As a followon CL, we should make the gcdata bitmap an array of uintptr instead of an array of byte, so we can load 64 bits of it at once. Similarly for the processing of gc programs. Change-Id: Ica5eb622f5b87e647be64f471d67b02732ef8be6 Reviewed-on: https://go-review.googlesource.com/c/go/+/422634 Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@golang.org>
2022-08-09Revert "runtime: redo heap bitmap"Keith Randall
This reverts commit b589208c8cc6e08239868f47e12c1449cd797bac. Reason for revert: Bug somewhere in this code, causing wasm and maybe linux/386 to fail. Change-Id: I5e1e501d839584e0219271bb937e94348f83c11f Reviewed-on: https://go-review.googlesource.com/c/go/+/422395 Reviewed-by: Than McIntosh <thanm@google.com> Run-TryBot: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-08runtime: redo heap bitmapKeith Randall
Use just 1 bit per word to record the ptr/nonptr bitmap. Use word-sized operations to manipulate the bitmap, so we can operate on up to 64 ptr/nonptr bits at a time. Use a separate bitmap, one bit per word of the ptr/nonptr bitmap, to encode a no-more-pointers signal. Since we can check 64 ptr/nonptr bits at once, knowing the exact last pointer location is not necessary. This cleans up the bitmap implementation significantly, which will hopefully make it faster. TODO: measure As a followon CL, we should make the gcdata bitmap an array of uintptr instead of an array of byte, so we can load 64 bits of it at once. Similarly for the processing of gc programs. Change-Id: I18151b1876d9543599800dec51e2a1b19df97d49 Reviewed-on: https://go-review.googlesource.com/c/go/+/407035 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@google.com>
2022-08-04runtime: add mayAcquire annotation for finlockAustin Clements
We're missing lock edges to finlock that happen only rarely. Anything that calls mallocgc can potentially trigger sweeping, which can potentially queue a finalizer, which acquires finlock. While this can happen on any malloc, it happens relatively rarely, so we simply haven't seen some of the lock edges that could happen. Add a mayAcquire annotation to mallocgc to capture the possibility of acquiring finlock. With this change, we add "fin" to the set of "malloc" locks. Several of these edges were already there, but not quite all of them. This was found by inspecting the rank graph for things that didn't make sense. For #53789. Change-Id: Idc10ce6f250596b0c07ba07ac93f2198fb38c22b Reviewed-on: https://go-review.googlesource.com/c/go/+/418717 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-02runtime: trivial replacements of g in remaining filesMichael Pratt
Rename g variables to gp for consistency. Change-Id: I09ecdc7e8439637bc0e32f9c5f96f515e6436362 Reviewed-on: https://go-review.googlesource.com/c/go/+/418591 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Michael Pratt <mpratt@google.com>
2022-05-18runtime: remove useless constant definition in malloc.goLeonard Wang
Change-Id: I060c867d89a06b5a44fbe77804c19299385802d9 Reviewed-on: https://go-review.googlesource.com/c/go/+/311250 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
2022-05-03runtime: split mprof locksRhys Hiltner
The profiles for memory allocations, sync.Mutex contention, and general blocking store their data in a shared hash table. The bookkeeping work at the end of a garbage collection cycle involves maintenance on each memory allocation record. Previously, a single lock guarded access to the hash table and the contents of all records. When a program has allocated memory at a large number of unique call stacks, the maintenance following every garbage collection can hold that lock for several milliseconds. That can prevent progress on all other goroutines by delaying acquirep's call to mcache.prepareForSweep, which needs the lock in mProf_Free to report when a profiled allocation is no longer in use. With no user goroutines making progress, it is in effect a multi-millisecond GC-related stop-the-world pause. Split the lock so the call to mProf_Flush no longer delays each P's call to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle, and mProf_Flush uses locks on the memory records' accumulator and their N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it uses a lock on the memory records' N+2 cycle. The profiles for sync.Mutex contention and general blocking now share a separate lock, and another lock guards insertions to the shared hash table (uncommon in the steady-state). Consumers of each type of profile take the matching accumulator lock, so will observe consistent count and magnitude values for each record. For #45894 Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa Reviewed-on: https://go-review.googlesource.com/c/go/+/399956 Reviewed-by: Carlos Amedee <carlos@golang.org> Run-TryBot: Rhys Hiltner <rhys@justin.tv> Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03runtime: move inconsistent memstats into gcControllerMichael Anthony Knyszek
Fundamentally, all of these memstats exist to serve the runtime in managing memory. For the sake of simpler testing, couple these stats more tightly with the GC. This CL was mostly done automatically. The fields had to be moved manually, but the references to the fields were updated via gofmt -w -r 'memstats.<field> -> gcController.<field>' *.go For #48409. Change-Id: Ic036e875c98138d9a11e1c35f8c61b784c376134 Reviewed-on: https://go-review.googlesource.com/c/go/+/397678 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03runtime: clean up inconsistent heap statsMichael Anthony Knyszek
The inconsistent heaps stats in memstats are a bit messy. Primarily, heap_sys is non-orthogonal with heap_released and heap_inuse. In later CLs, we're going to want heap_sys-heap_released-heap_inuse, so clean this up by replacing heap_sys with an orthogonal metric: heapFree. heapFree represents page heap memory that is free but not released. I think this change also simplifies a lot of reasoning about these stats; it's much clearer what they mean, and to obtain HeapSys for memstats, we no longer need to do the strange subtraction from heap_sys when allocating specifically non-heap memory from the page heap. Because we're removing heap_sys, we need to replace it with a sysMemStat for mem.go functions. In this case, heap_released is the most appropriate because we increase it anyway (again, non-orthogonality). In which case, it makes sense for heap_inuse, heap_released, and heapFree to become more uniform, and to just represent them all as sysMemStats. While we're here and messing with the types of heap_inuse and heap_released, let's also fix their names (and last_heap_inuse's name) up to the more modern Go convention of camelCase. For #48409. Change-Id: I87fcbf143b3e36b065c7faf9aa888d86bd11710b Reviewed-on: https://go-review.googlesource.com/c/go/+/397677 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03runtime: track how much memory is mapped in the Ready stateMichael Anthony Knyszek
This change adds a field to memstats called mappedReady that tracks how much memory is in the Ready state at any given time. In essence, it's the total memory usage by the Go runtime (with one exception which is documented). Essentially, all memory mapped read/write that has either been paged in or will soon. To make tracking this not involve the many different stats that track mapped memory, we track this statistic at a very low level. The downside of tracking this statistic at such a low level is that it managed to catch lots of situations where the runtime wasn't fully accounting for memory. This change rectifies these situations by always accounting for memory that's mapped in some way (i.e. always passing a sysMemStat to a mem.go function), with *two* exceptions. Rectifying these situations means also having the memory mapped during testing being accounted for, so that tests (i.e. ReadMemStats) that ultimately check mappedReady continue to work correctly without special exceptions. We choose to simply account for this memory in other_sys. Let's talk about the exceptions. The first is the arenas array for finding heap arena metadata from an address is mapped as read/write in one large chunk. It's tens of MiB in size. On systems with demand paging, we assume that the whole thing isn't paged in at once (after all, it maps to the whole address space, and it's exceedingly difficult with today's technology to even broach having as much physical memory as the total address space). On systems where we have to commit memory manually, we use a two-level structure. Now, the reason why this is an exception is because we have no mechanism to track what memory is paged in, and we can't just account for the entire thing, because that would *look* like an enormous overhead. Furthermore, this structure is on a few really, really critical paths in the runtime, so doing more explicit tracking isn't really an option. So, we explicitly don't and call sysAllocOS to map this memory. The second exception is that we call sysFree with no accounting to clean up address space reservations, or otherwise to throw out mappings we don't care about. In this case, also drop down to a lower level and call sysFreeOS to explicitly avoid accounting. The third exception is debuglog allocations. That is purely a debugging facility and ideally we want it to have as small an impact on the runtime as possible. If we include it in mappedReady calculations, it could cause GC pacing shifts in future CLs, especailly if one increases the debuglog buffer sizes as a one-off. As of this CL, these are the only three places in the runtime that would pass nil for a stat to any of the functions in mem.go. As a result, this CL makes sysMemStats mandatory to facilitate better accounting in the future. It's now much easier to grep and find out where accounting is explicitly elided, because one doesn't have to follow the trail of sysMemStat nil pointer values, and can just look at the function name. For #48409. Change-Id: I274eb467fc2603881717482214fddc47c9eaf218 Reviewed-on: https://go-review.googlesource.com/c/go/+/393402 Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-04-05all: separate doc comment from //go: directivesRuss Cox
A future change to gofmt will rewrite // Doc comment. //go:foo to // Doc comment. // //go:foo Apply that change preemptively to all comments (not necessarily just doc comments). For #51082. Change-Id: Iffe0285418d1e79d34526af3520b415a12203ca9 Reviewed-on: https://go-review.googlesource.com/c/go/+/384260 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-03-31runtime: add wrappers for sys* functions and consolidate docsMichael Anthony Knyszek
This change lifts all non-platform-specific code out of sys* functions for each platform up into wrappers, and moves documentation about the OS virtual memory abstraction layer from malloc.go to mem.go, which contains those wrappers. Change-Id: Ie803e4447403eaafc508b34b53a1a47d6cee9388 Reviewed-on: https://go-review.googlesource.com/c/go/+/393398 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Trust: Michael Knyszek <mknyszek@google.com>
2022-03-31runtime: remove intermediate fields in memstats for ReadMemStatsMichael Anthony Knyszek
Currently, the ReadMemStats (really this is all happening in readmemstats_m, but that's just a direct call from ReadMemStats) call chain first populates some fields in memstats, then copies those into the final MemStats location. This used to make a lot of sense when memstats' structure aligned with MemStats, and the values were just copied from one to other. Sometime in the last few releases, we switched to populating the MemStats manually because a lot of fields had diverged from their internal representation. Now, we're left with a lot of fields in memstats that pollute the structure: they only exist to be updated for the sake of ReadMemStats. Since we're going to be adding more fields to memstats in further CLs, this is a good opportunity to clean up. As a result of this change, updatememstats, which used to just update the aforementioned intermediate fields in memstats, is no longer necessary, so it is removed. Change-Id: Ifabfb3ac3002641105af62e9509a6351165dcd87 Reviewed-on: https://go-review.googlesource.com/c/go/+/393397 Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2021-12-06runtime: set iOS addr space to 40 bits with incremental pageallocMichael Anthony Knyszek
In iOS <14, the address space is strictly limited to 8 GiB, or 33 bits. As a result, the page allocator also assumes all heap memory lives in this region. This is especially necessary because the page allocator has a PROT_NONE mapping proportional to the size of the usable address space, so this keeps that mapping very small. However starting with iOS 14, this restriction is relaxed, and mmap may start returning addresses outside of the <14 range. Today this means that in iOS 14 and later, users experience an error in the page allocator when a heap arena is mapped outside of the old range. This change increases the ios/arm64 heapAddrBits to 40 while simultaneously making ios/arm64 use the 64-bit pagealloc implementation (with reservations and incremental mapping) to accommodate both iOS versions <14 and 14+. Once iOS <14 is deprecated, we can remove these exceptions and treat ios/arm64 like any other arm64 platform. This change also makes the BaseChunkIdx expression a little bit easier to read, while we're here. Fixes #46860. Change-Id: I13865f799777739109585f14f1cc49d6d57e096b Reviewed-on: https://go-review.googlesource.com/c/go/+/344401 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Austin Clements <austin@google.com>
2021-11-02runtime, syscall: add calls to asan functionsfanzha02
Add explicit address sanitizer instrumentation to the runtime and syscall packages. The compiler does not instrument the runtime package. It does instrument the syscall package, but we need to add a couple of cases that it can't see. Refer to the implementation of the asan malloc runtime library, this patch also allocates extra memory as the redzone, around the returned memory region, and marks the redzone as unaddressable to detect the overflows or underflows. Updates #44853. Change-Id: I2753d1cc1296935a66bf521e31ce91e35fcdf798 Reviewed-on: https://go-review.googlesource.com/c/go/+/298614 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Trust: fannie zhang <Fannie.Zhang@arm.com>
2021-10-29runtime: clean up allocation zeroingMichael Anthony Knyszek
Currently, the runtime zeroes allocations in several ways. First, small object spans are always zeroed if they come from mheap, and their slots are zeroed later in mallocgc if needed. Second, large object spans (objects that have their own spans) plumb the need for zeroing down into mheap. Thirdly, large objects that have no pointers have their zeroing delayed until after preemption is reenabled, but before returning in mallocgc. All of this has two consequences: 1. Spans for small objects that come from mheap are sometimes unnecessarily zeroed, even if the mallocgc call that created them doesn't need the object slot to be zeroed. 2. This is all messy and difficult to reason about. This CL simplifies this code, resolving both (1) and (2). First, it recognizes that zeroing in mheap is unnecessary for small object spans; mallocgc and its callees in mcache and mcentral, by design, are *always* able to deal with non-zeroed spans. They must, for they deal with recycled spans all the time. Once this fact is made clear, the only remaining use of zeroing in mheap is for large objects. As a result, this CL lifts mheap zeroing for large objects into mallocgc, to parallel all the other codepaths in mallocgc. This is makes the large object allocation code less surprising. Next, this CL sets the flag for the delayed zeroing explicitly in the one case where it matters, and inverts and renames the flag from isZeroed to delayZeroing. Finally, it adds a check to make sure that only pointer-free allocations take the delayed zeroing codepath, as an extra safety measure. Benchmark results: https://perf.golang.org/search?q=upload:20211028.8 Inspired by tapir.liu@gmail.com's CL 343470. Change-Id: I7e1296adc19ce8a02c8d93a0a5082aefb2673e8f Reviewed-on: https://go-review.googlesource.com/c/go/+/359477 Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com>
2021-10-07runtime,sync: using fastrandn instead of modulo reductionMeng Zhuo
fastrandn is ~50% faster than fastrand() % n. `ack -v 'fastrand\(\)\s?\%'` finds all modulo on fastrand() name old time/op new time/op delta Fastrandn/2 2.86ns ± 0% 1.59ns ± 0% -44.35% (p=0.000 n=9+10) Fastrandn/3 2.87ns ± 1% 1.59ns ± 0% -44.41% (p=0.000 n=10+9) Fastrandn/4 2.87ns ± 1% 1.58ns ± 1% -45.10% (p=0.000 n=10+10) Fastrandn/5 2.86ns ± 1% 1.58ns ± 1% -44.84% (p=0.000 n=10+10) Change-Id: Ic91f5ca9b9e3b65127bc34792b62fd64fbd13b5c Reviewed-on: https://go-review.googlesource.com/c/go/+/353269 Trust: Meng Zhuo <mzh@golangcn.org> Run-TryBot: Meng Zhuo <mzh@golangcn.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2021-09-28runtime: add mp parameter for getMCacheLeonard Wang
Since all callers of getMCache appear to have mp available, we pass the mp to getMCache, and reduce one call to getg. And after modification, getMCache is also inlined. Change-Id: Ib7880c118336acc026ecd7c60c5a88722c3ddfc7 Reviewed-on: https://go-review.googlesource.com/c/go/+/349329 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Trust: Michael Knyszek <mknyszek@google.com> Trust: Carlos Amedee <carlos@golang.org>
2021-07-22[dev.typeparams] all: merge master (798ec73) into dev.typeparamsMatthew Dempsky
Merge List: + 2021-07-22 798ec73519 runtime: don't clear timerModifiedEarliest if adjustTimers is 0 + 2021-07-22 fdb45acd1f runtime: move mem profile sampling into m-acquired section + 2021-07-21 3e48c0381f reflect: add missing copyright header + 2021-07-21 48c88f1b1b reflect: add Value.CanConvert + 2021-07-20 9e26569293 cmd/go: don't add C compiler ID to hash for standard library + 2021-07-20 d568e6e075 runtime/debug: skip TestPanicOnFault on netbsd/arm Change-Id: I87e1cd4614bb3b00807f18dfdd02664dcaecaebd
2021-07-22runtime: move mem profile sampling into m-acquired sectionDavid Chase
It was not safe to do mcache profiling updates outside the critical section, but we got lucky because the runtime was not preemptible. Adding chunked memory clearing (CL 270943) created preemption opportunities, which led to corruption of runtime data structures. Fixes #47304. Fixes #47302. Change-Id: I461615470d62328a83ccbac537fbdc6dcde81c85 Reviewed-on: https://go-review.googlesource.com/c/go/+/336449 Trust: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2021-06-17[dev.typeparams] runtime: replace Goarch* constants with internal/goarch ↵Michael Anthony Knyszek
versions [generated] [git-generate] cd src/runtime gofmt -w -r "sys.Goarch386 -> goarch.Is386" . gofmt -w -r "sys.GoarchAmd64 -> goarch.IsAmd64" . gofmt -w -r "sys.GoarchAmd64p32 -> goarch.IsAmd64p32" . gofmt -w -r "sys.GoarchArm -> goarch.IsArm" . gofmt -w -r "sys.GoarchArmbe -> goarch.IsArmbe" . gofmt -w -r "sys.GoarchArm64 -> goarch.IsArm64" . gofmt -w -r "sys.GoarchArm64be -> goarch.IsArm64be" . gofmt -w -r "sys.GoarchPpc64 -> goarch.IsPpc64" . gofmt -w -r "sys.GoarchPpc64le -> goarch.IsPpc64le" . gofmt -w -r "sys.GoarchMips -> goarch.IsMips" . gofmt -w -r "sys.GoarchMipsle -> goarch.IsMipsle" . gofmt -w -r "sys.GoarchMips64 -> goarch.IsMips64" . gofmt -w -r "sys.GoarchMips64le -> goarch.IsMips64le" . gofmt -w -r "sys.GoarchMips64p32 -> goarch.IsMips64p32" . gofmt -w -r "sys.GoarchMips64p32le -> goarch.IsMips64p32le" . gofmt -w -r "sys.GoarchPpc -> goarch.IsPpc" . gofmt -w -r "sys.GoarchRiscv -> goarch.IsRiscv" . gofmt -w -r "sys.GoarchRiscv64 -> goarch.IsRiscv64" . gofmt -w -r "sys.GoarchS390 -> goarch.IsS390" . gofmt -w -r "sys.GoarchS390x -> goarch.IsS390x" . gofmt -w -r "sys.GoarchSparc -> goarch.IsSparc" . gofmt -w -r "sys.GoarchSparc64 -> goarch.IsSparc64" . gofmt -w -r "sys.GoarchWasm -> goarch.IsWasm" . goimports -w *.go Change-Id: I9d88e1284efabaeb0ee3733cba6286247d078c85 Reviewed-on: https://go-review.googlesource.com/c/go/+/328345 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2021-06-17[dev.typeparams] runtime: replace Goos* constants with internal/goos ↵Michael Anthony Knyszek
versions [generated] [git-generate] cd src/runtime gofmt -w -r "sys.GoosAix -> goos.IsAix" . gofmt -w -r "sys.GoosAndroid -> goos.IsAndroid" . gofmt -w -r "sys.GoosDarwin -> goos.IsDarwin" . gofmt -w -r "sys.GoosDragonfly -> goos.IsDragonfly" . gofmt -w -r "sys.GoosFreebsd -> goos.IsFreebsd" . gofmt -w -r "sys.GoosHurd -> goos.IsHurd" . gofmt -w -r "sys.GoosIllumos -> goos.IsIllumos" . gofmt -w -r "sys.GoosIos -> goos.IsIos" . gofmt -w -r "sys.GoosJs -> goos.IsJs" . gofmt -w -r "sys.GoosLinux -> goos.IsLinux" . gofmt -w -r "sys.GoosNacl -> goos.IsNacl" . gofmt -w -r "sys.GoosNetbsd -> goos.IsNetbsd" . gofmt -w -r "sys.GoosOpenbsd -> goos.IsOpenbsd" . gofmt -w -r "sys.GoosPlan9 -> goos.IsPlan9" . gofmt -w -r "sys.GoosSolaris -> goos.IsSolaris" . gofmt -w -r "sys.GoosWindows -> goos.IsWindows" . gofmt -w -r "sys.GoosZos -> goos.IsZos" . goimports -w *.go Change-Id: I42bed2907317ed409812e5a3e2897c88a5d36f24 Reviewed-on: https://go-review.googlesource.com/c/go/+/328344 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2021-06-17[dev.typeparams] runtime: fix import sort order [generated]Michael Anthony Knyszek
[git-generate] cd src/runtime goimports -w *.go Change-Id: I1387af0f2fd1a213dc2f4c122e83a8db0fcb15f0 Reviewed-on: https://go-review.googlesource.com/c/go/+/329189 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Go Bot <gobot@golang.org>
2021-06-17[dev.typeparams] runtime: replace uses of runtime/internal/sys.PtrSize with ↵Michael Anthony Knyszek
internal/goarch.PtrSize [generated] [git-generate] cd src/runtime/internal/math gofmt -w -r "sys.PtrSize -> goarch.PtrSize" . goimports -w *.go cd ../.. gofmt -w -r "sys.PtrSize -> goarch.PtrSize" . goimports -w *.go Change-Id: I43491cdd54d2e06d4d04152b3d213851b7d6d423 Reviewed-on: https://go-review.googlesource.com/c/go/+/328337 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2021-06-11[dev.typeparams] runtime: simplify defer record allocationCherry Mui
Now that deferred functions are always argumentless and defer records are no longer with arguments, defer record can be fixed size (just the _defer struct). This allows us to simplify the allocation of defer records, specifically, remove the defer classes and the pools of different sized defers. Change-Id: Icc4b16afc23b38262ca9dd1f7369ad40874cf701 Reviewed-on: https://go-review.googlesource.com/c/go/+/326062 Trust: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org>
2021-05-26runtime,cmd/link/internal/ld: fix typostyltr
Change-Id: I558590cef7e2311aadbdcb4088033e350d3aae32 GitHub-Last-Rev: 513944a6238e0e32e2a2c266b70f7d50c9db508d GitHub-Pull-Request: golang/go#46389 Reviewed-on: https://go-review.googlesource.com/c/go/+/322809 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-04-30runtime: break up large calls to memclrNoHeapPointers to allow preemptionDavid Chase
If something "huge" is allocated, and the zeroing is trivial (no pointers involved) then zero it by chunks in a loop so that preemption can occur, not all in a single non-preemptible call. Benchmarking suggests that 256K is the best chunk size. Updates #42642. Change-Id: I94015e467eaa098c59870e479d6d83bc88efbfb4 Reviewed-on: https://go-review.googlesource.com/c/go/+/270943 Trust: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-04-29runtime: top align tinyallocs in race modeKeith Randall
Top align allocations in tinyalloc buckets when in race mode. This will make checkptr checks more reliable, because any code that modifies a pointer past the end of the object will trigger a checkptr error. No test, because we need -race for this to actually kick in. We could add it to the race detector tests, but the race detector tests are all geared towards race detector reports, not checkptr reports. Mucking with parsing reports is more than a test is worth. Fixes #38872 Change-Id: Ie56f0fbd1a9385539f6631fd1ac40c3de5600154 Reviewed-on: https://go-review.googlesource.com/c/go/+/315029 Trust: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Go Bot <gobot@golang.org>
2021-04-29runtime: use 4 MiB heap arenas on iOSMichael Anthony Knyszek
iOS arm64 is a 64-bit platform but with a strictly 32-bit address space (technically 33 bits, but the bottom half is unavailable to the application). Since address space is limited, use 4 MiB arenas instead of 64 MiB arenas. No changes are needed to the arena index because it's still relatively small; this change just brings iOS more in line with 32-bit platforms. Change-Id: I484e2d273d896fd0a57cd5c25012df0aef160290 Reviewed-on: https://go-review.googlesource.com/c/go/+/270538 Trust: Michael Knyszek <mknyszek@google.com> Trust: Emmanuel Odeke <emmanuel@orijtech.com> Trust: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2021-03-30runtime: fix typos in commentsLizzzcai
Change-Id: Ia70e8bdc6d2cf1195d7a3b5d33f180ae2db73e29 Reviewed-on: https://go-review.googlesource.com/c/go/+/305369 Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com> Reviewed-by: Ben Shi <powerman1st@163.com> Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com> TryBot-Result: Go Bot <gobot@golang.org>
2021-03-15runtime: prepare arenas for use incrementallyMichael Anthony Knyszek
This change moves the call of sysMap from (*mheap).sysAlloc into (*mheap).grow, so we only sysMap what we're going to use in the near future (thanks to the curArena mechanism). The purpose of this change is to better support systems with strict overcommit rules which generally accept reserved memory but not prepared memory (see malloc.go for exact descriptions of these states). This move requires changing linearAlloc to only optionally map memory. In one case, with mheap.heapArenaAlloc, we do want it to map memory. But now in the other case, with mheap.arena, we don't, because we want grow to take care of it. The risk with this change is we may make more syscalls than before on systems with 64 MiB arenas, but because heap growth is relatively rare this is unlikely to be a noticable issue. We also bound the amount of syscalls made by only extending curArena (and thus mapping) by pallocChunkPages*pageSize which is 4 MiB. Fixes #42612. Change-Id: I736df696afe78ddb1a747a896caa0db8726027e5 Reviewed-on: https://go-review.googlesource.com/c/go/+/270537 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-11-05runtime: avoid a bit of unneeded work when MemProfileRate==1Brad Fitzpatrick
Change-Id: I1dc355bcaeb0e5fb06a7fddc4cf5db596d22e0b3 Reviewed-on: https://go-review.googlesource.com/c/go/+/236148 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Trust: Emmanuel Odeke <emmanuel@orijtech.com> Reviewed-by: Austin Clements <austin@google.com>
2020-11-02runtime: make getMCache inlineableMichael Anthony Knyszek
This change moves the responsibility of throwing if an mcache is not available to the caller, because the inlining cost of throw is set very high in the compiler. Even if it was reduced down to the cost of a usual function call, it would still be too expensive, so just move it out. This choice also makes sense in the context of #42339 since we're going to have to handle the case where we don't have an mcache to update stats in a few contexts anyhow. Also, add getMCache to the list of functions that should be inlined to prevent future regressions. getMCache is called on the allocation fast path and because its not inlined actually causes a significant regression (~10%) in some microbenchmarks. Fixes #42305. Change-Id: I64ac5e4f26b730bd4435ea1069a4a50f55411ced Reviewed-on: https://go-review.googlesource.com/c/go/+/267157 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org>
2020-10-30runtime: add heap lock assertionsMichael Pratt
Some functions that required holding the heap lock _or_ world stop have been simplified to simply requiring the heap lock. This is conceptually simpler and taking the heap lock during world stop is guaranteed to not contend. This was only done on functions already called on the systemstack to avoid too many extra systemstack calls in GC. Updates #40677 Change-Id: I15aa1dadcdd1a81aac3d2a9ecad6e7d0377befdc Reviewed-on: https://go-review.googlesource.com/c/go/+/250262 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Trust: Michael Pratt <mpratt@google.com>
2020-10-30runtime: allocate at desired address when race detector is onCherry Zhang
Currently, on all supported platforms, the race detector (LLVM TSAN) expects the Go heap is at 0xc000000000 - 0xe000000000. Move the raceenabled condition first, so we always allocate there. This means on Linux/ARM64 when race detector is on we will allocate to 0xc000000000 - 0xe000000000, instead of 0x4000000000. The old address is meant for 39-bit VMA. But the race detector only supports 48-bit VMA anyway. So this is fine. Change-Id: I51ac8eff68297b37c8c651a93145cc94f83a939d Reviewed-on: https://go-review.googlesource.com/c/go/+/266372 Trust: Cherry Zhang <cherryyz@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-10-26runtime: move malloc stats into consistentHeapStatsMichael Anthony Knyszek
This change moves the mcache-local malloc stats into the consistentHeapStats structure so the malloc stats can be managed consistently with the memory stats. The one exception here is tinyAllocs for which moving that into the global stats would incur several atomic writes on the fast path. Microbenchmarks for just one CPU core have shown a 50% loss in throughput. Since tiny allocation counnt isn't exposed anyway and is always blindly added to both allocs and frees, let that stay inconsistent and flush the tiny allocation count every so often. Change-Id: I2a4b75f209c0e659b9c0db081a3287bf227c10ca Reviewed-on: https://go-review.googlesource.com/c/go/+/247039 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26runtime: add helper for getting an mcache in allocation contextsMichael Anthony Knyszek
This change adds a function getMCache which returns the current P's mcache if it's available, and otherwise tries to get mcache0 if we're bootstrapping. This function will come in handy as we need to replicate this behavior in multiple places in future changes. Change-Id: I536073d6f6dc6c6390269e613ead9f8bcb6e7f98 Reviewed-on: https://go-review.googlesource.com/c/go/+/246976 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26runtime: break down memstats.gc_sysMichael Anthony Knyszek
This change breaks apart gc_sys into three distinct pieces. Two of those pieces are pieces which come from heap_sys since they're allocated from the page heap. The rest comes from memory mapped from e.g. persistentalloc which better fits the purpose of a sysMemStat. Also, rename gc_sys to gcMiscSys. Change-Id: I098789170052511e7b31edbcdc9a53e5c24573f7 Reviewed-on: https://go-review.googlesource.com/c/go/+/246973 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26runtime: delineate which memstats are system stats with a typeMichael Anthony Knyszek
This change modifies the type of several mstats fields to be a new type: sysMemStat. This type has the same structure as the fields used to have. The purpose of this change is to make it very clear which stats may be used in various functions for accounting (usually the platform-specific sys* functions, but there are others). Currently there's an implicit understanding that the *uint64 value passed to these functions is some kind of statistic whose value is atomically managed. This understanding isn't inherently problematic, but we're about to change how some stats (which currently use mSysStatInc and mSysStatDec) work, so we want to make it very clear what the various requirements are around "sysStat". This change also removes mSysStatInc and mSysStatDec in favor of a method on sysMemStat. Note that those two functions were originally written the way they were because atomic 64-bit adds required a valid G on ARM, but this hasn't been the case for a very long time (since golang.org/cl/14204, but even before then it wasn't clear if mutexes required a valid G anymore). Today we implement 64-bit adds on ARM with a spinlock table. Change-Id: I4e9b37cf14afc2ae20cf736e874eb0064af086d7 Reviewed-on: https://go-review.googlesource.com/c/go/+/246971 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>