| Age | Commit message (Collapse) | Author |
|
Change-Id: I69065f8adf101fdb28682c55997f503013a50e29
Reviewed-on: https://go-review.googlesource.com/c/go/+/449757
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joedian Reid <joedian@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
When the GC is scanning some memory (possibly conservatively),
finding a pointer, while concurrently another goroutine is
allocating an object at the same address as the found pointer, the
GC may see the pointer before the object and/or the heap bits are
initialized. This may cause the GC to see bad pointers and
possibly crash.
To prevent this, we make it that the scanner can only see the
object as allocated after the object and the heap bits are
initialized. Currently the allocator uses freeindex to find the
next available slot, and that code is coupled with updating the
free index to a new slot past it. The scanner also uses the
freeindex to determine if an object is allocated. This is somewhat
racy. This CL makes the scanner use a different field, which is
only updated after the object initialization (and a memory
barrier).
Fixes #54596.
Change-Id: I2a57a226369926e7192c253dd0d21d3faf22297c
Reviewed-on: https://go-review.googlesource.com/c/go/+/449017
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This reverts commit bed2b7cf41471e1521af5a83ae28bd643eb3e038.
Reason for revert: I clicked submit by accident on the wrong CL.
Change-Id: Iddf128cb62f289d472510eb30466e515068271b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/449501
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
|
|
When the GC is scanning some memory (possibly conservatively),
finding a pointer, while concurrently another goroutine is
allocating an object at the same address as the found pointer, the
GC may see the pointer before the object and/or the heap bits are
initialized. This may cause the GC to see bad pointers and
possibly crash.
To prevent this, we make it that the scanner can only see the
object as allocated after the object and the heap bits are
initialized. As the scanner uses the freeindex to determine if an
object is allocated, we delay the increment of freeindex after the
initialization.
As currently in some code path finding the next free index and
updating the free index to a new slot past it is coupled, this
needs a small refactoring. In the new code mspan.nextFreeIndex
return the next free index but not update it (although allocCache
is updated). mallocgc will update it at a later time.
Fixes #54596.
Change-Id: I6dd5ccf743f2d2c46a1ed67c6a8237fe09a71260
Reviewed-on: https://go-review.googlesource.com/c/go/+/427619
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Replace all uses of Ctz64/32/8 with TrailingZeros64/32/8, because they
are the same and maybe duplicated. Also renamed CtzXX functions in 386
assembly code.
Change-Id: I19290204858083750f4be589bb0923393950ae6d
Reviewed-on: https://go-review.googlesource.com/c/go/+/438935
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
|
|
This appears to have been left over from a C cast during the rewrite of
malloc into Go in https://golang.org/cl/108840046.
Change-Id: I88f212089c2bcf79d5881b3e8bf3f94f343331d8
Reviewed-on: https://go-review.googlesource.com/c/go/+/443235
Run-TryBot: Dan Kortschak <dan@kortschak.io>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
This change adds an API to the runtime for arenas. A later CL can
potentially export it as an experimental API, but for now, just the
runtime implementation will suffice.
The purpose of arenas is to improve efficiency, primarily by allowing
for an application to manually free memory, thereby delaying garbage
collection. It comes with other potential performance benefits, such as
better locality, a better allocation strategy, and better handling of
interior pointers by the GC.
This implementation is based on one by danscales@google.com with a few
significant differences:
* The implementation lives entirely in the runtime (all layers).
* Arena chunks are the minimum of 8 MiB or the heap arena size. This
choice is made because in practice 64 MiB appears to be way too large
of an area for most real-world use-cases.
* Arena chunks are not unmapped, instead they're placed on an evacuation
list and when there are no pointers left pointing into them, they're
allowed to be reused.
* Reusing partially-used arena chunks no longer tries to find one used
by the same P first; it just takes the first one available.
* In order to ensure worst-case fragmentation is never worse than 25%,
only types and slice backing stores whose sizes are 1/4th the size of
a chunk or less may be used. Previously larger sizes, up to the size
of the chunk, were allowed.
* ASAN, MSAN, and the race detector are fully supported.
* Sets arena chunks to fault that were deferred at the end of mark
termination (a non-public patch once did this; I don't see a reason
not to continue that).
For #51317.
Change-Id: I83b1693a17302554cb36b6daa4e9249a81b1644f
Reviewed-on: https://go-review.googlesource.com/c/go/+/423359
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
|
|
This change makes (*mheap).sysAlloc take an explicit list of hints and a
boolean as to whether or not any newly-created heapArenas should be
registered in the full arena list.
This is a refactoring in preparation for arenas.
For #51317.
Change-Id: I0584a033fce3fcb60c5d0bc033d5fb8bd23b2378
Reviewed-on: https://go-review.googlesource.com/c/go/+/432078
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
|
|
No-op change in preparation for arenas.
For #51317.
Change-Id: I0777f21763fcd34957b7e709580cf2b7a962ba67
Reviewed-on: https://go-review.googlesource.com/c/go/+/423365
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This was left over from the old pacer, and never removed when the old
pacer was removed in Go 1.19.
Change-Id: I79e5f0420c6100c66bd06129a68f5bbab7c1ea8f
Reviewed-on: https://go-review.googlesource.com/c/go/+/429256
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Updates #46731
Change-Id: Ic2208c8bb639aa1e390be0d62e2bd799ecf20654
Reviewed-on: https://go-review.googlesource.com/c/go/+/421878
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
|
|
[this is a retry of CL 407035 + its revert CL 422395. The content is unchanged]
Use just 1 bit per word to record the ptr/nonptr bitmap.
Use word-sized operations to manipulate the bitmap, so we can operate
on up to 64 ptr/nonptr bits at a time.
Use a separate bitmap, one bit per word of the ptr/nonptr bitmap,
to encode a no-more-pointers signal. Since we can check 64 ptr/nonptr
bits at once, knowing the exact last pointer location is not necessary.
As a followon CL, we should make the gcdata bitmap an array of
uintptr instead of an array of byte, so we can load 64 bits of it at once.
Similarly for the processing of gc programs.
Change-Id: Ica5eb622f5b87e647be64f471d67b02732ef8be6
Reviewed-on: https://go-review.googlesource.com/c/go/+/422634
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
|
|
This reverts commit b589208c8cc6e08239868f47e12c1449cd797bac.
Reason for revert: Bug somewhere in this code, causing wasm and maybe linux/386 to fail.
Change-Id: I5e1e501d839584e0219271bb937e94348f83c11f
Reviewed-on: https://go-review.googlesource.com/c/go/+/422395
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Use just 1 bit per word to record the ptr/nonptr bitmap.
Use word-sized operations to manipulate the bitmap, so we can operate
on up to 64 ptr/nonptr bits at a time.
Use a separate bitmap, one bit per word of the ptr/nonptr bitmap,
to encode a no-more-pointers signal. Since we can check 64 ptr/nonptr
bits at once, knowing the exact last pointer location is not necessary.
This cleans up the bitmap implementation significantly, which will
hopefully make it faster. TODO: measure
As a followon CL, we should make the gcdata bitmap an array of
uintptr instead of an array of byte, so we can load 64 bits of it at once.
Similarly for the processing of gc programs.
Change-Id: I18151b1876d9543599800dec51e2a1b19df97d49
Reviewed-on: https://go-review.googlesource.com/c/go/+/407035
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
We're missing lock edges to finlock that happen only rarely. Anything
that calls mallocgc can potentially trigger sweeping, which can
potentially queue a finalizer, which acquires finlock. While this can
happen on any malloc, it happens relatively rarely, so we simply
haven't seen some of the lock edges that could happen.
Add a mayAcquire annotation to mallocgc to capture the possibility of
acquiring finlock.
With this change, we add "fin" to the set of "malloc" locks. Several
of these edges were already there, but not quite all of them.
This was found by inspecting the rank graph for things that didn't
make sense.
For #53789.
Change-Id: Idc10ce6f250596b0c07ba07ac93f2198fb38c22b
Reviewed-on: https://go-review.googlesource.com/c/go/+/418717
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Rename g variables to gp for consistency.
Change-Id: I09ecdc7e8439637bc0e32f9c5f96f515e6436362
Reviewed-on: https://go-review.googlesource.com/c/go/+/418591
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
|
|
Change-Id: I060c867d89a06b5a44fbe77804c19299385802d9
Reviewed-on: https://go-review.googlesource.com/c/go/+/311250
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
|
|
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Fundamentally, all of these memstats exist to serve the runtime in
managing memory. For the sake of simpler testing, couple these stats
more tightly with the GC.
This CL was mostly done automatically. The fields had to be moved
manually, but the references to the fields were updated via
gofmt -w -r 'memstats.<field> -> gcController.<field>' *.go
For #48409.
Change-Id: Ic036e875c98138d9a11e1c35f8c61b784c376134
Reviewed-on: https://go-review.googlesource.com/c/go/+/397678
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
The inconsistent heaps stats in memstats are a bit messy. Primarily,
heap_sys is non-orthogonal with heap_released and heap_inuse. In later
CLs, we're going to want heap_sys-heap_released-heap_inuse, so clean
this up by replacing heap_sys with an orthogonal metric: heapFree.
heapFree represents page heap memory that is free but not released.
I think this change also simplifies a lot of reasoning about these
stats; it's much clearer what they mean, and to obtain HeapSys for
memstats, we no longer need to do the strange subtraction from heap_sys
when allocating specifically non-heap memory from the page heap.
Because we're removing heap_sys, we need to replace it with a sysMemStat
for mem.go functions. In this case, heap_released is the most
appropriate because we increase it anyway (again, non-orthogonality). In
which case, it makes sense for heap_inuse, heap_released, and heapFree
to become more uniform, and to just represent them all as sysMemStats.
While we're here and messing with the types of heap_inuse and
heap_released, let's also fix their names (and last_heap_inuse's name)
up to the more modern Go convention of camelCase.
For #48409.
Change-Id: I87fcbf143b3e36b065c7faf9aa888d86bd11710b
Reviewed-on: https://go-review.googlesource.com/c/go/+/397677
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This change adds a field to memstats called mappedReady that tracks how
much memory is in the Ready state at any given time. In essence, it's
the total memory usage by the Go runtime (with one exception which is
documented). Essentially, all memory mapped read/write that has either
been paged in or will soon.
To make tracking this not involve the many different stats that track
mapped memory, we track this statistic at a very low level. The downside
of tracking this statistic at such a low level is that it managed to
catch lots of situations where the runtime wasn't fully accounting for
memory. This change rectifies these situations by always accounting for
memory that's mapped in some way (i.e. always passing a sysMemStat to a
mem.go function), with *two* exceptions.
Rectifying these situations means also having the memory mapped during
testing being accounted for, so that tests (i.e. ReadMemStats) that
ultimately check mappedReady continue to work correctly without special
exceptions. We choose to simply account for this memory in other_sys.
Let's talk about the exceptions. The first is the arenas array for
finding heap arena metadata from an address is mapped as read/write in
one large chunk. It's tens of MiB in size. On systems with demand
paging, we assume that the whole thing isn't paged in at once (after
all, it maps to the whole address space, and it's exceedingly difficult
with today's technology to even broach having as much physical memory as
the total address space). On systems where we have to commit memory
manually, we use a two-level structure.
Now, the reason why this is an exception is because we have no mechanism
to track what memory is paged in, and we can't just account for the
entire thing, because that would *look* like an enormous overhead.
Furthermore, this structure is on a few really, really critical paths in
the runtime, so doing more explicit tracking isn't really an option. So,
we explicitly don't and call sysAllocOS to map this memory.
The second exception is that we call sysFree with no accounting to clean
up address space reservations, or otherwise to throw out mappings we
don't care about. In this case, also drop down to a lower level and call
sysFreeOS to explicitly avoid accounting.
The third exception is debuglog allocations. That is purely a debugging
facility and ideally we want it to have as small an impact on the
runtime as possible. If we include it in mappedReady calculations, it
could cause GC pacing shifts in future CLs, especailly if one increases
the debuglog buffer sizes as a one-off.
As of this CL, these are the only three places in the runtime that would
pass nil for a stat to any of the functions in mem.go. As a result, this
CL makes sysMemStats mandatory to facilitate better accounting in the
future. It's now much easier to grep and find out where accounting is
explicitly elided, because one doesn't have to follow the trail of
sysMemStat nil pointer values, and can just look at the function name.
For #48409.
Change-Id: I274eb467fc2603881717482214fddc47c9eaf218
Reviewed-on: https://go-review.googlesource.com/c/go/+/393402
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
|
|
A future change to gofmt will rewrite
// Doc comment.
//go:foo
to
// Doc comment.
//
//go:foo
Apply that change preemptively to all comments (not necessarily just doc comments).
For #51082.
Change-Id: Iffe0285418d1e79d34526af3520b415a12203ca9
Reviewed-on: https://go-review.googlesource.com/c/go/+/384260
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This change lifts all non-platform-specific code out of sys* functions
for each platform up into wrappers, and moves documentation about the OS
virtual memory abstraction layer from malloc.go to mem.go, which
contains those wrappers.
Change-Id: Ie803e4447403eaafc508b34b53a1a47d6cee9388
Reviewed-on: https://go-review.googlesource.com/c/go/+/393398
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
|
|
Currently, the ReadMemStats (really this is all happening in
readmemstats_m, but that's just a direct call from ReadMemStats) call
chain first populates some fields in memstats, then copies those into
the final MemStats location. This used to make a lot of sense when
memstats' structure aligned with MemStats, and the values were just
copied from one to other. Sometime in the last few releases, we switched
to populating the MemStats manually because a lot of fields had diverged
from their internal representation. Now, we're left with a lot of fields
in memstats that pollute the structure: they only exist to be updated
for the sake of ReadMemStats. Since we're going to be adding more fields
to memstats in further CLs, this is a good opportunity to clean up.
As a result of this change, updatememstats, which used to just update
the aforementioned intermediate fields in memstats, is no longer
necessary, so it is removed.
Change-Id: Ifabfb3ac3002641105af62e9509a6351165dcd87
Reviewed-on: https://go-review.googlesource.com/c/go/+/393397
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
In iOS <14, the address space is strictly limited to 8 GiB, or 33 bits.
As a result, the page allocator also assumes all heap memory lives in
this region. This is especially necessary because the page allocator has
a PROT_NONE mapping proportional to the size of the usable address
space, so this keeps that mapping very small.
However starting with iOS 14, this restriction is relaxed, and mmap may
start returning addresses outside of the <14 range. Today this means
that in iOS 14 and later, users experience an error in the page
allocator when a heap arena is mapped outside of the old range.
This change increases the ios/arm64 heapAddrBits to 40 while
simultaneously making ios/arm64 use the 64-bit pagealloc implementation
(with reservations and incremental mapping) to accommodate both iOS
versions <14 and 14+.
Once iOS <14 is deprecated, we can remove these exceptions and treat
ios/arm64 like any other arm64 platform.
This change also makes the BaseChunkIdx expression a little bit easier
to read, while we're here.
Fixes #46860.
Change-Id: I13865f799777739109585f14f1cc49d6d57e096b
Reviewed-on: https://go-review.googlesource.com/c/go/+/344401
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Add explicit address sanitizer instrumentation to the runtime and
syscall packages. The compiler does not instrument the runtime
package. It does instrument the syscall package, but we need to add
a couple of cases that it can't see.
Refer to the implementation of the asan malloc runtime library,
this patch also allocates extra memory as the redzone, around the
returned memory region, and marks the redzone as unaddressable to
detect the overflows or underflows.
Updates #44853.
Change-Id: I2753d1cc1296935a66bf521e31ce91e35fcdf798
Reviewed-on: https://go-review.googlesource.com/c/go/+/298614
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: fannie zhang <Fannie.Zhang@arm.com>
|
|
Currently, the runtime zeroes allocations in several ways. First, small
object spans are always zeroed if they come from mheap, and their slots
are zeroed later in mallocgc if needed. Second, large object spans
(objects that have their own spans) plumb the need for zeroing down into
mheap. Thirdly, large objects that have no pointers have their zeroing
delayed until after preemption is reenabled, but before returning in
mallocgc.
All of this has two consequences:
1. Spans for small objects that come from mheap are sometimes
unnecessarily zeroed, even if the mallocgc call that created them
doesn't need the object slot to be zeroed.
2. This is all messy and difficult to reason about.
This CL simplifies this code, resolving both (1) and (2). First, it
recognizes that zeroing in mheap is unnecessary for small object spans;
mallocgc and its callees in mcache and mcentral, by design, are *always*
able to deal with non-zeroed spans. They must, for they deal with
recycled spans all the time. Once this fact is made clear, the only
remaining use of zeroing in mheap is for large objects.
As a result, this CL lifts mheap zeroing for large objects into
mallocgc, to parallel all the other codepaths in mallocgc. This is makes
the large object allocation code less surprising.
Next, this CL sets the flag for the delayed zeroing explicitly in the one
case where it matters, and inverts and renames the flag from isZeroed to
delayZeroing.
Finally, it adds a check to make sure that only pointer-free allocations
take the delayed zeroing codepath, as an extra safety measure.
Benchmark results: https://perf.golang.org/search?q=upload:20211028.8
Inspired by tapir.liu@gmail.com's CL 343470.
Change-Id: I7e1296adc19ce8a02c8d93a0a5082aefb2673e8f
Reviewed-on: https://go-review.googlesource.com/c/go/+/359477
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
fastrandn is ~50% faster than fastrand() % n.
`ack -v 'fastrand\(\)\s?\%'` finds all modulo on fastrand()
name old time/op new time/op delta
Fastrandn/2 2.86ns ± 0% 1.59ns ± 0% -44.35% (p=0.000 n=9+10)
Fastrandn/3 2.87ns ± 1% 1.59ns ± 0% -44.41% (p=0.000 n=10+9)
Fastrandn/4 2.87ns ± 1% 1.58ns ± 1% -45.10% (p=0.000 n=10+10)
Fastrandn/5 2.86ns ± 1% 1.58ns ± 1% -44.84% (p=0.000 n=10+10)
Change-Id: Ic91f5ca9b9e3b65127bc34792b62fd64fbd13b5c
Reviewed-on: https://go-review.googlesource.com/c/go/+/353269
Trust: Meng Zhuo <mzh@golangcn.org>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Since all callers of getMCache appear to have mp available,
we pass the mp to getMCache, and reduce one call to getg.
And after modification, getMCache is also inlined.
Change-Id: Ib7880c118336acc026ecd7c60c5a88722c3ddfc7
Reviewed-on: https://go-review.googlesource.com/c/go/+/349329
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
Trust: Carlos Amedee <carlos@golang.org>
|
|
Merge List:
+ 2021-07-22 798ec73519 runtime: don't clear timerModifiedEarliest if adjustTimers is 0
+ 2021-07-22 fdb45acd1f runtime: move mem profile sampling into m-acquired section
+ 2021-07-21 3e48c0381f reflect: add missing copyright header
+ 2021-07-21 48c88f1b1b reflect: add Value.CanConvert
+ 2021-07-20 9e26569293 cmd/go: don't add C compiler ID to hash for standard library
+ 2021-07-20 d568e6e075 runtime/debug: skip TestPanicOnFault on netbsd/arm
Change-Id: I87e1cd4614bb3b00807f18dfdd02664dcaecaebd
|
|
It was not safe to do mcache profiling updates outside the critical
section, but we got lucky because the runtime was not preemptible.
Adding chunked memory clearing (CL 270943) created preemption
opportunities, which led to corruption of runtime data structures.
Fixes #47304.
Fixes #47302.
Change-Id: I461615470d62328a83ccbac537fbdc6dcde81c85
Reviewed-on: https://go-review.googlesource.com/c/go/+/336449
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
versions [generated]
[git-generate]
cd src/runtime
gofmt -w -r "sys.Goarch386 -> goarch.Is386" .
gofmt -w -r "sys.GoarchAmd64 -> goarch.IsAmd64" .
gofmt -w -r "sys.GoarchAmd64p32 -> goarch.IsAmd64p32" .
gofmt -w -r "sys.GoarchArm -> goarch.IsArm" .
gofmt -w -r "sys.GoarchArmbe -> goarch.IsArmbe" .
gofmt -w -r "sys.GoarchArm64 -> goarch.IsArm64" .
gofmt -w -r "sys.GoarchArm64be -> goarch.IsArm64be" .
gofmt -w -r "sys.GoarchPpc64 -> goarch.IsPpc64" .
gofmt -w -r "sys.GoarchPpc64le -> goarch.IsPpc64le" .
gofmt -w -r "sys.GoarchMips -> goarch.IsMips" .
gofmt -w -r "sys.GoarchMipsle -> goarch.IsMipsle" .
gofmt -w -r "sys.GoarchMips64 -> goarch.IsMips64" .
gofmt -w -r "sys.GoarchMips64le -> goarch.IsMips64le" .
gofmt -w -r "sys.GoarchMips64p32 -> goarch.IsMips64p32" .
gofmt -w -r "sys.GoarchMips64p32le -> goarch.IsMips64p32le" .
gofmt -w -r "sys.GoarchPpc -> goarch.IsPpc" .
gofmt -w -r "sys.GoarchRiscv -> goarch.IsRiscv" .
gofmt -w -r "sys.GoarchRiscv64 -> goarch.IsRiscv64" .
gofmt -w -r "sys.GoarchS390 -> goarch.IsS390" .
gofmt -w -r "sys.GoarchS390x -> goarch.IsS390x" .
gofmt -w -r "sys.GoarchSparc -> goarch.IsSparc" .
gofmt -w -r "sys.GoarchSparc64 -> goarch.IsSparc64" .
gofmt -w -r "sys.GoarchWasm -> goarch.IsWasm" .
goimports -w *.go
Change-Id: I9d88e1284efabaeb0ee3733cba6286247d078c85
Reviewed-on: https://go-review.googlesource.com/c/go/+/328345
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
versions [generated]
[git-generate]
cd src/runtime
gofmt -w -r "sys.GoosAix -> goos.IsAix" .
gofmt -w -r "sys.GoosAndroid -> goos.IsAndroid" .
gofmt -w -r "sys.GoosDarwin -> goos.IsDarwin" .
gofmt -w -r "sys.GoosDragonfly -> goos.IsDragonfly" .
gofmt -w -r "sys.GoosFreebsd -> goos.IsFreebsd" .
gofmt -w -r "sys.GoosHurd -> goos.IsHurd" .
gofmt -w -r "sys.GoosIllumos -> goos.IsIllumos" .
gofmt -w -r "sys.GoosIos -> goos.IsIos" .
gofmt -w -r "sys.GoosJs -> goos.IsJs" .
gofmt -w -r "sys.GoosLinux -> goos.IsLinux" .
gofmt -w -r "sys.GoosNacl -> goos.IsNacl" .
gofmt -w -r "sys.GoosNetbsd -> goos.IsNetbsd" .
gofmt -w -r "sys.GoosOpenbsd -> goos.IsOpenbsd" .
gofmt -w -r "sys.GoosPlan9 -> goos.IsPlan9" .
gofmt -w -r "sys.GoosSolaris -> goos.IsSolaris" .
gofmt -w -r "sys.GoosWindows -> goos.IsWindows" .
gofmt -w -r "sys.GoosZos -> goos.IsZos" .
goimports -w *.go
Change-Id: I42bed2907317ed409812e5a3e2897c88a5d36f24
Reviewed-on: https://go-review.googlesource.com/c/go/+/328344
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
[git-generate]
cd src/runtime
goimports -w *.go
Change-Id: I1387af0f2fd1a213dc2f4c122e83a8db0fcb15f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/329189
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
internal/goarch.PtrSize [generated]
[git-generate]
cd src/runtime/internal/math
gofmt -w -r "sys.PtrSize -> goarch.PtrSize" .
goimports -w *.go
cd ../..
gofmt -w -r "sys.PtrSize -> goarch.PtrSize" .
goimports -w *.go
Change-Id: I43491cdd54d2e06d4d04152b3d213851b7d6d423
Reviewed-on: https://go-review.googlesource.com/c/go/+/328337
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
Now that deferred functions are always argumentless and defer
records are no longer with arguments, defer record can be fixed
size (just the _defer struct). This allows us to simplify the
allocation of defer records, specifically, remove the defer
classes and the pools of different sized defers.
Change-Id: Icc4b16afc23b38262ca9dd1f7369ad40874cf701
Reviewed-on: https://go-review.googlesource.com/c/go/+/326062
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
Change-Id: I558590cef7e2311aadbdcb4088033e350d3aae32
GitHub-Last-Rev: 513944a6238e0e32e2a2c266b70f7d50c9db508d
GitHub-Pull-Request: golang/go#46389
Reviewed-on: https://go-review.googlesource.com/c/go/+/322809
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
If something "huge" is allocated, and the zeroing is trivial (no pointers
involved) then zero it by chunks in a loop so that preemption can occur,
not all in a single non-preemptible call.
Benchmarking suggests that 256K is the best chunk size.
Updates #42642.
Change-Id: I94015e467eaa098c59870e479d6d83bc88efbfb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/270943
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Top align allocations in tinyalloc buckets when in race mode.
This will make checkptr checks more reliable, because any code
that modifies a pointer past the end of the object will trigger
a checkptr error.
No test, because we need -race for this to actually kick in. We could
add it to the race detector tests, but the race detector tests are all
geared towards race detector reports, not checkptr reports. Mucking
with parsing reports is more than a test is worth.
Fixes #38872
Change-Id: Ie56f0fbd1a9385539f6631fd1ac40c3de5600154
Reviewed-on: https://go-review.googlesource.com/c/go/+/315029
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
iOS arm64 is a 64-bit platform but with a strictly 32-bit address space
(technically 33 bits, but the bottom half is unavailable to the
application). Since address space is limited, use 4 MiB arenas instead
of 64 MiB arenas. No changes are needed to the arena index because it's
still relatively small; this change just brings iOS more in line with
32-bit platforms.
Change-Id: I484e2d273d896fd0a57cd5c25012df0aef160290
Reviewed-on: https://go-review.googlesource.com/c/go/+/270538
Trust: Michael Knyszek <mknyszek@google.com>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Trust: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Change-Id: Ia70e8bdc6d2cf1195d7a3b5d33f180ae2db73e29
Reviewed-on: https://go-review.googlesource.com/c/go/+/305369
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: Ben Shi <powerman1st@163.com>
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
This change moves the call of sysMap from (*mheap).sysAlloc into
(*mheap).grow, so we only sysMap what we're going to use in the near
future (thanks to the curArena mechanism). The purpose of this change is
to better support systems with strict overcommit rules which generally
accept reserved memory but not prepared memory (see malloc.go for exact
descriptions of these states).
This move requires changing linearAlloc to only optionally map memory.
In one case, with mheap.heapArenaAlloc, we do want it to map memory. But
now in the other case, with mheap.arena, we don't, because we want grow
to take care of it.
The risk with this change is we may make more syscalls than before on
systems with 64 MiB arenas, but because heap growth is relatively rare
this is unlikely to be a noticable issue. We also bound the amount of
syscalls made by only extending curArena (and thus mapping) by
pallocChunkPages*pageSize which is 4 MiB.
Fixes #42612.
Change-Id: I736df696afe78ddb1a747a896caa0db8726027e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/270537
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Change-Id: I1dc355bcaeb0e5fb06a7fddc4cf5db596d22e0b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/236148
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: Austin Clements <austin@google.com>
|
|
This change moves the responsibility of throwing if an mcache is not
available to the caller, because the inlining cost of throw is set very
high in the compiler. Even if it was reduced down to the cost of a usual
function call, it would still be too expensive, so just move it out.
This choice also makes sense in the context of #42339 since we're going
to have to handle the case where we don't have an mcache to update stats
in a few contexts anyhow.
Also, add getMCache to the list of functions that should be inlined to
prevent future regressions.
getMCache is called on the allocation fast path and because its not
inlined actually causes a significant regression (~10%) in some
microbenchmarks.
Fixes #42305.
Change-Id: I64ac5e4f26b730bd4435ea1069a4a50f55411ced
Reviewed-on: https://go-review.googlesource.com/c/go/+/267157
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
Some functions that required holding the heap lock _or_ world stop have
been simplified to simply requiring the heap lock. This is conceptually
simpler and taking the heap lock during world stop is guaranteed to not
contend. This was only done on functions already called on the
systemstack to avoid too many extra systemstack calls in GC.
Updates #40677
Change-Id: I15aa1dadcdd1a81aac3d2a9ecad6e7d0377befdc
Reviewed-on: https://go-review.googlesource.com/c/go/+/250262
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
|
|
Currently, on all supported platforms, the race detector (LLVM
TSAN) expects the Go heap is at 0xc000000000 - 0xe000000000.
Move the raceenabled condition first, so we always allocate
there.
This means on Linux/ARM64 when race detector is on we will
allocate to 0xc000000000 - 0xe000000000, instead of 0x4000000000.
The old address is meant for 39-bit VMA. But the race detector
only supports 48-bit VMA anyway. So this is fine.
Change-Id: I51ac8eff68297b37c8c651a93145cc94f83a939d
Reviewed-on: https://go-review.googlesource.com/c/go/+/266372
Trust: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This change moves the mcache-local malloc stats into the
consistentHeapStats structure so the malloc stats can be managed
consistently with the memory stats. The one exception here is
tinyAllocs for which moving that into the global stats would incur
several atomic writes on the fast path. Microbenchmarks for just one CPU
core have shown a 50% loss in throughput. Since tiny allocation counnt
isn't exposed anyway and is always blindly added to both allocs and
frees, let that stay inconsistent and flush the tiny allocation count
every so often.
Change-Id: I2a4b75f209c0e659b9c0db081a3287bf227c10ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/247039
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change adds a function getMCache which returns the current P's
mcache if it's available, and otherwise tries to get mcache0 if we're
bootstrapping. This function will come in handy as we need to replicate
this behavior in multiple places in future changes.
Change-Id: I536073d6f6dc6c6390269e613ead9f8bcb6e7f98
Reviewed-on: https://go-review.googlesource.com/c/go/+/246976
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change breaks apart gc_sys into three distinct pieces. Two of those
pieces are pieces which come from heap_sys since they're allocated from
the page heap. The rest comes from memory mapped from e.g.
persistentalloc which better fits the purpose of a sysMemStat. Also,
rename gc_sys to gcMiscSys.
Change-Id: I098789170052511e7b31edbcdc9a53e5c24573f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/246973
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change modifies the type of several mstats fields to be a new type:
sysMemStat. This type has the same structure as the fields used to have.
The purpose of this change is to make it very clear which stats may be
used in various functions for accounting (usually the platform-specific
sys* functions, but there are others). Currently there's an implicit
understanding that the *uint64 value passed to these functions is some
kind of statistic whose value is atomically managed. This understanding
isn't inherently problematic, but we're about to change how some stats
(which currently use mSysStatInc and mSysStatDec) work, so we want to
make it very clear what the various requirements are around "sysStat".
This change also removes mSysStatInc and mSysStatDec in favor of a
method on sysMemStat. Note that those two functions were originally
written the way they were because atomic 64-bit adds required a valid G
on ARM, but this hasn't been the case for a very long time (since
golang.org/cl/14204, but even before then it wasn't clear if mutexes
required a valid G anymore). Today we implement 64-bit adds on ARM with
a spinlock table.
Change-Id: I4e9b37cf14afc2ae20cf736e874eb0064af086d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/246971
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|