aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/malloc.go
AgeCommit message (Collapse)Author
2016-04-20runtime: simplify mallocgc flag argumentKeith Randall
mallocgc can calculate noscan itself. The only remaining flag argument is needzero, so we just make that a boolean arg. Fixes #15379 Change-Id: I839a70790b2a0c9dbcee2600052bfbd6c8148e20 Reviewed-on: https://go-review.googlesource.com/22290 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-19runtime: add maxSliceCapJosh Bleecher Snyder
This avoids expensive division calculations for many common slice element sizes. name old time/op new time/op delta MakeSlice-8 51.9ns ± 3% 35.1ns ± 2% -32.41% (p=0.000 n=10+10) GrowSliceBytes-8 44.1ns ± 2% 44.1ns ± 1% ~ (p=0.984 n=10+10) GrowSliceInts-8 60.9ns ± 3% 60.9ns ± 3% ~ (p=0.698 n=10+10) GrowSlicePtr-8 131ns ± 1% 120ns ± 2% -8.41% (p=0.000 n=8+10) GrowSliceStruct24Bytes-8 111ns ± 2% 103ns ± 3% -7.23% (p=0.000 n=8+8) Change-Id: I2630eb3d73c814db030cad16e620ea7fecbbd312 Reviewed-on: https://go-review.googlesource.com/22223 Reviewed-by: Keith Randall <khr@golang.org>
2016-04-10runtime: make execution error panic values implement the Error interfaceEmmanuel Odeke
Make execution panics implement Error as mandated by https://golang.org/ref/spec#Run_time_panics, instead of panics with strings. Fixes #14965 Change-Id: I7827f898b9b9c08af541db922cc24fa0800ff18a Reviewed-on: https://go-review.googlesource.com/21214 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-07runtime: eliminate unnecessary type conversionsMatthew Dempsky
Automated refactoring produced using github.com/mdempsky/unconvert. Change-Id: Iacf871a4f221ef17f48999a464ab2858b2bbaa90 Reviewed-on: https://go-review.googlesource.com/20071 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-02all: single space after period.Brad Fitzpatrick
The tree's pretty inconsistent about single space vs double space after a period in documentation. Make it consistently a single space, per earlier decisions. This means contributors won't be confused by misleading precedence. This CL doesn't use go/doc to parse. It only addresses // comments. It was generated with: $ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])') $ go test go/doc -update Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7 Reviewed-on: https://go-review.googlesource.com/20022 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dave Day <djd@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-01-27runtime: fix upper bound on out-of-memory printRuss Cox
It's possible for arena_start+MaxArena32 to wrap. We do the right thing in the bounds check but not in the print. For #13992 (to fix the print there, not the bug). Change-Id: I4df845d0c03f0f35461b128e4f6765d3ccb71c6d Reviewed-on: https://go-review.googlesource.com/18975 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2016-01-18runtime: print address as hex in messagesShenghou Ma
Change-Id: I7ccf1b5001d77c4390479f53c0137ab02f98595b Reviewed-on: https://go-review.googlesource.com/18685 Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-12-15runtime: fix (sometimes major) underestimation of heap_liveAustin Clements
Currently, we update memstats.heap_live from mcache.local_cachealloc whenever we lock the heap (e.g., to obtain a fresh span or to release an unused span). However, under the right circumstances, local_cachealloc can accumulate allocations up to the size of the *entire heap* without flushing them to heap_live. Specifically, since span allocations from an mcentral don't lock the heap, if a large number of pages are held in an mcentral and the application continues to use and free objects of that size class (e.g., the BinaryTree17 benchmark), local_cachealloc won't be flushed until the mcentral runs out of spans. This is a problem because, unlike many of the memory statistics that are purely informative, heap_live is used to determine when the garbage collector should start and how hard it should work. This commit eliminates local_cachealloc, instead atomically updating heap_live directly. To control contention, we do this only when obtaining a span from an mcentral. Furthermore, we make heap_live conservative: allocating a span assumes that all free slots in that span will be used and accounts for these when the span is allocated, *before* the objects themselves are. This is important because 1) this triggers the GC earlier than necessary rather than potentially too late and 2) this leads to a conservative GC rate rather than a GC rate that is potentially too low. Alternatively, we could have flushed local_cachealloc when it passed some threshold, but this would require determining a threshold and would cause heap_live to underestimate the true value rather than overestimate. Fixes #12199. name old time/op new time/op delta BinaryTree17-12 2.88s ± 4% 2.88s ± 1% ~ (p=0.470 n=19+19) Fannkuch11-12 2.48s ± 1% 2.48s ± 1% ~ (p=0.243 n=16+19) FmtFprintfEmpty-12 50.9ns ± 2% 50.7ns ± 1% ~ (p=0.238 n=15+14) FmtFprintfString-12 175ns ± 1% 171ns ± 1% -2.48% (p=0.000 n=18+18) FmtFprintfInt-12 159ns ± 1% 158ns ± 1% -0.78% (p=0.000 n=19+18) FmtFprintfIntInt-12 270ns ± 1% 265ns ± 2% -1.67% (p=0.000 n=18+18) FmtFprintfPrefixedInt-12 235ns ± 1% 234ns ± 0% ~ (p=0.362 n=18+19) FmtFprintfFloat-12 309ns ± 1% 308ns ± 1% -0.41% (p=0.001 n=18+19) FmtManyArgs-12 1.10µs ± 1% 1.08µs ± 0% -1.96% (p=0.000 n=19+18) GobDecode-12 7.81ms ± 1% 7.80ms ± 1% ~ (p=0.425 n=18+19) GobEncode-12 6.53ms ± 1% 6.53ms ± 1% ~ (p=0.817 n=19+19) Gzip-12 312ms ± 1% 312ms ± 2% ~ (p=0.967 n=19+20) Gunzip-12 42.0ms ± 1% 41.9ms ± 1% ~ (p=0.172 n=19+19) HTTPClientServer-12 63.7µs ± 1% 63.8µs ± 1% ~ (p=0.639 n=19+19) JSONEncode-12 16.4ms ± 1% 16.4ms ± 1% ~ (p=0.954 n=19+19) JSONDecode-12 58.5ms ± 1% 57.8ms ± 1% -1.27% (p=0.000 n=18+19) Mandelbrot200-12 3.86ms ± 1% 3.88ms ± 0% +0.44% (p=0.000 n=18+18) GoParse-12 3.67ms ± 2% 3.66ms ± 1% -0.52% (p=0.001 n=18+19) RegexpMatchEasy0_32-12 100ns ± 1% 100ns ± 0% ~ (p=0.257 n=19+18) RegexpMatchEasy0_1K-12 347ns ± 1% 347ns ± 1% ~ (p=0.527 n=18+18) RegexpMatchEasy1_32-12 83.7ns ± 2% 83.1ns ± 2% ~ (p=0.096 n=18+19) RegexpMatchEasy1_1K-12 509ns ± 1% 505ns ± 1% -0.75% (p=0.000 n=18+19) RegexpMatchMedium_32-12 130ns ± 2% 129ns ± 1% ~ (p=0.962 n=20+20) RegexpMatchMedium_1K-12 39.5µs ± 2% 39.4µs ± 1% ~ (p=0.376 n=20+19) RegexpMatchHard_32-12 2.04µs ± 0% 2.04µs ± 1% ~ (p=0.195 n=18+17) RegexpMatchHard_1K-12 61.4µs ± 1% 61.4µs ± 1% ~ (p=0.885 n=19+19) Revcomp-12 540ms ± 2% 542ms ± 4% ~ (p=0.552 n=19+17) Template-12 69.6ms ± 1% 71.2ms ± 1% +2.39% (p=0.000 n=20+20) TimeParse-12 357ns ± 1% 357ns ± 1% ~ (p=0.883 n=18+20) TimeFormat-12 379ns ± 1% 362ns ± 1% -4.53% (p=0.000 n=18+19) [Geo mean] 62.0µs 61.8µs -0.44% name old time/op new time/op delta XBenchGarbage-12 5.89ms ± 2% 5.81ms ± 2% -1.41% (p=0.000 n=19+18) Change-Id: I96b31cca6ae77c30693a891cff3fe663fa2447a0 Reviewed-on: https://go-review.googlesource.com/17748 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-17runtime: check for updated arena_end overflowAustin Clements
Currently, if an allocation is large enough that arena_end + size overflows (which is not hard to do on 32-bit), we go ahead and call sysReserve with the impossible base and length and depend on this to either directly fail because the kernel can't possibly fulfill the requested mapping (causing mheap.sysAlloc to return nil) or to succeed with a mapping at some other address which will then be rejected as outside the arena. In order to make this less subtle, less dependent on the kernel getting all of this right, and to eliminate the hopeless system call, add an explicit overflow check. Updates #13143. This real issue has been fixed by 0de59c2, but this is a belt-and-suspenders improvement on top of that. It was uncovered by my symbolic modeling of that bug. Change-Id: I85fa868a33286fdcc23cdd7cdf86b19abf1cb2d1 Reviewed-on: https://go-review.googlesource.com/16961 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-11-16runtime: make mcache.tiny a uintptrAustin Clements
mcache.tiny is in non-GC'd memory, but points to heap memory. As a result, there may or may not be write barriers when writing to mcache.tiny. Make it clearer that funny things are going on by making mcache.tiny a uintptr instead of an unsafe.Pointer. Change-Id: I732a5b7ea17162f196a9155154bbaff8d4d00eac Reviewed-on: https://go-review.googlesource.com/16963 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-16runtime: handle sysReserve returning a pointer below the arenaAustin Clements
In mheap.sysAlloc, if an allocation at arena_used would exceed arena_end (but wouldn't yet push us past arena_start+_MaxArean32), it trie to extend the arena reservation by another 256 MB. It extends the arena by calling sysReserve, which, on 32-bit, calls mmap without MAP_FIXED, which means the address is just a hint and the kernel can put the mapping wherever it wants. In particular, mmap may choose an address below arena_start (the kernel also chose arena_start, so there could be lots of space below it). Currently, we don't detect this case and, if it happens, mheap.sysAlloc will corrupt arena_end and arena_used then return the low pointer to mheap.grow, which will crash when it attempts to index in to h_spans with an underflowed index. Fix this by checking not only that that p+p_size isn't too high, but that p isn't too low. Fixes #13143. Change-Id: I8d0f42bd1484460282a83c6f1a6f8f0df7fb2048 Reviewed-on: https://go-review.googlesource.com/16927 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-11-16runtime: avoid stat underflow crashAustin Clements
If the area returned by sysReserve in mheap.sysAlloc is outside the usable arena, we sysFree it. We pass a fake stat pointer to sysFree because we haven't added the allocation to any stat at that point. However, we pass a 0 stat, so sysFree panics when it decrements the stat because the fake stat underflows. Fix this by setting the fake stat to the allocation size. Updates #13143 (this is a prerequisite to fixing that bug). Change-Id: I61a6c9be19ac1c95863cf6a8435e19790c8bfc9a Reviewed-on: https://go-review.googlesource.com/16926 Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-11-12runtime: break out system-specific constants into package sysMichael Matloob
runtime/internal/sys will hold system-, architecture- and config- specific constants. Updates #11647 Change-Id: I6db29c312556087a42e8d2bdd9af40d157c56b54 Reviewed-on: https://go-review.googlesource.com/16817 Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-12runtime: rewrite lots of foo_Bar(f, ...) into f.bar(...)Matthew Dempsky
Applies to types fixAlloc, mCache, mCentral, mHeap, mSpan, and mSpanList. Two special cases: 1. mHeap_Scavenge() previously didn't take an *mheap parameter, so it was specially handled in this CL. 2. mHeap_Free() would have collided with mheap's "free" field, so it's been renamed to (*mheap).freeSpan to parallel its underlying (*mheap).freeSpanLocked method. Change-Id: I325938554cca432c166fe9d9d689af2bbd68de4b Reviewed-on: https://go-review.googlesource.com/16221 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-05runtime: remove GC start up/shutdown workaround in mallocgcAustin Clements
Currently mallocgc detects if the GC is in a state where it can't assist, but also can't allocate uncontrolled and yields to help out the GC. This was a workaround for periods when we were trying to schedule the GC coordinator. It is no longer necessary because there is no GC coordinator and malloc can always assist with any GC transitions that are necessary. Updates #11970. Change-Id: I4f7beb7013e85e50ae99a3a8b0bb708ba49cbcd4 Reviewed-on: https://go-review.googlesource.com/16392 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-05runtime: decentralize mark done and mark terminationAustin Clements
This moves all of the mark 1 to mark 2 transition and mark termination to the mark done transition function. This means these transitions are now handled on the goroutine that detected mark completion. This also means that the GC coordinator and the background completion barriers are no longer used and various workarounds to yield to the coordinator are no longer necessary. These will be removed in follow-up commits. One consequence of this is that mark workers now need to be preemptible when performing the mark done transition. This allows them to stop the world and to perform the final clean-up steps of GC after restarting the world. They are only made preemptible while performing this transition, so if the worker findRunnableGCWorker would schedule isn't available, we didn't want to schedule it anyway. Fixes #11970. Change-Id: I9203a2d6287eeff62d589ec02ad9cb1e29ddb837 Reviewed-on: https://go-review.googlesource.com/16391 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-05runtime: decentralize sweep termination and mark transitionAustin Clements
This moves all of GC initialization, sweep termination, and the transition to concurrent marking in to the off->mark transition function. This means it's now handled on the goroutine that detected the state exit condition. As a result, malloc no longer needs to Gosched() at the beginning of the GC cycle to prevent over-allocation while the GC is starting up because it will now *help* the GC to start up. The Gosched hack is still necessary during GC shutdown (this is easy to test by enabling gctrace and hitting Ctrl-S to block the gctrace output). At this point, the GC coordinator still handles later phases. This requires a small tweak to how we start the GC coordinator. Currently, starting the GC coordinator is best-effort and may fail if the coordinator is about to park from the previous cycle but hasn't yet. We fix this by replacing the park/ready to wake up the coordinator with a semaphore. This is temporary since the coordinator will be going away in a few commits. Updates #11970. Change-Id: I2c6a11c91e72dfbc59c2d8e7c66146dee9a444fe Reviewed-on: https://go-review.googlesource.com/16357 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-05runtime: beginning of decentralized off->mark transitionAustin Clements
This begins the conversion of the centralized GC coordinator to a decentralized state machine by introducing the internal API that triggers the first state transition from _GCoff to _GCmark (or _GCmarktermination). This change introduces the transition lock, the off->mark transition condition (which is very similar to shouldtriggergc()), and the general structure of a state transition. Since we're doing this conversion in stages, it then falls back to the GC coordinator to actually execute the cycle. We'll start moving logic out of the GC coordinator and in to transition functions next. This fixes a minor bug in gcstoptheworld debug mode where passing the heap trigger once could trigger multiple STW GCs. Updates #11970. Change-Id: I964087dd190a639eb5766398f8e1bbf8b352902f Reviewed-on: https://go-review.googlesource.com/16355 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>
2015-10-28runtime: don't use FP when calling nextSample in the Plan 9 sighandlerDavid du Colombier
In the Go signal handler on Plan 9, when a signal with the _SigThrow flag is received, we call startpanic before printing the stack trace. The startpanic function calls systemstack which calls startpanic_m. In the startpanic_m function, we call allocmcache to allocate _g_.m.mcache. The problem is that allocmcache calls nextSample, which does a floating point operation to return a sampling point for heap profiling. However, Plan 9 doesn't support floating point in the signal handler. This change adds a new function nextSampleNoFP, only called when in the Plan 9 signal handler, which is similar to nextSample, but avoids floating point. Change-Id: Iaa30437aa0f7c8c84d40afbab7567ad3bd5ea2de Reviewed-on: https://go-review.googlesource.com/16307 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-10-27runtime: eliminate some unnecessary uintptr conversionsMatthew Dempsky
arena_{start,used,end} are already uintptr, so no need to convert them to uintptr, much less to convert them to unsafe.Pointer and then to uintptr. No binary change to pkg/linux_amd64/runtime.a. Change-Id: Ia4232ed2a724c44fde7eba403c5fe8e6dccaa879 Reviewed-on: https://go-review.googlesource.com/16339 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
2015-10-26runtime: fix tiny allocatorMatthew Dempsky
When a new tiny block is allocated because we're allocating an object that won't fit into the current block, mallocgc saves the new block if it has more space leftover than the old block. However, the logic for this was subtly broken in golang.org/cl/2814, resulting in never saving (or consequently reusing) a tiny block. Change-Id: Ib5f6769451fb82877ddeefe75dfe79ed4a04fd40 Reviewed-on: https://go-review.googlesource.com/16330 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-10-21runtime, syscall: add calls to msan functionsIan Lance Taylor
Add explicit memory sanitizer instrumentation to the runtime and syscall packages. The compiler does not instrument the runtime package. It does instrument the syscall package, but we need to add a couple of cases that it can't see. Change-Id: I2d66073f713fe67e33a6720460d2bb8f72f31394 Reviewed-on: https://go-review.googlesource.com/16164 Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-10-15runtime: use unsafe.Pointer(x) instead of (unsafe.Pointer)(x)Matthew Dempsky
This isn't C anymore. No binary change to pkg/linux_amd64/runtime.a. Change-Id: I24d66b0f5ac888f432b874aac684b1395e7c8345 Reviewed-on: https://go-review.googlesource.com/15903 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-10-09runtime: assist before allocatingAustin Clements
Currently, when the mutator allocates, the runtime first allocates the memory and then, if that G has done "enough" allocation, the runtime checks whether the G has assist debt to pay off and, if so, pays it off. This approach leads to under-assisting, where a G can allocate a large region (or many small regions) before paying for it, or can even exit with outstanding debt. This commit flips this around so that a G always acquires enough credit for an allocation before it can perform that allocation. We continue to amortize the cost of assists by requiring that they over-assist when triggered to build up credit for many allocations. Fixes #11967. Change-Id: Idac9f11133b328535667674d837be72c23ebd899 Reviewed-on: https://go-review.googlesource.com/15409 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>
2015-10-05pprof: improve sampling for heap profilingRaul Silvera
The current heap sampling introduces some bias that interferes with unsampling, producing unexpected heap profiles. The solution is to use a Poisson process to generate the sampling points, using the formulas described at https://en.wikipedia.org/wiki/Poisson_process This fixes #12620 Change-Id: If2400809ed3c41de504dd6cff06be14e476ff96c Reviewed-on: https://go-review.googlesource.com/14590 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Minux Ma <minux@golang.org> Run-TryBot: Minux Ma <minux@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-10-01runtime: handle sysReserve failure in mHeap_SysAllocJoel Sing
sysReserve will return nil on failure - correctly handle this case and return nil to the caller. Currently, a failure will result in h.arena_end being set to psize, h.arena_used being set to zero and fun times ensue. On the openbsd/arm builder this has resulted in: runtime: address space conflict: map(0x0) = 0x40946000 fatal error: runtime: address space conflict When it should be reporting out of memory instead. Change-Id: Iba828d5ee48ee1946de75eba409e0cfb04f089d4 Reviewed-on: https://go-review.googlesource.com/15056 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2015-08-05runtime: don't recheck heap trigger for periodic GCAustin Clements
88e945f introduced a non-speculative double check of the heap trigger before actually starting a concurrent GC. This was necessary to fix a race for heap-triggered GC, but broke sysmon-triggered periodic GC, since the heap check will of course fail for periodically triggered GC. Fix this by telling startGC whether or not this GC was triggered by heap size or a timer and only doing the heap size double check for GCs triggered by heap size. Fixes #12026. Change-Id: I7c3f6ec364545c36d619f2b4b3bf3b758e3bcbd6 Reviewed-on: https://go-review.googlesource.com/13168 Reviewed-by: Russ Cox <rsc@golang.org>
2015-08-04runtime: make sweep proportional to spans bytes allocatedAustin Clements
Proportional concurrent sweep is currently based on a ratio of spans to be swept per bytes of object allocation. However, proportional sweeping is performed during span allocation, not object allocation, in order to minimize contention and overhead. Since objects are allocated from spans after those spans are allocated, the system tends to operate in debt, which means when the next GC cycle starts, there is often sweep debt remaining, so GC has to finish the sweep, which delays the start of the cycle and delays enabling mutator assists. For example, it's quite likely that many Ps will simultaneously refill their span caches immediately after a GC cycle (because GC flushes the span caches), but at this point, there has been very little object allocation since the end of GC, so very little sweeping is done. The Ps then allocate objects from these cached spans, which drives up the bytes of object allocation, but since these allocations are coming from cached spans, nothing considers whether more sweeping has to happen. If the sweep ratio is high enough (which can happen if the next GC trigger is very close to the retained heap size), this can easily represent a sweep debt of thousands of pages. Fix this by making proportional sweep proportional to the number of bytes of spans allocated, rather than the number of bytes of objects allocated. Prior to allocating a span, both the small object path and the large object path ensure credit for allocating that span, so the system operates in the black, rather than in the red. Combined with the previous commit, this should eliminate all sweeping from GC start up. On the stress test in issue #11911, this reduces the time spent sweeping during GC (and delaying start up) by several orders of magnitude: mean 99%ile max pre fix 1 ms 11 ms 144 ms post fix 270 ns 735 ns 916 ns Updates #11911. Change-Id: I89223712883954c9d6ec2a7a51ecb97172097df3 Reviewed-on: https://go-review.googlesource.com/13044 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
2015-08-04runtime: assist the GC during GC startup and shutdownAustin Clements
Currently there are two sensitive periods during which a mutator can allocate past the heap goal but mutator assists can't be enabled: 1) at the beginning of GC between when the heap first passes the heap trigger and sweep termination and 2) at the end of GC between mark termination and when the background GC goroutine parks. During these periods there's no back-pressure or safety net, so a rapidly allocating mutator can allocate past the heap goal. This is exacerbated if there are many goroutines because the GC coordinator is scheduled as any other goroutine, so if it gets preempted during one of these periods, it may stay preempted for a long period (10s or 100s of milliseconds). Normally the mutator does scan work to create back-pressure against allocation, but there is no scan work during these periods. Hence, as a fall back, if a mutator would assist but can't yet, simply yield the CPU. This delays the mutator somewhat, but more importantly gives more CPU time to the GC coordinator for it to complete the transition. This is obviously a workaround. Issue #11970 suggests a far better but far more invasive way to fix this. Updates #11911. (This very nearly fixes the issue, but about once every 15 minutes I get a GC cycle where the assists are enabled but don't do enough work.) Change-Id: I9768b79e3778abd3e06d306596c3bd77f65bf3f1 Reviewed-on: https://go-review.googlesource.com/13026 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
2015-07-11all: link to https instead of httpBrad Fitzpatrick
The one in misc/makerelease/makerelease.go is particularly bad and probably warrants rotating our keys. I didn't update old weekly notes, and reverted some changes involving test code for now, since we're late in the Go 1.5 freeze. Otherwise, the rest are all auto-generated changes, and all manually reviewed. Change-Id: Ia2753576ab5d64826a167d259f48a2f50508792d Reviewed-on: https://go-review.googlesource.com/12048 Reviewed-by: Rob Pike <r@golang.org>
2015-06-22runtime: one more Map{Bits,Spans} before arena_used updateAustin Clements
In order to avoid a race with a concurrent write barrier or garbage collector thread, any update to arena_used must be preceded by mapping the corresponding heap bitmap and spans array memory. Otherwise, the concurrent access may observe that a pointer falls within the heap arena, but then attempt to access unmapped memory to look up its span or heap bits. Commit d57c889 fixed all of the places where we updated arena_used immediately before mapping the heap bitmap and spans, but it missed the one place where we update arena_used and depend on later code to update it again and map the bitmap and spans. This creates a window where the original race can still happen. This commit fixes this by mapping the heap bitmap and spans before this arena_used update as well. This code path is only taken when expanding the heap reservation on 32-bit over a hole in the address space, so these extra mmap calls should have negligible impact. Fixes #10212, #11324. Change-Id: Id67795e6c7563eb551873bc401e5cc997aaa2bd8 Reviewed-on: https://go-review.googlesource.com/11340 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2015-06-19runtime: ensure GC sees type-safe memory on weak machinesAustin Clements
Currently its possible for the garbage collector to observe uninitialized memory or stale heap bitmap bits on weakly ordered architectures such as ARM and PPC. On such architectures, the stores that zero newly allocated memory and initialize its heap bitmap may move after a store in user code that makes the allocated object observable by the garbage collector. To fix this, add a "publication barrier" (also known as an "export barrier") before returning from mallocgc. This is a store/store barrier that ensures any write done by user code that makes the returned object observable to the garbage collector will be ordered after the initialization performed by mallocgc. No barrier is necessary on the reading side because of the data dependency between loading the pointer and loading the contents of the object. Fixes one of the issues raised in #9984. Change-Id: Ia3d96ad9c5fc7f4d342f5e05ec0ceae700cd17c8 Reviewed-on: https://go-review.googlesource.com/11083 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Minux Ma <minux@golang.org> Reviewed-by: Martin Capitanio <capnm9@gmail.com> Reviewed-by: Russ Cox <rsc@golang.org>
2015-06-18runtime: reduce latency by aggressively ending mark phaseRick Hudson
Some latency regressions have crept into our system over the past few weeks. This CL fixes those by having the mark phase more aggressively blacken objects so that the mark termination phase, a STW phase, has less work to do. Three approaches were taken when the mark phase believes it has no more work to do, ie all the work buffers are empty. If things have gone well the mark phase is correct and there is in fact little or no work. In that case the following items will take very little time. If the mark phase is wrong this CL will ferret that work out and give the mark phase a chance to deal with it concurrently before mark termination begins. When the mark phase first appears to be out of work, it does three things: 1) It switches from allocating white to allocating black to reduce the number of unmarked objects reachable only from stacks. 2) It flushes and disables per-P GC work caches so all work must be in globally visible work buffers. 3) It rescans the global roots---the BSS and data segments---so there are fewer objects to blacken during mark termination. We do not rescan stacks at this point, though that could be done in a later CL. After these steps, it again drains the global work buffers. On a lightly loaded machine the garbage benchmark has reduced the number of GC cycles with latency > 10 ms from 83 out of 4083 cycles down to 2 out of 3995 cycles. Maximum latency was reduced from 60+ msecs down to 20 ms. Change-Id: I152285b48a7e56c5083a02e8e4485dd39c990492 Reviewed-on: https://go-review.googlesource.com/10590 Reviewed-by: Austin Clements <austin@google.com>
2015-06-17cmd/compile: introduce //go:systemstack annotationRuss Cox
//go:systemstack means that the function must run on the system stack. Add one use in runtime as a demonstration. Fixes #9174. Change-Id: I8d4a509cb313541426157da703f1c022e964ace4 Reviewed-on: https://go-review.googlesource.com/10840 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com>
2015-06-15runtime: raise maxmem to 512 GBRuss Cox
A workaround for #10460. Change-Id: I607a556561d509db6de047892f886fb565513895 Reviewed-on: https://go-review.googlesource.com/10819 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2015-06-15runtime: gofmtRuss Cox
Change-Id: I539bdc438f694610a7cd373f7e1451171737cfb3 Reviewed-on: https://go-review.googlesource.com/11084 Reviewed-by: Russ Cox <rsc@golang.org>
2015-06-11runtime: wait to update arena_used until after mapping bitmapRuss Cox
This avoids a race with gcmarkwb_m that was leading to faults. Fixes #10212. Change-Id: I6fcf8d09f2692227063ce29152cb57366ea22487 Reviewed-on: https://go-review.googlesource.com/10816 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2015-05-13runtime: add check for malloc in a signal handlerBrad Fitzpatrick
Change-Id: Ic8ebbe81eb788626c01bfab238d54236e6e5ef2b Reviewed-on: https://go-review.googlesource.com/9964 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
2015-05-11runtime: remove wbshadow modeRuss Cox
The write barrier shadow heap was very useful for developing the write barriers initially, but it's no longer used, clunky, and dragging the rest of the implementation down. The gccheckmark mode will find bugs due to missed barriers when they result in missed marks; wbshadow mode found the missed barriers more aggressively, but it required an entire separate copy of the heap. The gccheckmark mode requires no extra memory, making it more useful in practice. Compared to previous CL: name old mean new mean delta BinaryTree17 5.91s × (0.96,1.06) 5.72s × (0.97,1.03) -3.12% (p=0.000) Fannkuch11 4.32s × (1.00,1.00) 4.36s × (1.00,1.00) +0.91% (p=0.000) FmtFprintfEmpty 89.0ns × (0.93,1.10) 86.6ns × (0.96,1.11) ~ (p=0.077) FmtFprintfString 298ns × (0.98,1.06) 283ns × (0.99,1.04) -4.90% (p=0.000) FmtFprintfInt 286ns × (0.98,1.03) 283ns × (0.98,1.04) -1.09% (p=0.032) FmtFprintfIntInt 498ns × (0.97,1.06) 480ns × (0.99,1.02) -3.65% (p=0.000) FmtFprintfPrefixedInt 408ns × (0.98,1.02) 396ns × (0.99,1.01) -3.00% (p=0.000) FmtFprintfFloat 587ns × (0.98,1.01) 562ns × (0.99,1.01) -4.34% (p=0.000) FmtManyArgs 1.94µs × (0.99,1.02) 1.89µs × (0.99,1.01) -2.85% (p=0.000) GobDecode 15.8ms × (0.98,1.03) 15.7ms × (0.99,1.02) ~ (p=0.251) GobEncode 12.0ms × (0.96,1.09) 11.8ms × (0.98,1.03) -1.87% (p=0.024) Gzip 648ms × (0.99,1.01) 647ms × (0.99,1.01) ~ (p=0.688) Gunzip 143ms × (1.00,1.01) 143ms × (1.00,1.01) ~ (p=0.203) HTTPClientServer 90.3µs × (0.98,1.01) 89.1µs × (0.99,1.02) -1.30% (p=0.000) JSONEncode 31.6ms × (0.99,1.01) 31.7ms × (0.98,1.02) ~ (p=0.219) JSONDecode 107ms × (1.00,1.01) 111ms × (0.99,1.01) +3.58% (p=0.000) Mandelbrot200 6.03ms × (1.00,1.01) 6.01ms × (1.00,1.00) ~ (p=0.077) GoParse 6.53ms × (0.99,1.03) 6.54ms × (0.99,1.02) ~ (p=0.585) RegexpMatchEasy0_32 161ns × (1.00,1.01) 161ns × (0.98,1.05) ~ (p=0.948) RegexpMatchEasy0_1K 541ns × (0.99,1.01) 559ns × (0.98,1.01) +3.32% (p=0.000) RegexpMatchEasy1_32 138ns × (1.00,1.00) 137ns × (0.99,1.01) -0.55% (p=0.001) RegexpMatchEasy1_1K 887ns × (0.99,1.01) 878ns × (0.99,1.01) -0.98% (p=0.000) RegexpMatchMedium_32 253ns × (0.99,1.01) 252ns × (0.99,1.01) -0.39% (p=0.001) RegexpMatchMedium_1K 72.8µs × (1.00,1.00) 72.7µs × (1.00,1.00) ~ (p=0.485) RegexpMatchHard_32 3.85µs × (1.00,1.01) 3.85µs × (1.00,1.01) ~ (p=0.283) RegexpMatchHard_1K 117µs × (1.00,1.01) 117µs × (1.00,1.00) ~ (p=0.175) Revcomp 922ms × (0.97,1.08) 903ms × (0.98,1.05) -2.15% (p=0.021) Template 126ms × (0.99,1.01) 126ms × (0.99,1.01) ~ (p=0.943) TimeParse 628ns × (0.99,1.01) 634ns × (0.99,1.01) +0.92% (p=0.000) TimeFormat 668ns × (0.99,1.01) 698ns × (0.98,1.03) +4.53% (p=0.000) It's nice that the microbenchmarks are the ones helped the most, because those were the ones hurt the most by the conversion from 4-bit to 2-bit heap bitmaps. This CL brings the overall effect of that process to (compared to CL 9706 patch set 1): name old mean new mean delta BinaryTree17 5.87s × (0.94,1.09) 5.72s × (0.97,1.03) -2.57% (p=0.011) Fannkuch11 4.32s × (1.00,1.00) 4.36s × (1.00,1.00) +0.87% (p=0.000) FmtFprintfEmpty 89.1ns × (0.95,1.16) 86.6ns × (0.96,1.11) ~ (p=0.090) FmtFprintfString 283ns × (0.98,1.02) 283ns × (0.99,1.04) ~ (p=0.681) FmtFprintfInt 284ns × (0.98,1.04) 283ns × (0.98,1.04) ~ (p=0.620) FmtFprintfIntInt 486ns × (0.98,1.03) 480ns × (0.99,1.02) -1.27% (p=0.002) FmtFprintfPrefixedInt 400ns × (0.99,1.02) 396ns × (0.99,1.01) -0.84% (p=0.001) FmtFprintfFloat 566ns × (0.99,1.01) 562ns × (0.99,1.01) -0.80% (p=0.000) FmtManyArgs 1.91µs × (0.99,1.02) 1.89µs × (0.99,1.01) -1.10% (p=0.000) GobDecode 15.5ms × (0.98,1.05) 15.7ms × (0.99,1.02) +1.55% (p=0.005) GobEncode 11.9ms × (0.97,1.03) 11.8ms × (0.98,1.03) -0.97% (p=0.048) Gzip 648ms × (0.99,1.01) 647ms × (0.99,1.01) ~ (p=0.627) Gunzip 143ms × (1.00,1.00) 143ms × (1.00,1.01) ~ (p=0.482) HTTPClientServer 89.2µs × (0.99,1.02) 89.1µs × (0.99,1.02) ~ (p=0.740) JSONEncode 32.3ms × (0.97,1.06) 31.7ms × (0.98,1.02) -1.95% (p=0.002) JSONDecode 106ms × (0.99,1.01) 111ms × (0.99,1.01) +4.22% (p=0.000) Mandelbrot200 6.02ms × (1.00,1.00) 6.01ms × (1.00,1.00) ~ (p=0.417) GoParse 6.57ms × (0.97,1.06) 6.54ms × (0.99,1.02) ~ (p=0.404) RegexpMatchEasy0_32 162ns × (1.00,1.00) 161ns × (0.98,1.05) ~ (p=0.088) RegexpMatchEasy0_1K 561ns × (0.99,1.02) 559ns × (0.98,1.01) -0.47% (p=0.034) RegexpMatchEasy1_32 145ns × (0.95,1.04) 137ns × (0.99,1.01) -5.56% (p=0.000) RegexpMatchEasy1_1K 864ns × (0.99,1.04) 878ns × (0.99,1.01) +1.57% (p=0.000) RegexpMatchMedium_32 255ns × (0.99,1.04) 252ns × (0.99,1.01) -1.43% (p=0.001) RegexpMatchMedium_1K 73.9µs × (0.98,1.04) 72.7µs × (1.00,1.00) -1.55% (p=0.004) RegexpMatchHard_32 3.92µs × (0.98,1.04) 3.85µs × (1.00,1.01) -1.80% (p=0.003) RegexpMatchHard_1K 120µs × (0.98,1.04) 117µs × (1.00,1.00) -2.13% (p=0.001) Revcomp 936ms × (0.95,1.08) 903ms × (0.98,1.05) -3.58% (p=0.002) Template 130ms × (0.98,1.04) 126ms × (0.99,1.01) -2.98% (p=0.000) TimeParse 638ns × (0.98,1.05) 634ns × (0.99,1.01) ~ (p=0.198) TimeFormat 674ns × (0.99,1.01) 698ns × (0.98,1.03) +3.69% (p=0.000) Change-Id: Ia0e9b50b1d75a3c0c7556184cd966305574fe07c Reviewed-on: https://go-review.googlesource.com/9706 Reviewed-by: Rick Hudson <rlh@golang.org>
2015-05-06runtime: track "scannable" bytes of heapAustin Clements
This tracks the number of scannable bytes in the allocated heap. That is, bytes that the garbage collector must scan before reaching the last pointer field in each object. This will be used to compute a more robust estimate of the GC scan work. Change-Id: I1eecd45ef9cdd65b69d2afb5db5da885c80086bb Reviewed-on: https://go-review.googlesource.com/9695 Reviewed-by: Russ Cox <rsc@golang.org>
2015-05-04runtime: Reduce calls to shouldtriggergcRick Hudson
shouldtriggergc is slightly expensive due to the call overhead and the use of an atomic. This CL reduces the number of time one checks if a GC should be done from one at each allocation to once when a span is allocated. Since shouldtriggergc is an important abstraction simply hand inlining it, along with its atomic instruction would lose the abstraction. Change-Id: Ia3210655b4b3d433f77064a21ecb54e4d9d435f7 Reviewed-on: https://go-review.googlesource.com/9403 Reviewed-by: Austin Clements <austin@google.com>
2015-04-27runtime: replace STW for enabling write barriers with ragged barrierAustin Clements
Currently, we use a full stop-the-world around enabling write barriers. This is to ensure that all Gs have enabled write barriers before any blackening occurs (either in gcBgMarkWorker() or in gcAssistAlloc()). However, there's no need to bring the whole world to a synchronous stop to ensure this. This change replaces the STW with a ragged barrier that ensures each P has individually observed that write barriers should be enabled before GC performs any blackening. Change-Id: If2f129a6a55bd8bdd4308067af2b739f3fb41955 Reviewed-on: https://go-review.googlesource.com/8207 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-24runtime: implement xadduintptr and update system mstats using itSrdjan Petrovic
The motivation is that sysAlloc/Free() currently aren't safe to be called without a valid G, because arm's xadd64() uses locks that require a valid G. The solution here was proposed by Dmitry Vyukov: use xadduintptr() instead of xadd64(), until arm can support xadd64 on all of its architectures (not a trivial task for arm). Change-Id: I250252079357ea2e4360e1235958b1c22051498f Reviewed-on: https://go-review.googlesource.com/9002 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
2015-04-21runtime: make mcache.local_cachealloc a uintptrAustin Clements
This field used to decrease with sweeps (and potentially go negative). Now it is always zero or positive, so change it to a uintptr so it meshes better with other memory stats. Change-Id: I6a50a956ddc6077eeaf92011c51743cb69540a3c Reviewed-on: https://go-review.googlesource.com/8899 Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-21runtime: proportional mutator assistAustin Clements
Currently, mutator allocation periodically assists the garbage collector by performing a small, fixed amount of scanning work. However, to control heap growth, mutators need to perform scanning work *proportional* to their allocation rate. This change implements proportional mutator assists. This uses the scan work estimate computed by the garbage collector at the beginning of each cycle to compute how much scan work must be performed per allocation byte to complete the estimated scan work by the time the heap reaches the goal size. When allocation triggers an assist, it uses this ratio and the amount allocated since the last assist to compute the assist work, then attempts to steal as much of this work as possible from the background collector's credit, and then performs any remaining scan work itself. Change-Id: I98b2078147a60d01d6228b99afd414ef857e4fba Reviewed-on: https://go-review.googlesource.com/8836 Reviewed-by: Rick Hudson <rlh@golang.org>
2015-04-20runtime: replace func-based write barrier skipping with type-basedRuss Cox
This CL revises CL 7504 to use explicitly uintptr types for the struct fields that are going to be updated sometimes without write barriers. The result is that the fields are now updated *always* without write barriers. This approach has two important properties: 1) Now the GC never looks at the field, so if the missing reference could cause a problem, it will do so all the time, not just when the write barrier is missed at just the right moment. 2) Now a write barrier never happens for the field, avoiding the (correct) detection of inconsistent write barriers when GODEBUG=wbshadow=1. Change-Id: Iebd3962c727c0046495cc08914a8dc0808460e0e Reviewed-on: https://go-review.googlesource.com/9019 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-04-16runtime: darwin/arm64 supportShenghou Ma
Change-Id: I3b3f80791a1db4c2b7318f81a115972cd2237f03 Signed-off-by: Shenghou Ma <minux@golang.org> Reviewed-on: https://go-review.googlesource.com/8782 Reviewed-by: David Crawshaw <crawshaw@golang.org>
2015-04-10runtime, cmd/internal/ld: rename themoduledata to firstmoduledataMichael Hudson-Doyle
'themoduledata' doesn't really make sense now we support multiple moduledata objects. Change-Id: I8263045d8f62a42cb523502b37289b0fba054f62 Reviewed-on: https://go-review.googlesource.com/8521 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-04-06runtime: track heap bytes marked by GCAustin Clements
This tracks the number of heap bytes marked by a GC cycle. We'll use this information to precisely trigger the next GC cycle. Currently this aggregates the work counter in gcWork and dispose atomically aggregates this into a global work counter. dispose happens relatively infrequently, so the contention on the global counter should be low. If this turns out to be an issue, we can reduce the number of disposes, and if it's still a problem, we can switch to per-P counters. Change-Id: I1bc377cb2e802ef61c2968602b63146d52e7f5db Reviewed-on: https://go-review.googlesource.com/8388 Reviewed-by: Russ Cox <rsc@golang.org>
2015-03-31runtime, cmd/internal/ld: change runtime to use a single linker symbolMichael Hudson-Doyle
In preparation for being able to run a go program that has code in several objects, this changes from having several linker symbols used by the runtime into having one linker symbol that points at a structure containing the needed data. Multiple object support will construct a linked list of such structures. A follow up will initialize the slices in the themoduledata structure directly from the linker but I was aiming for a minimal diff for now. Change-Id: I613cce35309801cf265a1d5ae5aaca8d689c5cbf Reviewed-on: https://go-review.googlesource.com/7441 Reviewed-by: Ian Lance Taylor <iant@golang.org>