diff options
| author | Austin Clements <austin@google.com> | 2016-03-04 11:58:26 -0500 |
|---|---|---|
| committer | Austin Clements <austin@google.com> | 2016-04-26 23:40:13 +0000 |
| commit | 2a889b9d931e58166350f785b16edc51e28ef19b (patch) | |
| tree | 6cfac1e2a45e4dc7fdb557eca7edca37974e8635 /src/runtime/runtime2.go | |
| parent | 5b765ce310c594276ea919a9cb455cc894fee999 (diff) | |
| download | go-2a889b9d931e58166350f785b16edc51e28ef19b.tar.xz | |
runtime: make stack re-scan O(# dirty stacks)
Currently the stack re-scan during mark termination is O(# stacks)
because we enqueue a root marking job for every goroutine. It takes
~34ns to process this root marking job for a valid (clean) stack, so
at around 300k goroutines we exceed the 10ms pause goal. A non-trivial
portion of this time is spent simply taking the cache miss to check
the gcscanvalid flag, so simply optimizing the path that handles clean
stacks can only improve this so much.
Fix this by keeping an explicit list of goroutines with dirty stacks
that need to be rescanned. When a goroutine first transitions to
running after a stack scan and marks its stack dirty, it adds itself
to this list. We enqueue root marking jobs only for the goroutines in
this list, so this improves stack re-scanning asymptotically by
completely eliminating time spent on clean goroutines.
This reduces mark termination time for 500k idle goroutines from 15ms
to 238µs. Overall performance effect is negligible.
name \ 95%ile-time/markTerm old new delta
IdleGs/gs:500000/gomaxprocs:12 15000µs ± 0% 238µs ± 5% -98.41% (p=0.000 n=10+10)
name old time/op new time/op delta
XBenchGarbage-12 2.30ms ± 3% 2.29ms ± 1% -0.43% (p=0.049 n=17+18)
name old time/op new time/op delta
BinaryTree17-12 2.57s ± 3% 2.59s ± 2% ~ (p=0.141 n=19+20)
Fannkuch11-12 2.09s ± 0% 2.10s ± 1% +0.53% (p=0.000 n=19+19)
FmtFprintfEmpty-12 45.3ns ± 3% 45.2ns ± 2% ~ (p=0.845 n=20+20)
FmtFprintfString-12 129ns ± 0% 127ns ± 0% -1.55% (p=0.000 n=16+16)
FmtFprintfInt-12 123ns ± 0% 119ns ± 1% -3.24% (p=0.000 n=19+19)
FmtFprintfIntInt-12 195ns ± 1% 189ns ± 1% -3.11% (p=0.000 n=17+17)
FmtFprintfPrefixedInt-12 193ns ± 1% 187ns ± 1% -3.06% (p=0.000 n=19+19)
FmtFprintfFloat-12 254ns ± 0% 255ns ± 1% +0.35% (p=0.001 n=14+17)
FmtManyArgs-12 781ns ± 0% 770ns ± 0% -1.48% (p=0.000 n=16+19)
GobDecode-12 7.00ms ± 1% 6.98ms ± 1% ~ (p=0.563 n=19+19)
GobEncode-12 5.91ms ± 1% 5.92ms ± 0% ~ (p=0.118 n=19+18)
Gzip-12 219ms ± 1% 215ms ± 1% -1.81% (p=0.000 n=18+18)
Gunzip-12 37.2ms ± 0% 37.4ms ± 0% +0.45% (p=0.000 n=17+19)
HTTPClientServer-12 76.9µs ± 3% 77.5µs ± 2% +0.81% (p=0.030 n=20+19)
JSONEncode-12 15.0ms ± 0% 14.8ms ± 1% -0.88% (p=0.001 n=15+19)
JSONDecode-12 50.6ms ± 0% 53.2ms ± 2% +5.07% (p=0.000 n=17+19)
Mandelbrot200-12 4.05ms ± 0% 4.05ms ± 1% ~ (p=0.581 n=16+17)
GoParse-12 3.34ms ± 1% 3.30ms ± 1% -1.21% (p=0.000 n=15+20)
RegexpMatchEasy0_32-12 69.6ns ± 1% 69.8ns ± 2% ~ (p=0.566 n=19+19)
RegexpMatchEasy0_1K-12 238ns ± 1% 236ns ± 0% -0.91% (p=0.000 n=17+13)
RegexpMatchEasy1_32-12 69.8ns ± 1% 70.0ns ± 1% +0.23% (p=0.026 n=17+16)
RegexpMatchEasy1_1K-12 371ns ± 1% 363ns ± 1% -2.07% (p=0.000 n=19+19)
RegexpMatchMedium_32-12 107ns ± 2% 106ns ± 1% -0.51% (p=0.031 n=18+20)
RegexpMatchMedium_1K-12 33.0µs ± 0% 32.9µs ± 0% -0.30% (p=0.004 n=16+16)
RegexpMatchHard_32-12 1.70µs ± 0% 1.70µs ± 0% +0.45% (p=0.000 n=16+17)
RegexpMatchHard_1K-12 51.1µs ± 2% 51.4µs ± 1% +0.53% (p=0.000 n=17+19)
Revcomp-12 378ms ± 1% 385ms ± 1% +1.92% (p=0.000 n=19+18)
Template-12 64.3ms ± 2% 65.0ms ± 2% +1.09% (p=0.001 n=19+19)
TimeParse-12 315ns ± 1% 317ns ± 2% ~ (p=0.108 n=18+20)
TimeFormat-12 360ns ± 1% 337ns ± 0% -6.30% (p=0.000 n=18+13)
[Geo mean] 51.8µs 51.6µs -0.48%
Change-Id: Icf8994671476840e3998236e15407a505d4c760c
Reviewed-on: https://go-review.googlesource.com/20700
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Diffstat (limited to 'src/runtime/runtime2.go')
| -rw-r--r-- | src/runtime/runtime2.go | 11 |
1 files changed, 9 insertions, 2 deletions
diff --git a/src/runtime/runtime2.go b/src/runtime/runtime2.go index 0a988ce469..d35b897c3e 100644 --- a/src/runtime/runtime2.go +++ b/src/runtime/runtime2.go @@ -336,7 +336,7 @@ type g struct { paniconfault bool // panic (instead of crash) on unexpected fault address preemptscan bool // preempted g does scan for gc gcscandone bool // g has scanned stack; protected by _Gscan bit in status - gcscanvalid bool // false at start of gc cycle, true if G has not run since last scan + gcscanvalid bool // false at start of gc cycle, true if G has not run since last scan; transition from true to false by calling queueRescan and false to true by calling dequeueRescan throwsplit bool // must not split stack raceignore int8 // ignore race detection events sysblocktraced bool // StartTrace has emitted EvGoInSyscall about this goroutine @@ -354,7 +354,14 @@ type g struct { racectx uintptr waiting *sudog // sudog structures this g is waiting on (that have a valid elem ptr); in lock order - // Per-G gcController state + // Per-G GC state + + // gcRescan is this G's index in work.rescan.list. If this is + // -1, this G is not on the rescan list. + // + // If gcphase != _GCoff and this G is visible to the garbage + // collector, writes to this are protected by work.rescan.lock. + gcRescan int32 // gcAssistBytes is this G's GC assist credit in terms of // bytes allocated. If this is positive, then the G has credit |
