| Age | Commit message (Collapse) | Author |
|
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
|
|
build.
LGTM=dvyukov
R=golang-codereviews, dvyukov
CC=golang-codereviews, khr
https://golang.org/cl/128680043
|
|
Newly allocated memory is subtracted from inuse, while it was never added to inuse.
Span leftovers are subtracted from both inuse and idle,
while they were never added.
Fixes #8544.
Fixes #8430.
LGTM=khr, cookieo9
R=golang-codereviews, khr, cookieo9
CC=golang-codereviews, rlh, rsc
https://golang.org/cl/130200044
|
|
Implement the design described in:
https://docs.google.com/document/d/1v4Oqa0WwHunqlb8C3ObL_uNQw3DfSY-ztoA-4wWbKcg/pub
Summary of the changes:
GC uses "2-bits per word" pointer type info embed directly into bitmap.
Scanning of stacks/data/heap is unified.
The old spans types go away.
Compiler generates "sparse" 4-bits type info for GC (directly for GC bitmap).
Linker generates "dense" 2-bits type info for data/bss (the same as stacks use).
Summary of results:
-1680 lines of code total (-1000+ in mgc0.c only)
-25% memory consumption
-3-7% binary size
-15% GC pause reduction
-7% run time reduction
LGTM=khr
R=golang-codereviews, rsc, christoph, khr
CC=golang-codereviews, rlh
https://golang.org/cl/106260045
|
|
Pending CL 113120043.
LGTM=dave
R=golang-codereviews, dave
CC=golang-codereviews
https://golang.org/cl/112290043
|
|
Walking the stack by frames is ~3x more expensive
than not, and since it didn't end up being precise,
there is not enough benefit to outweigh the cost.
This is the conservative choice: this CL makes the
stack scanning behavior the same as it was in Go 1.1.
Add benchmarks to package runtime so that we have
them when we re-enable this feature during the
Go 1.3 development.
benchmark old ns/op new ns/op delta
BenchmarkGoroutineSelect 3194909 1272092 -60.18%
BenchmarkGoroutineBlocking 3120282 866366 -72.23%
BenchmarkGoroutineForRange 3256179 939902 -71.13%
BenchmarkGoroutineIdle 2005571 482982 -75.92%
The Go 1 benchmarks, just to add more data.
As far as I can tell the changes are mainly noise.
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 4409403046 4414734932 +0.12%
BenchmarkFannkuch11 3407708965 3378306120 -0.86%
BenchmarkFmtFprintfEmpty 100 99 -0.60%
BenchmarkFmtFprintfString 242 239 -1.24%
BenchmarkFmtFprintfInt 204 206 +0.98%
BenchmarkFmtFprintfIntInt 320 316 -1.25%
BenchmarkFmtFprintfPrefixedInt 295 299 +1.36%
BenchmarkFmtFprintfFloat 442 435 -1.58%
BenchmarkFmtManyArgs 1246 1216 -2.41%
BenchmarkGobDecode 10186951 10051210 -1.33%
BenchmarkGobEncode 16504381 16445650 -0.36%
BenchmarkGzip 447030885 447056865 +0.01%
BenchmarkGunzip 111056154 111696305 +0.58%
BenchmarkHTTPClientServer 89973 93040 +3.41%
BenchmarkJSONEncode 28174182 27933893 -0.85%
BenchmarkJSONDecode 106353777 110443817 +3.85%
BenchmarkMandelbrot200 4822289 4806083 -0.34%
BenchmarkGoParse 6102436 6142734 +0.66%
BenchmarkRegexpMatchEasy0_32 133 132 -0.75%
BenchmarkRegexpMatchEasy0_1K 372 373 +0.27%
BenchmarkRegexpMatchEasy1_32 113 111 -1.77%
BenchmarkRegexpMatchEasy1_1K 964 940 -2.49%
BenchmarkRegexpMatchMedium_32 202 205 +1.49%
BenchmarkRegexpMatchMedium_1K 68862 68858 -0.01%
BenchmarkRegexpMatchHard_32 3480 3407 -2.10%
BenchmarkRegexpMatchHard_1K 108255 112614 +4.03%
BenchmarkRevcomp 751393035 743929976 -0.99%
BenchmarkTemplate 139637041 135402220 -3.03%
BenchmarkTimeParse 479 475 -0.84%
BenchmarkTimeFormat 460 466 +1.30%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 75.34 76.36 1.01x
BenchmarkGobEncode 46.50 46.67 1.00x
BenchmarkGzip 43.41 43.41 1.00x
BenchmarkGunzip 174.73 173.73 0.99x
BenchmarkJSONEncode 68.87 69.47 1.01x
BenchmarkJSONDecode 18.25 17.57 0.96x
BenchmarkGoParse 9.49 9.43 0.99x
BenchmarkRegexpMatchEasy0_32 239.58 241.74 1.01x
BenchmarkRegexpMatchEasy0_1K 2749.74 2738.00 1.00x
BenchmarkRegexpMatchEasy1_32 282.49 286.32 1.01x
BenchmarkRegexpMatchEasy1_1K 1062.00 1088.96 1.03x
BenchmarkRegexpMatchMedium_32 4.93 4.86 0.99x
BenchmarkRegexpMatchMedium_1K 14.87 14.87 1.00x
BenchmarkRegexpMatchHard_32 9.19 9.39 1.02x
BenchmarkRegexpMatchHard_1K 9.46 9.09 0.96x
BenchmarkRevcomp 338.26 341.65 1.01x
BenchmarkTemplate 13.90 14.33 1.03x
Fixes #6482.
R=golang-dev, dave, r
CC=golang-dev
https://golang.org/cl/14257043
|
|
Currently lots of sys allocations are not accounted in any of XxxSys,
including GC bitmap, spans table, GC roots blocks, GC finalizer blocks,
iface table, netpoll descriptors and more. Up to ~20% can unaccounted.
This change introduces 2 new stats: GCSys and OtherSys for GC metadata
and all other misc allocations, respectively.
Also ensures that all XxxSys indeed sum up to Sys. All sys memory allocation
functions require the stat for accounting, so that it's impossible to miss something.
Also fix updating of mcache_sys/inuse, they were not updated after deallocation.
test/bench/garbage/parser before:
Sys 670064344
HeapSys 610271232
StackSys 65536
MSpanSys 14204928
MCacheSys 16384
BuckHashSys 1439992
after:
Sys 670064344
HeapSys 610271232
StackSys 65536
MSpanSys 14188544
MCacheSys 16384
BuckHashSys 3194304
GCSys 39198688
OtherSys 3129656
Fixes #5799.
R=rsc, dave, alex.brainman
CC=golang-dev
https://golang.org/cl/12946043
|
|
Allocs of size 16 can bypass atomic set of the allocated bit, while allocs of size 8 can not.
Allocs with and w/o type info hit different paths inside of malloc.
Current results on linux/amd64:
BenchmarkMalloc8 50000000 43.6 ns/op
BenchmarkMalloc16 50000000 46.7 ns/op
BenchmarkMallocTypeInfo8 50000000 61.3 ns/op
BenchmarkMallocTypeInfo16 50000000 63.5 ns/op
R=golang-dev, remyoudompheng, minux.ma, bradfitz, iant
CC=golang-dev
https://golang.org/cl/9090045
|