| Age | Commit message (Collapse) | Author |
|
Change-Id: I58e00a39cf63df07813d21453f91e68eef6a413c
Reviewed-on: https://go-review.googlesource.com/5635
Reviewed-by: Rob Pike <r@golang.org>
|
|
Change-Id: I3be69a4ebf300ad24b55b5f43fd7ad1f001c762e
Reviewed-on: https://go-review.googlesource.com/4838
Reviewed-by: Rob Pike <r@golang.org>
|
|
Otherwise the exported variable collides with the type Arch.
While we're here, remove arch.dumpit (now in portable code)
and add arch.defframe (forgotten originally, somehow).
Change-Id: I1b3a7dd7e96c5f632dba7cd6c1217b42a2004d72
Reviewed-on: https://go-review.googlesource.com/4644
Reviewed-by: Austin Clements <austin@google.com>
|
|
Currently we always create context objects for closures that capture variables.
However, it is completely unnecessary for direct calls of closures
(whether it is func()(), defer func()() or go func()()).
This change transforms any OCALLFUNC(OCLOSURE) to normal function call.
Closed variables become function arguments.
This transformation is especially beneficial for go func(),
because we do not need to allocate context object on heap.
But it makes direct closure calls a bit faster as well (see BenchmarkClosureCall).
On implementation level it required to introduce yet another compiler pass.
However, the pass iterates only over xtop, so it should not be an issue.
Transformation consists of two parts: closure transformation and call site
transformation. We can't run these parts on different sides of escape analysis,
because tree state is inconsistent. We can do both parts during typecheck,
we don't know how to capture variables and don't have call site.
We can't do both parts during walk of OCALLFUNC, because we can walk
OCLOSURE body earlier.
So now capturevars pass only decides how to capture variables
(this info is required for escape analysis). New transformclosure
pass, that runs just before order/walk, does all transformations
of a closure. And later walk of OCALLFUNC(OCLOSURE) transforms call site.
benchmark old ns/op new ns/op delta
BenchmarkClosureCall 4.89 3.09 -36.81%
BenchmarkCreateGoroutinesCapture 1634 1294 -20.81%
benchmark old allocs new allocs delta
BenchmarkCreateGoroutinesCapture 6 2 -66.67%
benchmark old bytes new bytes delta
BenchmarkCreateGoroutinesCapture 176 48 -72.73%
Change-Id: Ic85e1706e18c3235cc45b3c0c031a9c1cdb7a40e
Reviewed-on: https://go-review.googlesource.com/4050
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Consider an interface value i of type I and concrete value c of type C.
Prior to this CL, i==c was evaluated as
I(c) == i
Evaluating I(c) can allocate.
This CL changes the evaluation of i==c to
x, ok := i.(C); ok && x == c
The new generated code is shorter and does not allocate directly.
If C is small, as it is in every instance in the stdlib,
the new code also uses less stack space
and makes one runtime call instead of two.
If C is very large, the original implementation is used.
The cutoff for "very large" is 1<<16,
following the stack vs heap cutoff used elsewhere.
This kind of comparison occurs in 38 places in the stdlib,
mostly in the net and os packages.
benchmark old ns/op new ns/op delta
BenchmarkEqEfaceConcrete 29.5 7.92 -73.15%
BenchmarkEqIfaceConcrete 32.1 7.90 -75.39%
BenchmarkNeEfaceConcrete 29.9 7.90 -73.58%
BenchmarkNeIfaceConcrete 35.9 7.90 -77.99%
Fixes #9370.
Change-Id: I7c4555950bcd6406ee5c613be1f2128da2c9a2b7
Reviewed-on: https://go-review.googlesource.com/2096
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
|
|
Extend escape analysis to make(map[k]v).
If it does not escape, allocate temp buffer for hmap and one bucket on stack.
There are 75 cases of non-escaping maps in std lib.
benchmark old allocs new allocs delta
BenchmarkConcurrentStmtQuery 16161 15161 -6.19%
BenchmarkConcurrentTxQuery 17658 16658 -5.66%
BenchmarkConcurrentTxStmtQuery 16157 15156 -6.20%
BenchmarkConcurrentRandom 13637 13114 -3.84%
BenchmarkManyConcurrentQueries 22 20 -9.09%
BenchmarkDecodeComplex128Slice 250 188 -24.80%
BenchmarkDecodeFloat64Slice 250 188 -24.80%
BenchmarkDecodeInt32Slice 250 188 -24.80%
BenchmarkDecodeStringSlice 2250 2188 -2.76%
BenchmarkNewEmptyMap 1 0 -100.00%
BenchmarkNewSmallMap 2 0 -100.00%
benchmark old ns/op new ns/op delta
BenchmarkNewEmptyMap 124 55.7 -55.08%
BenchmarkNewSmallMap 317 148 -53.31%
benchmark old allocs new allocs delta
BenchmarkNewEmptyMap 1 0 -100.00%
BenchmarkNewSmallMap 2 0 -100.00%
benchmark old bytes new bytes delta
BenchmarkNewEmptyMap 48 0 -100.00%
BenchmarkNewSmallMap 192 0 -100.00%
Fixes #5449
Change-Id: I24fa66f949d2f138885d9e66a0d160240dc9e8fa
Reviewed-on: https://go-review.googlesource.com/3508
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
|
|
Walk calls it outervalue, racewalk calls it basenod,
isstack does it manually and slightly differently.
Change-Id: Id5b5d32b8faf143fe9d34bd08457bfab6fb33daa
Reviewed-on: https://go-review.googlesource.com/3745
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Support the following conversions in escape analysis:
[]rune("foo")
[]byte("foo")
string([]rune{})
If the result does not escape, allocate temp buffer on stack
and pass it to runtime functions.
Change-Id: I1d075907eab8b0109ad7ad1878104b02b3d5c690
Reviewed-on: https://go-review.googlesource.com/3590
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
This isn't everything, but it's a start.
Still producing bit-identical compiler output.
The semantics of the old back ends is preserved,
even when they are probably buggy.
There are some TODOs in gc/gsubr.c to
remove special cases to preserve bugs in 5g and 8g.
Change-Id: I28ae295fbfc94ef9df43e13ab96bd6fc2f194bc4
Reviewed-on: https://go-review.googlesource.com/3802
Reviewed-by: Austin Clements <austin@google.com>
|
|
eqstring does not need to check the length of the strings.
6g
benchmark old ns/op new ns/op delta
BenchmarkCompareStringEqual 7.03 6.14 -12.66%
BenchmarkCompareStringIdentical 3.36 3.04 -9.52%
5g
benchmark old ns/op new ns/op delta
BenchmarkCompareStringEqual 238 232 -2.52%
BenchmarkCompareStringIdentical 90.8 80.7 -11.12%
The equivalent PPC changes are in a separate commit
because I don't have the hardware to test them.
Change-Id: I292874324b9bbd9d24f57a390cfff8b550cdd53c
Reviewed-on: https://go-review.googlesource.com/3955
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Using benchmark from the issue:
benchmark old ns/op new ns/op delta
BenchmarkRangeStringCast 2162 1152 -46.72%
benchmark old allocs new allocs delta
BenchmarkRangeStringCast 1 0 -100.00%
Fixes #2204
Change-Id: I92c5edd2adca4a7b6fba00713a581bf49dc59afe
Reviewed-on: https://go-review.googlesource.com/3790
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Language specification says that variables are captured by reference.
And that is what gc compiler does. However, in lots of cases it is
possible to capture variables by value under the hood without
affecting visible behavior of programs. For example, consider
the following typical pattern:
func (o *Obj) requestMany(urls []string) []Result {
wg := new(sync.WaitGroup)
wg.Add(len(urls))
res := make([]Result, len(urls))
for i := range urls {
i := i
go func() {
res[i] = o.requestOne(urls[i])
wg.Done()
}()
}
wg.Wait()
return res
}
Currently o, wg, res, and i are captured by reference causing 3+len(urls)
allocations (e.g. PPARAM o is promoted to PPARAMREF and moved to heap).
But all of them can be captured by value without changing behavior.
This change implements simple strategy for capturing by value:
if a captured variable is not addrtaken and never assigned to,
then it is captured by value (it is effectively const).
This simple strategy turned out to be very effective:
~80% of all captures in std lib are turned into value captures.
The remaining 20% are mostly in defers and non-escaping closures,
that is, they do not cause allocations anyway.
benchmark old allocs new allocs delta
BenchmarkCompressedZipGarbage 153 126 -17.65%
BenchmarkEncodeDigitsSpeed1e4 91 69 -24.18%
BenchmarkEncodeDigitsSpeed1e5 178 129 -27.53%
BenchmarkEncodeDigitsSpeed1e6 1510 1051 -30.40%
BenchmarkEncodeDigitsDefault1e4 100 75 -25.00%
BenchmarkEncodeDigitsDefault1e5 193 139 -27.98%
BenchmarkEncodeDigitsDefault1e6 1420 985 -30.63%
BenchmarkEncodeDigitsCompress1e4 100 75 -25.00%
BenchmarkEncodeDigitsCompress1e5 193 139 -27.98%
BenchmarkEncodeDigitsCompress1e6 1420 985 -30.63%
BenchmarkEncodeTwainSpeed1e4 109 81 -25.69%
BenchmarkEncodeTwainSpeed1e5 211 151 -28.44%
BenchmarkEncodeTwainSpeed1e6 1588 1097 -30.92%
BenchmarkEncodeTwainDefault1e4 103 77 -25.24%
BenchmarkEncodeTwainDefault1e5 199 143 -28.14%
BenchmarkEncodeTwainDefault1e6 1324 917 -30.74%
BenchmarkEncodeTwainCompress1e4 103 77 -25.24%
BenchmarkEncodeTwainCompress1e5 190 137 -27.89%
BenchmarkEncodeTwainCompress1e6 1327 919 -30.75%
BenchmarkConcurrentDBExec 16223 16220 -0.02%
BenchmarkConcurrentStmtQuery 17687 16182 -8.51%
BenchmarkConcurrentStmtExec 5191 5186 -0.10%
BenchmarkConcurrentTxQuery 17665 17661 -0.02%
BenchmarkConcurrentTxExec 15154 15150 -0.03%
BenchmarkConcurrentTxStmtQuery 17661 16157 -8.52%
BenchmarkConcurrentTxStmtExec 3677 3673 -0.11%
BenchmarkConcurrentRandom 14000 13614 -2.76%
BenchmarkManyConcurrentQueries 25 22 -12.00%
BenchmarkDecodeComplex128Slice 318 252 -20.75%
BenchmarkDecodeFloat64Slice 318 252 -20.75%
BenchmarkDecodeInt32Slice 318 252 -20.75%
BenchmarkDecodeStringSlice 2318 2252 -2.85%
BenchmarkDecode 11 8 -27.27%
BenchmarkEncodeGray 64 56 -12.50%
BenchmarkEncodeNRGBOpaque 64 56 -12.50%
BenchmarkEncodeNRGBA 67 58 -13.43%
BenchmarkEncodePaletted 68 60 -11.76%
BenchmarkEncodeRGBOpaque 64 56 -12.50%
BenchmarkGoLookupIP 153 139 -9.15%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76%
BenchmarkClientServer 62 59 -4.84%
BenchmarkClientServerParallel4 62 59 -4.84%
BenchmarkClientServerParallel64 62 59 -4.84%
BenchmarkClientServerParallelTLS4 79 76 -3.80%
BenchmarkClientServerParallelTLS64 112 109 -2.68%
BenchmarkCreateGoroutinesCapture 10 6 -40.00%
BenchmarkAfterFunc 1006 1005 -0.10%
Fixes #6632.
Change-Id: I0cd51e4d356331d7f3c5f447669080cd19b0d2ca
Reviewed-on: https://go-review.googlesource.com/3166
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
If result of string(i) does not escape,
allocate a [4]byte temp on stack for it.
Change-Id: If31ce9447982929d5b3b963fd0830efae4247c37
Reviewed-on: https://go-review.googlesource.com/3411
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Currently we always allocate string buffers in heap.
For example, in the following code we allocate a temp string
just for comparison:
if string(byteSlice) == "abc" { ... }
This change extends escape analysis to cover []byte->string
conversions and string concatenation. If the result of operations
does not escape, compiler allocates a small buffer
on stack and passes it to slicebytetostring and concatstrings.
Then runtime uses the buffer if the result fits into it.
Size of the buffer is 32 bytes. There is no fundamental theory
behind this number. Just an observation that on std lib
tests/benchmarks frequency of string allocation is inversely
proportional to string length; and there is significant number
of allocations up to length 32.
benchmark old allocs new allocs delta
BenchmarkFprintfBytes 2 1 -50.00%
BenchmarkDecodeComplex128Slice 318 316 -0.63%
BenchmarkDecodeFloat64Slice 318 316 -0.63%
BenchmarkDecodeInt32Slice 318 316 -0.63%
BenchmarkDecodeStringSlice 2318 2316 -0.09%
BenchmarkStripTags 11 5 -54.55%
BenchmarkDecodeGray 111 102 -8.11%
BenchmarkDecodeNRGBAGradient 200 188 -6.00%
BenchmarkDecodeNRGBAOpaque 165 152 -7.88%
BenchmarkDecodePaletted 319 309 -3.13%
BenchmarkDecodeRGB 166 157 -5.42%
BenchmarkDecodeInterlacing 279 268 -3.94%
BenchmarkGoLookupIP 153 135 -11.76%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
BenchmarkGoLookupIPWithBrokenNameServer 245 226 -7.76%
BenchmarkClientServerParallel4 62 61 -1.61%
BenchmarkClientServerParallel64 62 61 -1.61%
BenchmarkClientServerParallelTLS4 79 78 -1.27%
BenchmarkClientServerParallelTLS64 112 111 -0.89%
benchmark old ns/op new ns/op delta
BenchmarkFprintfBytes 381 311 -18.37%
BenchmarkStripTags 2615 2351 -10.10%
BenchmarkDecodeNRGBAGradient 3715887 3635096 -2.17%
BenchmarkDecodeNRGBAOpaque 3047645 2928644 -3.90%
BenchmarkGoLookupIP 153 135 -11.76%
BenchmarkGoLookupIPNoSuchHost 508 466 -8.27%
Change-Id: I9ec01da816945c3329d7be3c7794b520418c3f99
Reviewed-on: https://go-review.googlesource.com/3120
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
For some reason the current conditions require the type to be "uintptr-shaped".
This cuts off structs and arrays with a pointer.
isdirectiface and width==widthptr is sufficient condition to enable the fast paths.
Change-Id: I11842531e7941365413606cfd6c34c202aa14786
Reviewed-on: https://go-review.googlesource.com/3414
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
This is another case where we can say that the address refers to stack.
We create such temps for OSTRUCTLIT initialization.
This eliminates a handful of write barriers today.
But this come up a prerequisite for another change (capturing vars by value),
otherwise we emit writebarriers in writebarrier itself when
capture writebarrier arguments by value.
Change-Id: Ibba93acd0f5431c5a4c3d90ef1e622cb9a7ff50e
Reviewed-on: https://go-review.googlesource.com/3285
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
- Remove more ? : expressions.
- Use uint32 **hash instead of uint32 *hash[] in function argument.
- Change array.c API to use int, not int32, to match Go's slices.
- Rename strlit to newstrlit, to avoid case-insensitive collision with Strlit.
- Fix a few incorrect printf formats.
- Rename a few variables from 'len' to n or length.
- Eliminate direct string editing building up names like convI2T.
Change-Id: I754cf553402ccdd4963e51b7039f589286219c29
Reviewed-on: https://go-review.googlesource.com/3278
Reviewed-by: Rob Pike <r@golang.org>
|
|
cmd/gc contains symbol references into the back end dirs like 6g.
It also contains a few files that include the back end header files and
are compiled separately for each back end, despite being in cmd/gc.
cmd/gc also defines main, which makes at least one reverse symbol
reference unavoidable. (Otherwise you can't get into back-end code.)
This was all expedient, but it's too tightly coupled, especially for a
program written Go.
Make cmd/gc into a true library, letting the back end define main and
call into cmd/gc after making the necessary references available.
cmd/gc being a real library will ease the transition to Go.
Change-Id: I4fb9a0e2b11a32f1d024b3c56fc3bd9ee458842c
Reviewed-on: https://go-review.googlesource.com/3277
Reviewed-by: Rob Pike <r@golang.org>
|
|
For Austin's framepointer experiment.
Change-Id: I81b6f414943b3578924f853300b9193684f79bf4
Reviewed-on: https://go-review.googlesource.com/2994
Reviewed-by: Austin Clements <austin@google.com>
|
|
The compiler converts 'val, ok = m[key]' to
tmp, ok = <runtime call>
val = *tmp
For lookups of the form '_, ok = m[key]',
the second statement is unnecessary.
By not generating it we save a nil check.
Change-Id: I21346cc195cb3c62e041af8b18770c0940358695
Reviewed-on: https://go-review.googlesource.com/1975
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
The equal algorithm used to take the size
equal(p, q *T, size uintptr) bool
With this change, it does not
equal(p, q *T) bool
Similarly for the hash algorithm.
The size is rarely used, as most equal functions know the size
of the thing they are comparing. For instance f32equal already
knows its inputs are 4 bytes in size.
For cases where the size is not known, we allocate a closure
(one for each size needed) that points to an assembly stub that
reads the size out of the closure and calls generic code that
has a size argument.
Reduces the size of the go binary by 0.07%. Performance impact
is not measurable.
Change-Id: I6e00adf3dde7ad2974adbcff0ee91e86d2194fec
Reviewed-on: https://go-review.googlesource.com/2392
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Found with GODEBUG=wbshadow=2 mode.
Eventually that will run automatically, but right now
it still detects other missing write barriers.
Change-Id: I5624b509a36650bce6834cf394b9da163abbf8c0
Reviewed-on: https://go-review.googlesource.com/2310
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
A side effect of this change is that when assertI2T writes to the
memory for the T being extracted, it can use typedmemmove
for write barriers.
There are other ways we could have done this, but this one
finishes a TODO in package runtime.
Found with GODEBUG=wbshadow=2 mode.
Eventually that will run automatically, but right now
it still detects other missing write barriers.
Change-Id: Icbc8aabfd8a9b1f00be2e421af0e3b29fa54d01e
Reviewed-on: https://go-review.googlesource.com/2279
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Found with GODEBUG=wbshadow=2 mode.
Eventually that will run automatically, but right now
it still detects other missing write barriers.
Change-Id: I1320d5340a9e421c779f24f3b170e33974e56e4f
Reviewed-on: https://go-review.googlesource.com/2278
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Preparation for replacing many memmove calls in runtime
with typedmemmove, which is a clearer description of what
the routine is doing.
For the same reason, rename writebarriercopy to typedslicecopy.
Change-Id: I6f23bef2c2215509fefba175b16908f76dc7538c
Reviewed-on: https://go-review.googlesource.com/2276
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Calls to goproc/deferproc used to push & pop two extra arguments,
the argument size and the function to call. Now, we allocate space
for those arguments in the outargs section so we don't have to
modify the SP.
Defers now use the stack pointer (instead of the argument pointer)
to identify which frame they are associated with.
A followon CL might simplify funcspdelta and some of the stack
walking code.
Fixes issue #8641
Change-Id: I835ec2f42f0392c5dec7cb0fe6bba6f2aed1dad8
Reviewed-on: https://go-review.googlesource.com/1601
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Change-Id: I3b81f2e9eb29ee6349d758b68fe7951b34f15a81
Reviewed-on: https://go-review.googlesource.com/1974
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
//go:nowritebarrier can only be used in package runtime.
It does not disable write barriers; it is an assertion, checked
by the compiler, that the following function needs no write
barriers.
Change-Id: Id7978b779b66dc1feea39ee6bda9fd4d80280b7c
Reviewed-on: https://go-review.googlesource.com/1224
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
TBR=austin
CC=golang-codereviews
https://golang.org/cl/179290043
|
|
warning: src/cmd/gc/walk.c:1769 set and not used: on
LGTM=rsc
R=rsc, minux
CC=golang-codereviews
https://golang.org/cl/175850043
|
|
This got lost in the change that added the writebarrierfat variants.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/165510043
|
|
Now each C printf, Go print, or Go println is guaranteed
not to be interleaved with other calls of those functions.
This should help when debugging concurrent failures.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/169120043
|
|
writebarrierptr
This CL implements the many multiword write barriers by calling
writebarrierptr, so that only writebarrierptr needs the actual barrier.
In lieu of an actual barrier, writebarrierptr checks that the value
being copied is not a small non-zero integer. This is enough to
shake out bugs where the barrier is being called when it should not
(for non-pointer values). It also found a few tests in sync/atomic
that were being too clever.
This CL adds a write barrier for the memory moved during the
builtin copy function, which I forgot when inserting barriers for Go 1.4.
This CL re-enables some write barriers that were disabled for Go 1.4.
Those were disabled because it is possible to change the generated
code so that they are unnecessary most of the time, but we have not
changed the generated code yet. For safety they must be enabled.
None of this is terribly efficient. We are aiming for correct first.
LGTM=rlh
R=rlh
CC=golang-codereviews
https://golang.org/cl/168770043
|
|
Still passes on amd64.
LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/165110043
|
|
TBR=austin
CC=golang-codereviews
https://golang.org/cl/162420043
|
|
Fixes #9006.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/167800043
|
|
goprintf is a printf-like print for Go.
It is used in the code generated by 'defer print(...)' and 'go print(...)'.
Normally print(1, 2, 3) turns into
printint(1)
printint(2)
printint(3)
but defer and go need a single function call to give the runtime;
they give the runtime something like goprintf("%d%d%d", 1, 2, 3).
Variadic functions like goprintf cannot be described in the new
type information world, so we have to replace it.
Replace with a custom function, so that defer print(1, 2, 3) turns
into
defer func(a1, a2, a3 int) {
print(a1, a2, a3)
}(1, 2, 3)
(and then the print becomes three different printints as usual).
Fixes #8614.
LGTM=austin
R=austin
CC=golang-codereviews, r
https://golang.org/cl/159700043
|
|
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/164110043
|
|
CL 157910047 introduced code to turn a node representing
a zeroed composite literal into N, the nil Node* pointer
(which represents any zero, not the Go literal nil).
That's great for assignments like x = T{}, but it doesn't work
when T{} is used in a value context like T{}.v or x == T{}.
Fix those.
Should have no effect on performance; confirmed.
The deltas below are noise (compare ns/op):
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 2902919192 2915228424 +0.42%
BenchmarkFannkuch11 2597417605 2630363685 +1.27%
BenchmarkFmtFprintfEmpty 73.7 74.8 +1.49%
BenchmarkFmtFprintfString 196 199 +1.53%
BenchmarkFmtFprintfInt 213 217 +1.88%
BenchmarkFmtFprintfIntInt 336 356 +5.95%
BenchmarkFmtFprintfPrefixedInt 289 294 +1.73%
BenchmarkFmtFprintfFloat 415 416 +0.24%
BenchmarkFmtManyArgs 1281 1271 -0.78%
BenchmarkGobDecode 10271734 10307978 +0.35%
BenchmarkGobEncode 8985021 9079442 +1.05%
BenchmarkGzip 410233227 412266944 +0.50%
BenchmarkGunzip 102114554 103272443 +1.13%
BenchmarkHTTPClientServer 45297 44993 -0.67%
BenchmarkJSONEncode 19499741 19498489 -0.01%
BenchmarkJSONDecode 76436733 74247497 -2.86%
BenchmarkMandelbrot200 4273814 4307292 +0.78%
BenchmarkGoParse 4024594 4028937 +0.11%
BenchmarkRegexpMatchEasy0_32 131 135 +3.05%
BenchmarkRegexpMatchEasy0_1K 328 333 +1.52%
BenchmarkRegexpMatchEasy1_32 115 117 +1.74%
BenchmarkRegexpMatchEasy1_1K 931 948 +1.83%
BenchmarkRegexpMatchMedium_32 216 217 +0.46%
BenchmarkRegexpMatchMedium_1K 72669 72857 +0.26%
BenchmarkRegexpMatchHard_32 3818 3809 -0.24%
BenchmarkRegexpMatchHard_1K 121398 121945 +0.45%
BenchmarkRevcomp 613996550 615145436 +0.19%
BenchmarkTemplate 93678525 93267391 -0.44%
BenchmarkTimeParse 414 411 -0.72%
BenchmarkTimeFormat 396 399 +0.76%
Fixes #8947.
LGTM=r
R=r, dave
CC=golang-codereviews
https://golang.org/cl/162130043
|
|
This brings dev.power64 up-to-date with the current tip of
default. go_bootstrap is still panicking with a bad defer
when initializing the runtime (even on amd64).
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/152570049
|
|
This also removes pkg/runtime/traceback_lr.c, which was ported
to Go in an earlier commit and then moved to
runtime/traceback.go.
Reviewer: rsc@golang.org
rsc: LGTM
|
|
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/160200044
|
|
Both of these forms can avoid writing to the base pointer in x
(in the slice, always, and in the append, most of the time).
For Go 1.5, will need to change the compilation of x = x[0:y]
to avoid writing to the base pointer, so that the elision is safe,
and will need to change the compilation of x = append(x, ...)
to write to the base pointer (through a barrier) only when
growing the underlying array, so that the general elision is safe.
For Go 1.4, elide the write barrier always, a change that should
have equivalent performance characteristics but is much
simpler and therefore safer.
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 3910526122 3918802545 +0.21%
BenchmarkFannkuch11 3747650699 3732600693 -0.40%
BenchmarkFmtFprintfEmpty 106 98.7 -6.89%
BenchmarkFmtFprintfString 280 269 -3.93%
BenchmarkFmtFprintfInt 296 282 -4.73%
BenchmarkFmtFprintfIntInt 467 470 +0.64%
BenchmarkFmtFprintfPrefixedInt 418 398 -4.78%
BenchmarkFmtFprintfFloat 574 535 -6.79%
BenchmarkFmtManyArgs 1768 1818 +2.83%
BenchmarkGobDecode 14916799 14925182 +0.06%
BenchmarkGobEncode 14110076 13358298 -5.33%
BenchmarkGzip 546609795 542630402 -0.73%
BenchmarkGunzip 136270657 136496277 +0.17%
BenchmarkHTTPClientServer 126574 125245 -1.05%
BenchmarkJSONEncode 30006238 27862354 -7.14%
BenchmarkJSONDecode 106020889 102664600 -3.17%
BenchmarkMandelbrot200 5793550 5818320 +0.43%
BenchmarkGoParse 5437608 5463962 +0.48%
BenchmarkRegexpMatchEasy0_32 192 179 -6.77%
BenchmarkRegexpMatchEasy0_1K 462 460 -0.43%
BenchmarkRegexpMatchEasy1_32 168 153 -8.93%
BenchmarkRegexpMatchEasy1_1K 1420 1280 -9.86%
BenchmarkRegexpMatchMedium_32 338 286 -15.38%
BenchmarkRegexpMatchMedium_1K 107435 98027 -8.76%
BenchmarkRegexpMatchHard_32 5941 4846 -18.43%
BenchmarkRegexpMatchHard_1K 185965 153830 -17.28%
BenchmarkRevcomp 795497458 798447829 +0.37%
BenchmarkTemplate 132091559 134938425 +2.16%
BenchmarkTimeParse 604 608 +0.66%
BenchmarkTimeFormat 551 548 -0.54%
LGTM=r
R=r, dave
CC=golang-codereviews, iant, khr, rlh
https://golang.org/cl/159960043
|
|
Among other things, *x = T{} does not need a write barrier.
The changes here avoid an unnecessary copy even when
no pointers are involved, so it may have larger effects.
In 6g and 8g, avoid manually repeated STOSQ in favor of
writing explicit MOVs, under the theory that the MOVs
should have fewer dependencies and pipeline better.
Benchmarks compare best of 5 on a 2012 MacBook Pro Core i5
with TurboBoost disabled. Most improvements can be explained
by the changes in this CL.
The effect in Revcomp is real but harder to explain: none of
the instructions in the inner loop changed. I suspect loop
alignment but really have no idea.
benchmark old new delta
BenchmarkBinaryTree17 3809027371 3819907076 +0.29%
BenchmarkFannkuch11 3607547556 3686983012 +2.20%
BenchmarkFmtFprintfEmpty 118 103 -12.71%
BenchmarkFmtFprintfString 289 277 -4.15%
BenchmarkFmtFprintfInt 304 290 -4.61%
BenchmarkFmtFprintfIntInt 507 458 -9.66%
BenchmarkFmtFprintfPrefixedInt 425 408 -4.00%
BenchmarkFmtFprintfFloat 555 555 +0.00%
BenchmarkFmtManyArgs 1835 1733 -5.56%
BenchmarkGobDecode 14738209 14639331 -0.67%
BenchmarkGobEncode 14239039 13703571 -3.76%
BenchmarkGzip 538211054 538701315 +0.09%
BenchmarkGunzip 135430877 134818459 -0.45%
BenchmarkHTTPClientServer 116488 116618 +0.11%
BenchmarkJSONEncode 28923406 29294334 +1.28%
BenchmarkJSONDecode 105779820 104289543 -1.41%
BenchmarkMandelbrot200 5791758 5771964 -0.34%
BenchmarkGoParse 5376642 5310943 -1.22%
BenchmarkRegexpMatchEasy0_32 195 190 -2.56%
BenchmarkRegexpMatchEasy0_1K 477 455 -4.61%
BenchmarkRegexpMatchEasy1_32 170 165 -2.94%
BenchmarkRegexpMatchEasy1_1K 1410 1394 -1.13%
BenchmarkRegexpMatchMedium_32 336 329 -2.08%
BenchmarkRegexpMatchMedium_1K 108979 106328 -2.43%
BenchmarkRegexpMatchHard_32 5854 5821 -0.56%
BenchmarkRegexpMatchHard_1K 185089 182838 -1.22%
BenchmarkRevcomp 834920364 780202624 -6.55%
BenchmarkTemplate 137046937 129728756 -5.34%
BenchmarkTimeParse 600 594 -1.00%
BenchmarkTimeFormat 559 539 -3.58%
LGTM=r
R=r
CC=golang-codereviews, iant, khr, rlh
https://golang.org/cl/157910047
|
|
The racewalk code was not updated for the new write barriers.
Make it more future-proof.
The new write barrier code assumed that +1 pointer would
be aligned properly for any type that might follow, but that's
not true on 32-bit systems where some types are 64-bit aligned.
The only system like that today is nacl/amd64p32.
Insert a dummy pointer so that the ambiguously typed
value is at +2 pointers, which is always max-aligned.
LGTM=r
R=r
CC=golang-codereviews, iant, khr
https://golang.org/cl/158890046
|
|
Assignments of 2-, 3-, and 4-word values were handled
by individual MOV instructions (and for scalars still are).
But if there are pointers involved, those assignments now
go through the write barrier routine. Before this CL, they
went to writebarrierfat, which calls memmove.
Memmove is too much overhead for these small
amounts of data.
Instead, call writebarrierfat{2,3,4}, which are specialized
for the specific amount of data being copied.
Today the write barrier does not care which words are
pointers, so size alone is enough to distinguish the cases.
If we keep these distinctions in Go 1.5 we will need to
expand them for all the pointer-vs-scalar possibilities,
so the current 3 functions will become 3+7+15 = 25,
still not a large burden (we deleted more morestack
functions than that when we dropped segmented stacks).
BenchmarkBinaryTree17 3250972583 3123910344 -3.91%
BenchmarkFannkuch11 3067605223 2964737839 -3.35%
BenchmarkFmtFprintfEmpty 101 96.0 -4.95%
BenchmarkFmtFprintfString 267 235 -11.99%
BenchmarkFmtFprintfInt 261 253 -3.07%
BenchmarkFmtFprintfIntInt 444 402 -9.46%
BenchmarkFmtFprintfPrefixedInt 374 346 -7.49%
BenchmarkFmtFprintfFloat 472 449 -4.87%
BenchmarkFmtManyArgs 1537 1476 -3.97%
BenchmarkGobDecode 13986528 12432985 -11.11%
BenchmarkGobEncode 13120323 12537420 -4.44%
BenchmarkGzip 451925758 437500578 -3.19%
BenchmarkGunzip 113267612 110053644 -2.84%
BenchmarkHTTPClientServer 103151 77100 -25.26%
BenchmarkJSONEncode 25002733 23435278 -6.27%
BenchmarkJSONDecode 94213717 82568789 -12.36%
BenchmarkMandelbrot200 4804246 4713070 -1.90%
BenchmarkGoParse 4646114 4379456 -5.74%
BenchmarkRegexpMatchEasy0_32 163 158 -3.07%
BenchmarkRegexpMatchEasy0_1K 433 391 -9.70%
BenchmarkRegexpMatchEasy1_32 154 138 -10.39%
BenchmarkRegexpMatchEasy1_1K 1481 1132 -23.57%
BenchmarkRegexpMatchMedium_32 282 270 -4.26%
BenchmarkRegexpMatchMedium_1K 92421 86149 -6.79%
BenchmarkRegexpMatchHard_32 5209 4718 -9.43%
BenchmarkRegexpMatchHard_1K 158141 147921 -6.46%
BenchmarkRevcomp 699818791 642222464 -8.23%
BenchmarkTemplate 132402383 108269713 -18.23%
BenchmarkTimeParse 509 478 -6.09%
BenchmarkTimeFormat 462 456 -1.30%
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/156200043
|
|
It is more useful to report all the errors instead of just the first.
LGTM=dave, khr
R=khr, dave
CC=golang-codereviews
https://golang.org/cl/143940043
|
|
During anylit run, nodes such as SLICEARR(statictmp, [:])
may be generated and are expected to be found unchanged by
gen_as_init.
In some walks (in particular walkselect), the statement
may be walked again and lowered to its usual form, leading to a
crash.
Fixes #8017.
Fixes #8024.
Fixes #8058.
LGTM=rsc
R=golang-codereviews, dvyukov, gobot, rsc
CC=golang-codereviews
https://golang.org/cl/112080043
|
|
A write *p = x that needs a write barrier (not all do)
now turns into runtime.writebarrierptr(p, x)
or one of the other variants.
The write barrier implementations are trivial.
The goal here is to emit the calls in the correct places
and to incur the cost of those function calls in the Go 1.4 cycle.
Performance on the Go 1 benchmark suite below.
Remember, the goal is to slow things down (and be correct).
We will look into optimizations in separate CLs, as part of
the process of comparing Go 1.3 against tip in order to make
sure Go 1.4 runs at least as fast as Go 1.3.
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 3118336716 3452876110 +10.73%
BenchmarkFannkuch11 3184497677 3211552284 +0.85%
BenchmarkFmtFprintfEmpty 89.9 107 +19.02%
BenchmarkFmtFprintfString 236 287 +21.61%
BenchmarkFmtFprintfInt 246 278 +13.01%
BenchmarkFmtFprintfIntInt 395 458 +15.95%
BenchmarkFmtFprintfPrefixedInt 343 378 +10.20%
BenchmarkFmtFprintfFloat 477 525 +10.06%
BenchmarkFmtManyArgs 1446 1707 +18.05%
BenchmarkGobDecode 14398047 14685958 +2.00%
BenchmarkGobEncode 12557718 12947104 +3.10%
BenchmarkGzip 453462345 472413285 +4.18%
BenchmarkGunzip 114226016 115127398 +0.79%
BenchmarkHTTPClientServer 114689 112122 -2.24%
BenchmarkJSONEncode 24914536 26135942 +4.90%
BenchmarkJSONDecode 86832877 103620289 +19.33%
BenchmarkMandelbrot200 4833452 4898780 +1.35%
BenchmarkGoParse 4317976 4835474 +11.98%
BenchmarkRegexpMatchEasy0_32 150 166 +10.67%
BenchmarkRegexpMatchEasy0_1K 393 402 +2.29%
BenchmarkRegexpMatchEasy1_32 125 142 +13.60%
BenchmarkRegexpMatchEasy1_1K 1010 1236 +22.38%
BenchmarkRegexpMatchMedium_32 232 301 +29.74%
BenchmarkRegexpMatchMedium_1K 76963 102721 +33.47%
BenchmarkRegexpMatchHard_32 3833 5463 +42.53%
BenchmarkRegexpMatchHard_1K 119668 161614 +35.05%
BenchmarkRevcomp 763449047 706768534 -7.42%
BenchmarkTemplate 124954724 134834549 +7.91%
BenchmarkTimeParse 517 511 -1.16%
BenchmarkTimeFormat 501 514 +2.59%
benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 53.31 52.26 0.98x
BenchmarkGobEncode 61.12 59.28 0.97x
BenchmarkGzip 42.79 41.08 0.96x
BenchmarkGunzip 169.88 168.55 0.99x
BenchmarkJSONEncode 77.89 74.25 0.95x
BenchmarkJSONDecode 22.35 18.73 0.84x
BenchmarkGoParse 13.41 11.98 0.89x
BenchmarkRegexpMatchEasy0_32 213.30 191.72 0.90x
BenchmarkRegexpMatchEasy0_1K 2603.92 2542.74 0.98x
BenchmarkRegexpMatchEasy1_32 254.00 224.93 0.89x
BenchmarkRegexpMatchEasy1_1K 1013.53 827.98 0.82x
BenchmarkRegexpMatchMedium_32 4.30 3.31 0.77x
BenchmarkRegexpMatchMedium_1K 13.30 9.97 0.75x
BenchmarkRegexpMatchHard_32 8.35 5.86 0.70x
BenchmarkRegexpMatchHard_1K 8.56 6.34 0.74x
BenchmarkRevcomp 332.92 359.62 1.08x
BenchmarkTemplate 15.53 14.39 0.93x
LGTM=rlh
R=rlh
CC=dvyukov, golang-codereviews, iant, khr, r
https://golang.org/cl/136380043
|
|
This CL adjusts code referring to src/pkg to refer to src.
Immediately after submitting this CL, I will submit
a change doing 'hg mv src/pkg/* src'.
That change will be too large to review with Rietveld
but will contain only the 'hg mv'.
This CL will break the build.
The followup 'hg mv' will fix it.
For more about the move, see golang.org/s/go14nopkg.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/134570043
|