| Age | Commit message (Collapse) | Author |
|
Each URL was manually verified to ensure it did not serve up incorrect
content.
Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df
Reviewed-on: https://go-review.googlesource.com/115798
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
|
|
Propagate values through some wide Zero/Move operations. Among
other things this allows us to optimize some kinds of array
initialization. For example, the following code no longer
requires a temporary be allocated on the stack. Instead it
writes the values directly into the return value.
func f(i uint32) [4]uint32 {
return [4]uint32{i, i+1, i+2, i+3}
}
The return value is unnecessarily cleared but removing that is
probably a task for dead store analysis (I think it needs to
be able to match multiple Store ops to wide Zero ops).
In order to reliably remove stack variables that are rendered
unnecessary by these new rules I've added a new generic version
of the unread autos elimination pass.
These rules are triggered more than 5000 times when building and
testing the standard library.
Updates #15925 (fixes for arrays of up to 4 elements).
Updates #24386 (fixes for up to 4 kept elements).
Updates #24416.
compilebench results:
name old time/op new time/op delta
Template 353ms ± 5% 359ms ± 3% ~ (p=0.143 n=10+10)
Unicode 219ms ± 1% 217ms ± 4% ~ (p=0.740 n=7+10)
GoTypes 1.26s ± 1% 1.26s ± 2% ~ (p=0.549 n=9+10)
Compiler 6.00s ± 1% 6.08s ± 1% +1.42% (p=0.000 n=9+8)
SSA 15.3s ± 2% 15.6s ± 1% +2.43% (p=0.000 n=10+10)
Flate 237ms ± 2% 240ms ± 2% +1.31% (p=0.015 n=10+10)
GoParser 285ms ± 1% 285ms ± 1% ~ (p=0.878 n=8+8)
Reflect 797ms ± 3% 807ms ± 2% ~ (p=0.065 n=9+10)
Tar 334ms ± 0% 335ms ± 4% ~ (p=0.460 n=8+10)
XML 419ms ± 0% 423ms ± 1% +0.91% (p=0.001 n=7+9)
StdCmd 46.0s ± 0% 46.4s ± 0% +0.85% (p=0.000 n=9+9)
name old user-time/op new user-time/op delta
Template 337ms ± 3% 346ms ± 5% ~ (p=0.053 n=9+10)
Unicode 205ms ±10% 205ms ± 8% ~ (p=1.000 n=10+10)
GoTypes 1.22s ± 2% 1.21s ± 3% ~ (p=0.436 n=10+10)
Compiler 5.85s ± 1% 5.93s ± 0% +1.46% (p=0.000 n=10+8)
SSA 14.9s ± 1% 15.3s ± 1% +2.62% (p=0.000 n=10+10)
Flate 229ms ± 4% 228ms ± 6% ~ (p=0.796 n=10+10)
GoParser 271ms ± 3% 275ms ± 4% ~ (p=0.165 n=10+10)
Reflect 779ms ± 5% 775ms ± 2% ~ (p=0.971 n=10+10)
Tar 317ms ± 4% 319ms ± 5% ~ (p=0.853 n=10+10)
XML 404ms ± 4% 409ms ± 5% ~ (p=0.436 n=10+10)
name old alloc/op new alloc/op delta
Template 34.9MB ± 0% 35.0MB ± 0% +0.26% (p=0.000 n=10+10)
Unicode 29.3MB ± 0% 29.3MB ± 0% +0.02% (p=0.000 n=10+10)
GoTypes 115MB ± 0% 115MB ± 0% +0.30% (p=0.000 n=10+10)
Compiler 519MB ± 0% 521MB ± 0% +0.30% (p=0.000 n=10+10)
SSA 1.55GB ± 0% 1.57GB ± 0% +1.34% (p=0.000 n=10+9)
Flate 24.1MB ± 0% 24.2MB ± 0% +0.10% (p=0.000 n=10+10)
GoParser 28.1MB ± 0% 28.1MB ± 0% +0.07% (p=0.000 n=10+10)
Reflect 78.7MB ± 0% 78.7MB ± 0% +0.03% (p=0.000 n=8+10)
Tar 34.4MB ± 0% 34.5MB ± 0% +0.12% (p=0.000 n=10+10)
XML 43.2MB ± 0% 43.2MB ± 0% +0.13% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Template 330k ± 0% 330k ± 0% -0.01% (p=0.017 n=10+10)
Unicode 337k ± 0% 337k ± 0% +0.01% (p=0.000 n=9+10)
GoTypes 1.15M ± 0% 1.15M ± 0% +0.03% (p=0.000 n=10+10)
Compiler 4.77M ± 0% 4.77M ± 0% +0.03% (p=0.000 n=9+10)
SSA 12.5M ± 0% 12.6M ± 0% +1.16% (p=0.000 n=10+10)
Flate 221k ± 0% 221k ± 0% +0.05% (p=0.000 n=9+10)
GoParser 275k ± 0% 275k ± 0% +0.01% (p=0.014 n=10+9)
Reflect 944k ± 0% 944k ± 0% -0.02% (p=0.000 n=10+10)
Tar 324k ± 0% 323k ± 0% -0.12% (p=0.000 n=10+10)
XML 384k ± 0% 384k ± 0% -0.01% (p=0.001 n=10+10)
name old object-bytes new object-bytes delta
Template 476kB ± 0% 476kB ± 0% -0.04% (p=0.000 n=10+10)
Unicode 218kB ± 0% 218kB ± 0% ~ (all equal)
GoTypes 1.58MB ± 0% 1.58MB ± 0% -0.04% (p=0.000 n=10+10)
Compiler 6.25MB ± 0% 6.24MB ± 0% -0.09% (p=0.000 n=10+10)
SSA 15.9MB ± 0% 16.1MB ± 0% +1.22% (p=0.000 n=10+10)
Flate 304kB ± 0% 304kB ± 0% -0.13% (p=0.000 n=10+10)
GoParser 370kB ± 0% 370kB ± 0% -0.00% (p=0.000 n=10+10)
Reflect 1.27MB ± 0% 1.27MB ± 0% -0.12% (p=0.000 n=10+10)
Tar 421kB ± 0% 419kB ± 0% -0.64% (p=0.000 n=10+10)
XML 518kB ± 0% 517kB ± 0% -0.12% (p=0.000 n=10+10)
name old export-bytes new export-bytes delta
Template 16.7kB ± 0% 16.7kB ± 0% ~ (all equal)
Unicode 6.52kB ± 0% 6.52kB ± 0% ~ (all equal)
GoTypes 29.2kB ± 0% 29.2kB ± 0% ~ (all equal)
Compiler 88.0kB ± 0% 88.0kB ± 0% ~ (all equal)
SSA 109kB ± 0% 109kB ± 0% ~ (all equal)
Flate 4.49kB ± 0% 4.49kB ± 0% ~ (all equal)
GoParser 8.10kB ± 0% 8.10kB ± 0% ~ (all equal)
Reflect 7.71kB ± 0% 7.71kB ± 0% ~ (all equal)
Tar 9.15kB ± 0% 9.15kB ± 0% ~ (all equal)
XML 12.3kB ± 0% 12.3kB ± 0% ~ (all equal)
name old text-bytes new text-bytes delta
HelloSize 676kB ± 0% 672kB ± 0% -0.59% (p=0.000 n=10+10)
CmdGoSize 7.26MB ± 0% 7.24MB ± 0% -0.18% (p=0.000 n=10+10)
name old data-bytes new data-bytes delta
HelloSize 10.2kB ± 0% 10.2kB ± 0% ~ (all equal)
CmdGoSize 248kB ± 0% 248kB ± 0% ~ (all equal)
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
CmdGoSize 145kB ± 0% 145kB ± 0% ~ (all equal)
name old exe-bytes new exe-bytes delta
HelloSize 1.46MB ± 0% 1.45MB ± 0% -0.31% (p=0.000 n=10+10)
CmdGoSize 14.7MB ± 0% 14.7MB ± 0% -0.17% (p=0.000 n=10+10)
Change-Id: Ic72b0c189dd542f391e1c9ab88a76e9148dc4285
Reviewed-on: https://go-review.googlesource.com/106495
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This commit adds the js/wasm architecture to the runtime package.
Currently WebAssembly has no support for threads yet, see
https://github.com/WebAssembly/design/issues/1073. Because of that,
there is no preemption of goroutines and no sysmon goroutine.
Design doc: https://docs.google.com/document/d/131vjr4DH6JFnb-blm_uRdaC0_Nv3OUwjEY5qVCxCup4
About WebAssembly assembly files: https://docs.google.com/document/d/1GRmy3rA4DiYtBlX-I1Jr_iHykbX8EixC3Mq0TCYqbKc
Updates #18892
Change-Id: I7f12d21b5180500d55ae9fd2f7e926a1731db391
Reviewed-on: https://go-review.googlesource.com/103877
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
This is a follow-up of CL 93637. There, when we redirect sync/atomic
to runtime/internal/atomic, a few good implementations of ARM atomics
were lost. This CL brings most of them back, with some improvements.
- Change atomic Store to a plain store with memory barrier, as we
already changed atomic Load to plain load with memory barrier.
- Use native 64-bit atomics on ARMv7, jump to Go implementations
on older machines. But drop the kernel helper. In particular,
for Load64, just do loads, not using Cas on the address being
load from, so it works also for read-only memory (since we have
already fixed 32-bit Load).
Change-Id: I725cd65cf945ae5200db81a35be3f251c9f7af14
Reviewed-on: https://go-review.googlesource.com/111315
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
This gets us around the kernel helpers on ARMv7.
It is slightly faster than using the kernel helper.
name old time/op new time/op delta
AtomicLoad-4 72.5ns ± 0% 69.5ns ± 0% -4.08% (p=0.000 n=9+9)
AtomicStore-4 57.6ns ± 1% 54.4ns ± 0% -5.58% (p=0.000 n=10+9)
[Geo mean] 64.6ns 61.5ns -4.83%
If performance is really critical, we can even do compiler intrinsics
on GOARM=7.
Fixes #23792.
Change-Id: I36497d880890b26bdf01e048b542bd5fd7b17d23
Reviewed-on: https://go-review.googlesource.com/94076
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
The implementation of atomics are inherently tricky. It would
be good to have them implemented in a single place, instead of
multiple copies.
Mostly a simple redirect.
On 386, some functions in sync/atomic have better implementations,
which are moved to runtime/internal/atomic.
On ARM, some functions in sync/atomic have better implementations.
They are dropped by this CL, but restored with an improved
version in a follow-up CL. On linux/arm, 64-bit CAS kernel helper
is dropped, as we're trying to move away from kernel helpers.
Fixes #23778.
Change-Id: Icb9e1039acc92adbb2a371c34baaf0b79551c3ea
Reviewed-on: https://go-review.googlesource.com/93637
Reviewed-by: Austin Clements <austin@google.com>
|
|
There are several things combined in this change.
First, eliminate the gobitvector type in favor
of adding a ptrbit method to bitvector.
In non-performance-critical code, use that method.
In performance critical code, though, load the bitvector data
one byte at a time and iterate only over set bits.
To support that, add and use sys.Ctz8.
name old time/op new time/op delta
StackCopyPtr-8 81.8ms ± 5% 78.9ms ± 3% -3.58% (p=0.000 n=97+96)
StackCopy-8 65.9ms ± 3% 62.8ms ± 3% -4.67% (p=0.000 n=96+92)
StackCopyNoCache-8 105ms ± 3% 102ms ± 3% -3.38% (p=0.000 n=96+95)
Change-Id: I00b80f45612708bd440b1a411a57fa6dfa24aa74
Reviewed-on: https://go-review.googlesource.com/109716
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Instead issue a memory barrier on ARMv7 after reading the address.
Fixes #23777
Change-Id: I7aff2ab0246af64b437ebe0b31d4b30d351890d8
Reviewed-on: https://go-review.googlesource.com/94275
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
In #17528 it was discussed (off-topic to the actual issue) to reserve
GOARCH names for the RISC-V architecture. With the first RISC-V
Linux-capable development boards released (e.g. HiFive Unleashed),
Linux distributions being ported to RISC-V (e.g. Debian, Fedora) and
RISC-V support being added to gccgo (CL 96377), it becomes more likely
that Go software (and maybe Go itself) will be ported as well.
Add riscv and riscv64 (which is already used by gccgo), so Go 1.11 will
already recognize "*_riscv{,64}.go" as reserved files.
Change-Id: I042aab19c68751d82ea513e40f7b1d7e1ad924ea
Reviewed-on: https://go-review.googlesource.com/106256
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This is the first commit of a series that will add WebAssembly
as an architecture target. The design document can be found at
https://docs.google.com/document/d/131vjr4DH6JFnb-blm_uRdaC0_Nv3OUwjEY5qVCxCup4.
The GOARCH name "wasm" is the official abbreviation of WebAssembly.
The GOOS name "js" got chosen because initially the host environment
that executes WebAssembly bytecode will be web browsers and Node.js,
which both use JavaScript to embed WebAssembly. Other GOOS values
may be possible later, see:
https://github.com/WebAssembly/design/blob/master/NonWeb.md
Updates #18892
Change-Id: Ia25b4fa26bba8029c25923f48ad009fd3681933a
Reviewed-on: https://go-review.googlesource.com/102835
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change-Id: Ib67a61d5b37af210ff15d60d72bd5238b9c2d0ca
Reviewed-on: https://go-review.googlesource.com/94815
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
faster atomics for ppc64x
This change implements faster atomics for ppc64x based on the ISA 2.07B,
Appendix B.2 recommendations, replacing SYNC/ISYNC by LWSYNC in some
cases.
Updates #21348
name old time/op new time/op delta
Cond1-16 955ns 856ns -10.33%
Cond2-16 2.38µs 2.03µs -14.59%
Cond4-16 5.90µs 5.44µs -7.88%
Cond8-16 12.1µs 11.1µs -8.42%
Cond16-16 27.0µs 25.1µs -7.04%
Cond32-16 59.1µs 55.5µs -6.14%
LoadMostlyHits/*sync_test.DeepCopyMap-16 22.1ns 24.1ns +9.02%
LoadMostlyHits/*sync_test.RWMutexMap-16 252ns 249ns -1.20%
LoadMostlyHits/*sync.Map-16 16.2ns 16.3ns ~
LoadMostlyMisses/*sync_test.DeepCopyMap-16 22.3ns 22.6ns ~
LoadMostlyMisses/*sync_test.RWMutexMap-16 249ns 247ns -0.51%
LoadMostlyMisses/*sync.Map-16 12.7ns 12.7ns ~
LoadOrStoreBalanced/*sync_test.RWMutexMap-16 1.27µs 1.17µs -7.54%
LoadOrStoreBalanced/*sync.Map-16 1.12µs 1.10µs -2.35%
LoadOrStoreUnique/*sync_test.RWMutexMap-16 1.75µs 1.68µs -3.84%
LoadOrStoreUnique/*sync.Map-16 2.07µs 1.97µs -5.13%
LoadOrStoreCollision/*sync_test.DeepCopyMap-16 15.8ns 15.9ns ~
LoadOrStoreCollision/*sync_test.RWMutexMap-16 496ns 424ns -14.48%
LoadOrStoreCollision/*sync.Map-16 6.07ns 6.07ns ~
Range/*sync_test.DeepCopyMap-16 1.65µs 1.64µs ~
Range/*sync_test.RWMutexMap-16 278µs 288µs +3.75%
Range/*sync.Map-16 2.00µs 2.01µs ~
AdversarialAlloc/*sync_test.DeepCopyMap-16 3.45µs 3.44µs ~
AdversarialAlloc/*sync_test.RWMutexMap-16 226ns 227ns ~
AdversarialAlloc/*sync.Map-16 1.09µs 1.07µs -2.36%
AdversarialDelete/*sync_test.DeepCopyMap-16 553ns 550ns -0.57%
AdversarialDelete/*sync_test.RWMutexMap-16 273ns 274ns ~
AdversarialDelete/*sync.Map-16 247ns 249ns ~
UncontendedSemaphore-16 79.0ns 65.5ns -17.11%
ContendedSemaphore-16 112ns 97ns -13.77%
MutexUncontended-16 3.34ns 2.51ns -24.69%
Mutex-16 266ns 191ns -28.26%
MutexSlack-16 226ns 159ns -29.55%
MutexWork-16 377ns 338ns -10.14%
MutexWorkSlack-16 335ns 308ns -8.20%
MutexNoSpin-16 196ns 184ns -5.91%
MutexSpin-16 710ns 666ns -6.21%
Once-16 1.29ns 1.29ns ~
Pool-16 8.64ns 8.71ns ~
PoolOverflow-16 1.60µs 1.44µs -10.25%
SemaUncontended-16 5.39ns 4.42ns -17.96%
SemaSyntNonblock-16 539ns 483ns -10.42%
SemaSyntBlock-16 413ns 354ns -14.20%
SemaWorkNonblock-16 305ns 258ns -15.36%
SemaWorkBlock-16 266ns 229ns -14.06%
RWMutexUncontended-16 12.9ns 9.7ns -24.80%
RWMutexWrite100-16 203ns 147ns -27.47%
RWMutexWrite10-16 177ns 119ns -32.74%
RWMutexWorkWrite100-16 435ns 403ns -7.39%
RWMutexWorkWrite10-16 642ns 611ns -4.79%
WaitGroupUncontended-16 4.67ns 3.70ns -20.92%
WaitGroupAddDone-16 402ns 355ns -11.54%
WaitGroupAddDoneWork-16 208ns 250ns +20.09%
WaitGroupWait-16 1.21ns 1.21ns ~
WaitGroupWaitWork-16 5.91ns 5.87ns -0.81%
WaitGroupActuallyWait-16 92.2ns 85.8ns -6.91%
Updates #21348
Change-Id: Ibb9b271d11b308264103829e176c6d9fe8f867d3
Reviewed-on: https://go-review.googlesource.com/95175
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
|
|
For each replacement, test case is added to new 386enc.s file
with exception of EMMS, SYSENTER, MFENCE and LFENCE as they
are already covered in amd64enc.s (same on amd64 and 386).
The replacement became less obvious after go vet suggested changes
Before:
BYTE $0x0f; BYTE $0x7f; BYTE $0x44; BYTE $0x24; BYTE $0x08
Changed to MOVQ (this form is being tested):
MOVQ M0, 8(SP)
Refactored to FP-relative access (go vet advice):
MOVQ M0, val+4(FP)
Change-Id: I56b87cf3371b6ad81ad0cd9db2033aee407b5818
Reviewed-on: https://go-review.googlesource.com/101475
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
|
|
If nil, fault before taking the lock or calling into the kernel.
Change-Id: I013d78a5f9233c2a9197660025f679940655d384
Reviewed-on: https://go-review.googlesource.com/93636
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Updates #23778.
Change-Id: I80e57a15b6e3bbc2e25ea186399ff0e360fc5c21
Reviewed-on: https://go-review.googlesource.com/93635
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
GitHub-Last-Rev: d6a6fa39095cac8a9acfeacbbafd636e1aa9b55b
GitHub-Pull-Request: golang/go#23809
Change-Id: Ife18ba2f982b5e1c30bda32d13dcd441778b986a
Reviewed-on: https://go-review.googlesource.com/93575
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This replaces frame size -4/-8 with the NOFRAME flag in mips and
mips64 assembly.
This was automated with:
sed -i -e 's/\(^TEXT.*[A-Z]\),\( *\)\$-[84]/\1|NOFRAME,\2$0/' $(find -name '*_mips*.s')
Plus a manual fix to mkduff.go.
The go binary is identical on both architectures before and after this
change.
Change-Id: I0310384d1a584118c41d1cd3a042bb8ea7227efb
Reviewed-on: https://go-review.googlesource.com/92044
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
"-8" is not a sensible frame size on arm and we're about to start
rejecting it. Replace it with -4.
Likewise, "-4" is not a sensible frame size on arm64 and we're about
to start rejecting it. Replace it with -8.
Finally, clean up some places we're weirdly inconsistent about using 0
versus -8.
Change-Id: If85e229993d5f7f1f0cfa9852b4e294d053bd784
Reviewed-on: https://go-review.googlesource.com/92038
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Change-Id: I30e9b938bb19ed4e674c3ea4a1cd389b9c4f0b88
Reviewed-on: https://go-review.googlesource.com/86875
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Suppose you build the Go toolchain in directory A,
move the whole thing to directory B, and then use
it from B to build a new program hello.exe, and then
run hello.exe, and hello.exe crashes with a stack
trace into the standard library.
Long ago, you'd have seen hello.exe print file names
in the A directory tree, even though the files had moved
to the B directory tree. About two years ago we changed
the compiler to write down these files with the name
"$GOROOT" (that literal string) instead of A, so that the
final link from B could replace "$GOROOT" with B,
so that hello.exe's crash would show the correct source
file paths in the stack trace. (golang.org/cl/18200)
Now suppose that you do the same thing but hello.exe
doesn't crash: it prints fmt.Println(runtime.GOROOT()).
And you run hello.exe after clearing $GOROOT from the
environment.
Long ago, you'd have seen hello.exe print A instead of B.
Before this CL, you'd still see hello.exe print A instead of B.
This case is the one instance where a moved toolchain
still divulges its origin. Not anymore. After this CL, hello.exe
will print B, because the linker sets runtime/internal/sys.DefaultGoroot
with the effective GOROOT from link time.
This makes the default result of runtime.GOROOT once again
match the file names recorded in the binary, after two years
of divergence.
With that cleared up, we can reintroduce GOROOT into the
link action ID and also reenable TestExecutableGOROOT/RelocatedExe.
When $GOROOT_FINAL is set during link, it is used
in preference to $GOROOT, as always, but it was easier
to explain the behavior above without introducing that
complication.
Fixes #22155.
Fixes #20284.
Fixes #22475.
Change-Id: Ifdaeb77fd4678fdb337cf59ee25b2cd873ec1016
Reviewed-on: https://go-review.googlesource.com/86835
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
The BigEndian constant is only used in boolean context so assign it
boolean constants.
Change-Id: If19d61dd71cdfbffede1d98b401f11e6535fba59
Reviewed-on: https://go-review.googlesource.com/73270
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This change implements the convention for generated code header agreed upon in https://golang.org/s/generatedcode.
Additionally run go generate.
Also update some comments.
Updates #13560
Change-Id: If45f91b93aaa0d43280c2c4630823bc4d2dc7d3a
Reviewed-on: https://go-review.googlesource.com/60250
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
According to http://infocenter.arm.com:
* ARM Cortex-A53 (Raspberry Pi 3, Pine A64)
* ARM Cortex-A57 (Opteron A1100, Tegra X1)
* ARM Cortex-A72
all have a cache line size of 64 bytes.
Change-Id: I4b333e930792fb1a221b3ca6f395bfa1b7762afa
Reviewed-on: https://go-review.googlesource.com/43250
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
LLV and SCV are 64-bit load-linked and store-conditional. They
were used in runtime as #define WORD. Change them to normal
instruction form.
NOOP is hardware no-op. It was written as WORD $0. Make a name
for it for better disassembly output.
Fixes #12561.
Fixes #18238.
Change-Id: I82c667ce756fa83ef37b034b641e8c4366335e83
Reviewed-on: https://go-review.googlesource.com/40297
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Starting in go1.9, the minimum processor requirement for ppc64 is POWER8. This
means the checks for GOARCH_ppc64 in asm_ppc64x.s can be removed, since we can
assume LBAR and STBCCC instructions (both from ISA 2.06) will always be
available.
Updates #19074
Change-Id: Ib4418169cd9fc6f871a5ab126b28ee58a2f349e2
Reviewed-on: https://go-review.googlesource.com/38406
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Implement math/bits.TrailingZerosX using intrinsics.
Generally reorganize the intrinsic spec a bit.
The instrinsics data structure is now built at init time.
This will make doing the other functions in math/bits easier.
Update sys.CtzX to return int instead of uint{64,32} so it
matches math/bits.TrailingZerosX.
Improve the intrinsics a bit for amd64. We don't need the CMOV
for <64 bit versions.
Update #18616
Change-Id: Ic1c5339c943f961d830ae56f12674d7b29d4ff39
Reviewed-on: https://go-review.googlesource.com/38155
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
This check was originally implemented by Vladimir in
https://go-review.googlesource.com/c/31489/1/src/runtime/internal/atomic/atomic_mipsx.go#30
but removed due to my comment (Sorry!). This CL adds it back.
Fixes #17786.
Change-Id: I7ff4c2539fc9e2afd8199964b587a8ccf093b896
Reviewed-on: https://go-review.googlesource.com/33431
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Updates #17786. Will fix mips(32) when the port is fully landed.
Change-Id: I00d4ff666ec14a38cadbcd52569b347bb5bc8b75
Reviewed-on: https://go-review.googlesource.com/33236
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Add a variant of sync/atomic's TestUnaligned64 to
runtime/internal/atomic.
Skips the test on arm for now where it's currently failing.
Updates #17786
Change-Id: If63f9c1243e9db7b243a95205b2d27f7d1dc1e6e
Reviewed-on: https://go-review.googlesource.com/33159
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Change-Id: Ice8f3b42194852f7ee8f00f004e80014d1ea119b
Reviewed-on: https://go-review.googlesource.com/32875
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change-Id: I6288f1fca1ae4c64b3907af700811ee842053020
Reviewed-on: https://go-review.googlesource.com/31472
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Change-Id: I10f36710dd95b9bd31b3b82a3c32edcadb90ffa9
Reviewed-on: https://go-review.googlesource.com/31510
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Change-Id: I99a48f719fd1a8178fc59787084a074e91c89ac6
Reviewed-on: https://go-review.googlesource.com/31489
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Found by vet.
Change-Id: I1d78454facdd3522509ecfe7c73b21c4602ced8a
Reviewed-on: https://go-review.googlesource.com/32670
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
|
|
These are emulated by the assembler and we don't need them.
Change-Id: I2b07c5315a5b642fdb5e50b468453260ae121164
Reviewed-on: https://go-review.googlesource.com/31758
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Also add assembly implementation, in case intrinsics is disabled.
Change-Id: Iff0a8a8ce326651bd29f6c403f5ec08dd3629993
Reviewed-on: https://go-review.googlesource.com/28979
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Now that the runtime fetches the true physical page size from the OS,
make the physical page size used by heap growth a variable instead of
a constant. This isn't used in any performance-critical paths, so it
shouldn't be an issue.
sys.PhysPageSize is also renamed to sys.DefaultPhysPageSize to make it
clear that it's not necessarily the true page size. There are no uses
of this constant any more, but we'll keep it around for now.
Updates #12480 and #10180.
Change-Id: I6c23b9df860db309c38c8287a703c53817754f03
Reviewed-on: https://go-review.googlesource.com/25022
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Currently we assume the physical page size on ARM is 4kB. While this
is usually true, the architecture also supports 16kB and 64kB physical
pages, and Linux (and possibly other OSes) can be configured to use
these larger page sizes.
With Go 1.6, such a configuration could potentially run, but generally
resulted in memory corruption or random panics. With current master,
this configuration will cause the runtime to panic during init on
Linux when it checks the true physical page size (and will still cause
corruption or panics on other OSes).
However, the assumed physical page size only has to be a multiple of
the true physical page size, the scavenger can now deal with large
physical page sizes, and the rest of the runtime can deal with a
larger assumed physical page size than the true size. Hence, there's
little disadvantage to conservatively setting the assumed physical
page size to 64kB on ARM.
This may result in some extra memory use, since we can only return
memory at multiples of the assumed physical page size. However, it is
a simple change that should make Go run on systems configured for
larger page sizes. The following commits will make the runtime query
the actual physical page size from the OS, but this is a simple step
there.
Updates #12480.
Change-Id: I851829595bc9e0c76235c847a7b5f62ad82b5302
Reviewed-on: https://go-review.googlesource.com/25021
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
|
|
The current padding in the 'p' struct is hardcoded at 64 bytes. It should be the
cache line size. On ppc64x, the current value is only okay because sys.CacheLineSize
is wrong at 64 bytes. This change fixes that by making the padding equal to the
cache line size. It also fixes the cache line size for ppc64/ppc64le to 128 bytes.
Fixes #16477
Change-Id: Ib7ec5195685116eb11ba312a064f41920373d4a3
Reviewed-on: https://go-review.googlesource.com/25370
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Inline atomic reads and writes on amd64. There's no reason
to pay the overhead of a call for these.
To keep atomic loads from being reordered, we make them
return a <value,memory> tuple.
Change the meaning of resultInArg0 for tuple-generating ops
to mean the first part of the result tuple, not the second.
This means we can always put the store part of the tuple last,
matching how arguments are laid out. This requires reordering
the outputs of add32carry and sub32carry and their descendents
in various architectures.
benchmark old ns/op new ns/op delta
BenchmarkAtomicLoad64-8 2.09 0.26 -87.56%
BenchmarkAtomicStore64-8 7.54 5.72 -24.14%
TBD (in a different CL): Cas, Or8, ...
Change-Id: I713ea88e7da3026c44ea5bdb56ed094b20bc5207
Reviewed-on: https://go-review.googlesource.com/27641
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Add missing function prototypes.
Fix function prototypes.
Use FP references instead of SP references.
Fix variable names.
Update comments.
Clean up whitespace. (Not for vet.)
All fairly minor fixes to make vet happy.
Updates #11041
Change-Id: Ifab2cdf235ff61cdc226ab1d84b8467b5ac9446c
Reviewed-on: https://go-review.googlesource.com/27713
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Now that we have ops that can return 2 results, have BSF return a result
and flags. We can then get rid of the redundant comparison and use CMOV
instead of CMOVconst ops.
Get rid of a bunch of the ops we don't use. Ctz{8,16}, plus all the Clzs,
and CMOVNEs. I don't think we'll ever use them, and they would be easy
to add back if needed.
Change-Id: I8858a1d017903474ea7e4002fc76a6a86e7bd487
Reviewed-on: https://go-review.googlesource.com/27630
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
They are unused, and vet wants them to have
a function prototype.
Updates #11041
Change-Id: Idedc96ddd3c3cf1b1d2ab6d98796367eab29f032
Reviewed-on: https://go-review.googlesource.com/27492
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change-Id: I80ccf40cd3930aff908ee64f6dcbe5f5255198d3
Reviewed-on: https://go-review.googlesource.com/24914
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Ctz is a hot-spot in the Go 1.7 memory manager. In SSA it's
implemented as an intrinsic that compiles to a few instructions, but
on the old backend (all architectures other than amd64), it's
implemented as a fairly complex Go function. As a result, switching to
bitmap-based allocation was a significant hit to allocation-heavy
workloads like BinaryTree17 on non-SSA platforms.
For unknown reasons, this hit 386 particularly hard. We can regain a
lot of the lost performance by implementing Ctz in assembly on the
386. This isn't as good as an intrinsic, since it still generates a
function call and prevents useful inlining, but it's much better than
the pure Go implementation:
name old time/op new time/op delta
BinaryTree17-12 3.59s ± 1% 3.06s ± 1% -14.74% (p=0.000 n=19+20)
Fannkuch11-12 3.72s ± 1% 3.64s ± 1% -2.09% (p=0.000 n=17+19)
FmtFprintfEmpty-12 52.3ns ± 3% 52.3ns ± 3% ~ (p=0.829 n=20+19)
FmtFprintfString-12 156ns ± 1% 148ns ± 3% -5.20% (p=0.000 n=18+19)
FmtFprintfInt-12 137ns ± 1% 136ns ± 1% -0.56% (p=0.000 n=19+13)
FmtFprintfIntInt-12 227ns ± 2% 225ns ± 2% -0.93% (p=0.000 n=19+17)
FmtFprintfPrefixedInt-12 210ns ± 1% 208ns ± 1% -0.91% (p=0.000 n=19+17)
FmtFprintfFloat-12 375ns ± 1% 371ns ± 1% -1.06% (p=0.000 n=19+18)
FmtManyArgs-12 995ns ± 2% 978ns ± 1% -1.63% (p=0.000 n=17+17)
GobDecode-12 9.33ms ± 1% 9.19ms ± 0% -1.59% (p=0.000 n=20+17)
GobEncode-12 7.73ms ± 1% 7.73ms ± 1% ~ (p=0.771 n=19+20)
Gzip-12 375ms ± 1% 374ms ± 1% ~ (p=0.141 n=20+18)
Gunzip-12 61.8ms ± 1% 61.8ms ± 1% ~ (p=0.602 n=20+20)
HTTPClientServer-12 87.7µs ± 2% 86.9µs ± 3% -0.87% (p=0.024 n=19+20)
JSONEncode-12 20.2ms ± 1% 20.4ms ± 0% +0.53% (p=0.000 n=18+19)
JSONDecode-12 65.3ms ± 0% 65.4ms ± 1% ~ (p=0.385 n=16+19)
Mandelbrot200-12 4.11ms ± 1% 4.12ms ± 0% +0.29% (p=0.020 n=19+19)
GoParse-12 3.75ms ± 1% 3.61ms ± 2% -3.90% (p=0.000 n=20+20)
RegexpMatchEasy0_32-12 104ns ± 0% 103ns ± 0% -0.96% (p=0.000 n=13+16)
RegexpMatchEasy0_1K-12 805ns ± 1% 803ns ± 1% ~ (p=0.189 n=18+18)
RegexpMatchEasy1_32-12 111ns ± 0% 111ns ± 3% ~ (p=1.000 n=14+19)
RegexpMatchEasy1_1K-12 1.00µs ± 1% 1.00µs ± 1% +0.50% (p=0.003 n=19+19)
RegexpMatchMedium_32-12 133ns ± 2% 133ns ± 2% ~ (p=0.218 n=20+20)
RegexpMatchMedium_1K-12 41.2µs ± 1% 42.2µs ± 1% +2.52% (p=0.000 n=18+16)
RegexpMatchHard_32-12 2.35µs ± 1% 2.38µs ± 1% +1.53% (p=0.000 n=18+18)
RegexpMatchHard_1K-12 70.9µs ± 2% 72.0µs ± 1% +1.42% (p=0.000 n=19+17)
Revcomp-12 1.06s ± 0% 1.05s ± 0% -1.36% (p=0.000 n=20+18)
Template-12 86.2ms ± 1% 84.6ms ± 0% -1.89% (p=0.000 n=20+18)
TimeParse-12 425ns ± 2% 428ns ± 1% +0.77% (p=0.000 n=18+19)
TimeFormat-12 517ns ± 1% 519ns ± 1% +0.43% (p=0.001 n=20+19)
[Geo mean] 74.3µs 73.5µs -1.05%
Prior to this commit, BinaryTree17-12 on 386 was 33% slower than at
the go1.6 tag. With this commit, it's 13% slower.
On arm and arm64, BinaryTree17-12 is only ~5% slower than it was at
go1.6. It may be worth implementing Ctz for them as well.
I consider this change low risk, since the functions it replaces are
simple, very well specified, and well tested.
For #16117.
Change-Id: Ic39d851d5aca91330134596effd2dab9689ba066
Reviewed-on: https://go-review.googlesource.com/24640
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This modifies a recent performance improvement to the
And8 and Or8 atomic functions which required both ppc64le
and ppc64 to use power8 instructions. Since then it was
decided that ppc64 (BE) should work for power5 and later.
This change uses instructions compatible with power5 for
ppc64 and uses power8 for ppc64le.
Fixes #16004
Change-Id: I623c75e8e6fd1fa063a53d250d86cdc9d0890dc7
Reviewed-on: https://go-review.googlesource.com/24181
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Andrew Gerrand <adg@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The following performance improvements have been made to the
low-level atomic functions for ppc64le & ppc64:
- For those cases containing a lwarx and stwcx (or other sizes):
sync, lwarx, maybe something, stwcx, loop to sync, sync, isync
The sync is moved before (outside) the lwarx/stwcx loop, and the
sync after is removed, so it becomes:
sync, lwarx, maybe something, stwcx, loop to lwarx, isync
- For the Or8 and And8, the shifting and manipulation of the
address to the word aligned version were removed and the
instructions were changed to use lbarx, stbcx instead of
register shifting, xor, then lwarx, stwcx.
- New instructions LWSYNC, LBAR, STBCC were tested and added.
runtime/atomic_ppc64x.s was changed to use the LWSYNC opcode
instead of the WORD encoding.
Fixes #15469
Ran some of the benchmarks in the runtime and sync directories.
Some results varied from run to run but the trend was improvement
based on best times for base and new:
runtime.test:
BenchmarkChanNonblocking-128 0.88 0.89 +1.14%
BenchmarkChanUncontended-128 569 511 -10.19%
BenchmarkChanContended-128 63110 53231 -15.65%
BenchmarkChanSync-128 691 598 -13.46%
BenchmarkChanSyncWork-128 11355 11649 +2.59%
BenchmarkChanProdCons0-128 2402 2090 -12.99%
BenchmarkChanProdCons10-128 1348 1363 +1.11%
BenchmarkChanProdCons100-128 1002 746 -25.55%
BenchmarkChanProdConsWork0-128 2554 2720 +6.50%
BenchmarkChanProdConsWork10-128 1909 1804 -5.50%
BenchmarkChanProdConsWork100-128 1624 1580 -2.71%
BenchmarkChanCreation-128 237 212 -10.55%
BenchmarkChanSem-128 705 667 -5.39%
BenchmarkChanPopular-128 5081190 4497566 -11.49%
BenchmarkCreateGoroutines-128 532 473 -11.09%
BenchmarkCreateGoroutinesParallel-128 35.0 34.7 -0.86%
BenchmarkCreateGoroutinesCapture-128 4923 4200 -14.69%
sync.test:
BenchmarkUncontendedSemaphore-128 112 94.2 -15.89%
BenchmarkContendedSemaphore-128 133 128 -3.76%
BenchmarkMutexUncontended-128 1.90 1.67 -12.11%
BenchmarkMutex-128 353 310 -12.18%
BenchmarkMutexSlack-128 304 283 -6.91%
BenchmarkMutexWork-128 554 541 -2.35%
BenchmarkMutexWorkSlack-128 567 556 -1.94%
BenchmarkMutexNoSpin-128 275 242 -12.00%
BenchmarkMutexSpin-128 1129 1030 -8.77%
BenchmarkOnce-128 1.08 0.96 -11.11%
BenchmarkPool-128 29.8 27.4 -8.05%
BenchmarkPoolOverflow-128 40564 36583 -9.81%
BenchmarkSemaUncontended-128 3.14 2.63 -16.24%
BenchmarkSemaSyntNonblock-128 1087 1069 -1.66%
BenchmarkSemaSyntBlock-128 897 893 -0.45%
BenchmarkSemaWorkNonblock-128 1034 1028 -0.58%
BenchmarkSemaWorkBlock-128 949 886 -6.64%
Change-Id: I4403fb29d3cd5254b7b1ce87a216bd11b391079e
Reviewed-on: https://go-review.googlesource.com/22549
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Minux Ma <minux@golang.org>
|
|
Change-Id: Ib29cf7abbbdaed81e918e5e41bca4e9b8da24621
Reviewed-on: https://go-review.googlesource.com/22503
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Use deBruijn sequences to count low-order zeros.
Reorg bswap to not use &^, it takes another instruction on x86.
Change-Id: I4a5ed9fd16ee6a279d88c067e8a2ba11de821156
Reviewed-on: https://go-review.googlesource.com/22084
Reviewed-by: David Chase <drchase@google.com>
|
|
Make it clear that the point of this function stores a pointer
*without* a write barrier.
sed -i -e 's/Storep1/StorepNoWB/' $(git grep -l Storep1)
Updates #15270.
Change-Id: Ifad7e17815e51a738070655fe3b178afdadaecf6
Reviewed-on: https://go-review.googlesource.com/21994
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
|