| Age | Commit message (Collapse) | Author |
|
When not in race mode, we redirect the sync/atomic functions to
internal/runtime/atomic with assembly stubs in the sync/atomic
package. Access from assembly is similar to linkname, so they need
export linknames from internal/runtime/atomic. We currently have
export linknames only for the Go functions, which varies from
architecture to architecture. This CL adds export linknames to
assembly functions as well, so they are same on all architectures.
Linknames on assembly functions don't hurt. Also, we may restrict
linkname access to assembly symbols without a push or export
linkname later. This CL prepares for that.
Change-Id: Ic6167a2f6610b9b1b1193e2ea86095ab7506b282
Reviewed-on: https://go-review.googlesource.com/c/go/+/749945
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
The implementation here needs to be consistent with ssa.OpLOONG64LoweredAtomicCas{32,64},
which was ignored in CL 613396.
Change-Id: I72e8d2318e0c1935cc3a35ab5098f8a84e48bcd5
Reviewed-on: https://go-review.googlesource.com/c/go/+/699395
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
|
|
This makes the single-byte atomic.Xchg8 operation available on all
GOARCHes, including those without direct / single-instruction support.
Fixes #69735
Change-Id: Icb6aff8f907257db81ea440dc4d29f96b3cff6c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/657936
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Updated comments in go assembler package
Change-Id: I174e344ca45fae6ef70af2e0b29cd783b003b4c2
GitHub-Last-Rev: 8ab37208891e795561a943269ca82b1ce6e7eef5
GitHub-Pull-Request: golang/go#72048
Reviewed-on: https://go-review.googlesource.com/c/go/+/654478
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: PRABHAV DOGRA <prabhavdogra1@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
For #69735
Change-Id: I2a0336214786e14b9a37834d81a0a0d14231451c
Reviewed-on: https://go-review.googlesource.com/c/go/+/651315
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
For #69735
Change-Id: Ide6b3077768a96b76078e5d4f6460596b8ff1560
Reviewed-on: https://go-review.googlesource.com/c/go/+/631756
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
For #69735
Change-Id: I34ca2b027494525ab64f94beee89ca373a5031ae
Reviewed-on: https://go-review.googlesource.com/c/go/+/631615
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
In Loongson's new microstructure LA664 (Loongson-3A6000) and later, the atomic
instruction AMSWAP[DB]{B,H} [1] is supported. Therefore, the implementation of
the atomic operation exchange can be selected according to the CPUCFG flag LAM_BH:
AMSWAPDBB(full barrier) instruction is used on new microstructures, and traditional
LL-SC is used on LA464 (Loongson-3A5000) and older microstructures. This can
significantly improve the performance of Go programs on new microstructures.
Because Xchg8 implemented using traditional LL-SC uses too many temporary
registers, it is not suitable for intrinsics.
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
BenchmarkXchg8 100000000 10.41 ns/op
BenchmarkXchg8-2 100000000 10.41 ns/op
BenchmarkXchg8-4 100000000 10.41 ns/op
BenchmarkXchg8Parallel 96647592 12.41 ns/op
BenchmarkXchg8Parallel-2 58376136 20.60 ns/op
BenchmarkXchg8Parallel-4 78458899 17.97 ns/op
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000-HV @ 2500.00MHz
BenchmarkXchg8 38323825 31.23 ns/op
BenchmarkXchg8-2 38368219 31.23 ns/op
BenchmarkXchg8-4 37154156 31.26 ns/op
BenchmarkXchg8Parallel 37908301 31.63 ns/op
BenchmarkXchg8Parallel-2 30413440 39.42 ns/op
BenchmarkXchg8Parallel-4 30737626 39.03 ns/op
For #69735
[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
Change-Id: I02ba68f66a2210b6902344fdc9975eb62de728ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/623058
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
In Loongson's new microstructure LA664 (Loongson-3A6000) and later, the atomic
compare-and-exchange instruction AMCAS[DB]{B,W,H,V} [1] is supported. Therefore,
the implementation of the atomic operation compare-and-swap can be selected according
to the CPUCFG flag LAMCAS: AMCASDB(full barrier) instruction is used on new
microstructures, and traditional LL-SC is used on LA464 (Loongson-3A5000) and older
microstructures. This can significantly improve the performance of Go programs on
new microstructures.
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
Cas 46.84n ± 0% 22.82n ± 0% -51.28% (p=0.000 n=20)
Cas-2 47.58n ± 0% 29.57n ± 0% -37.85% (p=0.000 n=20)
Cas-4 43.27n ± 20% 25.31n ± 13% -41.50% (p=0.000 n=20)
Cas64 46.85n ± 0% 22.82n ± 0% -51.29% (p=0.000 n=20)
Cas64-2 47.43n ± 0% 29.53n ± 0% -37.74% (p=0.002 n=20)
Cas64-4 43.18n ± 0% 25.28n ± 2% -41.46% (p=0.000 n=20)
geomean 45.82n 25.74n -43.82%
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
Cas 50.05n ± 0% 51.26n ± 0% +2.42% (p=0.000 n=20)
Cas-2 52.80n ± 0% 53.11n ± 0% +0.59% (p=0.000 n=20)
Cas-4 55.97n ± 0% 57.31n ± 0% +2.39% (p=0.000 n=20)
Cas64 50.05n ± 0% 51.26n ± 0% +2.42% (p=0.000 n=20)
Cas64-2 52.68n ± 0% 53.11n ± 0% +0.82% (p=0.000 n=20)
Cas64-4 55.96n ± 0% 57.26n ± 0% +2.33% (p=0.000 n=20)
geomean 52.86n 53.83n +1.82%
[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
Change-Id: I9b777c63c124fb492f61c903f77061fa2b4e5322
Reviewed-on: https://go-review.googlesource.com/c/go/+/613396
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
on loong64
Use loong64's atomic operation instruction AMANDDB{V,W,W} (full barrier) to implement
And{64,32,8}, AMORDB{V,W,W} (full barrier) to implement Or{64,32,8}.
Intrinsify And{64,32,8} and Or{64,32,8}, And this CL alias all of the And/Or operations
into sync/atomic package.
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000-HV @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
And32 27.73n ± 0% 10.81n ± 0% -61.02% (p=0.000 n=20)
And32Parallel 28.96n ± 0% 12.41n ± 0% -57.15% (p=0.000 n=20)
And64 27.73n ± 0% 10.81n ± 0% -61.02% (p=0.000 n=20)
And64Parallel 28.96n ± 0% 12.41n ± 0% -57.15% (p=0.000 n=20)
Or32 27.62n ± 0% 10.81n ± 0% -60.86% (p=0.000 n=20)
Or32Parallel 28.96n ± 0% 12.41n ± 0% -57.15% (p=0.000 n=20)
Or64 27.62n ± 0% 10.81n ± 0% -60.86% (p=0.000 n=20)
Or64Parallel 28.97n ± 0% 12.41n ± 0% -57.16% (p=0.000 n=20)
And8 29.15n ± 0% 13.21n ± 0% -54.68% (p=0.000 n=20)
And 27.71n ± 0% 12.82n ± 0% -53.74% (p=0.000 n=20)
And8Parallel 28.99n ± 0% 14.46n ± 0% -50.12% (p=0.000 n=20)
AndParallel 29.12n ± 0% 14.42n ± 0% -50.48% (p=0.000 n=20)
Or8 28.31n ± 0% 12.81n ± 0% -54.75% (p=0.000 n=20)
Or 27.72n ± 0% 12.81n ± 0% -53.79% (p=0.000 n=20)
Or8Parallel 29.03n ± 0% 14.62n ± 0% -49.64% (p=0.000 n=20)
OrParallel 29.12n ± 0% 14.42n ± 0% -50.49% (p=0.000 n=20)
geomean 28.47n 12.58n -55.80%
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
And32 30.02n ± 0% 14.81n ± 0% -50.67% (p=0.000 n=20)
And32Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20)
And64 30.02n ± 0% 14.81n ± 0% -50.67% (p=0.000 n=20)
And64Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20)
And8 30.42n ± 0% 14.41n ± 0% -52.63% (p=0.000 n=20)
And 30.02n ± 0% 13.61n ± 0% -54.66% (p=0.000 n=20)
And8Parallel 31.23n ± 0% 15.21n ± 0% -51.30% (p=0.000 n=20)
AndParallel 30.83n ± 0% 14.41n ± 0% -53.26% (p=0.000 n=20)
Or32 30.02n ± 0% 14.81n ± 0% -50.67% (p=0.000 n=20)
Or32Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20)
Or64 30.02n ± 0% 14.82n ± 0% -50.63% (p=0.000 n=20)
Or64Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20)
Or8 30.02n ± 0% 14.01n ± 0% -53.33% (p=0.000 n=20)
Or 30.02n ± 0% 13.61n ± 0% -54.66% (p=0.000 n=20)
Or8Parallel 30.83n ± 0% 14.81n ± 0% -51.96% (p=0.000 n=20)
OrParallel 30.83n ± 0% 14.41n ± 0% -53.26% (p=0.000 n=20)
geomean 30.47n 14.75n -51.61%
Change-Id: If008ff6a08b51905076f8ddb6e92f8e214d3f7b3
Reviewed-on: https://go-review.googlesource.com/c/go/+/482756
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Use Loong64's atomic operation instruction AMSWAPDB{W,V} (full barrier)
to implement atomic.Xchg{32,64}
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
| old.bench | new.bench |
| sec/op | sec/op vs base |
Xchg 26.44n ± 0% 12.01n ± 0% -54.58% (p=0.000 n=20)
Xchg-2 30.10n ± 0% 25.58n ± 0% -15.02% (p=0.000 n=20)
Xchg-4 30.06n ± 0% 24.82n ± 0% -17.43% (p=0.000 n=20)
Xchg64 26.44n ± 0% 12.02n ± 0% -54.54% (p=0.000 n=20)
Xchg64-2 30.10n ± 0% 25.57n ± 0% -15.05% (p=0.000 n=20)
Xchg64-4 30.05n ± 0% 24.80n ± 0% -17.47% (p=0.000 n=20)
geomean 28.81n 19.68n -31.69%
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
| old.bench | new.bench |
| sec/op | sec/op vs base |
Xchg 25.62n ± 0% 12.41n ± 0% -51.56% (p=0.000 n=20)
Xchg-2 35.01n ± 0% 20.59n ± 0% -41.19% (p=0.000 n=20)
Xchg-4 34.63n ± 0% 19.59n ± 0% -43.42% (p=0.000 n=20)
Xchg64 25.62n ± 0% 12.41n ± 0% -51.56% (p=0.000 n=20)
Xchg64-2 35.01n ± 0% 20.59n ± 0% -41.19% (p=0.000 n=20)
Xchg64-4 34.67n ± 0% 19.59n ± 0% -43.50% (p=0.000 n=20)
geomean 31.44n 17.11n -45.59%
Updates #59120.
Change-Id: Ied74fc20338b63799c6d6eeb122c31b42cff0f7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/481578
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
|
|
Use Loong64's atomic operation instruction AMADDDB{W,V} (full barrier)
to implement atomic.Xadd{32,64}
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
Xadd 27.24n ± 0% 12.01n ± 0% -55.91% (p=0.000 n=20)
Xadd-2 31.93n ± 0% 25.55n ± 0% -19.98% (p=0.000 n=20)
Xadd-4 31.90n ± 0% 24.80n ± 0% -22.26% (p=0.000 n=20)
Xadd64 27.23n ± 0% 12.01n ± 0% -55.89% (p=0.000 n=20)
Xadd64-2 31.93n ± 0% 25.57n ± 0% -19.90% (p=0.000 n=20)
Xadd64-4 31.89n ± 0% 24.80n ± 0% -22.23% (p=0.000 n=20)
geomean 30.27n 19.67n -35.01%
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
Xadd 26.02n ± 0% 12.41n ± 0% -52.31% (p=0.000 n=20)
Xadd-2 37.36n ± 0% 20.60n ± 0% -44.86% (p=0.000 n=20)
Xadd-4 37.22n ± 0% 19.59n ± 0% -47.37% (p=0.000 n=20)
Xadd64 26.42n ± 0% 12.41n ± 0% -53.03% (p=0.000 n=20)
Xadd64-2 37.77n ± 0% 20.60n ± 0% -45.46% (p=0.000 n=20)
Xadd64-4 37.78n ± 0% 19.59n ± 0% -48.15% (p=0.000 n=20)
geomean 33.30n 17.11n -48.62%
Change-Id: I982539c2aa04680e9dd11b099ba8d5f215bf9b32
Reviewed-on: https://go-review.googlesource.com/c/go/+/481937
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
|
|
On Loong64, AMSWAPDB{W,V} instructions are supported by default, and AMSWAPDB{B,H} [1]
is a new instruction added by LA664(Loongson 3A6000) and later microarchitectures.
Therefore, AMSWAPDB{W,V} (full barrier) is used to implement AtomicStore{32,64}, and
the traditional MOVB or the new AMSWAPDBB is used to implement AtomicStore8 according
to the CPU feature.
The StoreRelease barrier on Loong64 is "dbar 0x12", but it is still necessary to
ensure consistency in the order of Store/Load [2].
LoweredAtomicStorezero{32,64} was removed because on loong64 the constant "0" uses
the R0 register, and there is no performance difference between the implementations
of LoweredAtomicStorezero{32,64} and LoweredAtomicStore{32,64}.
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000-HV @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
AtomicStore64 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20)
AtomicStore64-2 19.61n ± 0% 13.61n ± 0% -30.57% (p=0.000 n=20)
AtomicStore64-4 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20)
AtomicStore 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20)
AtomicStore-2 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20)
AtomicStore-4 19.62n ± 0% 13.62n ± 0% -30.58% (p=0.000 n=20)
AtomicStore8 19.61n ± 0% 20.01n ± 0% +2.04% (p=0.000 n=20)
AtomicStore8-2 19.62n ± 0% 20.02n ± 0% +2.01% (p=0.000 n=20)
AtomicStore8-4 19.61n ± 0% 20.02n ± 0% +2.09% (p=0.000 n=20)
geomean 19.61n 15.48n -21.08%
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
AtomicStore64 18.03n ± 0% 12.81n ± 0% -28.93% (p=0.000 n=20)
AtomicStore64-2 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20)
AtomicStore64-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20)
AtomicStore-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore8 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore8-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore8-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
geomean 18.01n 12.81n -28.89%
[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
[2]: https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=gcc/config/loongarch/sync.md
Change-Id: I4ae5e8dd0e6f026129b6e503990a763ed40c6097
Reviewed-on: https://go-review.googlesource.com/c/go/+/581356
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
In CL 622075, I introduced code that violated unsafe.Pointer rules
by casting to uintptr and back across statements. This change corrects it.
Change-Id: Ib6f6c08d9ce33aaeaf41f390c7e9f13a7b8cb974
GitHub-Last-Rev: 01cc68a87c8c0ad068c71a911013421f28a8b4ef
GitHub-Pull-Request: golang/go#70129
Cq-Include-Trybots: luci.golang.try:gotip-linux-arm
Reviewed-on: https://go-review.googlesource.com/c/go/+/623755
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
|
|
For #69735
Change-Id: I18c0ca15d94a9b1751c1e55459283e01dc114150
GitHub-Last-Rev: dd9a39a5551e5a3415ab765cf271fecdbbe89b4c
GitHub-Pull-Request: golang/go#69924
Cq-Include-Trybots: luci.golang.try:gotip-linux-arm
Reviewed-on: https://go-review.googlesource.com/c/go/+/620855
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
With LDREXB/STREXB now available for the arm assembler we can implement these operations natively. The instructions are armv6k+ but for simplicity I only use them on armv7.
Benchmark results for a raspberry Pi 3 model B+:
goos: linux
goarch: arm
pkg: internal/runtime/atomic
cpu: ARMv7 Processor rev 4 (v7l)
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
And8-4 127.65n ± 0% 68.74n ± 0% -46.15% (p=0.000 n=10)
Change-Id: Ic87f307c35f7d7f56010980302f253056f6d54dc
GitHub-Last-Rev: a7351802fd212704712b37d183435ab14e58f885
GitHub-Pull-Request: golang/go#70002
Cq-Include-Trybots: luci.golang.try:gotip-linux-arm
Reviewed-on: https://go-review.googlesource.com/c/go/+/622075
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
For #69735
Change-Id: I5b9f57315d693d613dc88dc02c10bee39aeeef76
GitHub-Last-Rev: 690337e5b81a48bdcb808526d0c5f4837e8912b7
GitHub-Pull-Request: golang/go#69923
Reviewed-on: https://go-review.googlesource.com/c/go/+/620756
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
For #69735
Change-Id: I61a2e561684c538eea705e60c8ebda6be3ef31a7
GitHub-Last-Rev: 3c7f4ec845182d3ef1a007319d91027433163db3
GitHub-Pull-Request: golang/go#69751
Reviewed-on: https://go-review.googlesource.com/c/go/+/617595
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
This is minor extension of the existing support for 32 and
64 bit types.
For #69735
Change-Id: I6828ec223951d2b692e077dc507b000ac23c32a1
Reviewed-on: https://go-review.googlesource.com/c/go/+/617496
Reviewed-by: Rhys Hiltner <rhys.hiltner@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
For #68578
Change-Id: Idecfdbb793f46560dd69287af9170c07cf4ee973
Reviewed-on: https://go-review.googlesource.com/c/go/+/606900
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
The LoadAcquire barrier on Loong64 is "dbar 0x14", using the correct
barrier in Load{8,32,64} implementation can improve performance.
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000-HV @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
AtomicLoad64 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20)
AtomicLoad64-2 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20)
AtomicLoad64-4 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20)
AtomicLoad 17.220n ± 0% 4.402n ± 0% -74.44% (p=0.000 n=20)
AtomicLoad-2 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20)
AtomicLoad-4 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20)
AtomicLoad8 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20)
AtomicLoad8-2 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20)
AtomicLoad8-4 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20)
geomean 17.21n 4.402n -74.42%
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
AtomicLoad64 18.82n ± 0% 10.41n ± 0% -44.69% (p=0.000 n=20)
AtomicLoad64-2 18.81n ± 0% 10.41n ± 0% -44.66% (p=0.000 n=20)
AtomicLoad64-4 18.82n ± 0% 10.41n ± 0% -44.69% (p=0.000 n=20)
AtomicLoad 18.81n ± 0% 10.41n ± 0% -44.66% (p=0.000 n=20)
AtomicLoad-2 18.82n ± 0% 10.41n ± 0% -44.69% (p=0.000 n=20)
AtomicLoad-4 18.81n ± 0% 10.42n ± 0% -44.63% (p=0.000 n=20)
AtomicLoad8 18.82n ± 0% 10.41n ± 0% -44.69% (p=0.000 n=20)
AtomicLoad8-2 18.82n ± 0% 10.41n ± 0% -44.70% (p=0.000 n=20)
AtomicLoad8-4 18.82n ± 0% 10.41n ± 0% -44.69% (p=0.000 n=20)
geomean 18.82n 10.41n -44.68%
Change-Id: I9d47c9d6f359c4f2e41035ca656429aade2e7847
Reviewed-on: https://go-review.googlesource.com/c/go/+/581357
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
CL 544455, which added atomic And/Or APIs, raced with CL 585556, which
enabled stricter linkname checking. This caused linkname-related
failures on ARM and MIPS. Fix this by adding the necessary linknames.
We fix one other linkname that got overlooked in CL 585556.
Updates #61395.
Change-Id: I454f0767ce28188e550a61bc39b7e398239bc10e
Reviewed-on: https://go-review.googlesource.com/c/go/+/586516
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Austin Clements <austin@google.com>
|
|
As mentioned in CL 584598, linkname is a mechanism that, when
abused, can break API integrity and even safety of Go programs.
CL 584598 is a first step to restrict the use of linknames, by
implementing a blocklist. This CL takes a step further, tightening
up the restriction by allowing linkname references ("pull") only
when the definition side explicitly opts into it, by having a
linkname on the definition (possibly to itself). This way, it is at
least clear on the definition side that the symbol, despite being
unexported, is accessed outside of the package. Unexported symbols
without linkname can now be actually private. This is similar to
the symbol visibility rule used by gccgo for years (which defines
unexported non-linknamed symbols as C static symbols).
As there can be pull-only linknames in the wild that may be broken
by this change, we currently only enforce this rule for symbols
defined in the standard library. Push linknames are added in the
standard library to allow things build.
Linkname references to external (non-Go) symbols are still allowed,
as their visibility is controlled by the C symbol visibility rules
and enforced by the C (static or dynamic) linker.
Assembly symbols are treated similar to linknamed symbols.
This is controlled by -checklinkname linker flag, currently not
enabled by default. A follow-up CL will enable it by default.
Change-Id: I07344f5c7a02124dbbef0fbc8fec3b666a4b2b0e
Reviewed-on: https://go-review.googlesource.com/c/go/+/585358
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
These primitives will be used by the new And/Or sync/atomic apis.
Implemented for mips/mipsle and mips64/mips64le.
For #61395
Change-Id: Icc604a2b5cdfe72646d47d3c6a0bb49a0fd0d353
GitHub-Last-Rev: 95dca2a9f144c5d96dfa53eaf116e88be5f55040
GitHub-Pull-Request: golang/go#63297
Reviewed-on: https://go-review.googlesource.com/c/go/+/531835
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
The local variable may not be 64bit aligned, which caused arm tests
to fail.
Fixes #67077
Change-Id: Ia3ae4abcc90319cb10cd593bdc7994cc6eeb3a28
Reviewed-on: https://go-review.googlesource.com/c/go/+/581916
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
For #65355
Change-Id: I65dd090fb99de9b231af2112c5ccb0eb635db2be
Reviewed-on: https://go-review.googlesource.com/c/go/+/560155
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ibrahim Bazoka <ibrahimbazoka729@gmail.com>
Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
|