aboutsummaryrefslogtreecommitdiff
path: root/src/sync/atomic/asm_ppc64x.s
AgeCommit message (Collapse)Author
2018-05-03sync/atomic: redirect many functions to runtime/internal/atomicCherry Zhang
The implementation of atomics are inherently tricky. It would be good to have them implemented in a single place, instead of multiple copies. Mostly a simple redirect. On 386, some functions in sync/atomic have better implementations, which are moved to runtime/internal/atomic. On ARM, some functions in sync/atomic have better implementations. They are dropped by this CL, but restored with an improved version in a follow-up CL. On linux/arm, 64-bit CAS kernel helper is dropped, as we're trying to move away from kernel helpers. Fixes #23778. Change-Id: Icb9e1039acc92adbb2a371c34baaf0b79551c3ea Reviewed-on: https://go-review.googlesource.com/93637 Reviewed-by: Austin Clements <austin@google.com>
2018-03-22cmd/compile/internal/ppc64, runtime internal/atomic, sync/atomic: implement ↵Carlos Eduardo Seo
faster atomics for ppc64x This change implements faster atomics for ppc64x based on the ISA 2.07B, Appendix B.2 recommendations, replacing SYNC/ISYNC by LWSYNC in some cases. Updates #21348 name old time/op new time/op delta Cond1-16 955ns 856ns -10.33% Cond2-16 2.38µs 2.03µs -14.59% Cond4-16 5.90µs 5.44µs -7.88% Cond8-16 12.1µs 11.1µs -8.42% Cond16-16 27.0µs 25.1µs -7.04% Cond32-16 59.1µs 55.5µs -6.14% LoadMostlyHits/*sync_test.DeepCopyMap-16 22.1ns 24.1ns +9.02% LoadMostlyHits/*sync_test.RWMutexMap-16 252ns 249ns -1.20% LoadMostlyHits/*sync.Map-16 16.2ns 16.3ns ~ LoadMostlyMisses/*sync_test.DeepCopyMap-16 22.3ns 22.6ns ~ LoadMostlyMisses/*sync_test.RWMutexMap-16 249ns 247ns -0.51% LoadMostlyMisses/*sync.Map-16 12.7ns 12.7ns ~ LoadOrStoreBalanced/*sync_test.RWMutexMap-16 1.27µs 1.17µs -7.54% LoadOrStoreBalanced/*sync.Map-16 1.12µs 1.10µs -2.35% LoadOrStoreUnique/*sync_test.RWMutexMap-16 1.75µs 1.68µs -3.84% LoadOrStoreUnique/*sync.Map-16 2.07µs 1.97µs -5.13% LoadOrStoreCollision/*sync_test.DeepCopyMap-16 15.8ns 15.9ns ~ LoadOrStoreCollision/*sync_test.RWMutexMap-16 496ns 424ns -14.48% LoadOrStoreCollision/*sync.Map-16 6.07ns 6.07ns ~ Range/*sync_test.DeepCopyMap-16 1.65µs 1.64µs ~ Range/*sync_test.RWMutexMap-16 278µs 288µs +3.75% Range/*sync.Map-16 2.00µs 2.01µs ~ AdversarialAlloc/*sync_test.DeepCopyMap-16 3.45µs 3.44µs ~ AdversarialAlloc/*sync_test.RWMutexMap-16 226ns 227ns ~ AdversarialAlloc/*sync.Map-16 1.09µs 1.07µs -2.36% AdversarialDelete/*sync_test.DeepCopyMap-16 553ns 550ns -0.57% AdversarialDelete/*sync_test.RWMutexMap-16 273ns 274ns ~ AdversarialDelete/*sync.Map-16 247ns 249ns ~ UncontendedSemaphore-16 79.0ns 65.5ns -17.11% ContendedSemaphore-16 112ns 97ns -13.77% MutexUncontended-16 3.34ns 2.51ns -24.69% Mutex-16 266ns 191ns -28.26% MutexSlack-16 226ns 159ns -29.55% MutexWork-16 377ns 338ns -10.14% MutexWorkSlack-16 335ns 308ns -8.20% MutexNoSpin-16 196ns 184ns -5.91% MutexSpin-16 710ns 666ns -6.21% Once-16 1.29ns 1.29ns ~ Pool-16 8.64ns 8.71ns ~ PoolOverflow-16 1.60µs 1.44µs -10.25% SemaUncontended-16 5.39ns 4.42ns -17.96% SemaSyntNonblock-16 539ns 483ns -10.42% SemaSyntBlock-16 413ns 354ns -14.20% SemaWorkNonblock-16 305ns 258ns -15.36% SemaWorkBlock-16 266ns 229ns -14.06% RWMutexUncontended-16 12.9ns 9.7ns -24.80% RWMutexWrite100-16 203ns 147ns -27.47% RWMutexWrite10-16 177ns 119ns -32.74% RWMutexWorkWrite100-16 435ns 403ns -7.39% RWMutexWorkWrite10-16 642ns 611ns -4.79% WaitGroupUncontended-16 4.67ns 3.70ns -20.92% WaitGroupAddDone-16 402ns 355ns -11.54% WaitGroupAddDoneWork-16 208ns 250ns +20.09% WaitGroupWait-16 1.21ns 1.21ns ~ WaitGroupWaitWork-16 5.91ns 5.87ns -0.81% WaitGroupActuallyWait-16 92.2ns 85.8ns -6.91% Updates #21348 Change-Id: Ibb9b271d11b308264103829e176c6d9fe8f867d3 Reviewed-on: https://go-review.googlesource.com/95175 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2016-08-25all: fix assembly vet issuesJosh Bleecher Snyder
Add missing function prototypes. Fix function prototypes. Use FP references instead of SP references. Fix variable names. Update comments. Clean up whitespace. (Not for vet.) All fairly minor fixes to make vet happy. Updates #11041 Change-Id: Ifab2cdf235ff61cdc226ab1d84b8467b5ac9446c Reviewed-on: https://go-review.googlesource.com/27713 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-05-05sync/atomic, runtime/internal/atomic: improve ppc64x atomicsLynn Boger
The following performance improvements have been made to the low-level atomic functions for ppc64le & ppc64: - For those cases containing a lwarx and stwcx (or other sizes): sync, lwarx, maybe something, stwcx, loop to sync, sync, isync The sync is moved before (outside) the lwarx/stwcx loop, and the sync after is removed, so it becomes: sync, lwarx, maybe something, stwcx, loop to lwarx, isync - For the Or8 and And8, the shifting and manipulation of the address to the word aligned version were removed and the instructions were changed to use lbarx, stbcx instead of register shifting, xor, then lwarx, stwcx. - New instructions LWSYNC, LBAR, STBCC were tested and added. runtime/atomic_ppc64x.s was changed to use the LWSYNC opcode instead of the WORD encoding. Fixes #15469 Ran some of the benchmarks in the runtime and sync directories. Some results varied from run to run but the trend was improvement based on best times for base and new: runtime.test: BenchmarkChanNonblocking-128 0.88 0.89 +1.14% BenchmarkChanUncontended-128 569 511 -10.19% BenchmarkChanContended-128 63110 53231 -15.65% BenchmarkChanSync-128 691 598 -13.46% BenchmarkChanSyncWork-128 11355 11649 +2.59% BenchmarkChanProdCons0-128 2402 2090 -12.99% BenchmarkChanProdCons10-128 1348 1363 +1.11% BenchmarkChanProdCons100-128 1002 746 -25.55% BenchmarkChanProdConsWork0-128 2554 2720 +6.50% BenchmarkChanProdConsWork10-128 1909 1804 -5.50% BenchmarkChanProdConsWork100-128 1624 1580 -2.71% BenchmarkChanCreation-128 237 212 -10.55% BenchmarkChanSem-128 705 667 -5.39% BenchmarkChanPopular-128 5081190 4497566 -11.49% BenchmarkCreateGoroutines-128 532 473 -11.09% BenchmarkCreateGoroutinesParallel-128 35.0 34.7 -0.86% BenchmarkCreateGoroutinesCapture-128 4923 4200 -14.69% sync.test: BenchmarkUncontendedSemaphore-128 112 94.2 -15.89% BenchmarkContendedSemaphore-128 133 128 -3.76% BenchmarkMutexUncontended-128 1.90 1.67 -12.11% BenchmarkMutex-128 353 310 -12.18% BenchmarkMutexSlack-128 304 283 -6.91% BenchmarkMutexWork-128 554 541 -2.35% BenchmarkMutexWorkSlack-128 567 556 -1.94% BenchmarkMutexNoSpin-128 275 242 -12.00% BenchmarkMutexSpin-128 1129 1030 -8.77% BenchmarkOnce-128 1.08 0.96 -11.11% BenchmarkPool-128 29.8 27.4 -8.05% BenchmarkPoolOverflow-128 40564 36583 -9.81% BenchmarkSemaUncontended-128 3.14 2.63 -16.24% BenchmarkSemaSyntNonblock-128 1087 1069 -1.66% BenchmarkSemaSyntBlock-128 897 893 -0.45% BenchmarkSemaWorkNonblock-128 1034 1028 -0.58% BenchmarkSemaWorkBlock-128 949 886 -6.64% Change-Id: I4403fb29d3cd5254b7b1ce87a216bd11b391079e Reviewed-on: https://go-review.googlesource.com/22549 Reviewed-by: Michael Munday <munday@ca.ibm.com> Reviewed-by: Minux Ma <minux@golang.org>
2016-03-01all: make copyright headers consistent with one space after periodBrad Fitzpatrick
This is a subset of https://golang.org/cl/20022 with only the copyright header lines, so the next CL will be smaller and more reviewable. Go policy has been single space after periods in comments for some time. The copyright header template at: https://golang.org/doc/contribute.html#copyright also uses a single space. Make them all consistent. Change-Id: Icc26c6b8495c3820da6b171ca96a74701b4a01b0 Reviewed-on: https://go-review.googlesource.com/20111 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-06-06all: use RET instead of RETURN on ppc64Austin Clements
All of the architectures except ppc64 have only "RET" for the return mnemonic. ppc64 used to have only "RETURN", but commit cf06ea6 introduced RET as a synonym for RETURN to make ppc64 consistent with the other architectures. However, that commit was never followed up to make the code itself consistent by eliminating uses of RETURN. This commit replaces all uses of RETURN in the ppc64 assembly with RET. This was done with sed -i 's/\<RETURN\>/RET/' **/*_ppc64x.s plus one manual change to syscall/asm.s. Change-Id: I3f6c8d2be157df8841d48de988ee43f3e3087995 Reviewed-on: https://go-review.googlesource.com/10672 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Minux Ma <minux@golang.org>
2015-01-06runtime, sync/atomic: add write barrier for atomic write of pointerRuss Cox
Add write barrier to atomic operations manipulating pointers. In general an atomic write of a pointer word may indicate racy accesses, so there is no strictly safe way to attempt to keep the shadow copy in sync with the real one. Instead, mark the shadow copy as not used. Redirect sync/atomic pointer routines back to the runtime ones, so that there is only one copy of the write barrier and shadow logic. In time we might consider doing this for most of the sync/atomic functions, but for now only the pointer routines need that treatment. Found with GODEBUG=wbshadow=1 mode. Eventually that will run automatically, but right now it still detects other missing write barriers. Change-Id: I852936b9a111a6cb9079cfaf6bd78b43016c0242 Reviewed-on: https://go-review.googlesource.com/2066 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2014-12-05all: power64 is now ppc64Russ Cox
Fixes #8654. LGTM=austin R=austin CC=golang-codereviews https://golang.org/cl/180600043