aboutsummaryrefslogtreecommitdiff
path: root/src/internal/runtime
AgeCommit message (Collapse)Author
2024-11-17runtime/internal/maps: optimize long string keys for small mapsKeith Randall
For large strings, do a quick equality check on all the slots. Only if more than one passes the quick equality check do we resort to hashing. │ baseline │ experiment │ │ sec/op │ sec/op vs base │ MegMap-24 16609.50n ± 1% 13.91n ± 3% -99.92% (p=0.000 n=10) MegOneMap-24 16655.00n ± 0% 12.27n ± 1% -99.93% (p=0.000 n=10) MegEqMap-24 41.31µ ± 1% 25.03µ ± 1% -39.40% (p=0.000 n=10) MegEmptyMap-24 2.034n ± 0% 2.027n ± 2% ~ (p=0.541 n=10) MegEmptyMapWithInterfaceKey-24 5.931n ± 2% 5.599n ± 1% -5.60% (p=0.000 n=10) MapStringKeysEight_16-24 8.473n ± 7% 8.224n ± 5% ~ (p=0.315 n=10) MapStringKeysEight_32-24 8.441n ± 2% 8.147n ± 1% -3.48% (p=0.002 n=10) MapStringKeysEight_64-24 8.769n ± 1% 8.517n ± 1% -2.87% (p=0.000 n=10) MapStringKeysEight_128-24 10.73n ± 4% 13.57n ± 8% +26.57% (p=0.000 n=10) MapStringKeysEight_256-24 12.97n ± 2% 14.35n ± 4% +10.64% (p=0.001 n=10) MapStringKeysEight_1M-24 17359.50n ± 3% 13.92n ± 4% -99.92% (p=0.000 n=10) Change-Id: I4cc2ea4edab12a4b03236de626c7bcf0f96b6cc0 Reviewed-on: https://go-review.googlesource.com/c/go/+/625905 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-11-13internal/runtime/maps: use match to skip non-full slots in iterationMichael Pratt
Iteration over swissmaps with low load (think map with large hint but only one entry) is signicantly regressed vs old maps. See noswiss vs swiss-tip below (+60%). Currently we visit every single slot and individually check if the slot is full or not. We can do much better by using the control word to find all full slots in a group in a single operation. This lets us skip completely empty groups for instance. Always using the control match approach is great for maps with low load, but is a regression for mostly full maps. Mostly full maps have the majority of slots full, so most calls to mapiternext will return the next slot. In that case, doing the full group match on every call is more expensive than checking the individual slot. Thus we take a hybrid approach: on each call, we first check an individual slot. If that slot is full, we're done. If that slot is non-full, then we fall back to doing full group matches. This trade-off works well. Both mostly empty and mostly full maps perform nearly as well as doing all matching and all individual, respectively. The fast path is placed above the slow path loop rather than combined (with some sort of `useMatch` variable) into a single loop to help the compiler's code generation. The compiler really struggles with code generation on a combined loop for some reason, yielding ~15% additional instructions/op. Comparison with old maps prior to this CL: │ noswiss │ swiss-tip │ │ sec/op │ sec/op vs base │ MapIter/Key=int64/Elem=int64/len=6-12 11.53n ± 2% 10.64n ± 2% -7.72% (p=0.002 n=6) MapIter/Key=int64/Elem=int64/len=64-12 10.180n ± 2% 9.670n ± 5% -5.01% (p=0.004 n=6) MapIter/Key=int64/Elem=int64/len=65536-12 10.78n ± 1% 10.15n ± 2% -5.84% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=6-12 6.116n ± 2% 6.840n ± 2% +11.84% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=64-12 2.403n ± 2% 3.892n ± 0% +61.95% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=65536-12 1.940n ± 3% 3.237n ± 1% +66.81% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=6-12 66.20n ± 2% 60.14n ± 3% -9.15% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=64-12 97.24n ± 1% 171.35n ± 1% +76.21% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=65536-12 826.1n ± 12% 842.5n ± 10% ~ (p=0.937 n=6) geomean 17.93n 20.96n +16.88% After this CL: │ noswiss │ swiss-cl │ │ sec/op │ sec/op vs base │ MapIter/Key=int64/Elem=int64/len=6-12 11.53n ± 2% 10.90n ± 3% -5.42% (p=0.002 n=6) MapIter/Key=int64/Elem=int64/len=64-12 10.180n ± 2% 9.719n ± 9% -4.53% (p=0.043 n=6) MapIter/Key=int64/Elem=int64/len=65536-12 10.78n ± 1% 10.07n ± 2% -6.63% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=6-12 6.116n ± 2% 7.022n ± 1% +14.82% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=64-12 2.403n ± 2% 1.475n ± 1% -38.63% (p=0.002 n=6) MapIterLowLoad/Key=int64/Elem=int64/len=65536-12 1.940n ± 3% 1.210n ± 6% -37.67% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=6-12 66.20n ± 2% 61.54n ± 2% -7.02% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=64-12 97.24n ± 1% 110.10n ± 1% +13.23% (p=0.002 n=6) MapPop/Key=int64/Elem=int64/len=65536-12 826.1n ± 12% 504.7n ± 6% -38.91% (p=0.002 n=6) geomean 17.93n 15.29n -14.74% For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10 Change-Id: Ic07f9df763239e85be57873103df5007144fdaef Reviewed-on: https://go-review.googlesource.com/c/go/+/627156 Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2024-11-11internal/runtime/maps: don't hash twice when deletingKeith Randall
│ baseline │ experiment │ │ sec/op │ sec/op vs base │ MapDeleteLargeKey-24 312.0n ± 6% 162.3n ± 5% -47.97% (p=0.000 n=10) Change-Id: I31f1f8e3c344cf8abf2e9eb4b51b78fcd67b93c4 Reviewed-on: https://go-review.googlesource.com/c/go/+/625906 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2024-11-11internal/runtime/maps: get rid of a few obsolete TODOsKeith Randall
Change-Id: I7b3d95c0861ae2b6e0721b65aa75cda036435e9c Reviewed-on: https://go-review.googlesource.com/c/go/+/625903 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-11cmd/compiler,internal/runtime/atomic: optimize And{64,32,8} and Or{64,32,8} ↵Guoqi Chen
on loong64 Use loong64's atomic operation instruction AMANDDB{V,W,W} (full barrier) to implement And{64,32,8}, AMORDB{V,W,W} (full barrier) to implement Or{64,32,8}. Intrinsify And{64,32,8} and Or{64,32,8}, And this CL alias all of the And/Or operations into sync/atomic package. goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A6000-HV @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | And32 27.73n ± 0% 10.81n ± 0% -61.02% (p=0.000 n=20) And32Parallel 28.96n ± 0% 12.41n ± 0% -57.15% (p=0.000 n=20) And64 27.73n ± 0% 10.81n ± 0% -61.02% (p=0.000 n=20) And64Parallel 28.96n ± 0% 12.41n ± 0% -57.15% (p=0.000 n=20) Or32 27.62n ± 0% 10.81n ± 0% -60.86% (p=0.000 n=20) Or32Parallel 28.96n ± 0% 12.41n ± 0% -57.15% (p=0.000 n=20) Or64 27.62n ± 0% 10.81n ± 0% -60.86% (p=0.000 n=20) Or64Parallel 28.97n ± 0% 12.41n ± 0% -57.16% (p=0.000 n=20) And8 29.15n ± 0% 13.21n ± 0% -54.68% (p=0.000 n=20) And 27.71n ± 0% 12.82n ± 0% -53.74% (p=0.000 n=20) And8Parallel 28.99n ± 0% 14.46n ± 0% -50.12% (p=0.000 n=20) AndParallel 29.12n ± 0% 14.42n ± 0% -50.48% (p=0.000 n=20) Or8 28.31n ± 0% 12.81n ± 0% -54.75% (p=0.000 n=20) Or 27.72n ± 0% 12.81n ± 0% -53.79% (p=0.000 n=20) Or8Parallel 29.03n ± 0% 14.62n ± 0% -49.64% (p=0.000 n=20) OrParallel 29.12n ± 0% 14.42n ± 0% -50.49% (p=0.000 n=20) geomean 28.47n 12.58n -55.80% goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A5000 @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | And32 30.02n ± 0% 14.81n ± 0% -50.67% (p=0.000 n=20) And32Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20) And64 30.02n ± 0% 14.81n ± 0% -50.67% (p=0.000 n=20) And64Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20) And8 30.42n ± 0% 14.41n ± 0% -52.63% (p=0.000 n=20) And 30.02n ± 0% 13.61n ± 0% -54.66% (p=0.000 n=20) And8Parallel 31.23n ± 0% 15.21n ± 0% -51.30% (p=0.000 n=20) AndParallel 30.83n ± 0% 14.41n ± 0% -53.26% (p=0.000 n=20) Or32 30.02n ± 0% 14.81n ± 0% -50.67% (p=0.000 n=20) Or32Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20) Or64 30.02n ± 0% 14.82n ± 0% -50.63% (p=0.000 n=20) Or64Parallel 30.83n ± 0% 15.61n ± 0% -49.37% (p=0.000 n=20) Or8 30.02n ± 0% 14.01n ± 0% -53.33% (p=0.000 n=20) Or 30.02n ± 0% 13.61n ± 0% -54.66% (p=0.000 n=20) Or8Parallel 30.83n ± 0% 14.81n ± 0% -51.96% (p=0.000 n=20) OrParallel 30.83n ± 0% 14.41n ± 0% -53.26% (p=0.000 n=20) geomean 30.47n 14.75n -51.61% Change-Id: If008ff6a08b51905076f8ddb6e92f8e214d3f7b3 Reviewed-on: https://go-review.googlesource.com/c/go/+/482756 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-11cmd/compiler,internal/runtime/atomic: optimize xchg{32,64} on loong64Guoqi Chen
Use Loong64's atomic operation instruction AMSWAPDB{W,V} (full barrier) to implement atomic.Xchg{32,64} goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A5000 @ 2500.00MHz | old.bench | new.bench | | sec/op | sec/op vs base | Xchg 26.44n ± 0% 12.01n ± 0% -54.58% (p=0.000 n=20) Xchg-2 30.10n ± 0% 25.58n ± 0% -15.02% (p=0.000 n=20) Xchg-4 30.06n ± 0% 24.82n ± 0% -17.43% (p=0.000 n=20) Xchg64 26.44n ± 0% 12.02n ± 0% -54.54% (p=0.000 n=20) Xchg64-2 30.10n ± 0% 25.57n ± 0% -15.05% (p=0.000 n=20) Xchg64-4 30.05n ± 0% 24.80n ± 0% -17.47% (p=0.000 n=20) geomean 28.81n 19.68n -31.69% goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A6000 @ 2500.00MHz | old.bench | new.bench | | sec/op | sec/op vs base | Xchg 25.62n ± 0% 12.41n ± 0% -51.56% (p=0.000 n=20) Xchg-2 35.01n ± 0% 20.59n ± 0% -41.19% (p=0.000 n=20) Xchg-4 34.63n ± 0% 19.59n ± 0% -43.42% (p=0.000 n=20) Xchg64 25.62n ± 0% 12.41n ± 0% -51.56% (p=0.000 n=20) Xchg64-2 35.01n ± 0% 20.59n ± 0% -41.19% (p=0.000 n=20) Xchg64-4 34.67n ± 0% 19.59n ± 0% -43.50% (p=0.000 n=20) geomean 31.44n 17.11n -45.59% Updates #59120. Change-Id: Ied74fc20338b63799c6d6eeb122c31b42cff0f7e Reviewed-on: https://go-review.googlesource.com/c/go/+/481578 Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: WANG Xuerui <git@xen0n.name> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
2024-11-08cmd/compiler,internal/runtime/atomic: optimize xadd{32,64} on loong64Guoqi Chen
Use Loong64's atomic operation instruction AMADDDB{W,V} (full barrier) to implement atomic.Xadd{32,64} goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A5000 @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | Xadd 27.24n ± 0% 12.01n ± 0% -55.91% (p=0.000 n=20) Xadd-2 31.93n ± 0% 25.55n ± 0% -19.98% (p=0.000 n=20) Xadd-4 31.90n ± 0% 24.80n ± 0% -22.26% (p=0.000 n=20) Xadd64 27.23n ± 0% 12.01n ± 0% -55.89% (p=0.000 n=20) Xadd64-2 31.93n ± 0% 25.57n ± 0% -19.90% (p=0.000 n=20) Xadd64-4 31.89n ± 0% 24.80n ± 0% -22.23% (p=0.000 n=20) geomean 30.27n 19.67n -35.01% goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A6000 @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | Xadd 26.02n ± 0% 12.41n ± 0% -52.31% (p=0.000 n=20) Xadd-2 37.36n ± 0% 20.60n ± 0% -44.86% (p=0.000 n=20) Xadd-4 37.22n ± 0% 19.59n ± 0% -47.37% (p=0.000 n=20) Xadd64 26.42n ± 0% 12.41n ± 0% -53.03% (p=0.000 n=20) Xadd64-2 37.77n ± 0% 20.60n ± 0% -45.46% (p=0.000 n=20) Xadd64-4 37.78n ± 0% 19.59n ± 0% -48.15% (p=0.000 n=20) geomean 33.30n 17.11n -48.62% Change-Id: I982539c2aa04680e9dd11b099ba8d5f215bf9b32 Reviewed-on: https://go-review.googlesource.com/c/go/+/481937 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Meidan Li <limeidan@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
2024-11-07cmd/compiler,internal/runtime/atomic: optimize Store{64,32,8} on loong64Guoqi Chen
On Loong64, AMSWAPDB{W,V} instructions are supported by default, and AMSWAPDB{B,H} [1] is a new instruction added by LA664(Loongson 3A6000) and later microarchitectures. Therefore, AMSWAPDB{W,V} (full barrier) is used to implement AtomicStore{32,64}, and the traditional MOVB or the new AMSWAPDBB is used to implement AtomicStore8 according to the CPU feature. The StoreRelease barrier on Loong64 is "dbar 0x12", but it is still necessary to ensure consistency in the order of Store/Load [2]. LoweredAtomicStorezero{32,64} was removed because on loong64 the constant "0" uses the R0 register, and there is no performance difference between the implementations of LoweredAtomicStorezero{32,64} and LoweredAtomicStore{32,64}. goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A5000-HV @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | AtomicStore64 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20) AtomicStore64-2 19.61n ± 0% 13.61n ± 0% -30.57% (p=0.000 n=20) AtomicStore64-4 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20) AtomicStore 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20) AtomicStore-2 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20) AtomicStore-4 19.62n ± 0% 13.62n ± 0% -30.58% (p=0.000 n=20) AtomicStore8 19.61n ± 0% 20.01n ± 0% +2.04% (p=0.000 n=20) AtomicStore8-2 19.62n ± 0% 20.02n ± 0% +2.01% (p=0.000 n=20) AtomicStore8-4 19.61n ± 0% 20.02n ± 0% +2.09% (p=0.000 n=20) geomean 19.61n 15.48n -21.08% goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A6000 @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | AtomicStore64 18.03n ± 0% 12.81n ± 0% -28.93% (p=0.000 n=20) AtomicStore64-2 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20) AtomicStore64-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20) AtomicStore-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore8 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore8-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore8-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) geomean 18.01n 12.81n -28.89% [1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html [2]: https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=gcc/config/loongarch/sync.md Change-Id: I4ae5e8dd0e6f026129b6e503990a763ed40c6097 Reviewed-on: https://go-review.googlesource.com/c/go/+/581356 Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> Reviewed-by: Meidan Li <limeidan@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2024-11-01internal/runtime/maps: return after fatal to help register allocatorkhr@golang.org
Seems simple, but putting the return after fatal ensures that at the point of the small group loop, no call has happened so the key is still in a register. This ensures that we don't have to restore the key from the stack before the comparison on each iteration. That gets rid of a load from the inner loop. name old time/op new time/op delta MapAccessHit/Key=int64/Elem=int64/len=6-8 4.01ns ± 6% 3.85ns ± 3% -3.92% (p=0.001 n=10+10) Change-Id: Ia23ac48e6c5522be88f7d9be0ff3489b2dfc52fc Reviewed-on: https://go-review.googlesource.com/c/go/+/624255 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2024-11-01internal/runtime/maps: clean up put slot callskhr@golang.org
Use matchEmptyOrDeleted instead of matchEmpty. Streamline the code a bit. TODO: replicate in all the _fast files. Change-Id: I4df16a13a19df3aaae0c42e0c12f20552f08ead6 Reviewed-on: https://go-review.googlesource.com/c/go/+/624055 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-11-01internal/runtime/maps: use matchEmptyOrDeleted instead of matchEmptykhr@golang.org
It's a bit more efficient. Change-Id: If813a597516c41fdac6f60e586641d0ee1cde025 Reviewed-on: https://go-review.googlesource.com/c/go/+/623818 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-01internal/runtime/maps: removed unused convertNonFullToEmptyAndFullToDeletedKeith Randall
I don't think we have any code that uses this function. Unless it is something for the future. Change-Id: I7e44634f7a9c1d4d64d84c358447ccf213668d92 Reviewed-on: https://go-review.googlesource.com/c/go/+/622077 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-11-01internal/runtime/maps: simplify emptyOrDeleted conditionKeith Randall
Change-Id: I37e5bba9cd62b2d970754ac24da7e1397ef12fd4 Reviewed-on: https://go-review.googlesource.com/c/go/+/622076 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-31internal/runtime/atomic: fix uintptr usage in arm And8/Or8Mauri de Souza Meneguzzo
In CL 622075, I introduced code that violated unsafe.Pointer rules by casting to uintptr and back across statements. This change corrects it. Change-Id: Ib6f6c08d9ce33aaeaf41f390c7e9f13a7b8cb974 GitHub-Last-Rev: 01cc68a87c8c0ad068c71a911013421f28a8b4ef GitHub-Pull-Request: golang/go#70129 Cq-Include-Trybots: luci.golang.try:gotip-linux-arm Reviewed-on: https://go-review.googlesource.com/c/go/+/623755 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-30internal/runtime/atomic: add Xchg8 for armMauri de Souza Meneguzzo
For #69735 Change-Id: I18c0ca15d94a9b1751c1e55459283e01dc114150 GitHub-Last-Rev: dd9a39a5551e5a3415ab765cf271fecdbbe89b4c GitHub-Pull-Request: golang/go#69924 Cq-Include-Trybots: luci.golang.try:gotip-linux-arm Reviewed-on: https://go-review.googlesource.com/c/go/+/620855 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-10-30cmd/compile,internal/runtime/maps: stack allocated maps and small allocMichael Pratt
The compiler will stack allocate the Map struct and initial group if possible. Stack maps are initialized inline without calling into the runtime. Small heap allocated maps use makemap_small. These are the same heuristics as existing maps. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I6c371d1309716fd1c38a3212d417b3c76db5c9b9 Reviewed-on: https://go-review.googlesource.com/c/go/+/622042 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-30internal/runtime/maps: store group across Iter.Next callsMichael Pratt
A previous CL kept it across loop iterations, but those are more rare than call iterations. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: Ieea0f1677e357f5e451650b1c697da7f63f3bca1 Reviewed-on: https://go-review.googlesource.com/c/go/+/621116 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-30internal/runtime/maps: avoid table lookup on most Iter.Next callsMichael Pratt
Speeds up iteration by about 3%. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I3406376fb8db87306d52e665fcee1f33cf610f24 Reviewed-on: https://go-review.googlesource.com/c/go/+/621115 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-30internal/runtime/maps: optimize small map lookups with int keysMichael Pratt
Load the field we need from the type once outside the search loop. Get rid of the multiply to compute the slot position. Instead compute the slot position incrementally using addition. Move the hashing later in access2. Based on khr@'s CL 618959. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: Id11b5479fa5bc0130a1d8d9e664d0206d24942ea Reviewed-on: https://go-review.googlesource.com/c/go/+/620217 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-30internal/runtime/maps: use uintptr instead of uint32 for index in groupMichael Pratt
This avoids some zero-extension ops on 64-bit machines. Based on khr@'s CL 619479. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: Ie9a56da26382dc9e515c613abc8cf6fec3767671 Reviewed-on: https://go-review.googlesource.com/c/go/+/620216 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-30internal/runtime/maps: cleanup seed usageMichael Pratt
Keep only a single seed; initialize it; and reset it when the map is empty. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: Icc231f70957337a2d0dcd9c7daf9bd3cb4354d71 Reviewed-on: https://go-review.googlesource.com/c/go/+/616466 Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-30runtime,internal/runtime/maps: specialized swissmapsMichael Pratt
Add all the specialized variants that exist for the existing maps. Like the existing maps, the fast variants do not support indirect key/elem. Note that as of this CL, the Get and Put methods on Map/table are effectively dead. They are only reachable from the internal/runtime/maps unit tests. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I95297750be6200f34ec483e4cfc897f048c26db7 Reviewed-on: https://go-review.googlesource.com/c/go/+/616463 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-30cmd/compile,runtime: add indirect key/elem to swissmapMichael Pratt
We use the same heuristics as existing maps. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I44bb51483cae2c1714717f1b501850fb9e55a39a Reviewed-on: https://go-review.googlesource.com/c/go/+/616461 Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-30runtime: add concurrent write checks to swissmapMichael Pratt
This is the same design as existing maps. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I5f6ef5fea1e0f0616bcd90eaae7faee4cdac58c6 Reviewed-on: https://go-review.googlesource.com/c/go/+/616460 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-30internal/runtime/maps: enable race for map functions in internal/runtime/mapsMichael Pratt
For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: Iebc7f5482299cb7c4ecccc4c2eb46b4bc42c5fc3 Reviewed-on: https://go-review.googlesource.com/c/go/+/616459 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-30internal/runtime/maps: proper capacity hint handlingMichael Pratt
When given a hint size, set the initial capacity large enough to avoid requiring growth in the average case. When not given a hint (or given 0), don't allocate anything at all. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I8844fc652b8d2d4e5136cd56f7e78999a07fe381 Reviewed-on: https://go-review.googlesource.com/c/go/+/616457 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-29runtime: move mapaccess1 and mapassign to internal/runtime/mapsMichael Pratt
This enables manual inlining Map.Get/table.getWithoutKey to create a simple fast path with no calls. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: Ic208dd4c02c7554f312b85b5fadccaf82b23545c Reviewed-on: https://go-review.googlesource.com/c/go/+/616455 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-29internal/runtime/maps: remove type fieldsMichael Pratt
Rather than storing the same type pointer in multiple places, just pass it around. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: Ia6c74805c7a44125ae473177b317f16c6688e6de Reviewed-on: https://go-review.googlesource.com/c/go/+/622377 Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-28internal/runtime/maps: shift optimizationsMichael Pratt
Masking the shift lets the compiler elide a few instructions for handling a shift of > 63 bits. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I669fe01caa1de1b8521f1f56b6906f3e9066a39b Reviewed-on: https://go-review.googlesource.com/c/go/+/611190 Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2024-10-28internal/runtime/maps: avoid passing unused key returnMichael Pratt
For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: Idee1e021e3cef8f0c031e8f06efbcf6e88918d8a Reviewed-on: https://go-review.googlesource.com/c/go/+/622376 Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-28internal/runtime/maps: linear scan of small mapMichael Pratt
We still use the hash and control word, but loop over all 8 bytes instead of doing the match operation, which ends up being slightly faster when there is only one group. Note that specialized variants added later will avoid hashing at all. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I3bb353b023dd6120b6585e87d3efe2f18ac9e1ef Reviewed-on: https://go-review.googlesource.com/c/go/+/611189 Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-28internal/runtime/maps: small maps point directly to a groupMichael Pratt
If the map contains 8 or fewer entries, it is wasteful to have a directory that points to a table that points to a group. Add a special case that replaces the directory with a direct pointer to a group. We could theoretically do similar for single table maps (no directory, just point directly to a table), but that is left for later. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I6fc04dfc11c31dadfe5b5d6481b4c4abd43d48ed Reviewed-on: https://go-review.googlesource.com/c/go/+/611188 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-28internal/runtime/maps: speed up moduloMichael Pratt
For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: Ic47721e101f6fee650e6825a5a241fcd12fa0009 Reviewed-on: https://go-review.googlesource.com/c/go/+/611185 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-28internal/runtime/maps: reuse deleted slots on insertMichael Pratt
While walking the probe sequence, Put keeps track of the first deleted slot it encountered. If it reaches the end of the probe sequence without finding a match, then it will prefer to use the deleted slot rather than a new empty slot. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: I19356ef6780176506f57b42990ac15dc426f1b14 Reviewed-on: https://go-review.googlesource.com/c/go/+/618016 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-28internal/runtime/maps: merge Iter.groupIdx and Iter.slotIdxMichael Pratt
For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap Change-Id: Ie21ef0f33f42735eadccd75eeebb3b5e81c2f459 Reviewed-on: https://go-review.googlesource.com/c/go/+/618535 Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Commit-Queue: Michael Pratt <mpratt@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-28internal/runtime/atomic: add arm native implementations of And8/Or8Mauri de Souza Meneguzzo
With LDREXB/STREXB now available for the arm assembler we can implement these operations natively. The instructions are armv6k+ but for simplicity I only use them on armv7. Benchmark results for a raspberry Pi 3 model B+: goos: linux goarch: arm pkg: internal/runtime/atomic cpu: ARMv7 Processor rev 4 (v7l) │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ And8-4 127.65n ± 0% 68.74n ± 0% -46.15% (p=0.000 n=10) Change-Id: Ic87f307c35f7d7f56010980302f253056f6d54dc GitHub-Last-Rev: a7351802fd212704712b37d183435ab14e58f885 GitHub-Pull-Request: golang/go#70002 Cq-Include-Trybots: luci.golang.try:gotip-linux-arm Reviewed-on: https://go-review.googlesource.com/c/go/+/622075 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21cmd/compile,internal/runtime/maps: add extendible hashingMichael Pratt
Extendible hashing splits a swisstable map into many swisstables. This keeps grow operations small. For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap Change-Id: Id91f34af9e686bf35eb8882ee479956ece89e821 Reviewed-on: https://go-review.googlesource.com/c/go/+/604936 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-18internal/runtime/atomic: add Xchg8 for 386Mauri de Souza Meneguzzo
For #69735 Change-Id: I5b9f57315d693d613dc88dc02c10bee39aeeef76 GitHub-Last-Rev: 690337e5b81a48bdcb808526d0c5f4837e8912b7 GitHub-Pull-Request: golang/go#69923 Reviewed-on: https://go-review.googlesource.com/c/go/+/620756 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-10-14all: wire up swisstable mapsMichael Pratt
Use the new SwissTable-based map in internal/runtime/maps as the basis for the runtime map when GOEXPERIMENT=swissmap. Integration is complete enough to pass all.bash. Notable missing features: * Race integration / concurrent write detection * Stack-allocated maps * Specialized "fast" map variants * Indirect key / elem For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap Change-Id: Ie97b656b6d8e05c0403311ae08fef9f51756a639 Reviewed-on: https://go-review.googlesource.com/c/go/+/594596 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-09internal/runtime/maps: support big endian architecturesMichael Pratt
For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10 Change-Id: I0a928c4b1e90056c50d2abca8982bdb540c33a34 Reviewed-on: https://go-review.googlesource.com/c/go/+/619035 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2024-10-08cmd/compile, internal/runtime/atomic: add Xchg8 for arm64Mauri de Souza Meneguzzo
For #69735 Change-Id: I61a2e561684c538eea705e60c8ebda6be3ef31a7 GitHub-Last-Rev: 3c7f4ec845182d3ef1a007319d91027433163db3 GitHub-Pull-Request: golang/go#69751 Reviewed-on: https://go-review.googlesource.com/c/go/+/617595 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-08internal/runtime/maps: initial swiss table map implementationMichael Pratt
Add a new package that will contain a new "Swiss Table" (https://abseil.io/about/design/swisstables) map implementation, which is intended to eventually replace the existing runtime map implementation. This implementation is based on the fabulous github.com/cockroachdb/swiss package contributed by Peter Mattis. This CL adds an hash map implementation. It supports all the core operations, but does not have incremental growth. For #54766. Change-Id: I52cf371448c3817d471ddb1f5a78f3513565db41 Reviewed-on: https://go-review.googlesource.com/c/go/+/582415 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-10-07cmd/compile: add internal/runtime/atomic.Xchg8 intrinsic for PPC64Paul E. Murphy
This is minor extension of the existing support for 32 and 64 bit types. For #69735 Change-Id: I6828ec223951d2b692e077dc507b000ac23c32a1 Reviewed-on: https://go-review.googlesource.com/c/go/+/617496 Reviewed-by: Rhys Hiltner <rhys.hiltner@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Reviewed-by: Keith Randall <khr@google.com>
2024-10-02internal/runtime/atomic: add Xchg8 for amd64Rhys Hiltner
For #68578 Change-Id: Idecfdbb793f46560dd69287af9170c07cf4ee973 Reviewed-on: https://go-review.googlesource.com/c/go/+/606900 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com> Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-09-17runtime: move getclosureptr to internal/runtime/sysMichael Pratt
Moving these intrinsics to a base package enables other internal/runtime packages to use them. There is no immediate need for getclosureptr outside of runtime, but it is moved for consistency with the other intrinsics. For #54766. Change-Id: Ia68b16a938c8cb84cb222469db28e3a83861be5d Reviewed-on: https://go-review.googlesource.com/c/go/+/613262 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-09-17runtime: move getcallersp to internal/runtime/sysMichael Pratt
Moving these intrinsics to a base package enables other internal/runtime packages to use them. For #54766. Change-Id: I45a530422207dd94b5ad4eee51216c9410a84040 Reviewed-on: https://go-review.googlesource.com/c/go/+/613261 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-09-17runtime: move getcallerpc to internal/runtime/sysMichael Pratt
Moving these intrinsics to a base package enables other internal/runtime packages to use them. For #54766. Change-Id: I0b3eded3bb45af53e3eb5bab93e3792e6a8beb46 Reviewed-on: https://go-review.googlesource.com/c/go/+/613260 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-08-26internal/runtime/sys: fix typo in commentKevin Z
just removed a single byte :) Change-Id: Icd734f9f8f22b2ed0d9d0125d18b6d291bb14cd6 GitHub-Last-Rev: 93c0fd00d863c8a992c63f1bc01c0877b1bdff0c GitHub-Pull-Request: golang/go#69056 Reviewed-on: https://go-review.googlesource.com/c/go/+/607878 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-08-01cmd/compiler,internal/runtime/atomic: optimize Load{64,32,8} on loong64Guoqi Chen
The LoadAcquire barrier on Loong64 is "dbar 0x14", using the correct barrier in Load{8,32,64} implementation can improve performance. goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A6000-HV @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | AtomicLoad64 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20) AtomicLoad64-2 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20) AtomicLoad64-4 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20) AtomicLoad 17.220n ± 0% 4.402n ± 0% -74.44% (p=0.000 n=20) AtomicLoad-2 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20) AtomicLoad-4 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20) AtomicLoad8 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20) AtomicLoad8-2 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20) AtomicLoad8-4 17.210n ± 0% 4.402n ± 0% -74.42% (p=0.000 n=20) geomean 17.21n 4.402n -74.42% goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A5000 @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | AtomicLoad64 18.82n ± 0% 10.41n ± 0% -44.69% (p=0.000 n=20) AtomicLoad64-2 18.81n ± 0% 10.41n ± 0% -44.66% (p=0.000 n=20) AtomicLoad64-4 18.82n ± 0% 10.41n ± 0% -44.69% (p=0.000 n=20) AtomicLoad 18.81n ± 0% 10.41n ± 0% -44.66% (p=0.000 n=20) AtomicLoad-2 18.82n ± 0% 10.41n ± 0% -44.69% (p=0.000 n=20) AtomicLoad-4 18.81n ± 0% 10.42n ± 0% -44.63% (p=0.000 n=20) AtomicLoad8 18.82n ± 0% 10.41n ± 0% -44.69% (p=0.000 n=20) AtomicLoad8-2 18.82n ± 0% 10.41n ± 0% -44.70% (p=0.000 n=20) AtomicLoad8-4 18.82n ± 0% 10.41n ± 0% -44.69% (p=0.000 n=20) geomean 18.82n 10.41n -44.68% Change-Id: I9d47c9d6f359c4f2e41035ca656429aade2e7847 Reviewed-on: https://go-review.googlesource.com/c/go/+/581357 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-07-23runtime,internal: move runtime/internal/sys to internal/runtime/sysDavid Chase
Cleanup and friction reduction For #65355. Change-Id: Ia14c9dc584a529a35b97801dd3e95b9acc99a511 Reviewed-on: https://go-review.googlesource.com/c/go/+/600436 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>