| Age | Commit message (Collapse) | Author |
|
Change-Id: Ie7649060db25f1573eeaadd534a600bb24d30572
GitHub-Last-Rev: c617848a4ec9f5c21820982efc95e0ec4ca2510c
GitHub-Pull-Request: golang/go#70134
Reviewed-on: https://go-review.googlesource.com/c/go/+/623757
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
|
|
benchmark:
goos: linux
goarch: loong64
pkg: math
cpu: Loongson-3A6000 @ 2500.00MHz
│ bench.old │ bench.new │
│ sec/op │ sec/op vs base │
Ceil 10.810n ± 0% 2.578n ± 0% -76.15% (p=0.000 n=20)
Floor 10.810n ± 0% 2.531n ± 0% -76.59% (p=0.000 n=20)
Trunc 9.606n ± 0% 2.530n ± 0% -73.67% (p=0.000 n=20)
geomean 10.39n 2.546n -75.50%
goos: linux
goarch: loong64
pkg: math
cpu: Loongson-3A5000 @ 2500.00MHz
│ bench.old │ bench.new │
│ sec/op │ sec/op vs base │
Ceil 13.220n ± 0% 7.703n ± 8% -41.73% (p=0.000 n=20)
Floor 12.410n ± 0% 7.248n ± 2% -41.59% (p=0.000 n=20)
Trunc 11.210n ± 0% 7.757n ± 4% -30.80% (p=0.000 n=20)
geomean 12.25n 7.566n -38.25%
Change-Id: I3af51e9852e9cf5f965fed895d68945a2e8675f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/612615
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Follow-up on CL 467555.
Change-Id: I1815b5def656ae4b86c31385ad0737f0465fa2d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/613535
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Bypass: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Tim King <taking@google.com>
|
|
Rather than conditionally assigning ujn, initialise ujn above the
loop to invent the leading 0 for u, then unconditionally load ujn
at the bottom of the loop. This code operates on the basis that
n >= 2, hence j+n-1 is always greater than zero.
Change-Id: I1272ef30c787ed8707ae8421af2adcccc776d389
Reviewed-on: https://go-review.googlesource.com/c/go/+/467555
Auto-Submit: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Robert Griesemer <gri@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
|
|
This CL reapplies CL 504737 and adds integer precision
limitation check, since CL 504737 only checks whether
floating point number is +-Inf or NaN.
This CL is also ~7% faster than CL 504737.
Updates #68322
goos: linux
goarch: riscv64
pkg: math
│ math.old.bench │ math.new.bench │
│ sec/op │ sec/op vs base │
Ceil 54.09n ± 0% 18.72n ± 0% -65.39% (p=0.000 n=10)
Floor 40.72n ± 0% 18.72n ± 0% -54.03% (p=0.000 n=10)
Round 20.73n ± 0% 20.73n ± 0% ~ (p=1.000 n=10)
RoundToEven 24.07n ± 0% 24.07n ± 0% ~ (p=1.000 n=10)
Trunc 38.72n ± 0% 18.72n ± 0% -51.65% (p=0.000 n=10)
geomean 33.56n 20.09n -40.13%
Change-Id: I06cfe2cb9e2535cd705d40b6650a7e71fedd906c
Reviewed-on: https://go-review.googlesource.com/c/go/+/600075
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: I535a7aaaf3f9e8a9c0e0c04f8f745ad7445a32f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/611678
Run-TryBot: shuang cui <imcusg@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
|
|
This CL adds trunc,ceil,floor tests for large exact float.
Change-Id: Ib7ffec1d2d50d2ac955398a3dd0fd06d494fcf4f
Reviewed-on: https://go-review.googlesource.com/c/go/+/601095
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
All changes are related to the code, except for the comments in src/regexp/syntax/parse.go and src/slices/slices.go.
Change-Id: I73c5d3c54099749b62210aa7f3182c5eb84bb6a6
GitHub-Last-Rev: 794aa9b0539811d00e1cd42be1e8d9fe9afe0281
GitHub-Pull-Request: golang/go#69170
Reviewed-on: https://go-review.googlesource.com/c/go/+/609678
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
|
|
As some callers don't have a testing context, modify testenv.Executable
to accept nil (similar to how testenv.GOROOT works).
Change-Id: I39112a7869933785a26b5cb6520055b3cc42b847
Reviewed-on: https://go-review.googlesource.com/c/go/+/609835
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
This provides an assembly implementation of addMulVVW for riscv64,
processing up to four words per loop, resulting in a significant
performance gain.
On a StarFive VisionFive 2:
│ addmulvvw.1 │ addmulvvw.2 │
│ sec/op │ sec/op vs base │
AddMulVVW/1-4 65.49n ± 0% 50.79n ± 0% -22.44% (p=0.000 n=10)
AddMulVVW/2-4 82.81n ± 0% 66.83n ± 0% -19.29% (p=0.000 n=10)
AddMulVVW/3-4 100.20n ± 0% 82.87n ± 0% -17.30% (p=0.000 n=10)
AddMulVVW/4-4 117.50n ± 0% 84.20n ± 0% -28.34% (p=0.000 n=10)
AddMulVVW/5-4 134.9n ± 0% 100.3n ± 0% -25.69% (p=0.000 n=10)
AddMulVVW/10-4 221.7n ± 0% 164.4n ± 0% -25.85% (p=0.000 n=10)
AddMulVVW/100-4 1.794µ ± 0% 1.250µ ± 0% -30.32% (p=0.000 n=10)
AddMulVVW/1000-4 17.42µ ± 0% 12.08µ ± 0% -30.68% (p=0.000 n=10)
AddMulVVW/10000-4 254.9µ ± 0% 214.8µ ± 0% -15.75% (p=0.000 n=10)
AddMulVVW/100000-4 2.569m ± 0% 2.178m ± 0% -15.20% (p=0.000 n=10)
geomean 1.443µ 1.107µ -23.29%
│ addmulvvw.1 │ addmulvvw.2 │
│ B/s │ B/s vs base │
AddMulVVW/1-4 932.0Mi ± 0% 1201.6Mi ± 0% +28.93% (p=0.000 n=10)
AddMulVVW/2-4 1.440Gi ± 0% 1.784Gi ± 0% +23.90% (p=0.000 n=10)
AddMulVVW/3-4 1.785Gi ± 0% 2.158Gi ± 0% +20.87% (p=0.000 n=10)
AddMulVVW/4-4 2.029Gi ± 0% 2.832Gi ± 0% +39.59% (p=0.000 n=10)
AddMulVVW/5-4 2.209Gi ± 0% 2.973Gi ± 0% +34.55% (p=0.000 n=10)
AddMulVVW/10-4 2.689Gi ± 0% 3.626Gi ± 0% +34.86% (p=0.000 n=10)
AddMulVVW/100-4 3.323Gi ± 0% 4.770Gi ± 0% +43.54% (p=0.000 n=10)
AddMulVVW/1000-4 3.421Gi ± 0% 4.936Gi ± 0% +44.27% (p=0.000 n=10)
AddMulVVW/10000-4 2.338Gi ± 0% 2.776Gi ± 0% +18.69% (p=0.000 n=10)
AddMulVVW/100000-4 2.320Gi ± 0% 2.736Gi ± 0% +17.93% (p=0.000 n=10)
geomean 2.109Gi 2.749Gi +30.36%
Change-Id: I6c7ee48233c53ff9b6a5a9002675886cd9bff5af
Reviewed-on: https://go-review.googlesource.com/c/go/+/595400
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This provides an assembly implementation of mulAddVWW for riscv64,
processing up to four words per loop, resulting in a significant
performance gain.
On a StarFive VisionFive 2:
│ muladdvww.1 │ muladdvww.2 │
│ sec/op │ sec/op vs base │
MulAddVWW/1-4 68.18n ± 0% 65.49n ± 0% -3.95% (p=0.000 n=10)
MulAddVWW/2-4 82.81n ± 0% 78.85n ± 0% -4.78% (p=0.000 n=10)
MulAddVWW/3-4 97.49n ± 0% 72.18n ± 0% -25.96% (p=0.000 n=10)
MulAddVWW/4-4 112.20n ± 0% 85.54n ± 0% -23.76% (p=0.000 n=10)
MulAddVWW/5-4 126.90n ± 0% 98.90n ± 0% -22.06% (p=0.000 n=10)
MulAddVWW/10-4 200.3n ± 0% 144.3n ± 0% -27.96% (p=0.000 n=10)
MulAddVWW/100-4 1532.0n ± 0% 860.0n ± 0% -43.86% (p=0.000 n=10)
MulAddVWW/1000-4 14.757µ ± 0% 8.076µ ± 0% -45.27% (p=0.000 n=10)
MulAddVWW/10000-4 204.0µ ± 0% 137.1µ ± 0% -32.77% (p=0.000 n=10)
MulAddVWW/100000-4 2.066m ± 0% 1.382m ± 0% -33.12% (p=0.000 n=10)
geomean 1.311µ 950.0n -27.51%
│ muladdvww.1 │ muladdvww.2 │
│ B/s │ B/s vs base │
MulAddVWW/1-4 895.1Mi ± 0% 932.0Mi ± 0% +4.11% (p=0.000 n=10)
MulAddVWW/2-4 1.440Gi ± 0% 1.512Gi ± 0% +5.02% (p=0.000 n=10)
MulAddVWW/3-4 1.834Gi ± 0% 2.477Gi ± 0% +35.07% (p=0.000 n=10)
MulAddVWW/4-4 2.125Gi ± 0% 2.787Gi ± 0% +31.15% (p=0.000 n=10)
MulAddVWW/5-4 2.349Gi ± 0% 3.013Gi ± 0% +28.28% (p=0.000 n=10)
MulAddVWW/10-4 2.975Gi ± 0% 4.130Gi ± 0% +38.79% (p=0.000 n=10)
MulAddVWW/100-4 3.891Gi ± 0% 6.930Gi ± 0% +78.11% (p=0.000 n=10)
MulAddVWW/1000-4 4.039Gi ± 0% 7.380Gi ± 0% +82.72% (p=0.000 n=10)
MulAddVWW/10000-4 2.922Gi ± 0% 4.346Gi ± 0% +48.74% (p=0.000 n=10)
MulAddVWW/100000-4 2.884Gi ± 0% 4.313Gi ± 0% +49.52% (p=0.000 n=10)
geomean 2.321Gi 3.202Gi +37.95%
Change-Id: If08191607913ce5c7641f34bae8fa5c9dfb44777
Reviewed-on: https://go-review.googlesource.com/c/go/+/595399
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
|
|
This provides an assembly implementation of subVW for riscv64,
processing up to four words per loop, resulting in a significant
performance gain.
On a StarFive VisionFive 2:
│ subvw.1 │ subvw.2 │
│ sec/op │ sec/op vs base │
SubVW/1-4 57.43n ± 0% 41.45n ± 0% -27.82% (p=0.000 n=10)
SubVW/2-4 69.31n ± 0% 48.15n ± 0% -30.53% (p=0.000 n=10)
SubVW/3-4 76.12n ± 0% 54.87n ± 0% -27.92% (p=0.000 n=10)
SubVW/4-4 85.47n ± 0% 56.14n ± 0% -34.32% (p=0.000 n=10)
SubVW/5-4 96.15n ± 0% 62.83n ± 0% -34.65% (p=0.000 n=10)
SubVW/10-4 149.60n ± 0% 89.55n ± 0% -40.14% (p=0.000 n=10)
SubVW/100-4 1115.0n ± 0% 549.3n ± 0% -50.74% (p=0.000 n=10)
SubVW/1000-4 10.732µ ± 0% 5.071µ ± 0% -52.75% (p=0.000 n=10)
SubVW/10000-4 153.0µ ± 0% 103.7µ ± 0% -32.21% (p=0.000 n=10)
SubVW/100000-4 1.542m ± 0% 1.046m ± 0% -32.13% (p=0.000 n=10)
SubVWext/1-4 57.42n ± 0% 41.45n ± 0% -27.81% (p=0.000 n=10)
SubVWext/2-4 69.33n ± 0% 48.15n ± 0% -30.55% (p=0.000 n=10)
SubVWext/3-4 76.12n ± 0% 54.93n ± 0% -27.84% (p=0.000 n=10)
SubVWext/4-4 85.47n ± 0% 56.14n ± 0% -34.32% (p=0.000 n=10)
SubVWext/5-4 96.15n ± 0% 62.83n ± 0% -34.65% (p=0.000 n=10)
SubVWext/10-4 149.60n ± 0% 89.56n ± 0% -40.14% (p=0.000 n=10)
SubVWext/100-4 1115.0n ± 0% 549.3n ± 0% -50.74% (p=0.000 n=10)
SubVWext/1000-4 10.732µ ± 0% 5.061µ ± 0% -52.84% (p=0.000 n=10)
SubVWext/10000-4 152.5µ ± 0% 103.7µ ± 0% -32.02% (p=0.000 n=10)
SubVWext/100000-4 1.533m ± 0% 1.046m ± 0% -31.75% (p=0.000 n=10)
geomean 1.005µ 633.7n -36.92%
│ subvw.1 │ subvw.2 │
│ B/s │ B/s vs base │
SubVW/1-4 132.9Mi ± 0% 184.1Mi ± 0% +38.54% (p=0.000 n=10)
SubVW/2-4 220.1Mi ± 0% 316.9Mi ± 0% +43.95% (p=0.000 n=10)
SubVW/3-4 300.7Mi ± 0% 417.1Mi ± 0% +38.72% (p=0.000 n=10)
SubVW/4-4 357.1Mi ± 0% 543.6Mi ± 0% +52.24% (p=0.000 n=10)
SubVW/5-4 396.7Mi ± 0% 607.2Mi ± 0% +53.03% (p=0.000 n=10)
SubVW/10-4 510.1Mi ± 0% 851.9Mi ± 0% +67.01% (p=0.000 n=10)
SubVW/100-4 684.2Mi ± 0% 1388.9Mi ± 0% +102.99% (p=0.000 n=10)
SubVW/1000-4 710.9Mi ± 0% 1504.5Mi ± 0% +111.63% (p=0.000 n=10)
SubVW/10000-4 498.7Mi ± 0% 735.7Mi ± 0% +47.52% (p=0.000 n=10)
SubVW/100000-4 494.8Mi ± 0% 729.1Mi ± 0% +47.34% (p=0.000 n=10)
SubVWext/1-4 132.9Mi ± 0% 184.1Mi ± 0% +38.53% (p=0.000 n=10)
SubVWext/2-4 220.1Mi ± 0% 316.9Mi ± 0% +44.00% (p=0.000 n=10)
SubVWext/3-4 300.7Mi ± 0% 416.7Mi ± 0% +38.57% (p=0.000 n=10)
SubVWext/4-4 357.1Mi ± 0% 543.6Mi ± 0% +52.24% (p=0.000 n=10)
SubVWext/5-4 396.7Mi ± 0% 607.2Mi ± 0% +53.04% (p=0.000 n=10)
SubVWext/10-4 510.1Mi ± 0% 851.9Mi ± 0% +67.01% (p=0.000 n=10)
SubVWext/100-4 684.2Mi ± 0% 1388.9Mi ± 0% +102.99% (p=0.000 n=10)
SubVWext/1000-4 710.9Mi ± 0% 1507.6Mi ± 0% +112.07% (p=0.000 n=10)
SubVWext/10000-4 500.1Mi ± 0% 735.7Mi ± 0% +47.10% (p=0.000 n=10)
SubVWext/100000-4 497.8Mi ± 0% 729.4Mi ± 0% +46.52% (p=0.000 n=10)
geomean 387.6Mi 614.5Mi +58.51%
Change-Id: I9d7fac719e977710ad9db9121fa298db6df605de
Reviewed-on: https://go-review.googlesource.com/c/go/+/595398
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This provides an assembly implementation of addVW for riscv64,
processing up to four words per loop, resulting in a significant
performance gain.
On a StarFive VisionFive 2:
│ addvw.1 │ addvw.2 │
│ sec/op │ sec/op vs base │
AddVW/1-4 57.43n ± 0% 41.45n ± 0% -27.83% (p=0.000 n=10)
AddVW/2-4 69.31n ± 0% 48.15n ± 0% -30.53% (p=0.000 n=10)
AddVW/3-4 76.12n ± 0% 54.97n ± 0% -27.79% (p=0.000 n=10)
AddVW/4-4 85.47n ± 0% 56.14n ± 0% -34.32% (p=0.000 n=10)
AddVW/5-4 96.16n ± 0% 62.82n ± 0% -34.67% (p=0.000 n=10)
AddVW/10-4 149.60n ± 0% 89.55n ± 0% -40.14% (p=0.000 n=10)
AddVW/100-4 1115.0n ± 0% 549.3n ± 0% -50.74% (p=0.000 n=10)
AddVW/1000-4 10.732µ ± 0% 5.060µ ± 0% -52.85% (p=0.000 n=10)
AddVW/10000-4 151.7µ ± 0% 103.7µ ± 0% -31.63% (p=0.000 n=10)
AddVW/100000-4 1.523m ± 0% 1.050m ± 0% -31.03% (p=0.000 n=10)
AddVWext/1-4 57.42n ± 0% 41.45n ± 0% -27.81% (p=0.000 n=10)
AddVWext/2-4 69.32n ± 0% 48.15n ± 0% -30.54% (p=0.000 n=10)
AddVWext/3-4 76.12n ± 0% 54.87n ± 0% -27.92% (p=0.000 n=10)
AddVWext/4-4 85.47n ± 0% 56.14n ± 0% -34.32% (p=0.000 n=10)
AddVWext/5-4 96.15n ± 0% 62.82n ± 0% -34.66% (p=0.000 n=10)
AddVWext/10-4 149.60n ± 0% 89.55n ± 0% -40.14% (p=0.000 n=10)
AddVWext/100-4 1115.0n ± 0% 549.3n ± 0% -50.74% (p=0.000 n=10)
AddVWext/1000-4 10.732µ ± 0% 5.060µ ± 0% -52.85% (p=0.000 n=10)
AddVWext/10000-4 150.5µ ± 0% 103.7µ ± 0% -31.10% (p=0.000 n=10)
AddVWext/100000-4 1.530m ± 0% 1.049m ± 0% -31.41% (p=0.000 n=10)
geomean 1.003µ 633.9n -36.79%
│ addvw.1 │ addvw.2 │
│ B/s │ B/s vs base │
AddVW/1-4 132.8Mi ± 0% 184.1Mi ± 0% +38.55% (p=0.000 n=10)
AddVW/2-4 220.1Mi ± 0% 316.9Mi ± 0% +43.96% (p=0.000 n=10)
AddVW/3-4 300.7Mi ± 0% 416.4Mi ± 0% +38.48% (p=0.000 n=10)
AddVW/4-4 357.1Mi ± 0% 543.6Mi ± 0% +52.25% (p=0.000 n=10)
AddVW/5-4 396.7Mi ± 0% 607.2Mi ± 0% +53.06% (p=0.000 n=10)
AddVW/10-4 510.1Mi ± 0% 852.0Mi ± 0% +67.02% (p=0.000 n=10)
AddVW/100-4 684.1Mi ± 0% 1389.0Mi ± 0% +103.03% (p=0.000 n=10)
AddVW/1000-4 710.9Mi ± 0% 1507.8Mi ± 0% +112.08% (p=0.000 n=10)
AddVW/10000-4 503.1Mi ± 0% 735.8Mi ± 0% +46.26% (p=0.000 n=10)
AddVW/100000-4 501.0Mi ± 0% 726.5Mi ± 0% +45.00% (p=0.000 n=10)
AddVWext/1-4 132.9Mi ± 0% 184.1Mi ± 0% +38.55% (p=0.000 n=10)
AddVWext/2-4 220.1Mi ± 0% 316.9Mi ± 0% +43.98% (p=0.000 n=10)
AddVWext/3-4 300.7Mi ± 0% 417.1Mi ± 0% +38.73% (p=0.000 n=10)
AddVWext/4-4 357.1Mi ± 0% 543.6Mi ± 0% +52.25% (p=0.000 n=10)
AddVWext/5-4 396.7Mi ± 0% 607.2Mi ± 0% +53.05% (p=0.000 n=10)
AddVWext/10-4 510.1Mi ± 0% 852.0Mi ± 0% +67.02% (p=0.000 n=10)
AddVWext/100-4 684.2Mi ± 0% 1389.0Mi ± 0% +103.02% (p=0.000 n=10)
AddVWext/1000-4 710.9Mi ± 0% 1507.7Mi ± 0% +112.08% (p=0.000 n=10)
AddVWext/10000-4 506.9Mi ± 0% 735.8Mi ± 0% +45.15% (p=0.000 n=10)
AddVWext/100000-4 498.6Mi ± 0% 727.0Mi ± 0% +45.79% (p=0.000 n=10)
geomean 388.3Mi 614.3Mi +58.19%
Change-Id: Ib14a4b8c1d81e710753bbf6dd5546bbca44fe3f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/595397
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Implement the encoding.(Binary|Text)Appender interfaces for "x509.OID".
Implement the encoding.BinaryAppender interface for "rand/v2.PCG" and "rand/v2.ChaCha8".
"rand/v2.ChaCha8.MarshalBinary" alse gains some performance benefits:
│ old │ new │
│ sec/op │ sec/op vs base │
ChaCha8MarshalBinary-8 33.730n ± 2% 9.786n ± 1% -70.99% (p=0.000 n=10)
ChaCha8MarshalBinaryRead-8 99.86n ± 1% 17.79n ± 0% -82.18% (p=0.000 n=10)
geomean 58.04n 13.19n -77.27%
│ old │ new │
│ B/op │ B/op vs base │
ChaCha8MarshalBinary-8 48.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10)
ChaCha8MarshalBinaryRead-8 83.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10)
│ old │ new │
│ allocs/op │ allocs/op vs base │
ChaCha8MarshalBinary-8 1.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)
ChaCha8MarshalBinaryRead-8 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)
For #62384
Change-Id: I604bde6dad90a916012909c7260f4bb06dcf5c0a
GitHub-Last-Rev: 78abf9c5dfb74838985637798bcd5cb957541d20
GitHub-Pull-Request: golang/go#68987
Reviewed-on: https://go-review.googlesource.com/c/go/+/607079
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
Fix typos in ~30 files
Change-Id: Ie433aea01e7d15944c1e9e103691784876d5c1f9
GitHub-Last-Rev: bbaeb3d1f88a5fa6bbb69607b1bd075f496a7894
GitHub-Pull-Request: golang/go#68964
Reviewed-on: https://go-review.googlesource.com/c/go/+/606955
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
Makes calls to the global Seed a no-op. The GODEBUG=randseednop=0
setting can be used to revert this behavior.
Fixes #67273
Change-Id: I79c1b2b23f3bc472fbd6190cb916a9d7583250f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/606055
Auto-Submit: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
For #62384
Change-Id: I1557704c6a0f9c6f3b9aad001374dd5cdbc99065
GitHub-Last-Rev: c258d18ccedab5feeb481a2431d5647bde7e5c58
GitHub-Pull-Request: golang/go#68893
Reviewed-on: https://go-review.googlesource.com/c/go/+/605758
Reviewed-by: Ian Lance Taylor <iant@google.com>
Commit-Queue: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
|
|
This provides an assembly implementation of subVV for riscv64,
processing up to four words per loop, resulting in a significant
performance gain.
On a StarFive VisionFive 2:
│ subvv.1 │ subvv.2 │
│ sec/op │ sec/op vs base │
SubVV/1-4 73.46n ± 0% 48.08n ± 0% -34.55% (p=0.000 n=10)
SubVV/2-4 88.13n ± 0% 58.76n ± 0% -33.33% (p=0.000 n=10)
SubVV/3-4 102.80n ± 0% 69.45n ± 0% -32.44% (p=0.000 n=10)
SubVV/4-4 117.50n ± 0% 72.11n ± 0% -38.63% (p=0.000 n=10)
SubVV/5-4 132.20n ± 0% 82.80n ± 0% -37.37% (p=0.000 n=10)
SubVV/10-4 216.3n ± 0% 126.9n ± 0% -41.33% (p=0.000 n=10)
SubVV/100-4 1659.0n ± 0% 886.5n ± 0% -46.56% (p=0.000 n=10)
SubVV/1000-4 16.089µ ± 0% 8.401µ ± 0% -47.78% (p=0.000 n=10)
SubVV/10000-4 244.7µ ± 0% 176.8µ ± 0% -27.74% (p=0.000 n=10)
SubVV/100000-4 2.562m ± 0% 1.871m ± 0% -26.96% (p=0.000 n=10)
geomean 1.436µ 904.4n -37.04%
│ subvv.1 │ subvv.2 │
│ B/s │ B/s vs base │
SubVV/1-4 830.9Mi ± 0% 1269.5Mi ± 0% +52.79% (p=0.000 n=10)
SubVV/2-4 1.353Gi ± 0% 2.029Gi ± 0% +49.99% (p=0.000 n=10)
SubVV/3-4 1.739Gi ± 0% 2.575Gi ± 0% +48.06% (p=0.000 n=10)
SubVV/4-4 2.029Gi ± 0% 3.306Gi ± 0% +62.96% (p=0.000 n=10)
SubVV/5-4 2.254Gi ± 0% 3.600Gi ± 0% +59.67% (p=0.000 n=10)
SubVV/10-4 2.755Gi ± 0% 4.699Gi ± 0% +70.53% (p=0.000 n=10)
SubVV/100-4 3.594Gi ± 0% 6.723Gi ± 0% +87.08% (p=0.000 n=10)
SubVV/1000-4 3.705Gi ± 0% 7.095Gi ± 0% +91.52% (p=0.000 n=10)
SubVV/10000-4 2.436Gi ± 0% 3.372Gi ± 0% +38.39% (p=0.000 n=10)
SubVV/100000-4 2.327Gi ± 0% 3.185Gi ± 0% +36.91% (p=0.000 n=10)
geomean 2.118Gi 3.364Gi +58.84%
Change-Id: I361cb3f4195b27a9f1e9486c9e1fdbeaa94d32b4
Reviewed-on: https://go-review.googlesource.com/c/go/+/595396
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
Make math.{Min,Max} intrinsics and implement math.{archMax,archMin}
in hardware.
goos: linux
goarch: loong64
pkg: math
cpu: Loongson-3A6000 @ 2500.00MHz
│ old.bench │ new.bench │
│ sec/op │ sec/op vs base │
Max 7.606n ± 0% 3.087n ± 0% -59.41% (p=0.000 n=20)
Min 7.205n ± 0% 2.904n ± 0% -59.69% (p=0.000 n=20)
MinFloat 37.220n ± 0% 4.802n ± 0% -87.10% (p=0.000 n=20)
MaxFloat 33.620n ± 0% 4.802n ± 0% -85.72% (p=0.000 n=20)
geomean 16.18n 3.792n -76.57%
goos: linux
goarch: loong64
pkg: runtime
cpu: Loongson-3A5000 @ 2500.00MHz
│ old.bench │ new.bench │
│ sec/op │ sec/op vs base │
Max 10.010n ± 0% 7.196n ± 0% -28.11% (p=0.000 n=20)
Min 8.806n ± 0% 7.155n ± 0% -18.75% (p=0.000 n=20)
MinFloat 60.010n ± 0% 7.976n ± 0% -86.71% (p=0.000 n=20)
MaxFloat 56.410n ± 0% 7.980n ± 0% -85.85% (p=0.000 n=20)
geomean 23.37n 7.566n -67.63%
Updates #59120.
Change-Id: I6815d20bc304af3cbf5d6ca8fe0ca1c2ddebea2d
Reviewed-on: https://go-review.googlesource.com/c/go/+/580283
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
This provides an assembly implementation of addVV for riscv64,
processing up to four words per loop, resulting in a significant
performance gain.
On a StarFive VisionFive 2:
│ addvv.1 │ addvv.2 │
│ sec/op │ sec/op vs base │
AddVV/1-4 73.45n ± 0% 48.08n ± 0% -34.54% (p=0.000 n=10)
AddVV/2-4 88.14n ± 0% 58.76n ± 0% -33.33% (p=0.000 n=10)
AddVV/3-4 102.80n ± 0% 69.44n ± 0% -32.45% (p=0.000 n=10)
AddVV/4-4 117.50n ± 0% 72.18n ± 0% -38.57% (p=0.000 n=10)
AddVV/5-4 132.20n ± 0% 82.79n ± 0% -37.38% (p=0.000 n=10)
AddVV/10-4 216.3n ± 0% 126.8n ± 0% -41.35% (p=0.000 n=10)
AddVV/100-4 1659.0n ± 0% 885.2n ± 0% -46.64% (p=0.000 n=10)
AddVV/1000-4 16.089µ ± 0% 8.400µ ± 0% -47.79% (p=0.000 n=10)
AddVV/10000-4 245.3µ ± 0% 176.9µ ± 0% -27.88% (p=0.000 n=10)
AddVV/100000-4 2.537m ± 0% 1.873m ± 0% -26.17% (p=0.000 n=10)
geomean 1.435µ 904.5n -36.99%
│ addvv.1 │ addvv.2 │
│ B/s │ B/s vs base │
AddVV/1-4 830.9Mi ± 0% 1269.5Mi ± 0% +52.78% (p=0.000 n=10)
AddVV/2-4 1.353Gi ± 0% 2.029Gi ± 0% +50.00% (p=0.000 n=10)
AddVV/3-4 1.739Gi ± 0% 2.575Gi ± 0% +48.09% (p=0.000 n=10)
AddVV/4-4 2.029Gi ± 0% 3.303Gi ± 0% +62.82% (p=0.000 n=10)
AddVV/5-4 2.254Gi ± 0% 3.600Gi ± 0% +59.69% (p=0.000 n=10)
AddVV/10-4 2.755Gi ± 0% 4.699Gi ± 0% +70.54% (p=0.000 n=10)
AddVV/100-4 3.594Gi ± 0% 6.734Gi ± 0% +87.37% (p=0.000 n=10)
AddVV/1000-4 3.705Gi ± 0% 7.096Gi ± 0% +91.54% (p=0.000 n=10)
AddVV/10000-4 2.430Gi ± 0% 3.369Gi ± 0% +38.65% (p=0.000 n=10)
AddVV/100000-4 2.350Gi ± 0% 3.183Gi ± 0% +35.44% (p=0.000 n=10)
geomean 2.119Gi 3.364Gi +58.71%
Change-Id: I727b3d9f8ab01eada7270046480b1430d56d0a96
Reviewed-on: https://go-review.googlesource.com/c/go/+/595395
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
Comment in line 395:
[x₀ < S, so S - x₀ < 0; drop it]
Should be:
[x₀ < S, so S - x₀ > 0; drop it]
The proof is based on S - x₀ > 0, thus it's a typo of comment.
Fixes #68466
Change-Id: I68bb7cb909ba2bfe02a8873f74b57edc6679b72a
GitHub-Last-Rev: 40a2fc80cf22e97e0f535454a9b87b31b2e51421
GitHub-Pull-Request: golang/go#68487
Reviewed-on: https://go-review.googlesource.com/c/go/+/598855
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
This looks way better than the code formatting.
Similar to CL 597656.
Change-Id: I2c8809c1d6f8a8387941567213880662ff649a73
Reviewed-on: https://go-review.googlesource.com/c/go/+/597659
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: I3541859bbf3ac4f9317b82a66d21be3d5c4c5a84
Reviewed-on: https://go-review.googlesource.com/c/go/+/597658
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
|
|
Fixes #68322
This reverts commit ad377e906a8ee6f27545d83de280206dacec1e58.
Change-Id: Ifa4811e2c679d789cc830dbff5e50301410e24d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/596516
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Commit-Queue: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
|
|
Fixes #66358.
Change-Id: Ic9bde88eabfb2a446d32e1dc5ac404a51ef49f11
Reviewed-on: https://go-review.googlesource.com/c/go/+/590635
Auto-Submit: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
|
|
Add linknames for most modules with ≥50 dependents.
Add linknames for a few other modules that we know
are important but are below 50.
Remove linknames from badlinkname.go that do not merit
inclusion (very small number of dependents).
We can add them back later if the need arises.
Fixes #67401. (For now.)
Change-Id: I1e49fec0292265256044d64b1841d366c4106002
Reviewed-on: https://go-review.googlesource.com/c/go/+/587756
Auto-Submit: Russ Cox <rsc@golang.org>
TryBot-Bypass: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
In all cases the intent was not to interpret s as a format string.
In one case (go/types), this was a latent bug in production.
(These were uncovered by a new check in vet's printf analyzer.)
Updates #60529
Change-Id: I3e17af7e589be9aec1580783a1b1011c52ec494b
Reviewed-on: https://go-review.googlesource.com/c/go/+/587855
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
For #67401.
Change-Id: Ifea84af92017b405466937f50fb8f28e6893c8cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/587220
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
|
|
Doing this because the slices functions are slightly faster and
slightly easier to use. It also removes one dependency layer.
This CL does not change packages that are used during bootstrap,
as the bootstrap compiler does not have the required slices functions.
It does not change the go/scanner package because the ErrorList
Len, Swap, and Less methods are part of the Go 1 API.
Change-Id: If52899be791c829198e11d2408727720b91ebe8a
Reviewed-on: https://go-review.googlesource.com/c/go/+/587655
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
|
|
Fixes #67059
Closes #67452
Closes #67498
Change-Id: I84eba2ed787a17e9d6aaad2a8a78596e3944909a
Reviewed-on: https://go-review.googlesource.com/c/go/+/587280
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Cleanup all remaining trivial compares against $0 in ppc64x assembly.
In math, SRD ...,Rx; CMP Rx, $0 is further simplified to SRDCC.
Change-Id: Ia2bc204953e32f08ee142bfd06a91965f30f99b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/587016
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Just a cleanup.
Change-Id: Ibeb2c7d447c793086280e612fe5f0f7eeb863f71
Reviewed-on: https://go-review.googlesource.com/c/go/+/582875
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
|
|
Replacing Branch Conditional (BC) with its extended mnemonic form of BDNZ and BDZ.
- BC 16, 0, target can be replaced by BDNZ target
- BC 18, 0, target can be replaced by BDZ target
Change-Id: I1259e207f2a40d0b72780d5421f7449ddc006dc5
Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10
Reviewed-on: https://go-review.googlesource.com/c/go/+/585077
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
CL 585358 adds restrictions to disallow pull-only linknames
(currently off by default). Currently, there are quite some pull-
only linknames in user code in the wild. In order not to break
those, we add push linknames to allow them to be pulled. This CL
includes linknames found in a large code corpus (thanks Matthew
Dempsky and Michael Pratt for the analysis!), that are not
currently linknamed.
Updates #67401.
Change-Id: I32f5fc0c7a6abbd7a11359a025cfa2bf458fe767
Reviewed-on: https://go-review.googlesource.com/c/go/+/586137
Reviewed-by: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: I6d0050319c66fb62c817206e646e1a9449dc444c
Reviewed-on: https://go-review.googlesource.com/c/go/+/585715
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
|
|
Change-Id: Id07f16d14133ee539bc2880b39641c42418fa6e2
GitHub-Last-Rev: 7b327d508f677f2476d24f046d25921f4599dd9a
GitHub-Pull-Request: golang/go#67319
Reviewed-on: https://go-review.googlesource.com/c/go/+/585016
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
Uint was part of the approved proposal but was inadvertently left
out of Go 1.22. Add for Go 1.23.
Change-Id: Ifaf24447bd70c8524c2fd299eefdf4aa29e49e66
Reviewed-on: https://go-review.googlesource.com/c/go/+/583455
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
|
|
Improve the use of addze to avoid unnecessary register
moves on ppc64x.
goos: linux
goarch: ppc64le
pkg: math/big
cpu: POWER10
│ old.out │ new.out │
│ sec/op │ sec/op vs base │
MulAddVWW/1 4.524n ± 3% 4.248n ± 0% -6.10% (p=0.002 n=6)
MulAddVWW/2 5.634n ± 0% 5.283n ± 0% -6.24% (p=0.002 n=6)
MulAddVWW/3 6.406n ± 0% 5.918n ± 0% -7.63% (p=0.002 n=6)
MulAddVWW/4 6.484n ± 0% 5.859n ± 0% -9.64% (p=0.002 n=6)
MulAddVWW/5 7.363n ± 0% 6.766n ± 0% -8.11% (p=0.002 n=6)
MulAddVWW/10 10.920n ± 0% 9.856n ± 0% -9.75% (p=0.002 n=6)
MulAddVWW/100 83.46n ± 0% 66.95n ± 0% -19.78% (p=0.002 n=6)
MulAddVWW/1000 856.0n ± 0% 681.6n ± 0% -20.38% (p=0.002 n=6)
MulAddVWW/10000 8.589µ ± 1% 6.774µ ± 0% -21.14% (p=0.002 n=6)
MulAddVWW/100000 86.22µ ± 0% 67.71µ ± 43% -21.48% (p=0.065 n=6)
geomean 73.34n 63.62n -13.26%
Change-Id: I95d6ac49ff6b64aa678e6896f57af9d85c923aad
Reviewed-on: https://go-review.googlesource.com/c/go/+/579235
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
There is no hyphen between the organization and the number.
For example, https://standards.ieee.org/ieee/754/6210/
shows the string "IEEE 754-2019" and not "IEEE-754-2019".
This assists in searching for "IEEE 754" in documentation
and not missing those using "IEEE-754".
Change-Id: I9a50ede807984ff1e2f17390bc1039f6a5d162e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/575438
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Joseph Tsai <joetsai@digital-static.net>
TryBot-Result: Gopher Robot <gobot@golang.org>
TryBot-Bypass: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
Change-Id: I07c3a498ce1e462c3d1703d77e7d7824e9334651
GitHub-Last-Rev: 2ba8c4c705eaeb0772109ece7296978b62467eb3
GitHub-Pull-Request: golang/go#66312
Reviewed-on: https://go-review.googlesource.com/c/go/+/571636
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
According to the https://go.dev/wiki/CodeReviewComments#receiver-names
Change-Id: Ib8bc57cf6a680e5c75d7346b74e77847945f6939
Reviewed-on: https://go-review.googlesource.com/c/go/+/568635
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
goos: linux
goarch: riscv64
pkg: math
│ floor_old.bench │ floor_new.bench │
│ sec/op │ sec/op vs base │
Ceil 54.12n ± 0% 22.05n ± 0% -59.26% (p=0.000 n=10)
Floor 40.80n ± 0% 22.05n ± 0% -45.96% (p=0.000 n=10)
Round 20.73n ± 0% 20.74n ± 0% ~ (p=0.441 n=10)
RoundToEven 24.07n ± 0% 24.07n ± 0% ~ (p=1.000 n=10)
Trunc 38.73n ± 0% 22.05n ± 0% -43.07% (p=0.000 n=10)
geomean 33.58n 22.17n -33.98%
Change-Id: I24fb9e3bbf8146da253b6791b21377bea1afbd16
Reviewed-on: https://go-review.googlesource.com/c/go/+/504737
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: M Zhuo <mengzhuo1203@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
|
|
It's easier to go look at its documentation when there's a link.
Change-Id: Iad6c1aa1a3f4b9127dc526b4db473239329780d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/563255
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
This updates the assembly implementation of AddMulVVW to
unroll the main loop to do 64 bytes at a time.
The code for addMulVVWx is based on the same code and has
also been updated to improve performance.
goos: linux
goarch: ppc64le
pkg: crypto/internal/bigmod
cpu: POWER10
│ bg.orig.out │ bg.out │
│ sec/op │ sec/op vs base │
ModAdd 116.3n ± 0% 116.9n ± 0% +0.52% (p=0.002 n=6)
ModSub 111.5n ± 0% 111.5n ± 0% 0.00% (p=0.273 n=6)
MontgomeryRepr 2.195µ ± 0% 1.944µ ± 0% -11.44% (p=0.002 n=6)
MontgomeryMul 2.195µ ± 0% 1.943µ ± 0% -11.48% (p=0.002 n=6)
ModMul 4.418µ ± 0% 3.900µ ± 0% -11.72% (p=0.002 n=6)
ExpBig 5.736m ± 0% 5.117m ± 0% -10.78% (p=0.002 n=6)
Exp 5.891m ± 0% 5.237m ± 0% -11.11% (p=0.002 n=6)
geomean 9.901µ 9.094µ -8.15%
goos: linux
goarch: ppc64le
pkg: math/big
cpu: POWER10
│ am.orig.out │ am.out │
│ sec/op │ sec/op vs base │
AddMulVVW/1 4.456n ± 1% 3.565n ± 0% -20.00% (p=0.002 n=6)
AddMulVVW/2 4.875n ± 1% 5.938n ± 1% +21.79% (p=0.002 n=6)
AddMulVVW/3 5.484n ± 0% 5.693n ± 0% +3.80% (p=0.002 n=6)
AddMulVVW/4 6.370n ± 0% 6.065n ± 0% -4.79% (p=0.002 n=6)
AddMulVVW/5 7.321n ± 0% 7.188n ± 0% -1.82% (p=0.002 n=6)
AddMulVVW/10 12.26n ± 8% 11.41n ± 0% -6.97% (p=0.002 n=6)
AddMulVVW/100 100.70n ± 0% 93.58n ± 0% -7.08% (p=0.002 n=6)
AddMulVVW/1000 938.6n ± 0% 845.5n ± 0% -9.92% (p=0.002 n=6)
AddMulVVW/10000 9.459µ ± 0% 8.415µ ± 0% -11.04% (p=0.002 n=6)
AddMulVVW/100000 94.57µ ± 0% 84.01µ ± 0% -11.16% (p=0.002 n=6)
geomean 75.17n 71.21n -5.27%
Change-Id: Idd79f5f02387564f4c2cc28d50b1c12bcd9a400f
Reviewed-on: https://go-review.googlesource.com/c/go/+/557915
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Paul Murphy <murp@ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
|
|
Compute median as a + (b-a)/2 instead of (a + b)/2.
Add additional test cases.
Fixes #65025.
Change-Id: Ib716a1036c17f8f33f51e33cedab13512eb7e0be
Reviewed-on: https://go-review.googlesource.com/c/go/+/554617
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Robert Griesemer <gri@google.com>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
This commit is aimed at improving the readability and consistency
of the code base. Extraneous newline characters were present after
some return statements, creating unnecessary separation in the code.
Fixes #64610
Change-Id: Ic1b05bf11761c4dff22691c2f1c3755f66d341f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/548316
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Move ChaCha8 code into internal/chacha8rand and use it to implement
runtime.rand, which is used for the unseeded global source for
both math/rand and math/rand/v2. This also affects the calculation of
the start point for iteration over very very large maps (when the
32-bit fastrand is not big enough).
The benefit is that misuse of the global random number generators
in math/rand and math/rand/v2 in contexts where non-predictable
randomness is important for security reasons is no longer a
security problem, removing a common mistake among programmers
who are unaware of the different kinds of randomness.
The cost is an extra 304 bytes per thread stored in the m struct
plus 2-3ns more per random uint64 due to the more sophisticated
algorithm. Using PCG looks like it would cost about the same,
although I haven't benchmarked that.
Before this, the math/rand and math/rand/v2 global generator
was wyrand (https://github.com/wangyi-fudan/wyhash).
For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson
ALFG was justifiable, since the latter was not any better.
But for math/rand/v2, the global generator really should be
at least as good as one of the well-studied, specific algorithms
provided directly by the package, and it's not.
(Wyrand is still reasonable for scheduling and cache decisions.)
Good randomness does have a cost: about twice wyrand.
Also rationalize the various runtime rand references.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │
│ sec/op │ sec/op vs base │
ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20)
PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20)
SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20)
GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20)
GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20)
GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20)
GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20)
Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20)
Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20)
GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20)
IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20)
Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20)
Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20)
Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20)
Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20)
Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20)
Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20)
Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20)
Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20)
Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20)
Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20)
Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20)
Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20)
Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20)
ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20)
NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20)
Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20)
Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20)
Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20)
ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20)
Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
│ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │
│ sec/op │ sec/op vs base │
ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20)
PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20)
SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20)
GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20)
GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20)
GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20)
GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20)
Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20)
Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20)
GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20)
IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20)
Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20)
Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20)
Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20)
Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20)
Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20)
Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20)
Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20)
Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20)
Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20)
Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20)
Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20)
ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20)
NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20)
Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20)
Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20)
Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20)
ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20)
Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.386 │ 5cf807d1ea.386 │
│ sec/op │ sec/op vs base │
ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20)
PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20)
SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20)
GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20)
GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20)
GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20)
GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20)
Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20)
Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20)
GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20)
IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20)
Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20)
Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20)
Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20)
Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20)
Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20)
Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20)
Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20)
Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20)
Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20)
Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20)
Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20)
Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20)
Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20)
ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20)
NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20)
Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20)
Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20)
Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20)
ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20)
Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20)
For #61716.
Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2
Reviewed-on: https://go-review.googlesource.com/c/go/+/516860
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This is a replay of CL 516859, after its rollback in CL 543895,
with big-endian systems fixed and the tests disabled on RISC-V
since the compiler is broken there (#64285).
ChaCha8 provides a cryptographically strong generator
alongside PCG, so that people who want stronger randomness
have access to that. On systems with 128-bit vector math
assembly (amd64 and arm64), ChaCha8 runs at about the same
speed as PCG (25% slower on amd64, 2% faster on arm64).
Fixes #64284.
Change-Id: I6290bb8ace28e1aff9a61f805dbe380ccdf25b94
Reviewed-on: https://go-review.googlesource.com/c/go/+/546020
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This reverts commit 6382893890b82bbc0439eade5b132d9d1b3fb3a7.
Reason for revert: Causes failures on big endian platforms and riscv64.
Possibly a bug in the generic implementation.
For #64284.
For #64285.
Change-Id: Ic1bb8533d9641fae28d0337b36d434b9a575cd7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/543895
Reviewed-by: Heschi Kreinick <heschi@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Bypass: Michael Knyszek <mknyszek@google.com>
|
|
This change introduces new options to set the floating point
mode on ARM targets. The GOARM version number can optionally be
followed by ',hardfloat' or ',softfloat' to select whether to
use hardware instructions or software emulation for floating
point computations, respectively. For example,
GOARM=7,softfloat.
Previously, software floating point support was limited to
GOARM=5. With these options, software floating point is now
extended to all ARM versions, including GOARM=6 and 7. This
change also extends hardware floating point to GOARM=5.
GOARM=5 defaults to softfloat and GOARM=6 and 7 default to
hardfloat.
For #61588
Change-Id: I23dc86fbd0733b262004a2ed001e1032cf371e94
Reviewed-on: https://go-review.googlesource.com/c/go/+/514907
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|