aboutsummaryrefslogtreecommitdiff
path: root/src/math/rand
AgeCommit message (Collapse)Author
2025-10-17all: remove unnecessary loop variable copies in testsTobias Klauser
Copying the loop variable is no longer necessary since Go 1.22. Change-Id: Iebb21dac44a20ec200567f1d786f105a4ee4999d Reviewed-on: https://go-review.googlesource.com/c/go/+/711640 Reviewed-by: Florian Lehner <lehner.florian86@gmail.com> Auto-Submit: Damien Neil <dneil@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Damien Neil <dneil@google.com> Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-25all: surround -test.run arguments with ^$qmuntal
If the -test.run value is not surrounded by ^$ then any test that matches the -test.run value will be run. This is normally not the desired behavior, as it can lead to unexpected tests being run. Change-Id: I3447aaebad5156bbef7f263cdb9f6b8c32331324 Reviewed-on: https://go-review.googlesource.com/c/go/+/651956 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-12-04math/rand/v2: replace <= 0 with == 0 for Uint function docsJorropo
This harmonize the docs with (*Rand).Uint* functions. And it make it clearer, I wasn't sure if it would try to interpret the uint as a signed number somehow, it does not pull any surprises make that clear. Change-Id: I5a87a0a5563dbabfc31e536e40ee69b11f5cb6cf Reviewed-on: https://go-review.googlesource.com/c/go/+/633535 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Commit-Queue: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Robert Griesemer <gri@google.com>
2024-11-20internal/byteorder: use canonical Go casing in namesRuss Cox
If Be and Le stand for big-endian and little-endian, then they should be BE and LE. Change-Id: I723e3962b8918da84791783d3c547638f1c9e8a9 Reviewed-on: https://go-review.googlesource.com/c/go/+/627376 Reviewed-by: Robert Griesemer <gri@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-09-03math,os,os/*: use testenv.ExecutableKir Kolyshkin
As some callers don't have a testing context, modify testenv.Executable to accept nil (similar to how testenv.GOROOT works). Change-Id: I39112a7869933785a26b5cb6520055b3cc42b847 Reviewed-on: https://go-review.googlesource.com/c/go/+/609835 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-08-21crypto/x509,math/rand/v2: implement the encoding.(Binary|Text)Appenderapocelipes
Implement the encoding.(Binary|Text)Appender interfaces for "x509.OID". Implement the encoding.BinaryAppender interface for "rand/v2.PCG" and "rand/v2.ChaCha8". "rand/v2.ChaCha8.MarshalBinary" alse gains some performance benefits: │ old │ new │ │ sec/op │ sec/op vs base │ ChaCha8MarshalBinary-8 33.730n ± 2% 9.786n ± 1% -70.99% (p=0.000 n=10) ChaCha8MarshalBinaryRead-8 99.86n ± 1% 17.79n ± 0% -82.18% (p=0.000 n=10) geomean 58.04n 13.19n -77.27% │ old │ new │ │ B/op │ B/op vs base │ ChaCha8MarshalBinary-8 48.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) ChaCha8MarshalBinaryRead-8 83.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10) │ old │ new │ │ allocs/op │ allocs/op vs base │ ChaCha8MarshalBinary-8 1.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10) ChaCha8MarshalBinaryRead-8 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10) For #62384 Change-Id: I604bde6dad90a916012909c7260f4bb06dcf5c0a GitHub-Last-Rev: 78abf9c5dfb74838985637798bcd5cb957541d20 GitHub-Pull-Request: golang/go#68987 Reviewed-on: https://go-review.googlesource.com/c/go/+/607079 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-08-19math/rand: make calls to Seed no-opPaschalis T
Makes calls to the global Seed a no-op. The GODEBUG=randseednop=0 setting can be used to revert this behavior. Fixes #67273 Change-Id: I79c1b2b23f3bc472fbd6190cb916a9d7583250f4 Reviewed-on: https://go-review.googlesource.com/c/go/+/606055 Auto-Submit: Cherry Mui <cherryyz@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-23std: fix calls to Printf(s) with non-constant sAlan Donovan
In all cases the intent was not to interpret s as a format string. In one case (go/types), this was a latent bug in production. (These were uncovered by a new check in vet's printf analyzer.) Updates #60529 Change-Id: I3e17af7e589be9aec1580783a1b1011c52ec494b Reviewed-on: https://go-review.googlesource.com/c/go/+/587855 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Russ Cox <rsc@golang.org>
2024-05-22math/rand/v2: add ChaCha8.ReadFilippo Valsorda
Fixes #67059 Closes #67452 Closes #67498 Change-Id: I84eba2ed787a17e9d6aaad2a8a78596e3944909a Reviewed-on: https://go-review.googlesource.com/c/go/+/587280 Reviewed-by: Roland Shoemaker <roland@golang.org> Auto-Submit: Filippo Valsorda <filippo@golang.org> Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-22math/rand/v2: drop pointer receiver on zero-width typeBrad Fitzpatrick
Just a cleanup. Change-Id: Ibeb2c7d447c793086280e612fe5f0f7eeb863f71 Reviewed-on: https://go-review.googlesource.com/c/go/+/582875 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Damien Neil <dneil@google.com> Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
2024-05-16math/rand/v2: use max builtin in testsTobias Klauser
Change-Id: I6d0050319c66fb62c817206e646e1a9449dc444c Reviewed-on: https://go-review.googlesource.com/c/go/+/585715 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Robert Griesemer <gri@google.com>
2024-05-13math/rand/v2, math/big: use internal/byteorderMateusz Poliwczak
Change-Id: Id07f16d14133ee539bc2880b39641c42418fa6e2 GitHub-Last-Rev: 7b327d508f677f2476d24f046d25921f4599dd9a GitHub-Pull-Request: golang/go#67319 Reviewed-on: https://go-review.googlesource.com/c/go/+/585016 Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-05-07math/rand/v2: add UintRuss Cox
Uint was part of the approved proposal but was inadvertently left out of Go 1.22. Add for Go 1.23. Change-Id: Ifaf24447bd70c8524c2fd299eefdf4aa29e49e66 Reviewed-on: https://go-review.googlesource.com/c/go/+/583455 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Robert Griesemer <gri@google.com>
2024-03-04math/rand, math/rand/v2: rename receiver variablesOleksandr Redko
According to the https://go.dev/wiki/CodeReviewComments#receiver-names Change-Id: Ib8bc57cf6a680e5c75d7346b74e77847945f6939 Reviewed-on: https://go-review.googlesource.com/c/go/+/568635 Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-02-19math/rand/v2: use a doc link for crypto/randDaniel Martí
It's easier to go look at its documentation when there's a link. Change-Id: Iad6c1aa1a3f4b9127dc526b4db473239329780d6 Reviewed-on: https://go-review.googlesource.com/c/go/+/563255 Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@google.com>
2023-12-05math/rand, math/rand/v2: use ChaCha8 for global randRuss Cox
Move ChaCha8 code into internal/chacha8rand and use it to implement runtime.rand, which is used for the unseeded global source for both math/rand and math/rand/v2. This also affects the calculation of the start point for iteration over very very large maps (when the 32-bit fastrand is not big enough). The benefit is that misuse of the global random number generators in math/rand and math/rand/v2 in contexts where non-predictable randomness is important for security reasons is no longer a security problem, removing a common mistake among programmers who are unaware of the different kinds of randomness. The cost is an extra 304 bytes per thread stored in the m struct plus 2-3ns more per random uint64 due to the more sophisticated algorithm. Using PCG looks like it would cost about the same, although I haven't benchmarked that. Before this, the math/rand and math/rand/v2 global generator was wyrand (https://github.com/wangyi-fudan/wyhash). For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson ALFG was justifiable, since the latter was not any better. But for math/rand/v2, the global generator really should be at least as good as one of the well-studied, specific algorithms provided directly by the package, and it's not. (Wyrand is still reasonable for scheduling and cache decisions.) Good randomness does have a cost: about twice wyrand. Also rationalize the various runtime rand references. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │ │ sec/op │ sec/op vs base │ ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20) PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20) SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20) GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20) GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20) GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20) GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20) Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20) Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20) GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20) IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20) Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20) Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20) Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20) Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20) Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20) Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20) Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20) Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20) Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20) Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20) Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20) Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20) Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20) ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20) NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20) Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20) Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20) Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20) ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20) Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 │ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │ │ sec/op │ sec/op vs base │ ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20) PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20) SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20) GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20) GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20) GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20) GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20) Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20) Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20) GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20) IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20) Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20) Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20) Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20) Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20) Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20) Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20) Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20) Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20) Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20) Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20) Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20) Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20) Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20) ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20) NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20) Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20) Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20) Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20) ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20) Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ bbb48afeb7.386 │ 5cf807d1ea.386 │ │ sec/op │ sec/op vs base │ ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20) PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20) SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20) GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20) GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20) GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20) GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20) Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20) Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20) GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20) IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20) Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20) Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20) Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20) Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20) Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20) Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20) Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20) Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20) Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20) Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20) Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20) Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20) Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20) ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20) NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20) Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20) Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20) Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20) ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20) Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20) For #61716. Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2 Reviewed-on: https://go-review.googlesource.com/c/go/+/516860 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Filippo Valsorda <filippo@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-12-05math/rand/v2: add ChaCha8Russ Cox
This is a replay of CL 516859, after its rollback in CL 543895, with big-endian systems fixed and the tests disabled on RISC-V since the compiler is broken there (#64285). ChaCha8 provides a cryptographically strong generator alongside PCG, so that people who want stronger randomness have access to that. On systems with 128-bit vector math assembly (amd64 and arm64), ChaCha8 runs at about the same speed as PCG (25% slower on amd64, 2% faster on arm64). Fixes #64284. Change-Id: I6290bb8ace28e1aff9a61f805dbe380ccdf25b94 Reviewed-on: https://go-review.googlesource.com/c/go/+/546020 Reviewed-by: Filippo Valsorda <filippo@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-20Revert "math/rand/v2: add ChaCha8"Michael Knyszek
This reverts commit 6382893890b82bbc0439eade5b132d9d1b3fb3a7. Reason for revert: Causes failures on big endian platforms and riscv64. Possibly a bug in the generic implementation. For #64284. For #64285. Change-Id: Ic1bb8533d9641fae28d0337b36d434b9a575cd7e Reviewed-on: https://go-review.googlesource.com/c/go/+/543895 Reviewed-by: Heschi Kreinick <heschi@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> TryBot-Bypass: Michael Knyszek <mknyszek@google.com>
2023-11-20all: add floating point option for ARM targetsLudi Rehak
This change introduces new options to set the floating point mode on ARM targets. The GOARM version number can optionally be followed by ',hardfloat' or ',softfloat' to select whether to use hardware instructions or software emulation for floating point computations, respectively. For example, GOARM=7,softfloat. Previously, software floating point support was limited to GOARM=5. With these options, software floating point is now extended to all ARM versions, including GOARM=6 and 7. This change also extends hardware floating point to GOARM=5. GOARM=5 defaults to softfloat and GOARM=6 and 7 default to hardfloat. For #61588 Change-Id: I23dc86fbd0733b262004a2ed001e1032cf371e94 Reviewed-on: https://go-review.googlesource.com/c/go/+/514907 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2023-11-19math/rand/v2: add ChaCha8Russ Cox
ChaCha8 provides a cryptographically strong generator alongside PCG, so that people who want stronger randomness have access to that. On systems with 128-bit vector math assembly (amd64 and arm64), ChaCha8 runs at about the same speed as PCG (25% slower on amd64, 2% faster on arm64). Obviously all the claimed benchmark variation other than the new ChaCha8 benchmark is a lie. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ afa459a2f0.amd64 │ bbb48afeb7.amd64 │ │ sec/op │ sec/op vs base │ PCG_DXSM-32 1.488n ± 2% 1.492n ± 2% ~ (p=0.309 n=20) ChaCha8-32 1.861n ± 2% SourceUint64-32 1.450n ± 3% 1.590n ± 2% +9.69% (p=0.000 n=20) GlobalInt64-32 2.067n ± 2% 2.061n ± 1% ~ (p=0.952 n=20) GlobalInt64Parallel-32 0.1044n ± 2% 0.1041n ± 1% ~ (p=0.498 n=20) GlobalUint64-32 2.085n ± 0% 2.256n ± 2% +8.23% (p=0.000 n=20) GlobalUint64Parallel-32 0.1008n ± 1% 0.1018n ± 1% ~ (p=0.041 n=20) Int64-32 1.779n ± 1% 1.779n ± 1% ~ (p=0.410 n=20) Uint64-32 1.854n ± 2% 1.882n ± 1% ~ (p=0.044 n=20) GlobalIntN1000-32 3.140n ± 3% 3.115n ± 3% ~ (p=0.673 n=20) IntN1000-32 2.496n ± 1% 2.509n ± 1% ~ (p=0.171 n=20) Int64N1000-32 2.510n ± 2% 2.493n ± 1% ~ (p=0.804 n=20) Int64N1e8-32 2.471n ± 2% 2.521n ± 1% +1.98% (p=0.003 n=20) Int64N1e9-32 2.488n ± 2% 2.506n ± 1% ~ (p=0.663 n=20) Int64N2e9-32 2.478n ± 2% 2.482n ± 2% ~ (p=0.533 n=20) Int64N1e18-32 3.088n ± 1% 3.216n ± 1% +4.15% (p=0.000 n=20) Int64N2e18-32 3.493n ± 1% 3.635n ± 2% +4.05% (p=0.000 n=20) Int64N4e18-32 5.060n ± 2% 5.122n ± 1% +1.22% (p=0.000 n=20) Int32N1000-32 2.620n ± 1% 2.672n ± 1% +2.00% (p=0.002 n=20) Int32N1e8-32 2.652n ± 0% 2.646n ± 1% ~ (p=0.743 n=20) Int32N1e9-32 2.644n ± 1% 2.660n ± 2% ~ (p=0.163 n=20) Int32N2e9-32 2.619n ± 2% 2.652n ± 1% ~ (p=0.132 n=20) Float32-32 2.261n ± 1% 2.267n ± 1% ~ (p=0.516 n=20) Float64-32 2.241n ± 2% 2.276n ± 1% ~ (p=0.080 n=20) ExpFloat64-32 3.716n ± 1% 3.779n ± 1% +1.68% (p=0.007 n=20) NormFloat64-32 3.718n ± 1% 3.747n ± 1% ~ (p=0.011 n=20) Perm3-32 34.11n ± 2% 34.23n ± 2% ~ (p=0.779 n=20) Perm30-32 200.6n ± 0% 202.3n ± 2% ~ (p=0.055 n=20) Perm30ViaShuffle-32 109.7n ± 1% 115.5n ± 2% +5.34% (p=0.000 n=20) ShuffleOverhead-32 107.2n ± 1% 113.3n ± 1% +5.74% (p=0.000 n=20) Concurrent-32 2.108n ± 6% 2.107n ± 1% ~ (p=0.448 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ afa459a2f0.arm64 │ bbb48afeb7.arm64 │ │ sec/op │ sec/op vs base │ PCG_DXSM-8 2.531n ± 0% 2.529n ± 0% ~ (p=0.586 n=20) ChaCha8-8 2.480n ± 0% SourceUint64-8 2.531n ± 0% 2.534n ± 0% ~ (p=0.227 n=20) GlobalInt64-8 2.177n ± 1% 2.173n ± 1% ~ (p=0.733 n=20) GlobalInt64Parallel-8 0.4319n ± 0% 0.4304n ± 0% -0.32% (p=0.003 n=20) GlobalUint64-8 2.185n ± 1% 2.185n ± 0% ~ (p=0.541 n=20) GlobalUint64Parallel-8 0.4295n ± 1% 0.4294n ± 0% ~ (p=0.203 n=20) Int64-8 4.104n ± 0% 4.107n ± 0% ~ (p=0.193 n=20) Uint64-8 4.080n ± 0% 4.081n ± 0% ~ (p=0.053 n=20) GlobalIntN1000-8 2.814n ± 1% 2.814n ± 0% ~ (p=0.879 n=20) IntN1000-8 4.140n ± 0% 4.141n ± 0% ~ (p=0.428 n=20) Int64N1000-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.114 n=20) Int64N1e8-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.898 n=20) Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.593 n=20) Int64N2e9-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.158 n=20) Int64N1e18-8 5.273n ± 0% 5.274n ± 0% ~ (p=0.308 n=20) Int64N2e18-8 6.059n ± 0% 6.058n ± 0% ~ (p=0.053 n=20) Int64N4e18-8 8.803n ± 0% 8.800n ± 0% ~ (p=0.673 n=20) Int32N1000-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.342 n=20) Int32N1e8-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.091 n=20) Int32N1e9-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.273 n=20) Int32N2e9-8 4.131n ± 0% 4.131n ± 0% ~ (p=0.425 n=20) Float32-8 4.110n ± 0% 4.112n ± 0% ~ (p=0.203 n=20) Float64-8 4.104n ± 0% 4.106n ± 0% ~ (p=0.409 n=20) ExpFloat64-8 5.338n ± 0% 5.339n ± 0% ~ (p=0.037 n=20) NormFloat64-8 5.731n ± 0% 5.733n ± 0% ~ (p=0.692 n=20) Perm3-8 26.62n ± 0% 26.65n ± 0% +0.09% (p=0.000 n=20) Perm30-8 194.6n ± 2% 194.9n ± 0% ~ (p=0.141 n=20) Perm30ViaShuffle-8 156.4n ± 0% 156.5n ± 0% +0.06% (p=0.000 n=20) ShuffleOverhead-8 125.8n ± 0% 125.0n ± 0% -0.64% (p=0.000 n=20) Concurrent-8 2.654n ± 6% 2.441n ± 6% -8.06% (p=0.009 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ afa459a2f0.386 │ bbb48afeb7.386 │ │ sec/op │ sec/op vs base │ PCG_DXSM-32 7.793n ± 2% 7.647n ± 1% ~ (p=0.021 n=20) ChaCha8-32 11.48n ± 2% SourceUint64-32 7.680n ± 1% 7.714n ± 1% ~ (p=0.713 n=20) GlobalInt64-32 3.474n ± 3% 3.491n ± 28% ~ (p=0.337 n=20) GlobalInt64Parallel-32 0.3253n ± 0% 0.3194n ± 0% -1.81% (p=0.000 n=20) GlobalUint64-32 3.433n ± 2% 3.610n ± 2% +5.14% (p=0.000 n=20) GlobalUint64Parallel-32 0.3156n ± 0% 0.3164n ± 0% ~ (p=0.073 n=20) Int64-32 7.707n ± 1% 7.824n ± 0% +1.52% (p=0.005 n=20) Uint64-32 7.714n ± 1% 7.732n ± 2% ~ (p=0.441 n=20) GlobalIntN1000-32 6.236n ± 1% 6.176n ± 2% ~ (p=0.499 n=20) IntN1000-32 10.41n ± 1% 10.31n ± 2% ~ (p=0.782 n=20) Int64N1000-32 10.97n ± 2% 11.22n ± 2% +2.19% (p=0.002 n=20) Int64N1e8-32 10.98n ± 1% 11.07n ± 1% ~ (p=0.056 n=20) Int64N1e9-32 10.95n ± 0% 11.15n ± 2% ~ (p=0.016 n=20) Int64N2e9-32 11.11n ± 1% 11.00n ± 1% ~ (p=0.654 n=20) Int64N1e18-32 15.18n ± 2% 14.97n ± 2% ~ (p=0.387 n=20) Int64N2e18-32 15.61n ± 1% 15.91n ± 1% +1.92% (p=0.003 n=20) Int64N4e18-32 19.23n ± 2% 18.98n ± 1% ~ (p=1.000 n=20) Int32N1000-32 10.35n ± 1% 10.31n ± 2% ~ (p=0.081 n=20) Int32N1e8-32 10.33n ± 1% 10.38n ± 1% ~ (p=0.335 n=20) Int32N1e9-32 10.35n ± 1% 10.37n ± 1% ~ (p=0.497 n=20) Int32N2e9-32 10.35n ± 1% 10.41n ± 1% ~ (p=0.605 n=20) Float32-32 13.57n ± 1% 13.78n ± 2% ~ (p=0.047 n=20) Float64-32 22.95n ± 4% 23.43n ± 3% ~ (p=0.218 n=20) ExpFloat64-32 15.23n ± 2% 15.46n ± 1% ~ (p=0.095 n=20) NormFloat64-32 13.78n ± 1% 13.73n ± 2% ~ (p=0.031 n=20) Perm3-32 46.62n ± 2% 47.46n ± 2% +1.82% (p=0.004 n=20) Perm30-32 400.7n ± 1% 403.5n ± 1% ~ (p=0.098 n=20) Perm30ViaShuffle-32 350.5n ± 1% 348.1n ± 2% ~ (p=0.703 n=20) ShuffleOverhead-32 326.0n ± 2% 326.2n ± 2% ~ (p=0.440 n=20) Concurrent-32 3.290n ± 0% 3.297n ± 4% ~ (p=0.189 n=20) For #61716. Change-Id: Id2a7e1c1db0beb81f563faaefba65fe292497269 Reviewed-on: https://go-review.googlesource.com/c/go/+/516859 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Filippo Valsorda <filippo@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com>
2023-10-30math/rand/v2: delete Mitchell/Reeds sourceRuss Cox
These slowdowns are because we are now using PCG instead of the Mitchell/Reeds LFSR for the benchmarks. PCG is in fact a bit slower (but generates statically far better random numbers). goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 01ff938549.amd64 │ afa459a2f0.amd64 │ │ sec/op │ sec/op vs base │ PCG_DXSM-32 1.490n ± 0% 1.488n ± 2% ~ (p=0.408 n=20) SourceUint64-32 1.352n ± 1% 1.450n ± 3% +7.21% (p=0.000 n=20) GlobalInt64-32 2.083n ± 0% 2.067n ± 2% ~ (p=0.223 n=20) GlobalInt64Parallel-32 0.1035n ± 1% 0.1044n ± 2% ~ (p=0.010 n=20) GlobalUint64-32 2.038n ± 1% 2.085n ± 0% +2.28% (p=0.000 n=20) GlobalUint64Parallel-32 0.1006n ± 1% 0.1008n ± 1% ~ (p=0.733 n=20) Int64-32 1.687n ± 2% 1.779n ± 1% +5.48% (p=0.000 n=20) Uint64-32 1.674n ± 2% 1.854n ± 2% +10.69% (p=0.000 n=20) GlobalIntN1000-32 3.135n ± 1% 3.140n ± 3% ~ (p=0.794 n=20) IntN1000-32 2.478n ± 1% 2.496n ± 1% +0.73% (p=0.006 n=20) Int64N1000-32 2.455n ± 1% 2.510n ± 2% +2.22% (p=0.000 n=20) Int64N1e8-32 2.467n ± 2% 2.471n ± 2% ~ (p=0.050 n=20) Int64N1e9-32 2.454n ± 1% 2.488n ± 2% +1.39% (p=0.000 n=20) Int64N2e9-32 2.482n ± 1% 2.478n ± 2% ~ (p=0.066 n=20) Int64N1e18-32 3.349n ± 2% 3.088n ± 1% -7.81% (p=0.000 n=20) Int64N2e18-32 3.537n ± 1% 3.493n ± 1% -1.24% (p=0.002 n=20) Int64N4e18-32 4.917n ± 0% 5.060n ± 2% +2.91% (p=0.000 n=20) Int32N1000-32 2.386n ± 1% 2.620n ± 1% +9.76% (p=0.000 n=20) Int32N1e8-32 2.366n ± 1% 2.652n ± 0% +12.11% (p=0.000 n=20) Int32N1e9-32 2.355n ± 2% 2.644n ± 1% +12.32% (p=0.000 n=20) Int32N2e9-32 2.371n ± 1% 2.619n ± 2% +10.48% (p=0.000 n=20) Float32-32 2.245n ± 2% 2.261n ± 1% ~ (p=0.625 n=20) Float64-32 2.235n ± 1% 2.241n ± 2% ~ (p=0.393 n=20) ExpFloat64-32 3.813n ± 3% 3.716n ± 1% -2.53% (p=0.000 n=20) NormFloat64-32 3.652n ± 2% 3.718n ± 1% +1.79% (p=0.006 n=20) Perm3-32 33.12n ± 3% 34.11n ± 2% ~ (p=0.021 n=20) Perm30-32 205.1n ± 1% 200.6n ± 0% -2.17% (p=0.000 n=20) Perm30ViaShuffle-32 110.8n ± 1% 109.7n ± 1% -0.99% (p=0.002 n=20) ShuffleOverhead-32 113.0n ± 1% 107.2n ± 1% -5.09% (p=0.000 n=20) Concurrent-32 2.100n ± 0% 2.108n ± 6% ~ (p=0.103 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 │ 01ff938549.arm64 │ afa459a2f0.arm64 │ │ sec/op │ sec/op vs base │ PCG_DXSM-8 2.531n ± 0% 2.531n ± 0% ~ (p=0.763 n=20) SourceUint64-8 2.258n ± 1% 2.531n ± 0% +12.09% (p=0.000 n=20) GlobalInt64-8 2.167n ± 0% 2.177n ± 1% ~ (p=0.213 n=20) GlobalInt64Parallel-8 0.4310n ± 0% 0.4319n ± 0% ~ (p=0.027 n=20) GlobalUint64-8 2.182n ± 1% 2.185n ± 1% ~ (p=0.683 n=20) GlobalUint64Parallel-8 0.4297n ± 0% 0.4295n ± 1% ~ (p=0.941 n=20) Int64-8 2.472n ± 1% 4.104n ± 0% +66.00% (p=0.000 n=20) Uint64-8 2.449n ± 1% 4.080n ± 0% +66.60% (p=0.000 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 1% ~ (p=0.972 n=20) IntN1000-8 2.998n ± 2% 4.140n ± 0% +38.09% (p=0.000 n=20) Int64N1000-8 2.949n ± 2% 4.139n ± 0% +40.35% (p=0.000 n=20) Int64N1e8-8 2.953n ± 2% 4.140n ± 0% +40.22% (p=0.000 n=20) Int64N1e9-8 2.950n ± 0% 4.139n ± 0% +40.32% (p=0.000 n=20) Int64N2e9-8 2.946n ± 2% 4.140n ± 0% +40.53% (p=0.000 n=20) Int64N1e18-8 3.779n ± 1% 5.273n ± 0% +39.52% (p=0.000 n=20) Int64N2e18-8 4.370n ± 1% 6.059n ± 0% +38.65% (p=0.000 n=20) Int64N4e18-8 6.544n ± 1% 8.803n ± 0% +34.52% (p=0.000 n=20) Int32N1000-8 2.950n ± 0% 4.131n ± 0% +40.06% (p=0.000 n=20) Int32N1e8-8 2.950n ± 2% 4.131n ± 0% +40.03% (p=0.000 n=20) Int32N1e9-8 2.951n ± 2% 4.131n ± 0% +39.99% (p=0.000 n=20) Int32N2e9-8 2.950n ± 2% 4.131n ± 0% +40.03% (p=0.000 n=20) Float32-8 3.441n ± 0% 4.110n ± 0% +19.44% (p=0.000 n=20) Float64-8 3.442n ± 0% 4.104n ± 0% +19.24% (p=0.000 n=20) ExpFloat64-8 4.481n ± 0% 5.338n ± 0% +19.11% (p=0.000 n=20) NormFloat64-8 4.725n ± 0% 5.731n ± 0% +21.28% (p=0.000 n=20) Perm3-8 26.55n ± 0% 26.62n ± 0% +0.28% (p=0.000 n=20) Perm30-8 181.9n ± 0% 194.6n ± 2% +6.98% (p=0.000 n=20) Perm30ViaShuffle-8 142.9n ± 0% 156.4n ± 0% +9.45% (p=0.000 n=20) ShuffleOverhead-8 120.8n ± 2% 125.8n ± 0% +4.10% (p=0.000 n=20) Concurrent-8 2.421n ± 6% 2.654n ± 6% +9.67% (p=0.002 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 01ff938549.386 │ afa459a2f0.386 │ │ sec/op │ sec/op vs base │ PCG_DXSM-32 7.613n ± 1% 7.793n ± 2% +2.38% (p=0.000 n=20) SourceUint64-32 2.069n ± 0% 7.680n ± 1% +271.19% (p=0.000 n=20) GlobalInt64-32 3.456n ± 1% 3.474n ± 3% ~ (p=0.654 n=20) GlobalInt64Parallel-32 0.3252n ± 0% 0.3253n ± 0% ~ (p=0.952 n=20) GlobalUint64-32 3.573n ± 1% 3.433n ± 2% -3.92% (p=0.000 n=20) GlobalUint64Parallel-32 0.3159n ± 0% 0.3156n ± 0% ~ (p=0.223 n=20) Int64-32 2.562n ± 2% 7.707n ± 1% +200.74% (p=0.000 n=20) Uint64-32 2.592n ± 0% 7.714n ± 1% +197.65% (p=0.000 n=20) GlobalIntN1000-32 6.266n ± 2% 6.236n ± 1% ~ (p=0.039 n=20) IntN1000-32 4.724n ± 2% 10.410n ± 1% +120.39% (p=0.000 n=20) Int64N1000-32 5.490n ± 2% 10.975n ± 2% +99.89% (p=0.000 n=20) Int64N1e8-32 5.513n ± 2% 10.980n ± 1% +99.15% (p=0.000 n=20) Int64N1e9-32 5.476n ± 1% 10.950n ± 0% +99.96% (p=0.000 n=20) Int64N2e9-32 5.501n ± 2% 11.110n ± 1% +101.96% (p=0.000 n=20) Int64N1e18-32 9.043n ± 2% 15.180n ± 2% +67.86% (p=0.000 n=20) Int64N2e18-32 9.601n ± 2% 15.610n ± 1% +62.60% (p=0.000 n=20) Int64N4e18-32 12.00n ± 1% 19.23n ± 2% +60.14% (p=0.000 n=20) Int32N1000-32 4.829n ± 2% 10.345n ± 1% +114.25% (p=0.000 n=20) Int32N1e8-32 4.825n ± 2% 10.330n ± 1% +114.09% (p=0.000 n=20) Int32N1e9-32 4.830n ± 2% 10.350n ± 1% +114.26% (p=0.000 n=20) Int32N2e9-32 4.750n ± 2% 10.345n ± 1% +117.81% (p=0.000 n=20) Float32-32 10.89n ± 4% 13.57n ± 1% +24.61% (p=0.000 n=20) Float64-32 19.60n ± 4% 22.95n ± 4% +17.12% (p=0.000 n=20) ExpFloat64-32 12.96n ± 3% 15.23n ± 2% +17.47% (p=0.000 n=20) NormFloat64-32 7.516n ± 1% 13.780n ± 1% +83.34% (p=0.000 n=20) Perm3-32 36.78n ± 2% 46.62n ± 2% +26.72% (p=0.000 n=20) Perm30-32 238.9n ± 2% 400.7n ± 1% +67.73% (p=0.000 n=20) Perm30ViaShuffle-32 189.7n ± 2% 350.5n ± 1% +84.79% (p=0.000 n=20) ShuffleOverhead-32 159.8n ± 1% 326.0n ± 2% +104.01% (p=0.000 n=20) Concurrent-32 3.286n ± 1% 3.290n ± 0% ~ (p=0.743 n=20) On the other hand, compared to the original "update benchmarks" CL, the cleanups we've made more than compensate for PCG being a bit slower than LFSR, at least on 64-bit x86. ARM64 (Apple M1) is a bit slower: perhaps the 64x64→128 multiply is slower there for some reason. 386 is noticeably slower, but it's also a non-SSA backend. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.amd64 │ afa459a2f0.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.555n ± 1% 1.450n ± 3% -6.78% (p=0.000 n=20) GlobalInt64-32 2.071n ± 1% 2.067n ± 2% ~ (p=0.673 n=20) GlobalInt63Parallel-32 0.1023n ± 1% GlobalInt64Parallel-32 0.1044n ± 2% GlobalUint64-32 5.193n ± 1% 2.085n ± 0% -59.86% (p=0.000 n=20) GlobalUint64Parallel-32 0.2341n ± 0% 0.1008n ± 1% -56.93% (p=0.000 n=20) Int64-32 2.056n ± 2% 1.779n ± 1% -13.47% (p=0.000 n=20) Uint64-32 2.077n ± 2% 1.854n ± 2% -10.74% (p=0.000 n=20) GlobalIntN1000-32 4.077n ± 2% 3.140n ± 3% -22.98% (p=0.000 n=20) IntN1000-32 3.476n ± 2% 2.496n ± 1% -28.19% (p=0.000 n=20) Int64N1000-32 3.059n ± 1% 2.510n ± 2% -17.96% (p=0.000 n=20) Int64N1e8-32 2.942n ± 1% 2.471n ± 2% -15.98% (p=0.000 n=20) Int64N1e9-32 2.932n ± 1% 2.488n ± 2% -15.14% (p=0.000 n=20) Int64N2e9-32 2.925n ± 1% 2.478n ± 2% -15.30% (p=0.000 n=20) Int64N1e18-32 3.116n ± 1% 3.088n ± 1% ~ (p=0.013 n=20) Int64N2e18-32 4.067n ± 1% 3.493n ± 1% -14.11% (p=0.000 n=20) Int64N4e18-32 4.054n ± 1% 5.060n ± 2% +24.80% (p=0.000 n=20) Int32N1000-32 2.951n ± 1% 2.620n ± 1% -11.22% (p=0.000 n=20) Int32N1e8-32 3.102n ± 1% 2.652n ± 0% -14.50% (p=0.000 n=20) Int32N1e9-32 3.535n ± 1% 2.644n ± 1% -25.20% (p=0.000 n=20) Int32N2e9-32 3.514n ± 1% 2.619n ± 2% -25.47% (p=0.000 n=20) Float32-32 2.760n ± 1% 2.261n ± 1% -18.06% (p=0.000 n=20) Float64-32 2.284n ± 1% 2.241n ± 2% ~ (p=0.016 n=20) ExpFloat64-32 3.757n ± 1% 3.716n ± 1% ~ (p=0.034 n=20) NormFloat64-32 3.837n ± 1% 3.718n ± 1% -3.09% (p=0.000 n=20) Perm3-32 35.23n ± 2% 34.11n ± 2% -3.19% (p=0.000 n=20) Perm30-32 208.8n ± 1% 200.6n ± 0% -3.93% (p=0.000 n=20) Perm30ViaShuffle-32 111.7n ± 1% 109.7n ± 1% -1.84% (p=0.000 n=20) ShuffleOverhead-32 101.1n ± 1% 107.2n ± 1% +6.03% (p=0.000 n=20) Concurrent-32 2.108n ± 7% 2.108n ± 6% ~ (p=0.644 n=20) PCG_DXSM-32 1.488n ± 2% goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 220860f76f.arm64 │ afa459a2f0.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.316n ± 1% 2.531n ± 0% +9.33% (p=0.000 n=20) GlobalInt64-8 2.183n ± 1% 2.177n ± 1% ~ (p=0.533 n=20) GlobalInt63Parallel-8 0.4331n ± 0% GlobalInt64Parallel-8 0.4319n ± 0% GlobalUint64-8 4.377n ± 2% 2.185n ± 1% -50.07% (p=0.000 n=20) GlobalUint64Parallel-8 0.9237n ± 0% 0.4295n ± 1% -53.50% (p=0.000 n=20) Int64-8 2.538n ± 1% 4.104n ± 0% +61.68% (p=0.000 n=20) Uint64-8 2.604n ± 1% 4.080n ± 0% +56.68% (p=0.000 n=20) GlobalIntN1000-8 3.857n ± 2% 2.814n ± 1% -27.04% (p=0.000 n=20) IntN1000-8 3.822n ± 2% 4.140n ± 0% +8.32% (p=0.000 n=20) Int64N1000-8 3.318n ± 0% 4.139n ± 0% +24.74% (p=0.000 n=20) Int64N1e8-8 3.349n ± 1% 4.140n ± 0% +23.64% (p=0.000 n=20) Int64N1e9-8 3.317n ± 2% 4.139n ± 0% +24.80% (p=0.000 n=20) Int64N2e9-8 3.317n ± 2% 4.140n ± 0% +24.81% (p=0.000 n=20) Int64N1e18-8 3.542n ± 1% 5.273n ± 0% +48.85% (p=0.000 n=20) Int64N2e18-8 5.087n ± 0% 6.059n ± 0% +19.12% (p=0.000 n=20) Int64N4e18-8 5.084n ± 0% 8.803n ± 0% +73.16% (p=0.000 n=20) Int32N1000-8 3.208n ± 2% 4.131n ± 0% +28.79% (p=0.000 n=20) Int32N1e8-8 3.610n ± 1% 4.131n ± 0% +14.43% (p=0.000 n=20) Int32N1e9-8 4.235n ± 0% 4.131n ± 0% -2.44% (p=0.000 n=20) Int32N2e9-8 4.229n ± 1% 4.131n ± 0% -2.33% (p=0.000 n=20) Float32-8 3.468n ± 0% 4.110n ± 0% +18.50% (p=0.000 n=20) Float64-8 3.447n ± 0% 4.104n ± 0% +19.05% (p=0.000 n=20) ExpFloat64-8 4.567n ± 0% 5.338n ± 0% +16.86% (p=0.000 n=20) NormFloat64-8 4.821n ± 0% 5.731n ± 0% +18.89% (p=0.000 n=20) Perm3-8 28.89n ± 0% 26.62n ± 0% -7.84% (p=0.000 n=20) Perm30-8 175.7n ± 0% 194.6n ± 2% +10.76% (p=0.000 n=20) Perm30ViaShuffle-8 153.5n ± 0% 156.4n ± 0% +1.86% (p=0.000 n=20) ShuffleOverhead-8 119.8n ± 1% 125.8n ± 0% +4.97% (p=0.000 n=20) Concurrent-8 2.433n ± 3% 2.654n ± 6% +9.13% (p=0.001 n=20) PCG_DXSM-8 2.531n ± 0% goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.386 │ afa459a2f0.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.370n ± 1% 7.680n ± 1% +224.05% (p=0.000 n=20) GlobalInt64-32 3.569n ± 1% 3.474n ± 3% -2.66% (p=0.001 n=20) GlobalInt63Parallel-32 0.3221n ± 1% GlobalInt64Parallel-32 0.3253n ± 0% GlobalUint64-32 8.797n ± 10% 3.433n ± 2% -60.98% (p=0.000 n=20) GlobalUint64Parallel-32 0.6351n ± 0% 0.3156n ± 0% -50.31% (p=0.000 n=20) Int64-32 2.612n ± 2% 7.707n ± 1% +195.04% (p=0.000 n=20) Uint64-32 3.350n ± 1% 7.714n ± 1% +130.25% (p=0.000 n=20) GlobalIntN1000-32 5.892n ± 1% 6.236n ± 1% +5.82% (p=0.000 n=20) IntN1000-32 4.546n ± 1% 10.410n ± 1% +128.97% (p=0.000 n=20) Int64N1000-32 14.59n ± 1% 10.97n ± 2% -24.75% (p=0.000 n=20) Int64N1e8-32 14.76n ± 2% 10.98n ± 1% -25.58% (p=0.000 n=20) Int64N1e9-32 16.57n ± 1% 10.95n ± 0% -33.90% (p=0.000 n=20) Int64N2e9-32 14.54n ± 1% 11.11n ± 1% -23.62% (p=0.000 n=20) Int64N1e18-32 16.14n ± 1% 15.18n ± 2% -5.95% (p=0.000 n=20) Int64N2e18-32 18.10n ± 1% 15.61n ± 1% -13.73% (p=0.000 n=20) Int64N4e18-32 18.65n ± 1% 19.23n ± 2% +3.08% (p=0.000 n=20) Int32N1000-32 3.560n ± 1% 10.345n ± 1% +190.55% (p=0.000 n=20) Int32N1e8-32 3.770n ± 2% 10.330n ± 1% +174.01% (p=0.000 n=20) Int32N1e9-32 4.098n ± 0% 10.350n ± 1% +152.53% (p=0.000 n=20) Int32N2e9-32 4.179n ± 1% 10.345n ± 1% +147.52% (p=0.000 n=20) Float32-32 21.18n ± 4% 13.57n ± 1% -35.93% (p=0.000 n=20) Float64-32 20.60n ± 2% 22.95n ± 4% +11.41% (p=0.000 n=20) ExpFloat64-32 13.07n ± 0% 15.23n ± 2% +16.48% (p=0.000 n=20) NormFloat64-32 7.738n ± 2% 13.780n ± 1% +78.08% (p=0.000 n=20) Perm3-32 36.73n ± 1% 46.62n ± 2% +26.91% (p=0.000 n=20) Perm30-32 211.9n ± 1% 400.7n ± 1% +89.05% (p=0.000 n=20) Perm30ViaShuffle-32 165.2n ± 1% 350.5n ± 1% +112.20% (p=0.000 n=20) ShuffleOverhead-32 133.9n ± 1% 326.0n ± 2% +143.37% (p=0.000 n=20) Concurrent-32 3.287n ± 2% 3.290n ± 0% ~ (p=0.365 n=20) PCG_DXSM-32 7.793n ± 2% For #61716. Change-Id: I4e9c0525b5f84a2ac46f23da9e365495e2d05777 Reviewed-on: https://go-review.googlesource.com/c/go/+/502506 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30math/rand/v2: add PCG-DXSMRuss Cox
For the original math/rand, we ported Plan 9's random number generator, which was a refinement by Ken Thompson of an algorithm by Don Mitchell and Jim Reeds, which Mitchell in turn recalls as having been derived from an algorithm by Marsaglia. At its core, it is an additive lagged Fibonacci generator (ALFG). Whatever the details of the history, this generator is nowhere near the current state of the art for simple, pseudo-random generators. This CL adds an implementation of Melissa O'Neill's PCG, specifically the variant PCG-DXSM, which she defined after writing the PCG paper and which is now the default in Numpy. The update is slightly slower (a few multiplies and adds, instead of a few adds), but the state is dramatically smaller (2 words instead of 607). The statistical output properties are better too. A followup CL will delete the old generator. PCG is the only change here, so no benchmarks should be affected. Including them anyway as further evidence for caution. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 8993506f2f.amd64 │ 01ff938549.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.325n ± 1% 1.352n ± 1% +2.00% (p=0.000 n=20) GlobalInt64-32 2.240n ± 1% 2.083n ± 0% -7.03% (p=0.000 n=20) GlobalInt64Parallel-32 0.1041n ± 1% 0.1035n ± 1% ~ (p=0.064 n=20) GlobalUint64-32 2.072n ± 3% 2.038n ± 1% ~ (p=0.089 n=20) GlobalUint64Parallel-32 0.1008n ± 1% 0.1006n ± 1% ~ (p=0.804 n=20) Int64-32 1.716n ± 1% 1.687n ± 2% ~ (p=0.045 n=20) Uint64-32 1.665n ± 1% 1.674n ± 2% ~ (p=0.878 n=20) GlobalIntN1000-32 3.335n ± 1% 3.135n ± 1% -6.00% (p=0.000 n=20) IntN1000-32 2.484n ± 1% 2.478n ± 1% ~ (p=0.085 n=20) Int64N1000-32 2.502n ± 2% 2.455n ± 1% -1.88% (p=0.002 n=20) Int64N1e8-32 2.484n ± 2% 2.467n ± 2% ~ (p=0.048 n=20) Int64N1e9-32 2.502n ± 0% 2.454n ± 1% -1.92% (p=0.000 n=20) Int64N2e9-32 2.502n ± 0% 2.482n ± 1% -0.76% (p=0.000 n=20) Int64N1e18-32 3.201n ± 1% 3.349n ± 2% +4.62% (p=0.000 n=20) Int64N2e18-32 3.504n ± 1% 3.537n ± 1% ~ (p=0.185 n=20) Int64N4e18-32 4.873n ± 1% 4.917n ± 0% +0.90% (p=0.000 n=20) Int32N1000-32 2.639n ± 1% 2.386n ± 1% -9.57% (p=0.000 n=20) Int32N1e8-32 2.686n ± 2% 2.366n ± 1% -11.91% (p=0.000 n=20) Int32N1e9-32 2.636n ± 1% 2.355n ± 2% -10.70% (p=0.000 n=20) Int32N2e9-32 2.660n ± 1% 2.371n ± 1% -10.88% (p=0.000 n=20) Float32-32 2.261n ± 1% 2.245n ± 2% ~ (p=0.752 n=20) Float64-32 2.280n ± 1% 2.235n ± 1% -1.97% (p=0.007 n=20) ExpFloat64-32 3.891n ± 1% 3.813n ± 3% ~ (p=0.087 n=20) NormFloat64-32 3.711n ± 1% 3.652n ± 2% ~ (p=0.021 n=20) Perm3-32 32.60n ± 2% 33.12n ± 3% ~ (p=0.107 n=20) Perm30-32 204.2n ± 0% 205.1n ± 1% ~ (p=0.358 n=20) Perm30ViaShuffle-32 121.7n ± 2% 110.8n ± 1% -8.96% (p=0.000 n=20) ShuffleOverhead-32 106.2n ± 2% 113.0n ± 1% +6.36% (p=0.000 n=20) Concurrent-32 2.190n ± 5% 2.100n ± 0% -4.13% (p=0.001 n=20) PCG_DXSM-32 1.490n ± 0% goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 8993506f2f.arm64 │ 01ff938549.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.271n ± 0% 2.258n ± 1% ~ (p=0.167 n=20) GlobalInt64-8 2.161n ± 1% 2.167n ± 0% ~ (p=0.693 n=20) GlobalInt64Parallel-8 0.4303n ± 0% 0.4310n ± 0% ~ (p=0.051 n=20) GlobalUint64-8 2.164n ± 1% 2.182n ± 1% ~ (p=0.042 n=20) GlobalUint64Parallel-8 0.4287n ± 0% 0.4297n ± 0% ~ (p=0.082 n=20) Int64-8 2.478n ± 1% 2.472n ± 1% ~ (p=0.151 n=20) Uint64-8 2.460n ± 1% 2.449n ± 1% ~ (p=0.013 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 2% ~ (p=0.821 n=20) IntN1000-8 3.003n ± 2% 2.998n ± 2% ~ (p=0.024 n=20) Int64N1000-8 2.954n ± 0% 2.949n ± 2% ~ (p=0.192 n=20) Int64N1e8-8 2.956n ± 0% 2.953n ± 2% ~ (p=0.109 n=20) Int64N1e9-8 3.325n ± 0% 2.950n ± 0% -11.26% (p=0.000 n=20) Int64N2e9-8 2.956n ± 2% 2.946n ± 2% ~ (p=0.027 n=20) Int64N1e18-8 3.780n ± 1% 3.779n ± 1% ~ (p=0.815 n=20) Int64N2e18-8 4.385n ± 0% 4.370n ± 1% ~ (p=0.402 n=20) Int64N4e18-8 6.527n ± 0% 6.544n ± 1% ~ (p=0.140 n=20) Int32N1000-8 2.964n ± 1% 2.950n ± 0% -0.47% (p=0.002 n=20) Int32N1e8-8 2.964n ± 1% 2.950n ± 2% ~ (p=0.013 n=20) Int32N1e9-8 2.963n ± 2% 2.951n ± 2% ~ (p=0.062 n=20) Int32N2e9-8 2.961n ± 2% 2.950n ± 2% -0.37% (p=0.002 n=20) Float32-8 3.442n ± 0% 3.441n ± 0% ~ (p=0.211 n=20) Float64-8 3.442n ± 0% 3.442n ± 0% ~ (p=0.067 n=20) ExpFloat64-8 4.472n ± 0% 4.481n ± 0% +0.20% (p=0.000 n=20) NormFloat64-8 4.734n ± 0% 4.725n ± 0% -0.19% (p=0.003 n=20) Perm3-8 26.55n ± 0% 26.55n ± 0% ~ (p=0.833 n=20) Perm30-8 181.9n ± 0% 181.9n ± 0% -0.03% (p=0.004 n=20) Perm30ViaShuffle-8 143.1n ± 0% 142.9n ± 0% ~ (p=0.204 n=20) ShuffleOverhead-8 120.6n ± 1% 120.8n ± 2% ~ (p=0.102 n=20) Concurrent-8 2.357n ± 2% 2.421n ± 6% ~ (p=0.016 n=20) PCG_DXSM-8 2.531n ± 0% goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 8993506f2f.386 │ 01ff938549.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.102n ± 2% 2.069n ± 0% ~ (p=0.021 n=20) GlobalInt64-32 3.542n ± 2% 3.456n ± 1% -2.44% (p=0.001 n=20) GlobalInt64Parallel-32 0.3202n ± 0% 0.3252n ± 0% +1.56% (p=0.000 n=20) GlobalUint64-32 3.507n ± 1% 3.573n ± 1% +1.87% (p=0.000 n=20) GlobalUint64Parallel-32 0.3170n ± 1% 0.3159n ± 0% ~ (p=0.167 n=20) Int64-32 2.516n ± 1% 2.562n ± 2% ~ (p=0.016 n=20) Uint64-32 2.544n ± 1% 2.592n ± 0% +1.85% (p=0.000 n=20) GlobalIntN1000-32 6.237n ± 1% 6.266n ± 2% ~ (p=0.268 n=20) IntN1000-32 4.670n ± 2% 4.724n ± 2% ~ (p=0.644 n=20) Int64N1000-32 5.412n ± 1% 5.490n ± 2% ~ (p=0.159 n=20) Int64N1e8-32 5.414n ± 2% 5.513n ± 2% ~ (p=0.129 n=20) Int64N1e9-32 5.473n ± 1% 5.476n ± 1% ~ (p=0.723 n=20) Int64N2e9-32 5.487n ± 1% 5.501n ± 2% ~ (p=0.481 n=20) Int64N1e18-32 8.901n ± 2% 9.043n ± 2% ~ (p=0.330 n=20) Int64N2e18-32 9.521n ± 1% 9.601n ± 2% ~ (p=0.703 n=20) Int64N4e18-32 11.92n ± 1% 12.00n ± 1% ~ (p=0.489 n=20) Int32N1000-32 4.785n ± 1% 4.829n ± 2% ~ (p=0.402 n=20) Int32N1e8-32 4.748n ± 1% 4.825n ± 2% ~ (p=0.218 n=20) Int32N1e9-32 4.810n ± 1% 4.830n ± 2% ~ (p=0.794 n=20) Int32N2e9-32 4.812n ± 1% 4.750n ± 2% ~ (p=0.057 n=20) Float32-32 10.48n ± 4% 10.89n ± 4% ~ (p=0.162 n=20) Float64-32 19.79n ± 3% 19.60n ± 4% ~ (p=0.668 n=20) ExpFloat64-32 12.91n ± 3% 12.96n ± 3% ~ (p=1.000 n=20) NormFloat64-32 7.462n ± 1% 7.516n ± 1% ~ (p=0.051 n=20) Perm3-32 35.98n ± 2% 36.78n ± 2% ~ (p=0.033 n=20) Perm30-32 241.5n ± 1% 238.9n ± 2% ~ (p=0.126 n=20) Perm30ViaShuffle-32 187.3n ± 2% 189.7n ± 2% ~ (p=0.387 n=20) ShuffleOverhead-32 160.2n ± 1% 159.8n ± 1% ~ (p=0.256 n=20) Concurrent-32 3.308n ± 3% 3.286n ± 1% ~ (p=0.038 n=20) PCG_DXSM-32 7.613n ± 1% For #61716. Change-Id: Icb274ca1f782504d658305a40159b4ae6a2f3f1d Reviewed-on: https://go-review.googlesource.com/c/go/+/502505 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Rob Pike <r@golang.org>
2023-10-30math/rand/v2: simplify PermRuss Cox
The compiler says Perm is being inlined into BenchmarkPerm, and yet BenchmarkPerm30ViaShuffle, which you'd think is the same code, still runs significantly faster. The benchmarks are mystifying but this is clearly still a step in the right direction, since BenchmarkPerm30ViaShuffle is still the fastest and we avoid having two copies of that logic. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ e1bbe739fb.amd64 │ 8993506f2f.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.316n ± 2% 1.325n ± 1% ~ (p=0.208 n=20) GlobalInt64-32 2.048n ± 1% 2.240n ± 1% +9.38% (p=0.000 n=20) GlobalInt64Parallel-32 0.1037n ± 1% 0.1041n ± 1% ~ (p=0.774 n=20) GlobalUint64-32 2.039n ± 2% 2.072n ± 3% ~ (p=0.115 n=20) GlobalUint64Parallel-32 0.1013n ± 1% 0.1008n ± 1% ~ (p=0.417 n=20) Int64-32 1.692n ± 2% 1.716n ± 1% ~ (p=0.122 n=20) Uint64-32 1.643n ± 2% 1.665n ± 1% ~ (p=0.062 n=20) GlobalIntN1000-32 3.287n ± 1% 3.335n ± 1% ~ (p=0.147 n=20) IntN1000-32 2.678n ± 2% 2.484n ± 1% -7.24% (p=0.000 n=20) Int64N1000-32 2.684n ± 2% 2.502n ± 2% -6.80% (p=0.000 n=20) Int64N1e8-32 2.663n ± 2% 2.484n ± 2% -6.76% (p=0.000 n=20) Int64N1e9-32 2.633n ± 1% 2.502n ± 0% -4.98% (p=0.000 n=20) Int64N2e9-32 2.657n ± 1% 2.502n ± 0% -5.87% (p=0.000 n=20) Int64N1e18-32 3.125n ± 2% 3.201n ± 1% +2.43% (p=0.000 n=20) Int64N2e18-32 3.476n ± 1% 3.504n ± 1% +0.83% (p=0.009 n=20) Int64N4e18-32 4.795n ± 1% 4.873n ± 1% ~ (p=0.106 n=20) Int32N1000-32 2.485n ± 2% 2.639n ± 1% +6.20% (p=0.000 n=20) Int32N1e8-32 2.457n ± 1% 2.686n ± 2% +9.34% (p=0.000 n=20) Int32N1e9-32 2.452n ± 1% 2.636n ± 1% +7.52% (p=0.000 n=20) Int32N2e9-32 2.453n ± 1% 2.660n ± 1% +8.44% (p=0.000 n=20) Float32-32 2.254n ± 1% 2.261n ± 1% ~ (p=0.888 n=20) Float64-32 2.262n ± 1% 2.280n ± 1% ~ (p=0.040 n=20) ExpFloat64-32 3.777n ± 2% 3.891n ± 1% +3.03% (p=0.000 n=20) NormFloat64-32 3.606n ± 1% 3.711n ± 1% +2.91% (p=0.000 n=20) Perm3-32 33.12n ± 2% 32.60n ± 2% ~ (p=0.045 n=20) Perm30-32 176.1n ± 1% 204.2n ± 0% +15.96% (p=0.000 n=20) Perm30ViaShuffle-32 109.3n ± 1% 121.7n ± 2% +11.30% (p=0.000 n=20) ShuffleOverhead-32 112.5n ± 1% 106.2n ± 2% -5.56% (p=0.000 n=20) Concurrent-32 2.099n ± 0% 2.190n ± 5% +4.36% (p=0.001 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ e1bbe739fb.arm64 │ 8993506f2f.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.290n ± 1% 2.271n ± 0% ~ (p=0.015 n=20) GlobalInt64-8 2.180n ± 1% 2.161n ± 1% ~ (p=0.180 n=20) GlobalInt64Parallel-8 0.4294n ± 0% 0.4303n ± 0% +0.19% (p=0.001 n=20) GlobalUint64-8 2.170n ± 1% 2.164n ± 1% ~ (p=0.673 n=20) GlobalUint64Parallel-8 0.4283n ± 0% 0.4287n ± 0% ~ (p=0.128 n=20) Int64-8 2.481n ± 1% 2.478n ± 1% ~ (p=0.867 n=20) Uint64-8 2.464n ± 1% 2.460n ± 1% ~ (p=0.763 n=20) GlobalIntN1000-8 2.814n ± 0% 2.814n ± 2% ~ (p=0.969 n=20) IntN1000-8 2.934n ± 2% 3.003n ± 2% +2.35% (p=0.000 n=20) Int64N1000-8 2.957n ± 1% 2.954n ± 0% ~ (p=0.285 n=20) Int64N1e8-8 2.935n ± 2% 2.956n ± 0% +0.73% (p=0.002 n=20) Int64N1e9-8 2.935n ± 2% 3.325n ± 0% +13.29% (p=0.000 n=20) Int64N2e9-8 2.933n ± 4% 2.956n ± 2% ~ (p=0.163 n=20) Int64N1e18-8 3.781n ± 1% 3.780n ± 1% ~ (p=0.805 n=20) Int64N2e18-8 4.362n ± 0% 4.385n ± 0% ~ (p=0.077 n=20) Int64N4e18-8 6.576n ± 1% 6.527n ± 0% ~ (p=0.024 n=20) Int32N1000-8 2.942n ± 2% 2.964n ± 1% ~ (p=0.073 n=20) Int32N1e8-8 2.941n ± 1% 2.964n ± 1% ~ (p=0.058 n=20) Int32N1e9-8 2.938n ± 2% 2.963n ± 2% +0.87% (p=0.003 n=20) Int32N2e9-8 2.982n ± 2% 2.961n ± 2% ~ (p=0.056 n=20) Float32-8 3.441n ± 0% 3.442n ± 0% ~ (p=0.030 n=20) Float64-8 3.441n ± 0% 3.442n ± 0% +0.03% (p=0.001 n=20) ExpFloat64-8 4.472n ± 0% 4.472n ± 0% ~ (p=0.877 n=20) NormFloat64-8 4.716n ± 0% 4.734n ± 0% +0.38% (p=0.000 n=20) Perm3-8 26.66n ± 0% 26.55n ± 0% -0.39% (p=0.000 n=20) Perm30-8 143.3n ± 0% 181.9n ± 0% +26.97% (p=0.000 n=20) Perm30ViaShuffle-8 142.9n ± 0% 143.1n ± 0% ~ (p=0.669 n=20) ShuffleOverhead-8 121.1n ± 1% 120.6n ± 1% -0.41% (p=0.004 n=20) Concurrent-8 2.379n ± 2% 2.357n ± 2% ~ (p=0.337 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ e1bbe739fb.386 │ 8993506f2f.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.087n ± 1% 2.102n ± 2% ~ (p=0.507 n=20) GlobalInt64-32 3.538n ± 2% 3.542n ± 2% ~ (p=0.425 n=20) GlobalInt64Parallel-32 0.3207n ± 1% 0.3202n ± 0% ~ (p=0.963 n=20) GlobalUint64-32 3.543n ± 1% 3.507n ± 1% ~ (p=0.034 n=20) GlobalUint64Parallel-32 0.3170n ± 0% 0.3170n ± 1% ~ (p=0.920 n=20) Int64-32 2.548n ± 1% 2.516n ± 1% ~ (p=0.139 n=20) Uint64-32 2.565n ± 2% 2.544n ± 1% ~ (p=0.394 n=20) GlobalIntN1000-32 6.300n ± 1% 6.237n ± 1% ~ (p=0.029 n=20) IntN1000-32 4.750n ± 0% 4.670n ± 2% ~ (p=0.034 n=20) Int64N1000-32 5.515n ± 2% 5.412n ± 1% -1.86% (p=0.009 n=20) Int64N1e8-32 5.527n ± 0% 5.414n ± 2% -2.05% (p=0.002 n=20) Int64N1e9-32 5.531n ± 2% 5.473n ± 1% ~ (p=0.047 n=20) Int64N2e9-32 5.514n ± 2% 5.487n ± 1% ~ (p=0.298 n=20) Int64N1e18-32 9.059n ± 1% 8.901n ± 2% ~ (p=0.037 n=20) Int64N2e18-32 9.594n ± 1% 9.521n ± 1% ~ (p=0.051 n=20) Int64N4e18-32 12.05n ± 2% 11.92n ± 1% ~ (p=0.357 n=20) Int32N1000-32 4.840n ± 2% 4.785n ± 1% ~ (p=0.189 n=20) Int32N1e8-32 4.832n ± 2% 4.748n ± 1% ~ (p=0.042 n=20) Int32N1e9-32 4.815n ± 2% 4.810n ± 1% ~ (p=0.878 n=20) Int32N2e9-32 4.813n ± 1% 4.812n ± 1% ~ (p=0.542 n=20) Float32-32 10.90n ± 2% 10.48n ± 4% -3.85% (p=0.007 n=20) Float64-32 20.32n ± 4% 19.79n ± 3% ~ (p=0.553 n=20) ExpFloat64-32 12.95n ± 3% 12.91n ± 3% ~ (p=0.909 n=20) NormFloat64-32 7.570n ± 1% 7.462n ± 1% -1.44% (p=0.004 n=20) Perm3-32 37.80n ± 2% 35.98n ± 2% -4.79% (p=0.000 n=20) Perm30-32 214.0n ± 1% 241.5n ± 1% +12.85% (p=0.000 n=20) Perm30ViaShuffle-32 188.7n ± 2% 187.3n ± 2% ~ (p=0.029 n=20) ShuffleOverhead-32 160.8n ± 1% 160.2n ± 1% ~ (p=0.180 n=20) Concurrent-32 3.288n ± 0% 3.308n ± 3% ~ (p=0.037 n=20) For #61716. Change-Id: I342b611456c3569520d3c91c849d29eba325d87e Reviewed-on: https://go-review.googlesource.com/c/go/+/502504 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Rob Pike <r@golang.org>
2023-10-30math/rand/v2: remove bias in ExpFloat64 and NormFloat64Branden Brown
The original implementation of the ziggurat algorithm was designed for 32-bit random integer inputs. This necessitated reusing some low-order bits for the slice selection and the random coordinate, which introduces statistical bias. The result is that PractRand consistently fails the math/rand normal and exponential sequences (transformed to uniform) within 2 GB of variates. This change adjusts the ziggurat procedures to use 63-bit random inputs, so that there is no need to reuse bits between the slice and coordinate. This is sufficient for the normal sequence to survive to 256 GB of PractRand testing. An alternative technique is to recalculate the ziggurats to use 1024 rather than 128 or 256 slices to make full use of 64-bit inputs. This improves the survival of the normal sequence to far beyond 256 GB and additionally provides a 6% performance improvement due to the improved rejection procedure efficiency. However, doing so increases the total size of the ziggurat tables from 4.5 kB to 48 kB. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 2703446c2e.amd64 │ e1bbe739fb.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.337n ± 1% 1.316n ± 2% ~ (p=0.024 n=20) GlobalInt64-32 2.225n ± 2% 2.048n ± 1% -7.93% (p=0.000 n=20) GlobalInt64Parallel-32 0.1043n ± 2% 0.1037n ± 1% ~ (p=0.587 n=20) GlobalUint64-32 2.058n ± 1% 2.039n ± 2% ~ (p=0.030 n=20) GlobalUint64Parallel-32 0.1009n ± 1% 0.1013n ± 1% ~ (p=0.984 n=20) Int64-32 1.719n ± 2% 1.692n ± 2% ~ (p=0.085 n=20) Uint64-32 1.669n ± 1% 1.643n ± 2% ~ (p=0.049 n=20) GlobalIntN1000-32 3.321n ± 2% 3.287n ± 1% ~ (p=0.298 n=20) IntN1000-32 2.479n ± 1% 2.678n ± 2% +8.01% (p=0.000 n=20) Int64N1000-32 2.477n ± 1% 2.684n ± 2% +8.38% (p=0.000 n=20) Int64N1e8-32 2.490n ± 1% 2.663n ± 2% +6.99% (p=0.000 n=20) Int64N1e9-32 2.458n ± 1% 2.633n ± 1% +7.12% (p=0.000 n=20) Int64N2e9-32 2.486n ± 2% 2.657n ± 1% +6.90% (p=0.000 n=20) Int64N1e18-32 3.215n ± 2% 3.125n ± 2% -2.78% (p=0.000 n=20) Int64N2e18-32 3.588n ± 2% 3.476n ± 1% -3.15% (p=0.000 n=20) Int64N4e18-32 4.938n ± 2% 4.795n ± 1% -2.91% (p=0.000 n=20) Int32N1000-32 2.673n ± 2% 2.485n ± 2% -7.02% (p=0.000 n=20) Int32N1e8-32 2.631n ± 2% 2.457n ± 1% -6.63% (p=0.000 n=20) Int32N1e9-32 2.628n ± 2% 2.452n ± 1% -6.70% (p=0.000 n=20) Int32N2e9-32 2.684n ± 2% 2.453n ± 1% -8.61% (p=0.000 n=20) Float32-32 2.240n ± 2% 2.254n ± 1% ~ (p=0.878 n=20) Float64-32 2.253n ± 1% 2.262n ± 1% ~ (p=0.963 n=20) ExpFloat64-32 3.677n ± 1% 3.777n ± 2% +2.71% (p=0.004 n=20) NormFloat64-32 3.761n ± 1% 3.606n ± 1% -4.15% (p=0.000 n=20) Perm3-32 33.55n ± 2% 33.12n ± 2% ~ (p=0.402 n=20) Perm30-32 173.2n ± 1% 176.1n ± 1% +1.67% (p=0.000 n=20) Perm30ViaShuffle-32 115.9n ± 1% 109.3n ± 1% -5.69% (p=0.000 n=20) ShuffleOverhead-32 101.9n ± 1% 112.5n ± 1% +10.35% (p=0.000 n=20) Concurrent-32 2.107n ± 6% 2.099n ± 0% ~ (p=0.051 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 2703446c2e.arm64 │ e1bbe739fb.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.275n ± 0% 2.290n ± 1% ~ (p=0.044 n=20) GlobalInt64-8 2.154n ± 1% 2.180n ± 1% ~ (p=0.068 n=20) GlobalInt64Parallel-8 0.4298n ± 0% 0.4294n ± 0% ~ (p=0.079 n=20) GlobalUint64-8 2.160n ± 1% 2.170n ± 1% ~ (p=0.129 n=20) GlobalUint64Parallel-8 0.4286n ± 0% 0.4283n ± 0% ~ (p=0.350 n=20) Int64-8 2.491n ± 1% 2.481n ± 1% ~ (p=0.330 n=20) Uint64-8 2.458n ± 0% 2.464n ± 1% ~ (p=0.351 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 0% ~ (p=0.325 n=20) IntN1000-8 2.933n ± 0% 2.934n ± 2% ~ (p=0.079 n=20) Int64N1000-8 2.962n ± 1% 2.957n ± 1% ~ (p=0.259 n=20) Int64N1e8-8 2.960n ± 1% 2.935n ± 2% ~ (p=0.276 n=20) Int64N1e9-8 2.935n ± 2% 2.935n ± 2% ~ (p=0.984 n=20) Int64N2e9-8 2.934n ± 0% 2.933n ± 4% ~ (p=0.463 n=20) Int64N1e18-8 3.777n ± 1% 3.781n ± 1% ~ (p=0.516 n=20) Int64N2e18-8 4.359n ± 1% 4.362n ± 0% ~ (p=0.256 n=20) Int64N4e18-8 6.536n ± 1% 6.576n ± 1% ~ (p=0.224 n=20) Int32N1000-8 2.937n ± 0% 2.942n ± 2% ~ (p=0.312 n=20) Int32N1e8-8 2.937n ± 1% 2.941n ± 1% ~ (p=0.463 n=20) Int32N1e9-8 2.936n ± 0% 2.938n ± 2% ~ (p=0.044 n=20) Int32N2e9-8 2.938n ± 2% 2.982n ± 2% ~ (p=0.174 n=20) Float32-8 3.441n ± 0% 3.441n ± 0% ~ (p=0.064 n=20) Float64-8 3.441n ± 0% 3.441n ± 0% ~ (p=0.826 n=20) ExpFloat64-8 4.486n ± 0% 4.472n ± 0% -0.31% (p=0.000 n=20) NormFloat64-8 4.721n ± 0% 4.716n ± 0% ~ (p=0.051 n=20) Perm3-8 26.65n ± 0% 26.66n ± 0% ~ (p=0.080 n=20) Perm30-8 143.2n ± 0% 143.3n ± 0% +0.10% (p=0.000 n=20) Perm30ViaShuffle-8 143.0n ± 0% 142.9n ± 0% ~ (p=0.642 n=20) ShuffleOverhead-8 120.6n ± 1% 121.1n ± 1% +0.41% (p=0.010 n=20) Concurrent-8 2.399n ± 5% 2.379n ± 2% ~ (p=0.365 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 2703446c2e.386 │ e1bbe739fb.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.072n ± 2% 2.087n ± 1% ~ (p=0.440 n=20) GlobalInt64-32 3.546n ± 27% 3.538n ± 2% ~ (p=0.101 n=20) GlobalInt64Parallel-32 0.3211n ± 0% 0.3207n ± 1% ~ (p=0.753 n=20) GlobalUint64-32 3.522n ± 2% 3.543n ± 1% ~ (p=0.071 n=20) GlobalUint64Parallel-32 0.3172n ± 0% 0.3170n ± 0% ~ (p=0.507 n=20) Int64-32 2.520n ± 2% 2.548n ± 1% ~ (p=0.267 n=20) Uint64-32 2.581n ± 1% 2.565n ± 2% ~ (p=0.143 n=20) GlobalIntN1000-32 6.171n ± 1% 6.300n ± 1% ~ (p=0.037 n=20) IntN1000-32 4.752n ± 2% 4.750n ± 0% ~ (p=0.984 n=20) Int64N1000-32 5.429n ± 1% 5.515n ± 2% ~ (p=0.292 n=20) Int64N1e8-32 5.469n ± 2% 5.527n ± 0% ~ (p=0.013 n=20) Int64N1e9-32 5.489n ± 2% 5.531n ± 2% ~ (p=0.256 n=20) Int64N2e9-32 5.492n ± 2% 5.514n ± 2% ~ (p=0.606 n=20) Int64N1e18-32 8.927n ± 1% 9.059n ± 1% ~ (p=0.229 n=20) Int64N2e18-32 9.622n ± 1% 9.594n ± 1% ~ (p=0.703 n=20) Int64N4e18-32 12.03n ± 1% 12.05n ± 2% ~ (p=0.733 n=20) Int32N1000-32 4.817n ± 1% 4.840n ± 2% ~ (p=0.941 n=20) Int32N1e8-32 4.801n ± 1% 4.832n ± 2% ~ (p=0.228 n=20) Int32N1e9-32 4.798n ± 1% 4.815n ± 2% ~ (p=0.560 n=20) Int32N2e9-32 4.840n ± 1% 4.813n ± 1% ~ (p=0.015 n=20) Float32-32 10.51n ± 4% 10.90n ± 2% +3.71% (p=0.007 n=20) Float64-32 20.33n ± 3% 20.32n ± 4% ~ (p=0.566 n=20) ExpFloat64-32 12.59n ± 2% 12.95n ± 3% +2.86% (p=0.002 n=20) NormFloat64-32 7.350n ± 2% 7.570n ± 1% +2.99% (p=0.007 n=20) Perm3-32 39.29n ± 2% 37.80n ± 2% -3.79% (p=0.000 n=20) Perm30-32 219.1n ± 2% 214.0n ± 1% -2.33% (p=0.002 n=20) Perm30ViaShuffle-32 189.8n ± 2% 188.7n ± 2% ~ (p=0.147 n=20) ShuffleOverhead-32 158.9n ± 2% 160.8n ± 1% ~ (p=0.176 n=20) Concurrent-32 3.306n ± 3% 3.288n ± 0% -0.54% (p=0.005 n=20) For #61716. Change-Id: I4c5fe710b310dc075ae21c97d1805bcc20db5050 Reviewed-on: https://go-review.googlesource.com/c/go/+/516275 Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org>
2023-10-30math/rand/v2: optimize Float32, Float64Russ Cox
We realized too late after Go 1 that float64(r.Uint64())/(1<<64) is not a correct implementation: it occasionally rounds to 1. The correct implementation is float64(r.Uint64()&(1<<53-1))/(1<<53) but we couldn't change the implementation for compatibility, so we changed it to retry only in the "round to 1" cases. The change to v2 lets us update the algorithm to the simpler, faster one. Note that this implementation cannot generate 2⁻⁵⁴, nor 2⁻¹⁰⁰, nor any of the other numbers between 0 and 2⁻⁵³. A slower algorithm could shift some of the probability of generating these two boundary values over to the values in between, but that would be much slower and not necessarily be better. In particular, the current implementation has the property that there are uniform gaps between the possible returned floats, which might help stability. Also, the result is often scaled and shifted, like Float64()*X+Y. Multiplying by X>1 would open new gaps, and adding most Y would erase all the distinctions that were introduced. The only changes to benchmarks should be in Float32 and Float64. The other changes remain a cautionary tale. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 4d84a369d1.amd64 │ 2703446c2e.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.348n ± 2% 1.337n ± 1% ~ (p=0.662 n=20) GlobalInt64-32 2.082n ± 2% 2.225n ± 2% +6.87% (p=0.000 n=20) GlobalInt64Parallel-32 0.1036n ± 1% 0.1043n ± 2% ~ (p=0.171 n=20) GlobalUint64-32 2.077n ± 2% 2.058n ± 1% ~ (p=0.560 n=20) GlobalUint64Parallel-32 0.1012n ± 1% 0.1009n ± 1% ~ (p=0.995 n=20) Int64-32 1.750n ± 0% 1.719n ± 2% -1.74% (p=0.000 n=20) Uint64-32 1.707n ± 2% 1.669n ± 1% -2.20% (p=0.000 n=20) GlobalIntN1000-32 3.192n ± 1% 3.321n ± 2% +4.04% (p=0.000 n=20) IntN1000-32 2.462n ± 2% 2.479n ± 1% ~ (p=0.417 n=20) Int64N1000-32 2.470n ± 1% 2.477n ± 1% ~ (p=0.664 n=20) Int64N1e8-32 2.503n ± 2% 2.490n ± 1% ~ (p=0.245 n=20) Int64N1e9-32 2.487n ± 1% 2.458n ± 1% ~ (p=0.032 n=20) Int64N2e9-32 2.487n ± 1% 2.486n ± 2% ~ (p=0.507 n=20) Int64N1e18-32 3.006n ± 2% 3.215n ± 2% +6.94% (p=0.000 n=20) Int64N2e18-32 3.368n ± 1% 3.588n ± 2% +6.55% (p=0.000 n=20) Int64N4e18-32 4.763n ± 1% 4.938n ± 2% +3.69% (p=0.000 n=20) Int32N1000-32 2.403n ± 1% 2.673n ± 2% +11.19% (p=0.000 n=20) Int32N1e8-32 2.405n ± 1% 2.631n ± 2% +9.42% (p=0.000 n=20) Int32N1e9-32 2.402n ± 2% 2.628n ± 2% +9.41% (p=0.000 n=20) Int32N2e9-32 2.384n ± 1% 2.684n ± 2% +12.56% (p=0.000 n=20) Float32-32 2.641n ± 2% 2.240n ± 2% -15.18% (p=0.000 n=20) Float64-32 2.483n ± 1% 2.253n ± 1% -9.26% (p=0.000 n=20) ExpFloat64-32 3.486n ± 2% 3.677n ± 1% +5.49% (p=0.000 n=20) NormFloat64-32 3.648n ± 1% 3.761n ± 1% +3.11% (p=0.000 n=20) Perm3-32 33.04n ± 1% 33.55n ± 2% ~ (p=0.180 n=20) Perm30-32 171.9n ± 1% 173.2n ± 1% ~ (p=0.050 n=20) Perm30ViaShuffle-32 100.3n ± 1% 115.9n ± 1% +15.55% (p=0.000 n=20) ShuffleOverhead-32 102.5n ± 1% 101.9n ± 1% ~ (p=0.266 n=20) Concurrent-32 2.101n ± 0% 2.107n ± 6% ~ (p=0.212 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 4d84a369d1.arm64 │ 2703446c2e.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.261n ± 1% 2.275n ± 0% ~ (p=0.082 n=20) GlobalInt64-8 2.160n ± 1% 2.154n ± 1% ~ (p=0.490 n=20) GlobalInt64Parallel-8 0.4299n ± 0% 0.4298n ± 0% ~ (p=0.663 n=20) GlobalUint64-8 2.169n ± 1% 2.160n ± 1% ~ (p=0.292 n=20) GlobalUint64Parallel-8 0.4293n ± 1% 0.4286n ± 0% ~ (p=0.155 n=20) Int64-8 2.473n ± 1% 2.491n ± 1% ~ (p=0.317 n=20) Uint64-8 2.453n ± 1% 2.458n ± 0% ~ (p=0.941 n=20) GlobalIntN1000-8 2.814n ± 2% 2.814n ± 2% ~ (p=0.972 n=20) IntN1000-8 2.933n ± 2% 2.933n ± 0% ~ (p=0.287 n=20) Int64N1000-8 2.934n ± 2% 2.962n ± 1% ~ (p=0.062 n=20) Int64N1e8-8 2.935n ± 2% 2.960n ± 1% ~ (p=0.183 n=20) Int64N1e9-8 2.934n ± 2% 2.935n ± 2% ~ (p=0.367 n=20) Int64N2e9-8 2.935n ± 2% 2.934n ± 0% ~ (p=0.455 n=20) Int64N1e18-8 3.778n ± 1% 3.777n ± 1% ~ (p=0.995 n=20) Int64N2e18-8 4.359n ± 1% 4.359n ± 1% ~ (p=0.122 n=20) Int64N4e18-8 6.546n ± 1% 6.536n ± 1% ~ (p=0.920 n=20) Int32N1000-8 2.940n ± 2% 2.937n ± 0% ~ (p=0.149 n=20) Int32N1e8-8 2.937n ± 2% 2.937n ± 1% ~ (p=0.620 n=20) Int32N1e9-8 2.938n ± 0% 2.936n ± 0% ~ (p=0.046 n=20) Int32N2e9-8 2.938n ± 2% 2.938n ± 2% ~ (p=0.455 n=20) Float32-8 3.486n ± 0% 3.441n ± 0% -1.28% (p=0.000 n=20) Float64-8 3.480n ± 0% 3.441n ± 0% -1.13% (p=0.000 n=20) ExpFloat64-8 4.533n ± 0% 4.486n ± 0% -1.03% (p=0.000 n=20) NormFloat64-8 4.764n ± 0% 4.721n ± 0% -0.90% (p=0.000 n=20) Perm3-8 26.66n ± 0% 26.65n ± 0% ~ (p=0.019 n=20) Perm30-8 143.4n ± 0% 143.2n ± 0% -0.17% (p=0.000 n=20) Perm30ViaShuffle-8 142.9n ± 0% 143.0n ± 0% ~ (p=0.522 n=20) ShuffleOverhead-8 120.7n ± 0% 120.6n ± 1% ~ (p=0.488 n=20) Concurrent-8 2.360n ± 2% 2.399n ± 5% ~ (p=0.062 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 4d84a369d1.386 │ 2703446c2e.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.101n ± 2% 2.072n ± 2% ~ (p=0.273 n=20) GlobalInt64-32 3.518n ± 2% 3.546n ± 27% +0.78% (p=0.007 n=20) GlobalInt64Parallel-32 0.3206n ± 0% 0.3211n ± 0% ~ (p=0.386 n=20) GlobalUint64-32 3.538n ± 1% 3.522n ± 2% ~ (p=0.331 n=20) GlobalUint64Parallel-32 0.3231n ± 0% 0.3172n ± 0% -1.84% (p=0.000 n=20) Int64-32 2.554n ± 2% 2.520n ± 2% ~ (p=0.465 n=20) Uint64-32 2.575n ± 2% 2.581n ± 1% ~ (p=0.213 n=20) GlobalIntN1000-32 6.292n ± 1% 6.171n ± 1% ~ (p=0.015 n=20) IntN1000-32 4.735n ± 1% 4.752n ± 2% ~ (p=0.635 n=20) Int64N1000-32 5.489n ± 2% 5.429n ± 1% ~ (p=0.324 n=20) Int64N1e8-32 5.528n ± 2% 5.469n ± 2% ~ (p=0.013 n=20) Int64N1e9-32 5.438n ± 2% 5.489n ± 2% ~ (p=0.984 n=20) Int64N2e9-32 5.474n ± 1% 5.492n ± 2% ~ (p=0.616 n=20) Int64N1e18-32 9.053n ± 1% 8.927n ± 1% ~ (p=0.037 n=20) Int64N2e18-32 9.685n ± 2% 9.622n ± 1% ~ (p=0.449 n=20) Int64N4e18-32 12.18n ± 1% 12.03n ± 1% ~ (p=0.013 n=20) Int32N1000-32 4.862n ± 1% 4.817n ± 1% -0.94% (p=0.002 n=20) Int32N1e8-32 4.758n ± 2% 4.801n ± 1% ~ (p=0.597 n=20) Int32N1e9-32 4.772n ± 1% 4.798n ± 1% ~ (p=0.774 n=20) Int32N2e9-32 4.847n ± 0% 4.840n ± 1% ~ (p=0.867 n=20) Float32-32 22.18n ± 4% 10.51n ± 4% -52.61% (p=0.000 n=20) Float64-32 21.21n ± 3% 20.33n ± 3% -4.17% (p=0.000 n=20) ExpFloat64-32 12.39n ± 2% 12.59n ± 2% ~ (p=0.139 n=20) NormFloat64-32 7.422n ± 1% 7.350n ± 2% ~ (p=0.208 n=20) Perm3-32 38.00n ± 2% 39.29n ± 2% +3.38% (p=0.000 n=20) Perm30-32 212.7n ± 1% 219.1n ± 2% +3.03% (p=0.001 n=20) Perm30ViaShuffle-32 187.5n ± 2% 189.8n ± 2% ~ (p=0.457 n=20) ShuffleOverhead-32 159.7n ± 1% 158.9n ± 2% ~ (p=0.920 n=20) Concurrent-32 3.470n ± 0% 3.306n ± 3% -4.71% (p=0.000 n=20) For #61716. Change-Id: I1933f1f9efd7e6e832d83e7fa5d84398f67d41f5 Reviewed-on: https://go-review.googlesource.com/c/go/+/502503 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30math/rand/v2: add, optimize N, UintN, Uint32N, Uint64NRuss Cox
Now that we can break the value stream, we can take advantage of better algorithms that have been suggested since the original code was written. Also optimizes IntN, Int32N, Int64N, Perm (indirectly). All the N variants (IntN, Int32N, Int64N, UintN, N, etc) now return the same values given a Source and parameter n, so that for example uint(r.IntN(10)) and r.UintN(10) and r.N(uint(10)) are completely interchangeable. Int64N4e18 gets slower but that is a near worst case for the algorithm and is extremely unlikely in practice. 32-bit Int32N variants got slower too, by 15-30%, in exchange for speeding up everything on 64-bit systems and consistency across the N functions. Also rename previously missed benchmark GlobalInt63Parallel to GlobalInt64Parallel. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 11ad9fdddc.amd64 │ 4d84a369d1.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.335n ± 1% 1.348n ± 2% ~ (p=0.335 n=20) GlobalInt64-32 2.046n ± 1% 2.082n ± 2% ~ (p=0.310 n=20) GlobalInt63Parallel-32 0.1037n ± 1% GlobalInt64Parallel-32 0.1036n ± 1% GlobalUint64-32 2.075n ± 0% 2.077n ± 2% ~ (p=0.228 n=20) GlobalUint64Parallel-32 0.1013n ± 1% 0.1012n ± 1% ~ (p=0.878 n=20) Int64-32 1.726n ± 2% 1.750n ± 0% +1.39% (p=0.000 n=20) Uint64-32 1.673n ± 1% 1.707n ± 2% +2.03% (p=0.002 n=20) GlobalIntN1000-32 3.895n ± 2% 3.192n ± 1% -18.05% (p=0.000 n=20) IntN1000-32 3.403n ± 1% 2.462n ± 2% -27.65% (p=0.000 n=20) Int64N1000-32 3.053n ± 2% 2.470n ± 1% -19.11% (p=0.000 n=20) Int64N1e8-32 2.718n ± 1% 2.503n ± 2% -7.91% (p=0.000 n=20) Int64N1e9-32 2.712n ± 1% 2.487n ± 1% -8.31% (p=0.000 n=20) Int64N2e9-32 2.690n ± 1% 2.487n ± 1% -7.57% (p=0.000 n=20) Int64N1e18-32 3.084n ± 2% 3.006n ± 2% -2.53% (p=0.000 n=20) Int64N2e18-32 4.026n ± 1% 3.368n ± 1% -16.33% (p=0.000 n=20) Int64N4e18-32 4.049n ± 2% 4.763n ± 1% +17.62% (p=0.000 n=20) Int32N1000-32 2.730n ± 0% 2.403n ± 1% -11.94% (p=0.000 n=20) Int32N1e8-32 2.916n ± 2% 2.405n ± 1% -17.53% (p=0.000 n=20) Int32N1e9-32 3.375n ± 1% 2.402n ± 2% -28.83% (p=0.000 n=20) Int32N2e9-32 3.292n ± 1% 2.384n ± 1% -27.58% (p=0.000 n=20) Float32-32 2.673n ± 1% 2.641n ± 2% ~ (p=0.147 n=20) Float64-32 2.485n ± 1% 2.483n ± 1% ~ (p=0.804 n=20) ExpFloat64-32 3.577n ± 2% 3.486n ± 2% -2.57% (p=0.000 n=20) NormFloat64-32 3.797n ± 2% 3.648n ± 1% -3.92% (p=0.000 n=20) Perm3-32 35.79n ± 2% 33.04n ± 1% -7.68% (p=0.000 n=20) Perm30-32 205.1n ± 1% 171.9n ± 1% -16.14% (p=0.000 n=20) Perm30ViaShuffle-32 111.2n ± 2% 100.3n ± 1% -9.76% (p=0.000 n=20) ShuffleOverhead-32 100.5n ± 2% 102.5n ± 1% +1.99% (p=0.007 n=20) Concurrent-32 2.188n ± 5% 2.101n ± 0% ~ (p=0.013 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 11ad9fdddc.arm64 │ 4d84a369d1.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.272n ± 1% 2.261n ± 1% ~ (p=0.172 n=20) GlobalInt64-8 2.155n ± 1% 2.160n ± 1% ~ (p=0.482 n=20) GlobalInt63Parallel-8 0.4352n ± 0% GlobalInt64Parallel-8 0.4299n ± 0% GlobalUint64-8 2.173n ± 1% 2.169n ± 1% ~ (p=0.262 n=20) GlobalUint64Parallel-8 0.4340n ± 0% 0.4293n ± 1% -1.08% (p=0.000 n=20) Int64-8 2.544n ± 1% 2.473n ± 1% -2.83% (p=0.000 n=20) Uint64-8 2.552n ± 1% 2.453n ± 1% -3.90% (p=0.000 n=20) GlobalIntN1000-8 3.856n ± 0% 2.814n ± 2% -27.02% (p=0.000 n=20) IntN1000-8 3.820n ± 0% 2.933n ± 2% -23.22% (p=0.000 n=20) Int64N1000-8 3.219n ± 2% 2.934n ± 2% -8.85% (p=0.000 n=20) Int64N1e8-8 3.221n ± 2% 2.935n ± 2% -8.91% (p=0.000 n=20) Int64N1e9-8 3.276n ± 2% 2.934n ± 2% -10.44% (p=0.000 n=20) Int64N2e9-8 3.217n ± 0% 2.935n ± 2% -8.78% (p=0.000 n=20) Int64N1e18-8 3.502n ± 2% 3.778n ± 1% +7.91% (p=0.000 n=20) Int64N2e18-8 4.968n ± 1% 4.359n ± 1% -12.26% (p=0.000 n=20) Int64N4e18-8 4.963n ± 0% 6.546n ± 1% +31.92% (p=0.000 n=20) Int32N1000-8 3.189n ± 1% 2.940n ± 2% -7.81% (p=0.000 n=20) Int32N1e8-8 3.514n ± 1% 2.937n ± 2% -16.41% (p=0.000 n=20) Int32N1e9-8 4.133n ± 0% 2.938n ± 0% -28.91% (p=0.000 n=20) Int32N2e9-8 4.137n ± 0% 2.938n ± 2% -28.97% (p=0.000 n=20) Float32-8 3.468n ± 1% 3.486n ± 0% +0.52% (p=0.000 n=20) Float64-8 3.478n ± 0% 3.480n ± 0% ~ (p=0.063 n=20) ExpFloat64-8 4.563n ± 0% 4.533n ± 0% -0.67% (p=0.000 n=20) NormFloat64-8 4.768n ± 0% 4.764n ± 0% -0.07% (p=0.001 n=20) Perm3-8 28.94n ± 0% 26.66n ± 0% -7.88% (p=0.000 n=20) Perm30-8 175.9n ± 0% 143.4n ± 0% -18.50% (p=0.000 n=20) Perm30ViaShuffle-8 152.6n ± 1% 142.9n ± 0% -6.29% (p=0.000 n=20) ShuffleOverhead-8 119.6n ± 1% 120.7n ± 0% +0.96% (p=0.000 n=20) Concurrent-8 2.452n ± 3% 2.360n ± 2% -3.73% (p=0.007 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 11ad9fdddc.386 │ 4d84a369d1.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.091n ± 1% 2.101n ± 2% ~ (p=0.672 n=20) GlobalInt64-32 3.514n ± 2% 3.518n ± 2% ~ (p=0.723 n=20) GlobalInt63Parallel-32 0.3197n ± 0% GlobalInt64Parallel-32 0.3206n ± 0% GlobalUint64-32 3.542n ± 1% 3.538n ± 1% ~ (p=0.304 n=20) GlobalUint64Parallel-32 0.3218n ± 0% 0.3231n ± 0% ~ (p=0.071 n=20) Int64-32 2.552n ± 2% 2.554n ± 2% ~ (p=0.693 n=20) Uint64-32 2.566n ± 1% 2.575n ± 2% ~ (p=0.606 n=20) GlobalIntN1000-32 5.965n ± 2% 6.292n ± 1% +5.46% (p=0.000 n=20) IntN1000-32 4.652n ± 1% 4.735n ± 1% +1.77% (p=0.000 n=20) Int64N1000-32 14.485n ± 1% 5.489n ± 2% -62.11% (p=0.000 n=20) Int64N1e8-32 14.675n ± 1% 5.528n ± 2% -62.33% (p=0.000 n=20) Int64N1e9-32 16.805n ± 2% 5.438n ± 2% -67.64% (p=0.000 n=20) Int64N2e9-32 14.515n ± 1% 5.474n ± 1% -62.28% (p=0.000 n=20) Int64N1e18-32 16.165n ± 1% 9.053n ± 1% -44.00% (p=0.000 n=20) Int64N2e18-32 17.945n ± 2% 9.685n ± 2% -46.03% (p=0.000 n=20) Int64N4e18-32 18.35n ± 2% 12.18n ± 1% -33.62% (p=0.000 n=20) Int32N1000-32 3.608n ± 1% 4.862n ± 1% +34.77% (p=0.000 n=20) Int32N1e8-32 3.767n ± 1% 4.758n ± 2% +26.31% (p=0.000 n=20) Int32N1e9-32 4.130n ± 2% 4.772n ± 1% +15.54% (p=0.000 n=20) Int32N2e9-32 4.206n ± 1% 4.847n ± 0% +15.24% (p=0.000 n=20) Float32-32 22.18n ± 4% 22.18n ± 4% ~ (p=0.195 n=20) Float64-32 20.75n ± 4% 21.21n ± 3% ~ (p=0.394 n=20) ExpFloat64-32 12.58n ± 3% 12.39n ± 2% ~ (p=0.032 n=20) NormFloat64-32 7.920n ± 3% 7.422n ± 1% -6.29% (p=0.000 n=20) Perm3-32 40.27n ± 1% 38.00n ± 2% -5.65% (p=0.000 n=20) Perm30-32 213.2n ± 2% 212.7n ± 1% ~ (p=0.995 n=20) Perm30ViaShuffle-32 164.2n ± 2% 187.5n ± 2% +14.22% (p=0.000 n=20) ShuffleOverhead-32 134.7n ± 2% 159.7n ± 1% +18.52% (p=0.000 n=20) Concurrent-32 3.301n ± 2% 3.470n ± 0% +5.10% (p=0.000 n=20) For #61716. Change-Id: Id1481b04202883cd0b23e21bb58d1bca4e482bd3 Reviewed-on: https://go-review.googlesource.com/c/go/+/502500 Reviewed-by: Rob Pike <r@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30math/rand/v2: change Source to use uint64Russ Cox
This should make Uint64-using functions faster and leave other things alone. It is a mystery why so much got faster. A good cautionary tale not to read too much into minor jitter in the benchmarks. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.amd64 │ 11ad9fdddc.amd64 │ │ sec/op │ sec/op vs base │ SourceUint64-32 1.555n ± 1% 1.335n ± 1% -14.15% (p=0.000 n=20) GlobalInt64-32 2.071n ± 1% 2.046n ± 1% ~ (p=0.016 n=20) GlobalInt63Parallel-32 0.1023n ± 1% 0.1037n ± 1% +1.37% (p=0.002 n=20) GlobalUint64-32 5.193n ± 1% 2.075n ± 0% -60.06% (p=0.000 n=20) GlobalUint64Parallel-32 0.2341n ± 0% 0.1013n ± 1% -56.74% (p=0.000 n=20) Int64-32 2.056n ± 2% 1.726n ± 2% -16.10% (p=0.000 n=20) Uint64-32 2.077n ± 2% 1.673n ± 1% -19.46% (p=0.000 n=20) GlobalIntN1000-32 4.077n ± 2% 3.895n ± 2% -4.45% (p=0.000 n=20) IntN1000-32 3.476n ± 2% 3.403n ± 1% -2.10% (p=0.000 n=20) Int64N1000-32 3.059n ± 1% 3.053n ± 2% ~ (p=0.131 n=20) Int64N1e8-32 2.942n ± 1% 2.718n ± 1% -7.60% (p=0.000 n=20) Int64N1e9-32 2.932n ± 1% 2.712n ± 1% -7.50% (p=0.000 n=20) Int64N2e9-32 2.925n ± 1% 2.690n ± 1% -8.03% (p=0.000 n=20) Int64N1e18-32 3.116n ± 1% 3.084n ± 2% ~ (p=0.425 n=20) Int64N2e18-32 4.067n ± 1% 4.026n ± 1% -1.02% (p=0.007 n=20) Int64N4e18-32 4.054n ± 1% 4.049n ± 2% ~ (p=0.204 n=20) Int32N1000-32 2.951n ± 1% 2.730n ± 0% -7.49% (p=0.000 n=20) Int32N1e8-32 3.102n ± 1% 2.916n ± 2% -6.03% (p=0.000 n=20) Int32N1e9-32 3.535n ± 1% 3.375n ± 1% -4.54% (p=0.000 n=20) Int32N2e9-32 3.514n ± 1% 3.292n ± 1% -6.30% (p=0.000 n=20) Float32-32 2.760n ± 1% 2.673n ± 1% -3.13% (p=0.000 n=20) Float64-32 2.284n ± 1% 2.485n ± 1% +8.80% (p=0.000 n=20) ExpFloat64-32 3.757n ± 1% 3.577n ± 2% -4.78% (p=0.000 n=20) NormFloat64-32 3.837n ± 1% 3.797n ± 2% ~ (p=0.204 n=20) Perm3-32 35.23n ± 2% 35.79n ± 2% ~ (p=0.298 n=20) Perm30-32 208.8n ± 1% 205.1n ± 1% -1.82% (p=0.000 n=20) Perm30ViaShuffle-32 111.7n ± 1% 111.2n ± 2% ~ (p=0.273 n=20) ShuffleOverhead-32 101.1n ± 1% 100.5n ± 2% ~ (p=0.878 n=20) Concurrent-32 2.108n ± 7% 2.188n ± 5% ~ (p=0.417 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 │ 220860f76f.arm64 │ 11ad9fdddc.arm64 │ │ sec/op │ sec/op vs base │ SourceUint64-8 2.316n ± 1% 2.272n ± 1% -1.86% (p=0.000 n=20) GlobalInt64-8 2.183n ± 1% 2.155n ± 1% ~ (p=0.122 n=20) GlobalInt63Parallel-8 0.4331n ± 0% 0.4352n ± 0% +0.48% (p=0.000 n=20) GlobalUint64-8 4.377n ± 2% 2.173n ± 1% -50.35% (p=0.000 n=20) GlobalUint64Parallel-8 0.9237n ± 0% 0.4340n ± 0% -53.02% (p=0.000 n=20) Int64-8 2.538n ± 1% 2.544n ± 1% ~ (p=0.189 n=20) Uint64-8 2.604n ± 1% 2.552n ± 1% -1.98% (p=0.000 n=20) GlobalIntN1000-8 3.857n ± 2% 3.856n ± 0% ~ (p=0.051 n=20) IntN1000-8 3.822n ± 2% 3.820n ± 0% -0.05% (p=0.001 n=20) Int64N1000-8 3.318n ± 0% 3.219n ± 2% -2.98% (p=0.000 n=20) Int64N1e8-8 3.349n ± 1% 3.221n ± 2% -3.79% (p=0.000 n=20) Int64N1e9-8 3.317n ± 2% 3.276n ± 2% -1.24% (p=0.001 n=20) Int64N2e9-8 3.317n ± 2% 3.217n ± 0% -3.01% (p=0.000 n=20) Int64N1e18-8 3.542n ± 1% 3.502n ± 2% -1.16% (p=0.001 n=20) Int64N2e18-8 5.087n ± 0% 4.968n ± 1% -2.33% (p=0.000 n=20) Int64N4e18-8 5.084n ± 0% 4.963n ± 0% -2.39% (p=0.000 n=20) Int32N1000-8 3.208n ± 2% 3.189n ± 1% -0.58% (p=0.001 n=20) Int32N1e8-8 3.610n ± 1% 3.514n ± 1% -2.67% (p=0.000 n=20) Int32N1e9-8 4.235n ± 0% 4.133n ± 0% -2.40% (p=0.000 n=20) Int32N2e9-8 4.229n ± 1% 4.137n ± 0% -2.19% (p=0.000 n=20) Float32-8 3.468n ± 0% 3.468n ± 1% ~ (p=0.350 n=20) Float64-8 3.447n ± 0% 3.478n ± 0% +0.90% (p=0.000 n=20) ExpFloat64-8 4.567n ± 0% 4.563n ± 0% -0.10% (p=0.002 n=20) NormFloat64-8 4.821n ± 0% 4.768n ± 0% -1.09% (p=0.000 n=20) Perm3-8 28.89n ± 0% 28.94n ± 0% +0.17% (p=0.000 n=20) Perm30-8 175.7n ± 0% 175.9n ± 0% +0.14% (p=0.000 n=20) Perm30ViaShuffle-8 153.5n ± 0% 152.6n ± 1% ~ (p=0.010 n=20) ShuffleOverhead-8 119.8n ± 1% 119.6n ± 1% ~ (p=0.147 n=20) Concurrent-8 2.433n ± 3% 2.452n ± 3% ~ (p=0.616 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.386 │ 11ad9fdddc.386 │ │ sec/op │ sec/op vs base │ SourceUint64-32 2.370n ± 1% 2.091n ± 1% -11.75% (p=0.000 n=20) GlobalInt64-32 3.569n ± 1% 3.514n ± 2% -1.56% (p=0.000 n=20) GlobalInt63Parallel-32 0.3221n ± 1% 0.3197n ± 0% -0.76% (p=0.000 n=20) GlobalUint64-32 8.797n ± 10% 3.542n ± 1% -59.74% (p=0.000 n=20) GlobalUint64Parallel-32 0.6351n ± 0% 0.3218n ± 0% -49.33% (p=0.000 n=20) Int64-32 2.612n ± 2% 2.552n ± 2% -2.30% (p=0.000 n=20) Uint64-32 3.350n ± 1% 2.566n ± 1% -23.42% (p=0.000 n=20) GlobalIntN1000-32 5.892n ± 1% 5.965n ± 2% ~ (p=0.082 n=20) IntN1000-32 4.546n ± 1% 4.652n ± 1% +2.33% (p=0.000 n=20) Int64N1000-32 14.59n ± 1% 14.48n ± 1% ~ (p=0.652 n=20) Int64N1e8-32 14.76n ± 2% 14.67n ± 1% ~ (p=0.836 n=20) Int64N1e9-32 16.57n ± 1% 16.80n ± 2% ~ (p=0.016 n=20) Int64N2e9-32 14.54n ± 1% 14.52n ± 1% ~ (p=0.533 n=20) Int64N1e18-32 16.14n ± 1% 16.16n ± 1% ~ (p=0.606 n=20) Int64N2e18-32 18.10n ± 1% 17.95n ± 2% ~ (p=0.062 n=20) Int64N4e18-32 18.65n ± 1% 18.35n ± 2% -1.61% (p=0.010 n=20) Int32N1000-32 3.560n ± 1% 3.608n ± 1% +1.33% (p=0.001 n=20) Int32N1e8-32 3.770n ± 2% 3.767n ± 1% ~ (p=0.155 n=20) Int32N1e9-32 4.098n ± 0% 4.130n ± 2% ~ (p=0.016 n=20) Int32N2e9-32 4.179n ± 1% 4.206n ± 1% ~ (p=0.011 n=20) Float32-32 21.18n ± 4% 22.18n ± 4% +4.70% (p=0.003 n=20) Float64-32 20.60n ± 2% 20.75n ± 4% +0.73% (p=0.000 n=20) ExpFloat64-32 13.07n ± 0% 12.58n ± 3% -3.82% (p=0.000 n=20) NormFloat64-32 7.738n ± 2% 7.920n ± 3% ~ (p=0.066 n=20) Perm3-32 36.73n ± 1% 40.27n ± 1% +9.65% (p=0.000 n=20) Perm30-32 211.9n ± 1% 213.2n ± 2% ~ (p=0.262 n=20) Perm30ViaShuffle-32 165.2n ± 1% 164.2n ± 2% ~ (p=0.029 n=20) ShuffleOverhead-32 133.9n ± 1% 134.7n ± 2% ~ (p=0.551 n=20) Concurrent-32 3.287n ± 2% 3.301n ± 2% ~ (p=0.330 n=20) For #61716. Change-Id: I8d2f73f87dd3603a0c2ff069988938e0957b6904 Reviewed-on: https://go-review.googlesource.com/c/go/+/502499 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org>
2023-10-30math/rand/v2: update benchmarksRuss Cox
Change the benchmarks to use the result of the calls, as I found that in certain cases inlining resulted in discarding part of the computation in the benchmark loop. Add various benchmarks that will be relevant in future CLs. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.amd64 │ │ sec/op │ SourceUint64-32 1.555n ± 1% GlobalInt64-32 2.071n ± 1% GlobalInt63Parallel-32 0.1023n ± 1% GlobalUint64-32 5.193n ± 1% GlobalUint64Parallel-32 0.2341n ± 0% Int64-32 2.056n ± 2% Uint64-32 2.077n ± 2% GlobalIntN1000-32 4.077n ± 2% IntN1000-32 3.476n ± 2% Int64N1000-32 3.059n ± 1% Int64N1e8-32 2.942n ± 1% Int64N1e9-32 2.932n ± 1% Int64N2e9-32 2.925n ± 1% Int64N1e18-32 3.116n ± 1% Int64N2e18-32 4.067n ± 1% Int64N4e18-32 4.054n ± 1% Int32N1000-32 2.951n ± 1% Int32N1e8-32 3.102n ± 1% Int32N1e9-32 3.535n ± 1% Int32N2e9-32 3.514n ± 1% Float32-32 2.760n ± 1% Float64-32 2.284n ± 1% ExpFloat64-32 3.757n ± 1% NormFloat64-32 3.837n ± 1% Perm3-32 35.23n ± 2% Perm30-32 208.8n ± 1% Perm30ViaShuffle-32 111.7n ± 1% ShuffleOverhead-32 101.1n ± 1% Concurrent-32 2.108n ± 7% goos: darwin goarch: arm64 pkg: math/rand/v2 cpu: Apple M1 │ 220860f76f.arm64 │ │ sec/op │ SourceUint64-8 2.316n ± 1% GlobalInt64-8 2.183n ± 1% GlobalInt63Parallel-8 0.4331n ± 0% GlobalUint64-8 4.377n ± 2% GlobalUint64Parallel-8 0.9237n ± 0% Int64-8 2.538n ± 1% Uint64-8 2.604n ± 1% GlobalIntN1000-8 3.857n ± 2% IntN1000-8 3.822n ± 2% Int64N1000-8 3.318n ± 0% Int64N1e8-8 3.349n ± 1% Int64N1e9-8 3.317n ± 2% Int64N2e9-8 3.317n ± 2% Int64N1e18-8 3.542n ± 1% Int64N2e18-8 5.087n ± 0% Int64N4e18-8 5.084n ± 0% Int32N1000-8 3.208n ± 2% Int32N1e8-8 3.610n ± 1% Int32N1e9-8 4.235n ± 0% Int32N2e9-8 4.229n ± 1% Float32-8 3.468n ± 0% Float64-8 3.447n ± 0% ExpFloat64-8 4.567n ± 0% NormFloat64-8 4.821n ± 0% Perm3-8 28.89n ± 0% Perm30-8 175.7n ± 0% Perm30ViaShuffle-8 153.5n ± 0% ShuffleOverhead-8 119.8n ± 1% Concurrent-8 2.433n ± 3% goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ 220860f76f.386 │ │ sec/op │ SourceUint64-32 2.370n ± 1% GlobalInt64-32 3.569n ± 1% GlobalInt63Parallel-32 0.3221n ± 1% GlobalUint64-32 8.797n ± 10% GlobalUint64Parallel-32 0.6351n ± 0% Int64-32 2.612n ± 2% Uint64-32 3.350n ± 1% GlobalIntN1000-32 5.892n ± 1% IntN1000-32 4.546n ± 1% Int64N1000-32 14.59n ± 1% Int64N1e8-32 14.76n ± 2% Int64N1e9-32 16.57n ± 1% Int64N2e9-32 14.54n ± 1% Int64N1e18-32 16.14n ± 1% Int64N2e18-32 18.10n ± 1% Int64N4e18-32 18.65n ± 1% Int32N1000-32 3.560n ± 1% Int32N1e8-32 3.770n ± 2% Int32N1e9-32 4.098n ± 0% Int32N2e9-32 4.179n ± 1% Float32-32 21.18n ± 4% Float64-32 20.60n ± 2% ExpFloat64-32 13.07n ± 0% NormFloat64-32 7.738n ± 2% Perm3-32 36.73n ± 1% Perm30-32 211.9n ± 1% Perm30ViaShuffle-32 165.2n ± 1% ShuffleOverhead-32 133.9n ± 1% Concurrent-32 3.287n ± 2% For #61716. Change-Id: I2f0938eae4b7bf736a8cd899a99783e731bf2179 Reviewed-on: https://go-review.googlesource.com/c/go/+/502496 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30math/rand/v2: remove Rand.SeedRuss Cox
Removing Rand.Seed lets us remove lockedSource as well, along with the ambiguity in globalRand about which source to use. For #61716. Change-Id: Ibe150520dd1e7dd87165eacaebe9f0c2daeaedfd Reviewed-on: https://go-review.googlesource.com/c/go/+/502498 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Rob Pike <r@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org>
2023-10-30math/rand/v2: clean up regression testRuss Cox
Add more test cases. Replace -printgolden with -update, which rewrites the files for us. For #61716. Change-Id: I7c4c900ee896042429135a21971a56ebe16b6a66 Reviewed-on: https://go-review.googlesource.com/c/go/+/516858 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30math/rand/v2: remove ReadRuss Cox
In math/rand, Read is deprecated. Remove in v2. People should use crypto/rand if they need long strings. For #61716. Change-Id: Ib254b7e1844616e96db60a3a7abb572b0dcb1583 Reviewed-on: https://go-review.googlesource.com/c/go/+/502497 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30math/rand/v2: rename various functionsRuss Cox
Int31 -> Int32 Int31n -> Int32N Int63 -> Int64 Int63n -> Int64N Intn -> IntN The 31 and 63 are pedantic and confusing: the functions should be named for the type they return, same as all the others. The lower-case n is inconsistent with Go's usual CamelCase and especially problematic because we plan to add 'func N'. Capitalize the n. For #61716. Change-Id: Idb1a005a82f353677450d47fb612ade7a41fde69 Reviewed-on: https://go-review.googlesource.com/c/go/+/516857 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-30math/rand/v2: start of new APIRuss Cox
This is the beginning of the math/rand/v2 package from proposal #61716. Start by copying old API. This CL copies math/rand/* to math/rand/v2 and updates references to math/rand to add v2 throughout. Later CLs will make the v2 changes. For #61716. Change-Id: I1624ccffae3dfa442d4ba2461942decbd076e11b Reviewed-on: https://go-review.googlesource.com/c/go/+/502495 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org>
2023-10-19math: add available godoc linkcui fliter
Change-Id: I4a6c2ef6fd21355952ab7d8eaad883646a95d364 Reviewed-on: https://go-review.googlesource.com/c/go/+/535087 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@google.com>
2023-08-17math/big, math/rand: use the built-in max functionchanxuehong
Change-Id: I71a38dd20bfaf2b1aed18892d54eeb017d3d7d66 GitHub-Last-Rev: 8da43b2cbd563ed123690709e519c9f84272b332 GitHub-Pull-Request: golang/go#61955 Reviewed-on: https://go-review.googlesource.com/c/go/+/518595 Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: qiulaidongfeng <2645477756@qq.com>
2023-04-04math/rand: clarify Seed deprecation noteIan Lance Taylor
Fixes #59331 Change-Id: I62156be2f2758c59349c3b02db6cf9140429c9e3 Reviewed-on: https://go-review.googlesource.com/c/go/+/481915 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Bypass: Ian Lance Taylor <iant@google.com> Reviewed-by: Russ Cox <rsc@golang.org>
2023-04-04all: fix misuses of "a" vs "an"cui fliter
Fixes the misuse of "a" vs "an", according to English grammatical expectations and using https://www.a-or-an.com/ Change-Id: I53ac724070e3ff3d33c304483fe72c023c7cda47 Reviewed-on: https://go-review.googlesource.com/c/go/+/480536 Run-TryBot: shuang cui <imcusg@gmail.com> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2023-02-10math/rand: fix typo in Seed deprecation commentValentin Deleplace
Change-Id: I37a9e4362953a711840087e1b7b8d7a25f1a83b7 Reviewed-on: https://go-review.googlesource.com/c/go/+/467275 Reviewed-by: Russ Cox <rsc@golang.org> TryBot-Bypass: Russ Cox <rsc@golang.org> Auto-Submit: Russ Cox <rsc@golang.org>
2023-02-09math/rand: rewrite the math/rand package comment to say what it's good forRob Pike
It currently says only what it wasn't good for, which is not helpful. Change-Id: I468c7f385c14eaca99788a94d53c30b729ed0944 Reviewed-on: https://go-review.googlesource.com/c/go/+/466276 Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com>
2023-02-07math/rand: use fastrand64 if possibleIan Lance Taylor
Now that the top-level math/rand functions are auto-seeded by default (issue #54880), use the runtime fastrand64 function when 1) Seed has not been called; 2) the GODEBUG randautoseed=0 is not used. The benchmarks run quickly and are relatively noisy, but they show significant improvements for parallel calls to the top-level functions. goos: linux goarch: amd64 pkg: math/rand cpu: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz │ /tmp/foo.1 │ /tmp/foo.2 │ │ sec/op │ sec/op vs base │ Int63Threadsafe-16 11.605n ± 1% 3.094n ± 1% -73.34% (p=0.000 n=10) Int63ThreadsafeParallel-16 67.8350n ± 2% 0.4000n ± 1% -99.41% (p=0.000 n=10) Int63Unthreadsafe-16 1.947n ± 3% 1.924n ± 2% ~ (p=0.189 n=10) Intn1000-16 4.295n ± 2% 4.287n ± 3% ~ (p=0.517 n=10) Int63n1000-16 4.379n ± 0% 4.192n ± 2% -4.27% (p=0.000 n=10) Int31n1000-16 3.641n ± 3% 3.506n ± 0% -3.69% (p=0.000 n=10) Float32-16 3.330n ± 7% 3.250n ± 2% -2.40% (p=0.017 n=10) Float64-16 2.194n ± 6% 2.056n ± 4% -6.31% (p=0.004 n=10) Perm3-16 43.39n ± 9% 38.28n ± 12% -11.77% (p=0.015 n=10) Perm30-16 324.4n ± 6% 315.9n ± 19% ~ (p=0.315 n=10) Perm30ViaShuffle-16 175.4n ± 1% 143.6n ± 2% -18.15% (p=0.000 n=10) ShuffleOverhead-16 223.4n ± 2% 215.8n ± 1% -3.38% (p=0.000 n=10) Read3-16 5.428n ± 3% 5.406n ± 2% ~ (p=0.780 n=10) Read64-16 41.55n ± 5% 40.14n ± 3% -3.38% (p=0.000 n=10) Read1000-16 622.9n ± 4% 594.9n ± 2% -4.50% (p=0.000 n=10) Concurrent-16 136.300n ± 2% 4.647n ± 26% -96.59% (p=0.000 n=10) geomean 23.40n 12.15n -48.08% Fixes #49892 Change-Id: Iba75b326145512ab0b7ece233b98ac3d4e1fb504 Reviewed-on: https://go-review.googlesource.com/c/go/+/465037 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com>
2023-01-19internal/godebug: export non-default-behavior counters in runtime/metricsRuss Cox
Allow GODEBUG users to report how many times a setting resulted in non-default behavior. Record non-default-behaviors for all existing GODEBUGs. Also rework tests to ensure that runtime is in sync with runtime/metrics.All, and generate docs mechanically from metrics.All. For #56986. Change-Id: Iefa1213e2a5c3f19ea16cd53298c487952ef05a4 Reviewed-on: https://go-review.googlesource.com/c/go/+/453618 TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-11-16math/rand: deprecate SeedRuss Cox
Programs that call Seed and then expect a specific sequence of results from the global random source (using functions such as Int) can be broken when a dependency changes how much it consumes from the global random source. To avoid such breakages, programs that need a specific result sequence should use NewRand(NewSource(seed)) to obtain a random generator that other packages cannot access. Fixes #56319. Change-Id: Idac33991b719d2c71f109f51dacb3467a649e01e Reviewed-on: https://go-review.googlesource.com/c/go/+/451375 Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Russ Cox <rsc@golang.org>
2022-11-14internal/godebug: define more efficient APIRuss Cox
We have been expanding our use of GODEBUG for compatibility, and the current implementation forces a tradeoff between freshness and efficiency. It parses the environment variable in full each time it is called, which is expensive. But if clients cache the result, they won't respond to run-time GODEBUG changes, as happened with x509sha1 (#56436). This CL changes the GODEBUG API to provide efficient, up-to-date results. Instead of a single Get function, New returns a *godebug.Setting that itself has a Get method. Clients can save the result of New, which is no more expensive than errors.New, in a global variable, and then call that variable's Get method to get the value. Get costs only two atomic loads in the case where the variable hasn't changed since the last call. Unfortunately, these changes do require importing sync from godebug, which will mean that sync itself will never be able to use a GODEBUG setting. That doesn't seem like such a hardship. If it was really necessary, the runtime could pass a setting to package sync itself at startup, with the caveat that that setting, like the ones used by runtime itself, would not respond to run-time GODEBUG changes. Change-Id: I99a3acfa24fb2a692610af26a5d14bbc62c966ac Reviewed-on: https://go-review.googlesource.com/c/go/+/449504 Run-TryBot: Russ Cox <rsc@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-10-27math/rand: deprecate Readhopehook
For #20661. Change-Id: I1e638cb619e643eadc210d71f92bd1af7bafc912 Reviewed-on: https://go-review.googlesource.com/c/go/+/436955 Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: hopehook <hopehook@golangcn.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Robert Griesemer <gri@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com>
2022-10-26all: remove uses of rand.SeedRuss Cox
As of CL 443058, rand.Seed is not necessary to call, nor is it a particular good idea. For #54880. Change-Id: If9d70763622c09008599db8c97a90fcbe285c6f8 Reviewed-on: https://go-review.googlesource.com/c/go/+/445395 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-10-25math/rand: auto-seed global sourceRuss Cox
Implement proposal #54880, to automatically seed the global source. The justification for this not being a breaking change is that any use of the global source in a package's init function or exported API clearly must be valid - that is, if a package changes how much randomness it consumes at init time or in an exported API, that clearly isn't the kind of breaking change that requires issuing a v2 of that package. That kind of per-package change in the position of the global source is indistinguishable from seeding the global source differently. So if the per-package change is valid, so is auto-seeding. And then, of course, auto-seeding means that packages will be far less likely to depend on the specific results of the global source and therefore not break when those kinds of per-package changes happen in the future. Seed(1) can be called in programs that need the old sequence from the global source and want to restore the old behavior. Of course, those programs will still be broken by the per-package changes just described, and it would be better for them to allocate local sources rather than continue to use the global one. Fixes #54880. Change-Id: Ib9dc3307b97f7a45587a9cc50d81f919d3edc7ae Reviewed-on: https://go-review.googlesource.com/c/go/+/443058 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Russ Cox <rsc@golang.org>
2022-10-18math/rand: refactor to delay allocation of global sourceRuss Cox
This sets up for delaying the decision of which seed to use, but this CL still keeps the original global Seed(1) semantics. Preparation for #54880. Change-Id: Ibfa9d50ec9023aa755a83852e55168fa7d24b115 Reviewed-on: https://go-review.googlesource.com/c/go/+/443057 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Russ Cox <rsc@golang.org>
2022-09-02math/rand: document that Source returned by NewSource implements Source64Jonathan FOng
Fixes #44488 Change-Id: I570950799788678b9dc6e9ddad894973b4611e09 Reviewed-on: https://go-review.googlesource.com/c/go/+/425974 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Joseph Tsai <joetsai@digital-static.net> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Joseph Tsai <joetsai@digital-static.net> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: David Chase <drchase@google.com>
2022-04-11all: gofmt main repoRuss Cox
[This CL is part of a sequence implementing the proposal #51082. The design doc is at https://go.dev/s/godocfmt-design.] Run the updated gofmt, which reformats doc comments, on the main repository. Vendored files are excluded. For #51082. Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407 Reviewed-on: https://go-review.googlesource.com/c/go/+/384268 Reviewed-by: Jonathan Amsterdam <jba@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-04-01all: remove trailing blank doc comment linesRuss Cox
A future change to gofmt will rewrite // Doc comment. // func f() to // Doc comment. func f() Apply that change preemptively to all doc comments. For #51082. Change-Id: I4023e16cfb0729b64a8590f071cd92f17343081d Reviewed-on: https://go-review.googlesource.com/c/go/+/384259 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>