| Age | Commit message (Collapse) | Author |
|
Benchmark on Loongson 3A6000 and 3A5000:
goos: linux
goarch: loong64
pkg: bytes
cpu: Loongson-3A6000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
CountSingle/10 13.210n ± 0% 9.984n ± 0% -24.42% (p=0.000 n=15)
CountSingle/32 31.970n ± 1% 7.205n ± 0% -77.46% (p=0.000 n=15)
CountSingle/4K 4039.0n ± 0% 108.7n ± 0% -97.31% (p=0.000 n=15)
CountSingle/4M 4158.9µ ± 0% 117.3µ ± 0% -97.18% (p=0.000 n=15)
CountSingle/64M 68.641m ± 0% 2.585m ± 1% -96.23% (p=0.000 n=15)
geomean 13.72µ 1.189µ -91.34%
| bench.old | bench.new |
| B/s | B/s vs base |
CountSingle/10 722.0Mi ± 0% 955.2Mi ± 0% +32.30% (p=0.000 n=15)
CountSingle/32 954.6Mi ± 1% 4235.4Mi ± 0% +343.68% (p=0.000 n=15)
CountSingle/4K 967.2Mi ± 0% 35947.6Mi ± 0% +3616.64% (p=0.000 n=15)
CountSingle/4M 961.8Mi ± 0% 34092.7Mi ± 0% +3444.71% (p=0.000 n=15)
CountSingle/64M 932.4Mi ± 0% 24757.2Mi ± 1% +2555.24% (p=0.000 n=15)
geomean 902.2Mi 10.17Gi +1054.77%
goos: linux
goarch: loong64
pkg: bytes
cpu: Loongson-3A5000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
CountSingle/10 14.41n ± 0% 12.81n ± 0% -11.10% (p=0.000 n=15)
CountSingle/32 36.230n ± 0% 9.609n ± 0% -73.48% (p=0.000 n=15)
CountSingle/4K 4366.0n ± 0% 165.5n ± 0% -96.21% (p=0.000 n=15)
CountSingle/4M 4464.7µ ± 0% 325.2µ ± 0% -92.72% (p=0.000 n=15)
CountSingle/64M 75.627m ± 0% 8.307m ± 69% -89.02% (p=0.000 n=15)
geomean 15.04µ 2.229µ -85.18%
| bench.old | bench.new |
| B/s | B/s vs base |
CountSingle/10 661.8Mi ± 0% 744.4Mi ± 0% +12.49% (p=0.000 n=15)
CountSingle/32 842.4Mi ± 0% 3176.1Mi ± 0% +277.03% (p=0.000 n=15)
CountSingle/4K 894.7Mi ± 0% 23596.7Mi ± 0% +2537.34% (p=0.000 n=15)
CountSingle/4M 895.9Mi ± 0% 12299.7Mi ± 0% +1272.88% (p=0.000 n=15)
CountSingle/64M 846.3Mi ± 0% 7703.9Mi ± 41% +810.34% (p=0.000 n=15)
geomean 823.3Mi 5.424Gi +574.68%
Change-Id: Ie07592beac61bdb093470c524049ed494df4d703
Reviewed-on: https://go-review.googlesource.com/c/go/+/586055
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Add a simple assembly implementation of Count/CountString for mips64x.
name old sec/op new sec/op vs base
CountSingle/10-4 31.16n ± 0% 41.69n ± 0% +33.79% (p=0.000 n=11)
CountSingle/32-4 69.58n ± 0% 59.61n ± 0% -14.33% (p=0.000 n=11)
CountSingle/4K-4 7.428µ ± 0% 5.153µ ± 0% -30.63% (p=0.000 n=11)
CountSingle/4M-4 7.634m ± 0% 5.300m ± 0% -30.58% (p=0.000 n=11)
CountSingle/64M-4 134.4m ± 0% 100.8m ± 3% -24.99% (p=0.000 n=11)
name old B/s new B/s vs base
CountSingle/10-4 306.1Mi ± 0% 228.8Mi ± 0% -25.25% (p=0.000 n=11)
CountSingle/32-4 438.6Mi ± 0% 512.0Mi ± 0% +16.74% (p=0.000 n=11)
CountSingle/4K-4 525.9Mi ± 0% 758.0Mi ± 0% +44.15% (p=0.000 n=11)
CountSingle/4M-4 523.9Mi ± 0% 754.7Mi ± 0% +44.05% (p=0.000 n=11)
CountSingle/64M-4 476.3Mi ± 0% 635.0Mi ± 0% +33.31% (p=0.000 n=11)
Change-Id: Id5ddbea0d080e2903156ef8dc86c030a8179115b
Reviewed-on: https://go-review.googlesource.com/c/go/+/650995
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
When these packages are released as part of Go 1.18,
Go 1.16 will no longer be supported, so we can remove
the +build tags in these files.
Ran go fix -fix=buildtag std cmd and then reverted the bootstrapDirs
as defined in src/cmd/dist/buildtool.go, which need to continue
to build with Go 1.4 for now.
Also reverted src/vendor and src/cmd/vendor, which will need
to be updated in their own repos first.
Manual changes in runtime/pprof/mprof_test.go to adjust line numbers.
For #41184.
Change-Id: Ic0f93f7091295b6abc76ed5cd6e6746e1280861e
Reviewed-on: https://go-review.googlesource.com/c/go/+/344955
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).
Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild
Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/294430
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Simple single-byte loop count for now, to be further improved in future
CLs.
Benchmark on linux/riscv64 (HiFive Unleashed):
name old time/op new time/op delta
CountSingle/10-4 190ns ± 1% 145ns ± 1% -23.66% (p=0.000 n=10+9)
CountSingle/32-4 422ns ± 1% 268ns ± 0% -36.43% (p=0.000 n=10+7)
CountSingle/4K-4 43.3µs ± 0% 23.8µs ± 0% -45.09% (p=0.000 n=8+10)
CountSingle/4M-4 54.2ms ± 1% 33.3ms ± 1% -38.48% (p=0.000 n=10+10)
CountSingle/64M-4 1.52s ± 1% 1.20s ± 1% -21.20% (p=0.000 n=9+9)
name old speed new speed delta
CountSingle/10-4 52.7MB/s ± 1% 69.1MB/s ± 1% +31.03% (p=0.000 n=10+9)
CountSingle/32-4 75.9MB/s ± 1% 119.5MB/s ± 0% +57.34% (p=0.000 n=10+8)
CountSingle/4K-4 94.6MB/s ± 0% 172.2MB/s ± 0% +82.10% (p=0.000 n=8+10)
CountSingle/4M-4 77.4MB/s ± 1% 125.8MB/s ± 1% +62.54% (p=0.000 n=10+10)
CountSingle/64M-4 44.2MB/s ± 1% 56.1MB/s ± 1% +26.91% (p=0.000 n=9+9)
Change-Id: I2a6bd50d22d5f598517bb3c5a50066c54280cac5
Reviewed-on: https://go-review.googlesource.com/c/go/+/263541
Trust: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
|
|
Add a 'single lane' SIMD implemementation of the single byte count
function for use on machines that support the vector facility. This
allows up to 16 bytes to be counted per loop iteration.
We can probably improve performance further by adding more 'lanes'
(i.e. counting more bytes in parallel) however this will increase
the complexity of the function so I'm not sure it is worth doing
yet.
name old speed new speed delta
pkg:strings goos:linux goarch:s390x
CountByte/10 789MB/s ± 0% 1131MB/s ± 0% +43.44% (p=0.000 n=9+9)
CountByte/32 936MB/s ± 0% 3236MB/s ± 0% +245.87% (p=0.000 n=8+9)
CountByte/4096 1.06GB/s ± 0% 21.26GB/s ± 0% +1907.07% (p=0.000 n=10+10)
CountByte/4194304 1.06GB/s ± 0% 20.54GB/s ± 0% +1838.50% (p=0.000 n=10+10)
CountByte/67108864 1.06GB/s ± 0% 18.31GB/s ± 0% +1629.51% (p=0.000 n=10+10)
pkg:bytes goos:linux goarch:s390x
CountSingle/10 800MB/s ± 0% 986MB/s ± 0% +23.21% (p=0.000 n=9+10)
CountSingle/32 925MB/s ± 0% 2744MB/s ± 0% +196.55% (p=0.000 n=9+10)
CountSingle/4K 1.26GB/s ± 0% 19.44GB/s ± 0% +1445.59% (p=0.000 n=10+10)
CountSingle/4M 1.26GB/s ± 0% 20.28GB/s ± 0% +1510.26% (p=0.000 n=8+10)
CountSingle/64M 1.23GB/s ± 0% 17.78GB/s ± 0% +1350.67% (p=0.000 n=9+10)
Change-Id: I230d57905db92a8fdfc50b1d5be338941ae3a7a1
Reviewed-on: https://go-review.googlesource.com/c/go/+/199979
Run-TryBot: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Simple single-byte loop count for now, to be further improved in future
CLs.
Benchmark on linux/arm:
name old time/op new time/op delta
CountSingle/10-4 122ns ± 0% 87ns ± 1% -28.41% (p=0.000 n=7+10)
CountSingle/32-4 242ns ± 0% 174ns ± 1% -28.25% (p=0.000 n=10+10)
CountSingle/4K-4 24.2µs ± 1% 15.6µs ± 1% -35.42% (p=0.000 n=10+10)
CountSingle/4M-4 29.6ms ± 1% 21.3ms ± 1% -28.09% (p=0.000 n=10+9)
CountSingle/64M-4 562ms ± 0% 414ms ± 1% -26.23% (p=0.000 n=8+10)
name old speed new speed delta
CountSingle/10-4 81.7MB/s ± 1% 114.5MB/s ± 1% +40.07% (p=0.000 n=10+10)
CountSingle/32-4 132MB/s ± 0% 184MB/s ± 1% +39.39% (p=0.000 n=10+9)
CountSingle/4K-4 170MB/s ± 1% 263MB/s ± 1% +54.86% (p=0.000 n=10+10)
CountSingle/4M-4 142MB/s ± 1% 197MB/s ± 1% +39.07% (p=0.000 n=10+9)
CountSingle/64M-4 119MB/s ± 0% 162MB/s ± 1% +35.55% (p=0.000 n=8+10)
Updates #29001
Change-Id: I42a268215a62044286ec32b548d8e4b86b9570ee
Reviewed-on: https://go-review.googlesource.com/c/go/+/168319
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This adds an asm implementation for the Count function in ppc64x.
The Go code that manipulates a byte at a time is especially
inefficient on ppc64x, so an asm implementation is a significant
improvement.
bytes:
name old time/op new time/op delta
CountSingle/10-8 23.1ns ± 0% 18.6ns ± 0% -19.48% (p=1.000 n=1+1)
CountSingle/32-8 60.4ns ± 0% 19.0ns ± 0% -68.54% (p=1.000 n=1+1)
CountSingle/4K-8 7.29µs ± 0% 0.45µs ± 0% -93.80% (p=1.000 n=1+1)
CountSingle/4M-8 7.49ms ± 0% 0.45ms ± 0% -93.97% (p=1.000 n=1+1)
CountSingle/64M-8 127ms ± 0% 9ms ± 0% -92.53% (p=1.000 n=1+1)
html:
name old time/op new time/op delta
Escape-8 57.5µs ± 0% 36.1µs ± 0% -37.13% (p=1.000 n=1+1)
EscapeNone-8 20.0µs ± 0% 2.0µs ± 0% -90.14% (p=1.000 n=1+1)
Change-Id: Iadbf422c0e9a37b47d2d95fb8c778420f3aabb58
Reviewed-on: https://go-review.googlesource.com/131695
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
|
|
Also move the arm64 CountByte implementation while we're here.
Fixes #19792
Change-Id: I1e0fdf1e03e3135af84150a2703b58dad1b0d57e
Reviewed-on: https://go-review.googlesource.com/98518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Move bytes.Count and strings.Count to bytealg.
Update #19792
Change-Id: I3e4e14b504a0b71758885bb131e5656e342cf8cb
Reviewed-on: https://go-review.googlesource.com/98495
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|