| Age | Commit message (Collapse) | Author |
|
When GO386=sse2 we can assume sse2 to be present without
a runtime check. If GO386=softfloat is set we can avoid
the usage of SSE2 even if detected.
This might cause a memcpy, memclr and bytealg slowdown of Go
binaries compiled with softfloat on machines that support
SSE2. Such setups are rare and should use GO386=sse2 instead
if performance matters.
On targets that support SSE2 we avoid the runtime overhead of
dynamic cpu feature dispatch.
The removal of runtime sse2 checks also allows to simplify
internal/cpu further by removing handling of the required
feature option as a followup after this CL.
Change-Id: I90a853a8853a405cb665497c6d1a86556947ba17
Reviewed-on: https://go-review.googlesource.com/c/go/+/344350
Trust: Martin Möhrmann <martin@golang.org>
Run-TryBot: Martin Möhrmann <martin@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This adds a power9 version of the bytes.Index function
for little endian.
Here is the improvement on power9 for some of the Index
benchmarks:
Index/10 -0.14%
Index/32 -3.19%
Index/4K -12.66%
Index/4M -13.34%
Index/64M -13.17%
Count/10 -0.59%
Count/32 -2.88%
Count/4K -12.63%
Count/4M -13.35%
Count/64M -13.17%
IndexHard1 -23.03%
IndexHard2 -13.01%
IndexHard3 -22.12%
IndexHard4 +0.16%
CountHard1 -23.02%
CountHard2 -13.01%
CountHard3 -22.12%
IndexPeriodic/IndexPeriodic2 -22.85%
IndexPeriodic/IndexPeriodic4 -23.15%
Change-Id: Id72353e2771eba2efbb1544d5f0be65f8a9f0433
Reviewed-on: https://go-review.googlesource.com/c/go/+/311380
Run-TryBot: Carlos Eduardo Seo <carlos.seo@linaro.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
|
|
Change-Id: I09ba19e19b195e345a0fe29d542e0d86529b0d31
Reviewed-on: https://go-review.googlesource.com/c/go/+/261359
Trust: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
For the case of a pattern containing multi-byte rune, the time complexity of the
previous algorithm is O(nm), and if both input arguments are long, the search
performance will be poor. This CL improves the searching performance for these
cases by using IndexRune, which is mainly implemented with IndexByte and Index.
As IndexByte and Index are specially optimized with some powerful instructions
for short patterns (an UTF8 rune is 1 to 4 bytes), so they can help to reduce the
runtime complexity of IndexAny and LastIndexAny.
Another optimization method is using hash table, however, the actual test results
show that using indexrune is better, and the space complexity is lower.
There are two fast paths in IndexAny and LastIndexAny for cases where the length
of the input arguements are 1, and their locations are not exactly the same, which
is determined based on the actual test results.
Benchmarks on arm64 and amd64:
name old time/op new time/op delta
pkg:strings goos:linux goarch:arm64
IndexAnyASCII/1:1-8 23.7ns ± 3% 28.5ns ± 0% +20.15% (p=0.008 n=5+5)
IndexAnyASCII/1:2-8 18.0ns ± 0% 33.1ns ± 0% +83.67% (p=0.008 n=5+5)
IndexAnyASCII/1:4-8 20.0ns ± 0% 36.0ns ± 0% +80.00% (p=0.029 n=4+4)
IndexAnyASCII/1:8-8 36.1ns ± 0% 36.0ns ± 0% ~ (p=0.095 n=5+4)
IndexAnyASCII/1:16-8 48.1ns ± 0% 36.0ns ± 0% -25.19% (p=0.029 n=4+4)
IndexAnyASCII/1:32-8 72.1ns ± 0% 36.0ns ± 0% -50.01% (p=0.008 n=5+5)
IndexAnyASCII/1:64-8 120ns ± 0% 39ns ± 0% -67.83% (p=0.008 n=5+5)
IndexAnyASCII/16:1-8 73.0ns ± 0% 28.5ns ± 0% -60.95% (p=0.008 n=5+5)
IndexAnyASCII/16:2-8 76.8ns ± 0% 77.0ns ± 0% ~ (p=1.000 n=5+5)
IndexAnyASCII/16:4-8 83.2ns ± 1% 83.0ns ± 0% ~ (p=0.770 n=5+5)
IndexAnyASCII/16:8-8 111ns ± 1% 107ns ± 0% -3.25% (p=0.008 n=5+5)
IndexAnyASCII/16:16-8 139ns ± 1% 137ns ± 0% -1.58% (p=0.008 n=5+5)
IndexAnyASCII/16:32-8 199ns ± 1% 197ns ± 0% -1.20% (p=0.008 n=5+5)
IndexAnyASCII/16:64-8 307ns ± 0% 313ns ± 0% +1.82% (p=0.016 n=5+4)
IndexAnyASCII/256:1-8 674ns ± 0% 65ns ± 0% -90.31% (p=0.008 n=5+5)
IndexAnyASCII/256:2-8 678ns ± 0% 683ns ± 0% +0.68% (p=0.008 n=5+5)
IndexAnyASCII/256:4-8 685ns ± 0% 683ns ± 0% -0.29% (p=0.000 n=5+4)
IndexAnyASCII/256:8-8 711ns ± 0% 708ns ± 0% -0.48% (p=0.008 n=5+5)
IndexAnyASCII/256:16-8 740ns ± 0% 740ns ± 0% ~ (p=0.444 n=5+5)
IndexAnyASCII/256:32-8 799ns ± 0% 798ns ± 0% -0.18% (p=0.008 n=5+5)
IndexAnyASCII/256:64-8 910ns ± 0% 914ns ± 0% +0.44% (p=0.016 n=4+5)
IndexAnyUTF8/1:1-8 27.1ns ± 0% 19.0ns ± 0% -29.79% (p=0.008 n=5+5)
IndexAnyUTF8/1:2-8 44.1ns ± 0% 33.0ns ± 0% -25.17% (p=0.008 n=5+5)
IndexAnyUTF8/1:4-8 46.1ns ± 0% 33.1ns ± 0% -28.29% (p=0.016 n=4+5)
IndexAnyUTF8/1:8-8 85.1ns ± 0% 33.0ns ± 0% -61.18% (p=0.008 n=5+5)
IndexAnyUTF8/1:16-8 110ns ± 1% 36ns ± 0% -67.27% (p=0.008 n=5+5)
IndexAnyUTF8/1:32-8 188ns ± 0% 36ns ± 0% -80.85% (p=0.008 n=5+5)
IndexAnyUTF8/1:64-8 332ns ± 0% 39ns ± 0% ~ (p=0.079 n=4+5)
IndexAnyUTF8/16:1-8 293ns ± 0% 54ns ± 0% -81.56% (p=0.008 n=5+5)
IndexAnyUTF8/16:2-8 563ns ± 0% 349ns ± 0% -37.98% (p=0.008 n=5+5)
IndexAnyUTF8/16:4-8 546ns ± 1% 349ns ± 0% -36.10% (p=0.000 n=5+4)
IndexAnyUTF8/16:8-8 1.22µs ± 0% 0.35µs ± 0% -71.39% (p=0.008 n=5+5)
IndexAnyUTF8/16:16-8 1.63µs ± 1% 0.42µs ± 0% -73.98% (p=0.008 n=5+5)
IndexAnyUTF8/16:32-8 2.87µs ± 0% 0.42µs ± 0% -85.22% (p=0.008 n=5+5)
IndexAnyUTF8/16:64-8 5.18µs ± 0% 0.47µs ± 0% -90.98% (p=0.008 n=5+5)
IndexAnyUTF8/256:1-8 4.26µs ± 0% 0.47µs ± 0% -88.85% (p=0.000 n=4+5)
IndexAnyUTF8/256:2-8 8.62µs ± 0% 5.15µs ± 0% -40.21% (p=0.008 n=5+5)
IndexAnyUTF8/256:4-8 8.25µs ± 0% 5.15µs ± 0% -37.50% (p=0.016 n=5+4)
IndexAnyUTF8/256:8-8 19.2µs ± 1% 5.2µs ± 0% -73.08% (p=0.016 n=5+4)
IndexAnyUTF8/256:16-8 25.6µs ± 1% 6.3µs ± 0% -75.32% (p=0.008 n=5+5)
IndexAnyUTF8/256:32-8 45.6µs ± 0% 6.3µs ± 0% -86.15% (p=0.008 n=5+5)
IndexAnyUTF8/256:64-8 82.4µs ± 0% 7.0µs ± 0% -91.53% (p=0.016 n=5+4)
LastIndexAnyASCII/1:1-8 23.0ns ± 0% 33.5ns ± 0% +45.65% (p=0.008 n=5+5)
LastIndexAnyASCII/1:2-8 24.5ns ± 0% 33.5ns ± 0% +36.73% (p=0.016 n=4+5)
LastIndexAnyASCII/1:4-8 27.5ns ± 0% 35.5ns ± 0% +29.09% (p=0.008 n=5+5)
LastIndexAnyASCII/1:8-8 44.5ns ± 0% 35.5ns ± 0% -20.13% (p=0.008 n=5+5)
LastIndexAnyASCII/1:16-8 56.5ns ± 0% 35.5ns ± 0% -37.15% (p=0.008 n=5+5)
LastIndexAnyASCII/1:32-8 80.3ns ± 0% 35.5ns ± 0% -55.79% (p=0.000 n=5+4)
LastIndexAnyASCII/1:64-8 129ns ± 0% 40ns ± 0% -68.85% (p=0.008 n=5+5)
LastIndexAnyASCII/16:1-8 72.8ns ± 0% 72.7ns ± 0% -0.19% (p=0.016 n=4+5)
LastIndexAnyASCII/16:2-8 75.4ns ± 0% 75.1ns ± 0% ~ (p=0.127 n=5+5)
LastIndexAnyASCII/16:4-8 81.9ns ± 1% 80.2ns ± 0% -2.00% (p=0.008 n=5+5)
LastIndexAnyASCII/16:8-8 110ns ± 1% 108ns ± 0% -1.46% (p=0.008 n=5+5)
LastIndexAnyASCII/16:16-8 138ns ± 1% 134ns ± 0% -3.18% (p=0.008 n=5+5)
LastIndexAnyASCII/16:32-8 198ns ± 0% 197ns ± 0% -0.51% (p=0.008 n=5+5)
LastIndexAnyASCII/16:64-8 309ns ± 0% 313ns ± 0% +1.30% (p=0.008 n=5+5)
LastIndexAnyASCII/256:1-8 652ns ± 0% 653ns ± 0% +0.21% (p=0.008 n=5+5)
LastIndexAnyASCII/256:2-8 656ns ± 0% 656ns ± 0% ~ (all equal)
LastIndexAnyASCII/256:4-8 663ns ± 0% 663ns ± 0% ~ (p=0.444 n=5+5)
LastIndexAnyASCII/256:8-8 691ns ± 0% 690ns ± 0% ~ (p=0.079 n=4+5)
LastIndexAnyASCII/256:16-8 719ns ± 0% 715ns ± 0% -0.53% (p=0.000 n=5+4)
LastIndexAnyASCII/256:32-8 779ns ± 0% 780ns ± 0% +0.13% (p=0.029 n=4+4)
LastIndexAnyASCII/256:64-8 890ns ± 0% 894ns ± 0% +0.45% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:1-8 31.6ns ± 0% 33.5ns ± 0% +6.01% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:2-8 48.6ns ± 0% 33.5ns ± 0% -30.99% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:4-8 48.6ns ± 0% 33.5ns ± 0% -31.13% (p=0.000 n=5+4)
LastIndexAnyUTF8/1:8-8 89.6ns ± 0% 33.5ns ± 0% -62.56% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:16-8 113ns ± 1% 36ns ± 0% -68.47% (p=0.000 n=5+4)
LastIndexAnyUTF8/1:32-8 190ns ± 0% 36ns ± 0% -81.26% (p=0.029 n=4+4)
LastIndexAnyUTF8/1:64-8 327ns ± 0% 40ns ± 0% -87.77% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:1-8 364ns ± 0% 158ns ± 0% ~ (p=0.079 n=4+5)
LastIndexAnyUTF8/16:2-8 636ns ± 0% 472ns ± 0% -25.79% (p=0.000 n=5+4)
LastIndexAnyUTF8/16:4-8 630ns ± 0% 472ns ± 0% -25.03% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:8-8 1.28µs ± 0% 0.47µs ± 0% -63.09% (p=0.016 n=5+4)
LastIndexAnyUTF8/16:16-8 1.66µs ± 0% 0.53µs ± 0% -68.39% (p=0.016 n=5+4)
LastIndexAnyUTF8/16:32-8 2.88µs ± 0% 0.53µs ± 0% -81.72% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:64-8 5.08µs ± 0% 0.57µs ± 0% -88.79% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:1-8 5.41µs ± 0% 2.03µs ± 0% -62.46% (p=0.016 n=4+5)
LastIndexAnyUTF8/256:2-8 9.77µs ± 0% 7.14µs ± 0% -26.97% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:4-8 9.63µs ± 0% 7.14µs ± 0% -25.86% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:8-8 20.0µs ± 0% 7.1µs ± 0% -64.30% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:16-8 26.1µs ± 1% 8.0µs ± 0% -69.40% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:32-8 45.6µs ± 1% 8.0µs ± 0% -82.51% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:64-8 80.8µs ± 0% 8.6µs ± 0% -89.33% (p=0.016 n=5+4)
pkg:bytes goos:linux goarch:arm64
IndexAnyASCII/1:1-8 26.2ns ± 1% 26.5ns ± 0% +1.30% (p=0.016 n=5+4)
IndexAnyASCII/1:2-8 18.5ns ± 0% 26.5ns ± 0% +43.24% (p=0.008 n=5+5)
IndexAnyASCII/1:4-8 21.0ns ± 0% 26.5ns ± 0% +26.38% (p=0.008 n=5+5)
IndexAnyASCII/1:8-8 37.5ns ± 0% 26.5ns ± 0% -29.33% (p=0.000 n=5+4)
IndexAnyASCII/1:16-8 49.6ns ± 0% 26.5ns ± 0% -46.49% (p=0.008 n=5+5)
IndexAnyASCII/1:32-8 73.6ns ± 0% 30.1ns ± 0% -59.16% (p=0.008 n=5+5)
IndexAnyASCII/1:64-8 122ns ± 0% 33ns ± 0% -73.23% (p=0.008 n=5+5)
IndexAnyASCII/16:1-8 73.7ns ± 0% 33.4ns ± 0% -54.71% (p=0.008 n=5+5)
IndexAnyASCII/16:2-8 79.1ns ± 0% 78.9ns ± 0% -0.30% (p=0.016 n=4+5)
IndexAnyASCII/16:4-8 84.8ns ± 0% 86.1ns ± 0% +1.58% (p=0.016 n=5+4)
IndexAnyASCII/16:8-8 111ns ± 0% 111ns ± 0% ~ (all equal)
IndexAnyASCII/16:16-8 139ns ± 0% 144ns ± 0% +3.60% (p=0.016 n=4+5)
IndexAnyASCII/16:32-8 196ns ± 0% 207ns ± 0% +5.61% (p=0.016 n=5+4)
IndexAnyASCII/16:64-8 311ns ± 0% 320ns ± 0% +2.89% (p=0.016 n=4+5)
IndexAnyASCII/256:1-8 674ns ± 0% 65ns ± 1% -90.35% (p=0.008 n=5+5)
IndexAnyASCII/256:2-8 680ns ± 0% 680ns ± 0% ~ (p=0.444 n=5+5)
IndexAnyASCII/256:4-8 686ns ± 0% 687ns ± 0% ~ (p=0.167 n=5+5)
IndexAnyASCII/256:8-8 713ns ± 0% 712ns ± 0% -0.14% (p=0.008 n=5+5)
IndexAnyASCII/256:16-8 740ns ± 0% 744ns ± 0% +0.54% (p=0.016 n=5+4)
IndexAnyASCII/256:32-8 797ns ± 0% 808ns ± 0% +1.43% (p=0.008 n=5+5)
IndexAnyASCII/256:64-8 912ns ± 0% 921ns ± 0% +0.99% (p=0.016 n=4+5)
IndexAnyUTF8/1:1-8 27.5ns ± 0% 26.5ns ± 0% -3.64% (p=0.008 n=5+5)
IndexAnyUTF8/1:2-8 44.5ns ± 0% 26.5ns ± 0% -40.50% (p=0.008 n=5+5)
IndexAnyUTF8/1:4-8 45.6ns ± 0% 26.5ns ± 0% -41.89% (p=0.000 n=5+4)
IndexAnyUTF8/1:8-8 85.8ns ± 1% 26.5ns ± 0% -69.11% (p=0.008 n=5+5)
IndexAnyUTF8/1:16-8 110ns ± 1% 26ns ± 0% -76.00% (p=0.016 n=5+4)
IndexAnyUTF8/1:32-8 188ns ± 0% 30ns ± 0% -84.04% (p=0.008 n=5+5)
IndexAnyUTF8/1:64-8 333ns ± 0% 33ns ± 0% -90.20% (p=0.008 n=5+5)
IndexAnyUTF8/16:1-8 294ns ± 0% 235ns ± 0% -20.07% (p=0.008 n=5+5)
IndexAnyUTF8/16:2-8 563ns ± 0% 309ns ± 0% -45.12% (p=0.008 n=5+5)
IndexAnyUTF8/16:4-8 558ns ± 1% 309ns ± 0% -44.60% (p=0.000 n=5+4)
IndexAnyUTF8/16:8-8 1.23µs ± 0% 0.31µs ± 0% -74.79% (p=0.008 n=5+5)
IndexAnyUTF8/16:16-8 1.62µs ± 2% 0.31µs ± 0% -80.93% (p=0.008 n=5+5)
IndexAnyUTF8/16:32-8 2.86µs ± 0% 0.38µs ± 0% -86.87% (p=0.008 n=5+5)
IndexAnyUTF8/16:64-8 5.18µs ± 0% 0.42µs ± 0% -91.86% (p=0.008 n=5+5)
IndexAnyUTF8/256:1-8 4.27µs ± 1% 3.30µs ± 1% -22.75% (p=0.008 n=5+5)
IndexAnyUTF8/256:2-8 8.61µs ± 0% 4.45µs ± 0% -48.31% (p=0.016 n=4+5)
IndexAnyUTF8/256:4-8 8.44µs ± 0% 4.45µs ± 0% -47.23% (p=0.008 n=5+5)
IndexAnyUTF8/256:8-8 19.2µs ± 0% 4.5µs ± 0% -76.78% (p=0.008 n=5+5)
IndexAnyUTF8/256:16-8 25.6µs ± 0% 4.5µs ± 0% -82.63% (p=0.008 n=5+5)
IndexAnyUTF8/256:32-8 45.4µs ± 0% 5.5µs ± 0% -87.85% (p=0.016 n=4+5)
IndexAnyUTF8/256:64-8 82.5µs ± 0% 6.2µs ± 0% -92.49% (p=0.008 n=5+5)
LastIndexAnyASCII/1:1-8 23.0ns ± 0% 26.5ns ± 0% +15.02% (p=0.008 n=5+5)
LastIndexAnyASCII/1:2-8 24.5ns ± 0% 26.5ns ± 0% +8.16% (p=0.008 n=5+5)
LastIndexAnyASCII/1:4-8 27.8ns ± 0% 26.5ns ± 0% -4.68% (p=0.029 n=4+4)
LastIndexAnyASCII/1:8-8 45.1ns ± 1% 26.5ns ± 0% -41.29% (p=0.000 n=5+4)
LastIndexAnyASCII/1:16-8 57.1ns ± 0% 26.5ns ± 0% -53.61% (p=0.008 n=5+5)
LastIndexAnyASCII/1:32-8 81.5ns ± 0% 30.0ns ± 0% ~ (p=0.079 n=4+5)
LastIndexAnyASCII/1:64-8 129ns ± 0% 32ns ± 0% -74.81% (p=0.008 n=5+5)
LastIndexAnyASCII/16:1-8 72.6ns ± 0% 72.1ns ± 0% -0.63% (p=0.000 n=4+5)
LastIndexAnyASCII/16:2-8 77.2ns ± 0% 77.2ns ± 0% ~ (p=0.167 n=5+5)
LastIndexAnyASCII/16:4-8 83.1ns ± 0% 83.2ns ± 0% ~ (p=0.444 n=5+5)
LastIndexAnyASCII/16:8-8 109ns ± 1% 108ns ± 0% ~ (p=0.167 n=5+5)
LastIndexAnyASCII/16:16-8 136ns ± 0% 136ns ± 0% ~ (all equal)
LastIndexAnyASCII/16:32-8 195ns ± 0% 197ns ± 0% +0.82% (p=0.008 n=5+5)
LastIndexAnyASCII/16:64-8 309ns ± 0% 309ns ± 0% ~ (all equal)
LastIndexAnyASCII/256:1-8 653ns ± 0% 657ns ± 0% +0.61% (p=0.008 n=5+5)
LastIndexAnyASCII/256:2-8 659ns ± 0% 658ns ± 0% ~ (p=0.167 n=5+5)
LastIndexAnyASCII/256:4-8 664ns ± 0% 663ns ± 0% ~ (p=0.095 n=5+4)
LastIndexAnyASCII/256:8-8 698ns ± 0% 689ns ± 0% -1.29% (p=0.008 n=5+5)
LastIndexAnyASCII/256:16-8 726ns ± 0% 717ns ± 0% -1.24% (p=0.008 n=5+5)
LastIndexAnyASCII/256:32-8 777ns ± 0% 779ns ± 0% ~ (p=0.079 n=5+4)
LastIndexAnyASCII/256:64-8 889ns ± 0% 890ns ± 0% ~ (p=0.444 n=5+5)
LastIndexAnyUTF8/1:1-8 32.1ns ± 0% 26.5ns ± 0% -17.45% (p=0.000 n=5+4)
LastIndexAnyUTF8/1:2-8 48.6ns ± 0% 26.5ns ± 0% -45.52% (p=0.000 n=5+4)
LastIndexAnyUTF8/1:4-8 49.6ns ± 0% 26.5ns ± 0% -46.62% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:8-8 91.9ns ± 0% 26.5ns ± 0% -71.18% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:16-8 114ns ± 1% 26ns ± 0% -76.84% (p=0.000 n=5+4)
LastIndexAnyUTF8/1:32-8 203ns ± 6% 30ns ± 0% -85.25% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:64-8 330ns ± 0% 33ns ± 0% -90.14% (p=0.000 n=4+5)
LastIndexAnyUTF8/16:1-8 365ns ± 0% 164ns ± 0% -55.04% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:2-8 638ns ± 0% 296ns ± 0% -53.58% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:4-8 634ns ± 0% 296ns ± 0% -53.31% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:8-8 1.30µs ± 0% 0.30µs ± 0% -77.18% (p=0.000 n=4+5)
LastIndexAnyUTF8/16:16-8 1.66µs ± 0% 0.30µs ± 0% -82.19% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:32-8 2.90µs ± 0% 0.38µs ± 0% -87.00% (p=0.029 n=4+4)
LastIndexAnyUTF8/16:64-8 5.10µs ± 0% 0.42µs ± 0% -91.78% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:1-8 5.42µs ± 0% 2.12µs ± 0% -60.92% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:2-8 9.79µs ± 0% 4.26µs ± 0% -56.47% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:4-8 9.66µs ± 0% 4.26µs ± 0% -55.87% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:8-8 20.4µs ± 0% 4.3µs ± 0% -79.10% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:16-8 26.0µs ± 1% 4.3µs ± 0% -83.62% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:32-8 46.0µs ± 0% 5.5µs ± 0% -88.09% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:64-8 81.1µs ± 0% 6.2µs ± 0% -92.38% (p=0.008 n=5+5)
name old time/op new time/op delta
pkg:strings goos:linux goarch:amd64
IndexAnyASCII/1:1-48 10.0ns ± 0% 13.3ns ± 0% +33.00% (p=0.008 n=5+5)
IndexAnyASCII/1:2-48 11.0ns ± 0% 15.5ns ± 0% +40.55% (p=0.016 n=4+5)
IndexAnyASCII/1:4-48 12.9ns ± 0% 15.4ns ± 0% +19.69% (p=0.008 n=5+5)
IndexAnyASCII/1:8-48 18.6ns ± 0% 15.5ns ± 0% -16.45% (p=0.000 n=4+5)
IndexAnyASCII/1:16-48 30.1ns ± 0% 16.9ns ± 0% ~ (p=0.079 n=4+5)
IndexAnyASCII/1:32-48 53.1ns ± 0% 18.6ns ± 0% -64.95% (p=0.000 n=5+4)
IndexAnyASCII/1:64-48 98.9ns ± 0% 17.4ns ± 0% -82.41% (p=0.000 n=5+4)
IndexAnyASCII/16:1-48 35.0ns ± 0% 14.2ns ± 0% -59.47% (p=0.000 n=5+4)
IndexAnyASCII/16:2-48 35.5ns ± 0% 35.6ns ± 0% ~ (p=0.238 n=5+4)
IndexAnyASCII/16:4-48 40.8ns ± 0% 40.7ns ± 1% ~ (p=0.643 n=5+5)
IndexAnyASCII/16:8-48 50.8ns ± 0% 50.9ns ± 1% ~ (p=1.000 n=4+5)
IndexAnyASCII/16:16-48 64.0ns ± 1% 64.5ns ± 1% ~ (p=0.071 n=5+5)
IndexAnyASCII/16:32-48 98.3ns ± 0% 100.8ns ± 1% +2.52% (p=0.008 n=5+5)
IndexAnyASCII/16:64-48 156ns ± 0% 157ns ± 0% ~ (p=0.238 n=4+5)
IndexAnyASCII/256:1-48 299ns ± 0% 24ns ± 3% -92.12% (p=0.008 n=5+5)
IndexAnyASCII/256:2-48 303ns ± 0% 304ns ± 0% ~ (p=0.762 n=5+5)
IndexAnyASCII/256:4-48 311ns ± 0% 311ns ± 0% ~ (p=0.476 n=5+5)
IndexAnyASCII/256:8-48 321ns ± 0% 321ns ± 0% ~ (p=0.429 n=4+5)
IndexAnyASCII/256:16-48 334ns ± 0% 335ns ± 0% ~ (p=0.079 n=5+4)
IndexAnyASCII/256:32-48 367ns ± 0% 365ns ± 0% ~ (p=0.079 n=4+5)
IndexAnyASCII/256:64-48 431ns ± 1% 421ns ± 0% -2.27% (p=0.008 n=5+5)
IndexAnyUTF8/1:1-48 17.2ns ± 0% 10.8ns ± 0% -37.21% (p=0.029 n=4+4)
IndexAnyUTF8/1:2-48 26.7ns ± 0% 15.6ns ± 0% ~ (p=0.079 n=4+5)
IndexAnyUTF8/1:4-48 28.2ns ± 0% 15.6ns ± 0% -44.68% (p=0.000 n=5+4)
IndexAnyUTF8/1:8-48 48.8ns ± 0% 15.6ns ± 0% -68.03% (p=0.029 n=4+4)
IndexAnyUTF8/1:16-48 58.3ns ± 0% 16.2ns ± 0% ~ (p=0.079 n=4+5)
IndexAnyUTF8/1:32-48 103ns ± 0% 18ns ± 0% -82.27% (p=0.008 n=5+5)
IndexAnyUTF8/1:64-48 182ns ± 0% 17ns ± 0% -90.53% (p=0.008 n=5+5)
IndexAnyUTF8/16:1-48 197ns ± 0% 25ns ± 0% -87.34% (p=0.000 n=5+4)
IndexAnyUTF8/16:2-48 348ns ± 0% 163ns ± 0% -53.11% (p=0.000 n=5+4)
IndexAnyUTF8/16:4-48 374ns ± 0% 163ns ± 0% -56.37% (p=0.000 n=5+4)
IndexAnyUTF8/16:8-48 716ns ± 0% 163ns ± 0% -77.22% (p=0.000 n=5+4)
IndexAnyUTF8/16:16-48 859ns ± 0% 175ns ± 0% -79.63% (p=0.000 n=5+4)
IndexAnyUTF8/16:32-48 1.58µs ± 0% 0.20µs ± 0% -87.01% (p=0.029 n=4+4)
IndexAnyUTF8/16:64-48 2.84µs ± 0% 0.19µs ± 1% -93.34% (p=0.008 n=5+5)
IndexAnyUTF8/256:1-48 2.61µs ± 0% 0.27µs ± 0% -89.81% (p=0.008 n=5+5)
IndexAnyUTF8/256:2-48 4.95µs ± 0% 2.23µs ± 0% -54.91% (p=0.016 n=5+4)
IndexAnyUTF8/256:4-48 5.55µs ± 0% 2.23µs ± 0% -59.72% (p=0.008 n=5+5)
IndexAnyUTF8/256:8-48 10.8µs ± 0% 2.2µs ± 0% -79.39% (p=0.008 n=5+5)
IndexAnyUTF8/256:16-48 13.1µs ± 0% 2.5µs ± 0% -81.21% (p=0.016 n=4+5)
IndexAnyUTF8/256:32-48 24.7µs ± 0% 2.8µs ± 0% -88.49% (p=0.008 n=5+5)
IndexAnyUTF8/256:64-48 45.0µs ± 0% 2.6µs ± 1% -94.23% (p=0.008 n=5+5)
LastIndexAnyASCII/1:1-48 13.9ns ± 0% 15.2ns ± 0% +9.35% (p=0.008 n=5+5)
LastIndexAnyASCII/1:2-48 14.4ns ± 0% 15.2ns ± 0% +5.56% (p=0.008 n=5+5)
LastIndexAnyASCII/1:4-48 16.7ns ± 0% 15.2ns ± 0% -8.98% (p=0.008 n=5+5)
LastIndexAnyASCII/1:8-48 24.0ns ± 0% 15.2ns ± 0% -36.67% (p=0.008 n=5+5)
LastIndexAnyASCII/1:16-48 35.6ns ± 0% 15.0ns ± 0% -57.82% (p=0.008 n=5+5)
LastIndexAnyASCII/1:32-48 68.9ns ± 0% 16.7ns ± 0% -75.75% (p=0.008 n=5+5)
LastIndexAnyASCII/1:64-48 104ns ± 0% 17ns ± 1% -83.81% (p=0.008 n=5+5)
LastIndexAnyASCII/16:1-48 35.0ns ± 0% 35.0ns ± 0% ~ (all equal)
LastIndexAnyASCII/16:2-48 35.6ns ± 0% 35.6ns ± 0% ~ (all equal)
LastIndexAnyASCII/16:4-48 41.0ns ± 0% 40.8ns ± 0% -0.49% (p=0.032 n=5+5)
LastIndexAnyASCII/16:8-48 50.9ns ± 0% 50.7ns ± 1% ~ (p=0.397 n=5+5)
LastIndexAnyASCII/16:16-48 64.3ns ± 1% 64.4ns ± 1% ~ (p=1.000 n=4+5)
LastIndexAnyASCII/16:32-48 100ns ± 0% 100ns ± 0% +0.38% (p=0.016 n=4+5)
LastIndexAnyASCII/16:64-48 157ns ± 1% 163ns ± 0% +3.82% (p=0.008 n=5+5)
LastIndexAnyASCII/256:1-48 302ns ± 0% 300ns ± 0% -0.53% (p=0.008 n=5+5)
LastIndexAnyASCII/256:2-48 305ns ± 0% 303ns ± 0% -0.66% (p=0.000 n=5+4)
LastIndexAnyASCII/256:4-48 313ns ± 0% 307ns ± 0% -2.04% (p=0.000 n=4+5)
LastIndexAnyASCII/256:8-48 323ns ± 0% 315ns ± 0% -2.48% (p=0.029 n=4+4)
LastIndexAnyASCII/256:16-48 333ns ± 0% 332ns ± 0% -0.30% (p=0.048 n=5+5)
LastIndexAnyASCII/256:32-48 366ns ± 0% 367ns ± 0% ~ (p=0.238 n=4+5)
LastIndexAnyASCII/256:64-48 430ns ± 0% 430ns ± 0% ~ (p=1.000 n=5+5)
LastIndexAnyUTF8/1:1-48 21.1ns ± 0% 13.9ns ± 0% -34.00% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:2-48 29.5ns ± 0% 13.9ns ± 0% -52.95% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:4-48 31.6ns ± 0% 13.9ns ± 0% -55.96% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:8-48 51.1ns ± 0% 13.9ns ± 0% -72.81% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:16-48 58.9ns ± 0% 14.6ns ± 0% -75.23% (p=0.016 n=5+4)
LastIndexAnyUTF8/1:32-48 103ns ± 0% 16ns ± 1% -84.12% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:64-48 177ns ± 0% 17ns ± 1% -90.62% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:1-48 275ns ± 1% 105ns ± 0% -61.85% (p=0.000 n=5+4)
LastIndexAnyUTF8/16:2-48 406ns ± 0% 216ns ± 0% -46.70% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:4-48 458ns ± 0% 216ns ± 0% -52.75% (p=0.000 n=4+5)
LastIndexAnyUTF8/16:8-48 753ns ± 0% 216ns ± 0% -71.31% (p=0.029 n=4+4)
LastIndexAnyUTF8/16:16-48 902ns ± 0% 221ns ± 0% -75.50% (p=0.016 n=5+4)
LastIndexAnyUTF8/16:32-48 1.57µs ± 0% 0.24µs ± 0% -84.46% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:64-48 2.77µs ± 0% 0.24µs ± 0% -91.22% (p=0.000 n=5+4)
LastIndexAnyUTF8/256:1-48 4.06µs ± 0% 1.53µs ± 0% -62.26% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:2-48 5.92µs ± 0% 3.04µs ± 0% -48.55% (p=0.016 n=4+5)
LastIndexAnyUTF8/256:4-48 6.82µs ± 0% 3.04µs ± 0% -55.34% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:8-48 11.5µs ± 0% 3.0µs ± 0% -73.48% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:16-48 14.1µs ± 0% 3.1µs ± 0% -77.85% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:32-48 24.5µs ± 0% 3.5µs ± 0% -85.85% (p=0.016 n=5+4)
LastIndexAnyUTF8/256:64-48 44.0µs ± 0% 3.5µs ± 0% -92.12% (p=0.008 n=5+5)
pkg:bytes goos:linux goarch:amd64
IndexAnyASCII/1:1-48 9.56ns ± 0% 11.00ns ± 0% +15.06% (p=0.016 n=5+4)
IndexAnyASCII/1:2-48 11.0ns ± 0% 10.8ns ± 2% -1.64% (p=0.048 n=5+5)
IndexAnyASCII/1:4-48 13.9ns ± 0% 11.0ns ± 1% -21.15% (p=0.008 n=5+5)
IndexAnyASCII/1:8-48 19.6ns ± 0% 10.8ns ± 3% -44.90% (p=0.008 n=5+5)
IndexAnyASCII/1:16-48 31.1ns ± 0% 11.5ns ± 0% -63.02% (p=0.008 n=5+5)
IndexAnyASCII/1:32-48 54.0ns ± 0% 11.8ns ± 0% -78.15% (p=0.000 n=5+4)
IndexAnyASCII/1:64-48 100ns ± 0% 13ns ± 0% -86.89% (p=0.008 n=5+5)
IndexAnyASCII/16:1-48 35.5ns ± 0% 14.8ns ± 0% -58.26% (p=0.008 n=5+5)
IndexAnyASCII/16:2-48 36.2ns ± 1% 36.0ns ± 1% ~ (p=0.087 n=5+5)
IndexAnyASCII/16:4-48 40.3ns ± 1% 39.7ns ± 4% ~ (p=0.175 n=4+5)
IndexAnyASCII/16:8-48 48.7ns ± 5% 45.8ns ± 0% -6.02% (p=0.016 n=5+4)
IndexAnyASCII/16:16-48 64.1ns ±11% 62.1ns ± 1% ~ (p=0.143 n=5+5)
IndexAnyASCII/16:32-48 97.9ns ± 1% 98.3ns ± 1% ~ (p=0.294 n=5+5)
IndexAnyASCII/16:64-48 163ns ± 0% 157ns ± 0% -3.68% (p=0.008 n=5+5)
IndexAnyASCII/256:1-48 389ns ± 0% 25ns ± 0% -93.65% (p=0.000 n=5+4)
IndexAnyASCII/256:2-48 391ns ± 0% 307ns ± 0% -21.48% (p=0.000 n=5+4)
IndexAnyASCII/256:4-48 394ns ± 0% 323ns ± 0% -17.92% (p=0.008 n=5+5)
IndexAnyASCII/256:8-48 402ns ± 0% 323ns ± 0% -19.51% (p=0.008 n=5+5)
IndexAnyASCII/256:16-48 414ns ± 0% 334ns ± 0% -19.32% (p=0.016 n=4+5)
IndexAnyASCII/256:32-48 446ns ± 0% 367ns ± 0% -17.75% (p=0.016 n=5+4)
IndexAnyASCII/256:64-48 511ns ± 0% 424ns ± 0% -17.02% (p=0.008 n=5+5)
IndexAnyUTF8/1:1-48 17.4ns ± 0% 11.0ns ± 0% -36.64% (p=0.008 n=5+5)
IndexAnyUTF8/1:2-48 27.3ns ± 1% 11.0ns ± 0% -59.74% (p=0.008 n=5+5)
IndexAnyUTF8/1:4-48 28.7ns ± 0% 11.0ns ± 0% -61.73% (p=0.008 n=5+5)
IndexAnyUTF8/1:8-48 49.2ns ± 0% 11.0ns ± 0% -77.66% (p=0.008 n=5+5)
IndexAnyUTF8/1:16-48 56.0ns ± 0% 11.5ns ± 0% -79.46% (p=0.000 n=5+4)
IndexAnyUTF8/1:32-48 102ns ± 0% 12ns ± 0% -88.24% (p=0.008 n=5+5)
IndexAnyUTF8/1:64-48 177ns ± 0% 13ns ± 0% -92.51% (p=0.008 n=5+5)
IndexAnyUTF8/16:1-48 212ns ± 0% 112ns ± 0% -47.17% (p=0.008 n=5+5)
IndexAnyUTF8/16:2-48 356ns ± 0% 159ns ± 1% -55.28% (p=0.000 n=4+5)
IndexAnyUTF8/16:4-48 372ns ± 0% 158ns ± 0% -57.47% (p=0.008 n=5+5)
IndexAnyUTF8/16:8-48 712ns ± 0% 159ns ± 1% -77.70% (p=0.008 n=5+5)
IndexAnyUTF8/16:16-48 829ns ± 0% 129ns ± 0% -84.44% (p=0.008 n=5+5)
IndexAnyUTF8/16:32-48 1.55µs ± 0% 0.16µs ± 0% -89.87% (p=0.008 n=5+5)
IndexAnyUTF8/16:64-48 2.77µs ± 0% 0.14µs ± 0% -94.94% (p=0.008 n=5+5)
IndexAnyUTF8/256:1-48 2.85µs ± 0% 1.63µs ± 1% -42.74% (p=0.008 n=5+5)
IndexAnyUTF8/256:2-48 5.14µs ± 1% 2.03µs ± 0% -60.51% (p=0.008 n=5+5)
IndexAnyUTF8/256:4-48 5.56µs ± 0% 2.03µs ± 0% -63.52% (p=0.008 n=5+5)
IndexAnyUTF8/256:8-48 10.8µs ± 0% 2.0µs ± 0% -81.22% (p=0.008 n=5+5)
IndexAnyUTF8/256:16-48 12.9µs ± 0% 1.9µs ± 0% -85.55% (p=0.008 n=5+5)
IndexAnyUTF8/256:32-48 24.2µs ± 0% 2.1µs ± 0% -91.29% (p=0.016 n=5+4)
IndexAnyUTF8/256:64-48 43.7µs ± 0% 2.0µs ± 0% -95.32% (p=0.016 n=5+4)
LastIndexAnyASCII/1:1-48 13.7ns ± 1% 12.8ns ± 0% -6.57% (p=0.016 n=5+4)
LastIndexAnyASCII/1:2-48 14.7ns ± 0% 12.7ns ± 1% -13.33% (p=0.000 n=4+5)
LastIndexAnyASCII/1:4-48 16.9ns ± 0% 12.7ns ± 1% -24.73% (p=0.000 n=4+5)
LastIndexAnyASCII/1:8-48 20.5ns ± 0% 12.7ns ± 0% -37.85% (p=0.000 n=4+5)
LastIndexAnyASCII/1:16-48 28.0ns ± 0% 11.7ns ± 0% ~ (p=0.079 n=4+5)
LastIndexAnyASCII/1:32-48 69.8ns ± 0% 12.4ns ± 0% -82.19% (p=0.008 n=5+5)
LastIndexAnyASCII/1:64-48 73.8ns ± 0% 13.3ns ± 0% -82.03% (p=0.000 n=4+5)
LastIndexAnyASCII/16:1-48 35.5ns ± 0% 35.5ns ± 0% ~ (all equal)
LastIndexAnyASCII/16:2-48 36.0ns ± 0% 36.1ns ± 0% +0.28% (p=0.016 n=4+5)
LastIndexAnyASCII/16:4-48 40.3ns ± 2% 40.0ns ± 6% ~ (p=0.651 n=5+5)
LastIndexAnyASCII/16:8-48 50.3ns ± 0% 50.2ns ± 9% ~ (p=0.175 n=4+5)
LastIndexAnyASCII/16:16-48 62.4ns ± 4% 64.4ns ± 0% +3.28% (p=0.016 n=5+4)
LastIndexAnyASCII/16:32-48 98.9ns ± 0% 98.4ns ± 0% -0.53% (p=0.016 n=5+4)
LastIndexAnyASCII/16:64-48 160ns ± 1% 161ns ± 1% ~ (p=0.325 n=5+5)
LastIndexAnyASCII/256:1-48 300ns ± 0% 301ns ± 0% +0.33% (p=0.008 n=5+5)
LastIndexAnyASCII/256:2-48 304ns ± 0% 304ns ± 0% ~ (p=1.000 n=5+5)
LastIndexAnyASCII/256:4-48 311ns ± 0% 311ns ± 0% ~ (p=0.556 n=4+5)
LastIndexAnyASCII/256:8-48 320ns ± 0% 321ns ± 0% ~ (p=0.143 n=5+5)
LastIndexAnyASCII/256:16-48 333ns ± 0% 335ns ± 0% +0.60% (p=0.029 n=4+4)
LastIndexAnyASCII/256:32-48 367ns ± 0% 366ns ± 0% ~ (p=0.095 n=4+5)
LastIndexAnyASCII/256:64-48 431ns ± 0% 424ns ± 0% -1.62% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:1-48 19.7ns ± 1% 11.9ns ± 0% -39.47% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:2-48 27.6ns ± 1% 11.9ns ± 0% -56.82% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:4-48 29.9ns ± 0% 11.9ns ± 0% ~ (p=0.079 n=4+5)
LastIndexAnyUTF8/1:8-48 48.7ns ± 0% 11.9ns ± 0% -75.54% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:16-48 57.8ns ± 0% 11.4ns ± 0% -80.26% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:32-48 94.7ns ± 0% 12.2ns ± 0% -87.07% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:64-48 163ns ± 0% 13ns ± 1% -91.93% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:1-48 258ns ± 0% 88ns ± 0% -65.76% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:2-48 400ns ± 0% 162ns ± 0% -59.38% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:4-48 415ns ± 0% 162ns ± 0% -60.87% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:8-48 737ns ± 0% 162ns ± 0% -78.02% (p=0.000 n=5+4)
LastIndexAnyUTF8/16:16-48 882ns ± 0% 128ns ± 0% -85.49% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:32-48 1.47µs ± 0% 0.16µs ± 0% -89.29% (p=0.000 n=4+5)
LastIndexAnyUTF8/16:64-48 2.56µs ± 0% 0.14µs ± 0% -94.41% (p=0.016 n=5+4)
LastIndexAnyUTF8/256:1-48 3.60µs ± 0% 1.23µs ± 0% -65.67% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:2-48 5.78µs ± 0% 2.18µs ± 0% -62.32% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:4-48 6.26µs ± 0% 2.18µs ± 0% -65.15% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:8-48 11.2µs ± 0% 2.2µs ± 0% -80.53% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:16-48 13.5µs ± 0% 1.9µs ± 0% -86.02% (p=0.016 n=4+5)
LastIndexAnyUTF8/256:32-48 23.0µs ± 0% 2.1µs ± 0% -90.72% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:64-48 40.5µs ± 0% 2.1µs ± 0% -94.73% (p=0.008 n=5+5)
Change-Id: Ie05e306f8b184b989701868cb161ce8b3f18203b
Reviewed-on: https://go-review.googlesource.com/c/go/+/156998
Run-TryBot: eric fang <eric.fang@arm.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
In order to facilitate optimization of IndexAny and LastIndexAny, this patch moves
three Rabin-Karp related functions indexRabinKarp, hashStr and hashStrRev in strings
package to initernal/bytealg. There are also three functions in the bytes package with
the same names and functions but different parameter types. To highlight this, this
patch also moves them to internal/bytealg and gives them slightly different names.
Related benchmark changes on amd64 and arm64:
name old time/op new time/op delta
pkg:strings goos:linux goarch:amd64
Index-16 14.0ns ± 1% 14.1ns ± 2% ~ (p=0.738 n=5+5)
LastIndex-16 15.5ns ± 1% 15.7ns ± 4% ~ (p=0.897 n=5+5)
pkg:bytes goos:linux goarch:amd64
Index/10-16 26.5ns ± 1% 26.5ns ± 0% ~ (p=0.873 n=5+5)
Index/32-16 26.2ns ± 0% 25.7ns ± 0% -1.68% (p=0.008 n=5+5)
Index/4K-16 5.12µs ± 4% 5.14µs ± 2% ~ (p=0.841 n=5+5)
Index/4M-16 5.44ms ± 3% 5.34ms ± 2% ~ (p=0.056 n=5+5)
Index/64M-16 85.8ms ± 3% 84.6ms ± 0% -1.37% (p=0.016 n=5+5)
name old speed new speed delta
pkg:bytes goos:linux goarch:amd64
Index/10-16 377MB/s ± 1% 377MB/s ± 0% ~ (p=1.000 n=5+5)
Index/32-16 1.22GB/s ± 1% 1.24GB/s ± 0% +1.66% (p=0.008 n=5+5)
Index/4K-16 800MB/s ± 4% 797MB/s ± 2% ~ (p=0.841 n=5+5)
Index/4M-16 771MB/s ± 3% 786MB/s ± 2% ~ (p=0.056 n=5+5)
Index/64M-16 783MB/s ± 3% 793MB/s ± 0% +1.36% (p=0.016 n=5+5)
name old time/op new time/op delta
pkg:strings goos:linux goarch:arm64
Index-8 22.6ns ± 0% 22.5ns ± 0% ~ (p=0.167 n=5+5)
LastIndex-8 17.5ns ± 0% 17.5ns ± 0% ~ (all equal)
pkg:bytes goos:linux goarch:arm64
Index/10-8 25.0ns ± 0% 25.0ns ± 0% ~ (all equal)
Index/32-8 160ns ± 0% 160ns ± 0% ~ (all equal)
Index/4K-8 6.26µs ± 0% 6.26µs ± 0% ~ (p=0.167 n=5+5)
Index/4M-8 6.30ms ± 0% 6.31ms ± 0% ~ (p=1.000 n=5+5)
Index/64M-8 101ms ± 0% 101ms ± 0% ~ (p=0.690 n=5+5)
name old speed new speed delta
pkg:bytes goos:linux goarch:arm64
Index/10-8 399MB/s ± 0% 400MB/s ± 0% +0.08% (p=0.008 n=5+5)
Index/32-8 200MB/s ± 0% 200MB/s ± 0% ~ (p=0.127 n=4+5)
Index/4K-8 654MB/s ± 0% 654MB/s ± 0% +0.01% (p=0.016 n=5+5)
Index/4M-8 665MB/s ± 0% 665MB/s ± 0% ~ (p=0.833 n=5+5)
Index/64M-8 665MB/s ± 0% 665MB/s ± 0% ~ (p=0.913 n=5+5)
Change-Id: Icce3bc162bb8613ac36dc963a46c51f8e82ab842
Reviewed-on: https://go-review.googlesource.com/c/go/+/208638
Run-TryBot: eric fang <eric.fang@arm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Add an "offset_" prefix to all cpu feature variable offset constants to
signify that they are not boolean cpu feature variables.
Remove _ from offset constant names.
Change-Id: I6e22a79ebcbe6e2ae54c4ac8764f9260bb3223ff
Reviewed-on: https://go-review.googlesource.com/131215
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Also move the arm64 CountByte implementation while we're here.
Fixes #19792
Change-Id: I1e0fdf1e03e3135af84150a2703b58dad1b0d57e
Reviewed-on: https://go-review.googlesource.com/98518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|