aboutsummaryrefslogtreecommitdiff
path: root/src/internal/bytealg/bytealg.go
AgeCommit message (Collapse)Author
2021-08-23all: replace runtime SSE2 detection with GO386 settingMartin Möhrmann
When GO386=sse2 we can assume sse2 to be present without a runtime check. If GO386=softfloat is set we can avoid the usage of SSE2 even if detected. This might cause a memcpy, memclr and bytealg slowdown of Go binaries compiled with softfloat on machines that support SSE2. Such setups are rare and should use GO386=sse2 instead if performance matters. On targets that support SSE2 we avoid the runtime overhead of dynamic cpu feature dispatch. The removal of runtime sse2 checks also allows to simplify internal/cpu further by removing handling of the required feature option as a followup after this CL. Change-Id: I90a853a8853a405cb665497c6d1a86556947ba17 Reviewed-on: https://go-review.googlesource.com/c/go/+/344350 Trust: Martin Möhrmann <martin@golang.org> Run-TryBot: Martin Möhrmann <martin@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2021-04-21internal/bytealg: add power9 version of bytes indexLynn Boger
This adds a power9 version of the bytes.Index function for little endian. Here is the improvement on power9 for some of the Index benchmarks: Index/10 -0.14% Index/32 -3.19% Index/4K -12.66% Index/4M -13.34% Index/64M -13.17% Count/10 -0.59% Count/32 -2.88% Count/4K -12.63% Count/4M -13.35% Count/64M -13.17% IndexHard1 -23.03% IndexHard2 -13.01% IndexHard3 -22.12% IndexHard4 +0.16% CountHard1 -23.02% CountHard2 -13.01% CountHard3 -22.12% IndexPeriodic/IndexPeriodic2 -22.85% IndexPeriodic/IndexPeriodic4 -23.15% Change-Id: Id72353e2771eba2efbb1544d5f0be65f8a9f0433 Reviewed-on: https://go-review.googlesource.com/c/go/+/311380 Run-TryBot: Carlos Eduardo Seo <carlos.seo@linaro.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org> Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2020-10-13internal/bytealg: fix typo in IndexRabinKarp{,Bytes} godocTobias Klauser
Change-Id: I09ba19e19b195e345a0fe29d542e0d86529b0d31 Reviewed-on: https://go-review.googlesource.com/c/go/+/261359 Trust: Tobias Klauser <tobias.klauser@gmail.com> Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-03-11strings, bytes: improve IndexAny and LastIndexAny performanceerifan01
For the case of a pattern containing multi-byte rune, the time complexity of the previous algorithm is O(nm), and if both input arguments are long, the search performance will be poor. This CL improves the searching performance for these cases by using IndexRune, which is mainly implemented with IndexByte and Index. As IndexByte and Index are specially optimized with some powerful instructions for short patterns (an UTF8 rune is 1 to 4 bytes), so they can help to reduce the runtime complexity of IndexAny and LastIndexAny. Another optimization method is using hash table, however, the actual test results show that using indexrune is better, and the space complexity is lower. There are two fast paths in IndexAny and LastIndexAny for cases where the length of the input arguements are 1, and their locations are not exactly the same, which is determined based on the actual test results. Benchmarks on arm64 and amd64: name old time/op new time/op delta pkg:strings goos:linux goarch:arm64 IndexAnyASCII/1:1-8 23.7ns ± 3% 28.5ns ± 0% +20.15% (p=0.008 n=5+5) IndexAnyASCII/1:2-8 18.0ns ± 0% 33.1ns ± 0% +83.67% (p=0.008 n=5+5) IndexAnyASCII/1:4-8 20.0ns ± 0% 36.0ns ± 0% +80.00% (p=0.029 n=4+4) IndexAnyASCII/1:8-8 36.1ns ± 0% 36.0ns ± 0% ~ (p=0.095 n=5+4) IndexAnyASCII/1:16-8 48.1ns ± 0% 36.0ns ± 0% -25.19% (p=0.029 n=4+4) IndexAnyASCII/1:32-8 72.1ns ± 0% 36.0ns ± 0% -50.01% (p=0.008 n=5+5) IndexAnyASCII/1:64-8 120ns ± 0% 39ns ± 0% -67.83% (p=0.008 n=5+5) IndexAnyASCII/16:1-8 73.0ns ± 0% 28.5ns ± 0% -60.95% (p=0.008 n=5+5) IndexAnyASCII/16:2-8 76.8ns ± 0% 77.0ns ± 0% ~ (p=1.000 n=5+5) IndexAnyASCII/16:4-8 83.2ns ± 1% 83.0ns ± 0% ~ (p=0.770 n=5+5) IndexAnyASCII/16:8-8 111ns ± 1% 107ns ± 0% -3.25% (p=0.008 n=5+5) IndexAnyASCII/16:16-8 139ns ± 1% 137ns ± 0% -1.58% (p=0.008 n=5+5) IndexAnyASCII/16:32-8 199ns ± 1% 197ns ± 0% -1.20% (p=0.008 n=5+5) IndexAnyASCII/16:64-8 307ns ± 0% 313ns ± 0% +1.82% (p=0.016 n=5+4) IndexAnyASCII/256:1-8 674ns ± 0% 65ns ± 0% -90.31% (p=0.008 n=5+5) IndexAnyASCII/256:2-8 678ns ± 0% 683ns ± 0% +0.68% (p=0.008 n=5+5) IndexAnyASCII/256:4-8 685ns ± 0% 683ns ± 0% -0.29% (p=0.000 n=5+4) IndexAnyASCII/256:8-8 711ns ± 0% 708ns ± 0% -0.48% (p=0.008 n=5+5) IndexAnyASCII/256:16-8 740ns ± 0% 740ns ± 0% ~ (p=0.444 n=5+5) IndexAnyASCII/256:32-8 799ns ± 0% 798ns ± 0% -0.18% (p=0.008 n=5+5) IndexAnyASCII/256:64-8 910ns ± 0% 914ns ± 0% +0.44% (p=0.016 n=4+5) IndexAnyUTF8/1:1-8 27.1ns ± 0% 19.0ns ± 0% -29.79% (p=0.008 n=5+5) IndexAnyUTF8/1:2-8 44.1ns ± 0% 33.0ns ± 0% -25.17% (p=0.008 n=5+5) IndexAnyUTF8/1:4-8 46.1ns ± 0% 33.1ns ± 0% -28.29% (p=0.016 n=4+5) IndexAnyUTF8/1:8-8 85.1ns ± 0% 33.0ns ± 0% -61.18% (p=0.008 n=5+5) IndexAnyUTF8/1:16-8 110ns ± 1% 36ns ± 0% -67.27% (p=0.008 n=5+5) IndexAnyUTF8/1:32-8 188ns ± 0% 36ns ± 0% -80.85% (p=0.008 n=5+5) IndexAnyUTF8/1:64-8 332ns ± 0% 39ns ± 0% ~ (p=0.079 n=4+5) IndexAnyUTF8/16:1-8 293ns ± 0% 54ns ± 0% -81.56% (p=0.008 n=5+5) IndexAnyUTF8/16:2-8 563ns ± 0% 349ns ± 0% -37.98% (p=0.008 n=5+5) IndexAnyUTF8/16:4-8 546ns ± 1% 349ns ± 0% -36.10% (p=0.000 n=5+4) IndexAnyUTF8/16:8-8 1.22µs ± 0% 0.35µs ± 0% -71.39% (p=0.008 n=5+5) IndexAnyUTF8/16:16-8 1.63µs ± 1% 0.42µs ± 0% -73.98% (p=0.008 n=5+5) IndexAnyUTF8/16:32-8 2.87µs ± 0% 0.42µs ± 0% -85.22% (p=0.008 n=5+5) IndexAnyUTF8/16:64-8 5.18µs ± 0% 0.47µs ± 0% -90.98% (p=0.008 n=5+5) IndexAnyUTF8/256:1-8 4.26µs ± 0% 0.47µs ± 0% -88.85% (p=0.000 n=4+5) IndexAnyUTF8/256:2-8 8.62µs ± 0% 5.15µs ± 0% -40.21% (p=0.008 n=5+5) IndexAnyUTF8/256:4-8 8.25µs ± 0% 5.15µs ± 0% -37.50% (p=0.016 n=5+4) IndexAnyUTF8/256:8-8 19.2µs ± 1% 5.2µs ± 0% -73.08% (p=0.016 n=5+4) IndexAnyUTF8/256:16-8 25.6µs ± 1% 6.3µs ± 0% -75.32% (p=0.008 n=5+5) IndexAnyUTF8/256:32-8 45.6µs ± 0% 6.3µs ± 0% -86.15% (p=0.008 n=5+5) IndexAnyUTF8/256:64-8 82.4µs ± 0% 7.0µs ± 0% -91.53% (p=0.016 n=5+4) LastIndexAnyASCII/1:1-8 23.0ns ± 0% 33.5ns ± 0% +45.65% (p=0.008 n=5+5) LastIndexAnyASCII/1:2-8 24.5ns ± 0% 33.5ns ± 0% +36.73% (p=0.016 n=4+5) LastIndexAnyASCII/1:4-8 27.5ns ± 0% 35.5ns ± 0% +29.09% (p=0.008 n=5+5) LastIndexAnyASCII/1:8-8 44.5ns ± 0% 35.5ns ± 0% -20.13% (p=0.008 n=5+5) LastIndexAnyASCII/1:16-8 56.5ns ± 0% 35.5ns ± 0% -37.15% (p=0.008 n=5+5) LastIndexAnyASCII/1:32-8 80.3ns ± 0% 35.5ns ± 0% -55.79% (p=0.000 n=5+4) LastIndexAnyASCII/1:64-8 129ns ± 0% 40ns ± 0% -68.85% (p=0.008 n=5+5) LastIndexAnyASCII/16:1-8 72.8ns ± 0% 72.7ns ± 0% -0.19% (p=0.016 n=4+5) LastIndexAnyASCII/16:2-8 75.4ns ± 0% 75.1ns ± 0% ~ (p=0.127 n=5+5) LastIndexAnyASCII/16:4-8 81.9ns ± 1% 80.2ns ± 0% -2.00% (p=0.008 n=5+5) LastIndexAnyASCII/16:8-8 110ns ± 1% 108ns ± 0% -1.46% (p=0.008 n=5+5) LastIndexAnyASCII/16:16-8 138ns ± 1% 134ns ± 0% -3.18% (p=0.008 n=5+5) LastIndexAnyASCII/16:32-8 198ns ± 0% 197ns ± 0% -0.51% (p=0.008 n=5+5) LastIndexAnyASCII/16:64-8 309ns ± 0% 313ns ± 0% +1.30% (p=0.008 n=5+5) LastIndexAnyASCII/256:1-8 652ns ± 0% 653ns ± 0% +0.21% (p=0.008 n=5+5) LastIndexAnyASCII/256:2-8 656ns ± 0% 656ns ± 0% ~ (all equal) LastIndexAnyASCII/256:4-8 663ns ± 0% 663ns ± 0% ~ (p=0.444 n=5+5) LastIndexAnyASCII/256:8-8 691ns ± 0% 690ns ± 0% ~ (p=0.079 n=4+5) LastIndexAnyASCII/256:16-8 719ns ± 0% 715ns ± 0% -0.53% (p=0.000 n=5+4) LastIndexAnyASCII/256:32-8 779ns ± 0% 780ns ± 0% +0.13% (p=0.029 n=4+4) LastIndexAnyASCII/256:64-8 890ns ± 0% 894ns ± 0% +0.45% (p=0.008 n=5+5) LastIndexAnyUTF8/1:1-8 31.6ns ± 0% 33.5ns ± 0% +6.01% (p=0.008 n=5+5) LastIndexAnyUTF8/1:2-8 48.6ns ± 0% 33.5ns ± 0% -30.99% (p=0.008 n=5+5) LastIndexAnyUTF8/1:4-8 48.6ns ± 0% 33.5ns ± 0% -31.13% (p=0.000 n=5+4) LastIndexAnyUTF8/1:8-8 89.6ns ± 0% 33.5ns ± 0% -62.56% (p=0.008 n=5+5) LastIndexAnyUTF8/1:16-8 113ns ± 1% 36ns ± 0% -68.47% (p=0.000 n=5+4) LastIndexAnyUTF8/1:32-8 190ns ± 0% 36ns ± 0% -81.26% (p=0.029 n=4+4) LastIndexAnyUTF8/1:64-8 327ns ± 0% 40ns ± 0% -87.77% (p=0.008 n=5+5) LastIndexAnyUTF8/16:1-8 364ns ± 0% 158ns ± 0% ~ (p=0.079 n=4+5) LastIndexAnyUTF8/16:2-8 636ns ± 0% 472ns ± 0% -25.79% (p=0.000 n=5+4) LastIndexAnyUTF8/16:4-8 630ns ± 0% 472ns ± 0% -25.03% (p=0.008 n=5+5) LastIndexAnyUTF8/16:8-8 1.28µs ± 0% 0.47µs ± 0% -63.09% (p=0.016 n=5+4) LastIndexAnyUTF8/16:16-8 1.66µs ± 0% 0.53µs ± 0% -68.39% (p=0.016 n=5+4) LastIndexAnyUTF8/16:32-8 2.88µs ± 0% 0.53µs ± 0% -81.72% (p=0.008 n=5+5) LastIndexAnyUTF8/16:64-8 5.08µs ± 0% 0.57µs ± 0% -88.79% (p=0.008 n=5+5) LastIndexAnyUTF8/256:1-8 5.41µs ± 0% 2.03µs ± 0% -62.46% (p=0.016 n=4+5) LastIndexAnyUTF8/256:2-8 9.77µs ± 0% 7.14µs ± 0% -26.97% (p=0.008 n=5+5) LastIndexAnyUTF8/256:4-8 9.63µs ± 0% 7.14µs ± 0% -25.86% (p=0.008 n=5+5) LastIndexAnyUTF8/256:8-8 20.0µs ± 0% 7.1µs ± 0% -64.30% (p=0.008 n=5+5) LastIndexAnyUTF8/256:16-8 26.1µs ± 1% 8.0µs ± 0% -69.40% (p=0.008 n=5+5) LastIndexAnyUTF8/256:32-8 45.6µs ± 1% 8.0µs ± 0% -82.51% (p=0.008 n=5+5) LastIndexAnyUTF8/256:64-8 80.8µs ± 0% 8.6µs ± 0% -89.33% (p=0.016 n=5+4) pkg:bytes goos:linux goarch:arm64 IndexAnyASCII/1:1-8 26.2ns ± 1% 26.5ns ± 0% +1.30% (p=0.016 n=5+4) IndexAnyASCII/1:2-8 18.5ns ± 0% 26.5ns ± 0% +43.24% (p=0.008 n=5+5) IndexAnyASCII/1:4-8 21.0ns ± 0% 26.5ns ± 0% +26.38% (p=0.008 n=5+5) IndexAnyASCII/1:8-8 37.5ns ± 0% 26.5ns ± 0% -29.33% (p=0.000 n=5+4) IndexAnyASCII/1:16-8 49.6ns ± 0% 26.5ns ± 0% -46.49% (p=0.008 n=5+5) IndexAnyASCII/1:32-8 73.6ns ± 0% 30.1ns ± 0% -59.16% (p=0.008 n=5+5) IndexAnyASCII/1:64-8 122ns ± 0% 33ns ± 0% -73.23% (p=0.008 n=5+5) IndexAnyASCII/16:1-8 73.7ns ± 0% 33.4ns ± 0% -54.71% (p=0.008 n=5+5) IndexAnyASCII/16:2-8 79.1ns ± 0% 78.9ns ± 0% -0.30% (p=0.016 n=4+5) IndexAnyASCII/16:4-8 84.8ns ± 0% 86.1ns ± 0% +1.58% (p=0.016 n=5+4) IndexAnyASCII/16:8-8 111ns ± 0% 111ns ± 0% ~ (all equal) IndexAnyASCII/16:16-8 139ns ± 0% 144ns ± 0% +3.60% (p=0.016 n=4+5) IndexAnyASCII/16:32-8 196ns ± 0% 207ns ± 0% +5.61% (p=0.016 n=5+4) IndexAnyASCII/16:64-8 311ns ± 0% 320ns ± 0% +2.89% (p=0.016 n=4+5) IndexAnyASCII/256:1-8 674ns ± 0% 65ns ± 1% -90.35% (p=0.008 n=5+5) IndexAnyASCII/256:2-8 680ns ± 0% 680ns ± 0% ~ (p=0.444 n=5+5) IndexAnyASCII/256:4-8 686ns ± 0% 687ns ± 0% ~ (p=0.167 n=5+5) IndexAnyASCII/256:8-8 713ns ± 0% 712ns ± 0% -0.14% (p=0.008 n=5+5) IndexAnyASCII/256:16-8 740ns ± 0% 744ns ± 0% +0.54% (p=0.016 n=5+4) IndexAnyASCII/256:32-8 797ns ± 0% 808ns ± 0% +1.43% (p=0.008 n=5+5) IndexAnyASCII/256:64-8 912ns ± 0% 921ns ± 0% +0.99% (p=0.016 n=4+5) IndexAnyUTF8/1:1-8 27.5ns ± 0% 26.5ns ± 0% -3.64% (p=0.008 n=5+5) IndexAnyUTF8/1:2-8 44.5ns ± 0% 26.5ns ± 0% -40.50% (p=0.008 n=5+5) IndexAnyUTF8/1:4-8 45.6ns ± 0% 26.5ns ± 0% -41.89% (p=0.000 n=5+4) IndexAnyUTF8/1:8-8 85.8ns ± 1% 26.5ns ± 0% -69.11% (p=0.008 n=5+5) IndexAnyUTF8/1:16-8 110ns ± 1% 26ns ± 0% -76.00% (p=0.016 n=5+4) IndexAnyUTF8/1:32-8 188ns ± 0% 30ns ± 0% -84.04% (p=0.008 n=5+5) IndexAnyUTF8/1:64-8 333ns ± 0% 33ns ± 0% -90.20% (p=0.008 n=5+5) IndexAnyUTF8/16:1-8 294ns ± 0% 235ns ± 0% -20.07% (p=0.008 n=5+5) IndexAnyUTF8/16:2-8 563ns ± 0% 309ns ± 0% -45.12% (p=0.008 n=5+5) IndexAnyUTF8/16:4-8 558ns ± 1% 309ns ± 0% -44.60% (p=0.000 n=5+4) IndexAnyUTF8/16:8-8 1.23µs ± 0% 0.31µs ± 0% -74.79% (p=0.008 n=5+5) IndexAnyUTF8/16:16-8 1.62µs ± 2% 0.31µs ± 0% -80.93% (p=0.008 n=5+5) IndexAnyUTF8/16:32-8 2.86µs ± 0% 0.38µs ± 0% -86.87% (p=0.008 n=5+5) IndexAnyUTF8/16:64-8 5.18µs ± 0% 0.42µs ± 0% -91.86% (p=0.008 n=5+5) IndexAnyUTF8/256:1-8 4.27µs ± 1% 3.30µs ± 1% -22.75% (p=0.008 n=5+5) IndexAnyUTF8/256:2-8 8.61µs ± 0% 4.45µs ± 0% -48.31% (p=0.016 n=4+5) IndexAnyUTF8/256:4-8 8.44µs ± 0% 4.45µs ± 0% -47.23% (p=0.008 n=5+5) IndexAnyUTF8/256:8-8 19.2µs ± 0% 4.5µs ± 0% -76.78% (p=0.008 n=5+5) IndexAnyUTF8/256:16-8 25.6µs ± 0% 4.5µs ± 0% -82.63% (p=0.008 n=5+5) IndexAnyUTF8/256:32-8 45.4µs ± 0% 5.5µs ± 0% -87.85% (p=0.016 n=4+5) IndexAnyUTF8/256:64-8 82.5µs ± 0% 6.2µs ± 0% -92.49% (p=0.008 n=5+5) LastIndexAnyASCII/1:1-8 23.0ns ± 0% 26.5ns ± 0% +15.02% (p=0.008 n=5+5) LastIndexAnyASCII/1:2-8 24.5ns ± 0% 26.5ns ± 0% +8.16% (p=0.008 n=5+5) LastIndexAnyASCII/1:4-8 27.8ns ± 0% 26.5ns ± 0% -4.68% (p=0.029 n=4+4) LastIndexAnyASCII/1:8-8 45.1ns ± 1% 26.5ns ± 0% -41.29% (p=0.000 n=5+4) LastIndexAnyASCII/1:16-8 57.1ns ± 0% 26.5ns ± 0% -53.61% (p=0.008 n=5+5) LastIndexAnyASCII/1:32-8 81.5ns ± 0% 30.0ns ± 0% ~ (p=0.079 n=4+5) LastIndexAnyASCII/1:64-8 129ns ± 0% 32ns ± 0% -74.81% (p=0.008 n=5+5) LastIndexAnyASCII/16:1-8 72.6ns ± 0% 72.1ns ± 0% -0.63% (p=0.000 n=4+5) LastIndexAnyASCII/16:2-8 77.2ns ± 0% 77.2ns ± 0% ~ (p=0.167 n=5+5) LastIndexAnyASCII/16:4-8 83.1ns ± 0% 83.2ns ± 0% ~ (p=0.444 n=5+5) LastIndexAnyASCII/16:8-8 109ns ± 1% 108ns ± 0% ~ (p=0.167 n=5+5) LastIndexAnyASCII/16:16-8 136ns ± 0% 136ns ± 0% ~ (all equal) LastIndexAnyASCII/16:32-8 195ns ± 0% 197ns ± 0% +0.82% (p=0.008 n=5+5) LastIndexAnyASCII/16:64-8 309ns ± 0% 309ns ± 0% ~ (all equal) LastIndexAnyASCII/256:1-8 653ns ± 0% 657ns ± 0% +0.61% (p=0.008 n=5+5) LastIndexAnyASCII/256:2-8 659ns ± 0% 658ns ± 0% ~ (p=0.167 n=5+5) LastIndexAnyASCII/256:4-8 664ns ± 0% 663ns ± 0% ~ (p=0.095 n=5+4) LastIndexAnyASCII/256:8-8 698ns ± 0% 689ns ± 0% -1.29% (p=0.008 n=5+5) LastIndexAnyASCII/256:16-8 726ns ± 0% 717ns ± 0% -1.24% (p=0.008 n=5+5) LastIndexAnyASCII/256:32-8 777ns ± 0% 779ns ± 0% ~ (p=0.079 n=5+4) LastIndexAnyASCII/256:64-8 889ns ± 0% 890ns ± 0% ~ (p=0.444 n=5+5) LastIndexAnyUTF8/1:1-8 32.1ns ± 0% 26.5ns ± 0% -17.45% (p=0.000 n=5+4) LastIndexAnyUTF8/1:2-8 48.6ns ± 0% 26.5ns ± 0% -45.52% (p=0.000 n=5+4) LastIndexAnyUTF8/1:4-8 49.6ns ± 0% 26.5ns ± 0% -46.62% (p=0.008 n=5+5) LastIndexAnyUTF8/1:8-8 91.9ns ± 0% 26.5ns ± 0% -71.18% (p=0.008 n=5+5) LastIndexAnyUTF8/1:16-8 114ns ± 1% 26ns ± 0% -76.84% (p=0.000 n=5+4) LastIndexAnyUTF8/1:32-8 203ns ± 6% 30ns ± 0% -85.25% (p=0.008 n=5+5) LastIndexAnyUTF8/1:64-8 330ns ± 0% 33ns ± 0% -90.14% (p=0.000 n=4+5) LastIndexAnyUTF8/16:1-8 365ns ± 0% 164ns ± 0% -55.04% (p=0.008 n=5+5) LastIndexAnyUTF8/16:2-8 638ns ± 0% 296ns ± 0% -53.58% (p=0.008 n=5+5) LastIndexAnyUTF8/16:4-8 634ns ± 0% 296ns ± 0% -53.31% (p=0.008 n=5+5) LastIndexAnyUTF8/16:8-8 1.30µs ± 0% 0.30µs ± 0% -77.18% (p=0.000 n=4+5) LastIndexAnyUTF8/16:16-8 1.66µs ± 0% 0.30µs ± 0% -82.19% (p=0.008 n=5+5) LastIndexAnyUTF8/16:32-8 2.90µs ± 0% 0.38µs ± 0% -87.00% (p=0.029 n=4+4) LastIndexAnyUTF8/16:64-8 5.10µs ± 0% 0.42µs ± 0% -91.78% (p=0.008 n=5+5) LastIndexAnyUTF8/256:1-8 5.42µs ± 0% 2.12µs ± 0% -60.92% (p=0.008 n=5+5) LastIndexAnyUTF8/256:2-8 9.79µs ± 0% 4.26µs ± 0% -56.47% (p=0.008 n=5+5) LastIndexAnyUTF8/256:4-8 9.66µs ± 0% 4.26µs ± 0% -55.87% (p=0.008 n=5+5) LastIndexAnyUTF8/256:8-8 20.4µs ± 0% 4.3µs ± 0% -79.10% (p=0.008 n=5+5) LastIndexAnyUTF8/256:16-8 26.0µs ± 1% 4.3µs ± 0% -83.62% (p=0.008 n=5+5) LastIndexAnyUTF8/256:32-8 46.0µs ± 0% 5.5µs ± 0% -88.09% (p=0.008 n=5+5) LastIndexAnyUTF8/256:64-8 81.1µs ± 0% 6.2µs ± 0% -92.38% (p=0.008 n=5+5) name old time/op new time/op delta pkg:strings goos:linux goarch:amd64 IndexAnyASCII/1:1-48 10.0ns ± 0% 13.3ns ± 0% +33.00% (p=0.008 n=5+5) IndexAnyASCII/1:2-48 11.0ns ± 0% 15.5ns ± 0% +40.55% (p=0.016 n=4+5) IndexAnyASCII/1:4-48 12.9ns ± 0% 15.4ns ± 0% +19.69% (p=0.008 n=5+5) IndexAnyASCII/1:8-48 18.6ns ± 0% 15.5ns ± 0% -16.45% (p=0.000 n=4+5) IndexAnyASCII/1:16-48 30.1ns ± 0% 16.9ns ± 0% ~ (p=0.079 n=4+5) IndexAnyASCII/1:32-48 53.1ns ± 0% 18.6ns ± 0% -64.95% (p=0.000 n=5+4) IndexAnyASCII/1:64-48 98.9ns ± 0% 17.4ns ± 0% -82.41% (p=0.000 n=5+4) IndexAnyASCII/16:1-48 35.0ns ± 0% 14.2ns ± 0% -59.47% (p=0.000 n=5+4) IndexAnyASCII/16:2-48 35.5ns ± 0% 35.6ns ± 0% ~ (p=0.238 n=5+4) IndexAnyASCII/16:4-48 40.8ns ± 0% 40.7ns ± 1% ~ (p=0.643 n=5+5) IndexAnyASCII/16:8-48 50.8ns ± 0% 50.9ns ± 1% ~ (p=1.000 n=4+5) IndexAnyASCII/16:16-48 64.0ns ± 1% 64.5ns ± 1% ~ (p=0.071 n=5+5) IndexAnyASCII/16:32-48 98.3ns ± 0% 100.8ns ± 1% +2.52% (p=0.008 n=5+5) IndexAnyASCII/16:64-48 156ns ± 0% 157ns ± 0% ~ (p=0.238 n=4+5) IndexAnyASCII/256:1-48 299ns ± 0% 24ns ± 3% -92.12% (p=0.008 n=5+5) IndexAnyASCII/256:2-48 303ns ± 0% 304ns ± 0% ~ (p=0.762 n=5+5) IndexAnyASCII/256:4-48 311ns ± 0% 311ns ± 0% ~ (p=0.476 n=5+5) IndexAnyASCII/256:8-48 321ns ± 0% 321ns ± 0% ~ (p=0.429 n=4+5) IndexAnyASCII/256:16-48 334ns ± 0% 335ns ± 0% ~ (p=0.079 n=5+4) IndexAnyASCII/256:32-48 367ns ± 0% 365ns ± 0% ~ (p=0.079 n=4+5) IndexAnyASCII/256:64-48 431ns ± 1% 421ns ± 0% -2.27% (p=0.008 n=5+5) IndexAnyUTF8/1:1-48 17.2ns ± 0% 10.8ns ± 0% -37.21% (p=0.029 n=4+4) IndexAnyUTF8/1:2-48 26.7ns ± 0% 15.6ns ± 0% ~ (p=0.079 n=4+5) IndexAnyUTF8/1:4-48 28.2ns ± 0% 15.6ns ± 0% -44.68% (p=0.000 n=5+4) IndexAnyUTF8/1:8-48 48.8ns ± 0% 15.6ns ± 0% -68.03% (p=0.029 n=4+4) IndexAnyUTF8/1:16-48 58.3ns ± 0% 16.2ns ± 0% ~ (p=0.079 n=4+5) IndexAnyUTF8/1:32-48 103ns ± 0% 18ns ± 0% -82.27% (p=0.008 n=5+5) IndexAnyUTF8/1:64-48 182ns ± 0% 17ns ± 0% -90.53% (p=0.008 n=5+5) IndexAnyUTF8/16:1-48 197ns ± 0% 25ns ± 0% -87.34% (p=0.000 n=5+4) IndexAnyUTF8/16:2-48 348ns ± 0% 163ns ± 0% -53.11% (p=0.000 n=5+4) IndexAnyUTF8/16:4-48 374ns ± 0% 163ns ± 0% -56.37% (p=0.000 n=5+4) IndexAnyUTF8/16:8-48 716ns ± 0% 163ns ± 0% -77.22% (p=0.000 n=5+4) IndexAnyUTF8/16:16-48 859ns ± 0% 175ns ± 0% -79.63% (p=0.000 n=5+4) IndexAnyUTF8/16:32-48 1.58µs ± 0% 0.20µs ± 0% -87.01% (p=0.029 n=4+4) IndexAnyUTF8/16:64-48 2.84µs ± 0% 0.19µs ± 1% -93.34% (p=0.008 n=5+5) IndexAnyUTF8/256:1-48 2.61µs ± 0% 0.27µs ± 0% -89.81% (p=0.008 n=5+5) IndexAnyUTF8/256:2-48 4.95µs ± 0% 2.23µs ± 0% -54.91% (p=0.016 n=5+4) IndexAnyUTF8/256:4-48 5.55µs ± 0% 2.23µs ± 0% -59.72% (p=0.008 n=5+5) IndexAnyUTF8/256:8-48 10.8µs ± 0% 2.2µs ± 0% -79.39% (p=0.008 n=5+5) IndexAnyUTF8/256:16-48 13.1µs ± 0% 2.5µs ± 0% -81.21% (p=0.016 n=4+5) IndexAnyUTF8/256:32-48 24.7µs ± 0% 2.8µs ± 0% -88.49% (p=0.008 n=5+5) IndexAnyUTF8/256:64-48 45.0µs ± 0% 2.6µs ± 1% -94.23% (p=0.008 n=5+5) LastIndexAnyASCII/1:1-48 13.9ns ± 0% 15.2ns ± 0% +9.35% (p=0.008 n=5+5) LastIndexAnyASCII/1:2-48 14.4ns ± 0% 15.2ns ± 0% +5.56% (p=0.008 n=5+5) LastIndexAnyASCII/1:4-48 16.7ns ± 0% 15.2ns ± 0% -8.98% (p=0.008 n=5+5) LastIndexAnyASCII/1:8-48 24.0ns ± 0% 15.2ns ± 0% -36.67% (p=0.008 n=5+5) LastIndexAnyASCII/1:16-48 35.6ns ± 0% 15.0ns ± 0% -57.82% (p=0.008 n=5+5) LastIndexAnyASCII/1:32-48 68.9ns ± 0% 16.7ns ± 0% -75.75% (p=0.008 n=5+5) LastIndexAnyASCII/1:64-48 104ns ± 0% 17ns ± 1% -83.81% (p=0.008 n=5+5) LastIndexAnyASCII/16:1-48 35.0ns ± 0% 35.0ns ± 0% ~ (all equal) LastIndexAnyASCII/16:2-48 35.6ns ± 0% 35.6ns ± 0% ~ (all equal) LastIndexAnyASCII/16:4-48 41.0ns ± 0% 40.8ns ± 0% -0.49% (p=0.032 n=5+5) LastIndexAnyASCII/16:8-48 50.9ns ± 0% 50.7ns ± 1% ~ (p=0.397 n=5+5) LastIndexAnyASCII/16:16-48 64.3ns ± 1% 64.4ns ± 1% ~ (p=1.000 n=4+5) LastIndexAnyASCII/16:32-48 100ns ± 0% 100ns ± 0% +0.38% (p=0.016 n=4+5) LastIndexAnyASCII/16:64-48 157ns ± 1% 163ns ± 0% +3.82% (p=0.008 n=5+5) LastIndexAnyASCII/256:1-48 302ns ± 0% 300ns ± 0% -0.53% (p=0.008 n=5+5) LastIndexAnyASCII/256:2-48 305ns ± 0% 303ns ± 0% -0.66% (p=0.000 n=5+4) LastIndexAnyASCII/256:4-48 313ns ± 0% 307ns ± 0% -2.04% (p=0.000 n=4+5) LastIndexAnyASCII/256:8-48 323ns ± 0% 315ns ± 0% -2.48% (p=0.029 n=4+4) LastIndexAnyASCII/256:16-48 333ns ± 0% 332ns ± 0% -0.30% (p=0.048 n=5+5) LastIndexAnyASCII/256:32-48 366ns ± 0% 367ns ± 0% ~ (p=0.238 n=4+5) LastIndexAnyASCII/256:64-48 430ns ± 0% 430ns ± 0% ~ (p=1.000 n=5+5) LastIndexAnyUTF8/1:1-48 21.1ns ± 0% 13.9ns ± 0% -34.00% (p=0.008 n=5+5) LastIndexAnyUTF8/1:2-48 29.5ns ± 0% 13.9ns ± 0% -52.95% (p=0.008 n=5+5) LastIndexAnyUTF8/1:4-48 31.6ns ± 0% 13.9ns ± 0% -55.96% (p=0.008 n=5+5) LastIndexAnyUTF8/1:8-48 51.1ns ± 0% 13.9ns ± 0% -72.81% (p=0.008 n=5+5) LastIndexAnyUTF8/1:16-48 58.9ns ± 0% 14.6ns ± 0% -75.23% (p=0.016 n=5+4) LastIndexAnyUTF8/1:32-48 103ns ± 0% 16ns ± 1% -84.12% (p=0.008 n=5+5) LastIndexAnyUTF8/1:64-48 177ns ± 0% 17ns ± 1% -90.62% (p=0.008 n=5+5) LastIndexAnyUTF8/16:1-48 275ns ± 1% 105ns ± 0% -61.85% (p=0.000 n=5+4) LastIndexAnyUTF8/16:2-48 406ns ± 0% 216ns ± 0% -46.70% (p=0.008 n=5+5) LastIndexAnyUTF8/16:4-48 458ns ± 0% 216ns ± 0% -52.75% (p=0.000 n=4+5) LastIndexAnyUTF8/16:8-48 753ns ± 0% 216ns ± 0% -71.31% (p=0.029 n=4+4) LastIndexAnyUTF8/16:16-48 902ns ± 0% 221ns ± 0% -75.50% (p=0.016 n=5+4) LastIndexAnyUTF8/16:32-48 1.57µs ± 0% 0.24µs ± 0% -84.46% (p=0.008 n=5+5) LastIndexAnyUTF8/16:64-48 2.77µs ± 0% 0.24µs ± 0% -91.22% (p=0.000 n=5+4) LastIndexAnyUTF8/256:1-48 4.06µs ± 0% 1.53µs ± 0% -62.26% (p=0.008 n=5+5) LastIndexAnyUTF8/256:2-48 5.92µs ± 0% 3.04µs ± 0% -48.55% (p=0.016 n=4+5) LastIndexAnyUTF8/256:4-48 6.82µs ± 0% 3.04µs ± 0% -55.34% (p=0.008 n=5+5) LastIndexAnyUTF8/256:8-48 11.5µs ± 0% 3.0µs ± 0% -73.48% (p=0.008 n=5+5) LastIndexAnyUTF8/256:16-48 14.1µs ± 0% 3.1µs ± 0% -77.85% (p=0.008 n=5+5) LastIndexAnyUTF8/256:32-48 24.5µs ± 0% 3.5µs ± 0% -85.85% (p=0.016 n=5+4) LastIndexAnyUTF8/256:64-48 44.0µs ± 0% 3.5µs ± 0% -92.12% (p=0.008 n=5+5) pkg:bytes goos:linux goarch:amd64 IndexAnyASCII/1:1-48 9.56ns ± 0% 11.00ns ± 0% +15.06% (p=0.016 n=5+4) IndexAnyASCII/1:2-48 11.0ns ± 0% 10.8ns ± 2% -1.64% (p=0.048 n=5+5) IndexAnyASCII/1:4-48 13.9ns ± 0% 11.0ns ± 1% -21.15% (p=0.008 n=5+5) IndexAnyASCII/1:8-48 19.6ns ± 0% 10.8ns ± 3% -44.90% (p=0.008 n=5+5) IndexAnyASCII/1:16-48 31.1ns ± 0% 11.5ns ± 0% -63.02% (p=0.008 n=5+5) IndexAnyASCII/1:32-48 54.0ns ± 0% 11.8ns ± 0% -78.15% (p=0.000 n=5+4) IndexAnyASCII/1:64-48 100ns ± 0% 13ns ± 0% -86.89% (p=0.008 n=5+5) IndexAnyASCII/16:1-48 35.5ns ± 0% 14.8ns ± 0% -58.26% (p=0.008 n=5+5) IndexAnyASCII/16:2-48 36.2ns ± 1% 36.0ns ± 1% ~ (p=0.087 n=5+5) IndexAnyASCII/16:4-48 40.3ns ± 1% 39.7ns ± 4% ~ (p=0.175 n=4+5) IndexAnyASCII/16:8-48 48.7ns ± 5% 45.8ns ± 0% -6.02% (p=0.016 n=5+4) IndexAnyASCII/16:16-48 64.1ns ±11% 62.1ns ± 1% ~ (p=0.143 n=5+5) IndexAnyASCII/16:32-48 97.9ns ± 1% 98.3ns ± 1% ~ (p=0.294 n=5+5) IndexAnyASCII/16:64-48 163ns ± 0% 157ns ± 0% -3.68% (p=0.008 n=5+5) IndexAnyASCII/256:1-48 389ns ± 0% 25ns ± 0% -93.65% (p=0.000 n=5+4) IndexAnyASCII/256:2-48 391ns ± 0% 307ns ± 0% -21.48% (p=0.000 n=5+4) IndexAnyASCII/256:4-48 394ns ± 0% 323ns ± 0% -17.92% (p=0.008 n=5+5) IndexAnyASCII/256:8-48 402ns ± 0% 323ns ± 0% -19.51% (p=0.008 n=5+5) IndexAnyASCII/256:16-48 414ns ± 0% 334ns ± 0% -19.32% (p=0.016 n=4+5) IndexAnyASCII/256:32-48 446ns ± 0% 367ns ± 0% -17.75% (p=0.016 n=5+4) IndexAnyASCII/256:64-48 511ns ± 0% 424ns ± 0% -17.02% (p=0.008 n=5+5) IndexAnyUTF8/1:1-48 17.4ns ± 0% 11.0ns ± 0% -36.64% (p=0.008 n=5+5) IndexAnyUTF8/1:2-48 27.3ns ± 1% 11.0ns ± 0% -59.74% (p=0.008 n=5+5) IndexAnyUTF8/1:4-48 28.7ns ± 0% 11.0ns ± 0% -61.73% (p=0.008 n=5+5) IndexAnyUTF8/1:8-48 49.2ns ± 0% 11.0ns ± 0% -77.66% (p=0.008 n=5+5) IndexAnyUTF8/1:16-48 56.0ns ± 0% 11.5ns ± 0% -79.46% (p=0.000 n=5+4) IndexAnyUTF8/1:32-48 102ns ± 0% 12ns ± 0% -88.24% (p=0.008 n=5+5) IndexAnyUTF8/1:64-48 177ns ± 0% 13ns ± 0% -92.51% (p=0.008 n=5+5) IndexAnyUTF8/16:1-48 212ns ± 0% 112ns ± 0% -47.17% (p=0.008 n=5+5) IndexAnyUTF8/16:2-48 356ns ± 0% 159ns ± 1% -55.28% (p=0.000 n=4+5) IndexAnyUTF8/16:4-48 372ns ± 0% 158ns ± 0% -57.47% (p=0.008 n=5+5) IndexAnyUTF8/16:8-48 712ns ± 0% 159ns ± 1% -77.70% (p=0.008 n=5+5) IndexAnyUTF8/16:16-48 829ns ± 0% 129ns ± 0% -84.44% (p=0.008 n=5+5) IndexAnyUTF8/16:32-48 1.55µs ± 0% 0.16µs ± 0% -89.87% (p=0.008 n=5+5) IndexAnyUTF8/16:64-48 2.77µs ± 0% 0.14µs ± 0% -94.94% (p=0.008 n=5+5) IndexAnyUTF8/256:1-48 2.85µs ± 0% 1.63µs ± 1% -42.74% (p=0.008 n=5+5) IndexAnyUTF8/256:2-48 5.14µs ± 1% 2.03µs ± 0% -60.51% (p=0.008 n=5+5) IndexAnyUTF8/256:4-48 5.56µs ± 0% 2.03µs ± 0% -63.52% (p=0.008 n=5+5) IndexAnyUTF8/256:8-48 10.8µs ± 0% 2.0µs ± 0% -81.22% (p=0.008 n=5+5) IndexAnyUTF8/256:16-48 12.9µs ± 0% 1.9µs ± 0% -85.55% (p=0.008 n=5+5) IndexAnyUTF8/256:32-48 24.2µs ± 0% 2.1µs ± 0% -91.29% (p=0.016 n=5+4) IndexAnyUTF8/256:64-48 43.7µs ± 0% 2.0µs ± 0% -95.32% (p=0.016 n=5+4) LastIndexAnyASCII/1:1-48 13.7ns ± 1% 12.8ns ± 0% -6.57% (p=0.016 n=5+4) LastIndexAnyASCII/1:2-48 14.7ns ± 0% 12.7ns ± 1% -13.33% (p=0.000 n=4+5) LastIndexAnyASCII/1:4-48 16.9ns ± 0% 12.7ns ± 1% -24.73% (p=0.000 n=4+5) LastIndexAnyASCII/1:8-48 20.5ns ± 0% 12.7ns ± 0% -37.85% (p=0.000 n=4+5) LastIndexAnyASCII/1:16-48 28.0ns ± 0% 11.7ns ± 0% ~ (p=0.079 n=4+5) LastIndexAnyASCII/1:32-48 69.8ns ± 0% 12.4ns ± 0% -82.19% (p=0.008 n=5+5) LastIndexAnyASCII/1:64-48 73.8ns ± 0% 13.3ns ± 0% -82.03% (p=0.000 n=4+5) LastIndexAnyASCII/16:1-48 35.5ns ± 0% 35.5ns ± 0% ~ (all equal) LastIndexAnyASCII/16:2-48 36.0ns ± 0% 36.1ns ± 0% +0.28% (p=0.016 n=4+5) LastIndexAnyASCII/16:4-48 40.3ns ± 2% 40.0ns ± 6% ~ (p=0.651 n=5+5) LastIndexAnyASCII/16:8-48 50.3ns ± 0% 50.2ns ± 9% ~ (p=0.175 n=4+5) LastIndexAnyASCII/16:16-48 62.4ns ± 4% 64.4ns ± 0% +3.28% (p=0.016 n=5+4) LastIndexAnyASCII/16:32-48 98.9ns ± 0% 98.4ns ± 0% -0.53% (p=0.016 n=5+4) LastIndexAnyASCII/16:64-48 160ns ± 1% 161ns ± 1% ~ (p=0.325 n=5+5) LastIndexAnyASCII/256:1-48 300ns ± 0% 301ns ± 0% +0.33% (p=0.008 n=5+5) LastIndexAnyASCII/256:2-48 304ns ± 0% 304ns ± 0% ~ (p=1.000 n=5+5) LastIndexAnyASCII/256:4-48 311ns ± 0% 311ns ± 0% ~ (p=0.556 n=4+5) LastIndexAnyASCII/256:8-48 320ns ± 0% 321ns ± 0% ~ (p=0.143 n=5+5) LastIndexAnyASCII/256:16-48 333ns ± 0% 335ns ± 0% +0.60% (p=0.029 n=4+4) LastIndexAnyASCII/256:32-48 367ns ± 0% 366ns ± 0% ~ (p=0.095 n=4+5) LastIndexAnyASCII/256:64-48 431ns ± 0% 424ns ± 0% -1.62% (p=0.008 n=5+5) LastIndexAnyUTF8/1:1-48 19.7ns ± 1% 11.9ns ± 0% -39.47% (p=0.008 n=5+5) LastIndexAnyUTF8/1:2-48 27.6ns ± 1% 11.9ns ± 0% -56.82% (p=0.008 n=5+5) LastIndexAnyUTF8/1:4-48 29.9ns ± 0% 11.9ns ± 0% ~ (p=0.079 n=4+5) LastIndexAnyUTF8/1:8-48 48.7ns ± 0% 11.9ns ± 0% -75.54% (p=0.008 n=5+5) LastIndexAnyUTF8/1:16-48 57.8ns ± 0% 11.4ns ± 0% -80.26% (p=0.008 n=5+5) LastIndexAnyUTF8/1:32-48 94.7ns ± 0% 12.2ns ± 0% -87.07% (p=0.008 n=5+5) LastIndexAnyUTF8/1:64-48 163ns ± 0% 13ns ± 1% -91.93% (p=0.008 n=5+5) LastIndexAnyUTF8/16:1-48 258ns ± 0% 88ns ± 0% -65.76% (p=0.008 n=5+5) LastIndexAnyUTF8/16:2-48 400ns ± 0% 162ns ± 0% -59.38% (p=0.008 n=5+5) LastIndexAnyUTF8/16:4-48 415ns ± 0% 162ns ± 0% -60.87% (p=0.008 n=5+5) LastIndexAnyUTF8/16:8-48 737ns ± 0% 162ns ± 0% -78.02% (p=0.000 n=5+4) LastIndexAnyUTF8/16:16-48 882ns ± 0% 128ns ± 0% -85.49% (p=0.008 n=5+5) LastIndexAnyUTF8/16:32-48 1.47µs ± 0% 0.16µs ± 0% -89.29% (p=0.000 n=4+5) LastIndexAnyUTF8/16:64-48 2.56µs ± 0% 0.14µs ± 0% -94.41% (p=0.016 n=5+4) LastIndexAnyUTF8/256:1-48 3.60µs ± 0% 1.23µs ± 0% -65.67% (p=0.008 n=5+5) LastIndexAnyUTF8/256:2-48 5.78µs ± 0% 2.18µs ± 0% -62.32% (p=0.008 n=5+5) LastIndexAnyUTF8/256:4-48 6.26µs ± 0% 2.18µs ± 0% -65.15% (p=0.008 n=5+5) LastIndexAnyUTF8/256:8-48 11.2µs ± 0% 2.2µs ± 0% -80.53% (p=0.008 n=5+5) LastIndexAnyUTF8/256:16-48 13.5µs ± 0% 1.9µs ± 0% -86.02% (p=0.016 n=4+5) LastIndexAnyUTF8/256:32-48 23.0µs ± 0% 2.1µs ± 0% -90.72% (p=0.008 n=5+5) LastIndexAnyUTF8/256:64-48 40.5µs ± 0% 2.1µs ± 0% -94.73% (p=0.008 n=5+5) Change-Id: Ie05e306f8b184b989701868cb161ce8b3f18203b Reviewed-on: https://go-review.googlesource.com/c/go/+/156998 Run-TryBot: eric fang <eric.fang@arm.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-03-04bytes, strings: moves indexRabinKarp function to internal/bytealgerifan01
In order to facilitate optimization of IndexAny and LastIndexAny, this patch moves three Rabin-Karp related functions indexRabinKarp, hashStr and hashStrRev in strings package to initernal/bytealg. There are also three functions in the bytes package with the same names and functions but different parameter types. To highlight this, this patch also moves them to internal/bytealg and gives them slightly different names. Related benchmark changes on amd64 and arm64: name old time/op new time/op delta pkg:strings goos:linux goarch:amd64 Index-16 14.0ns ± 1% 14.1ns ± 2% ~ (p=0.738 n=5+5) LastIndex-16 15.5ns ± 1% 15.7ns ± 4% ~ (p=0.897 n=5+5) pkg:bytes goos:linux goarch:amd64 Index/10-16 26.5ns ± 1% 26.5ns ± 0% ~ (p=0.873 n=5+5) Index/32-16 26.2ns ± 0% 25.7ns ± 0% -1.68% (p=0.008 n=5+5) Index/4K-16 5.12µs ± 4% 5.14µs ± 2% ~ (p=0.841 n=5+5) Index/4M-16 5.44ms ± 3% 5.34ms ± 2% ~ (p=0.056 n=5+5) Index/64M-16 85.8ms ± 3% 84.6ms ± 0% -1.37% (p=0.016 n=5+5) name old speed new speed delta pkg:bytes goos:linux goarch:amd64 Index/10-16 377MB/s ± 1% 377MB/s ± 0% ~ (p=1.000 n=5+5) Index/32-16 1.22GB/s ± 1% 1.24GB/s ± 0% +1.66% (p=0.008 n=5+5) Index/4K-16 800MB/s ± 4% 797MB/s ± 2% ~ (p=0.841 n=5+5) Index/4M-16 771MB/s ± 3% 786MB/s ± 2% ~ (p=0.056 n=5+5) Index/64M-16 783MB/s ± 3% 793MB/s ± 0% +1.36% (p=0.016 n=5+5) name old time/op new time/op delta pkg:strings goos:linux goarch:arm64 Index-8 22.6ns ± 0% 22.5ns ± 0% ~ (p=0.167 n=5+5) LastIndex-8 17.5ns ± 0% 17.5ns ± 0% ~ (all equal) pkg:bytes goos:linux goarch:arm64 Index/10-8 25.0ns ± 0% 25.0ns ± 0% ~ (all equal) Index/32-8 160ns ± 0% 160ns ± 0% ~ (all equal) Index/4K-8 6.26µs ± 0% 6.26µs ± 0% ~ (p=0.167 n=5+5) Index/4M-8 6.30ms ± 0% 6.31ms ± 0% ~ (p=1.000 n=5+5) Index/64M-8 101ms ± 0% 101ms ± 0% ~ (p=0.690 n=5+5) name old speed new speed delta pkg:bytes goos:linux goarch:arm64 Index/10-8 399MB/s ± 0% 400MB/s ± 0% +0.08% (p=0.008 n=5+5) Index/32-8 200MB/s ± 0% 200MB/s ± 0% ~ (p=0.127 n=4+5) Index/4K-8 654MB/s ± 0% 654MB/s ± 0% +0.01% (p=0.016 n=5+5) Index/4M-8 665MB/s ± 0% 665MB/s ± 0% ~ (p=0.833 n=5+5) Index/64M-8 665MB/s ± 0% 665MB/s ± 0% ~ (p=0.913 n=5+5) Change-Id: Icce3bc162bb8613ac36dc963a46c51f8e82ab842 Reviewed-on: https://go-review.googlesource.com/c/go/+/208638 Run-TryBot: eric fang <eric.fang@arm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-08-24all: align cpu feature variable offset namingMartin Möhrmann
Add an "offset_" prefix to all cpu feature variable offset constants to signify that they are not boolean cpu feature variables. Remove _ from offset constant names. Change-Id: I6e22a79ebcbe6e2ae54c4ac8764f9260bb3223ff Reviewed-on: https://go-review.googlesource.com/131215 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-04internal/bytealg: move short string Index implementations into bytealgKeith Randall
Also move the arm64 CountByte implementation while we're here. Fixes #19792 Change-Id: I1e0fdf1e03e3135af84150a2703b58dad1b0d57e Reviewed-on: https://go-review.googlesource.com/98518 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>