| Age | Commit message (Collapse) | Author |
|
For the case of a pattern containing multi-byte rune, the time complexity of the
previous algorithm is O(nm), and if both input arguments are long, the search
performance will be poor. This CL improves the searching performance for these
cases by using IndexRune, which is mainly implemented with IndexByte and Index.
As IndexByte and Index are specially optimized with some powerful instructions
for short patterns (an UTF8 rune is 1 to 4 bytes), so they can help to reduce the
runtime complexity of IndexAny and LastIndexAny.
Another optimization method is using hash table, however, the actual test results
show that using indexrune is better, and the space complexity is lower.
There are two fast paths in IndexAny and LastIndexAny for cases where the length
of the input arguements are 1, and their locations are not exactly the same, which
is determined based on the actual test results.
Benchmarks on arm64 and amd64:
name old time/op new time/op delta
pkg:strings goos:linux goarch:arm64
IndexAnyASCII/1:1-8 23.7ns ± 3% 28.5ns ± 0% +20.15% (p=0.008 n=5+5)
IndexAnyASCII/1:2-8 18.0ns ± 0% 33.1ns ± 0% +83.67% (p=0.008 n=5+5)
IndexAnyASCII/1:4-8 20.0ns ± 0% 36.0ns ± 0% +80.00% (p=0.029 n=4+4)
IndexAnyASCII/1:8-8 36.1ns ± 0% 36.0ns ± 0% ~ (p=0.095 n=5+4)
IndexAnyASCII/1:16-8 48.1ns ± 0% 36.0ns ± 0% -25.19% (p=0.029 n=4+4)
IndexAnyASCII/1:32-8 72.1ns ± 0% 36.0ns ± 0% -50.01% (p=0.008 n=5+5)
IndexAnyASCII/1:64-8 120ns ± 0% 39ns ± 0% -67.83% (p=0.008 n=5+5)
IndexAnyASCII/16:1-8 73.0ns ± 0% 28.5ns ± 0% -60.95% (p=0.008 n=5+5)
IndexAnyASCII/16:2-8 76.8ns ± 0% 77.0ns ± 0% ~ (p=1.000 n=5+5)
IndexAnyASCII/16:4-8 83.2ns ± 1% 83.0ns ± 0% ~ (p=0.770 n=5+5)
IndexAnyASCII/16:8-8 111ns ± 1% 107ns ± 0% -3.25% (p=0.008 n=5+5)
IndexAnyASCII/16:16-8 139ns ± 1% 137ns ± 0% -1.58% (p=0.008 n=5+5)
IndexAnyASCII/16:32-8 199ns ± 1% 197ns ± 0% -1.20% (p=0.008 n=5+5)
IndexAnyASCII/16:64-8 307ns ± 0% 313ns ± 0% +1.82% (p=0.016 n=5+4)
IndexAnyASCII/256:1-8 674ns ± 0% 65ns ± 0% -90.31% (p=0.008 n=5+5)
IndexAnyASCII/256:2-8 678ns ± 0% 683ns ± 0% +0.68% (p=0.008 n=5+5)
IndexAnyASCII/256:4-8 685ns ± 0% 683ns ± 0% -0.29% (p=0.000 n=5+4)
IndexAnyASCII/256:8-8 711ns ± 0% 708ns ± 0% -0.48% (p=0.008 n=5+5)
IndexAnyASCII/256:16-8 740ns ± 0% 740ns ± 0% ~ (p=0.444 n=5+5)
IndexAnyASCII/256:32-8 799ns ± 0% 798ns ± 0% -0.18% (p=0.008 n=5+5)
IndexAnyASCII/256:64-8 910ns ± 0% 914ns ± 0% +0.44% (p=0.016 n=4+5)
IndexAnyUTF8/1:1-8 27.1ns ± 0% 19.0ns ± 0% -29.79% (p=0.008 n=5+5)
IndexAnyUTF8/1:2-8 44.1ns ± 0% 33.0ns ± 0% -25.17% (p=0.008 n=5+5)
IndexAnyUTF8/1:4-8 46.1ns ± 0% 33.1ns ± 0% -28.29% (p=0.016 n=4+5)
IndexAnyUTF8/1:8-8 85.1ns ± 0% 33.0ns ± 0% -61.18% (p=0.008 n=5+5)
IndexAnyUTF8/1:16-8 110ns ± 1% 36ns ± 0% -67.27% (p=0.008 n=5+5)
IndexAnyUTF8/1:32-8 188ns ± 0% 36ns ± 0% -80.85% (p=0.008 n=5+5)
IndexAnyUTF8/1:64-8 332ns ± 0% 39ns ± 0% ~ (p=0.079 n=4+5)
IndexAnyUTF8/16:1-8 293ns ± 0% 54ns ± 0% -81.56% (p=0.008 n=5+5)
IndexAnyUTF8/16:2-8 563ns ± 0% 349ns ± 0% -37.98% (p=0.008 n=5+5)
IndexAnyUTF8/16:4-8 546ns ± 1% 349ns ± 0% -36.10% (p=0.000 n=5+4)
IndexAnyUTF8/16:8-8 1.22µs ± 0% 0.35µs ± 0% -71.39% (p=0.008 n=5+5)
IndexAnyUTF8/16:16-8 1.63µs ± 1% 0.42µs ± 0% -73.98% (p=0.008 n=5+5)
IndexAnyUTF8/16:32-8 2.87µs ± 0% 0.42µs ± 0% -85.22% (p=0.008 n=5+5)
IndexAnyUTF8/16:64-8 5.18µs ± 0% 0.47µs ± 0% -90.98% (p=0.008 n=5+5)
IndexAnyUTF8/256:1-8 4.26µs ± 0% 0.47µs ± 0% -88.85% (p=0.000 n=4+5)
IndexAnyUTF8/256:2-8 8.62µs ± 0% 5.15µs ± 0% -40.21% (p=0.008 n=5+5)
IndexAnyUTF8/256:4-8 8.25µs ± 0% 5.15µs ± 0% -37.50% (p=0.016 n=5+4)
IndexAnyUTF8/256:8-8 19.2µs ± 1% 5.2µs ± 0% -73.08% (p=0.016 n=5+4)
IndexAnyUTF8/256:16-8 25.6µs ± 1% 6.3µs ± 0% -75.32% (p=0.008 n=5+5)
IndexAnyUTF8/256:32-8 45.6µs ± 0% 6.3µs ± 0% -86.15% (p=0.008 n=5+5)
IndexAnyUTF8/256:64-8 82.4µs ± 0% 7.0µs ± 0% -91.53% (p=0.016 n=5+4)
LastIndexAnyASCII/1:1-8 23.0ns ± 0% 33.5ns ± 0% +45.65% (p=0.008 n=5+5)
LastIndexAnyASCII/1:2-8 24.5ns ± 0% 33.5ns ± 0% +36.73% (p=0.016 n=4+5)
LastIndexAnyASCII/1:4-8 27.5ns ± 0% 35.5ns ± 0% +29.09% (p=0.008 n=5+5)
LastIndexAnyASCII/1:8-8 44.5ns ± 0% 35.5ns ± 0% -20.13% (p=0.008 n=5+5)
LastIndexAnyASCII/1:16-8 56.5ns ± 0% 35.5ns ± 0% -37.15% (p=0.008 n=5+5)
LastIndexAnyASCII/1:32-8 80.3ns ± 0% 35.5ns ± 0% -55.79% (p=0.000 n=5+4)
LastIndexAnyASCII/1:64-8 129ns ± 0% 40ns ± 0% -68.85% (p=0.008 n=5+5)
LastIndexAnyASCII/16:1-8 72.8ns ± 0% 72.7ns ± 0% -0.19% (p=0.016 n=4+5)
LastIndexAnyASCII/16:2-8 75.4ns ± 0% 75.1ns ± 0% ~ (p=0.127 n=5+5)
LastIndexAnyASCII/16:4-8 81.9ns ± 1% 80.2ns ± 0% -2.00% (p=0.008 n=5+5)
LastIndexAnyASCII/16:8-8 110ns ± 1% 108ns ± 0% -1.46% (p=0.008 n=5+5)
LastIndexAnyASCII/16:16-8 138ns ± 1% 134ns ± 0% -3.18% (p=0.008 n=5+5)
LastIndexAnyASCII/16:32-8 198ns ± 0% 197ns ± 0% -0.51% (p=0.008 n=5+5)
LastIndexAnyASCII/16:64-8 309ns ± 0% 313ns ± 0% +1.30% (p=0.008 n=5+5)
LastIndexAnyASCII/256:1-8 652ns ± 0% 653ns ± 0% +0.21% (p=0.008 n=5+5)
LastIndexAnyASCII/256:2-8 656ns ± 0% 656ns ± 0% ~ (all equal)
LastIndexAnyASCII/256:4-8 663ns ± 0% 663ns ± 0% ~ (p=0.444 n=5+5)
LastIndexAnyASCII/256:8-8 691ns ± 0% 690ns ± 0% ~ (p=0.079 n=4+5)
LastIndexAnyASCII/256:16-8 719ns ± 0% 715ns ± 0% -0.53% (p=0.000 n=5+4)
LastIndexAnyASCII/256:32-8 779ns ± 0% 780ns ± 0% +0.13% (p=0.029 n=4+4)
LastIndexAnyASCII/256:64-8 890ns ± 0% 894ns ± 0% +0.45% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:1-8 31.6ns ± 0% 33.5ns ± 0% +6.01% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:2-8 48.6ns ± 0% 33.5ns ± 0% -30.99% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:4-8 48.6ns ± 0% 33.5ns ± 0% -31.13% (p=0.000 n=5+4)
LastIndexAnyUTF8/1:8-8 89.6ns ± 0% 33.5ns ± 0% -62.56% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:16-8 113ns ± 1% 36ns ± 0% -68.47% (p=0.000 n=5+4)
LastIndexAnyUTF8/1:32-8 190ns ± 0% 36ns ± 0% -81.26% (p=0.029 n=4+4)
LastIndexAnyUTF8/1:64-8 327ns ± 0% 40ns ± 0% -87.77% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:1-8 364ns ± 0% 158ns ± 0% ~ (p=0.079 n=4+5)
LastIndexAnyUTF8/16:2-8 636ns ± 0% 472ns ± 0% -25.79% (p=0.000 n=5+4)
LastIndexAnyUTF8/16:4-8 630ns ± 0% 472ns ± 0% -25.03% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:8-8 1.28µs ± 0% 0.47µs ± 0% -63.09% (p=0.016 n=5+4)
LastIndexAnyUTF8/16:16-8 1.66µs ± 0% 0.53µs ± 0% -68.39% (p=0.016 n=5+4)
LastIndexAnyUTF8/16:32-8 2.88µs ± 0% 0.53µs ± 0% -81.72% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:64-8 5.08µs ± 0% 0.57µs ± 0% -88.79% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:1-8 5.41µs ± 0% 2.03µs ± 0% -62.46% (p=0.016 n=4+5)
LastIndexAnyUTF8/256:2-8 9.77µs ± 0% 7.14µs ± 0% -26.97% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:4-8 9.63µs ± 0% 7.14µs ± 0% -25.86% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:8-8 20.0µs ± 0% 7.1µs ± 0% -64.30% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:16-8 26.1µs ± 1% 8.0µs ± 0% -69.40% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:32-8 45.6µs ± 1% 8.0µs ± 0% -82.51% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:64-8 80.8µs ± 0% 8.6µs ± 0% -89.33% (p=0.016 n=5+4)
pkg:bytes goos:linux goarch:arm64
IndexAnyASCII/1:1-8 26.2ns ± 1% 26.5ns ± 0% +1.30% (p=0.016 n=5+4)
IndexAnyASCII/1:2-8 18.5ns ± 0% 26.5ns ± 0% +43.24% (p=0.008 n=5+5)
IndexAnyASCII/1:4-8 21.0ns ± 0% 26.5ns ± 0% +26.38% (p=0.008 n=5+5)
IndexAnyASCII/1:8-8 37.5ns ± 0% 26.5ns ± 0% -29.33% (p=0.000 n=5+4)
IndexAnyASCII/1:16-8 49.6ns ± 0% 26.5ns ± 0% -46.49% (p=0.008 n=5+5)
IndexAnyASCII/1:32-8 73.6ns ± 0% 30.1ns ± 0% -59.16% (p=0.008 n=5+5)
IndexAnyASCII/1:64-8 122ns ± 0% 33ns ± 0% -73.23% (p=0.008 n=5+5)
IndexAnyASCII/16:1-8 73.7ns ± 0% 33.4ns ± 0% -54.71% (p=0.008 n=5+5)
IndexAnyASCII/16:2-8 79.1ns ± 0% 78.9ns ± 0% -0.30% (p=0.016 n=4+5)
IndexAnyASCII/16:4-8 84.8ns ± 0% 86.1ns ± 0% +1.58% (p=0.016 n=5+4)
IndexAnyASCII/16:8-8 111ns ± 0% 111ns ± 0% ~ (all equal)
IndexAnyASCII/16:16-8 139ns ± 0% 144ns ± 0% +3.60% (p=0.016 n=4+5)
IndexAnyASCII/16:32-8 196ns ± 0% 207ns ± 0% +5.61% (p=0.016 n=5+4)
IndexAnyASCII/16:64-8 311ns ± 0% 320ns ± 0% +2.89% (p=0.016 n=4+5)
IndexAnyASCII/256:1-8 674ns ± 0% 65ns ± 1% -90.35% (p=0.008 n=5+5)
IndexAnyASCII/256:2-8 680ns ± 0% 680ns ± 0% ~ (p=0.444 n=5+5)
IndexAnyASCII/256:4-8 686ns ± 0% 687ns ± 0% ~ (p=0.167 n=5+5)
IndexAnyASCII/256:8-8 713ns ± 0% 712ns ± 0% -0.14% (p=0.008 n=5+5)
IndexAnyASCII/256:16-8 740ns ± 0% 744ns ± 0% +0.54% (p=0.016 n=5+4)
IndexAnyASCII/256:32-8 797ns ± 0% 808ns ± 0% +1.43% (p=0.008 n=5+5)
IndexAnyASCII/256:64-8 912ns ± 0% 921ns ± 0% +0.99% (p=0.016 n=4+5)
IndexAnyUTF8/1:1-8 27.5ns ± 0% 26.5ns ± 0% -3.64% (p=0.008 n=5+5)
IndexAnyUTF8/1:2-8 44.5ns ± 0% 26.5ns ± 0% -40.50% (p=0.008 n=5+5)
IndexAnyUTF8/1:4-8 45.6ns ± 0% 26.5ns ± 0% -41.89% (p=0.000 n=5+4)
IndexAnyUTF8/1:8-8 85.8ns ± 1% 26.5ns ± 0% -69.11% (p=0.008 n=5+5)
IndexAnyUTF8/1:16-8 110ns ± 1% 26ns ± 0% -76.00% (p=0.016 n=5+4)
IndexAnyUTF8/1:32-8 188ns ± 0% 30ns ± 0% -84.04% (p=0.008 n=5+5)
IndexAnyUTF8/1:64-8 333ns ± 0% 33ns ± 0% -90.20% (p=0.008 n=5+5)
IndexAnyUTF8/16:1-8 294ns ± 0% 235ns ± 0% -20.07% (p=0.008 n=5+5)
IndexAnyUTF8/16:2-8 563ns ± 0% 309ns ± 0% -45.12% (p=0.008 n=5+5)
IndexAnyUTF8/16:4-8 558ns ± 1% 309ns ± 0% -44.60% (p=0.000 n=5+4)
IndexAnyUTF8/16:8-8 1.23µs ± 0% 0.31µs ± 0% -74.79% (p=0.008 n=5+5)
IndexAnyUTF8/16:16-8 1.62µs ± 2% 0.31µs ± 0% -80.93% (p=0.008 n=5+5)
IndexAnyUTF8/16:32-8 2.86µs ± 0% 0.38µs ± 0% -86.87% (p=0.008 n=5+5)
IndexAnyUTF8/16:64-8 5.18µs ± 0% 0.42µs ± 0% -91.86% (p=0.008 n=5+5)
IndexAnyUTF8/256:1-8 4.27µs ± 1% 3.30µs ± 1% -22.75% (p=0.008 n=5+5)
IndexAnyUTF8/256:2-8 8.61µs ± 0% 4.45µs ± 0% -48.31% (p=0.016 n=4+5)
IndexAnyUTF8/256:4-8 8.44µs ± 0% 4.45µs ± 0% -47.23% (p=0.008 n=5+5)
IndexAnyUTF8/256:8-8 19.2µs ± 0% 4.5µs ± 0% -76.78% (p=0.008 n=5+5)
IndexAnyUTF8/256:16-8 25.6µs ± 0% 4.5µs ± 0% -82.63% (p=0.008 n=5+5)
IndexAnyUTF8/256:32-8 45.4µs ± 0% 5.5µs ± 0% -87.85% (p=0.016 n=4+5)
IndexAnyUTF8/256:64-8 82.5µs ± 0% 6.2µs ± 0% -92.49% (p=0.008 n=5+5)
LastIndexAnyASCII/1:1-8 23.0ns ± 0% 26.5ns ± 0% +15.02% (p=0.008 n=5+5)
LastIndexAnyASCII/1:2-8 24.5ns ± 0% 26.5ns ± 0% +8.16% (p=0.008 n=5+5)
LastIndexAnyASCII/1:4-8 27.8ns ± 0% 26.5ns ± 0% -4.68% (p=0.029 n=4+4)
LastIndexAnyASCII/1:8-8 45.1ns ± 1% 26.5ns ± 0% -41.29% (p=0.000 n=5+4)
LastIndexAnyASCII/1:16-8 57.1ns ± 0% 26.5ns ± 0% -53.61% (p=0.008 n=5+5)
LastIndexAnyASCII/1:32-8 81.5ns ± 0% 30.0ns ± 0% ~ (p=0.079 n=4+5)
LastIndexAnyASCII/1:64-8 129ns ± 0% 32ns ± 0% -74.81% (p=0.008 n=5+5)
LastIndexAnyASCII/16:1-8 72.6ns ± 0% 72.1ns ± 0% -0.63% (p=0.000 n=4+5)
LastIndexAnyASCII/16:2-8 77.2ns ± 0% 77.2ns ± 0% ~ (p=0.167 n=5+5)
LastIndexAnyASCII/16:4-8 83.1ns ± 0% 83.2ns ± 0% ~ (p=0.444 n=5+5)
LastIndexAnyASCII/16:8-8 109ns ± 1% 108ns ± 0% ~ (p=0.167 n=5+5)
LastIndexAnyASCII/16:16-8 136ns ± 0% 136ns ± 0% ~ (all equal)
LastIndexAnyASCII/16:32-8 195ns ± 0% 197ns ± 0% +0.82% (p=0.008 n=5+5)
LastIndexAnyASCII/16:64-8 309ns ± 0% 309ns ± 0% ~ (all equal)
LastIndexAnyASCII/256:1-8 653ns ± 0% 657ns ± 0% +0.61% (p=0.008 n=5+5)
LastIndexAnyASCII/256:2-8 659ns ± 0% 658ns ± 0% ~ (p=0.167 n=5+5)
LastIndexAnyASCII/256:4-8 664ns ± 0% 663ns ± 0% ~ (p=0.095 n=5+4)
LastIndexAnyASCII/256:8-8 698ns ± 0% 689ns ± 0% -1.29% (p=0.008 n=5+5)
LastIndexAnyASCII/256:16-8 726ns ± 0% 717ns ± 0% -1.24% (p=0.008 n=5+5)
LastIndexAnyASCII/256:32-8 777ns ± 0% 779ns ± 0% ~ (p=0.079 n=5+4)
LastIndexAnyASCII/256:64-8 889ns ± 0% 890ns ± 0% ~ (p=0.444 n=5+5)
LastIndexAnyUTF8/1:1-8 32.1ns ± 0% 26.5ns ± 0% -17.45% (p=0.000 n=5+4)
LastIndexAnyUTF8/1:2-8 48.6ns ± 0% 26.5ns ± 0% -45.52% (p=0.000 n=5+4)
LastIndexAnyUTF8/1:4-8 49.6ns ± 0% 26.5ns ± 0% -46.62% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:8-8 91.9ns ± 0% 26.5ns ± 0% -71.18% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:16-8 114ns ± 1% 26ns ± 0% -76.84% (p=0.000 n=5+4)
LastIndexAnyUTF8/1:32-8 203ns ± 6% 30ns ± 0% -85.25% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:64-8 330ns ± 0% 33ns ± 0% -90.14% (p=0.000 n=4+5)
LastIndexAnyUTF8/16:1-8 365ns ± 0% 164ns ± 0% -55.04% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:2-8 638ns ± 0% 296ns ± 0% -53.58% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:4-8 634ns ± 0% 296ns ± 0% -53.31% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:8-8 1.30µs ± 0% 0.30µs ± 0% -77.18% (p=0.000 n=4+5)
LastIndexAnyUTF8/16:16-8 1.66µs ± 0% 0.30µs ± 0% -82.19% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:32-8 2.90µs ± 0% 0.38µs ± 0% -87.00% (p=0.029 n=4+4)
LastIndexAnyUTF8/16:64-8 5.10µs ± 0% 0.42µs ± 0% -91.78% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:1-8 5.42µs ± 0% 2.12µs ± 0% -60.92% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:2-8 9.79µs ± 0% 4.26µs ± 0% -56.47% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:4-8 9.66µs ± 0% 4.26µs ± 0% -55.87% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:8-8 20.4µs ± 0% 4.3µs ± 0% -79.10% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:16-8 26.0µs ± 1% 4.3µs ± 0% -83.62% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:32-8 46.0µs ± 0% 5.5µs ± 0% -88.09% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:64-8 81.1µs ± 0% 6.2µs ± 0% -92.38% (p=0.008 n=5+5)
name old time/op new time/op delta
pkg:strings goos:linux goarch:amd64
IndexAnyASCII/1:1-48 10.0ns ± 0% 13.3ns ± 0% +33.00% (p=0.008 n=5+5)
IndexAnyASCII/1:2-48 11.0ns ± 0% 15.5ns ± 0% +40.55% (p=0.016 n=4+5)
IndexAnyASCII/1:4-48 12.9ns ± 0% 15.4ns ± 0% +19.69% (p=0.008 n=5+5)
IndexAnyASCII/1:8-48 18.6ns ± 0% 15.5ns ± 0% -16.45% (p=0.000 n=4+5)
IndexAnyASCII/1:16-48 30.1ns ± 0% 16.9ns ± 0% ~ (p=0.079 n=4+5)
IndexAnyASCII/1:32-48 53.1ns ± 0% 18.6ns ± 0% -64.95% (p=0.000 n=5+4)
IndexAnyASCII/1:64-48 98.9ns ± 0% 17.4ns ± 0% -82.41% (p=0.000 n=5+4)
IndexAnyASCII/16:1-48 35.0ns ± 0% 14.2ns ± 0% -59.47% (p=0.000 n=5+4)
IndexAnyASCII/16:2-48 35.5ns ± 0% 35.6ns ± 0% ~ (p=0.238 n=5+4)
IndexAnyASCII/16:4-48 40.8ns ± 0% 40.7ns ± 1% ~ (p=0.643 n=5+5)
IndexAnyASCII/16:8-48 50.8ns ± 0% 50.9ns ± 1% ~ (p=1.000 n=4+5)
IndexAnyASCII/16:16-48 64.0ns ± 1% 64.5ns ± 1% ~ (p=0.071 n=5+5)
IndexAnyASCII/16:32-48 98.3ns ± 0% 100.8ns ± 1% +2.52% (p=0.008 n=5+5)
IndexAnyASCII/16:64-48 156ns ± 0% 157ns ± 0% ~ (p=0.238 n=4+5)
IndexAnyASCII/256:1-48 299ns ± 0% 24ns ± 3% -92.12% (p=0.008 n=5+5)
IndexAnyASCII/256:2-48 303ns ± 0% 304ns ± 0% ~ (p=0.762 n=5+5)
IndexAnyASCII/256:4-48 311ns ± 0% 311ns ± 0% ~ (p=0.476 n=5+5)
IndexAnyASCII/256:8-48 321ns ± 0% 321ns ± 0% ~ (p=0.429 n=4+5)
IndexAnyASCII/256:16-48 334ns ± 0% 335ns ± 0% ~ (p=0.079 n=5+4)
IndexAnyASCII/256:32-48 367ns ± 0% 365ns ± 0% ~ (p=0.079 n=4+5)
IndexAnyASCII/256:64-48 431ns ± 1% 421ns ± 0% -2.27% (p=0.008 n=5+5)
IndexAnyUTF8/1:1-48 17.2ns ± 0% 10.8ns ± 0% -37.21% (p=0.029 n=4+4)
IndexAnyUTF8/1:2-48 26.7ns ± 0% 15.6ns ± 0% ~ (p=0.079 n=4+5)
IndexAnyUTF8/1:4-48 28.2ns ± 0% 15.6ns ± 0% -44.68% (p=0.000 n=5+4)
IndexAnyUTF8/1:8-48 48.8ns ± 0% 15.6ns ± 0% -68.03% (p=0.029 n=4+4)
IndexAnyUTF8/1:16-48 58.3ns ± 0% 16.2ns ± 0% ~ (p=0.079 n=4+5)
IndexAnyUTF8/1:32-48 103ns ± 0% 18ns ± 0% -82.27% (p=0.008 n=5+5)
IndexAnyUTF8/1:64-48 182ns ± 0% 17ns ± 0% -90.53% (p=0.008 n=5+5)
IndexAnyUTF8/16:1-48 197ns ± 0% 25ns ± 0% -87.34% (p=0.000 n=5+4)
IndexAnyUTF8/16:2-48 348ns ± 0% 163ns ± 0% -53.11% (p=0.000 n=5+4)
IndexAnyUTF8/16:4-48 374ns ± 0% 163ns ± 0% -56.37% (p=0.000 n=5+4)
IndexAnyUTF8/16:8-48 716ns ± 0% 163ns ± 0% -77.22% (p=0.000 n=5+4)
IndexAnyUTF8/16:16-48 859ns ± 0% 175ns ± 0% -79.63% (p=0.000 n=5+4)
IndexAnyUTF8/16:32-48 1.58µs ± 0% 0.20µs ± 0% -87.01% (p=0.029 n=4+4)
IndexAnyUTF8/16:64-48 2.84µs ± 0% 0.19µs ± 1% -93.34% (p=0.008 n=5+5)
IndexAnyUTF8/256:1-48 2.61µs ± 0% 0.27µs ± 0% -89.81% (p=0.008 n=5+5)
IndexAnyUTF8/256:2-48 4.95µs ± 0% 2.23µs ± 0% -54.91% (p=0.016 n=5+4)
IndexAnyUTF8/256:4-48 5.55µs ± 0% 2.23µs ± 0% -59.72% (p=0.008 n=5+5)
IndexAnyUTF8/256:8-48 10.8µs ± 0% 2.2µs ± 0% -79.39% (p=0.008 n=5+5)
IndexAnyUTF8/256:16-48 13.1µs ± 0% 2.5µs ± 0% -81.21% (p=0.016 n=4+5)
IndexAnyUTF8/256:32-48 24.7µs ± 0% 2.8µs ± 0% -88.49% (p=0.008 n=5+5)
IndexAnyUTF8/256:64-48 45.0µs ± 0% 2.6µs ± 1% -94.23% (p=0.008 n=5+5)
LastIndexAnyASCII/1:1-48 13.9ns ± 0% 15.2ns ± 0% +9.35% (p=0.008 n=5+5)
LastIndexAnyASCII/1:2-48 14.4ns ± 0% 15.2ns ± 0% +5.56% (p=0.008 n=5+5)
LastIndexAnyASCII/1:4-48 16.7ns ± 0% 15.2ns ± 0% -8.98% (p=0.008 n=5+5)
LastIndexAnyASCII/1:8-48 24.0ns ± 0% 15.2ns ± 0% -36.67% (p=0.008 n=5+5)
LastIndexAnyASCII/1:16-48 35.6ns ± 0% 15.0ns ± 0% -57.82% (p=0.008 n=5+5)
LastIndexAnyASCII/1:32-48 68.9ns ± 0% 16.7ns ± 0% -75.75% (p=0.008 n=5+5)
LastIndexAnyASCII/1:64-48 104ns ± 0% 17ns ± 1% -83.81% (p=0.008 n=5+5)
LastIndexAnyASCII/16:1-48 35.0ns ± 0% 35.0ns ± 0% ~ (all equal)
LastIndexAnyASCII/16:2-48 35.6ns ± 0% 35.6ns ± 0% ~ (all equal)
LastIndexAnyASCII/16:4-48 41.0ns ± 0% 40.8ns ± 0% -0.49% (p=0.032 n=5+5)
LastIndexAnyASCII/16:8-48 50.9ns ± 0% 50.7ns ± 1% ~ (p=0.397 n=5+5)
LastIndexAnyASCII/16:16-48 64.3ns ± 1% 64.4ns ± 1% ~ (p=1.000 n=4+5)
LastIndexAnyASCII/16:32-48 100ns ± 0% 100ns ± 0% +0.38% (p=0.016 n=4+5)
LastIndexAnyASCII/16:64-48 157ns ± 1% 163ns ± 0% +3.82% (p=0.008 n=5+5)
LastIndexAnyASCII/256:1-48 302ns ± 0% 300ns ± 0% -0.53% (p=0.008 n=5+5)
LastIndexAnyASCII/256:2-48 305ns ± 0% 303ns ± 0% -0.66% (p=0.000 n=5+4)
LastIndexAnyASCII/256:4-48 313ns ± 0% 307ns ± 0% -2.04% (p=0.000 n=4+5)
LastIndexAnyASCII/256:8-48 323ns ± 0% 315ns ± 0% -2.48% (p=0.029 n=4+4)
LastIndexAnyASCII/256:16-48 333ns ± 0% 332ns ± 0% -0.30% (p=0.048 n=5+5)
LastIndexAnyASCII/256:32-48 366ns ± 0% 367ns ± 0% ~ (p=0.238 n=4+5)
LastIndexAnyASCII/256:64-48 430ns ± 0% 430ns ± 0% ~ (p=1.000 n=5+5)
LastIndexAnyUTF8/1:1-48 21.1ns ± 0% 13.9ns ± 0% -34.00% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:2-48 29.5ns ± 0% 13.9ns ± 0% -52.95% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:4-48 31.6ns ± 0% 13.9ns ± 0% -55.96% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:8-48 51.1ns ± 0% 13.9ns ± 0% -72.81% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:16-48 58.9ns ± 0% 14.6ns ± 0% -75.23% (p=0.016 n=5+4)
LastIndexAnyUTF8/1:32-48 103ns ± 0% 16ns ± 1% -84.12% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:64-48 177ns ± 0% 17ns ± 1% -90.62% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:1-48 275ns ± 1% 105ns ± 0% -61.85% (p=0.000 n=5+4)
LastIndexAnyUTF8/16:2-48 406ns ± 0% 216ns ± 0% -46.70% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:4-48 458ns ± 0% 216ns ± 0% -52.75% (p=0.000 n=4+5)
LastIndexAnyUTF8/16:8-48 753ns ± 0% 216ns ± 0% -71.31% (p=0.029 n=4+4)
LastIndexAnyUTF8/16:16-48 902ns ± 0% 221ns ± 0% -75.50% (p=0.016 n=5+4)
LastIndexAnyUTF8/16:32-48 1.57µs ± 0% 0.24µs ± 0% -84.46% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:64-48 2.77µs ± 0% 0.24µs ± 0% -91.22% (p=0.000 n=5+4)
LastIndexAnyUTF8/256:1-48 4.06µs ± 0% 1.53µs ± 0% -62.26% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:2-48 5.92µs ± 0% 3.04µs ± 0% -48.55% (p=0.016 n=4+5)
LastIndexAnyUTF8/256:4-48 6.82µs ± 0% 3.04µs ± 0% -55.34% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:8-48 11.5µs ± 0% 3.0µs ± 0% -73.48% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:16-48 14.1µs ± 0% 3.1µs ± 0% -77.85% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:32-48 24.5µs ± 0% 3.5µs ± 0% -85.85% (p=0.016 n=5+4)
LastIndexAnyUTF8/256:64-48 44.0µs ± 0% 3.5µs ± 0% -92.12% (p=0.008 n=5+5)
pkg:bytes goos:linux goarch:amd64
IndexAnyASCII/1:1-48 9.56ns ± 0% 11.00ns ± 0% +15.06% (p=0.016 n=5+4)
IndexAnyASCII/1:2-48 11.0ns ± 0% 10.8ns ± 2% -1.64% (p=0.048 n=5+5)
IndexAnyASCII/1:4-48 13.9ns ± 0% 11.0ns ± 1% -21.15% (p=0.008 n=5+5)
IndexAnyASCII/1:8-48 19.6ns ± 0% 10.8ns ± 3% -44.90% (p=0.008 n=5+5)
IndexAnyASCII/1:16-48 31.1ns ± 0% 11.5ns ± 0% -63.02% (p=0.008 n=5+5)
IndexAnyASCII/1:32-48 54.0ns ± 0% 11.8ns ± 0% -78.15% (p=0.000 n=5+4)
IndexAnyASCII/1:64-48 100ns ± 0% 13ns ± 0% -86.89% (p=0.008 n=5+5)
IndexAnyASCII/16:1-48 35.5ns ± 0% 14.8ns ± 0% -58.26% (p=0.008 n=5+5)
IndexAnyASCII/16:2-48 36.2ns ± 1% 36.0ns ± 1% ~ (p=0.087 n=5+5)
IndexAnyASCII/16:4-48 40.3ns ± 1% 39.7ns ± 4% ~ (p=0.175 n=4+5)
IndexAnyASCII/16:8-48 48.7ns ± 5% 45.8ns ± 0% -6.02% (p=0.016 n=5+4)
IndexAnyASCII/16:16-48 64.1ns ±11% 62.1ns ± 1% ~ (p=0.143 n=5+5)
IndexAnyASCII/16:32-48 97.9ns ± 1% 98.3ns ± 1% ~ (p=0.294 n=5+5)
IndexAnyASCII/16:64-48 163ns ± 0% 157ns ± 0% -3.68% (p=0.008 n=5+5)
IndexAnyASCII/256:1-48 389ns ± 0% 25ns ± 0% -93.65% (p=0.000 n=5+4)
IndexAnyASCII/256:2-48 391ns ± 0% 307ns ± 0% -21.48% (p=0.000 n=5+4)
IndexAnyASCII/256:4-48 394ns ± 0% 323ns ± 0% -17.92% (p=0.008 n=5+5)
IndexAnyASCII/256:8-48 402ns ± 0% 323ns ± 0% -19.51% (p=0.008 n=5+5)
IndexAnyASCII/256:16-48 414ns ± 0% 334ns ± 0% -19.32% (p=0.016 n=4+5)
IndexAnyASCII/256:32-48 446ns ± 0% 367ns ± 0% -17.75% (p=0.016 n=5+4)
IndexAnyASCII/256:64-48 511ns ± 0% 424ns ± 0% -17.02% (p=0.008 n=5+5)
IndexAnyUTF8/1:1-48 17.4ns ± 0% 11.0ns ± 0% -36.64% (p=0.008 n=5+5)
IndexAnyUTF8/1:2-48 27.3ns ± 1% 11.0ns ± 0% -59.74% (p=0.008 n=5+5)
IndexAnyUTF8/1:4-48 28.7ns ± 0% 11.0ns ± 0% -61.73% (p=0.008 n=5+5)
IndexAnyUTF8/1:8-48 49.2ns ± 0% 11.0ns ± 0% -77.66% (p=0.008 n=5+5)
IndexAnyUTF8/1:16-48 56.0ns ± 0% 11.5ns ± 0% -79.46% (p=0.000 n=5+4)
IndexAnyUTF8/1:32-48 102ns ± 0% 12ns ± 0% -88.24% (p=0.008 n=5+5)
IndexAnyUTF8/1:64-48 177ns ± 0% 13ns ± 0% -92.51% (p=0.008 n=5+5)
IndexAnyUTF8/16:1-48 212ns ± 0% 112ns ± 0% -47.17% (p=0.008 n=5+5)
IndexAnyUTF8/16:2-48 356ns ± 0% 159ns ± 1% -55.28% (p=0.000 n=4+5)
IndexAnyUTF8/16:4-48 372ns ± 0% 158ns ± 0% -57.47% (p=0.008 n=5+5)
IndexAnyUTF8/16:8-48 712ns ± 0% 159ns ± 1% -77.70% (p=0.008 n=5+5)
IndexAnyUTF8/16:16-48 829ns ± 0% 129ns ± 0% -84.44% (p=0.008 n=5+5)
IndexAnyUTF8/16:32-48 1.55µs ± 0% 0.16µs ± 0% -89.87% (p=0.008 n=5+5)
IndexAnyUTF8/16:64-48 2.77µs ± 0% 0.14µs ± 0% -94.94% (p=0.008 n=5+5)
IndexAnyUTF8/256:1-48 2.85µs ± 0% 1.63µs ± 1% -42.74% (p=0.008 n=5+5)
IndexAnyUTF8/256:2-48 5.14µs ± 1% 2.03µs ± 0% -60.51% (p=0.008 n=5+5)
IndexAnyUTF8/256:4-48 5.56µs ± 0% 2.03µs ± 0% -63.52% (p=0.008 n=5+5)
IndexAnyUTF8/256:8-48 10.8µs ± 0% 2.0µs ± 0% -81.22% (p=0.008 n=5+5)
IndexAnyUTF8/256:16-48 12.9µs ± 0% 1.9µs ± 0% -85.55% (p=0.008 n=5+5)
IndexAnyUTF8/256:32-48 24.2µs ± 0% 2.1µs ± 0% -91.29% (p=0.016 n=5+4)
IndexAnyUTF8/256:64-48 43.7µs ± 0% 2.0µs ± 0% -95.32% (p=0.016 n=5+4)
LastIndexAnyASCII/1:1-48 13.7ns ± 1% 12.8ns ± 0% -6.57% (p=0.016 n=5+4)
LastIndexAnyASCII/1:2-48 14.7ns ± 0% 12.7ns ± 1% -13.33% (p=0.000 n=4+5)
LastIndexAnyASCII/1:4-48 16.9ns ± 0% 12.7ns ± 1% -24.73% (p=0.000 n=4+5)
LastIndexAnyASCII/1:8-48 20.5ns ± 0% 12.7ns ± 0% -37.85% (p=0.000 n=4+5)
LastIndexAnyASCII/1:16-48 28.0ns ± 0% 11.7ns ± 0% ~ (p=0.079 n=4+5)
LastIndexAnyASCII/1:32-48 69.8ns ± 0% 12.4ns ± 0% -82.19% (p=0.008 n=5+5)
LastIndexAnyASCII/1:64-48 73.8ns ± 0% 13.3ns ± 0% -82.03% (p=0.000 n=4+5)
LastIndexAnyASCII/16:1-48 35.5ns ± 0% 35.5ns ± 0% ~ (all equal)
LastIndexAnyASCII/16:2-48 36.0ns ± 0% 36.1ns ± 0% +0.28% (p=0.016 n=4+5)
LastIndexAnyASCII/16:4-48 40.3ns ± 2% 40.0ns ± 6% ~ (p=0.651 n=5+5)
LastIndexAnyASCII/16:8-48 50.3ns ± 0% 50.2ns ± 9% ~ (p=0.175 n=4+5)
LastIndexAnyASCII/16:16-48 62.4ns ± 4% 64.4ns ± 0% +3.28% (p=0.016 n=5+4)
LastIndexAnyASCII/16:32-48 98.9ns ± 0% 98.4ns ± 0% -0.53% (p=0.016 n=5+4)
LastIndexAnyASCII/16:64-48 160ns ± 1% 161ns ± 1% ~ (p=0.325 n=5+5)
LastIndexAnyASCII/256:1-48 300ns ± 0% 301ns ± 0% +0.33% (p=0.008 n=5+5)
LastIndexAnyASCII/256:2-48 304ns ± 0% 304ns ± 0% ~ (p=1.000 n=5+5)
LastIndexAnyASCII/256:4-48 311ns ± 0% 311ns ± 0% ~ (p=0.556 n=4+5)
LastIndexAnyASCII/256:8-48 320ns ± 0% 321ns ± 0% ~ (p=0.143 n=5+5)
LastIndexAnyASCII/256:16-48 333ns ± 0% 335ns ± 0% +0.60% (p=0.029 n=4+4)
LastIndexAnyASCII/256:32-48 367ns ± 0% 366ns ± 0% ~ (p=0.095 n=4+5)
LastIndexAnyASCII/256:64-48 431ns ± 0% 424ns ± 0% -1.62% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:1-48 19.7ns ± 1% 11.9ns ± 0% -39.47% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:2-48 27.6ns ± 1% 11.9ns ± 0% -56.82% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:4-48 29.9ns ± 0% 11.9ns ± 0% ~ (p=0.079 n=4+5)
LastIndexAnyUTF8/1:8-48 48.7ns ± 0% 11.9ns ± 0% -75.54% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:16-48 57.8ns ± 0% 11.4ns ± 0% -80.26% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:32-48 94.7ns ± 0% 12.2ns ± 0% -87.07% (p=0.008 n=5+5)
LastIndexAnyUTF8/1:64-48 163ns ± 0% 13ns ± 1% -91.93% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:1-48 258ns ± 0% 88ns ± 0% -65.76% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:2-48 400ns ± 0% 162ns ± 0% -59.38% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:4-48 415ns ± 0% 162ns ± 0% -60.87% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:8-48 737ns ± 0% 162ns ± 0% -78.02% (p=0.000 n=5+4)
LastIndexAnyUTF8/16:16-48 882ns ± 0% 128ns ± 0% -85.49% (p=0.008 n=5+5)
LastIndexAnyUTF8/16:32-48 1.47µs ± 0% 0.16µs ± 0% -89.29% (p=0.000 n=4+5)
LastIndexAnyUTF8/16:64-48 2.56µs ± 0% 0.14µs ± 0% -94.41% (p=0.016 n=5+4)
LastIndexAnyUTF8/256:1-48 3.60µs ± 0% 1.23µs ± 0% -65.67% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:2-48 5.78µs ± 0% 2.18µs ± 0% -62.32% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:4-48 6.26µs ± 0% 2.18µs ± 0% -65.15% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:8-48 11.2µs ± 0% 2.2µs ± 0% -80.53% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:16-48 13.5µs ± 0% 1.9µs ± 0% -86.02% (p=0.016 n=4+5)
LastIndexAnyUTF8/256:32-48 23.0µs ± 0% 2.1µs ± 0% -90.72% (p=0.008 n=5+5)
LastIndexAnyUTF8/256:64-48 40.5µs ± 0% 2.1µs ± 0% -94.73% (p=0.008 n=5+5)
Change-Id: Ie05e306f8b184b989701868cb161ce8b3f18203b
Reviewed-on: https://go-review.googlesource.com/c/go/+/156998
Run-TryBot: eric fang <eric.fang@arm.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
In order to facilitate optimization of IndexAny and LastIndexAny, this patch moves
three Rabin-Karp related functions indexRabinKarp, hashStr and hashStrRev in strings
package to initernal/bytealg. There are also three functions in the bytes package with
the same names and functions but different parameter types. To highlight this, this
patch also moves them to internal/bytealg and gives them slightly different names.
Related benchmark changes on amd64 and arm64:
name old time/op new time/op delta
pkg:strings goos:linux goarch:amd64
Index-16 14.0ns ± 1% 14.1ns ± 2% ~ (p=0.738 n=5+5)
LastIndex-16 15.5ns ± 1% 15.7ns ± 4% ~ (p=0.897 n=5+5)
pkg:bytes goos:linux goarch:amd64
Index/10-16 26.5ns ± 1% 26.5ns ± 0% ~ (p=0.873 n=5+5)
Index/32-16 26.2ns ± 0% 25.7ns ± 0% -1.68% (p=0.008 n=5+5)
Index/4K-16 5.12µs ± 4% 5.14µs ± 2% ~ (p=0.841 n=5+5)
Index/4M-16 5.44ms ± 3% 5.34ms ± 2% ~ (p=0.056 n=5+5)
Index/64M-16 85.8ms ± 3% 84.6ms ± 0% -1.37% (p=0.016 n=5+5)
name old speed new speed delta
pkg:bytes goos:linux goarch:amd64
Index/10-16 377MB/s ± 1% 377MB/s ± 0% ~ (p=1.000 n=5+5)
Index/32-16 1.22GB/s ± 1% 1.24GB/s ± 0% +1.66% (p=0.008 n=5+5)
Index/4K-16 800MB/s ± 4% 797MB/s ± 2% ~ (p=0.841 n=5+5)
Index/4M-16 771MB/s ± 3% 786MB/s ± 2% ~ (p=0.056 n=5+5)
Index/64M-16 783MB/s ± 3% 793MB/s ± 0% +1.36% (p=0.016 n=5+5)
name old time/op new time/op delta
pkg:strings goos:linux goarch:arm64
Index-8 22.6ns ± 0% 22.5ns ± 0% ~ (p=0.167 n=5+5)
LastIndex-8 17.5ns ± 0% 17.5ns ± 0% ~ (all equal)
pkg:bytes goos:linux goarch:arm64
Index/10-8 25.0ns ± 0% 25.0ns ± 0% ~ (all equal)
Index/32-8 160ns ± 0% 160ns ± 0% ~ (all equal)
Index/4K-8 6.26µs ± 0% 6.26µs ± 0% ~ (p=0.167 n=5+5)
Index/4M-8 6.30ms ± 0% 6.31ms ± 0% ~ (p=1.000 n=5+5)
Index/64M-8 101ms ± 0% 101ms ± 0% ~ (p=0.690 n=5+5)
name old speed new speed delta
pkg:bytes goos:linux goarch:arm64
Index/10-8 399MB/s ± 0% 400MB/s ± 0% +0.08% (p=0.008 n=5+5)
Index/32-8 200MB/s ± 0% 200MB/s ± 0% ~ (p=0.127 n=4+5)
Index/4K-8 654MB/s ± 0% 654MB/s ± 0% +0.01% (p=0.016 n=5+5)
Index/4M-8 665MB/s ± 0% 665MB/s ± 0% ~ (p=0.833 n=5+5)
Index/64M-8 665MB/s ± 0% 665MB/s ± 0% ~ (p=0.913 n=5+5)
Change-Id: Icce3bc162bb8613ac36dc963a46c51f8e82ab842
Reviewed-on: https://go-review.googlesource.com/c/go/+/208638
Run-TryBot: eric fang <eric.fang@arm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Change-Id: I83f806e76ef4d268b187bd273d78ceb41b7e8fa5
GitHub-Last-Rev: ee82eaae64536cecb631df328aafe2541f71d3f2
GitHub-Pull-Request: golang/go#36194
Reviewed-on: https://go-review.googlesource.com/c/go/+/211799
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
There is a part in the comment that points to a non-existent file.
It seems to have been overlooked in following PR.
https://go-review.googlesource.com/c/go/+/98518/
Change-Id: I21dbfbd270c654d5cd7fa88d114a356862612d90
Reviewed-on: https://go-review.googlesource.com/c/go/+/210298
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
This clarifies meaning of "case folding" Unicode equality with more familiar "case insensitive" wording.
For case folding properties see ftp://ftp.unicode.org/Public/UNIDATA/CaseFolding.txt.
Fixes #33447
Change-Id: I6ee85ab398679bf2a0b7d18693985ff0979d6c5a
GitHub-Last-Rev: accc9159330c61e046d77f77beac62b38bf72c19
GitHub-Pull-Request: golang/go#34434
Reviewed-on: https://go-review.googlesource.com/c/go/+/196717
Reviewed-by: Rob Pike <r@golang.org>
|
|
This is intended to help clear up confusion around the usage of the
Title and ToTitle functions. It includes a link to define title case
to distinguish it from upper case. It also includes an additional
example for the ToTitle function to showcase the difference in behavior
between it and the Title function.
Fixes #33302
Change-Id: I44e62962fb04d0d22966a39eda3a2d16de7a2291
Reviewed-on: https://go-review.googlesource.com/c/go/+/187825
Reviewed-by: Rob Pike <r@golang.org>
|
|
The newly added functions create a copy of their input with all bytes in
invalid UTF-8 byte sequences mapped to the UTF-8 byte sequence
given as replacement parameter.
Fixes #25805
Change-Id: Iaf65f65b40c0581c6bb000f1590408d6628321d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/142003
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
When string letters are all in lower/upper cases, both functions respectively
return original string.
Fixes #30987
Change-Id: Ie8d664f7af5e087f82c1bc156933e9a995645bf4
Reviewed-on: https://go-review.googlesource.com/c/go/+/171735
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
As noted by Brad in CL 170954 for package bytes.
Change-Id: I2772a356299e54ba5b7884d537e6649039adb9be
Reviewed-on: https://go-review.googlesource.com/c/go/+/171198
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
CL 56470 unindented bytes.Fields, but not strings.Fields. Do so now to
make it easier to diff the two functions for potential differences.
Change-Id: Ifef81f50cee64e8277e91efa5ec5521d8d21d3bd
Reviewed-on: https://go-review.googlesource.com/c/go/+/170951
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This change adds a fast path for ASCII strings to both
strings.TrimSpace and bytes.TrimSpace. It doesn't slow down the
non-ASCII path much, if at all.
I added benchmarks for strings.TrimSpace as it didn't have any, and
I fleshed out the benchmarks for bytes.TrimSpace as it just had one
case (for ASCII). The benchmarks (and the code!) are now the same
between the two versions. Below are the benchmark results:
strings.TrimSpace:
name old time/op new time/op delta
TrimSpace/NoTrim-8 18.6ns ± 0% 3.8ns ± 0% -79.53% (p=0.000 n=5+4)
TrimSpace/ASCII-8 33.5ns ± 2% 6.0ns ± 3% -82.05% (p=0.008 n=5+5)
TrimSpace/SomeNonASCII-8 97.1ns ± 1% 88.6ns ± 1% -8.68% (p=0.008 n=5+5)
TrimSpace/JustNonASCII-8 144ns ± 0% 143ns ± 0% ~ (p=0.079 n=4+5)
bytes.TrimSpace:
name old time/op new time/op delta
TrimSpace/NoTrim-8 18.9ns ± 1% 4.1ns ± 1% -78.34% (p=0.008 n=5+5)
TrimSpace/ASCII-8 29.9ns ± 0% 6.3ns ± 1% -79.06% (p=0.008 n=5+5)
TrimSpace/SomeNonASCII-8 91.5ns ± 0% 82.3ns ± 0% -10.03% (p=0.008 n=5+5)
TrimSpace/JustNonASCII-8 150ns ± 0% 150ns ± 0% ~ (all equal)
Fixes #29122
Change-Id: Ica45cd86a219cadf60173ec9db260133cd1d7951
Reviewed-on: https://go-review.googlesource.com/c/go/+/152917
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Cleans things up quite a bit.
There's still a few more, like runtime.cmpstring, which might also
be worth fixing.
Change-Id: Ide18dd621efc129cc686db223f47fa0b044b5580
Reviewed-on: https://go-review.googlesource.com/c/148578
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Change-Id: Ibdca4f7002585b00d7f69d710285a8e0f69c598a
GitHub-Last-Rev: eb8f800c986c8ac4a81705158ecc730c35e1c5c2
GitHub-Pull-Request: golang/go#28659
Reviewed-on: https://go-review.googlesource.com/c/148477
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
When an invalid UTF-8 byte sequence is decoded in a range loop over a string
a utf8.RuneError rune is returned. This is not distinguishable from decoding
the valid '\uFFFD' sequence representing utf8.RuneError from a string without
further checks within the range loop.
The previous Map code did not do any extra checks and would thereby not map
invalid UTF-8 byte sequences correctly when those were mapping to utf8.RuneError.
Fix this by adding the extra checks necessary to distinguish the decoding
of invalid utf8 byte sequences from decoding the sequence for utf8.RuneError
when the mapping of a rune is utf8.RuneError.
This fix does not result in a measureable performance regression:
name old time/op new time/op delta
ByteByteMap 1.05µs ± 3% 1.03µs ± 3% ~ (p=0.118 n=10+10)
Map/identity/ASCII 169ns ± 2% 170ns ± 1% ~ (p=0.501 n=9+10)
Map/identity/Greek 298ns ± 1% 303ns ± 4% ~ (p=0.338 n=10+10)
Map/change/ASCII 323ns ± 3% 325ns ± 4% ~ (p=0.679 n=8+10)
Map/change/Greek 628ns ± 5% 635ns ± 1% ~ (p=0.460 n=10+9)
MapNoChanges 120ns ± 4% 119ns ± 1% ~ (p=0.496 n=10+9)
Fixes #26305
Change-Id: I70e99fa244983c5040756fa4549ac1e8cb6022c3
Reviewed-on: https://go-review.googlesource.com/c/131495
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Credit to Harald Nordgren for the proposal in
https://golang.org/cl/137456 and #27864.
Fixes #27864
Change-Id: I80546683b0623124fe4627a71af88add2f6c1c27
Reviewed-on: https://go-review.googlesource.com/137855
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Map was optimized to use Builder in 45c7d80832, which avoided the []byte
to string converstion. This left the ToUpper and ToLower ASCII fast path
with an extra allocation over Map.
name old time/op new time/op delta
ToUpper/#00-12 3.59ns ± 4% 3.71ns ± 1% ~ (p=0.056 n=5+5)
ToUpper/ONLYUPPER-12 11.8ns ± 2% 10.5ns ± 2% -10.85% (p=0.008 n=5+5)
ToUpper/abc-12 31.8ns ± 1% 25.3ns ± 1% -20.40% (p=0.008 n=5+5)
ToUpper/AbC123-12 46.2ns ± 7% 31.9ns ± 8% -30.89% (p=0.008 n=5+5)
ToUpper/azAZ09_-12 47.1ns ± 8% 32.6ns ± 4% -30.77% (p=0.008 n=5+5)
ToUpper/longStrinGwitHmixofsmaLLandcAps-12 137ns ±15% 104ns ±11% -24.11% (p=0.008 n=5+5)
ToUpper/longɐstringɐwithɐnonasciiⱯchars-12 231ns ± 1% 228ns ± 1% ~ (p=0.079 n=5+5)
ToUpper/ɐɐɐɐɐ-12 207ns ± 3% 206ns ± 1% ~ (p=0.913 n=5+5)
ToUpper/a\u0080\U0010ffff-12 90.8ns ± 1% 89.6ns ± 1% -1.30% (p=0.024 n=5+5)
ToLower/#00-12 3.59ns ± 1% 4.26ns ± 2% +18.66% (p=0.008 n=5+5)
ToLower/abc-12 6.32ns ± 1% 6.62ns ± 1% +4.72% (p=0.008 n=5+5)
ToLower/AbC123-12 45.0ns ±13% 31.5ns ± 4% -29.89% (p=0.008 n=5+5)
ToLower/azAZ09_-12 48.8ns ± 6% 33.2ns ± 3% -31.91% (p=0.008 n=5+5)
ToLower/longStrinGwitHmixofsmaLLandcAps-12 149ns ±13% 98ns ± 8% -34.30% (p=0.008 n=5+5)
ToLower/LONGⱯSTRINGⱯWITHⱯNONASCIIⱯCHARS-12 237ns ± 4% 237ns ± 2% ~ (p=0.635 n=5+5)
ToLower/ⱭⱭⱭⱭⱭ-12 181ns ± 1% 181ns ± 1% ~ (p=0.762 n=5+5)
ToLower/A\u0080\U0010ffff-12 90.6ns ± 1% 92.5ns ± 1% +2.05% (p=0.016 n=5+5)
name old alloc/op new alloc/op delta
ToUpper/#00-12 0.00B 0.00B ~ (all equal)
ToUpper/ONLYUPPER-12 0.00B 0.00B ~ (all equal)
ToUpper/abc-12 6.00B ± 0% 3.00B ± 0% -50.00% (p=0.008 n=5+5)
ToUpper/AbC123-12 16.0B ± 0% 8.0B ± 0% -50.00% (p=0.008 n=5+5)
ToUpper/azAZ09_-12 16.0B ± 0% 8.0B ± 0% -50.00% (p=0.008 n=5+5)
ToUpper/longStrinGwitHmixofsmaLLandcAps-12 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.008 n=5+5)
ToUpper/longɐstringɐwithɐnonasciiⱯchars-12 48.0B ± 0% 48.0B ± 0% ~ (all equal)
ToUpper/ɐɐɐɐɐ-12 48.0B ± 0% 48.0B ± 0% ~ (all equal)
ToUpper/a\u0080\U0010ffff-12 16.0B ± 0% 16.0B ± 0% ~ (all equal)
ToLower/#00-12 0.00B 0.00B ~ (all equal)
ToLower/abc-12 0.00B 0.00B ~ (all equal)
ToLower/AbC123-12 16.0B ± 0% 8.0B ± 0% -50.00% (p=0.008 n=5+5)
ToLower/azAZ09_-12 16.0B ± 0% 8.0B ± 0% -50.00% (p=0.008 n=5+5)
ToLower/longStrinGwitHmixofsmaLLandcAps-12 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.008 n=5+5)
ToLower/LONGⱯSTRINGⱯWITHⱯNONASCIIⱯCHARS-12 48.0B ± 0% 48.0B ± 0% ~ (all equal)
ToLower/ⱭⱭⱭⱭⱭ-12 32.0B ± 0% 32.0B ± 0% ~ (all equal)
ToLower/A\u0080\U0010ffff-12 16.0B ± 0% 16.0B ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
ToUpper/#00-12 0.00 0.00 ~ (all equal)
ToUpper/ONLYUPPER-12 0.00 0.00 ~ (all equal)
ToUpper/abc-12 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.008 n=5+5)
ToUpper/AbC123-12 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.008 n=5+5)
ToUpper/azAZ09_-12 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.008 n=5+5)
ToUpper/longStrinGwitHmixofsmaLLandcAps-12 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.008 n=5+5)
ToUpper/longɐstringɐwithɐnonasciiⱯchars-12 1.00 ± 0% 1.00 ± 0% ~ (all equal)
ToUpper/ɐɐɐɐɐ-12 2.00 ± 0% 2.00 ± 0% ~ (all equal)
ToUpper/a\u0080\U0010ffff-12 1.00 ± 0% 1.00 ± 0% ~ (all equal)
ToLower/#00-12 0.00 0.00 ~ (all equal)
ToLower/abc-12 0.00 0.00 ~ (all equal)
ToLower/AbC123-12 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.008 n=5+5)
ToLower/azAZ09_-12 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.008 n=5+5)
ToLower/longStrinGwitHmixofsmaLLandcAps-12 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.008 n=5+5)
ToLower/LONGⱯSTRINGⱯWITHⱯNONASCIIⱯCHARS-12 1.00 ± 0% 1.00 ± 0% ~ (all equal)
ToLower/ⱭⱭⱭⱭⱭ-12 1.00 ± 0% 1.00 ± 0% ~ (all equal)
ToLower/A\u0080\U0010ffff-12 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Updates #26304
Change-Id: I4179e21d5e60d950b925fe3ffc74b376b82812d2
GitHub-Last-Rev: 2c7c3bb75b8fb16fed5f0c8979ee9941675ed6bf
GitHub-Pull-Request: golang/go#27872
Reviewed-on: https://go-review.googlesource.com/137575
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
This change compares the first two characters instead of the first one,
and if they match, the entire string is compared. Comparing the first two
characters helps to filter out the case where the first character matches
but the subsequent characters do not match, thereby improving the substring
search speed in this case.
Benchmarks with no effect or minimal impact (less than 5%) is not listed,
the following are improved benchmarks:
On arm64:
strings:
IndexPeriodic/IndexPeriodic16-8 172890.00ns +- 2% 124156.20ns +- 0% -28.19% (p=0.008 n=5+5)
IndexPeriodic/IndexPeriodic32-8 78092.80ns +- 0% 65138.60ns +- 0% -16.59% (p=0.008 n=5+5)
IndexPeriodic/IndexPeriodic64-8 42322.20ns +- 0% 34661.60ns +- 0% -18.10% (p=0.008 n=5+5)
bytes:
IndexPeriodic/IndexPeriodic16-8 183468.20ns +- 6% 123759.00ns +- 0% -32.54% (p=0.008 n=5+5)
IndexPeriodic/IndexPeriodic32-8 84776.40ns +- 0% 63907.80ns +- 0% -24.62% (p=0.008 n=5+5)
IndexPeriodic/IndexPeriodic64-8 45835.60ns +- 0% 34194.20ns +- 0% -25.40% (p=0.008 n=5+5)
On amd64:
strings:
IndexPeriodic/IndexPeriodic8-16 219499.00ns +- 0% 178123.40ns +- 0% -18.85% (p=0.008 n=5+5)
IndexPeriodic/IndexPeriodic16-16 109760.20ns +- 0% 88957.80ns +- 0% -18.95% (p=0.008 n=5+5)
IndexPeriodic/IndexPeriodic32-16 54943.00ns +- 0% 44573.80ns +- 0% -18.87% (p=0.008 n=5+5)
IndexPeriodic/IndexPeriodic64-16 29804.80ns +- 0% 24417.80ns +- 0% -18.07% (p=0.008 n=5+5)
bytes:
IndexPeriodic/IndexPeriodic8-16 226592.60ns +- 0% 181183.20ns +- 0% -20.04% (p=0.008 n=5+5)
IndexPeriodic/IndexPeriodic16-16 111432.60ns +- 0% 90634.60ns +- 0% -18.66% (p=0.008 n=5+5)
IndexPeriodic/IndexPeriodic32-16 55640.60ns +- 0% 45433.00ns +- 0% -18.35% (p=0.008 n=5+5)
IndexPeriodic/IndexPeriodic64-16 30833.00ns +- 0% 24784.20ns +- 0% -19.62% (p=0.008 n=5+5)
Change-Id: I2d9e7e138d29e960d20a203eb74dc2ec976a9d71
Reviewed-on: https://go-review.googlesource.com/131177
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
The existing implementation has a bunch of special cases and
suffers an additional allocation for longer arrays. We can replace
this code with a simple implementation using Builder, improve
performance and reduce complexity.
name old time/op new time/op delta
Join/0-8 3.53ns ± 3% 3.72ns ± 2% +5.56% (p=0.000 n=10+10)
Join/1-8 3.94ns ± 4% 3.40ns ± 4% -13.57% (p=0.000 n=10+10)
Join/2-8 57.0ns ± 3% 51.0ns ± 1% -10.48% (p=0.000 n=10+9)
Join/3-8 74.9ns ± 2% 65.5ns ± 4% -12.60% (p=0.000 n=10+10)
Join/4-8 105ns ± 0% 79ns ± 4% -24.63% (p=0.000 n=6+10)
Join/5-8 116ns ± 2% 91ns ± 4% -21.95% (p=0.000 n=10+10)
Join/6-8 131ns ± 1% 104ns ± 1% -20.66% (p=0.000 n=10+10)
Join/7-8 141ns ± 0% 114ns ± 4% -18.82% (p=0.000 n=9+10)
name old alloc/op new alloc/op delta
Join/0-8 0.00B 0.00B ~ (all equal)
Join/1-8 0.00B 0.00B ~ (all equal)
Join/2-8 16.0B ± 0% 16.0B ± 0% ~ (all equal)
Join/3-8 32.0B ± 0% 32.0B ± 0% ~ (all equal)
Join/4-8 96.0B ± 0% 48.0B ± 0% -50.00% (p=0.000 n=10+10)
Join/5-8 96.0B ± 0% 48.0B ± 0% -50.00% (p=0.000 n=10+10)
Join/6-8 128B ± 0% 64B ± 0% -50.00% (p=0.000 n=10+10)
Join/7-8 128B ± 0% 64B ± 0% -50.00% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Join/0-8 0.00 0.00 ~ (all equal)
Join/1-8 0.00 0.00 ~ (all equal)
Join/2-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Join/3-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Join/4-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
Join/5-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
Join/6-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
Join/7-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
Change-Id: I866a50e809c398512cb87648c955eaa4bf4d8606
Reviewed-on: https://go-review.googlesource.com/132895
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
name old time/op new time/op delta
Repeat/5x1-4 95.9ns ± 2% 70.1ns ± 2% -26.93% (p=0.000 n=9+10)
Repeat/5x2-4 146ns ± 3% 100ns ± 2% -31.99% (p=0.000 n=10+10)
Repeat/5x6-4 203ns ± 3% 140ns ± 4% -30.77% (p=0.000 n=10+10)
Repeat/10x1-4 139ns ± 3% 92ns ± 4% -34.08% (p=0.000 n=10+10)
Repeat/10x2-4 188ns ± 4% 122ns ± 2% -35.34% (p=0.000 n=10+10)
Repeat/10x6-4 264ns ± 5% 179ns ± 4% -32.15% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
Repeat/5x1-4 10.0B ± 0% 5.0B ± 0% -50.00% (p=0.000 n=10+10)
Repeat/5x2-4 32.0B ± 0% 16.0B ± 0% -50.00% (p=0.000 n=10+10)
Repeat/5x6-4 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.000 n=10+10)
Repeat/10x1-4 32.0B ± 0% 16.0B ± 0% -50.00% (p=0.000 n=10+10)
Repeat/10x2-4 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.000 n=10+10)
Repeat/10x6-4 128B ± 0% 64B ± 0% -50.00% (p=0.000 n=10+10)
Change-Id: I6619336da636df39c560f6cc481519f48c6e8176
GitHub-Last-Rev: 4b2c73f3bfa0b3789268b9ea6e1ecdb984e8087c
GitHub-Pull-Request: golang/go#25894
Reviewed-on: https://go-review.googlesource.com/118855
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
A number of explicit function literals found through the
unlambda linter are removed.
Fixes #26802
Change-Id: I0b122bdd95e9cb804c77efe20483fdf681c8154e
Reviewed-on: https://go-review.googlesource.com/127756
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
|
|
Use a builder to avoid the copy when converting the []byte to a string.
name old time/op new time/op delta
ByteByteMap-8 796ns ± 5% 700ns ± 1% -12.00% (p=0.000 n=9+8)
Map/identity/ASCII-8 123ns ± 8% 126ns ± 7% ~ (p=0.194 n=10+10)
Map/identity/Greek-8 198ns ± 2% 204ns ± 5% +2.99% (p=0.008 n=9+10)
Map/change/ASCII-8 266ns ±10% 202ns ± 3% -24.19% (p=0.000 n=10+10)
Map/change/Greek-8 450ns ± 4% 406ns ± 1% -9.73% (p=0.000 n=9+10)
MapNoChanges-8 85.4ns ± 3% 90.2ns ±11% +5.67% (p=0.000 n=9+10)
name old alloc/op new alloc/op delta
ByteByteMap-8 416B ± 0% 208B ± 0% -50.00% (p=0.000 n=10+10)
Map/identity/ASCII-8 0.00B 0.00B ~ (all equal)
Map/identity/Greek-8 0.00B 0.00B ~ (all equal)
Map/change/ASCII-8 128B ± 0% 64B ± 0% -50.00% (p=0.000 n=10+10)
Map/change/Greek-8 160B ± 0% 80B ± 0% -50.00% (p=0.000 n=10+10)
MapNoChanges-8 0.00B 0.00B ~ (all equal)
name old allocs/op new allocs/op delta
ByteByteMap-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
Map/identity/ASCII-8 0.00 0.00 ~ (all equal)
Map/identity/Greek-8 0.00 0.00 ~ (all equal)
Map/change/ASCII-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
Map/change/Greek-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
MapNoChanges-8 0.00 0.00 ~ (all equal)
Fixes #26304
Change-Id: Ideec9dfc29b0b8107f34fc634247081d0031777d
Reviewed-on: https://go-review.googlesource.com/122875
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Fixes #26654
Change-Id: I4832c45cad40607b83e1a8a9b562fa12e639b7d9
GitHub-Last-Rev: c9ceedb7d4b4c01f91ea4fe3dc3496e73eed9120
GitHub-Pull-Request: golang/go#26781
Reviewed-on: https://go-review.googlesource.com/127716
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
If one quickly looks at the strings package godoc, reading the name
TrimLeft, one might think it removes a prefix from the string.
The function's godoc does explain its purpose, but it's apparent that it
is not clear enough, as there have been numerous raised issues about
this confusion: #12771 #14657 #18160 #19371 #20085 #25328 #26119. These
questions are also frequent elsewhere on the internet.
Add a very short paragraph to the godoc, to hopefully point new Go
developers in the right direction faster. Do the same thing for
TrimRight and TrimSuffix.
Change-Id: I4dee5ed8dd9fba565b4755bad12ae1ee6d277959
Reviewed-on: https://go-review.googlesource.com/121637
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Fix encoding of PAD (U+0080) which has the same value as utf8.RuneSelf
being incorrectly encoded as \x80 in strings.Map due to using <= instead
of a < comparison operator to check one byte encodings for utf8.
Fixes #25242
Change-Id: Ib6c7d1f425a7ba81e431b6d64009e713d94ea3bc
Reviewed-on: https://go-review.googlesource.com/111286
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
The existing implementation only considers the special ASCII
case when the lower character is an upper case letter. This
means that most ASCII comparisons use unicode.SimpleFold even
when it is not necessary.
benchmark old ns/op new ns/op delta
BenchmarkEqualFold-8 450 390 -13.33%
Change-Id: I735ca3c30fc0145c186d2a54f31fd39caab2c3fa
Reviewed-on: https://go-review.googlesource.com/110018
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Also move the arm64 CountByte implementation while we're here.
Fixes #19792
Change-Id: I1e0fdf1e03e3135af84150a2703b58dad1b0d57e
Reviewed-on: https://go-review.googlesource.com/98518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Move bytes.Count and strings.Count to bytealg.
Update #19792
Change-Id: I3e4e14b504a0b71758885bb131e5656e342cf8cb
Reviewed-on: https://go-review.googlesource.com/98495
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
CL 65851 (bytes) and CL 65910 (strings) “improve[d] readability”
by removing the special case that bypassed the whole function body
when chars == "". In doing so, yes, the function was unindented a
level, which is nice, but the runtime of that case went from O(1) to O(n)
where n = len(s).
I don't know if anyone's code depends on the O(1) behavior in this case,
but quite possibly someone's does.
This CL adds the special case back, with a comment to prevent future
deletions, and without reindenting each function body in full.
Change-Id: I5aba33922b304dd1b8657e6d51d6c937a7f95c81
Reviewed-on: https://go-review.googlesource.com/78112
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Use IndexByte first, as it allows us to skip lots of bytes quickly.
If IndexByte is generating a lot of false positives, switch over to Rabin-Karp.
Experiments for ppc64le
bytes:
name old time/op new time/op delta
IndexPeriodic/IndexPeriodic2-2 1.12ms ± 0% 0.18ms ± 0% -83.54% (p=0.000 n=10+9)
IndexPeriodic/IndexPeriodic4-2 635µs ± 0% 184µs ± 0% -71.06% (p=0.000 n=9+9)
IndexPeriodic/IndexPeriodic8-2 289µs ± 0% 184µs ± 0% -36.51% (p=0.000 n=10+9)
IndexPeriodic/IndexPeriodic16-2 133µs ± 0% 183µs ± 0% +37.68% (p=0.000 n=10+9)
IndexPeriodic/IndexPeriodic32-2 68.3µs ± 0% 70.2µs ± 0% +2.76% (p=0.000 n=10+10)
IndexPeriodic/IndexPeriodic64-2 35.8µs ± 0% 36.6µs ± 0% +2.17% (p=0.000 n=8+10)
strings:
name old time/op new time/op delta
IndexPeriodic/IndexPeriodic2-2 184µs ± 0% 184µs ± 0% +0.11% (p=0.029 n=4+4)
IndexPeriodic/IndexPeriodic4-2 184µs ± 0% 184µs ± 0% ~ (p=0.886 n=4+4)
IndexPeriodic/IndexPeriodic8-2 184µs ± 0% 184µs ± 0% ~ (p=0.486 n=4+4)
IndexPeriodic/IndexPeriodic16-2 185µs ± 1% 184µs ± 0% ~ (p=0.343 n=4+4)
IndexPeriodic/IndexPeriodic32-2 184µs ± 0% 69µs ± 0% -62.37% (p=0.029 n=4+4)
IndexPeriodic/IndexPeriodic64-2 184µs ± 0% 37µs ± 0% -80.17% (p=0.029 n=4+4)
Fixes #22578
Change-Id: If2a4d8554cb96bfd699b58149d13ac294615f8b8
Reviewed-on: https://go-review.googlesource.com/76070
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
|
|
Handling the ASCII case inline and call unicode.ToLower only
for non-ASCII cases.
Gives good improvements for the ASCII case and minor perf
degrade for non-ASCII case
name old time/op new time/op delta
ToLower/#00 10.8ns ± 1% 9.0ns ± 1% -16.83% (p=0.008 n=5+5)
ToLower/abc 23.3ns ± 4% 12.6ns ± 1% -46.01% (p=0.008 n=5+5)
ToLower/AbC123 91.0ns ± 2% 70.4ns ± 0% -22.59% (p=0.008 n=5+5)
ToLower/azAZ09_ 104ns ± 3% 75ns ± 1% -28.35% (p=0.008 n=5+5)
ToLower/longStrinGwitHmixofsmaLLandcAps 254ns ± 4% 157ns ± 0% -38.19% (p=0.016 n=5+4)
ToLower/LONGⱯSTRINGⱯWITHⱯNONASCIIⱯCHARS 446ns ± 1% 451ns ± 1% ~ (p=0.056 n=5+5)
ToLower/ⱭⱭⱭⱭⱭ 345ns ± 1% 348ns ± 0% +0.93% (p=0.016 n=5+5)
name old alloc/op new alloc/op delta
ToLower/#00 0.00B 0.00B ~ (all equal)
ToLower/abc 0.00B 0.00B ~ (all equal)
ToLower/AbC123 16.0B ± 0% 16.0B ± 0% ~ (all equal)
ToLower/azAZ09_ 24.0B ± 0% 16.0B ± 0% -33.33% (p=0.008 n=5+5)
ToLower/longStrinGwitHmixofsmaLLandcAps 80.0B ± 0% 64.0B ± 0% -20.00% (p=0.008 n=5+5)
ToLower/LONGⱯSTRINGⱯWITHⱯNONASCIIⱯCHARS 96.0B ± 0% 96.0B ± 0% ~ (all equal)
ToLower/ⱭⱭⱭⱭⱭ 48.0B ± 0% 48.0B ± 0% ~ (all equal)
Ran on a machine with Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
Fixes #17859
Change-Id: Iacc1e6b77e1aedba9447a6e94352606f131ea597
Reviewed-on: https://go-review.googlesource.com/76470
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
|
|
Handling the ASCII case inline and call unicode.ToUpper only
for non-ascii cases.
Gives good improvements for the ascii case and minor perf
degrade for non-ascii case
name old time/op new time/op delta
ToUpper/#00 11.7ns ± 8% 8.0ns ± 1% -31.95% (p=0.008 n=5+5)
ToUpper/ONLYUPPER 45.6ns ± 5% 19.9ns ± 1% -56.40% (p=0.008 n=5+5)
ToUpper/abc 77.4ns ± 1% 57.0ns ± 1% -26.32% (p=0.008 n=5+5)
ToUpper/AbC123 92.1ns ± 4% 67.7ns ± 2% -26.57% (p=0.008 n=5+5)
ToUpper/azAZ09_ 105ns ± 6% 67ns ± 2% -36.26% (p=0.000 n=5+4)
ToUpper/longStrinGwitHmixofsmaLLandcAps 255ns ± 1% 140ns ± 1% -45.01% (p=0.029 n=4+4)
ToUpper/longɐstringɐwithɐnonasciiⱯchars 440ns ± 1% 447ns ± 0% +1.49% (p=0.016 n=5+4)
ToUpper/ɐɐɐɐɐ 370ns ± 4% 366ns ± 1% ~ (p=0.667 n=5+5)
name old alloc/op new alloc/op delta
ToUpper/#00 0.00B 0.00B ~ (all equal)
ToUpper/ONLYUPPER 0.00B 0.00B ~ (all equal)
ToUpper/abc 16.0B ± 0% 6.0B ± 0% -62.50% (p=0.008 n=5+5)
ToUpper/AbC123 16.0B ± 0% 16.0B ± 0% ~ (all equal)
ToUpper/azAZ09_ 24.0B ± 0% 16.0B ± 0% -33.33% (p=0.008 n=5+5)
ToUpper/longStrinGwitHmixofsmaLLandcAps 80.0B ± 0% 64.0B ± 0% -20.00% (p=0.008 n=5+5)
ToUpper/longɐstringɐwithɐnonasciiⱯchars 96.0B ± 0% 96.0B ± 0% ~ (all equal)
ToUpper/ɐɐɐɐɐ 64.0B ± 0% 64.0B ± 0% ~ (all equal)
Ran on a machine with Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
Updates #17859
Change-Id: I0735ac4a4a36e8a8f6cc06f2c16b871f12b4abf9
Reviewed-on: https://go-review.googlesource.com/68370
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This change removes the check of len(chars) > 0 inside the Index and
IndexAny functions which was redundant.
Change-Id: Iffbc0f2b3332c6e31c7514b5f644b6fe7bdcfe0d
Reviewed-on: https://go-review.googlesource.com/65910
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
|
|
Change-Id: I70b839ff0ae5f015587390a82616ebb1d657d71a
Reviewed-on: https://go-review.googlesource.com/55490
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Increases performance of FieldsFunc by recording the start and end
of the fields in an array. The first 32 fields are saved in a pre-allocated
array on the stack. This avoids the old behavior of iterating over the
input string two times but uses more allocations when more than 32 fields
are encountered.
Additionally code for handling non-ASCII containing strings from Fields is
removed and replaced by a call to the new faster FieldsFunc function.
Overall this still leads to a slowdown for Fields on non-ASCII strings
while speeding up Fields in general.
name old time/op new time/op delta
Fields/ASCII/16 116ns ± 5% 115ns ± 5% ~ (p=0.480 n=10+10)
Fields/ASCII/256 765ns ± 1% 761ns ± 2% ~ (p=0.171 n=10+10)
Fields/ASCII/4096 12.5µs ± 1% 12.7µs ± 1% +1.82% (p=0.000 n=10+10)
Fields/ASCII/65536 226µs ± 1% 226µs ± 2% ~ (p=0.739 n=10+10)
Fields/ASCII/1048576 5.12ms ± 1% 5.12ms ± 1% ~ (p=0.696 n=8+10)
Fields/Mixed/16 172ns ± 1% 233ns ± 1% +35.90% (p=0.000 n=9+10)
Fields/Mixed/256 1.18µs ± 2% 2.45µs ± 1% +107.47% (p=0.000 n=10+10)
Fields/Mixed/4096 20.3µs ± 1% 43.1µs ± 2% +112.41% (p=0.000 n=10+10)
Fields/Mixed/65536 364µs ± 1% 704µs ± 1% +93.56% (p=0.000 n=9+10)
Fields/Mixed/1048576 7.07ms ± 2% 13.34ms ± 4% +88.83% (p=0.000 n=10+10)
FieldsFunc/ASCII/16 274ns ± 1% 188ns ± 3% -31.44% (p=0.000 n=10+10)
FieldsFunc/ASCII/256 3.69µs ± 1% 2.06µs ± 2% -44.26% (p=0.000 n=10+10)
FieldsFunc/ASCII/4096 59.9µs ± 1% 35.3µs ± 2% -41.10% (p=0.000 n=10+10)
FieldsFunc/ASCII/65536 958µs ± 1% 567µs ± 1% -40.82% (p=0.000 n=10+9)
FieldsFunc/ASCII/1048576 16.3ms ± 2% 11.0ms ± 3% -32.52% (p=0.000 n=10+10)
FieldsFunc/Mixed/16 309ns ± 1% 213ns ± 0% -30.98% (p=0.000 n=10+6)
FieldsFunc/Mixed/256 3.83µs ± 1% 2.14µs ± 1% -44.01% (p=0.000 n=10+10)
FieldsFunc/Mixed/4096 66.2µs ± 2% 37.8µs ± 1% -42.85% (p=0.000 n=10+10)
FieldsFunc/Mixed/65536 1.09ms ± 1% 0.63ms ± 1% -42.73% (p=0.000 n=10+10)
FieldsFunc/Mixed/1048576 18.6ms ± 3% 12.0ms ± 2% -35.50% (p=0.000 n=10+10)
Fixes #17856
Fixes #19789
Change-Id: I9f5a560e534566fd81963651f342c8f44cfb0469
Reviewed-on: https://go-review.googlesource.com/42810
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
|
|
We initialize fieldStart to 0, then set it to i without ever reading
0, so we might as well just initialize it to i.
Change-Id: I17905b25d54a62b6bc76f915353756ed5eb6972b
Reviewed-on: https://go-review.googlesource.com/52933
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Avelino <t@avelino.xxx>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Apparently people get confused by the fact that
Split("", ",")
returns []{""} instead of []{}.
This is actually just a consequence of the fact that if the separator
sep (2nd argument) is not found the string s (1st argument), then the
Split* functions return a length 1 slice with the string s in it.
Document the general case: if sep is not in s, what you get is a len 1
slice with s in it; unless both s and sep are "", in that case you get
an empty slice of length 0.
Fixes #19726
Change-Id: I64c8220b91acd1e5aa1cc1829199e0cd8c47c404
Reviewed-on: https://go-review.googlesource.com/44950
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
|
|
A for-range loop is simpler and also generally faster nowadays:
TrimASCII/1:1-4 69.2ns ± 1% 72.3ns ± 4% +4.55% (p=0.001 n=8+8)
TrimASCII/1:2-4 114ns ± 4% 104ns ± 3% -8.71% (p=0.000 n=9+8)
TrimASCII/1:4-4 112ns ± 1% 109ns ± 2% -2.57% (p=0.000 n=8+9)
TrimASCII/1:8-4 120ns ± 2% 118ns ± 4% ~ (p=0.097 n=9+9)
TrimASCII/1:16-4 137ns ± 3% 132ns ± 3% -3.82% (p=0.001 n=9+9)
TrimASCII/16:1-4 129ns ± 1% 125ns ± 2% -3.38% (p=0.000 n=8+9)
TrimASCII/16:2-4 167ns ± 3% 159ns ± 1% -4.99% (p=0.000 n=9+8)
TrimASCII/16:4-4 165ns ± 2% 162ns ± 1% -1.91% (p=0.005 n=8+9)
TrimASCII/16:8-4 173ns ± 2% 170ns ± 1% -1.29% (p=0.018 n=9+9)
TrimASCII/16:16-4 188ns ± 2% 186ns ± 2% -1.13% (p=0.022 n=8+9)
TrimASCII/256:1-4 1.06µs ± 1% 0.98µs ± 2% -7.64% (p=0.000 n=8+9)
TrimASCII/256:2-4 1.08µs ± 1% 1.06µs ± 2% -1.95% (p=0.006 n=9+9)
TrimASCII/256:4-4 1.09µs ± 1% 1.07µs ± 3% ~ (p=0.059 n=9+9)
TrimASCII/256:8-4 1.10µs ± 1% 1.07µs ± 2% -2.63% (p=0.000 n=9+8)
TrimASCII/256:16-4 1.10µs ± 1% 1.08µs ± 1% -1.90% (p=0.000 n=8+9)
TrimASCII/4096:1-4 15.8µs ± 1% 14.5µs ± 1% -8.59% (p=0.000 n=9+9)
TrimASCII/4096:2-4 15.6µs ± 1% 15.4µs ± 2% -1.27% (p=0.021 n=8+8)
TrimASCII/4096:4-4 15.6µs ± 1% 15.4µs ± 2% ~ (p=0.094 n=9+9)
TrimASCII/4096:8-4 15.7µs ± 1% 15.8µs ± 6% ~ (p=0.555 n=8+8)
TrimASCII/4096:16-4 15.7µs ± 2% 15.3µs ± 1% -2.64% (p=0.000 n=8+9)
Change-Id: I9b06689b67c0cf2c7ff446fc63a8c44cc5d6a246
Reviewed-on: https://go-review.googlesource.com/32891
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Move optimized Count implementation from bytes to runtime. Use in
both bytes and strings packages.
Add CountByte benchmark to strings.
Strings benchmarks:
name old time/op new time/op delta
CountHard1-4 226µs ± 1% 226µs ± 2% ~ (p=0.247 n=10+10)
CountHard2-4 316µs ± 1% 315µs ± 0% ~ (p=0.133 n=9+10)
CountHard3-4 919µs ± 1% 920µs ± 1% ~ (p=0.968 n=10+9)
CountTorture-4 15.4µs ± 1% 15.7µs ± 1% +2.47% (p=0.000 n=10+9)
CountTortureOverlapping-4 9.60ms ± 0% 9.65ms ± 1% ~ (p=0.247 n=10+10)
CountByte/10-4 26.3ns ± 1% 10.9ns ± 1% -58.71% (p=0.000 n=9+9)
CountByte/32-4 42.7ns ± 0% 14.2ns ± 0% -66.64% (p=0.000 n=10+10)
CountByte/4096-4 3.07µs ± 0% 0.31µs ± 2% -89.99% (p=0.000 n=9+10)
CountByte/4194304-4 3.48ms ± 1% 0.34ms ± 1% -90.09% (p=0.000 n=10+9)
CountByte/67108864-4 55.6ms ± 1% 7.0ms ± 0% -87.49% (p=0.000 n=9+8)
name old speed new speed delta
CountByte/10-4 380MB/s ± 1% 919MB/s ± 1% +142.21% (p=0.000 n=9+9)
CountByte/32-4 750MB/s ± 0% 2247MB/s ± 0% +199.62% (p=0.000 n=10+10)
CountByte/4096-4 1.33GB/s ± 0% 13.32GB/s ± 2% +898.13% (p=0.000 n=9+10)
CountByte/4194304-4 1.21GB/s ± 1% 12.17GB/s ± 1% +908.87% (p=0.000 n=10+9)
CountByte/67108864-4 1.21GB/s ± 1% 9.65GB/s ± 0% +699.29% (p=0.000 n=9+8)
Fixes #19411
Change-Id: I8d2d409f0fa6df6d03b60790aa86e540b4a4e3b0
Reviewed-on: https://go-review.googlesource.com/38693
Reviewed-by: Keith Randall <khr@golang.org>
|
|
- use a string lookup to detect if a single byte is a space character
- determine the exact number of fields for ASCII and
a possibly underestimated number of fields for non ASCII strings
by doing a separate byte for byte scan of the input string
before collecting the fields in an extra pass
- provide a fast path for ASCII only strings when collecting the fields
- avoid utf8.DecodeRuneInString and unicode.IsSpace for ASCII characters
Used golang.org/cl/33108 from Joe Tsai as starting point.
name old time/op new time/op delta
Fields/ASCII/16 284ns ± 1% 116ns ± 2% -59.30% (p=0.000 n=9+10)
Fields/ASCII/256 3.81µs ± 1% 0.80µs ± 1% -79.10% (p=0.000 n=10+10)
Fields/ASCII/4096 61.4µs ± 1% 12.3µs ± 1% -79.96% (p=0.000 n=10+9)
Fields/ASCII/65536 982µs ± 1% 235µs ± 0% -76.04% (p=0.000 n=10+9)
Fields/ASCII/1048576 16.7ms ± 2% 5.4ms ± 1% -67.52% (p=0.000 n=10+10)
Fields/Mixed/16 314ns ± 1% 168ns ± 1% -46.33% (p=0.000 n=9+10)
Fields/Mixed/256 3.92µs ± 1% 1.17µs ± 1% -70.19% (p=0.000 n=10+10)
Fields/Mixed/4096 69.1µs ± 1% 19.0µs ± 1% -72.53% (p=0.000 n=10+10)
Fields/Mixed/65536 1.12ms ± 1% 0.39ms ± 0% -65.37% (p=0.000 n=10+9)
Fields/Mixed/1048576 19.0ms ± 2% 7.3ms ± 4% -61.75% (p=0.000 n=10+9)
name old speed new speed delta
Fields/ASCII/16 56.3MB/s ± 1% 138.1MB/s ± 2% +145.31% (p=0.000 n=9+10)
Fields/ASCII/256 67.1MB/s ± 1% 321.0MB/s ± 1% +378.26% (p=0.000 n=10+10)
Fields/ASCII/4096 66.7MB/s ± 1% 333.0MB/s ± 1% +398.97% (p=0.000 n=10+9)
Fields/ASCII/65536 66.7MB/s ± 1% 278.4MB/s ± 0% +317.39% (p=0.000 n=10+9)
Fields/ASCII/1048576 62.7MB/s ± 2% 192.9MB/s ± 1% +207.82% (p=0.000 n=10+10)
Fields/Mixed/16 51.0MB/s ± 2% 94.9MB/s ± 1% +85.87% (p=0.000 n=10+10)
Fields/Mixed/256 65.4MB/s ± 1% 219.2MB/s ± 1% +235.33% (p=0.000 n=10+10)
Fields/Mixed/4096 59.3MB/s ± 1% 215.7MB/s ± 1% +263.98% (p=0.000 n=10+10)
Fields/Mixed/65536 58.6MB/s ± 1% 169.1MB/s ± 0% +188.73% (p=0.000 n=10+9)
Fields/Mixed/1048576 55.1MB/s ± 2% 144.0MB/s ± 4% +161.44% (p=0.000 n=10+9)
Updates #19789
Updates #17856
Change-Id: If2ce1479542702e9cd65a82a462ba55ac8eb3876
Reviewed-on: https://go-review.googlesource.com/37959
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
|
|
The recently updated Count functions declare variables before
special-cased returns.
Change-Id: I8f726118336b7b0ff72117d12adc48b6e37e60ea
Reviewed-on: https://go-review.googlesource.com/39357
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The new Map implementation introduced in golang.org/cl/33201
did not differentiate if an invalid UTF-8 sequence was decoded
or the RuneError rune. It would therefore always advance by
3 bytes (which is the length of the RuneError rune) instead
of 1 for an invalid sequences. This cl adds a check to correctly
determine the length of bytes needed to advance to the next rune.
Fixes #19330.
Change-Id: I1e7f9333f3ef6068ffc64015bb0a9f32b0b7111d
Reviewed-on: https://go-review.googlesource.com/37597
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
name old time/op new time/op delta
ByteByteMap-4 2.03µs ± 2% 1.03µs ± 2% -49.24% (p=0.000 n=10+10)
Map/identity/ASCII-4 246ns ± 0% 158ns ± 0% -35.90% (p=0.000 n=9+9)
Map/identity/Greek-4 367ns ± 1% 273ns ± 1% -25.63% (p=0.000 n=10+10)
Map/change/ASCII-4 582ns ± 1% 324ns ± 1% -44.34% (p=0.000 n=10+10)
Map/change/Greek-4 709ns ± 2% 623ns ± 2% -12.16% (p=0.000 n=10+10)
MapNoChanges-4 171ns ± 1% 111ns ± 1% -35.36% (p=0.000 n=8+10)
Updates #17859
Change-Id: I55d7d261fdc1ce2dcd0ebe23b0fa20b9889bf54c
Reviewed-on: https://go-review.googlesource.com/33201
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Using 'sep' as parameter name for strings functions that take a
separator argument is fine, but for functions like Index or Count that
look for a substring it's better to use 'substr' (like Contains
already does).
Fixes #19039
Change-Id: Idd557409c8fea64ce830ab0e3fec37d3d56a79f0
Reviewed-on: https://go-review.googlesource.com/36874
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
The relevant benchmark results on linux/amd64:
bytes:
SplitSingleByteSeparator-4 25.7ms ± 5% 9.1ms ± 4% -64.40% (p=0.000 n=10+10)
SplitMultiByteSeparator-4 13.8ms ±20% 4.3ms ± 8% -69.23% (p=0.000 n=10+10)
SplitNSingleByteSeparator-4 1.88µs ± 9% 0.88µs ± 4% -53.25% (p=0.000 n=10+10)
SplitNMultiByteSeparator-4 4.83µs ±10% 1.32µs ± 9% -72.65% (p=0.000 n=10+10)
strings:
name old time/op new time/op delta
SplitSingleByteSeparator-4 21.4ms ± 8% 8.5ms ± 5% -60.19% (p=0.000 n=10+10)
SplitMultiByteSeparator-4 13.2ms ± 9% 3.9ms ± 4% -70.29% (p=0.000 n=10+10)
SplitNSingleByteSeparator-4 1.54µs ± 5% 0.75µs ± 7% -51.21% (p=0.000 n=10+10)
SplitNMultiByteSeparator-4 3.57µs ± 8% 1.01µs ±11% -71.76% (p=0.000 n=10+10)
Fixes #18973
Change-Id: Ie4bc010c6cc389001e72eab530497c81e5b26f34
Reviewed-on: https://go-review.googlesource.com/36510
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Similar to https://go-review.googlesource.com/28586,
but for package bytes instead of strings.
This provides simpler code and some performance gain.
Also update strings.Count to use the same code.
On AMD64 with heavily optimized Index I see:
name old time/op new time/op delta
Count/10-6 47.3ns ± 0% 36.8ns ± 0% -22.35% (p=0.000 n=10+10)
Count/32-6 286ns ± 0% 38ns ± 0% -86.71% (p=0.000 n=10+10)
Count/4K-6 50.1µs ± 0% 4.4µs ± 0% -91.18% (p=0.000 n=10+10)
Count/4M-6 48.1ms ± 1% 4.5ms ± 0% -90.56% (p=0.000 n=10+9)
Count/64M-6 784ms ± 0% 73ms ± 0% -90.73% (p=0.000 n=10+10)
CountEasy/10-6 28.4ns ± 0% 31.0ns ± 0% +9.23% (p=0.000 n=10+10)
CountEasy/32-6 30.6ns ± 0% 37.0ns ± 0% +20.92% (p=0.000 n=10+10)
CountEasy/4K-6 186ns ± 0% 198ns ± 0% +6.45% (p=0.000 n=9+10)
CountEasy/4M-6 233µs ± 2% 234µs ± 2% ~ (p=0.912 n=10+10)
CountEasy/64M-6 6.70ms ± 0% 6.68ms ± 1% ~ (p=0.762 n=8+10)
name old speed new speed delta
Count/10-6 211MB/s ± 0% 272MB/s ± 0% +28.77% (p=0.000 n=10+9)
Count/32-6 112MB/s ± 0% 842MB/s ± 0% +652.84% (p=0.000 n=10+10)
Count/4K-6 81.8MB/s ± 0% 927.6MB/s ± 0% +1033.63% (p=0.000 n=10+9)
Count/4M-6 87.2MB/s ± 1% 924.0MB/s ± 0% +959.25% (p=0.000 n=10+9)
Count/64M-6 85.6MB/s ± 0% 922.9MB/s ± 0% +978.31% (p=0.000 n=10+10)
CountEasy/10-6 352MB/s ± 0% 322MB/s ± 0% -8.41% (p=0.000 n=10+10)
CountEasy/32-6 1.05GB/s ± 0% 0.87GB/s ± 0% -17.35% (p=0.000 n=9+10)
CountEasy/4K-6 22.0GB/s ± 0% 20.6GB/s ± 0% -6.33% (p=0.000 n=10+10)
CountEasy/4M-6 18.0GB/s ± 2% 18.0GB/s ± 2% ~ (p=0.912 n=10+10)
CountEasy/64M-6 10.0GB/s ± 0% 10.0GB/s ± 1% ~ (p=0.762 n=8+10)
On 386, without asm version of Index:
Count/10-6 57.0ns ± 0% 56.9ns ± 0% -0.11% (p=0.006 n=10+9)
Count/32-6 340ns ± 0% 274ns ± 0% -19.48% (p=0.000 n=10+9)
Count/4K-6 49.5µs ± 0% 37.1µs ± 0% -24.96% (p=0.000 n=10+10)
Count/4M-6 51.1ms ± 0% 38.2ms ± 0% -25.21% (p=0.000 n=10+10)
Count/64M-6 818ms ± 0% 613ms ± 0% -25.07% (p=0.000 n=8+10)
CountEasy/10-6 60.0ns ± 0% 70.4ns ± 0% +17.34% (p=0.000 n=10+10)
CountEasy/32-6 81.1ns ± 0% 94.0ns ± 0% +15.97% (p=0.000 n=9+10)
CountEasy/4K-6 4.37µs ± 0% 4.39µs ± 0% +0.30% (p=0.000 n=10+9)
CountEasy/4M-6 4.43ms ± 0% 4.43ms ± 0% ~ (p=0.579 n=10+10)
CountEasy/64M-6 70.9ms ± 0% 70.9ms ± 0% ~ (p=0.912 n=10+10)
name old speed new speed delta
Count/10-6 176MB/s ± 0% 176MB/s ± 0% +0.10% (p=0.000 n=10+9)
Count/32-6 93.9MB/s ± 0% 116.5MB/s ± 0% +24.06% (p=0.000 n=10+9)
Count/4K-6 82.7MB/s ± 0% 110.3MB/s ± 0% +33.26% (p=0.000 n=10+10)
Count/4M-6 82.1MB/s ± 0% 109.7MB/s ± 0% +33.70% (p=0.000 n=10+10)
Count/64M-6 82.0MB/s ± 0% 109.5MB/s ± 0% +33.46% (p=0.000 n=8+10)
CountEasy/10-6 167MB/s ± 0% 142MB/s ± 0% -14.75% (p=0.000 n=9+10)
CountEasy/32-6 395MB/s ± 0% 340MB/s ± 0% -13.77% (p=0.000 n=10+10)
CountEasy/4K-6 936MB/s ± 0% 934MB/s ± 0% -0.29% (p=0.000 n=10+9)
CountEasy/4M-6 947MB/s ± 0% 946MB/s ± 0% ~ (p=0.591 n=10+10)
CountEasy/64M-6 947MB/s ± 0% 947MB/s ± 0% ~ (p=0.867 n=10+10)
Change-Id: Ia76b247372b6f5b5d23a9f10253a86536a5153b3
Reviewed-on: https://go-review.googlesource.com/36489
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
In a large codebase within Google, there are thousands of uses of:
ContainsAny|IndexAny|LastIndexAny|Trim|TrimLeft|TrimRight
An analysis of their usage shows that over 97% of them only use character
sets consisting of only ASCII symbols.
Uses of ContainsAny|IndexAny|LastIndexAny:
6% are 1 character (e.g., "\n" or " ")
58% are 2-4 characters (e.g., "<>" or "\r\n\t ")
24% are 5-9 characters (e.g., "()[]*^$")
10% are 10+ characters (e.g., "+-=&|><!(){}[]^\"~*?:\\/ ")
We optimize for ASCII sets, which are commonly used to search for
"control" characters in some string. We don't optimize for the
single character scenario since IndexRune or IndexByte could be used.
Uses of Trim|TrimLeft|TrimRight:
71% are 1 character (e.g., "\n" or " ")
14% are 2 characters (e.g., "\r\n")
10% are 3-4 characters (e.g., " \t\r\n")
5% are 10+ characters (e.g., "0123456789abcdefABCDEF")
We optimize for the single character case with a simple closured function
that only checks for that character's value. We optimize for the medium
and larger sets using a 16-byte bit-map representing a set of ASCII characters.
The benchmarks below have the following suffix name "%d:%d" where the first
number is the length of the input and the second number is the length
of the charset.
== bytes package ==
benchmark old ns/op new ns/op delta
BenchmarkIndexAnyASCII/1:1-4 5.09 5.23 +2.75%
BenchmarkIndexAnyASCII/1:2-4 5.81 5.85 +0.69%
BenchmarkIndexAnyASCII/1:4-4 7.22 7.50 +3.88%
BenchmarkIndexAnyASCII/1:8-4 11.0 11.1 +0.91%
BenchmarkIndexAnyASCII/1:16-4 17.5 17.8 +1.71%
BenchmarkIndexAnyASCII/16:1-4 36.0 34.0 -5.56%
BenchmarkIndexAnyASCII/16:2-4 46.6 36.5 -21.67%
BenchmarkIndexAnyASCII/16:4-4 78.0 40.4 -48.21%
BenchmarkIndexAnyASCII/16:8-4 136 47.4 -65.15%
BenchmarkIndexAnyASCII/16:16-4 254 61.5 -75.79%
BenchmarkIndexAnyASCII/256:1-4 542 388 -28.41%
BenchmarkIndexAnyASCII/256:2-4 705 382 -45.82%
BenchmarkIndexAnyASCII/256:4-4 1089 386 -64.55%
BenchmarkIndexAnyASCII/256:8-4 1994 394 -80.24%
BenchmarkIndexAnyASCII/256:16-4 3843 411 -89.31%
BenchmarkIndexAnyASCII/4096:1-4 8522 5873 -31.08%
BenchmarkIndexAnyASCII/4096:2-4 11253 5861 -47.92%
BenchmarkIndexAnyASCII/4096:4-4 17824 5883 -66.99%
BenchmarkIndexAnyASCII/4096:8-4 32053 5871 -81.68%
BenchmarkIndexAnyASCII/4096:16-4 60512 5888 -90.27%
BenchmarkTrimASCII/1:1-4 79.5 70.8 -10.94%
BenchmarkTrimASCII/1:2-4 79.0 105 +32.91%
BenchmarkTrimASCII/1:4-4 79.6 109 +36.93%
BenchmarkTrimASCII/1:8-4 78.8 118 +49.75%
BenchmarkTrimASCII/1:16-4 80.2 132 +64.59%
BenchmarkTrimASCII/16:1-4 243 116 -52.26%
BenchmarkTrimASCII/16:2-4 243 171 -29.63%
BenchmarkTrimASCII/16:4-4 243 176 -27.57%
BenchmarkTrimASCII/16:8-4 241 184 -23.65%
BenchmarkTrimASCII/16:16-4 238 199 -16.39%
BenchmarkTrimASCII/256:1-4 2580 840 -67.44%
BenchmarkTrimASCII/256:2-4 2603 1175 -54.86%
BenchmarkTrimASCII/256:4-4 2572 1188 -53.81%
BenchmarkTrimASCII/256:8-4 2550 1191 -53.29%
BenchmarkTrimASCII/256:16-4 2585 1208 -53.27%
BenchmarkTrimASCII/4096:1-4 39773 12181 -69.37%
BenchmarkTrimASCII/4096:2-4 39946 17231 -56.86%
BenchmarkTrimASCII/4096:4-4 39641 17179 -56.66%
BenchmarkTrimASCII/4096:8-4 39835 17175 -56.88%
BenchmarkTrimASCII/4096:16-4 40229 17215 -57.21%
== strings package ==
benchmark old ns/op new ns/op delta
BenchmarkIndexAnyASCII/1:1-4 5.94 4.97 -16.33%
BenchmarkIndexAnyASCII/1:2-4 5.94 5.55 -6.57%
BenchmarkIndexAnyASCII/1:4-4 7.45 7.21 -3.22%
BenchmarkIndexAnyASCII/1:8-4 10.8 10.6 -1.85%
BenchmarkIndexAnyASCII/1:16-4 17.4 17.2 -1.15%
BenchmarkIndexAnyASCII/16:1-4 36.4 32.2 -11.54%
BenchmarkIndexAnyASCII/16:2-4 49.6 34.6 -30.24%
BenchmarkIndexAnyASCII/16:4-4 77.5 37.9 -51.10%
BenchmarkIndexAnyASCII/16:8-4 138 45.5 -67.03%
BenchmarkIndexAnyASCII/16:16-4 241 59.1 -75.48%
BenchmarkIndexAnyASCII/256:1-4 509 378 -25.74%
BenchmarkIndexAnyASCII/256:2-4 720 381 -47.08%
BenchmarkIndexAnyASCII/256:4-4 1142 384 -66.37%
BenchmarkIndexAnyASCII/256:8-4 1999 391 -80.44%
BenchmarkIndexAnyASCII/256:16-4 3735 403 -89.21%
BenchmarkIndexAnyASCII/4096:1-4 7973 5824 -26.95%
BenchmarkIndexAnyASCII/4096:2-4 11432 5809 -49.19%
BenchmarkIndexAnyASCII/4096:4-4 18327 5819 -68.25%
BenchmarkIndexAnyASCII/4096:8-4 33059 5828 -82.37%
BenchmarkIndexAnyASCII/4096:16-4 59703 5817 -90.26%
BenchmarkTrimASCII/1:1-4 71.9 71.8 -0.14%
BenchmarkTrimASCII/1:2-4 73.3 103 +40.52%
BenchmarkTrimASCII/1:4-4 71.8 106 +47.63%
BenchmarkTrimASCII/1:8-4 71.2 113 +58.71%
BenchmarkTrimASCII/1:16-4 71.6 128 +78.77%
BenchmarkTrimASCII/16:1-4 152 116 -23.68%
BenchmarkTrimASCII/16:2-4 160 168 +5.00%
BenchmarkTrimASCII/16:4-4 172 170 -1.16%
BenchmarkTrimASCII/16:8-4 200 177 -11.50%
BenchmarkTrimASCII/16:16-4 254 193 -24.02%
BenchmarkTrimASCII/256:1-4 1438 864 -39.92%
BenchmarkTrimASCII/256:2-4 1551 1195 -22.95%
BenchmarkTrimASCII/256:4-4 1770 1200 -32.20%
BenchmarkTrimASCII/256:8-4 2195 1216 -44.60%
BenchmarkTrimASCII/256:16-4 3054 1224 -59.92%
BenchmarkTrimASCII/4096:1-4 21726 12557 -42.20%
BenchmarkTrimASCII/4096:2-4 23586 17508 -25.77%
BenchmarkTrimASCII/4096:4-4 26898 17510 -34.90%
BenchmarkTrimASCII/4096:8-4 33714 17595 -47.81%
BenchmarkTrimASCII/4096:16-4 47429 17700 -62.68%
The benchmarks added test the worst case. For IndexAny, that is when the
charset matches none of the input. For Trim, it is when the charset matches
all of the input.
Change-Id: I970874d101a96b33528fc99b165379abe58cf6ea
Reviewed-on: https://go-review.googlesource.com/31593
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Martin Möhrmann <martisch@uos.de>
|
|
In all previous versions of Go, the behavior of IndexRune(s, r)
where r was utf.RuneError was that it would effectively return the
index of any invalid UTF-8 byte sequence (include RuneError).
Optimizations made in http://golang.org/cl/28537 and
http://golang.org/cl/28546 altered this undocumented behavior such
that RuneError would only match on the RuneError rune itself.
Although, the new behavior is arguably reasonable, it did break code
that depended on the previous behavior. Thus, we add special checks
to ensure that we preserve the old behavior.
There is a slight performance hit for correctness:
benchmark old ns/op new ns/op delta
BenchmarkIndexRune/10-4 19.3 21.6 +11.92%
BenchmarkIndexRune/32-4 33.6 35.2 +4.76%
This only occurs on small strings. The performance hit for larger strings
is neglible and not shown.
Fixes #17611
Change-Id: I1d863a741213d46c40b2e1724c41245df52502a5
Reviewed-on: https://go-review.googlesource.com/32123
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change-Id: I40896fffbffefa359d08abda346933aa996f628d
Reviewed-on: https://go-review.googlesource.com/32124
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This simplifies code and provides performance iprovments:
Similar to https://go-review.googlesource.com/#/c/28577
CountHard1-48 1.74ms ±14% 0.17ms ±14% -90.16% (p=0.000 n=19+19)
CountHard2-48 1.78ms ±15% 0.25ms ±13% -86.10% (p=0.000 n=19+20)
CountHard3-48 1.78ms ±12% 0.80ms ±11% -55.19% (p=0.000 n=17+20)
CountTorture-48 13.5µs ±14% 13.6µs ±11% ~ (p=0.625 n=18+19)
CountTortureOverlapping-48 6.92ms ±13% 8.42ms ±11% +21.72% (p=0.000 n=19+17)
Change-Id: Ief120aee918a66487c76be56e0796871c8502f89
Reviewed-on: https://go-review.googlesource.com/28586
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Panic if Repeat is given a negative count or
if the value of (len(*) * count) is detected
to overflow.
We panic because we cannot change the
signature of Repeat to return an error.
Fixes #16237
Change-Id: I9f5ba031a5b8533db0582d7a672ffb715143f3fb
Reviewed-on: https://go-review.googlesource.com/29954
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|