| Age | Commit message (Collapse) | Author |
|
Also move the arm64 CountByte implementation while we're here.
Fixes #19792
Change-Id: I1e0fdf1e03e3135af84150a2703b58dad1b0d57e
Reviewed-on: https://go-review.googlesource.com/98518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Move bytes.Count and strings.Count to bytealg.
Update #19792
Change-Id: I3e4e14b504a0b71758885bb131e5656e342cf8cb
Reviewed-on: https://go-review.googlesource.com/98495
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Move the IndexByte function from the runtime to a new bytealg package.
The new package will eventually hold all the optimized assembly for
groveling through byte slices and strings. It seems a better home for
this code than randomly keeping it in runtime.
Once this is in, the next step is to move the other functions
(Compare, Equal, ...).
Update #19792
This change seems complicated enough that we might just declare
"not worth it" and abandon. Opinions welcome.
The core assembly is all unchanged, except minor modifications where
the code reads cpu feature bits.
The wrapper functions have been cleaned up as they are now actually
checked by vet.
Change-Id: I9fa75bee5d85db3a65b3fd3b7997e60367523796
Reviewed-on: https://go-review.googlesource.com/98016
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Despite the existing test that locks in the allocation behavior, people
really want a benchmark. So:
BenchmarkBuildString_Builder/1Write_NoGrow-4 20000000 60.4 ns/op 48 B/op 1 allocs/op
BenchmarkBuildString_Builder/3Write_NoGrow-4 10000000 230 ns/op 336 B/op 3 allocs/op
BenchmarkBuildString_Builder/3Write_Grow-4 20000000 102 ns/op 112 B/op 1 allocs/op
BenchmarkBuildString_ByteBuffer/1Write_NoGrow-4 10000000 125 ns/op 160 B/op 2 allocs/op
BenchmarkBuildString_ByteBuffer/3Write_NoGrow-4 5000000 339 ns/op 400 B/op 3 allocs/op
BenchmarkBuildString_ByteBuffer/3Write_Grow-4 5000000 316 ns/op 336 B/op 3 allocs/op
I don't think these allocate-as-fast-as-you-can benchmarks are very
interesting because they're effectively just GC benchmarks, but sure.
If one wants to see that there's 1 fewer allocation, there it is. The
ns/op and B/op numbers will change as the built string size changes.
Updates #18990
Change-Id: Ifccf535bd396217434a0e6989e195105f90132ae
Reviewed-on: https://go-review.googlesource.com/96980
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
|
|
All credit and blame goes to Ian for this suggestion, copied from the
runtime.
Fixes #23382
Updates #7921
Change-Id: I3d5a9ee4ab730c87e0f3feff3e7fceff9bcf9e18
Reviewed-on: https://go-review.googlesource.com/86976
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
The Builder's ReadFrom method allows the underlying unsafe slice to
escape, and for callers to subsequently modify memory that had been
unsafely converted into an immutable string.
In the original proposal for Builder (#18990), I'd noted there should
be no Read methods:
> There would be no Reset or Bytes or Truncate or Read methods.
> Nothing that could mutate the []byte once it was unsafely converted
> to a string.
And in my prototype (https://golang.org/cl/37767), I handled ReadFrom
properly, but when https://golang.org/cl/74931 arrived, I missed that
it had a ReadFrom method and approved it.
Because we're so close to the Go 1.10 release, just remove the
ReadFrom method rather than think about possible fixes. It has
marginal utility in a Builder anyway.
Also, fix a separate bug that also allowed mutation of a slice's
backing array after it had been converted into a slice by disallowing
copies of the Builder by value.
Updates #18990
Fixes #23083
Fixes #23084
Change-Id: Id1f860f8a4f5f88b32213cf85108ebc609acb95f
Reviewed-on: https://go-review.googlesource.com/83255
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Change-Id: I235c5bc7ce598047eccc1518984dd27f568046a2
Reviewed-on: https://go-review.googlesource.com/82776
Run-TryBot: Caleb Spare <cespare@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
CL 65851 (bytes) and CL 65910 (strings) “improve[d] readability”
by removing the special case that bypassed the whole function body
when chars == "". In doing so, yes, the function was unindented a
level, which is nice, but the runtime of that case went from O(1) to O(n)
where n = len(s).
I don't know if anyone's code depends on the O(1) behavior in this case,
but quite possibly someone's does.
This CL adds the special case back, with a comment to prevent future
deletions, and without reindenting each function body in full.
Change-Id: I5aba33922b304dd1b8657e6d51d6c937a7f95c81
Reviewed-on: https://go-review.googlesource.com/78112
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Use IndexByte first, as it allows us to skip lots of bytes quickly.
If IndexByte is generating a lot of false positives, switch over to Rabin-Karp.
Experiments for ppc64le
bytes:
name old time/op new time/op delta
IndexPeriodic/IndexPeriodic2-2 1.12ms ± 0% 0.18ms ± 0% -83.54% (p=0.000 n=10+9)
IndexPeriodic/IndexPeriodic4-2 635µs ± 0% 184µs ± 0% -71.06% (p=0.000 n=9+9)
IndexPeriodic/IndexPeriodic8-2 289µs ± 0% 184µs ± 0% -36.51% (p=0.000 n=10+9)
IndexPeriodic/IndexPeriodic16-2 133µs ± 0% 183µs ± 0% +37.68% (p=0.000 n=10+9)
IndexPeriodic/IndexPeriodic32-2 68.3µs ± 0% 70.2µs ± 0% +2.76% (p=0.000 n=10+10)
IndexPeriodic/IndexPeriodic64-2 35.8µs ± 0% 36.6µs ± 0% +2.17% (p=0.000 n=8+10)
strings:
name old time/op new time/op delta
IndexPeriodic/IndexPeriodic2-2 184µs ± 0% 184µs ± 0% +0.11% (p=0.029 n=4+4)
IndexPeriodic/IndexPeriodic4-2 184µs ± 0% 184µs ± 0% ~ (p=0.886 n=4+4)
IndexPeriodic/IndexPeriodic8-2 184µs ± 0% 184µs ± 0% ~ (p=0.486 n=4+4)
IndexPeriodic/IndexPeriodic16-2 185µs ± 1% 184µs ± 0% ~ (p=0.343 n=4+4)
IndexPeriodic/IndexPeriodic32-2 184µs ± 0% 69µs ± 0% -62.37% (p=0.029 n=4+4)
IndexPeriodic/IndexPeriodic64-2 184µs ± 0% 37µs ± 0% -80.17% (p=0.029 n=4+4)
Fixes #22578
Change-Id: If2a4d8554cb96bfd699b58149d13ac294615f8b8
Reviewed-on: https://go-review.googlesource.com/76070
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
|
|
Handling the ASCII case inline and call unicode.ToLower only
for non-ASCII cases.
Gives good improvements for the ASCII case and minor perf
degrade for non-ASCII case
name old time/op new time/op delta
ToLower/#00 10.8ns ± 1% 9.0ns ± 1% -16.83% (p=0.008 n=5+5)
ToLower/abc 23.3ns ± 4% 12.6ns ± 1% -46.01% (p=0.008 n=5+5)
ToLower/AbC123 91.0ns ± 2% 70.4ns ± 0% -22.59% (p=0.008 n=5+5)
ToLower/azAZ09_ 104ns ± 3% 75ns ± 1% -28.35% (p=0.008 n=5+5)
ToLower/longStrinGwitHmixofsmaLLandcAps 254ns ± 4% 157ns ± 0% -38.19% (p=0.016 n=5+4)
ToLower/LONGⱯSTRINGⱯWITHⱯNONASCIIⱯCHARS 446ns ± 1% 451ns ± 1% ~ (p=0.056 n=5+5)
ToLower/ⱭⱭⱭⱭⱭ 345ns ± 1% 348ns ± 0% +0.93% (p=0.016 n=5+5)
name old alloc/op new alloc/op delta
ToLower/#00 0.00B 0.00B ~ (all equal)
ToLower/abc 0.00B 0.00B ~ (all equal)
ToLower/AbC123 16.0B ± 0% 16.0B ± 0% ~ (all equal)
ToLower/azAZ09_ 24.0B ± 0% 16.0B ± 0% -33.33% (p=0.008 n=5+5)
ToLower/longStrinGwitHmixofsmaLLandcAps 80.0B ± 0% 64.0B ± 0% -20.00% (p=0.008 n=5+5)
ToLower/LONGⱯSTRINGⱯWITHⱯNONASCIIⱯCHARS 96.0B ± 0% 96.0B ± 0% ~ (all equal)
ToLower/ⱭⱭⱭⱭⱭ 48.0B ± 0% 48.0B ± 0% ~ (all equal)
Ran on a machine with Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
Fixes #17859
Change-Id: Iacc1e6b77e1aedba9447a6e94352606f131ea597
Reviewed-on: https://go-review.googlesource.com/76470
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
|
|
Handling the ASCII case inline and call unicode.ToUpper only
for non-ascii cases.
Gives good improvements for the ascii case and minor perf
degrade for non-ascii case
name old time/op new time/op delta
ToUpper/#00 11.7ns ± 8% 8.0ns ± 1% -31.95% (p=0.008 n=5+5)
ToUpper/ONLYUPPER 45.6ns ± 5% 19.9ns ± 1% -56.40% (p=0.008 n=5+5)
ToUpper/abc 77.4ns ± 1% 57.0ns ± 1% -26.32% (p=0.008 n=5+5)
ToUpper/AbC123 92.1ns ± 4% 67.7ns ± 2% -26.57% (p=0.008 n=5+5)
ToUpper/azAZ09_ 105ns ± 6% 67ns ± 2% -36.26% (p=0.000 n=5+4)
ToUpper/longStrinGwitHmixofsmaLLandcAps 255ns ± 1% 140ns ± 1% -45.01% (p=0.029 n=4+4)
ToUpper/longɐstringɐwithɐnonasciiⱯchars 440ns ± 1% 447ns ± 0% +1.49% (p=0.016 n=5+4)
ToUpper/ɐɐɐɐɐ 370ns ± 4% 366ns ± 1% ~ (p=0.667 n=5+5)
name old alloc/op new alloc/op delta
ToUpper/#00 0.00B 0.00B ~ (all equal)
ToUpper/ONLYUPPER 0.00B 0.00B ~ (all equal)
ToUpper/abc 16.0B ± 0% 6.0B ± 0% -62.50% (p=0.008 n=5+5)
ToUpper/AbC123 16.0B ± 0% 16.0B ± 0% ~ (all equal)
ToUpper/azAZ09_ 24.0B ± 0% 16.0B ± 0% -33.33% (p=0.008 n=5+5)
ToUpper/longStrinGwitHmixofsmaLLandcAps 80.0B ± 0% 64.0B ± 0% -20.00% (p=0.008 n=5+5)
ToUpper/longɐstringɐwithɐnonasciiⱯchars 96.0B ± 0% 96.0B ± 0% ~ (all equal)
ToUpper/ɐɐɐɐɐ 64.0B ± 0% 64.0B ± 0% ~ (all equal)
Ran on a machine with Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
Updates #17859
Change-Id: I0735ac4a4a36e8a8f6cc06f2c16b871f12b4abf9
Reviewed-on: https://go-review.googlesource.com/68370
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This is like a write-only subset of bytes.Buffer with an
allocation-free String method.
Fixes #18990.
Change-Id: Icdf7240f4309a52924dc3af04a39ecd737a210f4
Reviewed-on: https://go-review.googlesource.com/74931
Run-TryBot: Caleb Spare <cespare@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change-Id: Ifa0384722dd879af7f5edb7b7aaac5ede3cff46d
Reviewed-on: https://go-review.googlesource.com/74690
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This change removes the check of len(chars) > 0 inside the Index and
IndexAny functions which was redundant.
Change-Id: Iffbc0f2b3332c6e31c7514b5f644b6fe7bdcfe0d
Reviewed-on: https://go-review.googlesource.com/65910
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
|
|
Change-Id: I70b839ff0ae5f015587390a82616ebb1d657d71a
Reviewed-on: https://go-review.googlesource.com/55490
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Increases performance of FieldsFunc by recording the start and end
of the fields in an array. The first 32 fields are saved in a pre-allocated
array on the stack. This avoids the old behavior of iterating over the
input string two times but uses more allocations when more than 32 fields
are encountered.
Additionally code for handling non-ASCII containing strings from Fields is
removed and replaced by a call to the new faster FieldsFunc function.
Overall this still leads to a slowdown for Fields on non-ASCII strings
while speeding up Fields in general.
name old time/op new time/op delta
Fields/ASCII/16 116ns ± 5% 115ns ± 5% ~ (p=0.480 n=10+10)
Fields/ASCII/256 765ns ± 1% 761ns ± 2% ~ (p=0.171 n=10+10)
Fields/ASCII/4096 12.5µs ± 1% 12.7µs ± 1% +1.82% (p=0.000 n=10+10)
Fields/ASCII/65536 226µs ± 1% 226µs ± 2% ~ (p=0.739 n=10+10)
Fields/ASCII/1048576 5.12ms ± 1% 5.12ms ± 1% ~ (p=0.696 n=8+10)
Fields/Mixed/16 172ns ± 1% 233ns ± 1% +35.90% (p=0.000 n=9+10)
Fields/Mixed/256 1.18µs ± 2% 2.45µs ± 1% +107.47% (p=0.000 n=10+10)
Fields/Mixed/4096 20.3µs ± 1% 43.1µs ± 2% +112.41% (p=0.000 n=10+10)
Fields/Mixed/65536 364µs ± 1% 704µs ± 1% +93.56% (p=0.000 n=9+10)
Fields/Mixed/1048576 7.07ms ± 2% 13.34ms ± 4% +88.83% (p=0.000 n=10+10)
FieldsFunc/ASCII/16 274ns ± 1% 188ns ± 3% -31.44% (p=0.000 n=10+10)
FieldsFunc/ASCII/256 3.69µs ± 1% 2.06µs ± 2% -44.26% (p=0.000 n=10+10)
FieldsFunc/ASCII/4096 59.9µs ± 1% 35.3µs ± 2% -41.10% (p=0.000 n=10+10)
FieldsFunc/ASCII/65536 958µs ± 1% 567µs ± 1% -40.82% (p=0.000 n=10+9)
FieldsFunc/ASCII/1048576 16.3ms ± 2% 11.0ms ± 3% -32.52% (p=0.000 n=10+10)
FieldsFunc/Mixed/16 309ns ± 1% 213ns ± 0% -30.98% (p=0.000 n=10+6)
FieldsFunc/Mixed/256 3.83µs ± 1% 2.14µs ± 1% -44.01% (p=0.000 n=10+10)
FieldsFunc/Mixed/4096 66.2µs ± 2% 37.8µs ± 1% -42.85% (p=0.000 n=10+10)
FieldsFunc/Mixed/65536 1.09ms ± 1% 0.63ms ± 1% -42.73% (p=0.000 n=10+10)
FieldsFunc/Mixed/1048576 18.6ms ± 3% 12.0ms ± 2% -35.50% (p=0.000 n=10+10)
Fixes #17856
Fixes #19789
Change-Id: I9f5a560e534566fd81963651f342c8f44cfb0469
Reviewed-on: https://go-review.googlesource.com/42810
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
|
|
Change-Id: Ia0f0c8ab4f2f9e96faad6d88775ae19ca7fae53c
Reviewed-on: https://go-review.googlesource.com/53790
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Avelino <t@avelino.xxx>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
We initialize fieldStart to 0, then set it to i without ever reading
0, so we might as well just initialize it to i.
Change-Id: I17905b25d54a62b6bc76f915353756ed5eb6972b
Reviewed-on: https://go-review.googlesource.com/52933
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Avelino <t@avelino.xxx>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change-Id: I6bfe5b914cf11be1cd1f8e61d557cc718725f0be
Reviewed-on: https://go-review.googlesource.com/49013
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Change-Id: I9c0c601ec5957475e949dcc4a8c2116724d01215
Reviewed-on: https://go-review.googlesource.com/48961
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Add a example for string.Compare that return the three possible results.
Change-Id: I103cf39327c1868fb249538d9e22b11865ba4b70
Reviewed-on: https://go-review.googlesource.com/49011
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
Change-Id: Ib6a59735381ce744553f1ac96eeb65a194c8da10
Reviewed-on: https://go-review.googlesource.com/48860
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Change-Id: I69d1359d8868d4c5b173e4d831e38cea7dfeb713
Reviewed-on: https://go-review.googlesource.com/48859
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Change-Id: I994f003c97a14d194df5f07dd217c0ff3b214741
Reviewed-on: https://go-review.googlesource.com/48874
Reviewed-by: Matt Layher <mdlayher@gmail.com>
Run-TryBot: Matt Layher <mdlayher@gmail.com>
|
|
Apparently people get confused by the fact that
Split("", ",")
returns []{""} instead of []{}.
This is actually just a consequence of the fact that if the separator
sep (2nd argument) is not found the string s (1st argument), then the
Split* functions return a length 1 slice with the string s in it.
Document the general case: if sep is not in s, what you get is a len 1
slice with s in it; unless both s and sep are "", in that case you get
an empty slice of length 0.
Fixes #19726
Change-Id: I64c8220b91acd1e5aa1cc1829199e0cd8c47c404
Reviewed-on: https://go-review.googlesource.com/44950
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
|
|
A for-range loop is simpler and also generally faster nowadays:
TrimASCII/1:1-4 69.2ns ± 1% 72.3ns ± 4% +4.55% (p=0.001 n=8+8)
TrimASCII/1:2-4 114ns ± 4% 104ns ± 3% -8.71% (p=0.000 n=9+8)
TrimASCII/1:4-4 112ns ± 1% 109ns ± 2% -2.57% (p=0.000 n=8+9)
TrimASCII/1:8-4 120ns ± 2% 118ns ± 4% ~ (p=0.097 n=9+9)
TrimASCII/1:16-4 137ns ± 3% 132ns ± 3% -3.82% (p=0.001 n=9+9)
TrimASCII/16:1-4 129ns ± 1% 125ns ± 2% -3.38% (p=0.000 n=8+9)
TrimASCII/16:2-4 167ns ± 3% 159ns ± 1% -4.99% (p=0.000 n=9+8)
TrimASCII/16:4-4 165ns ± 2% 162ns ± 1% -1.91% (p=0.005 n=8+9)
TrimASCII/16:8-4 173ns ± 2% 170ns ± 1% -1.29% (p=0.018 n=9+9)
TrimASCII/16:16-4 188ns ± 2% 186ns ± 2% -1.13% (p=0.022 n=8+9)
TrimASCII/256:1-4 1.06µs ± 1% 0.98µs ± 2% -7.64% (p=0.000 n=8+9)
TrimASCII/256:2-4 1.08µs ± 1% 1.06µs ± 2% -1.95% (p=0.006 n=9+9)
TrimASCII/256:4-4 1.09µs ± 1% 1.07µs ± 3% ~ (p=0.059 n=9+9)
TrimASCII/256:8-4 1.10µs ± 1% 1.07µs ± 2% -2.63% (p=0.000 n=9+8)
TrimASCII/256:16-4 1.10µs ± 1% 1.08µs ± 1% -1.90% (p=0.000 n=8+9)
TrimASCII/4096:1-4 15.8µs ± 1% 14.5µs ± 1% -8.59% (p=0.000 n=9+9)
TrimASCII/4096:2-4 15.6µs ± 1% 15.4µs ± 2% -1.27% (p=0.021 n=8+8)
TrimASCII/4096:4-4 15.6µs ± 1% 15.4µs ± 2% ~ (p=0.094 n=9+9)
TrimASCII/4096:8-4 15.7µs ± 1% 15.8µs ± 6% ~ (p=0.555 n=8+8)
TrimASCII/4096:16-4 15.7µs ± 2% 15.3µs ± 1% -2.64% (p=0.000 n=8+9)
Change-Id: I9b06689b67c0cf2c7ff446fc63a8c44cc5d6a246
Reviewed-on: https://go-review.googlesource.com/32891
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Implements detection of x86 cpu features that
are used in the go standard library.
Changes all standard library packages to use the new cpu package
instead of using runtime internal variables to check x86 cpu features.
Updates: #15403
Change-Id: I2999a10cb4d9ec4863ffbed72f4e021a1dbc4bb9
Reviewed-on: https://go-review.googlesource.com/41476
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Move optimized Count implementation from bytes to runtime. Use in
both bytes and strings packages.
Add CountByte benchmark to strings.
Strings benchmarks:
name old time/op new time/op delta
CountHard1-4 226µs ± 1% 226µs ± 2% ~ (p=0.247 n=10+10)
CountHard2-4 316µs ± 1% 315µs ± 0% ~ (p=0.133 n=9+10)
CountHard3-4 919µs ± 1% 920µs ± 1% ~ (p=0.968 n=10+9)
CountTorture-4 15.4µs ± 1% 15.7µs ± 1% +2.47% (p=0.000 n=10+9)
CountTortureOverlapping-4 9.60ms ± 0% 9.65ms ± 1% ~ (p=0.247 n=10+10)
CountByte/10-4 26.3ns ± 1% 10.9ns ± 1% -58.71% (p=0.000 n=9+9)
CountByte/32-4 42.7ns ± 0% 14.2ns ± 0% -66.64% (p=0.000 n=10+10)
CountByte/4096-4 3.07µs ± 0% 0.31µs ± 2% -89.99% (p=0.000 n=9+10)
CountByte/4194304-4 3.48ms ± 1% 0.34ms ± 1% -90.09% (p=0.000 n=10+9)
CountByte/67108864-4 55.6ms ± 1% 7.0ms ± 0% -87.49% (p=0.000 n=9+8)
name old speed new speed delta
CountByte/10-4 380MB/s ± 1% 919MB/s ± 1% +142.21% (p=0.000 n=9+9)
CountByte/32-4 750MB/s ± 0% 2247MB/s ± 0% +199.62% (p=0.000 n=10+10)
CountByte/4096-4 1.33GB/s ± 0% 13.32GB/s ± 2% +898.13% (p=0.000 n=9+10)
CountByte/4194304-4 1.21GB/s ± 1% 12.17GB/s ± 1% +908.87% (p=0.000 n=10+9)
CountByte/67108864-4 1.21GB/s ± 1% 9.65GB/s ± 0% +699.29% (p=0.000 n=9+8)
Fixes #19411
Change-Id: I8d2d409f0fa6df6d03b60790aa86e540b4a4e3b0
Reviewed-on: https://go-review.googlesource.com/38693
Reviewed-by: Keith Randall <khr@golang.org>
|
|
- use a string lookup to detect if a single byte is a space character
- determine the exact number of fields for ASCII and
a possibly underestimated number of fields for non ASCII strings
by doing a separate byte for byte scan of the input string
before collecting the fields in an extra pass
- provide a fast path for ASCII only strings when collecting the fields
- avoid utf8.DecodeRuneInString and unicode.IsSpace for ASCII characters
Used golang.org/cl/33108 from Joe Tsai as starting point.
name old time/op new time/op delta
Fields/ASCII/16 284ns ± 1% 116ns ± 2% -59.30% (p=0.000 n=9+10)
Fields/ASCII/256 3.81µs ± 1% 0.80µs ± 1% -79.10% (p=0.000 n=10+10)
Fields/ASCII/4096 61.4µs ± 1% 12.3µs ± 1% -79.96% (p=0.000 n=10+9)
Fields/ASCII/65536 982µs ± 1% 235µs ± 0% -76.04% (p=0.000 n=10+9)
Fields/ASCII/1048576 16.7ms ± 2% 5.4ms ± 1% -67.52% (p=0.000 n=10+10)
Fields/Mixed/16 314ns ± 1% 168ns ± 1% -46.33% (p=0.000 n=9+10)
Fields/Mixed/256 3.92µs ± 1% 1.17µs ± 1% -70.19% (p=0.000 n=10+10)
Fields/Mixed/4096 69.1µs ± 1% 19.0µs ± 1% -72.53% (p=0.000 n=10+10)
Fields/Mixed/65536 1.12ms ± 1% 0.39ms ± 0% -65.37% (p=0.000 n=10+9)
Fields/Mixed/1048576 19.0ms ± 2% 7.3ms ± 4% -61.75% (p=0.000 n=10+9)
name old speed new speed delta
Fields/ASCII/16 56.3MB/s ± 1% 138.1MB/s ± 2% +145.31% (p=0.000 n=9+10)
Fields/ASCII/256 67.1MB/s ± 1% 321.0MB/s ± 1% +378.26% (p=0.000 n=10+10)
Fields/ASCII/4096 66.7MB/s ± 1% 333.0MB/s ± 1% +398.97% (p=0.000 n=10+9)
Fields/ASCII/65536 66.7MB/s ± 1% 278.4MB/s ± 0% +317.39% (p=0.000 n=10+9)
Fields/ASCII/1048576 62.7MB/s ± 2% 192.9MB/s ± 1% +207.82% (p=0.000 n=10+10)
Fields/Mixed/16 51.0MB/s ± 2% 94.9MB/s ± 1% +85.87% (p=0.000 n=10+10)
Fields/Mixed/256 65.4MB/s ± 1% 219.2MB/s ± 1% +235.33% (p=0.000 n=10+10)
Fields/Mixed/4096 59.3MB/s ± 1% 215.7MB/s ± 1% +263.98% (p=0.000 n=10+10)
Fields/Mixed/65536 58.6MB/s ± 1% 169.1MB/s ± 0% +188.73% (p=0.000 n=10+9)
Fields/Mixed/1048576 55.1MB/s ± 2% 144.0MB/s ± 4% +161.44% (p=0.000 n=10+9)
Updates #19789
Updates #17856
Change-Id: If2ce1479542702e9cd65a82a462ba55ac8eb3876
Reviewed-on: https://go-review.googlesource.com/37959
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
|
|
The recently updated Count functions declare variables before
special-cased returns.
Change-Id: I8f726118336b7b0ff72117d12adc48b6e37e60ea
Reviewed-on: https://go-review.googlesource.com/39357
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The new Map implementation introduced in golang.org/cl/33201
did not differentiate if an invalid UTF-8 sequence was decoded
or the RuneError rune. It would therefore always advance by
3 bytes (which is the length of the RuneError rune) instead
of 1 for an invalid sequences. This cl adds a check to correctly
determine the length of bytes needed to advance to the next rune.
Fixes #19330.
Change-Id: I1e7f9333f3ef6068ffc64015bb0a9f32b0b7111d
Reviewed-on: https://go-review.googlesource.com/37597
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
name old time/op new time/op delta
ByteByteMap-4 2.03µs ± 2% 1.03µs ± 2% -49.24% (p=0.000 n=10+10)
Map/identity/ASCII-4 246ns ± 0% 158ns ± 0% -35.90% (p=0.000 n=9+9)
Map/identity/Greek-4 367ns ± 1% 273ns ± 1% -25.63% (p=0.000 n=10+10)
Map/change/ASCII-4 582ns ± 1% 324ns ± 1% -44.34% (p=0.000 n=10+10)
Map/change/Greek-4 709ns ± 2% 623ns ± 2% -12.16% (p=0.000 n=10+10)
MapNoChanges-4 171ns ± 1% 111ns ± 1% -35.36% (p=0.000 n=8+10)
Updates #17859
Change-Id: I55d7d261fdc1ce2dcd0ebe23b0fa20b9889bf54c
Reviewed-on: https://go-review.googlesource.com/33201
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Using 'sep' as parameter name for strings functions that take a
separator argument is fine, but for functions like Index or Count that
look for a substring it's better to use 'substr' (like Contains
already does).
Fixes #19039
Change-Id: Idd557409c8fea64ce830ab0e3fec37d3d56a79f0
Reviewed-on: https://go-review.googlesource.com/36874
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
The relevant benchmark results on linux/amd64:
bytes:
SplitSingleByteSeparator-4 25.7ms ± 5% 9.1ms ± 4% -64.40% (p=0.000 n=10+10)
SplitMultiByteSeparator-4 13.8ms ±20% 4.3ms ± 8% -69.23% (p=0.000 n=10+10)
SplitNSingleByteSeparator-4 1.88µs ± 9% 0.88µs ± 4% -53.25% (p=0.000 n=10+10)
SplitNMultiByteSeparator-4 4.83µs ±10% 1.32µs ± 9% -72.65% (p=0.000 n=10+10)
strings:
name old time/op new time/op delta
SplitSingleByteSeparator-4 21.4ms ± 8% 8.5ms ± 5% -60.19% (p=0.000 n=10+10)
SplitMultiByteSeparator-4 13.2ms ± 9% 3.9ms ± 4% -70.29% (p=0.000 n=10+10)
SplitNSingleByteSeparator-4 1.54µs ± 5% 0.75µs ± 7% -51.21% (p=0.000 n=10+10)
SplitNMultiByteSeparator-4 3.57µs ± 8% 1.01µs ±11% -71.76% (p=0.000 n=10+10)
Fixes #18973
Change-Id: Ie4bc010c6cc389001e72eab530497c81e5b26f34
Reviewed-on: https://go-review.googlesource.com/36510
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Similar to https://go-review.googlesource.com/28586,
but for package bytes instead of strings.
This provides simpler code and some performance gain.
Also update strings.Count to use the same code.
On AMD64 with heavily optimized Index I see:
name old time/op new time/op delta
Count/10-6 47.3ns ± 0% 36.8ns ± 0% -22.35% (p=0.000 n=10+10)
Count/32-6 286ns ± 0% 38ns ± 0% -86.71% (p=0.000 n=10+10)
Count/4K-6 50.1µs ± 0% 4.4µs ± 0% -91.18% (p=0.000 n=10+10)
Count/4M-6 48.1ms ± 1% 4.5ms ± 0% -90.56% (p=0.000 n=10+9)
Count/64M-6 784ms ± 0% 73ms ± 0% -90.73% (p=0.000 n=10+10)
CountEasy/10-6 28.4ns ± 0% 31.0ns ± 0% +9.23% (p=0.000 n=10+10)
CountEasy/32-6 30.6ns ± 0% 37.0ns ± 0% +20.92% (p=0.000 n=10+10)
CountEasy/4K-6 186ns ± 0% 198ns ± 0% +6.45% (p=0.000 n=9+10)
CountEasy/4M-6 233µs ± 2% 234µs ± 2% ~ (p=0.912 n=10+10)
CountEasy/64M-6 6.70ms ± 0% 6.68ms ± 1% ~ (p=0.762 n=8+10)
name old speed new speed delta
Count/10-6 211MB/s ± 0% 272MB/s ± 0% +28.77% (p=0.000 n=10+9)
Count/32-6 112MB/s ± 0% 842MB/s ± 0% +652.84% (p=0.000 n=10+10)
Count/4K-6 81.8MB/s ± 0% 927.6MB/s ± 0% +1033.63% (p=0.000 n=10+9)
Count/4M-6 87.2MB/s ± 1% 924.0MB/s ± 0% +959.25% (p=0.000 n=10+9)
Count/64M-6 85.6MB/s ± 0% 922.9MB/s ± 0% +978.31% (p=0.000 n=10+10)
CountEasy/10-6 352MB/s ± 0% 322MB/s ± 0% -8.41% (p=0.000 n=10+10)
CountEasy/32-6 1.05GB/s ± 0% 0.87GB/s ± 0% -17.35% (p=0.000 n=9+10)
CountEasy/4K-6 22.0GB/s ± 0% 20.6GB/s ± 0% -6.33% (p=0.000 n=10+10)
CountEasy/4M-6 18.0GB/s ± 2% 18.0GB/s ± 2% ~ (p=0.912 n=10+10)
CountEasy/64M-6 10.0GB/s ± 0% 10.0GB/s ± 1% ~ (p=0.762 n=8+10)
On 386, without asm version of Index:
Count/10-6 57.0ns ± 0% 56.9ns ± 0% -0.11% (p=0.006 n=10+9)
Count/32-6 340ns ± 0% 274ns ± 0% -19.48% (p=0.000 n=10+9)
Count/4K-6 49.5µs ± 0% 37.1µs ± 0% -24.96% (p=0.000 n=10+10)
Count/4M-6 51.1ms ± 0% 38.2ms ± 0% -25.21% (p=0.000 n=10+10)
Count/64M-6 818ms ± 0% 613ms ± 0% -25.07% (p=0.000 n=8+10)
CountEasy/10-6 60.0ns ± 0% 70.4ns ± 0% +17.34% (p=0.000 n=10+10)
CountEasy/32-6 81.1ns ± 0% 94.0ns ± 0% +15.97% (p=0.000 n=9+10)
CountEasy/4K-6 4.37µs ± 0% 4.39µs ± 0% +0.30% (p=0.000 n=10+9)
CountEasy/4M-6 4.43ms ± 0% 4.43ms ± 0% ~ (p=0.579 n=10+10)
CountEasy/64M-6 70.9ms ± 0% 70.9ms ± 0% ~ (p=0.912 n=10+10)
name old speed new speed delta
Count/10-6 176MB/s ± 0% 176MB/s ± 0% +0.10% (p=0.000 n=10+9)
Count/32-6 93.9MB/s ± 0% 116.5MB/s ± 0% +24.06% (p=0.000 n=10+9)
Count/4K-6 82.7MB/s ± 0% 110.3MB/s ± 0% +33.26% (p=0.000 n=10+10)
Count/4M-6 82.1MB/s ± 0% 109.7MB/s ± 0% +33.70% (p=0.000 n=10+10)
Count/64M-6 82.0MB/s ± 0% 109.5MB/s ± 0% +33.46% (p=0.000 n=8+10)
CountEasy/10-6 167MB/s ± 0% 142MB/s ± 0% -14.75% (p=0.000 n=9+10)
CountEasy/32-6 395MB/s ± 0% 340MB/s ± 0% -13.77% (p=0.000 n=10+10)
CountEasy/4K-6 936MB/s ± 0% 934MB/s ± 0% -0.29% (p=0.000 n=10+9)
CountEasy/4M-6 947MB/s ± 0% 946MB/s ± 0% ~ (p=0.591 n=10+10)
CountEasy/64M-6 947MB/s ± 0% 947MB/s ± 0% ~ (p=0.867 n=10+10)
Change-Id: Ia76b247372b6f5b5d23a9f10253a86536a5153b3
Reviewed-on: https://go-review.googlesource.com/36489
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Updates the s390x-specific files in these packages with the changes
to the amd64-specific files made during the review of CL 31690. I'd
like to keep these files in sync unless there is a reason to
diverge.
Change-Id: Id83e5ce11a45f877bdcc991d02b14416d1a2d8d2
Reviewed-on: https://go-review.googlesource.com/32574
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
IndexByte+compare is faster than indexShortStr in good case, when
first byte is rare, but is more costly in bad cases.
Start with IndexByte and switch to indexShortStr if we encounter
false positives more often than once per 8 bytes.
Benchmark changes for package bytes:
IndexRune/4K-8 416ns ± 0% 86ns ± 0% -79.24% (p=0.000 n=10+10)
IndexRune/4M-8 413µs ± 0% 100µs ± 1% -75.88% (p=0.000 n=10+10)
IndexRune/64M-8 6.73ms ± 0% 2.86ms ± 1% -57.49% (p=0.000 n=10+10)
Index/10-8 8.45ns ± 0% 8.96ns ± 0% +6.04% (p=0.000 n=9+10)
Index/32-8 9.64ns ± 0% 9.51ns ± 0% -1.30% (p=0.000 n=8+9)
Index/4K-8 2.11µs ± 0% 2.12µs ± 0% +0.26% (p=0.000 n=10+10)
Index/4M-8 3.60ms ± 5% 3.59ms ± 7% ~ (p=0.497 n=9+10)
Index/64M-8 57.1ms ± 3% 58.7ms ± 5% ~ (p=0.113 n=9+10)
IndexEasy/10-8 7.10ns ± 1% 7.71ns ± 1% +8.60% (p=0.000 n=10+10)
IndexEasy/32-8 9.29ns ± 1% 9.22ns ± 0% -0.75% (p=0.000 n=9+10)
IndexEasy/4K-8 1.06µs ± 0% 0.08µs ± 0% -92.18% (p=0.000 n=10+10)
IndexEasy/4M-8 1.07ms ± 0% 0.10ms ± 1% -90.74% (p=0.000 n=9+10)
IndexEasy/64M-8 17.3ms ± 0% 2.8ms ± 1% -83.76% (p=0.000 n=10+9)
IndexRune/4K-8 9.84GB/s ± 0% 47.42GB/s ± 0% +381.85% (p=0.000 n=8+10)
IndexRune/4M-8 10.1GB/s ± 0% 42.1GB/s ± 1% +314.56% (p=0.000 n=10+10)
IndexRune/64M-8 10.0GB/s ± 0% 23.4GB/s ± 1% +135.25% (p=0.000 n=10+10)
Index/10-8 1.18GB/s ± 0% 1.12GB/s ± 0% -5.67% (p=0.000 n=10+9)
Index/32-8 3.32GB/s ± 0% 3.36GB/s ± 0% +1.27% (p=0.000 n=10+9)
Index/4K-8 1.94GB/s ± 0% 1.93GB/s ± 0% -0.25% (p=0.000 n=10+9)
Index/4M-8 1.17GB/s ± 5% 1.17GB/s ± 7% ~ (p=0.497 n=9+10)
Index/64M-8 1.17GB/s ± 3% 1.15GB/s ± 6% ~ (p=0.113 n=9+10)
IndexEasy/10-8 1.41GB/s ± 1% 1.30GB/s ± 1% -7.90% (p=0.000 n=10+10)
IndexEasy/32-8 3.45GB/s ± 1% 3.47GB/s ± 0% +0.73% (p=0.000 n=9+10)
IndexEasy/4K-8 3.84GB/s ± 0% 49.16GB/s ± 0% +1178.78% (p=0.000 n=9+10)
IndexEasy/4M-8 3.91GB/s ± 0% 42.19GB/s ± 1% +980.37% (p=0.000 n=9+10)
IndexEasy/64M-8 3.88GB/s ± 0% 23.91GB/s ± 1% +515.76% (p=0.000 n=10+9)
No significant changes in strings.
In regexp I see:
Match/Easy0/32-8 536MB/s ± 1% 540MB/s ± 1% +0.75% (p=0.001 n=9+10)
Match/Easy0/1K-8 1.62GB/s ± 0% 4.42GB/s ± 1% +172.48% (p=0.000 n=10+10)
Match/Easy0/32K-8 1.87GB/s ± 0% 9.07GB/s ± 1% +384.24% (p=0.000 n=7+10)
Match/Easy0/1M-8 1.90GB/s ± 0% 4.83GB/s ± 0% +154.56% (p=0.000 n=8+10)
Match/Easy0/32M-8 1.90GB/s ± 0% 4.53GB/s ± 0% +138.62% (p=0.000 n=7+10)
Compared to in 1.7:
Match/Easy0/32-8 59.5ns ± 0% 59.2ns ± 1% -0.45% (p=0.008 n=9+10)
Match/Easy0/1K-8 226ns ± 1% 231ns ± 1% +2.30% (p=0.000 n=10+10)
Match/Easy0/32K-8 3.73µs ± 2% 3.61µs ± 1% -3.12% (p=0.000 n=10+10)
Match/Easy0/1M-8 206µs ± 1% 217µs ± 0% +5.34% (p=0.000 n=10+10)
Match/Easy0/32M-8 7.03ms ± 1% 7.40ms ± 0% +5.23% (p=0.000 n=10+10)
Fixes #17456
Change-Id: I38b2fabcaed7119cc4bf37007ba7bfe7504c8f9f
Reviewed-on: https://go-review.googlesource.com/31690
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Fixes #17699
Change-Id: I7ea29a3fc2ca13d9d7e3044cbb8ea22e3435d423
Reviewed-on: https://go-review.googlesource.com/32484
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
|
|
Use vector instructions to speed up indexing operations for short
strings (64 bytes or less).
bytes_s390x.go and strings_s390x.go are based on their amd64
equivalents in CL 31690.
bytes package:
name old time/op new time/op delta
Index/10 40.3ns ± 7% 11.3ns ± 4% -72.06% (p=0.000 n=10+10)
Index/32 196ns ± 1% 27ns ± 2% -86.25% (p=0.000 n=10+10)
Index/4K 28.9µs ± 1% 1.5µs ± 2% -94.94% (p=0.000 n=9+9)
Index/4M 30.1ms ± 2% 1.5ms ± 3% -94.94% (p=0.000 n=10+10)
Index/64M 549ms ±13% 28ms ± 3% -94.87% (p=0.000 n=10+9)
IndexEasy/10 18.8ns ±11% 11.5ns ± 2% -38.81% (p=0.000 n=10+10)
IndexEasy/32 23.6ns ± 6% 28.1ns ± 3% +19.29% (p=0.000 n=10+10)
IndexEasy/4K 251ns ± 5% 223ns ± 8% -11.04% (p=0.000 n=10+10)
IndexEasy/4M 318µs ± 9% 266µs ± 8% -16.42% (p=0.000 n=10+10)
IndexEasy/64M 14.7ms ±16% 13.2ms ±11% -10.22% (p=0.001 n=10+10)
strings package:
name old time/op new time/op delta
IndexRune 88.1ns ±16% 28.9ns ± 4% -67.20% (p=0.000 n=10+10)
IndexRuneLongString 456ns ± 7% 34ns ± 3% -92.50% (p=0.000 n=10+10)
IndexRuneFastPath 12.9ns ±14% 11.1ns ± 6% -13.84% (p=0.000 n=10+10)
Index 13.0ns ± 7% 11.3ns ± 4% -13.31% (p=0.000 n=10+10)
IndexHard1 3.38ms ± 9% 0.07ms ± 1% -97.79% (p=0.000 n=10+10)
IndexHard2 3.58ms ± 7% 0.37ms ± 2% -89.78% (p=0.000 n=10+10)
IndexHard3 3.47ms ± 7% 0.75ms ± 1% -78.52% (p=0.000 n=10+10)
IndexHard4 3.56ms ± 6% 1.34ms ± 0% -62.39% (p=0.000 n=9+9)
Change-Id: If36c2afb8c02e80fcaa1cf5ec2abb0a2be08c7d1
Reviewed-on: https://go-review.googlesource.com/32447
Run-TryBot: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
In a large codebase within Google, there are thousands of uses of:
ContainsAny|IndexAny|LastIndexAny|Trim|TrimLeft|TrimRight
An analysis of their usage shows that over 97% of them only use character
sets consisting of only ASCII symbols.
Uses of ContainsAny|IndexAny|LastIndexAny:
6% are 1 character (e.g., "\n" or " ")
58% are 2-4 characters (e.g., "<>" or "\r\n\t ")
24% are 5-9 characters (e.g., "()[]*^$")
10% are 10+ characters (e.g., "+-=&|><!(){}[]^\"~*?:\\/ ")
We optimize for ASCII sets, which are commonly used to search for
"control" characters in some string. We don't optimize for the
single character scenario since IndexRune or IndexByte could be used.
Uses of Trim|TrimLeft|TrimRight:
71% are 1 character (e.g., "\n" or " ")
14% are 2 characters (e.g., "\r\n")
10% are 3-4 characters (e.g., " \t\r\n")
5% are 10+ characters (e.g., "0123456789abcdefABCDEF")
We optimize for the single character case with a simple closured function
that only checks for that character's value. We optimize for the medium
and larger sets using a 16-byte bit-map representing a set of ASCII characters.
The benchmarks below have the following suffix name "%d:%d" where the first
number is the length of the input and the second number is the length
of the charset.
== bytes package ==
benchmark old ns/op new ns/op delta
BenchmarkIndexAnyASCII/1:1-4 5.09 5.23 +2.75%
BenchmarkIndexAnyASCII/1:2-4 5.81 5.85 +0.69%
BenchmarkIndexAnyASCII/1:4-4 7.22 7.50 +3.88%
BenchmarkIndexAnyASCII/1:8-4 11.0 11.1 +0.91%
BenchmarkIndexAnyASCII/1:16-4 17.5 17.8 +1.71%
BenchmarkIndexAnyASCII/16:1-4 36.0 34.0 -5.56%
BenchmarkIndexAnyASCII/16:2-4 46.6 36.5 -21.67%
BenchmarkIndexAnyASCII/16:4-4 78.0 40.4 -48.21%
BenchmarkIndexAnyASCII/16:8-4 136 47.4 -65.15%
BenchmarkIndexAnyASCII/16:16-4 254 61.5 -75.79%
BenchmarkIndexAnyASCII/256:1-4 542 388 -28.41%
BenchmarkIndexAnyASCII/256:2-4 705 382 -45.82%
BenchmarkIndexAnyASCII/256:4-4 1089 386 -64.55%
BenchmarkIndexAnyASCII/256:8-4 1994 394 -80.24%
BenchmarkIndexAnyASCII/256:16-4 3843 411 -89.31%
BenchmarkIndexAnyASCII/4096:1-4 8522 5873 -31.08%
BenchmarkIndexAnyASCII/4096:2-4 11253 5861 -47.92%
BenchmarkIndexAnyASCII/4096:4-4 17824 5883 -66.99%
BenchmarkIndexAnyASCII/4096:8-4 32053 5871 -81.68%
BenchmarkIndexAnyASCII/4096:16-4 60512 5888 -90.27%
BenchmarkTrimASCII/1:1-4 79.5 70.8 -10.94%
BenchmarkTrimASCII/1:2-4 79.0 105 +32.91%
BenchmarkTrimASCII/1:4-4 79.6 109 +36.93%
BenchmarkTrimASCII/1:8-4 78.8 118 +49.75%
BenchmarkTrimASCII/1:16-4 80.2 132 +64.59%
BenchmarkTrimASCII/16:1-4 243 116 -52.26%
BenchmarkTrimASCII/16:2-4 243 171 -29.63%
BenchmarkTrimASCII/16:4-4 243 176 -27.57%
BenchmarkTrimASCII/16:8-4 241 184 -23.65%
BenchmarkTrimASCII/16:16-4 238 199 -16.39%
BenchmarkTrimASCII/256:1-4 2580 840 -67.44%
BenchmarkTrimASCII/256:2-4 2603 1175 -54.86%
BenchmarkTrimASCII/256:4-4 2572 1188 -53.81%
BenchmarkTrimASCII/256:8-4 2550 1191 -53.29%
BenchmarkTrimASCII/256:16-4 2585 1208 -53.27%
BenchmarkTrimASCII/4096:1-4 39773 12181 -69.37%
BenchmarkTrimASCII/4096:2-4 39946 17231 -56.86%
BenchmarkTrimASCII/4096:4-4 39641 17179 -56.66%
BenchmarkTrimASCII/4096:8-4 39835 17175 -56.88%
BenchmarkTrimASCII/4096:16-4 40229 17215 -57.21%
== strings package ==
benchmark old ns/op new ns/op delta
BenchmarkIndexAnyASCII/1:1-4 5.94 4.97 -16.33%
BenchmarkIndexAnyASCII/1:2-4 5.94 5.55 -6.57%
BenchmarkIndexAnyASCII/1:4-4 7.45 7.21 -3.22%
BenchmarkIndexAnyASCII/1:8-4 10.8 10.6 -1.85%
BenchmarkIndexAnyASCII/1:16-4 17.4 17.2 -1.15%
BenchmarkIndexAnyASCII/16:1-4 36.4 32.2 -11.54%
BenchmarkIndexAnyASCII/16:2-4 49.6 34.6 -30.24%
BenchmarkIndexAnyASCII/16:4-4 77.5 37.9 -51.10%
BenchmarkIndexAnyASCII/16:8-4 138 45.5 -67.03%
BenchmarkIndexAnyASCII/16:16-4 241 59.1 -75.48%
BenchmarkIndexAnyASCII/256:1-4 509 378 -25.74%
BenchmarkIndexAnyASCII/256:2-4 720 381 -47.08%
BenchmarkIndexAnyASCII/256:4-4 1142 384 -66.37%
BenchmarkIndexAnyASCII/256:8-4 1999 391 -80.44%
BenchmarkIndexAnyASCII/256:16-4 3735 403 -89.21%
BenchmarkIndexAnyASCII/4096:1-4 7973 5824 -26.95%
BenchmarkIndexAnyASCII/4096:2-4 11432 5809 -49.19%
BenchmarkIndexAnyASCII/4096:4-4 18327 5819 -68.25%
BenchmarkIndexAnyASCII/4096:8-4 33059 5828 -82.37%
BenchmarkIndexAnyASCII/4096:16-4 59703 5817 -90.26%
BenchmarkTrimASCII/1:1-4 71.9 71.8 -0.14%
BenchmarkTrimASCII/1:2-4 73.3 103 +40.52%
BenchmarkTrimASCII/1:4-4 71.8 106 +47.63%
BenchmarkTrimASCII/1:8-4 71.2 113 +58.71%
BenchmarkTrimASCII/1:16-4 71.6 128 +78.77%
BenchmarkTrimASCII/16:1-4 152 116 -23.68%
BenchmarkTrimASCII/16:2-4 160 168 +5.00%
BenchmarkTrimASCII/16:4-4 172 170 -1.16%
BenchmarkTrimASCII/16:8-4 200 177 -11.50%
BenchmarkTrimASCII/16:16-4 254 193 -24.02%
BenchmarkTrimASCII/256:1-4 1438 864 -39.92%
BenchmarkTrimASCII/256:2-4 1551 1195 -22.95%
BenchmarkTrimASCII/256:4-4 1770 1200 -32.20%
BenchmarkTrimASCII/256:8-4 2195 1216 -44.60%
BenchmarkTrimASCII/256:16-4 3054 1224 -59.92%
BenchmarkTrimASCII/4096:1-4 21726 12557 -42.20%
BenchmarkTrimASCII/4096:2-4 23586 17508 -25.77%
BenchmarkTrimASCII/4096:4-4 26898 17510 -34.90%
BenchmarkTrimASCII/4096:8-4 33714 17595 -47.81%
BenchmarkTrimASCII/4096:16-4 47429 17700 -62.68%
The benchmarks added test the worst case. For IndexAny, that is when the
charset matches none of the input. For Trim, it is when the charset matches
all of the input.
Change-Id: I970874d101a96b33528fc99b165379abe58cf6ea
Reviewed-on: https://go-review.googlesource.com/31593
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Martin Möhrmann <martisch@uos.de>
|
|
In all previous versions of Go, the behavior of IndexRune(s, r)
where r was utf.RuneError was that it would effectively return the
index of any invalid UTF-8 byte sequence (include RuneError).
Optimizations made in http://golang.org/cl/28537 and
http://golang.org/cl/28546 altered this undocumented behavior such
that RuneError would only match on the RuneError rune itself.
Although, the new behavior is arguably reasonable, it did break code
that depended on the previous behavior. Thus, we add special checks
to ensure that we preserve the old behavior.
There is a slight performance hit for correctness:
benchmark old ns/op new ns/op delta
BenchmarkIndexRune/10-4 19.3 21.6 +11.92%
BenchmarkIndexRune/32-4 33.6 35.2 +4.76%
This only occurs on small strings. The performance hit for larger strings
is neglible and not shown.
Fixes #17611
Change-Id: I1d863a741213d46c40b2e1724c41245df52502a5
Reviewed-on: https://go-review.googlesource.com/32123
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change-Id: I40896fffbffefa359d08abda346933aa996f628d
Reviewed-on: https://go-review.googlesource.com/32124
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This simplifies code and provides performance iprovments:
Similar to https://go-review.googlesource.com/#/c/28577
CountHard1-48 1.74ms ±14% 0.17ms ±14% -90.16% (p=0.000 n=19+19)
CountHard2-48 1.78ms ±15% 0.25ms ±13% -86.10% (p=0.000 n=19+20)
CountHard3-48 1.78ms ±12% 0.80ms ±11% -55.19% (p=0.000 n=17+20)
CountTorture-48 13.5µs ±14% 13.6µs ±11% ~ (p=0.625 n=18+19)
CountTortureOverlapping-48 6.92ms ±13% 8.42ms ±11% +21.72% (p=0.000 n=19+17)
Change-Id: Ief120aee918a66487c76be56e0796871c8502f89
Reviewed-on: https://go-review.googlesource.com/28586
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Panic if Repeat is given a negative count or
if the value of (len(*) * count) is detected
to overflow.
We panic because we cannot change the
signature of Repeat to return an error.
Fixes #16237
Change-Id: I9f5ba031a5b8533db0582d7a672ffb715143f3fb
Reviewed-on: https://go-review.googlesource.com/29954
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
IndexHard4-4 1.50ms ± 2% 0.71ms ± 0% -52.36% (p=0.000 n=20+19)
This also fixes a bug, that caused a string of length 16 to use
two 8-byte comparisons instead of one 16-byte. And adds a test for
cases when partial_match fails.
Change-Id: I1ee8fc4e068bb36c95c45de78f067c822c0d9df0
Reviewed-on: https://go-review.googlesource.com/22551
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
re-implement IndexRune by Index which is well optimized to get
performance gain.
name old time/op new time/op delta
IndexRune-4 30.2ns ± 1% 28.3ns ± 1% -6.22% (p=0.000 n=20+19)
IndexRuneLongString-4 156ns ± 1% 49ns ± 1% -68.72% (p=0.000 n=19+19)
IndexRuneFastPath-4 10.6ns ± 2% 10.0ns ± 1% -6.30% (p=0.000 n=18+18)
Change-Id: Ie663b8f7860ca51892dd4be182fca3caa5f8ae61
Reviewed-on: https://go-review.googlesource.com/28546
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
We already had special cases for 0 and 1. Add 2 and 3 for now too.
To be removed if the compiler is improved later (#6714).
This halves the number of allocations and total bytes allocated via
common filepath.Join calls, improving filepath.Walk performance.
Noticed as part of investigating filepath.Walk in #16399.
Change-Id: If7b1bb85606d4720f3ebdf8de7b1e12ad165079d
Reviewed-on: https://go-review.googlesource.com/25005
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Fixes #15689
Change-Id: I56d0103738cc35cd5bc5e77a0e0341c0dd55530e
Reviewed-on: https://go-review.googlesource.com/23440
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
|
|
The 17-31 byte code is broken. Disabled it.
Added a bunch of tests to at least cover the cases
in indexShortStr. I'll channel Brad and wonder why
this CL ever got in without any tests.
Fixes #15679
Change-Id: I84a7b283a74107db865b9586c955dcf5f2d60161
Reviewed-on: https://go-review.googlesource.com/23106
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
CL/19862 (f79b50b8d5bc159561c1dcf7c17e2a0db96a9a11) recently introduced the constants
SeekStart, SeekCurrent, and SeekEnd to the io package. We should use these constants
consistently throughout the code base.
Updates #15269
Change-Id: If7fcaca7676e4a51f588528f5ced28220d9639a2
Reviewed-on: https://go-review.googlesource.com/22097
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <joetsai@digital-static.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|