aboutsummaryrefslogtreecommitdiff
path: root/src/bytes
AgeCommit message (Collapse)Author
2018-03-02internal/bytealg: move IndexByte asssembly to the new bytealg packageKeith Randall
Move the IndexByte function from the runtime to a new bytealg package. The new package will eventually hold all the optimized assembly for groveling through byte slices and strings. It seems a better home for this code than randomly keeping it in runtime. Once this is in, the next step is to move the other functions (Compare, Equal, ...). Update #19792 This change seems complicated enough that we might just declare "not worth it" and abandon. Opinions welcome. The core assembly is all unchanged, except minor modifications where the code reads cpu feature bits. The wrapper functions have been cleaned up as they are now actually checked by vet. Change-Id: I9fa75bee5d85db3a65b3fd3b7997e60367523796 Reviewed-on: https://go-review.googlesource.com/98016 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-01bytes: add asm version of Index for short strings on arm64Wei Xiao
Currently we have special case for 1-byte strings, this extends it to strings shorter than 9 bytes on arm64. Benchmark results: name old time/op new time/op delta IndexByte/10-32 18.6ns ± 0% 18.1ns ± 0% -2.69% (p=0.008 n=5+5) IndexByte/32-32 16.8ns ± 1% 16.9ns ± 1% ~ (p=0.762 n=5+5) IndexByte/4K-32 464ns ± 0% 464ns ± 0% ~ (all equal) IndexByte/4M-32 528µs ± 1% 506µs ± 1% -4.17% (p=0.008 n=5+5) IndexByte/64M-32 18.7ms ± 0% 18.7ms ± 1% ~ (p=0.730 n=4+5) IndexBytePortable/10-32 33.8ns ± 0% 34.9ns ± 3% ~ (p=0.167 n=5+5) IndexBytePortable/32-32 65.3ns ± 0% 66.1ns ± 2% ~ (p=0.444 n=5+5) IndexBytePortable/4K-32 5.88µs ± 0% 5.88µs ± 0% ~ (p=0.325 n=5+5) IndexBytePortable/4M-32 6.03ms ± 0% 6.03ms ± 0% ~ (p=1.000 n=5+5) IndexBytePortable/64M-32 98.8ms ± 0% 98.9ms ± 0% +0.10% (p=0.008 n=5+5) IndexRune/10-32 57.7ns ± 0% 49.2ns ± 0% -14.73% (p=0.000 n=5+4) IndexRune/32-32 57.7ns ± 0% 58.6ns ± 0% +1.56% (p=0.008 n=5+5) IndexRune/4K-32 511ns ± 0% 513ns ± 0% +0.39% (p=0.008 n=5+5) IndexRune/4M-32 527µs ± 1% 527µs ± 1% ~ (p=0.690 n=5+5) IndexRune/64M-32 18.7ms ± 0% 18.7ms ± 1% ~ (p=0.190 n=4+5) IndexRuneASCII/10-32 23.8ns ± 0% 23.8ns ± 0% ~ (all equal) IndexRuneASCII/32-32 24.3ns ± 0% 24.3ns ± 0% ~ (all equal) IndexRuneASCII/4K-32 468ns ± 0% 468ns ± 0% ~ (all equal) IndexRuneASCII/4M-32 521µs ± 1% 531µs ± 2% +1.91% (p=0.016 n=5+5) IndexRuneASCII/64M-32 18.6ms ± 1% 18.5ms ± 0% ~ (p=0.730 n=5+4) Index/10-32 89.1ns ±13% 25.2ns ± 0% -71.72% (p=0.008 n=5+5) Index/32-32 225ns ± 2% 226ns ± 3% ~ (p=0.683 n=5+5) Index/4K-32 11.9µs ± 0% 11.8µs ± 0% -0.22% (p=0.008 n=5+5) Index/4M-32 12.1ms ± 0% 12.1ms ± 0% ~ (p=0.548 n=5+5) Index/64M-32 197ms ± 0% 197ms ± 0% ~ (p=0.690 n=5+5) IndexEasy/10-32 46.2ns ± 0% 22.1ns ± 8% -52.16% (p=0.008 n=5+5) IndexEasy/32-32 46.2ns ± 0% 47.2ns ± 0% +2.16% (p=0.008 n=5+5) IndexEasy/4K-32 499ns ± 0% 502ns ± 0% +0.44% (p=0.008 n=5+5) IndexEasy/4M-32 529µs ± 2% 529µs ± 1% ~ (p=0.841 n=5+5) IndexEasy/64M-32 18.6ms ± 1% 18.7ms ± 1% ~ (p=0.222 n=5+5) IndexAnyASCII/1:1-32 15.7ns ± 0% 15.7ns ± 0% ~ (all equal) IndexAnyASCII/1:2-32 17.2ns ± 0% 17.2ns ± 0% ~ (all equal) IndexAnyASCII/1:4-32 20.0ns ± 0% 20.0ns ± 0% ~ (all equal) IndexAnyASCII/1:8-32 34.8ns ± 0% 34.8ns ± 0% ~ (all equal) IndexAnyASCII/1:16-32 48.1ns ± 0% 48.1ns ± 0% ~ (all equal) IndexAnyASCII/16:1-32 97.9ns ± 1% 97.7ns ± 0% ~ (p=0.857 n=5+5) IndexAnyASCII/16:2-32 102ns ± 0% 102ns ± 0% ~ (all equal) IndexAnyASCII/16:4-32 116ns ± 1% 116ns ± 1% ~ (p=1.000 n=5+5) IndexAnyASCII/16:8-32 141ns ± 1% 141ns ± 0% ~ (p=0.571 n=5+4) IndexAnyASCII/16:16-32 178ns ± 0% 178ns ± 0% ~ (all equal) IndexAnyASCII/256:1-32 1.09µs ± 0% 1.09µs ± 0% ~ (all equal) IndexAnyASCII/256:2-32 1.09µs ± 0% 1.10µs ± 0% +0.27% (p=0.008 n=5+5) IndexAnyASCII/256:4-32 1.11µs ± 0% 1.11µs ± 0% ~ (p=0.397 n=5+5) IndexAnyASCII/256:8-32 1.10µs ± 0% 1.10µs ± 0% ~ (p=0.444 n=5+5) IndexAnyASCII/256:16-32 1.14µs ± 0% 1.14µs ± 0% ~ (all equal) IndexAnyASCII/4096:1-32 16.5µs ± 0% 16.5µs ± 0% ~ (p=1.000 n=5+5) IndexAnyASCII/4096:2-32 17.0µs ± 0% 17.0µs ± 0% ~ (p=0.159 n=5+4) IndexAnyASCII/4096:4-32 17.1µs ± 0% 17.1µs ± 0% ~ (p=0.921 n=4+5) IndexAnyASCII/4096:8-32 16.5µs ± 0% 16.5µs ± 0% ~ (p=0.460 n=5+5) IndexAnyASCII/4096:16-32 16.5µs ± 0% 16.5µs ± 0% ~ (p=0.794 n=5+4) IndexPeriodic/IndexPeriodic2-32 189µs ± 0% 189µs ± 0% ~ (p=0.841 n=5+5) IndexPeriodic/IndexPeriodic4-32 189µs ± 0% 189µs ± 0% -0.03% (p=0.016 n=5+4) IndexPeriodic/IndexPeriodic8-32 189µs ± 0% 189µs ± 0% ~ (p=0.651 n=5+5) IndexPeriodic/IndexPeriodic16-32 175µs ± 9% 174µs ± 7% ~ (p=1.000 n=5+5) IndexPeriodic/IndexPeriodic32-32 75.1µs ± 0% 75.1µs ± 0% ~ (p=0.690 n=5+5) IndexPeriodic/IndexPeriodic64-32 42.6µs ± 0% 44.7µs ± 0% +4.98% (p=0.008 n=5+5) name old speed new speed delta IndexByte/10-32 538MB/s ± 0% 552MB/s ± 0% +2.65% (p=0.008 n=5+5) IndexByte/32-32 1.90GB/s ± 1% 1.90GB/s ± 1% ~ (p=0.548 n=5+5) IndexByte/4K-32 8.82GB/s ± 0% 8.81GB/s ± 0% ~ (p=0.548 n=5+5) IndexByte/4M-32 7.95GB/s ± 1% 8.29GB/s ± 1% +4.35% (p=0.008 n=5+5) IndexByte/64M-32 3.58GB/s ± 0% 3.60GB/s ± 1% ~ (p=0.730 n=4+5) IndexBytePortable/10-32 296MB/s ± 0% 286MB/s ± 3% ~ (p=0.381 n=4+5) IndexBytePortable/32-32 490MB/s ± 0% 485MB/s ± 2% ~ (p=0.286 n=5+5) IndexBytePortable/4K-32 697MB/s ± 0% 697MB/s ± 0% ~ (p=0.413 n=5+5) IndexBytePortable/4M-32 696MB/s ± 0% 695MB/s ± 0% ~ (p=0.897 n=5+5) IndexBytePortable/64M-32 679MB/s ± 0% 678MB/s ± 0% -0.10% (p=0.008 n=5+5) IndexRune/10-32 173MB/s ± 0% 203MB/s ± 0% +17.24% (p=0.016 n=5+4) IndexRune/32-32 555MB/s ± 0% 546MB/s ± 0% -1.62% (p=0.008 n=5+5) IndexRune/4K-32 8.01GB/s ± 0% 7.98GB/s ± 0% -0.38% (p=0.008 n=5+5) IndexRune/4M-32 7.97GB/s ± 1% 7.95GB/s ± 1% ~ (p=0.690 n=5+5) IndexRune/64M-32 3.59GB/s ± 0% 3.58GB/s ± 1% ~ (p=0.190 n=4+5) IndexRuneASCII/10-32 420MB/s ± 0% 420MB/s ± 0% ~ (p=0.190 n=5+4) IndexRuneASCII/32-32 1.32GB/s ± 0% 1.32GB/s ± 0% ~ (p=0.333 n=5+5) IndexRuneASCII/4K-32 8.75GB/s ± 0% 8.75GB/s ± 0% ~ (p=0.690 n=5+5) IndexRuneASCII/4M-32 8.04GB/s ± 1% 7.89GB/s ± 2% -1.87% (p=0.016 n=5+5) IndexRuneASCII/64M-32 3.61GB/s ± 1% 3.62GB/s ± 0% ~ (p=0.730 n=5+4) Index/10-32 113MB/s ±14% 397MB/s ± 0% +249.76% (p=0.008 n=5+5) Index/32-32 142MB/s ± 2% 141MB/s ± 3% ~ (p=0.794 n=5+5) Index/4K-32 345MB/s ± 0% 346MB/s ± 0% +0.22% (p=0.008 n=5+5) Index/4M-32 345MB/s ± 0% 345MB/s ± 0% ~ (p=0.619 n=5+5) Index/64M-32 341MB/s ± 0% 341MB/s ± 0% ~ (p=0.595 n=5+5) IndexEasy/10-32 216MB/s ± 0% 453MB/s ± 8% +109.60% (p=0.008 n=5+5) IndexEasy/32-32 692MB/s ± 0% 678MB/s ± 0% -2.01% (p=0.008 n=5+5) IndexEasy/4K-32 8.19GB/s ± 0% 8.16GB/s ± 0% -0.45% (p=0.008 n=5+5) IndexEasy/4M-32 7.93GB/s ± 2% 7.93GB/s ± 1% ~ (p=0.841 n=5+5) IndexEasy/64M-32 3.60GB/s ± 1% 3.59GB/s ± 1% ~ (p=0.222 n=5+5) Change-Id: I4ca69378a2df6f9ba748c6a2706953ee1bd07343 Reviewed-on: https://go-review.googlesource.com/96555 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-11-30bytes: mention strings.Builder in Buffer.String docsBrad Fitzpatrick
Fixes #22778 Change-Id: I37f7a59c15828aa720fe787fff42fb3ef17729c7 Reviewed-on: https://go-review.googlesource.com/80815 Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-11-21bytes: add optimized countByte for arm64Wei Xiao
Use SIMD instructions when counting a single byte. Inspired from runtime IndexByte implementation. Benchmark results of bytes, where 1 byte in every 8 is the one we are looking: name old time/op new time/op delta CountSingle/10-8 96.1ns ± 1% 38.8ns ± 0% -59.64% (p=0.000 n=9+7) CountSingle/32-8 172ns ± 2% 36ns ± 1% -79.27% (p=0.000 n=10+10) CountSingle/4K-8 18.2µs ± 1% 0.9µs ± 0% -95.17% (p=0.000 n=9+10) CountSingle/4M-8 18.4ms ± 0% 0.9ms ± 0% -95.00% (p=0.000 n=10+9) CountSingle/64M-8 284ms ± 4% 19ms ± 0% -93.40% (p=0.000 n=10+10) name old speed new speed delta CountSingle/10-8 104MB/s ± 1% 258MB/s ± 0% +147.99% (p=0.000 n=9+10) CountSingle/32-8 185MB/s ± 1% 897MB/s ± 1% +385.33% (p=0.000 n=9+10) CountSingle/4K-8 225MB/s ± 1% 4658MB/s ± 0% +1967.40% (p=0.000 n=9+10) CountSingle/4M-8 228MB/s ± 0% 4555MB/s ± 0% +1901.71% (p=0.000 n=10+9) CountSingle/64M-8 236MB/s ± 4% 3575MB/s ± 0% +1414.69% (p=0.000 n=10+10) Change-Id: Ifccb51b3c8658c49773fe05147c3cf3aead361e5 Reviewed-on: https://go-review.googlesource.com/71111 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-11-19bytes: don't use an iota for the readOp constantsDaniel Martí
As per the comments in golang.org/cl/78617. Also leaving a comment here, to make sure noone else thinks to re-introduce the iota like I did. Change-Id: I2a2275998b81896eaa0e9d5ee0197661ebe84acf Reviewed-on: https://go-review.googlesource.com/78676 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-11-18bytes: make all readOp constants actually typedDaniel Martí
This is a regression introduced in golang.org/cl/28817. That change got rid of the iota, which meant that the type was no longer applied to all the constant names. Re-add the iota starting at -1, simplifying the code and adding the types once more. Change-Id: I38bd0e04f8d298196bccd33651e29f5011401a8d Reviewed-on: https://go-review.googlesource.com/78617 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-11-15bytes, strings: restore O(1) behavior of IndexAny(s, "") and LastIndexAny(s, "")Russ Cox
CL 65851 (bytes) and CL 65910 (strings) “improve[d] readability” by removing the special case that bypassed the whole function body when chars == "". In doing so, yes, the function was unindented a level, which is nice, but the runtime of that case went from O(1) to O(n) where n = len(s). I don't know if anyone's code depends on the O(1) behavior in this case, but quite possibly someone's does. This CL adds the special case back, with a comment to prevent future deletions, and without reindenting each function body in full. Change-Id: I5aba33922b304dd1b8657e6d51d6c937a7f95c81 Reviewed-on: https://go-review.googlesource.com/78112 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-11-15bytes: make ExampleTrimLeft and ExampleTrimRight matchRuss Cox
ExampleTrimLeft was inexplicably complex. Change-Id: I13ca81bdeba728bdd632acf82e3a1101d29b9f39 Reviewed-on: https://go-review.googlesource.com/78111 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-11-15bytes: change ExampleReader_Len to use a non-ASCII stringRuss Cox
This should help make clear that Len is not counting runes. Also delete empty string, which doesn't add much. Change-Id: I1602352df1897fef6e855e9db0bababb8ab788ca Reviewed-on: https://go-review.googlesource.com/78110 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-11-15bytes,strings: in generic Index, use mix of IndexByte and Rabin-KarpKeith Randall
Use IndexByte first, as it allows us to skip lots of bytes quickly. If IndexByte is generating a lot of false positives, switch over to Rabin-Karp. Experiments for ppc64le bytes: name old time/op new time/op delta IndexPeriodic/IndexPeriodic2-2 1.12ms ± 0% 0.18ms ± 0% -83.54% (p=0.000 n=10+9) IndexPeriodic/IndexPeriodic4-2 635µs ± 0% 184µs ± 0% -71.06% (p=0.000 n=9+9) IndexPeriodic/IndexPeriodic8-2 289µs ± 0% 184µs ± 0% -36.51% (p=0.000 n=10+9) IndexPeriodic/IndexPeriodic16-2 133µs ± 0% 183µs ± 0% +37.68% (p=0.000 n=10+9) IndexPeriodic/IndexPeriodic32-2 68.3µs ± 0% 70.2µs ± 0% +2.76% (p=0.000 n=10+10) IndexPeriodic/IndexPeriodic64-2 35.8µs ± 0% 36.6µs ± 0% +2.17% (p=0.000 n=8+10) strings: name old time/op new time/op delta IndexPeriodic/IndexPeriodic2-2 184µs ± 0% 184µs ± 0% +0.11% (p=0.029 n=4+4) IndexPeriodic/IndexPeriodic4-2 184µs ± 0% 184µs ± 0% ~ (p=0.886 n=4+4) IndexPeriodic/IndexPeriodic8-2 184µs ± 0% 184µs ± 0% ~ (p=0.486 n=4+4) IndexPeriodic/IndexPeriodic16-2 185µs ± 1% 184µs ± 0% ~ (p=0.343 n=4+4) IndexPeriodic/IndexPeriodic32-2 184µs ± 0% 69µs ± 0% -62.37% (p=0.029 n=4+4) IndexPeriodic/IndexPeriodic64-2 184µs ± 0% 37µs ± 0% -80.17% (p=0.029 n=4+4) Fixes #22578 Change-Id: If2a4d8554cb96bfd699b58149d13ac294615f8b8 Reviewed-on: https://go-review.googlesource.com/76070 Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
2017-11-04bytes: reduce work in IndexNearPageBoundary testKeith Randall
This test was taking too long on ppc64x. There were a few reasons. The first is that the page size on ppc64x is 64k instead of 4k. That's 16x more work. The second is that the generic Index is pretty bad in this case. It first calls IndexByte which does a bunch of setup work only to find the byte we're looking for at index 0. Then it calls Equal which has to look at the whole string to find a difference on the last byte. To fix, just limit our attention to near the end of the page. Change-Id: I6b8bcbb94652a2da853862acc23803def0c49303 Reviewed-on: https://go-review.googlesource.com/76050 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
2017-11-03bytes: set cap of slices returned by Split and Fields and friendsIan Lance Taylor
This avoids the problem in which appending to a slice returned by Split can affect subsequent slices. Fixes #21149. Change-Id: Ie3df2b9ceeb9605d4625f47d49073c5f348cf0a1 Reviewed-on: https://go-review.googlesource.com/74510 Reviewed-by: Jelte Fennema <github-tech@jeltef.nl> Reviewed-by: Robert Griesemer <gri@golang.org>
2017-11-03bytes: add more page boundary testsKeith Randall
Make sure Index and IndexByte don't read past the queried byte slice. Hopefully will be helpful for CL 33597. Also remove the code which maps/unmaps the Go heap. Much safer to play with protection bits off-heap. Change-Id: I50d73e879b2d83285e1bc7c3e810efe4c245fe75 Reviewed-on: https://go-review.googlesource.com/75890 Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-10-16bytes: add examples of Equal and IndexByteJavier Segura
Change-Id: Ibf3179d0903eb443c89b6d886802c36f8d199898 Reviewed-on: https://go-review.googlesource.com/70933 Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com> Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-06bytes: panic in ReadFrom with more information with negative Read countsAfanasev Stanislav
This is only to aid in human debugging, and for that reason we maintain a panic, and not return an error. Fixes #22097 Change-Id: If72e4d1e47ec9125ca7bc97d5fe4cedb7f76ae72 Reviewed-on: https://go-review.googlesource.com/67970 Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-04bytes: correct Map documentationGabriel Aszalos
Fix incorrect reference to string instead of byte slice. Change-Id: I95553da32acfbcf5dde9613b07ea38408cb31ae8 Reviewed-on: https://go-review.googlesource.com/68090 Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-10-02bytes: explicitly state if a function expects UTF-8-encoded dataTim Cooper
Fixes #21950 Change-Id: I6fa392abd2c3bf6a4f80f14c6b1419470e9a944d Reviewed-on: https://go-review.googlesource.com/66750 Run-TryBot: Rob Pike <r@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rob Pike <r@golang.org>
2017-09-25bytes: improve readability of IndexAny and LastIndexAny functionsGabriel Aszalos
This change removes the check of len(chars) > 0 inside the Index and IndexAny functions which was redundant. Change-Id: Ic4bf8b8a37d7f040d3ebd81b4fc45fcb386b639a Reviewed-on: https://go-review.googlesource.com/65851 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-25cmd/compile: merge bytes inline test with the restDaniel Martí
In golang.org/cl/42813, a test was added in the bytes package to check if a Buffer method was being inlined, using 'go tool nm'. Now that we have a compiler test that verifies that certain funcs are inlineable, merge it there. Knowing whether the funcs are inlineable is also more reliable than whether or not their symbol appears in the binary, too. For example, under some circumstances, inlineable funcs can't be inlined, such as if closures are used. While at it, add a few more bytes.Buffer methods that are currently inlined and should clearly stay that way. Updates #21851. Change-Id: I62066e32ef5542d37908bd64f90bda51276da4de Reviewed-on: https://go-review.googlesource.com/65658 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-21bytes: add documentation to reader methodsGabriel Aszalos
Some methods that were used to implement various `io` interfaces in the Reader were documented, whereas others were not. This change adds documentation to all the missing methods used to implement these interfaces. Change-Id: I2dac6e328542de3cd87e89510651cd6ba74a7b7d Reviewed-on: https://go-review.googlesource.com/65231 Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-20bytes: improve test readabilityGabriel Aszalos
This CL improves the readability of the tests in the bytes package by naming the `data` test variable `testString`, using the same convention as its counterpart, `testBytes`. It additionally removes some type casting which was unnecessary. Change-Id: If38b5606ce8bda0306bae24498f21cb8dbbb6377 Reviewed-on: https://go-review.googlesource.com/64931 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-20bytes: correct message in test logGabriel Aszalos
Change-Id: Ib731874b9a37ff141e4305d8ccfdf7c165155da6 Reviewed-on: https://go-review.googlesource.com/64930 Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-09-18bytes: removed unnecessary slicing on copyGabriel Aszalos
Change-Id: Ia42e3479c852a88968947411de8b32e5bcda5ae3 Reviewed-on: https://go-review.googlesource.com/64371 Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-17bytes: add example for Len function of ReaderAndrzej Żeżel
Change-Id: If7ecdc57f190f647bfc673bde8e66b4ef12aa906 Reviewed-on: https://go-review.googlesource.com/64190 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-25bytes: Add missing examples to functionsBorja Clemente
Fixes #21570 Change-Id: Ia0734929a04fbce8fdd5fbcb1b7baff9a8bbe39e Reviewed-on: https://go-review.googlesource.com/58030 Reviewed-by: Robert Griesemer <gri@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-22bytes: add examples for TrimLeft and TrimRightMichael Brandenburg
Change-Id: Ib6d94f185dd43568cf97ef267dd51a09f43a402f Reviewed-on: https://go-review.googlesource.com/51391 Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-18bytes: clean-up of buffer.goMarvin Stenger
Clean-up changes in no particular order: - use uint8 instead of int for readOp - remove duplicated code in ReadFrom() - introduce (*Buffer).empty() - remove naked returns Change-Id: Ie6e673c20c398f980f8be0448969a36ad4778804 Reviewed-on: https://go-review.googlesource.com/42816 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-18all: unindent some big chunks of codeDaniel Martí
Found with mvdan.cc/unindent. Prioritized the ones with the biggest wins for now. Change-Id: I2b032e45cdd559fc9ed5b1ee4c4de42c4c92e07b Reviewed-on: https://go-review.googlesource.com/56470 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-16bytes: avoid overflow in (*Buffer).Grow and ReadFromBryan C. Mills
fixes #21481 Change-Id: I26717876a1c0ee25a86c81159c6b3c59563dfec6 Reviewed-on: https://go-review.googlesource.com/56230 Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-08-14bytes: speed up Fields and FieldsFuncMartin Möhrmann
Applies the optimizations from golang.org/cl/42810 and golang.org/cl/37959 done to the strings package to the bytes package. name old time/op new time/op delta Fields/ASCII/16 417ns ± 4% 118ns ± 3% -71.65% (p=0.000 n=10+10) Fields/ASCII/256 5.95µs ± 3% 0.88µs ± 0% -85.23% (p=0.000 n=10+7) Fields/ASCII/4096 92.3µs ± 1% 12.8µs ± 2% -86.13% (p=0.000 n=10+10) Fields/ASCII/65536 1.49ms ± 1% 0.25ms ± 1% -83.14% (p=0.000 n=10+10) Fields/ASCII/1048576 25.0ms ± 1% 6.5ms ± 2% -74.04% (p=0.000 n=10+10) Fields/Mixed/16 406ns ± 1% 222ns ± 1% -45.24% (p=0.000 n=10+9) Fields/Mixed/256 5.78µs ± 1% 2.27µs ± 1% -60.73% (p=0.000 n=9+10) Fields/Mixed/4096 97.9µs ± 1% 40.5µs ± 3% -58.66% (p=0.000 n=10+10) Fields/Mixed/65536 1.58ms ± 1% 0.69ms ± 1% -56.58% (p=0.000 n=10+10) Fields/Mixed/1048576 26.6ms ± 1% 12.6ms ± 2% -52.44% (p=0.000 n=9+10) FieldsFunc/ASCII/16 395ns ± 1% 188ns ± 1% -52.34% (p=0.000 n=10+10) FieldsFunc/ASCII/256 5.90µs ± 1% 2.00µs ± 1% -66.06% (p=0.000 n=10+10) FieldsFunc/ASCII/4096 92.5µs ± 1% 33.0µs ± 1% -64.34% (p=0.000 n=10+9) FieldsFunc/ASCII/65536 1.48ms ± 1% 0.54ms ± 1% -63.38% (p=0.000 n=10+9) FieldsFunc/ASCII/1048576 25.1ms ± 1% 10.5ms ± 3% -58.24% (p=0.000 n=10+10) FieldsFunc/Mixed/16 401ns ± 1% 205ns ± 2% -48.87% (p=0.000 n=10+10) FieldsFunc/Mixed/256 5.70µs ± 1% 1.98µs ± 1% -65.28% (p=0.000 n=10+10) FieldsFunc/Mixed/4096 97.5µs ± 1% 35.4µs ± 1% -63.65% (p=0.000 n=10+10) FieldsFunc/Mixed/65536 1.57ms ± 1% 0.61ms ± 1% -61.20% (p=0.000 n=10+10) FieldsFunc/Mixed/1048576 26.5ms ± 1% 11.4ms ± 2% -56.84% (p=0.000 n=10+10) name old speed new speed delta Fields/ASCII/16 38.4MB/s ± 4% 134.9MB/s ± 3% +251.55% (p=0.000 n=10+10) Fields/ASCII/256 43.0MB/s ± 3% 290.6MB/s ± 1% +575.97% (p=0.000 n=10+8) Fields/ASCII/4096 44.4MB/s ± 1% 320.0MB/s ± 2% +620.90% (p=0.000 n=10+10) Fields/ASCII/65536 44.0MB/s ± 1% 260.7MB/s ± 1% +493.15% (p=0.000 n=10+10) Fields/ASCII/1048576 42.0MB/s ± 1% 161.6MB/s ± 2% +285.21% (p=0.000 n=10+10) Fields/Mixed/16 39.4MB/s ± 1% 71.7MB/s ± 1% +82.20% (p=0.000 n=10+10) Fields/Mixed/256 44.3MB/s ± 1% 112.8MB/s ± 1% +154.64% (p=0.000 n=9+10) Fields/Mixed/4096 41.9MB/s ± 1% 101.2MB/s ± 3% +141.92% (p=0.000 n=10+10) Fields/Mixed/65536 41.5MB/s ± 1% 95.5MB/s ± 1% +130.29% (p=0.000 n=10+10) Fields/Mixed/1048576 39.4MB/s ± 1% 82.9MB/s ± 2% +110.28% (p=0.000 n=9+10) FieldsFunc/ASCII/16 40.5MB/s ± 1% 84.9MB/s ± 2% +109.80% (p=0.000 n=10+10) FieldsFunc/ASCII/256 43.4MB/s ± 1% 127.9MB/s ± 1% +194.58% (p=0.000 n=10+10) FieldsFunc/ASCII/4096 44.3MB/s ± 1% 124.2MB/s ± 1% +180.44% (p=0.000 n=10+9) FieldsFunc/ASCII/65536 44.2MB/s ± 1% 120.6MB/s ± 1% +173.06% (p=0.000 n=10+9) FieldsFunc/ASCII/1048576 41.8MB/s ± 1% 100.2MB/s ± 3% +139.53% (p=0.000 n=10+10) FieldsFunc/Mixed/16 39.8MB/s ± 1% 77.8MB/s ± 2% +95.46% (p=0.000 n=10+10) FieldsFunc/Mixed/256 44.9MB/s ± 1% 129.4MB/s ± 1% +187.97% (p=0.000 n=10+10) FieldsFunc/Mixed/4096 42.0MB/s ± 1% 115.6MB/s ± 1% +175.08% (p=0.000 n=10+10) FieldsFunc/Mixed/65536 41.6MB/s ± 1% 107.3MB/s ± 1% +157.75% (p=0.000 n=10+10) FieldsFunc/Mixed/1048576 39.6MB/s ± 1% 91.8MB/s ± 2% +131.72% (p=0.000 n=10+10) name old alloc/op new alloc/op delta Fields/ASCII/16 80.0B ± 0% 80.0B ± 0% ~ (all equal) Fields/ASCII/256 768B ± 0% 768B ± 0% ~ (all equal) Fields/ASCII/4096 9.47kB ± 0% 9.47kB ± 0% ~ (all equal) Fields/ASCII/65536 147kB ± 0% 147kB ± 0% ~ (all equal) Fields/ASCII/1048576 2.27MB ± 0% 2.27MB ± 0% ~ (all equal) Fields/Mixed/16 96.0B ± 0% 96.0B ± 0% ~ (all equal) Fields/Mixed/256 768B ± 0% 768B ± 0% ~ (all equal) Fields/Mixed/4096 9.47kB ± 0% 24.83kB ± 0% +162.16% (p=0.000 n=10+10) Fields/Mixed/65536 147kB ± 0% 497kB ± 0% +237.24% (p=0.000 n=10+10) Fields/Mixed/1048576 2.26MB ± 0% 9.61MB ± 0% +324.89% (p=0.000 n=10+10) FieldsFunc/ASCII/16 80.0B ± 0% 80.0B ± 0% ~ (all equal) FieldsFunc/ASCII/256 768B ± 0% 768B ± 0% ~ (all equal) FieldsFunc/ASCII/4096 9.47kB ± 0% 24.83kB ± 0% +162.16% (p=0.000 n=10+10) FieldsFunc/ASCII/65536 147kB ± 0% 497kB ± 0% +237.24% (p=0.000 n=10+10) FieldsFunc/ASCII/1048576 2.27MB ± 0% 9.61MB ± 0% +323.72% (p=0.000 n=10+10) FieldsFunc/Mixed/16 96.0B ± 0% 96.0B ± 0% ~ (all equal) FieldsFunc/Mixed/256 768B ± 0% 768B ± 0% ~ (all equal) FieldsFunc/Mixed/4096 9.47kB ± 0% 24.83kB ± 0% +162.16% (p=0.000 n=10+10) FieldsFunc/Mixed/65536 147kB ± 0% 497kB ± 0% +237.24% (p=0.000 n=10+10) FieldsFunc/Mixed/1048576 2.26MB ± 0% 9.61MB ± 0% +324.89% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Fields/ASCII/16 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/ASCII/256 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/ASCII/4096 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/ASCII/65536 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/ASCII/1048576 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/Mixed/16 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/Mixed/256 1.00 ± 0% 1.00 ± 0% ~ (all equal) Fields/Mixed/4096 1.00 ± 0% 5.00 ± 0% +400.00% (p=0.000 n=10+10) Fields/Mixed/65536 1.00 ± 0% 12.00 ± 0% +1100.00% (p=0.000 n=10+10) Fields/Mixed/1048576 1.00 ± 0% 24.00 ± 0% +2300.00% (p=0.000 n=10+10) FieldsFunc/ASCII/16 1.00 ± 0% 1.00 ± 0% ~ (all equal) FieldsFunc/ASCII/256 1.00 ± 0% 1.00 ± 0% ~ (all equal) FieldsFunc/ASCII/4096 1.00 ± 0% 5.00 ± 0% +400.00% (p=0.000 n=10+10) FieldsFunc/ASCII/65536 1.00 ± 0% 12.00 ± 0% +1100.00% (p=0.000 n=10+10) FieldsFunc/ASCII/1048576 1.00 ± 0% 24.00 ± 0% +2300.00% (p=0.000 n=10+10) FieldsFunc/Mixed/16 1.00 ± 0% 1.00 ± 0% ~ (all equal) FieldsFunc/Mixed/256 1.00 ± 0% 1.00 ± 0% ~ (all equal) FieldsFunc/Mixed/4096 1.00 ± 0% 5.00 ± 0% +400.00% (p=0.000 n=10+10) FieldsFunc/Mixed/65536 1.00 ± 0% 12.00 ± 0% +1100.00% (p=0.000 n=10+10) FieldsFunc/Mixed/1048576 1.00 ± 0% 24.00 ± 0% +2300.00% (p=0.000 n=10+10) Change-Id: If1926782decc2f60d3b4b8c41c2ce7d8bdedfd8f Reviewed-on: https://go-review.googlesource.com/55131 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-07-16bytes: add example for (*Buffer).GrowBrian Downs
Change-Id: I04849883dd2e1f6d083e9f57d2a8c1bd7d258953 Reviewed-on: https://go-review.googlesource.com/48878 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com>
2017-06-02bytes: note that NewBuffer take ownership of its argumentAlberto Donizetti
Fixes #19383 Change-Id: Ic84517053ced7794006f6fc65e6f249e97d6cf35 Reviewed-on: https://go-review.googlesource.com/44691 Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-05-10internal/cpu: new package to detect cpu featuresMartin Möhrmann
Implements detection of x86 cpu features that are used in the go standard library. Changes all standard library packages to use the new cpu package instead of using runtime internal variables to check x86 cpu features. Updates: #15403 Change-Id: I2999a10cb4d9ec4863ffbed72f4e021a1dbc4bb9 Reviewed-on: https://go-review.googlesource.com/41476 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-08bytes: skip inline test by defaultMarvin Stenger
The test "TestTryGrowByResliceInlined" introduced in c08ac36 broke the noopt builder as it fails when inlining is disabled. Since there are currently no other options at hand for checking inlined-ness other than looking at emited symbols of the compilation, we for now skip the problem causing test by default and only run it on one specific builder ("linux-amd64"). Also see CL 42813, which introduced the test and contains comments suggesting this temporary solution. Change-Id: I3978ab0831da04876cf873d78959f821c459282b Reviewed-on: https://go-review.googlesource.com/42820 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-07bytes: optimize Buffer's Write, WriteString, WriteByte, and WriteRuneMarvin Stenger
In the common case, the grow method only needs to reslice the internal buffer. Making another function call to grow can be expensive when Write is called very often with small pieces of data (like a byte or rune). Thus, we add a tryGrowByReslice method that is inlineable so that we can avoid an extra call in most cases. name old time/op new time/op delta WriteByte-4 35.5µs ± 0% 17.4µs ± 1% -51.03% (p=0.000 n=19+20) WriteRune-4 55.7µs ± 1% 38.7µs ± 1% -30.56% (p=0.000 n=18+19) BufferNotEmptyWriteRead-4 304µs ± 5% 283µs ± 3% -6.86% (p=0.000 n=19+17) BufferFullSmallReads-4 87.0µs ± 5% 66.8µs ± 2% -23.26% (p=0.000 n=17+17) name old speed new speed delta WriteByte-4 115MB/s ± 0% 235MB/s ± 1% +104.19% (p=0.000 n=19+20) WriteRune-4 221MB/s ± 1% 318MB/s ± 1% +44.01% (p=0.000 n=18+19) Fixes #17857 Change-Id: I08dfb10a1c7e001817729dbfcc951bda12fe8814 Reviewed-on: https://go-review.googlesource.com/42813 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-28bytes: clarify documentation for UnreadByte/RuneRobert Griesemer
Fixes #19522. Change-Id: Ib3cf0336e0bf91580d533704ec1a9d45eb0bf62d Reviewed-on: https://go-review.googlesource.com/42020 Reviewed-by: Rob Pike <r@golang.org>
2017-04-07strings: optimize Count for amd64Josselin Costanzi
Move optimized Count implementation from bytes to runtime. Use in both bytes and strings packages. Add CountByte benchmark to strings. Strings benchmarks: name old time/op new time/op delta CountHard1-4 226µs ± 1% 226µs ± 2% ~ (p=0.247 n=10+10) CountHard2-4 316µs ± 1% 315µs ± 0% ~ (p=0.133 n=9+10) CountHard3-4 919µs ± 1% 920µs ± 1% ~ (p=0.968 n=10+9) CountTorture-4 15.4µs ± 1% 15.7µs ± 1% +2.47% (p=0.000 n=10+9) CountTortureOverlapping-4 9.60ms ± 0% 9.65ms ± 1% ~ (p=0.247 n=10+10) CountByte/10-4 26.3ns ± 1% 10.9ns ± 1% -58.71% (p=0.000 n=9+9) CountByte/32-4 42.7ns ± 0% 14.2ns ± 0% -66.64% (p=0.000 n=10+10) CountByte/4096-4 3.07µs ± 0% 0.31µs ± 2% -89.99% (p=0.000 n=9+10) CountByte/4194304-4 3.48ms ± 1% 0.34ms ± 1% -90.09% (p=0.000 n=10+9) CountByte/67108864-4 55.6ms ± 1% 7.0ms ± 0% -87.49% (p=0.000 n=9+8) name old speed new speed delta CountByte/10-4 380MB/s ± 1% 919MB/s ± 1% +142.21% (p=0.000 n=9+9) CountByte/32-4 750MB/s ± 0% 2247MB/s ± 0% +199.62% (p=0.000 n=10+10) CountByte/4096-4 1.33GB/s ± 0% 13.32GB/s ± 2% +898.13% (p=0.000 n=9+10) CountByte/4194304-4 1.21GB/s ± 1% 12.17GB/s ± 1% +908.87% (p=0.000 n=10+9) CountByte/67108864-4 1.21GB/s ± 1% 9.65GB/s ± 0% +699.29% (p=0.000 n=9+8) Fixes #19411 Change-Id: I8d2d409f0fa6df6d03b60790aa86e540b4a4e3b0 Reviewed-on: https://go-review.googlesource.com/38693 Reviewed-by: Keith Randall <khr@golang.org>
2017-04-03bytes, strings: declare variables inside loop they're used inEric Lagergren
The recently updated Count functions declare variables before special-cased returns. Change-Id: I8f726118336b7b0ff72117d12adc48b6e37e60ea Reviewed-on: https://go-review.googlesource.com/39357 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-04-03all: fix minor misspellingsEric Lagergren
Change-Id: I1f1cfb161640eb8756fb1a283892d06b30b7a8fa Reviewed-on: https://go-review.googlesource.com/39356 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-22bytes: fix typo in commentJosselin Costanzi
Change-Id: Ia739337dc9961422982912cc6a669022559fb991 Reviewed-on: https://go-review.googlesource.com/38365 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-21bytes: add optimized countByte for amd64Josselin Costanzi
Use SSE/AVX2 when counting a single byte. Inspired from runtime indexbyte implementation. Benchmark against previous implementation, where 1 byte in every 8 is the one we are looking for: * On a machine without AVX2 name old time/op new time/op delta CountSingle/10-4 61.8ns ±10% 15.6ns ±11% -74.83% (p=0.000 n=10+10) CountSingle/32-4 100ns ± 4% 17ns ±10% -82.54% (p=0.000 n=10+9) CountSingle/4K-4 9.66µs ± 3% 0.37µs ± 6% -96.21% (p=0.000 n=10+10) CountSingle/4M-4 11.0ms ± 6% 0.4ms ± 4% -96.04% (p=0.000 n=10+10) CountSingle/64M-4 194ms ± 8% 8ms ± 2% -95.64% (p=0.000 n=10+10) name old speed new speed delta CountSingle/10-4 162MB/s ±10% 645MB/s ±10% +297.00% (p=0.000 n=10+10) CountSingle/32-4 321MB/s ± 5% 1844MB/s ± 9% +474.79% (p=0.000 n=10+9) CountSingle/4K-4 424MB/s ± 3% 11169MB/s ± 6% +2533.10% (p=0.000 n=10+10) CountSingle/4M-4 381MB/s ± 7% 9609MB/s ± 4% +2421.88% (p=0.000 n=10+10) CountSingle/64M-4 346MB/s ± 7% 7924MB/s ± 2% +2188.78% (p=0.000 n=10+10) * On a machine with AVX2 name old time/op new time/op delta CountSingle/10-8 37.1ns ± 3% 8.2ns ± 1% -77.80% (p=0.000 n=10+10) CountSingle/32-8 66.1ns ± 3% 9.8ns ± 2% -85.23% (p=0.000 n=10+10) CountSingle/4K-8 7.36µs ± 3% 0.11µs ± 1% -98.54% (p=0.000 n=10+10) CountSingle/4M-8 7.46ms ± 2% 0.15ms ± 2% -97.95% (p=0.000 n=10+9) CountSingle/64M-8 124ms ± 2% 6ms ± 4% -95.09% (p=0.000 n=10+10) name old speed new speed delta CountSingle/10-8 269MB/s ± 3% 1213MB/s ± 1% +350.32% (p=0.000 n=10+10) CountSingle/32-8 484MB/s ± 4% 3277MB/s ± 2% +576.66% (p=0.000 n=10+10) CountSingle/4K-8 556MB/s ± 3% 37933MB/s ± 1% +6718.36% (p=0.000 n=10+10) CountSingle/4M-8 562MB/s ± 2% 27444MB/s ± 3% +4783.43% (p=0.000 n=10+9) CountSingle/64M-8 543MB/s ± 2% 11054MB/s ± 3% +1935.81% (p=0.000 n=10+10) Fixes #19411 Change-Id: Ieaf20b1fabccabe767c55c66e242e86f3617f883 Reviewed-on: https://go-review.googlesource.com/38258 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-02-28bytes: make bytes.Buffer cache-friendlyCarlo Alberto Ferraris
During benchmark of an internal tool we found out that (*Buffer).Reset() was surprisingly showing up in CPU profiles. This CL contains two related changes aimed at speeding up Reset(): 1. Create a fast path for Truncate(0) by moving the logic to Reset() (this makes Reset() a simple leaf func that gets inlined since it gets compiled to 3 MOVx instructions). Accordingly change calls in the rest of the Buffer methods to call Reset() instead of Truncate(0). 2. Reorder the fields in the Buffer struct so that frequently accessed fields are packed together (buf, off, lastRead). This also make them likely to be in the same cacheline. Ideally it would be advisable to have Buffer{} cacheline-aligned, but I couldn't find a way to do this without changing the size of the bootstrap array (but this will cause some regressions, because it will make duffcopy show up in CPU profiles where it wasn't showing up before). go1 benchmarks are not really affected, but some other benchmarks that exercise Buffer more show improvements: name old time/op new time/op delta BinaryTree17-4 2.46s ± 9% 2.43s ± 3% ~ (p=0.982 n=14+14) Fannkuch11-4 2.98s ± 1% 2.90s ± 1% -2.58% (p=0.000 n=15+14) FmtFprintfEmpty-4 45.2ns ± 1% 45.2ns ± 1% ~ (p=0.494 n=14+15) FmtFprintfString-4 76.8ns ± 1% 83.1ns ± 2% +8.23% (p=0.000 n=10+15) FmtFprintfInt-4 78.0ns ± 2% 74.6ns ± 1% -4.46% (p=0.000 n=15+15) FmtFprintfIntInt-4 113ns ± 1% 109ns ± 2% -2.91% (p=0.000 n=13+15) FmtFprintfPrefixedInt-4 152ns ± 2% 143ns ± 2% -6.04% (p=0.000 n=15+14) FmtFprintfFloat-4 224ns ± 1% 222ns ± 2% -1.08% (p=0.001 n=15+14) FmtManyArgs-4 464ns ± 2% 463ns ± 2% ~ (p=0.303 n=14+15) GobDecode-4 6.25ms ± 2% 6.32ms ± 3% +1.20% (p=0.002 n=14+14) GobEncode-4 5.41ms ± 2% 5.41ms ± 2% ~ (p=0.967 n=15+15) Gzip-4 215ms ± 2% 218ms ± 2% +1.35% (p=0.002 n=15+15) Gunzip-4 34.3ms ± 2% 34.2ms ± 2% ~ (p=0.539 n=15+15) HTTPClientServer-4 76.4µs ± 2% 75.4µs ± 1% -1.31% (p=0.000 n=15+15) JSONEncode-4 14.7ms ± 2% 14.6ms ± 3% ~ (p=0.094 n=14+14) JSONDecode-4 48.0ms ± 1% 48.5ms ± 1% +0.92% (p=0.001 n=14+12) Mandelbrot200-4 4.04ms ± 2% 4.06ms ± 1% ~ (p=0.108 n=15+13) GoParse-4 2.99ms ± 2% 3.00ms ± 1% ~ (p=0.130 n=15+13) RegexpMatchEasy0_32-4 78.3ns ± 1% 79.5ns ± 1% +1.51% (p=0.000 n=15+14) RegexpMatchEasy0_1K-4 185ns ± 1% 186ns ± 1% +0.76% (p=0.005 n=15+15) RegexpMatchEasy1_32-4 79.0ns ± 2% 76.7ns ± 1% -2.87% (p=0.000 n=14+15) name old speed new speed delta GobDecode-4 123MB/s ± 2% 121MB/s ± 3% -1.18% (p=0.002 n=14+14) GobEncode-4 142MB/s ± 2% 142MB/s ± 1% ~ (p=0.959 n=15+15) Gzip-4 90.3MB/s ± 2% 89.1MB/s ± 2% -1.34% (p=0.002 n=15+15) Gunzip-4 565MB/s ± 2% 567MB/s ± 2% ~ (p=0.539 n=15+15) JSONEncode-4 132MB/s ± 2% 133MB/s ± 3% ~ (p=0.091 n=14+14) JSONDecode-4 40.4MB/s ± 1% 40.0MB/s ± 1% -0.92% (p=0.001 n=14+12) GoParse-4 19.4MB/s ± 2% 19.3MB/s ± 1% ~ (p=0.121 n=15+13) RegexpMatchEasy0_32-4 409MB/s ± 1% 403MB/s ± 1% -1.47% (p=0.000 n=15+14) RegexpMatchEasy0_1K-4 5.53GB/s ± 1% 5.49GB/s ± 1% -0.86% (p=0.002 n=15+15) RegexpMatchEasy1_32-4 405MB/s ± 2% 417MB/s ± 1% +2.94% (p=0.000 n=14+15) name old time/op new time/op delta PoolsSingle1K-4 34.9ns ± 2% 30.4ns ± 4% -12.80% (p=0.000 n=15+15) PoolsSingle64K-4 36.9ns ± 1% 34.4ns ± 4% -6.72% (p=0.000 n=14+15) PoolsRandomSmall-4 34.8ns ± 3% 29.5ns ± 1% -15.19% (p=0.000 n=15+14) PoolsRandomLarge-4 38.6ns ± 1% 34.3ns ± 3% -11.17% (p=0.000 n=14+15) PoolSingle1K-4 26.1ns ± 1% 21.2ns ± 2% -18.59% (p=0.000 n=15+14) PoolSingle64K-4 26.7ns ± 2% 21.5ns ± 2% -19.72% (p=0.000 n=15+15) MakeSingle1K-4 24.2ns ± 2% 24.3ns ± 3% ~ (p=0.132 n=13+15) MakeSingle64K-4 6.76µs ± 1% 6.96µs ± 5% +2.94% (p=0.002 n=13+13) MakeRandomSmall-4 531ns ± 4% 538ns ± 5% ~ (p=0.066 n=14+15) MakeRandomLarge-4 152µs ± 0% 152µs ± 1% -0.31% (p=0.001 n=14+13) Change-Id: I86d7d9d2cac65335baf62214fbb35ba0fd8f9528 Reviewed-on: https://go-review.googlesource.com/37416 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-02-08bytes, strings: optimize Split*Aliaksandr Valialkin
The relevant benchmark results on linux/amd64: bytes: SplitSingleByteSeparator-4 25.7ms ± 5% 9.1ms ± 4% -64.40% (p=0.000 n=10+10) SplitMultiByteSeparator-4 13.8ms ±20% 4.3ms ± 8% -69.23% (p=0.000 n=10+10) SplitNSingleByteSeparator-4 1.88µs ± 9% 0.88µs ± 4% -53.25% (p=0.000 n=10+10) SplitNMultiByteSeparator-4 4.83µs ±10% 1.32µs ± 9% -72.65% (p=0.000 n=10+10) strings: name old time/op new time/op delta SplitSingleByteSeparator-4 21.4ms ± 8% 8.5ms ± 5% -60.19% (p=0.000 n=10+10) SplitMultiByteSeparator-4 13.2ms ± 9% 3.9ms ± 4% -70.29% (p=0.000 n=10+10) SplitNSingleByteSeparator-4 1.54µs ± 5% 0.75µs ± 7% -51.21% (p=0.000 n=10+10) SplitNMultiByteSeparator-4 3.57µs ± 8% 1.01µs ±11% -71.76% (p=0.000 n=10+10) Fixes #18973 Change-Id: Ie4bc010c6cc389001e72eab530497c81e5b26f34 Reviewed-on: https://go-review.googlesource.com/36510 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-08bytes: use Index in CountIlya Tocar
Similar to https://go-review.googlesource.com/28586, but for package bytes instead of strings. This provides simpler code and some performance gain. Also update strings.Count to use the same code. On AMD64 with heavily optimized Index I see: name old time/op new time/op delta Count/10-6 47.3ns ± 0% 36.8ns ± 0% -22.35% (p=0.000 n=10+10) Count/32-6 286ns ± 0% 38ns ± 0% -86.71% (p=0.000 n=10+10) Count/4K-6 50.1µs ± 0% 4.4µs ± 0% -91.18% (p=0.000 n=10+10) Count/4M-6 48.1ms ± 1% 4.5ms ± 0% -90.56% (p=0.000 n=10+9) Count/64M-6 784ms ± 0% 73ms ± 0% -90.73% (p=0.000 n=10+10) CountEasy/10-6 28.4ns ± 0% 31.0ns ± 0% +9.23% (p=0.000 n=10+10) CountEasy/32-6 30.6ns ± 0% 37.0ns ± 0% +20.92% (p=0.000 n=10+10) CountEasy/4K-6 186ns ± 0% 198ns ± 0% +6.45% (p=0.000 n=9+10) CountEasy/4M-6 233µs ± 2% 234µs ± 2% ~ (p=0.912 n=10+10) CountEasy/64M-6 6.70ms ± 0% 6.68ms ± 1% ~ (p=0.762 n=8+10) name old speed new speed delta Count/10-6 211MB/s ± 0% 272MB/s ± 0% +28.77% (p=0.000 n=10+9) Count/32-6 112MB/s ± 0% 842MB/s ± 0% +652.84% (p=0.000 n=10+10) Count/4K-6 81.8MB/s ± 0% 927.6MB/s ± 0% +1033.63% (p=0.000 n=10+9) Count/4M-6 87.2MB/s ± 1% 924.0MB/s ± 0% +959.25% (p=0.000 n=10+9) Count/64M-6 85.6MB/s ± 0% 922.9MB/s ± 0% +978.31% (p=0.000 n=10+10) CountEasy/10-6 352MB/s ± 0% 322MB/s ± 0% -8.41% (p=0.000 n=10+10) CountEasy/32-6 1.05GB/s ± 0% 0.87GB/s ± 0% -17.35% (p=0.000 n=9+10) CountEasy/4K-6 22.0GB/s ± 0% 20.6GB/s ± 0% -6.33% (p=0.000 n=10+10) CountEasy/4M-6 18.0GB/s ± 2% 18.0GB/s ± 2% ~ (p=0.912 n=10+10) CountEasy/64M-6 10.0GB/s ± 0% 10.0GB/s ± 1% ~ (p=0.762 n=8+10) On 386, without asm version of Index: Count/10-6 57.0ns ± 0% 56.9ns ± 0% -0.11% (p=0.006 n=10+9) Count/32-6 340ns ± 0% 274ns ± 0% -19.48% (p=0.000 n=10+9) Count/4K-6 49.5µs ± 0% 37.1µs ± 0% -24.96% (p=0.000 n=10+10) Count/4M-6 51.1ms ± 0% 38.2ms ± 0% -25.21% (p=0.000 n=10+10) Count/64M-6 818ms ± 0% 613ms ± 0% -25.07% (p=0.000 n=8+10) CountEasy/10-6 60.0ns ± 0% 70.4ns ± 0% +17.34% (p=0.000 n=10+10) CountEasy/32-6 81.1ns ± 0% 94.0ns ± 0% +15.97% (p=0.000 n=9+10) CountEasy/4K-6 4.37µs ± 0% 4.39µs ± 0% +0.30% (p=0.000 n=10+9) CountEasy/4M-6 4.43ms ± 0% 4.43ms ± 0% ~ (p=0.579 n=10+10) CountEasy/64M-6 70.9ms ± 0% 70.9ms ± 0% ~ (p=0.912 n=10+10) name old speed new speed delta Count/10-6 176MB/s ± 0% 176MB/s ± 0% +0.10% (p=0.000 n=10+9) Count/32-6 93.9MB/s ± 0% 116.5MB/s ± 0% +24.06% (p=0.000 n=10+9) Count/4K-6 82.7MB/s ± 0% 110.3MB/s ± 0% +33.26% (p=0.000 n=10+10) Count/4M-6 82.1MB/s ± 0% 109.7MB/s ± 0% +33.70% (p=0.000 n=10+10) Count/64M-6 82.0MB/s ± 0% 109.5MB/s ± 0% +33.46% (p=0.000 n=8+10) CountEasy/10-6 167MB/s ± 0% 142MB/s ± 0% -14.75% (p=0.000 n=9+10) CountEasy/32-6 395MB/s ± 0% 340MB/s ± 0% -13.77% (p=0.000 n=10+10) CountEasy/4K-6 936MB/s ± 0% 934MB/s ± 0% -0.29% (p=0.000 n=10+9) CountEasy/4M-6 947MB/s ± 0% 946MB/s ± 0% ~ (p=0.591 n=10+10) CountEasy/64M-6 947MB/s ± 0% 947MB/s ± 0% ~ (p=0.867 n=10+10) Change-Id: Ia76b247372b6f5b5d23a9f10253a86536a5153b3 Reviewed-on: https://go-review.googlesource.com/36489 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-01-07all: fix misspellingsshawnps
Change-Id: I429637ca91f7db4144f17621de851a548dc1ce76 Reviewed-on: https://go-review.googlesource.com/34923 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-11-04all: make copyright headers consistent with one space after periodMichael Munday
Continuation of CL 20111. Change-Id: Ie2f62237e6ec316989c021de9b267cc9d6ee6676 Reviewed-on: https://go-review.googlesource.com/32830 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-11-02bytes, strings: update s390x code to match amd64 changesMichael Munday
Updates the s390x-specific files in these packages with the changes to the amd64-specific files made during the review of CL 31690. I'd like to keep these files in sync unless there is a reason to diverge. Change-Id: Id83e5ce11a45f877bdcc991d02b14416d1a2d8d2 Reviewed-on: https://go-review.googlesource.com/32574 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-11-01bytes,strings: use IndexByte more often in Index on AMD64Ilya Tocar
IndexByte+compare is faster than indexShortStr in good case, when first byte is rare, but is more costly in bad cases. Start with IndexByte and switch to indexShortStr if we encounter false positives more often than once per 8 bytes. Benchmark changes for package bytes: IndexRune/4K-8 416ns ± 0% 86ns ± 0% -79.24% (p=0.000 n=10+10) IndexRune/4M-8 413µs ± 0% 100µs ± 1% -75.88% (p=0.000 n=10+10) IndexRune/64M-8 6.73ms ± 0% 2.86ms ± 1% -57.49% (p=0.000 n=10+10) Index/10-8 8.45ns ± 0% 8.96ns ± 0% +6.04% (p=0.000 n=9+10) Index/32-8 9.64ns ± 0% 9.51ns ± 0% -1.30% (p=0.000 n=8+9) Index/4K-8 2.11µs ± 0% 2.12µs ± 0% +0.26% (p=0.000 n=10+10) Index/4M-8 3.60ms ± 5% 3.59ms ± 7% ~ (p=0.497 n=9+10) Index/64M-8 57.1ms ± 3% 58.7ms ± 5% ~ (p=0.113 n=9+10) IndexEasy/10-8 7.10ns ± 1% 7.71ns ± 1% +8.60% (p=0.000 n=10+10) IndexEasy/32-8 9.29ns ± 1% 9.22ns ± 0% -0.75% (p=0.000 n=9+10) IndexEasy/4K-8 1.06µs ± 0% 0.08µs ± 0% -92.18% (p=0.000 n=10+10) IndexEasy/4M-8 1.07ms ± 0% 0.10ms ± 1% -90.74% (p=0.000 n=9+10) IndexEasy/64M-8 17.3ms ± 0% 2.8ms ± 1% -83.76% (p=0.000 n=10+9) IndexRune/4K-8 9.84GB/s ± 0% 47.42GB/s ± 0% +381.85% (p=0.000 n=8+10) IndexRune/4M-8 10.1GB/s ± 0% 42.1GB/s ± 1% +314.56% (p=0.000 n=10+10) IndexRune/64M-8 10.0GB/s ± 0% 23.4GB/s ± 1% +135.25% (p=0.000 n=10+10) Index/10-8 1.18GB/s ± 0% 1.12GB/s ± 0% -5.67% (p=0.000 n=10+9) Index/32-8 3.32GB/s ± 0% 3.36GB/s ± 0% +1.27% (p=0.000 n=10+9) Index/4K-8 1.94GB/s ± 0% 1.93GB/s ± 0% -0.25% (p=0.000 n=10+9) Index/4M-8 1.17GB/s ± 5% 1.17GB/s ± 7% ~ (p=0.497 n=9+10) Index/64M-8 1.17GB/s ± 3% 1.15GB/s ± 6% ~ (p=0.113 n=9+10) IndexEasy/10-8 1.41GB/s ± 1% 1.30GB/s ± 1% -7.90% (p=0.000 n=10+10) IndexEasy/32-8 3.45GB/s ± 1% 3.47GB/s ± 0% +0.73% (p=0.000 n=9+10) IndexEasy/4K-8 3.84GB/s ± 0% 49.16GB/s ± 0% +1178.78% (p=0.000 n=9+10) IndexEasy/4M-8 3.91GB/s ± 0% 42.19GB/s ± 1% +980.37% (p=0.000 n=9+10) IndexEasy/64M-8 3.88GB/s ± 0% 23.91GB/s ± 1% +515.76% (p=0.000 n=10+9) No significant changes in strings. In regexp I see: Match/Easy0/32-8 536MB/s ± 1% 540MB/s ± 1% +0.75% (p=0.001 n=9+10) Match/Easy0/1K-8 1.62GB/s ± 0% 4.42GB/s ± 1% +172.48% (p=0.000 n=10+10) Match/Easy0/32K-8 1.87GB/s ± 0% 9.07GB/s ± 1% +384.24% (p=0.000 n=7+10) Match/Easy0/1M-8 1.90GB/s ± 0% 4.83GB/s ± 0% +154.56% (p=0.000 n=8+10) Match/Easy0/32M-8 1.90GB/s ± 0% 4.53GB/s ± 0% +138.62% (p=0.000 n=7+10) Compared to in 1.7: Match/Easy0/32-8 59.5ns ± 0% 59.2ns ± 1% -0.45% (p=0.008 n=9+10) Match/Easy0/1K-8 226ns ± 1% 231ns ± 1% +2.30% (p=0.000 n=10+10) Match/Easy0/32K-8 3.73µs ± 2% 3.61µs ± 1% -3.12% (p=0.000 n=10+10) Match/Easy0/1M-8 206µs ± 1% 217µs ± 0% +5.34% (p=0.000 n=10+10) Match/Easy0/32M-8 7.03ms ± 1% 7.40ms ± 0% +5.23% (p=0.000 n=10+10) Fixes #17456 Change-Id: I38b2fabcaed7119cc4bf37007ba7bfe7504c8f9f Reviewed-on: https://go-review.googlesource.com/31690 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> Reviewed-by: Keith Randall <khr@golang.org>
2016-11-01bytes, strings: optimize multi-byte index operations on s390xMichael Munday
Use vector instructions to speed up indexing operations for short strings (64 bytes or less). bytes_s390x.go and strings_s390x.go are based on their amd64 equivalents in CL 31690. bytes package: name old time/op new time/op delta Index/10 40.3ns ± 7% 11.3ns ± 4% -72.06% (p=0.000 n=10+10) Index/32 196ns ± 1% 27ns ± 2% -86.25% (p=0.000 n=10+10) Index/4K 28.9µs ± 1% 1.5µs ± 2% -94.94% (p=0.000 n=9+9) Index/4M 30.1ms ± 2% 1.5ms ± 3% -94.94% (p=0.000 n=10+10) Index/64M 549ms ±13% 28ms ± 3% -94.87% (p=0.000 n=10+9) IndexEasy/10 18.8ns ±11% 11.5ns ± 2% -38.81% (p=0.000 n=10+10) IndexEasy/32 23.6ns ± 6% 28.1ns ± 3% +19.29% (p=0.000 n=10+10) IndexEasy/4K 251ns ± 5% 223ns ± 8% -11.04% (p=0.000 n=10+10) IndexEasy/4M 318µs ± 9% 266µs ± 8% -16.42% (p=0.000 n=10+10) IndexEasy/64M 14.7ms ±16% 13.2ms ±11% -10.22% (p=0.001 n=10+10) strings package: name old time/op new time/op delta IndexRune 88.1ns ±16% 28.9ns ± 4% -67.20% (p=0.000 n=10+10) IndexRuneLongString 456ns ± 7% 34ns ± 3% -92.50% (p=0.000 n=10+10) IndexRuneFastPath 12.9ns ±14% 11.1ns ± 6% -13.84% (p=0.000 n=10+10) Index 13.0ns ± 7% 11.3ns ± 4% -13.31% (p=0.000 n=10+10) IndexHard1 3.38ms ± 9% 0.07ms ± 1% -97.79% (p=0.000 n=10+10) IndexHard2 3.58ms ± 7% 0.37ms ± 2% -89.78% (p=0.000 n=10+10) IndexHard3 3.47ms ± 7% 0.75ms ± 1% -78.52% (p=0.000 n=10+10) IndexHard4 3.56ms ± 6% 1.34ms ± 0% -62.39% (p=0.000 n=9+9) Change-Id: If36c2afb8c02e80fcaa1cf5ec2abb0a2be08c7d1 Reviewed-on: https://go-review.googlesource.com/32447 Run-TryBot: Michael Munday <munday@ca.ibm.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-28bytes, strings: optimize for ASCII setsJoe Tsai
In a large codebase within Google, there are thousands of uses of: ContainsAny|IndexAny|LastIndexAny|Trim|TrimLeft|TrimRight An analysis of their usage shows that over 97% of them only use character sets consisting of only ASCII symbols. Uses of ContainsAny|IndexAny|LastIndexAny: 6% are 1 character (e.g., "\n" or " ") 58% are 2-4 characters (e.g., "<>" or "\r\n\t ") 24% are 5-9 characters (e.g., "()[]*^$") 10% are 10+ characters (e.g., "+-=&|><!(){}[]^\"~*?:\\/ ") We optimize for ASCII sets, which are commonly used to search for "control" characters in some string. We don't optimize for the single character scenario since IndexRune or IndexByte could be used. Uses of Trim|TrimLeft|TrimRight: 71% are 1 character (e.g., "\n" or " ") 14% are 2 characters (e.g., "\r\n") 10% are 3-4 characters (e.g., " \t\r\n") 5% are 10+ characters (e.g., "0123456789abcdefABCDEF") We optimize for the single character case with a simple closured function that only checks for that character's value. We optimize for the medium and larger sets using a 16-byte bit-map representing a set of ASCII characters. The benchmarks below have the following suffix name "%d:%d" where the first number is the length of the input and the second number is the length of the charset. == bytes package == benchmark old ns/op new ns/op delta BenchmarkIndexAnyASCII/1:1-4 5.09 5.23 +2.75% BenchmarkIndexAnyASCII/1:2-4 5.81 5.85 +0.69% BenchmarkIndexAnyASCII/1:4-4 7.22 7.50 +3.88% BenchmarkIndexAnyASCII/1:8-4 11.0 11.1 +0.91% BenchmarkIndexAnyASCII/1:16-4 17.5 17.8 +1.71% BenchmarkIndexAnyASCII/16:1-4 36.0 34.0 -5.56% BenchmarkIndexAnyASCII/16:2-4 46.6 36.5 -21.67% BenchmarkIndexAnyASCII/16:4-4 78.0 40.4 -48.21% BenchmarkIndexAnyASCII/16:8-4 136 47.4 -65.15% BenchmarkIndexAnyASCII/16:16-4 254 61.5 -75.79% BenchmarkIndexAnyASCII/256:1-4 542 388 -28.41% BenchmarkIndexAnyASCII/256:2-4 705 382 -45.82% BenchmarkIndexAnyASCII/256:4-4 1089 386 -64.55% BenchmarkIndexAnyASCII/256:8-4 1994 394 -80.24% BenchmarkIndexAnyASCII/256:16-4 3843 411 -89.31% BenchmarkIndexAnyASCII/4096:1-4 8522 5873 -31.08% BenchmarkIndexAnyASCII/4096:2-4 11253 5861 -47.92% BenchmarkIndexAnyASCII/4096:4-4 17824 5883 -66.99% BenchmarkIndexAnyASCII/4096:8-4 32053 5871 -81.68% BenchmarkIndexAnyASCII/4096:16-4 60512 5888 -90.27% BenchmarkTrimASCII/1:1-4 79.5 70.8 -10.94% BenchmarkTrimASCII/1:2-4 79.0 105 +32.91% BenchmarkTrimASCII/1:4-4 79.6 109 +36.93% BenchmarkTrimASCII/1:8-4 78.8 118 +49.75% BenchmarkTrimASCII/1:16-4 80.2 132 +64.59% BenchmarkTrimASCII/16:1-4 243 116 -52.26% BenchmarkTrimASCII/16:2-4 243 171 -29.63% BenchmarkTrimASCII/16:4-4 243 176 -27.57% BenchmarkTrimASCII/16:8-4 241 184 -23.65% BenchmarkTrimASCII/16:16-4 238 199 -16.39% BenchmarkTrimASCII/256:1-4 2580 840 -67.44% BenchmarkTrimASCII/256:2-4 2603 1175 -54.86% BenchmarkTrimASCII/256:4-4 2572 1188 -53.81% BenchmarkTrimASCII/256:8-4 2550 1191 -53.29% BenchmarkTrimASCII/256:16-4 2585 1208 -53.27% BenchmarkTrimASCII/4096:1-4 39773 12181 -69.37% BenchmarkTrimASCII/4096:2-4 39946 17231 -56.86% BenchmarkTrimASCII/4096:4-4 39641 17179 -56.66% BenchmarkTrimASCII/4096:8-4 39835 17175 -56.88% BenchmarkTrimASCII/4096:16-4 40229 17215 -57.21% == strings package == benchmark old ns/op new ns/op delta BenchmarkIndexAnyASCII/1:1-4 5.94 4.97 -16.33% BenchmarkIndexAnyASCII/1:2-4 5.94 5.55 -6.57% BenchmarkIndexAnyASCII/1:4-4 7.45 7.21 -3.22% BenchmarkIndexAnyASCII/1:8-4 10.8 10.6 -1.85% BenchmarkIndexAnyASCII/1:16-4 17.4 17.2 -1.15% BenchmarkIndexAnyASCII/16:1-4 36.4 32.2 -11.54% BenchmarkIndexAnyASCII/16:2-4 49.6 34.6 -30.24% BenchmarkIndexAnyASCII/16:4-4 77.5 37.9 -51.10% BenchmarkIndexAnyASCII/16:8-4 138 45.5 -67.03% BenchmarkIndexAnyASCII/16:16-4 241 59.1 -75.48% BenchmarkIndexAnyASCII/256:1-4 509 378 -25.74% BenchmarkIndexAnyASCII/256:2-4 720 381 -47.08% BenchmarkIndexAnyASCII/256:4-4 1142 384 -66.37% BenchmarkIndexAnyASCII/256:8-4 1999 391 -80.44% BenchmarkIndexAnyASCII/256:16-4 3735 403 -89.21% BenchmarkIndexAnyASCII/4096:1-4 7973 5824 -26.95% BenchmarkIndexAnyASCII/4096:2-4 11432 5809 -49.19% BenchmarkIndexAnyASCII/4096:4-4 18327 5819 -68.25% BenchmarkIndexAnyASCII/4096:8-4 33059 5828 -82.37% BenchmarkIndexAnyASCII/4096:16-4 59703 5817 -90.26% BenchmarkTrimASCII/1:1-4 71.9 71.8 -0.14% BenchmarkTrimASCII/1:2-4 73.3 103 +40.52% BenchmarkTrimASCII/1:4-4 71.8 106 +47.63% BenchmarkTrimASCII/1:8-4 71.2 113 +58.71% BenchmarkTrimASCII/1:16-4 71.6 128 +78.77% BenchmarkTrimASCII/16:1-4 152 116 -23.68% BenchmarkTrimASCII/16:2-4 160 168 +5.00% BenchmarkTrimASCII/16:4-4 172 170 -1.16% BenchmarkTrimASCII/16:8-4 200 177 -11.50% BenchmarkTrimASCII/16:16-4 254 193 -24.02% BenchmarkTrimASCII/256:1-4 1438 864 -39.92% BenchmarkTrimASCII/256:2-4 1551 1195 -22.95% BenchmarkTrimASCII/256:4-4 1770 1200 -32.20% BenchmarkTrimASCII/256:8-4 2195 1216 -44.60% BenchmarkTrimASCII/256:16-4 3054 1224 -59.92% BenchmarkTrimASCII/4096:1-4 21726 12557 -42.20% BenchmarkTrimASCII/4096:2-4 23586 17508 -25.77% BenchmarkTrimASCII/4096:4-4 26898 17510 -34.90% BenchmarkTrimASCII/4096:8-4 33714 17595 -47.81% BenchmarkTrimASCII/4096:16-4 47429 17700 -62.68% The benchmarks added test the worst case. For IndexAny, that is when the charset matches none of the input. For Trim, it is when the charset matches all of the input. Change-Id: I970874d101a96b33528fc99b165379abe58cf6ea Reviewed-on: https://go-review.googlesource.com/31593 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Martin Möhrmann <martisch@uos.de>