diff options
| author | Archana R <aravind5@in.ibm.com> | 2022-01-10 02:15:25 -0600 |
|---|---|---|
| committer | Lynn Boger <laboger@linux.vnet.ibm.com> | 2022-05-09 12:02:02 +0000 |
| commit | 7ca1e2aa56f23b1766e8f5f65c431d18abd58985 (patch) | |
| tree | a436c44b2667fd54620e5ea1be442bf9c2114917 /src/bytes/bytes_test.go | |
| parent | c386269ed8746304b219d5be7d673539ae1e2643 (diff) | |
| download | go-7ca1e2aa56f23b1766e8f5f65c431d18abd58985.tar.xz | |
internal/bytealg: optimize index function for ppc64le/power9
Optimized index2to16 loop by unrolling the loop by 4.
Multiple benchmark tests show performance improvement on
POWER9. Similar improvements are seen on POWER10. Added
tests to ensure changes work fine.
name old time/op new time/op delta
Index/10 18.3ns ± 0% 19.7ns ±25% ~
Index/32 75.3ns ± 0% 69.2ns ± 0% -8.22%
Index/4K 5.53µs ± 0% 3.69µs ± 0% -33.20%
Index/4M 5.64ms ± 0% 3.75ms ± 0% -33.55%
Index/64M 92.9ms ± 0% 61.6ms ± 0% -33.69%
IndexHard2 1.41ms ± 0% 0.93ms ± 0% -33.75%
CountHard2 1.41ms ± 0% 0.93ms ± 0% -33.75%
Change-Id: If9331df6a141a4716724b8cb648d2b91bdf17e5f
Reviewed-on: https://go-review.googlesource.com/c/go/+/377016
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Archana Ravindar <aravind5@in.ibm.com>
Diffstat (limited to 'src/bytes/bytes_test.go')
| -rw-r--r-- | src/bytes/bytes_test.go | 30 |
1 files changed, 30 insertions, 0 deletions
diff --git a/src/bytes/bytes_test.go b/src/bytes/bytes_test.go index b702efb239..985aa0b147 100644 --- a/src/bytes/bytes_test.go +++ b/src/bytes/bytes_test.go @@ -140,6 +140,36 @@ var indexTests = []BinOpTest{ {"abc", "c", 2}, {"abc", "x", -1}, {"barfoobarfooyyyzzzyyyzzzyyyzzzyyyxxxzzzyyy", "x", 33}, + {"fofofofooofoboo", "oo", 7}, + {"fofofofofofoboo", "ob", 11}, + {"fofofofofofoboo", "boo", 12}, + {"fofofofofofoboo", "oboo", 11}, + {"fofofofofoooboo", "fooo", 8}, + {"fofofofofofoboo", "foboo", 10}, + {"fofofofofofoboo", "fofob", 8}, + {"fofofofofofofoffofoobarfoo", "foffof", 12}, + {"fofofofofoofofoffofoobarfoo", "foffof", 13}, + {"fofofofofofofoffofoobarfoo", "foffofo", 12}, + {"fofofofofoofofoffofoobarfoo", "foffofo", 13}, + {"fofofofofoofofoffofoobarfoo", "foffofoo", 13}, + {"fofofofofofofoffofoobarfoo", "foffofoo", 12}, + {"fofofofofoofofoffofoobarfoo", "foffofoob", 13}, + {"fofofofofofofoffofoobarfoo", "foffofoob", 12}, + {"fofofofofoofofoffofoobarfoo", "foffofooba", 13}, + {"fofofofofofofoffofoobarfoo", "foffofooba", 12}, + {"fofofofofoofofoffofoobarfoo", "foffofoobar", 13}, + {"fofofofofofofoffofoobarfoo", "foffofoobar", 12}, + {"fofofofofoofofoffofoobarfoo", "foffofoobarf", 13}, + {"fofofofofofofoffofoobarfoo", "foffofoobarf", 12}, + {"fofofofofoofofoffofoobarfoo", "foffofoobarfo", 13}, + {"fofofofofofofoffofoobarfoo", "foffofoobarfo", 12}, + {"fofofofofoofofoffofoobarfoo", "foffofoobarfoo", 13}, + {"fofofofofofofoffofoobarfoo", "foffofoobarfoo", 12}, + {"fofofofofoofofoffofoobarfoo", "ofoffofoobarfoo", 12}, + {"fofofofofofofoffofoobarfoo", "ofoffofoobarfoo", 11}, + {"fofofofofoofofoffofoobarfoo", "fofoffofoobarfoo", 11}, + {"fofofofofofofoffofoobarfoo", "fofoffofoobarfoo", 10}, + {"fofofofofoofofoffofoobarfoo", "foobars", -1}, {"foofyfoobarfoobar", "y", 4}, {"oooooooooooooooooooooo", "r", -1}, {"oxoxoxoxoxoxoxoxoxoxoxoy", "oy", 22}, |
