| Age | Commit message (Collapse) | Author |
|
Copying the loop variable is no longer necessary since Go 1.22.
Change-Id: Iebb21dac44a20ec200567f1d786f105a4ee4999d
Reviewed-on: https://go-review.googlesource.com/c/go/+/711640
Reviewed-by: Florian Lehner <lehner.florian86@gmail.com>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This lets the inliner do a better job optimizing the mapKeyError call.
goos: linux
goarch: amd64
pkg: runtime
cpu: AMD Ryzen 5 4600G with Radeon Graphics
│ /tmp/before2 │ /tmp/after3 │
│ sec/op │ sec/op vs base │
MapAccessZero/Key=int64-12 1.875n ± 0% 1.875n ± 0% ~ (p=0.506 n=25)
MapAccessZero/Key=int32-12 1.875n ± 0% 1.875n ± 0% ~ (p=0.082 n=25)
MapAccessZero/Key=string-12 1.902n ± 1% 1.902n ± 1% ~ (p=0.256 n=25)
MapAccessZero/Key=mediumType-12 2.816n ± 0% 1.958n ± 0% -30.47% (p=0.000 n=25)
MapAccessZero/Key=bigType-12 2.815n ± 0% 1.935n ± 0% -31.26% (p=0.000 n=25)
MapAccessEmpty/Key=int64-12 1.942n ± 0% 2.109n ± 0% +8.60% (p=0.000 n=25)
MapAccessEmpty/Key=int32-12 2.110n ± 0% 1.940n ± 0% -8.06% (p=0.000 n=25)
MapAccessEmpty/Key=string-12 2.024n ± 0% 2.109n ± 0% +4.20% (p=0.000 n=25)
MapAccessEmpty/Key=mediumType-12 3.157n ± 0% 2.344n ± 0% -25.75% (p=0.000 n=25)
MapAccessEmpty/Key=bigType-12 3.054n ± 0% 2.115n ± 0% -30.75% (p=0.000 n=25)
geomean 2.305n 2.011n -12.75%
Change-Id: Iee83930884dc4c8a791a711aa189a1c93b68d536
Reviewed-on: https://go-review.googlesource.com/c/go/+/663495
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
On master, lookups on small Swiss Table maps (<= 8 elements) for
non-specialized key types are seemingly a performance regression
compared to the Go 1.23 map implementation (reported in #70849).
Currently, a linear scan is used for gets in these cases.
This CL changes (*Map).getWithKeySmall to instead use the SIMD or SWAR
match on the control bytes to then jump to candidate matching slots,
with sample results below for a 16-byte key. This especially helps the
hit case when the key is unpredictable, which previously had to scan an
unpredictable number of control bytes to find a candidate slot when the
key is unpredictable.
Separately, other CLs in this stack modify the main Swiss Table
benchmarks to randomize lookup key order (vs. previously most of the
benchmarks had a repeating lookup key ordering, which likely is
predictable until the map is too big). We have sample results for the
randomized key order benchmarks followed by results from the older
benchmarks.
The first table below is with randomized key order. For hits, the older
results get slower as there are more elements. With this CL, we see hits
for unpredictable key ordering (sizes 2-8) get a ~1.7x speedup from
~25ns to ~14ns, with a now consistent lookup time for the different
sizes. (The 1 element size map has a predictable key ordering because
there is only one key, and that reports a modest ~0.5ns or ~3%
performance penalty). Misses for unpredictable key order get a ~1.3x
speedup, from ~13ns to ~10ns, with similar results for the 1 element
size.
│ no-fix-new-bmarks │ fix-with-new-bmarks │
│ sec/op │ sec/op vs base │
MapSmallAccessHit/Key=smallType/Elem=int32/len=1-4 13.26n ± 0% 13.64n ± 0% +2.90% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=2-4 19.47n ± 0% 13.62n ± 0% -30.05% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=3-4 22.23n ± 0% 13.64n ± 0% -38.68% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=4-4 23.98n ± 0% 13.64n ± 0% -43.11% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=5-4 25.02n ± 0% 13.67n ± 0% -45.35% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=6-4 25.77n ± 1% 13.68n ± 2% -46.89% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=7-4 26.38n ± 0% 13.64n ± 0% -48.28% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=8-4 26.31n ± 0% 13.71n ± 21% -47.90% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=1-4 13.055n ± 0% 9.815n ± 0% -24.82% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=2-4 13.070n ± 0% 9.813n ± 0% -24.92% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=3-4 13.060n ± 0% 9.819n ± 0% -24.82% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=4-4 13.075n ± 0% 9.816n ± 0% -24.92% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=5-4 13.060n ± 0% 9.826n ± 0% -24.76% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=6-4 13.095n ± 19% 9.834n ± 31% -24.90% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=7-4 13.075n ± 19% 9.822n ± 27% -24.88% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=8-4 13.11n ± 16% 12.14n ± 19% -7.43% (p=0.000 n=20)
The next table uses the original benchmarks from just before this CL
stack (i.e., without shuffling lookup keys).
With this CL, we see improvement that is directionally similar to the
above results but not as large, presumably because the branches in the
linear scan are fairly predictable with predictable keys. (The numbers
here also include the time from a mod in the benchmark code, which
seemed to take around ~1/3 of CPU time based on spot checking a couple
of examples, vs. the modified benchmarks shown above have removed that
mod).
│ master-8c3e391573 │ just-fix-with-old-bmarks │
│ sec/op │ sec/op vs base │
MapSmallAccessHit/Key=smallType/Elem=int32/len=1-4 20.85n ± 0% 21.69n ± 0% +4.03% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=2-4 21.22n ± 0% 21.70n ± 0% +2.24% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=3-4 21.73n ± 0% 21.71n ± 0% ~ (p=0.158 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=4-4 22.06n ± 0% 21.71n ± 0% -1.56% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=5-4 22.41n ± 0% 21.73n ± 0% -3.01% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=6-4 22.71n ± 0% 21.72n ± 0% -4.38% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=7-4 22.98n ± 0% 21.71n ± 0% -5.53% (p=0.000 n=20)
MapSmallAccessHit/Key=smallType/Elem=int32/len=8-4 23.20n ± 0% 21.72n ± 0% -6.36% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=1-4 19.95n ± 0% 17.30n ± 0% -13.28% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=2-4 19.96n ± 0% 17.31n ± 0% -13.28% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=3-4 19.95n ± 0% 17.29n ± 0% -13.33% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=4-4 19.95n ± 0% 17.30n ± 0% -13.29% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=5-4 19.96n ± 25% 17.32n ± 0% -13.22% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=6-4 19.99n ± 24% 17.29n ± 0% -13.51% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=7-4 19.97n ± 20% 17.34n ± 16% -13.14% (p=0.000 n=20)
MapSmallAccessMiss/Key=smallType/Elem=int32/len=8-4 20.02n ± 11% 17.33n ± 14% -13.44% (p=0.000 n=20)
geomean 21.02n 19.39n -7.78%
See #70849 for additional benchmark results, including results for arm64
(which also means without SIMD support).
Updates #54766
Updates #70700
Fixes #70849
Change-Id: Ic2361bb6fc15b4436d1d1d5be7e4712e547f611b
Reviewed-on: https://go-review.googlesource.com/c/go/+/634396
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
It wasn't actually testing what it says it was testing.
A random permutation isn't cyclic. It only probably hits a few
elements before entering a cycle.
Use an algorithm that generates a random cyclic permutation instead.
Fixing the test makes the previous CL look less good. But it still helps.
(Theory: Fixing the test makes it less cache friendly, so there are
more misses all around. That makes the benchmark slower, suppressing
the differences seen. Also fixing the benchmark makes the loop
iteration count less predictable, which hurts the raw loop
implementation somewhat.)
(baseline = tip, experiment = tip+previous CL, noswiss = GOEXPERIMENT=noswissmap)
goos: darwin
goarch: arm64
pkg: runtime
cpu: Apple M2 Ultra
│ baseline │ experiment │
│ sec/op │ sec/op vs base │
MapCycle-24 20.59n ± 4% 18.99n ± 3% -7.77% (p=0.000 n=10)
khr@Mac-Studio src % benchstat noswiss experiment
goos: darwin
goarch: arm64
pkg: runtime
cpu: Apple M2 Ultra
│ noswiss │ experiment │
│ sec/op │ sec/op vs base │
MapCycle-24 16.12n ± 1% 18.99n ± 3% +17.83% (p=0.000 n=10)
Change-Id: I3a4edb814ba97fec020a6698c535ce3a87a9fc67
Reviewed-on: https://go-review.googlesource.com/c/go/+/625900
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
We don't really need the index of the slot we're looking at.
Just keep looking until there are no more filled slots.
This particularly helps when there are only a few filled entries
(packed at the bottom), and we're looking for something that isn't
there. We exit earlier than we would otherwise.
goos: darwin
goarch: arm64
pkg: runtime
cpu: Apple M2 Ultra
│ baseline │ experiment │
│ sec/op │ sec/op vs base │
MapSmallAccessHit/Key=int64/Elem=int64/len=1-24 2.759n ± 0% 2.779n ± 2% ~ (p=0.055 n=10)
MapSmallAccessHit/Key=int64/Elem=int64/len=2-24 2.862n ± 1% 2.922n ± 1% +2.08% (p=0.000 n=10)
MapSmallAccessHit/Key=int64/Elem=int64/len=3-24 3.003n ± 0% 3.061n ± 1% +1.91% (p=0.000 n=10)
MapSmallAccessHit/Key=int64/Elem=int64/len=4-24 3.170n ± 1% 3.188n ± 1% +0.57% (p=0.030 n=10)
MapSmallAccessHit/Key=int64/Elem=int64/len=5-24 3.387n ± 1% 3.391n ± 1% ~ (p=0.362 n=10)
MapSmallAccessHit/Key=int64/Elem=int64/len=6-24 3.601n ± 1% 3.584n ± 0% -0.49% (p=0.009 n=10)
MapSmallAccessHit/Key=int64/Elem=int64/len=7-24 3.785n ± 1% 3.778n ± 3% ~ (p=0.987 n=10)
MapSmallAccessHit/Key=int64/Elem=int64/len=8-24 3.960n ± 1% 3.946n ± 1% ~ (p=0.256 n=10)
MapSmallAccessMiss/Key=int64/Elem=int64/len=0-24 2.004n ± 1%
MapSmallAccessMiss/Key=int64/Elem=int64/len=1-24 5.145n ± 1% 2.411n ± 1% -53.14% (p=0.000 n=10)
MapSmallAccessMiss/Key=int64/Elem=int64/len=2-24 5.128n ± 0% 3.313n ± 1% -35.40% (p=0.000 n=10)
MapSmallAccessMiss/Key=int64/Elem=int64/len=3-24 5.159n ± 1% 3.690n ± 1% -28.48% (p=0.000 n=10)
MapSmallAccessMiss/Key=int64/Elem=int64/len=4-24 5.117n ± 1% 4.466n ± 6% -12.73% (p=0.000 n=10)
MapSmallAccessMiss/Key=int64/Elem=int64/len=5-24 5.115n ± 1% 4.308n ± 1% -15.79% (p=0.000 n=10)
MapSmallAccessMiss/Key=int64/Elem=int64/len=6-24 5.111n ± 1% 4.538n ± 2% -11.19% (p=0.000 n=10)
MapSmallAccessMiss/Key=int64/Elem=int64/len=7-24 4.896n ± 4% 4.831n ± 1% -1.33% (p=0.001 n=10)
MapSmallAccessMiss/Key=int64/Elem=int64/len=8-24 4.905n ± 1% 5.121n ± 1% +4.40% (p=0.000 n=10)
geomean 3.917n 3.631n -11.11%
Change-Id: Ife26ac457a513af24fa0921b839ee6cd5fed6fba
Reviewed-on: https://go-review.googlesource.com/c/go/+/627717
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
For large strings, do a quick equality check on all the slots.
Only if more than one passes the quick equality check do we
resort to hashing.
│ baseline │ experiment │
│ sec/op │ sec/op vs base │
MegMap-24 16609.50n ± 1% 13.91n ± 3% -99.92% (p=0.000 n=10)
MegOneMap-24 16655.00n ± 0% 12.27n ± 1% -99.93% (p=0.000 n=10)
MegEqMap-24 41.31µ ± 1% 25.03µ ± 1% -39.40% (p=0.000 n=10)
MegEmptyMap-24 2.034n ± 0% 2.027n ± 2% ~ (p=0.541 n=10)
MegEmptyMapWithInterfaceKey-24 5.931n ± 2% 5.599n ± 1% -5.60% (p=0.000 n=10)
MapStringKeysEight_16-24 8.473n ± 7% 8.224n ± 5% ~ (p=0.315 n=10)
MapStringKeysEight_32-24 8.441n ± 2% 8.147n ± 1% -3.48% (p=0.002 n=10)
MapStringKeysEight_64-24 8.769n ± 1% 8.517n ± 1% -2.87% (p=0.000 n=10)
MapStringKeysEight_128-24 10.73n ± 4% 13.57n ± 8% +26.57% (p=0.000 n=10)
MapStringKeysEight_256-24 12.97n ± 2% 14.35n ± 4% +10.64% (p=0.001 n=10)
MapStringKeysEight_1M-24 17359.50n ± 3% 13.92n ± 4% -99.92% (p=0.000 n=10)
Change-Id: I4cc2ea4edab12a4b03236de626c7bcf0f96b6cc0
Reviewed-on: https://go-review.googlesource.com/c/go/+/625905
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Change-Id: I3a3b7da6245a18bf1db0c595008f0eea853ce544
Reviewed-on: https://go-review.googlesource.com/c/go/+/627155
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
|
|
│ baseline │ experiment │
│ sec/op │ sec/op vs base │
MapDeleteLargeKey-24 312.0n ± 6% 162.3n ± 5% -47.97% (p=0.000 n=10)
Change-Id: I31f1f8e3c344cf8abf2e9eb4b51b78fcd67b93c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/625906
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Fixes #70008.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-race
Change-Id: I1fd7d1cbda20cc96016c864bcf0696382453e807
Reviewed-on: https://go-review.googlesource.com/c/go/+/623335
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Extendible hashing splits a swisstable map into many swisstables. This
keeps grow operations small.
For #54766.
Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap
Change-Id: Id91f34af9e686bf35eb8882ee479956ece89e821
Reviewed-on: https://go-review.googlesource.com/c/go/+/604936
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Based on the benchmarks in github.com/cockroachlabs/swiss.
For #54766.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-swissmap
Change-Id: I9ad925d3272c671e21ec04eb2da5ebd8f0fc6a28
Reviewed-on: https://go-review.googlesource.com/c/go/+/596295
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
|
|
name old time/op new time/op delta
MegEmptyMapWithInterfaceKey-10 15.5µs ± 0% 0.0µs ± 0% -99.97% (p=0.000 n=20+16)
name old alloc/op new alloc/op delta
MegEmptyMapWithInterfaceKey-10 0.00B 0.00B ~ (all equal)
name old allocs/op new allocs/op delta
MegEmptyMapWithInterfaceKey-10 0.00 0.00 ~ (all equal)
Change-Id: I46248223100e98b7877464da640075d272c14802
Reviewed-on: https://go-review.googlesource.com/c/go/+/502075
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: xie cui <523516579@qq.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Change-Id: I7f4ac72fe3230d8b7486fab0c925015cefcbe355
GitHub-Last-Rev: 54455839b674f980fb6c3afceb433db4833d340e
GitHub-Pull-Request: golang/go#61544
Reviewed-on: https://go-review.googlesource.com/c/go/+/512376
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
And then revert the bootstrap cmd directories and certain testdata.
And adjust tests as needed.
Not reverting the changes in std that are bootstrapped,
because some of those changes would appear in API docs,
and we want to use any consistently.
Instead, rewrite 'any' to 'interface{}' in cmd/dist for those directories
when preparing the bootstrap copy.
A few files changed as a result of running gofmt -w
not because of interface{} -> any but because they
hadn't been updated for the new //go:build lines.
Fixes #49884.
Change-Id: Ie8045cba995f65bd79c694ec77a1b3d1fe01bb09
Reviewed-on: https://go-review.googlesource.com/c/go/+/368254
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Makes sure the copyright notice is not interpreted as the package level
godoc.
Change-Id: I2afce7c9d620f19d51ec1438b1d0db1774b57146
Reviewed-on: https://go-review.googlesource.com/c/go/+/248760
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
|
|
For map with hint larger than BUCKETSIZE, makemap ignore allocated
bucket and allocate buckets itself. So do not allocate bucket in
this case, save us the cost of zeroing+assignment to the bucket.
name old time/op new time/op delta
NewEmptyMap-12 3.89ns ± 4% 3.88ns ± 2% ~ (p=0.939 n=19+20)
NewSmallMap-12 23.3ns ± 3% 23.1ns ± 2% ~ (p=0.307 n=18+17)
NewEmptyMapHintLessThan8-12 6.43ns ± 3% 6.31ns ± 2% -1.72% (p=0.000 n=19+18)
NewEmptyMapHintGreaterThan8-12 159ns ± 2% 150ns ± 1% -5.79% (p=0.000 n=20+18)
Benchmark run with commit ab7c174 reverted, see #38314.
Fixes #20184
Change-Id: Ic021f57454c3a0dd50601d73bbd77b8faf8d93b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/227458
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Use the following (suboptimal) script to obtain a list of possible
typos:
#!/usr/bin/env sh
set -x
git ls-files |\
grep -e '\.\(c\|cc\|go\)$' |\
xargs -n 1\
awk\
'/\/\// { gsub(/.*\/\//, ""); print; } /\/\*/, /\*\// { gsub(/.*\/\*/, ""); gsub(/\*\/.*/, ""); }' |\
hunspell -d en_US -l |\
grep '^[[:upper:]]\{0,1\}[[:lower:]]\{1,\}$' |\
grep -v -e '^.\{1,4\}$' -e '^.\{16,\}$' |\
sort -f |\
uniq -c |\
awk '$1 == 1 { print $2; }'
Then, go through the results manually and fix the most obvious typos in
the non-vendored code.
Change-Id: I3cb5830a176850e1a0584b8a40b47bde7b260eae
Reviewed-on: https://go-review.googlesource.com/c/go/+/193848
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
Right now we generate hash functions for all types, just in case they
are used as map keys. That's a lot of wasted effort and binary size
for types which will never be used as a map key. Instead, generate
hash functions only for types that we know are map keys.
Just doing that is a bit too simple, since maps with an interface type
as a key might have to hash any concrete key type that implements that
interface. So for that case, implement hashing of such types at
runtime (instead of with generated code). It will be slower, but only
for maps with interface types as keys, and maybe only a bit slower as
the aeshash time probably dominates the dispatch time.
Reorg where we keep the equals and hash functions. Move the hash function
from the key type to the map type, saving a field in every non-map type.
That leaves only one function in the alg structure, so get rid of that and
just keep the equal function in the type descriptor itself.
cmd/go now has 10 generated hash functions, instead of 504. Makes
cmd/go 1.0% smaller. Update #6853.
Speed on non-interface keys is unchanged. Speed on interface keys
is ~20% slower:
name old time/op new time/op delta
MapInterfaceString-8 23.0ns ±21% 27.6ns ±14% +20.01% (p=0.002 n=10+10)
MapInterfacePtr-8 19.4ns ±16% 23.7ns ± 7% +22.48% (p=0.000 n=10+8)
Change-Id: I7c2e42292a46b5d4e288aaec4029bdbb01089263
Reviewed-on: https://go-review.googlesource.com/c/go/+/191198
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
|
|
Divide the "empty" slot state into two, "emptyOne" and "emptyRest".
emptyOne means just that slot is empty. emptyRest means all subsequent
slots in that bucket are empty and the overflow pointer is nil.
When scanning a bucket, we can often stop at emptyRest, reducing
the total work we have to do. (This is similar to how tombstones
work in open addressing.)
Ideally on delete we have to figure out whether to zero the slot
with an emptyOne or emptyRest marker. For now, we choose the safe
but non-optimal choice. (Fix in subsequent CL?)
This is a simpler CL than some others we've tried, including my
CL sequence 11835[5-8] and Ilya's CL 115616.
Update #19495
name old time/op new time/op delta
MegMap 8.96ns ± 2% 8.74ns ± 6% -2.44% (p=0.020 n=10+10)
MegOneMap 8.91ns ± 2% 5.53ns ± 2% -37.99% (p=0.000 n=10+10)
MegEqMap 46.0µs ± 1% 45.8µs ± 3% ~ (p=0.315 n=9+10)
MegEmptyMap 2.50ns ± 0% 2.50ns ± 2% ~ (p=0.957 n=8+10)
SmallStrMap 8.54ns ± 1% 8.71ns ± 2% +2.01% (p=0.000 n=10+10)
MapStringKeysEight_16 8.61ns ± 3% 8.71ns ± 3% +1.20% (p=0.026 n=9+9)
MapStringKeysEight_32 8.54ns ± 2% 8.97ns ± 1% +5.05% (p=0.000 n=10+9)
MapStringKeysEight_64 8.66ns ± 2% 8.99ns ± 2% +3.87% (p=0.000 n=10+10)
MapStringKeysEight_1M 8.57ns ± 2% 8.95ns ± 2% +4.51% (p=0.000 n=10+9)
IntMap 6.69ns ± 1% 7.46ns ± 1% +11.60% (p=0.000 n=9+9)
MapFirst/1 3.69ns ± 1% 3.63ns ± 3% -1.52% (p=0.040 n=10+10)
MapFirst/2 3.70ns ± 2% 3.63ns ± 2% -1.95% (p=0.001 n=9+9)
MapFirst/3 3.74ns ± 2% 3.66ns ± 2% -2.12% (p=0.000 n=8+10)
MapFirst/4 3.71ns ± 2% 3.66ns ± 4% ~ (p=0.073 n=9+10)
MapFirst/5 3.69ns ± 1% 3.62ns ± 2% -1.88% (p=0.000 n=9+10)
MapFirst/6 3.68ns ± 2% 3.62ns ± 1% -1.83% (p=0.001 n=10+9)
MapFirst/7 3.67ns ± 1% 3.60ns ± 1% -1.98% (p=0.000 n=10+8)
MapFirst/8 3.68ns ± 2% 3.61ns ± 2% -1.87% (p=0.000 n=10+10)
MapFirst/9 8.03ns ± 4% 7.89ns ± 2% -1.76% (p=0.007 n=10+10)
MapFirst/10 7.99ns ± 2% 7.86ns ± 3% -1.64% (p=0.009 n=9+10)
MapFirst/11 7.96ns ± 1% 7.80ns ± 2% -2.01% (p=0.000 n=10+10)
MapFirst/12 7.96ns ± 1% 7.82ns ± 1% -1.67% (p=0.000 n=10+10)
MapFirst/13 8.06ns ± 3% 7.92ns ± 3% ~ (p=0.055 n=10+10)
MapFirst/14 7.95ns ± 1% 7.80ns ± 1% -1.88% (p=0.000 n=10+9)
MapFirst/15 8.01ns ± 2% 7.80ns ± 2% -2.57% (p=0.000 n=10+10)
MapFirst/16 8.05ns ± 2% 7.90ns ± 2% -1.84% (p=0.005 n=9+10)
MapMid/1 4.00ns ± 1% 3.94ns ± 2% -1.30% (p=0.021 n=8+9)
MapMid/2 4.39ns ± 2% 4.32ns ± 4% ~ (p=0.128 n=10+10)
MapMid/3 4.40ns ± 2% 4.27ns ± 2% -2.93% (p=0.000 n=10+9)
MapMid/4 4.76ns ± 2% 4.65ns ± 1% -2.26% (p=0.000 n=10+9)
MapMid/5 4.76ns ± 1% 4.65ns ± 1% -2.27% (p=0.000 n=10+10)
MapMid/6 5.11ns ± 2% 4.98ns ± 2% -2.55% (p=0.000 n=10+10)
MapMid/7 5.12ns ± 1% 5.01ns ± 3% -2.02% (p=0.003 n=9+9)
MapMid/8 5.71ns ± 3% 5.97ns ± 1% +4.51% (p=0.000 n=10+9)
MapMid/9 8.72ns ±10% 8.89ns ±10% ~ (p=0.458 n=9+10)
MapMid/10 10.1ns ±15% 9.6ns ± 7% ~ (p=0.080 n=9+10)
MapMid/11 9.88ns ±10% 9.44ns ±11% ~ (p=0.065 n=10+10)
MapMid/12 9.90ns ±13% 10.04ns ± 9% ~ (p=1.000 n=10+8)
MapMid/13 9.67ns ±14% 10.23ns ±10% ~ (p=0.209 n=10+9)
MapMid/14 9.12ns ±14% 9.14ns ±13% ~ (p=0.927 n=10+10)
MapMid/15 9.16ns ±12% 9.15ns ±16% ~ (p=0.955 n=10+10)
MapMid/16 9.37ns ±11% 9.60ns ±23% ~ (p=0.825 n=9+10)
MapLast/1 4.08ns ± 1% 3.92ns ± 0% -3.91% (p=0.000 n=10+9)
MapLast/2 4.37ns ± 1% 4.28ns ± 1% -1.95% (p=0.000 n=10+10)
MapLast/3 4.94ns ± 2% 4.65ns ± 1% -5.79% (p=0.000 n=9+8)
MapLast/4 5.40ns ± 3% 5.02ns ± 2% -7.13% (p=0.000 n=9+9)
MapLast/5 5.88ns ± 2% 5.67ns ± 2% -3.57% (p=0.000 n=10+10)
MapLast/6 6.48ns ± 3% 5.90ns ± 2% -8.89% (p=0.000 n=10+10)
MapLast/7 7.01ns ± 2% 6.27ns ± 5% -10.56% (p=0.000 n=10+10)
MapLast/8 7.60ns ± 2% 6.62ns ± 2% -12.93% (p=0.000 n=9+10)
MapLast/9 10.6ns ± 9% 10.9ns ±15% ~ (p=0.344 n=9+10)
MapLast/10 11.0ns ±12% 10.9ns ±14% ~ (p=0.985 n=10+10)
MapLast/11 11.4ns ±12% 11.8ns ±22% ~ (p=0.671 n=10+10)
MapLast/12 11.6ns ±10% 12.1ns ±19% ~ (p=0.617 n=10+10)
MapLast/13 12.5ns ±23% 11.8ns ±13% ~ (p=0.827 n=10+9)
MapLast/14 10.5ns ±22% 10.4ns ± 5% ~ (p=0.797 n=10+9)
MapLast/15 10.0ns ±15% 10.3ns ±16% ~ (p=0.565 n=10+10)
MapLast/16 10.4ns ±12% 10.5ns ±13% ~ (p=0.889 n=10+9)
MapCycle 22.3ns ± 1% 22.0ns ± 2% -1.43% (p=0.002 n=9+10)
RepeatedLookupStrMapKey32 16.4ns ± 1% 16.6ns ± 1% +1.24% (p=0.000 n=10+9)
RepeatedLookupStrMapKey1M 35.6µs ± 0% 35.4µs ± 1% -0.62% (p=0.002 n=10+10)
NewEmptyMap 5.36ns ± 1% 9.05ns ± 1% +69.02% (p=0.000 n=10+8)
NewSmallMap 51.2ns ± 2% 33.7ns ± 1% -34.22% (p=0.000 n=10+9)
MapIter 83.8ns ± 1% 88.4ns ± 1% +5.55% (p=0.000 n=10+10)
MapIterEmpty 4.32ns ± 3% 5.54ns ± 3% +28.12% (p=0.000 n=10+10)
SameLengthMap 4.31ns ± 1% 4.59ns ± 2% +6.41% (p=0.000 n=9+10)
BigKeyMap 24.2ns ± 2% 24.3ns ± 1% ~ (p=0.432 n=10+10)
BigValMap 24.3ns ± 1% 24.4ns ± 2% ~ (p=0.200 n=10+9)
SmallKeyMap 17.5ns ± 1% 18.5ns ± 2% +5.81% (p=0.000 n=9+10)
MapPopulate/1 29.0ns ± 4% 18.8ns ± 1% -35.27% (p=0.000 n=10+9)
MapPopulate/10 736ns ± 5% 693ns ± 4% -5.92% (p=0.000 n=10+10)
MapPopulate/100 11.3µs ± 2% 10.8µs ± 3% -4.38% (p=0.000 n=10+10)
MapPopulate/1000 139µs ± 8% 132µs ± 4% -5.10% (p=0.002 n=10+10)
MapPopulate/10000 1.21ms ± 5% 1.16ms ± 5% -4.56% (p=0.002 n=10+10)
MapPopulate/100000 12.2ms ± 3% 11.8ms ± 5% ~ (p=0.052 n=10+10)
ComplexAlgMap 73.9ns ± 1% 74.4ns ± 2% ~ (p=0.161 n=9+10)
GoMapClear/Reflexive/1 36.0ns ± 1% 26.9ns ± 2% -25.31% (p=0.000 n=10+10)
GoMapClear/Reflexive/10 35.2ns ± 1% 24.4ns ± 1% -30.62% (p=0.000 n=10+10)
GoMapClear/Reflexive/100 69.6ns ± 2% 59.2ns ± 1% -14.92% (p=0.000 n=10+10)
GoMapClear/Reflexive/1000 1.06µs ± 2% 1.05µs ± 1% -1.16% (p=0.013 n=10+9)
GoMapClear/Reflexive/10000 11.7µs ± 1% 11.7µs ± 1% ~ (p=0.542 n=10+10)
GoMapClear/NonReflexive/1 96.3ns ± 1% 90.0ns ± 1% -6.52% (p=0.000 n=10+10)
GoMapClear/NonReflexive/10 110ns ± 2% 101ns ± 0% -8.10% (p=0.000 n=10+7)
GoMapClear/NonReflexive/100 270ns ± 2% 235ns ± 2% -12.94% (p=0.000 n=10+10)
GoMapClear/NonReflexive/1000 3.02µs ± 2% 2.48µs ± 1% -17.92% (p=0.000 n=10+10)
GoMapClear/NonReflexive/10000 23.7µs ± 1% 19.6µs ± 1% -17.30% (p=0.000 n=10+9)
MapPop100 9.65µs ± 6% 9.18µs ± 8% -4.82% (p=0.008 n=9+10)
MapPop1000 162µs ± 6% 148µs ± 4% -8.67% (p=0.000 n=9+9)
MapPop10000 3.05ms ± 8% 2.82ms ±15% -7.66% (p=0.023 n=10+10)
MapAssign/Int32/256 15.7ns ± 4% 14.6ns ± 2% -7.08% (p=0.000 n=10+10)
MapAssign/Int32/65536 29.8ns ± 1% 30.4ns ± 0% +2.04% (p=0.000 n=10+8)
MapAssign/Int64/256 14.9ns ± 5% 14.8ns ± 4% ~ (p=0.611 n=10+10)
MapAssign/Int64/65536 30.3ns ± 2% 30.4ns ± 1% +0.54% (p=0.046 n=10+9)
MapAssign/Str/256 17.8ns ± 3% 19.8ns ± 4% +11.08% (p=0.000 n=10+10)
MapAssign/Str/65536 35.7ns ± 1% 36.4ns ± 1% +1.82% (p=0.000 n=10+10)
MapOperatorAssign/Int32/256 18.8ns ± 5% 14.6ns ± 3% -22.57% (p=0.000 n=10+10)
MapOperatorAssign/Int32/65536 29.8ns ± 1% 30.5ns ± 1% +2.39% (p=0.000 n=10+10)
MapOperatorAssign/Int64/256 16.6ns ± 4% 15.0ns ± 6% -9.34% (p=0.000 n=10+10)
MapOperatorAssign/Int64/65536 30.1ns ± 1% 31.7ns ± 2% +5.21% (p=0.000 n=10+10)
MapOperatorAssign/Str/256 1.70µs ± 1% 1.61µs ± 2% -5.55% (p=0.000 n=10+8)
MapOperatorAssign/Str/65536 289ns ± 7% 294ns ± 4% ~ (p=0.425 n=10+10)
MapAppendAssign/Int32/256 34.3ns ± 2% 31.0ns ± 3% -9.59% (p=0.000 n=9+9)
MapAppendAssign/Int32/65536 51.8ns ± 3% 47.1ns ±13% -9.17% (p=0.002 n=9+10)
MapAppendAssign/Int64/256 32.5ns ± 8% 31.2ns ± 6% ~ (p=0.065 n=10+10)
MapAppendAssign/Int64/65536 51.4ns ± 4% 47.2ns ±10% -8.07% (p=0.005 n=9+10)
MapAppendAssign/Str/256 105ns ±12% 109ns ± 4% ~ (p=0.138 n=10+8)
MapAppendAssign/Str/65536 101ns ±14% 81ns ± 8% -19.82% (p=0.000 n=10+9)
MapDelete/Int32/100 32.0ns ± 1% 35.0ns ± 2% +9.59% (p=0.000 n=9+10)
MapDelete/Int32/1000 27.0ns ± 3% 30.3ns ± 1% +12.10% (p=0.000 n=10+9)
MapDelete/Int32/10000 29.2ns ± 1% 32.9ns ± 2% +12.80% (p=0.000 n=10+10)
MapDelete/Int64/100 31.5ns ± 1% 35.7ns ± 2% +13.16% (p=0.000 n=10+10)
MapDelete/Int64/1000 27.0ns ± 2% 30.6ns ± 1% +13.21% (p=0.000 n=10+10)
MapDelete/Int64/10000 30.3ns ± 1% 34.4ns ± 3% +13.47% (p=0.000 n=10+10)
MapDelete/Str/100 23.4ns ± 8% 26.7ns ± 6% +14.10% (p=0.000 n=10+9)
MapDelete/Str/1000 31.0ns ± 2% 35.1ns ± 3% +13.19% (p=0.000 n=10+9)
MapDelete/Str/10000 38.8ns ± 1% 43.4ns ± 2% +12.02% (p=0.000 n=9+10)
Change-Id: I564ce0f40936589f0f9b837f7f2bbcca4c4a1070
Reviewed-on: https://go-review.googlesource.com/c/142437
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Martin Möhrmann <martisch@uos.de>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This improves performance for maps with a bucket size
(key+value*8 bytes) larger than 32 bytes and removes loading
a value from the maxElems array for smaller bucket sizes.
name old time/op new time/op delta
MakeMap/[Byte]Byte 93.5ns ± 1% 91.8ns ± 1% -1.83% (p=0.000 n=10+10)
MakeMap/[Int]Int 134ns ± 1% 127ns ± 2% -5.61% (p=0.000 n=9+10)
Updates #21588
Change-Id: I53f77186769c4bd0f2b90f3c6c17df643b060e39
Reviewed-on: https://go-review.googlesource.com/c/143797
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
x = map[string(byteslice)] is already optimized by the compiler to avoid a
string allocation. This CL generalizes this optimization to:
x = map[T1{ ... Tn{..., string(byteslice), ...} ... }]
where T1 to Tn is a nesting of struct and array literals.
Found in a hot code path that used a struct of strings made from []byte
slices to make a map lookup.
There are no uses of the more generalized optimization in the standard library.
Passes toolstash -cmp.
MapStringConversion/32/simple 21.9ns ± 2% 21.9ns ± 3% ~ (p=0.995 n=17+20)
MapStringConversion/32/struct 28.8ns ± 3% 22.0ns ± 2% -23.80% (p=0.000 n=20+20)
MapStringConversion/32/array 28.5ns ± 2% 21.9ns ± 2% -23.14% (p=0.000 n=19+16)
MapStringConversion/64/simple 21.0ns ± 2% 21.1ns ± 3% ~ (p=0.072 n=19+18)
MapStringConversion/64/struct 72.4ns ± 3% 21.3ns ± 2% -70.53% (p=0.000 n=20+20)
MapStringConversion/64/array 72.8ns ± 1% 21.0ns ± 2% -71.13% (p=0.000 n=17+19)
name old allocs/op new allocs/op delta
MapStringConversion/32/simple 0.00 0.00 ~ (all equal)
MapStringConversion/32/struct 0.00 0.00 ~ (all equal)
MapStringConversion/32/array 0.00 0.00 ~ (all equal)
MapStringConversion/64/simple 0.00 0.00 ~ (all equal)
MapStringConversion/64/struct 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20)
MapStringConversion/64/array 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20)
Change-Id: I483b4d84d8d74b1025b62c954da9a365e79b7a3a
Reviewed-on: https://go-review.googlesource.com/c/116275
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
replace map clears of the form:
for k := range m {
delete(m, k)
}
(where m is map with key type that is reflexive for ==)
with a new runtime function that clears the maps backing
array with a memclr and reinitializes the hmap struct.
Map key types that for example contain floats are not
replaced by this optimization since NaN keys cannot
be deleted from maps using delete.
name old time/op new time/op delta
GoMapClear/Reflexive/1 92.2ns ± 1% 47.1ns ± 2% -48.89% (p=0.000 n=9+9)
GoMapClear/Reflexive/10 108ns ± 1% 48ns ± 2% -55.68% (p=0.000 n=10+10)
GoMapClear/Reflexive/100 303ns ± 2% 110ns ± 3% -63.56% (p=0.000 n=10+10)
GoMapClear/Reflexive/1000 3.58µs ± 3% 1.23µs ± 2% -65.49% (p=0.000 n=9+10)
GoMapClear/Reflexive/10000 28.2µs ± 3% 10.3µs ± 2% -63.55% (p=0.000 n=9+10)
GoMapClear/NonReflexive/1 121ns ± 2% 124ns ± 7% ~ (p=0.097 n=10+10)
GoMapClear/NonReflexive/10 137ns ± 2% 139ns ± 3% +1.53% (p=0.033 n=10+10)
GoMapClear/NonReflexive/100 331ns ± 3% 334ns ± 2% ~ (p=0.342 n=10+10)
GoMapClear/NonReflexive/1000 3.64µs ± 3% 3.64µs ± 2% ~ (p=0.887 n=9+10)
GoMapClear/NonReflexive/10000 28.1µs ± 2% 28.4µs ± 3% ~ (p=0.247 n=10+10)
Fixes #20138
Change-Id: I181332a8ef434a4f0d89659f492d8711db3f3213
Reviewed-on: https://go-review.googlesource.com/110055
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Rename all map implementation and test files to use "map"
as a file name prefix instead of "hashmap" for the implementation
and "map" for the test file names.
Change-Id: I7b317c1f7a660b95c6d1f1a185866f2839e69446
Reviewed-on: https://go-review.googlesource.com/90336
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|