| Age | Commit message (Collapse) | Author |
|
If two slices start out with the same length and decrease in length by
the same amount on each round of the loop (or in the if block), then
we think their length are always equal.
For example:
if len(a) != len(b) {
return
}
for len(a) >= 4 {
a = a[4:]
b = b[4:] // proved here, omit boundary check
}
if len(a) == len(b) { // proved here
//...
}
Or, change 'for' to 'if':
if len(a) != len(b) {
return
}
if len(a) >= 4 {
a = a[4:]
b = b[4:]
}
if len(a) == len(b) { // proved here
//...
}
Fixes #75144
Change-Id: I4e5902a02b5cf8fdc122715a7dbd2fb5e9a8f5dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/699155
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
Raspberry Pi 5 (Cortex-A76)
│ base.log │ opt.log │
│ sec/op │ sec/op vs base │
MemmoveKnownSize112 3.549n ± 0% 3.652n ± 0% +2.92% (p=0.000 n=10)
MemmoveKnownSize128 3.979n ± 0% 3.617n ± 0% -9.09% (p=0.000 n=10)
MemmoveKnownSize192 7.566n ± 0% 5.074n ± 0% -32.94% (p=0.000 n=10)
MemmoveKnownSize248 8.549n ± 0% 7.184n ± 1% -15.97% (p=0.000 n=10)
MemmoveKnownSize256 10.010n ± 0% 6.827n ± 0% -31.80% (p=0.000 n=10)
MemmoveKnownSize512 19.81n ± 0% 13.59n ± 0% -31.40% (p=0.000 n=10)
MemmoveKnownSize1024 39.66n ± 0% 27.00n ± 0% -31.93% (p=0.000 n=10)
geomean 9.538n 7.392n -22.50%
Change-Id: I7b17408cd0a500ceaa80bc93ffe2f19ddeea9c0d
Reviewed-on: https://go-review.googlesource.com/c/go/+/692315
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Fixes #75615
Change-Id: I2c7f0ea1203e8a97749c9f780c29a66050f0159d
Reviewed-on: https://go-review.googlesource.com/c/go/+/710355
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
The new conversions can be activated (or bisected) with
-gcflags=all=-d=converthash=PATTERN
where PATTERN is either a hash string or n, qn, y, qy for
no, quietly no, yes, quietly yes.
This CL makes the default pattern be "qn" instead of the
default-default which is an efficient encoding of "qy".
Updates #75834
Change-Id: I88a9fd7880bc999132420c8d0a22a8fdc1e95a2a
Reviewed-on: https://go-review.googlesource.com/c/go/+/711845
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Bypass: David Chase <drchase@google.com>
|
|
This reverts commit 78d75b37992be01326b9bd2666195aaba9bf2ae2.
Reason for revert: we need to do this more carefully, at minimum gated by a module version
(This should follow the softfloat FP conversion revert)
Change-Id: I736bec6cd860285dcc3b11fac85b377a149435c3
Reviewed-on: https://go-review.googlesource.com/c/go/+/711842
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
conversions"
This reverts commit 8d810286b3121b601480426159c04d178fa29166.
Reason for revert: we need to do this more carefully, at minimum gated by a module version
Change-Id: Ia951e2e5ecdd455ea0f17567963c6fab0f4540dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/711840
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
API naming changes.
This CL also remove AddDotProductPairsSaturated.
Change-Id: I02e6d45268704f3ed4eaf62f0ecb7dc936b42124
Reviewed-on: https://go-review.googlesource.com/c/go/+/710935
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
The added test are manually checked that the peepholes are triggered.
Change-Id: Ibd29eac449869b52c2376f9eafd83410b5266890
Reviewed-on: https://go-review.googlesource.com/c/go/+/710916
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
We have convert mask to bits already, the API of mask load and stores
are inconsistent with them, also mask load and stores could just be
hidden behind peepholes. So this CL removes them, the next CL will add
the peephole for them.
Change-Id: Ifa7d23fb52bb0efd1785935ead4d703927f16d2b
Reviewed-on: https://go-review.googlesource.com/c/go/+/710915
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
We do not lookup/devirtualize such, so we can skip tracking them.
Change-Id: I8bdb0b11c694e4b2326c236093508a356a6a6964
Reviewed-on: https://go-review.googlesource.com/c/go/+/711160
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Change-Id: I5e9e9c89336446720c3c21347969e4126a6a6964
Reviewed-on: https://go-review.googlesource.com/c/go/+/711140
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
Fixes #75863
Change-Id: I1e5a0f3880dcd5f820a5b6f4540c49b16a6a6964
Reviewed-on: https://go-review.googlesource.com/c/go/+/711141
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Lasse Folger <lassefolger@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Follow-up to CL 711420.
Change-Id: If577e96f413e46b98dd86d11605de1004637851a
Reviewed-on: https://go-review.googlesource.com/c/go/+/711540
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Issue #49439 was about a deadlock during type inference inside
a type parameter list of a recursive constraint. As a remedy
we disallowed recursive type parameter lists.
In the meantime we have removed support for type inference for
type arguments to generic types; the Go 1.18 generic release
didn't support it.
As a consequence, the fix for #49439, CL 361922, is probably
not needed anymore: cycles through type parameter lists are ok.
Fixes #68162.
For #49439.
Change-Id: Ie9deb3274914d428e8e45071cee5e68abf8afe9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/711420
Commit-Queue: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Bypass: Robert Griesemer <gri@google.com>
|
|
Leave those constant foldings for runtime, similar to how we do it
for NaN generation.
These are the only instances I could find in cmd/compile/..., using
objdump -d ../pkg/tool/darwin_arm64/compile| egrep "(fcvtz|>:)" | grep -B1 fcvt
(There are instances in other places, like runtime and reflect, but I don't
think those places would affect compiler output.)
Change-Id: I4113fe4570115e4765825cf442cb1fde97cf2f27
Reviewed-on: https://go-review.googlesource.com/c/go/+/711281
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This change replaces most occurrences (in code as well as in comments) of
errors.As with errors.AsType. It leaves the errors package and vendored
code untouched.
Change-Id: I3bde73f318a0b408bdb8f5a251494af15a13118a
GitHub-Last-Rev: 8aaaa36a5a12d2a6a90c6d51680464e1a3115139
GitHub-Pull-Request: golang/go#75698
Reviewed-on: https://go-review.googlesource.com/c/go/+/708495
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
add hasFeature, also record maximum feature for a function
Change-Id: I68dd063aad1c1dc0ef5310a9f5d970c03dd31a0e
Reviewed-on: https://go-review.googlesource.com/c/go/+/710695
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
My stab at a bisect-reproducer failed, but I verified that
it fixed the problem described in the issue.
Updates #75834
Change-Id: I9e0dfacd2bbd22cbc557e144920ee3417a48088c
Reviewed-on: https://go-review.googlesource.com/c/go/+/710997
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
The ICE seen on loong64 while compiling the `(*gcWork).tryStealSpan`
function was due to an `LoweredAtomicAnd32` op (inlined from the
`(pMask).clear` implementation) being incorrectly assigned an output
register while it shouldn't have. Because the op is of mem type, it has
needRegister() == false; hence in the shuffle phase of regalloc, its
bogus output register has no associated `orig` value recorded. The bug
was introduced in CL 482756, but only recently exposed by CL 696035.
Since the old-style atomic ops need no return value (and is even
documented so besides the loong64 ssa op definition), just fix the
register info for both.
While at it, add a note in the ssa op definition file about the
architectural necessity of resultNotInArgs for loong64 atomic ops,
because the practice is not seen in several other arches I have
checked.
Updates #75776
Change-Id: I087f51b8a2825d7b00fc3965b0afcc8b02cad277
Reviewed-on: https://go-review.googlesource.com/c/go/+/710475
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
This change creates calls to size-specialized malloc functions instead
of calls to newObject when we know the size of the allocation at
compilation time. Most of it is a matter of calling the newObject
function (which will create calls to the size-specialized functions)
rather then the newObjectNonSpecialized function (which won't). In the
newHeapaddr, small, non-pointer case, we'll create a non specialized
newObject and transform that into the appropriate size-specialized
function when we produce the mallocgc in flushPendingHeapAllocations.
We have to update some of the rewrites in generic.rules to also apply to
the size-specialized functions when they apply to newObject.
The messiest thing is we have to adjust the offset we use to save the
memory profiler stack, because the depth of the call to profilealloc is
two frames fewer in the size-specialized malloc functions compared to
when newObject calls mallocgc. A bunch of tests have been adjusted to
account for that.
Change-Id: I6a6a6964c9037fb6719e392c4a498ed700b617d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/707856
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Matloob <matloob@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
this change is for overflows, so that all the platforms agree.
Change-Id: I9f459353615bf24ef8a5de641063d9ce34986241
Reviewed-on: https://go-review.googlesource.com/c/go/+/708358
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Change-Id: Iff13b4471f94a6a91d8b159603a9338cb9c89747
Reviewed-on: https://go-review.googlesource.com/c/go/+/708295
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Eventual goal is that all the architectures agree, and are
sensible. The test will be build-tagged to exclude
not-yet-handled platforms.
This change also bisects the conversion change in case of bugs.
(`bisect -compile=convert ...`)
Change-Id: I98528666b0a3fde17cbe8d69b612d01da18dce85
Reviewed-on: https://go-review.googlesource.com/c/go/+/691135
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This change improves the concrete type analysis in the devirtualizer,
it not longer relies on ir.Reassigned, it now statically tries to
determine the concrete type of an interface, even when assigned
multiple times, following type assertions and iface conversions.
Alternative to CL 649195
Updates #69521
Fixes #64824
Change-Id: Ib1656e19f3619ab2e1e6b2c78346cc320490b2af
GitHub-Last-Rev: e8fa0b12f0a7b1d7ae00e5edb54ce04d1f702c09
GitHub-Pull-Request: golang/go#71935
Reviewed-on: https://go-review.googlesource.com/c/go/+/652036
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
NaN checks can often be merged into other comparisons by inverting them.
For example, `math.IsNaN(x) || x > 0` is equivalent to `!(x <= 0)`.
goos: linux
goarch: amd64
pkg: math
cpu: 12th Gen Intel(R) Core(TM) i7-12700T
│ sec/op │ sec/op vs base │
Acos 4.315n ± 0% 4.314n ± 0% ~ (p=0.642 n=10)
Acosh 8.398n ± 0% 7.779n ± 0% -7.37% (p=0.000 n=10)
Asin 4.203n ± 0% 4.211n ± 0% +0.20% (p=0.001 n=10)
Asinh 10.150n ± 0% 9.562n ± 0% -5.79% (p=0.000 n=10)
Atan 2.363n ± 0% 2.363n ± 0% ~ (p=0.801 n=10)
Atanh 8.192n ± 2% 7.685n ± 0% -6.20% (p=0.000 n=10)
Atan2 4.013n ± 0% 4.010n ± 0% ~ (p=0.073 n=10)
Cbrt 4.858n ± 0% 4.755n ± 0% -2.12% (p=0.000 n=10)
Cos 4.596n ± 0% 4.357n ± 0% -5.20% (p=0.000 n=10)
Cosh 5.071n ± 0% 5.071n ± 0% ~ (p=0.585 n=10)
Erf 2.802n ± 1% 2.788n ± 0% -0.54% (p=0.002 n=10)
Erfc 3.087n ± 1% 3.071n ± 0% ~ (p=0.320 n=10)
Erfinv 3.981n ± 0% 3.965n ± 0% -0.41% (p=0.000 n=10)
Erfcinv 3.985n ± 0% 3.977n ± 0% -0.20% (p=0.000 n=10)
ExpGo 8.721n ± 2% 8.252n ± 0% -5.38% (p=0.000 n=10)
Expm1 4.378n ± 0% 4.228n ± 0% -3.43% (p=0.000 n=10)
Exp2 8.313n ± 0% 7.855n ± 0% -5.52% (p=0.000 n=10)
Exp2Go 8.498n ± 2% 7.921n ± 0% -6.79% (p=0.000 n=10)
Mod 15.16n ± 4% 12.20n ± 1% -19.58% (p=0.000 n=10)
Frexp 1.780n ± 2% 1.496n ± 0% -15.96% (p=0.000 n=10)
Gamma 4.378n ± 1% 4.013n ± 0% -8.35% (p=0.000 n=10)
HypotGo 2.655n ± 5% 2.427n ± 1% -8.57% (p=0.000 n=10)
Ilogb 1.912n ± 5% 1.749n ± 0% -8.53% (p=0.000 n=10)
J0 22.43n ± 9% 20.46n ± 0% -8.76% (p=0.000 n=10)
J1 21.03n ± 4% 19.96n ± 0% -5.09% (p=0.000 n=10)
Jn 45.40n ± 1% 42.59n ± 0% -6.20% (p=0.000 n=10)
Ldexp 2.312n ± 1% 1.944n ± 0% -15.94% (p=0.000 n=10)
Lgamma 4.617n ± 1% 4.584n ± 0% -0.73% (p=0.000 n=10)
Log 4.226n ± 0% 4.213n ± 0% -0.31% (p=0.001 n=10)
Logb 1.771n ± 0% 1.775n ± 0% ~ (p=0.097 n=10)
Log1p 5.102n ± 2% 5.001n ± 0% -1.97% (p=0.000 n=10)
Log10 4.407n ± 0% 4.408n ± 0% ~ (p=1.000 n=10)
Log2 2.416n ± 1% 2.138n ± 0% -11.51% (p=0.000 n=10)
Modf 1.669n ± 2% 1.611n ± 0% -3.50% (p=0.000 n=10)
Nextafter32 2.186n ± 0% 2.185n ± 0% ~ (p=0.051 n=10)
Nextafter64 2.182n ± 0% 2.184n ± 0% +0.09% (p=0.016 n=10)
PowInt 11.39n ± 6% 10.68n ± 2% -6.24% (p=0.000 n=10)
PowFrac 26.60n ± 2% 26.12n ± 0% -1.80% (p=0.000 n=10)
Pow10Pos 0.5067n ± 4% 0.5003n ± 1% -1.27% (p=0.001 n=10)
Pow10Neg 0.8552n ± 0% 0.8552n ± 0% ~ (p=0.928 n=10)
Round 1.181n ± 0% 1.182n ± 0% +0.08% (p=0.001 n=10)
RoundToEven 1.709n ± 0% 1.710n ± 0% ~ (p=0.053 n=10)
Remainder 12.54n ± 5% 11.99n ± 2% -4.46% (p=0.000 n=10)
Sin 3.933n ± 5% 3.926n ± 0% -0.17% (p=0.000 n=10)
Sincos 5.672n ± 0% 5.522n ± 0% -2.65% (p=0.000 n=10)
Sinh 5.447n ± 1% 5.444n ± 0% -0.06% (p=0.029 n=10)
Tan 4.061n ± 0% 4.058n ± 0% -0.07% (p=0.005 n=10)
Tanh 5.599n ± 0% 5.595n ± 0% -0.06% (p=0.042 n=10)
Y0 20.75n ± 5% 19.73n ± 1% -4.92% (p=0.000 n=10)
Y1 20.87n ± 2% 19.78n ± 1% -5.20% (p=0.000 n=10)
Yn 44.50n ± 2% 42.04n ± 2% -5.53% (p=0.000 n=10)
geomean 4.989n 4.791n -3.96%
goos: linux
goarch: riscv64
pkg: math
cpu: Spacemit(R) X60
│ sec/op │ sec/op vs base │
Acos 159.9n ± 0% 159.9n ± 0% ~ (p=0.269 n=10)
Acosh 244.7n ± 0% 235.0n ± 0% -3.98% (p=0.000 n=10)
Asin 159.9n ± 0% 159.9n ± 0% ~ (p=0.154 n=10)
Asinh 270.8n ± 0% 261.1n ± 0% -3.60% (p=0.000 n=10)
Atan 119.1n ± 0% 119.1n ± 0% ~ (p=0.347 n=10)
Atanh 260.2n ± 0% 261.8n ± 4% ~ (p=0.459 n=10)
Atan2 186.8n ± 0% 186.8n ± 0% ~ (p=0.487 n=10)
Cbrt 203.5n ± 0% 198.2n ± 0% -2.60% (p=0.000 n=10)
Ceil 31.82n ± 0% 31.81n ± 0% ~ (p=0.714 n=10)
Copysign 4.894n ± 0% 4.893n ± 0% ~ (p=0.161 n=10)
Cos 107.6n ± 0% 103.6n ± 0% -3.76% (p=0.000 n=10)
Cosh 259.0n ± 0% 252.8n ± 0% -2.39% (p=0.000 n=10)
Erf 133.7n ± 0% 133.7n ± 0% ~ (p=0.720 n=10)
Erfc 137.9n ± 0% 137.8n ± 0% -0.04% (p=0.033 n=10)
Erfinv 173.7n ± 0% 168.8n ± 0% -2.82% (p=0.000 n=10)
Erfcinv 173.7n ± 0% 168.8n ± 0% -2.82% (p=0.000 n=10)
Exp 215.3n ± 0% 208.1n ± 0% -3.34% (p=0.000 n=10)
ExpGo 226.7n ± 0% 220.6n ± 0% -2.69% (p=0.000 n=10)
Expm1 164.8n ± 0% 159.0n ± 0% -3.52% (p=0.000 n=10)
Exp2 185.0n ± 0% 182.7n ± 0% -1.22% (p=0.000 n=10)
Exp2Go 198.9n ± 0% 196.5n ± 0% -1.21% (p=0.000 n=10)
Abs 4.894n ± 0% 4.893n ± 0% ~ (p=0.262 n=10)
Dim 16.31n ± 0% 16.31n ± 0% ~ (p=1.000 n=10)
Floor 31.81n ± 0% 31.81n ± 0% ~ (p=0.067 n=10)
Max 26.11n ± 0% 26.10n ± 0% ~ (p=0.080 n=10)
Min 26.10n ± 0% 26.10n ± 0% ~ (p=0.095 n=10)
Mod 337.7n ± 0% 291.9n ± 0% -13.56% (p=0.000 n=10)
Frexp 50.57n ± 0% 42.41n ± 0% -16.13% (p=0.000 n=10)
Gamma 206.3n ± 0% 198.1n ± 0% -4.00% (p=0.000 n=10)
Hypot 94.62n ± 0% 94.61n ± 0% ~ (p=0.437 n=10)
HypotGo 109.3n ± 0% 109.3n ± 0% ~ (p=1.000 n=10)
Ilogb 44.05n ± 0% 44.04n ± 0% -0.02% (p=0.025 n=10)
J0 663.1n ± 0% 663.9n ± 0% +0.13% (p=0.002 n=10)
J1 663.9n ± 0% 666.4n ± 0% +0.38% (p=0.000 n=10)
Jn 1.404µ ± 0% 1.407µ ± 0% +0.21% (p=0.000 n=10)
Ldexp 57.10n ± 0% 48.93n ± 0% -14.30% (p=0.000 n=10)
Lgamma 185.1n ± 0% 187.6n ± 0% +1.32% (p=0.000 n=10)
Log 182.7n ± 0% 170.1n ± 0% -6.87% (p=0.000 n=10)
Logb 46.49n ± 0% 46.49n ± 0% ~ (p=0.675 n=10)
Log1p 184.3n ± 0% 179.4n ± 0% -2.63% (p=0.000 n=10)
Log10 184.3n ± 0% 171.2n ± 0% -7.08% (p=0.000 n=10)
Log2 66.05n ± 0% 57.90n ± 0% -12.34% (p=0.000 n=10)
Modf 34.25n ± 0% 34.24n ± 0% ~ (p=0.163 n=10)
Nextafter32 49.33n ± 1% 48.93n ± 0% -0.81% (p=0.002 n=10)
Nextafter64 43.64n ± 0% 43.23n ± 0% -0.93% (p=0.000 n=10)
PowInt 267.6n ± 0% 251.2n ± 0% -6.11% (p=0.000 n=10)
PowFrac 672.9n ± 0% 637.9n ± 0% -5.19% (p=0.000 n=10)
Pow10Pos 13.87n ± 0% 13.87n ± 0% ~ (p=1.000 n=10)
Pow10Neg 19.58n ± 62% 19.59n ± 62% ~ (p=0.355 n=10)
Round 23.65n ± 0% 23.65n ± 0% ~ (p=1.000 n=10)
RoundToEven 27.73n ± 0% 27.73n ± 0% ~ (p=0.635 n=10)
Remainder 309.9n ± 0% 280.5n ± 0% -9.49% (p=0.000 n=10)
Signbit 13.05n ± 0% 13.05n ± 0% ~ (p=1.000 n=10) ¹
Sin 120.7n ± 0% 120.7n ± 0% ~ (p=1.000 n=10) ¹
Sincos 148.4n ± 0% 143.5n ± 0% -3.30% (p=0.000 n=10)
Sinh 275.6n ± 0% 267.5n ± 0% -2.94% (p=0.000 n=10)
SqrtIndirect 3.262n ± 0% 3.262n ± 0% ~ (p=0.263 n=10)
SqrtLatency 19.57n ± 0% 19.57n ± 0% ~ (p=0.582 n=10)
SqrtIndirectLatency 19.57n ± 0% 19.57n ± 0% ~ (p=1.000 n=10)
SqrtGoLatency 203.2n ± 0% 197.6n ± 0% -2.78% (p=0.000 n=10)
SqrtPrime 4.952µ ± 0% 4.952µ ± 0% -0.01% (p=0.025 n=10)
Tan 153.3n ± 0% 153.3n ± 0% ~ (p=1.000 n=10)
Tanh 280.5n ± 0% 272.4n ± 0% -2.91% (p=0.000 n=10)
Trunc 31.81n ± 0% 31.81n ± 0% ~ (p=1.000 n=10)
Y0 680.1n ± 0% 664.8n ± 0% -2.25% (p=0.000 n=10)
Y1 684.2n ± 0% 669.6n ± 0% -2.14% (p=0.000 n=10)
Yn 1.444µ ± 0% 1.410µ ± 0% -2.35% (p=0.000 n=10)
Float64bits 5.709n ± 0% 5.708n ± 0% ~ (p=0.573 n=10)
Float64frombits 4.893n ± 0% 4.893n ± 0% ~ (p=0.734 n=10)
Float32bits 12.23n ± 0% 12.23n ± 0% ~ (p=0.628 n=10)
Float32frombits 4.893n ± 0% 4.893n ± 0% ~ (p=0.971 n=10)
FMA 4.893n ± 0% 4.893n ± 0% ~ (p=0.736 n=10)
geomean 88.96n 87.05n -2.15%
¹ all samples are equal
Change-Id: I8db8ac7b7b3430b946b89e88dd6c1546804125c3
Reviewed-on: https://go-review.googlesource.com/c/go/+/697360
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Munday <mikemndy@gmail.com>
|
|
analysis for
- is this block only reached through feature checks?
- does the function signature imply AVX-something?
- is there an instruction in this block which implies AVX-something?
and keep track of which features those are. Features =
AVX, AVX2, AVX512, etc.
Has a test.
Change-Id: I0b6f2e87d01ec587818db11cf71fac1e4d500650
Reviewed-on: https://go-review.googlesource.com/c/go/+/706337
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
This reverts commit 719dfcf8a8478d70360bf3c34c0e920be7b32994.
Reason for revert: Causing crashes.
Change-Id: I0b8526dd03d82fa074ce4f97f1789eeac702b3eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/709755
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Instead of storing LR (the return address) at 0(SP) and the FP
(parent's frame pointer) at -8(SP), store them at framesize-8(SP)
and framesize-16(SP), respectively.
We push and pop data onto the stack such that we're never accessing
anything below SP.
The prolog/epilog lengths are unchanged (3 insns for a typical prolog,
2 for a typical epilog).
We use 8 bytes more per frame.
Typical prologue:
STP.W (FP, LR), -16(SP)
MOVD SP, FP
SUB $C, SP
Typical epilogue:
ADD $C, SP
LDP.P 16(SP), (FP, LR)
RET
The previous word where we stored LR, at 0(SP), is now unused.
We could repurpose that slot for storing a local variable.
The new prolog and epilog instructions are recognized by libunwind,
so pc-sampling tools like perf should now be accurate. (TODO: except
maybe after the first RET instruction? Have to look into that.)
Update #73753 (fixes, for arm64)
Update #57302 (Quim thinks this will help on that issue)
Change-Id: I4800036a9a9a08aaaf35d9f99de79a36cf37ebb8
Reviewed-on: https://go-review.googlesource.com/c/go/+/674615
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
this will be subsumed by pending changes in local slice
representation, however this was easy and works well.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk.
Change-Id: I5b6eb10d257f04f906be7a8a6f2b6833992a39e8
Reviewed-on: https://go-review.googlesource.com/c/go/+/704876
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708866
Reviewed-by: David Chase <drchase@google.com>
|
|
Currently, we remove stores to local variables that are not read.
We don't do that for arguments. But arguments and locals are
essentially the same. Arguments are passed by value, and are not
expected to be read in the caller's frame. So we can remove the
writes to them as well. One exception is the cgo_unsafe_arg
directive, which makes all the arguments effectively address-taken.
cgo_unsafe_arg implies ABI0, so we just skip ABI0 functions'
arguments.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk.
Change-Id: I8999fc50da6a87f22c1ec23e9a0c15483b6f7df8
Reviewed-on: https://go-review.googlesource.com/c/go/+/705815
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708865
|
|
This CL fixes a condition for the previous fix CL 704056.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk. Test is
SIMD specific so not included for now.
Change-Id: I1f1f8c6f72870403cb3dff14755c43385dc0c933
Reviewed-on: https://go-review.googlesource.com/c/go/+/705499
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708864
Reviewed-by: David Chase <drchase@google.com>
|
|
the example comes up in chunked reslicing, e.g. A[i:] where i
has a relationship with len(A)-K.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk.
Change-Id: Ib97dede6cfc7bbbd27b4f384988f741760686604
Reviewed-on: https://go-review.googlesource.com/c/go/+/704875
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708863
Reviewed-by: David Chase <drchase@google.com>
|
|
this helps SIMD, but also helps plain old Go
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk.
Change-Id: Idcdacd54b6776f5c32b497bc94485052611cfa8d
Reviewed-on: https://go-review.googlesource.com/c/go/+/704756
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708862
Reviewed-by: David Chase <drchase@google.com>
|
|
This CL fixes an issue raised by contributor dominikh@.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk. Test is
SIMD specific so not included for now.
Change-Id: I941b330a6ba6f6c120c69951ddd24933f2f0b3ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/704056
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708861
|
|
Currently, when shuffling registers, if we need to spill a
register, we always create a spill slot of type int64. The type
doesn't actually matter, as long as it is wide enough to hold the
registers. This is no longer true with SIMD registers, which could
be wider than a int64. Create the slot with the proper type
instead.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk. Test is
SIMD specific so not included for now.
Change-Id: I85c82e2532001bfdefe98c9446f2dd18583d49b4
Reviewed-on: https://go-review.googlesource.com/c/go/+/704055
TryBot-Bypass: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708860
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
For chunked iterations (useful for, but not exclusive to,
SIMD calculations) it is common to see the combination of
```
for ; i <= len(m)-4; i += 4 {
```
and
```
r0, r1, r2, r3 := m[i], m[i+1], m[i+2], m[i+3]
``
Prove did not handle the case of len-offset1 vs index+offset2
checking, but this change fixes this. There may be other
similar cases yet to handle -- this worked for the chunked
loops for simd, as well as a handful in std.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk.
Change-Id: I3785df83028d517e5e5763206653b34b2befd3d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/700696
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708859
Reviewed-by: David Chase <drchase@google.com>
|
|
This might have been something that prove could be educated
into figuring out, but this also works, and it also helps
prove downstream.
Adjusted the prove test, because this change moved a message.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk.
Change-Id: I5eabe639eff5db9cd9766a6a8666fdb4973829cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/697715
Commit-Queue: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Bypass: David Chase <drchase@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708858
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
This CL implements the check for rematerializeable value's output
regspec at its remateralization site. It has some potential problems,
please see the TODO in regalloc.go.
Fixes #70451.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk.
Change-Id: Ib624b967031776851136554719e939e9bf116b7c
Reviewed-on: https://go-review.googlesource.com/c/go/+/695315
Reviewed-by: David Chase <drchase@google.com>
TryBot-Bypass: David Chase <drchase@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708857
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This makes the front-end a little bit less temp-happy
when instrumenting, which repairs the "is it a constant?"
test in the simd intrinsic conversion which is otherwise
broken by race detection.
Also, this will perhaps be better code.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk.
Change-Id: I84b7a45b7bff62bb2c9f9662466b50858d288645
Reviewed-on: https://go-review.googlesource.com/c/go/+/685637
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708856
|
|
The compiler has a logic to print different messages on internal
compiler error depending on whether this is a released version of
Go. It hides the panic stack trace if it is a released version. It
does this by checking the version and see if it has a "go" prefix.
This includes all the released versions. However, for a non-
released build, if there is no explicit version set, cmd/dist now
sets the toolchain version as go1.X-devel_XXX, which makes it be
treated as a released compiler, and causes the stack trace to be
hidden. Change the logic to not match a devel compiler as a
released compiler.
Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk.
Change-Id: I5d3b2101527212f825b6e4000b36030c4f83870b
Reviewed-on: https://go-review.googlesource.com/c/go/+/682975
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/708855
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
this is partly to ensure that emulations get inlined
Change-Id: I14f1a591081a4c39b61e48957a1474217ed0a399
Reviewed-on: https://go-review.googlesource.com/c/go/+/705975
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
Conflicts:
- src/internal/goexperiment/flags.go
- src/runtime/export_test.go
Merge List:
+ 2025-10-03 adce7f196e cmd/link: support .def file with MSVC clang toolchain
+ 2025-10-03 d5b950399d cmd/cgo: fix unaligned arguments typedmemmove crash on iOS
+ 2025-10-02 53845004d6 net/http/httputil: deprecate ReverseProxy.Director
+ 2025-10-02 bbdff9e8e1 net/http: update bundled x/net/http2 and delete obsolete http2inTests
+ 2025-10-02 4008e07080 io/fs: move path name documentation up to the package doc comment
+ 2025-10-02 0e4e2e6832 runtime: skip TestGoroutineLeakProfile under mayMoreStackPreempt
+ 2025-10-02 f03c392295 runtime: fix aix/ppc64 library initialization
+ 2025-10-02 707454b41f cmd/go: update `go help mod edit` with the tool and ignore sections
+ 2025-10-02 8c68a1c1ab runtime,net/http/pprof: goroutine leak detection by using the garbage collector
+ 2025-10-02 84db201ae1 cmd/compile: propagate len([]T{}) to make builtin to allow stack allocation
+ 2025-10-02 5799c139a7 crypto/tls: rm marshalEncryptedClientHelloConfigList dead code
+ 2025-10-01 633dd1d475 encoding/json: fix Decoder.InputOffset regression in goexperiment.jsonv2
+ 2025-10-01 8ad27fb656 doc/go_spec.html: update date
+ 2025-10-01 3f451f2c54 testing/synctest: fix inverted test failure message in TestContextAfterFunc
+ 2025-10-01 be0fed8a5f cmd/go/testdata/script/test_fuzz_fuzztime.txt: disable
+ 2025-09-30 eb1c7f6e69 runtime: move loong64 library entry point to os-agnostic file
+ 2025-09-30 c9257151e5 runtime: unify ppc64/ppc64le library entry point
+ 2025-09-30 4ff8a457db test/codegen: codify handling of floating point constants on arm64
+ 2025-09-30 fcb893fc4b cmd/compile/internal/ssa: remove redundant "type:" prefix check
+ 2025-09-30 19cc1022ba mime: reduce allocs incurred by ParseMediaType
+ 2025-09-30 08afc50bea mime: extend "builtinTypes" to include a more complete list of common types
+ 2025-09-30 97da068774 cmd/compile: eliminate nil checks on .dict arg
+ 2025-09-30 300d9d2714 runtime: initialise debug settings much earlier in startup process
+ 2025-09-30 a846bb0aa5 errors: add AsType
+ 2025-09-30 7c8166d02d cmd/link/internal/arm64: support Mach-O ARM64_RELOC_SUBTRACTOR in internal linking
+ 2025-09-30 6e95748335 cmd/link/internal/arm64: support Mach-O ARM64_RELOC_POINTER_TO_GOT in internal linking
+ 2025-09-30 742f92063e cmd/compile, runtime: always enable Wasm signext and satconv features
+ 2025-09-30 db10db6be3 internal/poll: remove operation fields from FD
+ 2025-09-29 75c87df58e internal/poll: pass the I/O mode instead of an overlapped object in execIO
+ 2025-09-29 fc88e18b4a crypto/internal/fips140/entropy: add CPU jitter-based entropy source
+ 2025-09-29 db4fade759 crypto/internal/fips140/mlkem: make CAST conditional
+ 2025-09-29 db3cb3fd9a runtime: correct reference to getStackMap in comment
+ 2025-09-29 690fc2fb05 internal/poll: remove buf field from operation
+ 2025-09-29 eaf2345256 cmd/link: use a .def file to mark exported symbols on Windows
+ 2025-09-29 4b77733565 internal/syscall/windows: regenerate GetFileSizeEx
+ 2025-09-29 4e9006a716 crypto/tls: quote protocols in ALPN error message
+ 2025-09-29 047c2ab841 cmd/link: don't pass -Wl,-S on Solaris
+ 2025-09-29 ae8eba071b cmd/link: use correct length for pcln.cutab
+ 2025-09-29 fe3ba74b9e cmd/link: skip TestFlagW on platforms without DWARF symbol table
+ 2025-09-29 d42d56b764 encoding/xml: make use of reflect.TypeAssert
+ 2025-09-29 6d51f93257 runtime: jump instead of branch in netbsd/arm64 entry point
+ 2025-09-28 5500cbf0e4 debug/elf: prevent offset overflow
+ 2025-09-27 34e67623a8 all: fix typos
+ 2025-09-27 af6999e60d cmd/compile: implement jump table on loong64
+ 2025-09-26 63cd912083 os/user: simplify go:build
+ 2025-09-26 53009b26dd runtime: use a smaller arena size on Wasm
+ 2025-09-26 3a5df9d2b2 net/http: add HTTP2Config.StrictMaxConcurrentRequests
+ 2025-09-26 16be34df02 net/http: add more tests of transport connection pool
+ 2025-09-26 3e4540b49d os/user: use getgrouplist on illumos && cgo
+ 2025-09-26 15fbe3480b internal/poll: simplify WriteMsg and ReadMsg on Windows
+ 2025-09-26 16ae11a9e1 runtime: move TestReadMetricsSched to testprog
+ 2025-09-26 459f3a3adc cmd/link: don't pass -Wl,-S on AIX
+ 2025-09-26 4631a2d3c6 cmd/link: skip TestFlagW on AIX
+ 2025-09-26 0f31d742cd cmd/compile: fix ICE with new(<untyped expr>)
+ 2025-09-26 7d7cd6e07b internal/poll: don't call SetFilePointerEx in Seek for overlapped handles
+ 2025-09-26 41cba31e66 mime/multipart: percent-encode CR and LF in header values to avoid CRLF injection
+ 2025-09-26 dd1d597c3a Revert "cmd/internal/obj/loong64: use the MOVVP instruction to optimize prologue"
+ 2025-09-26 45d6bc76af runtime: unify arm64 entry point code
+ 2025-09-25 fdea7da3e6 runtime: use common library entry point on windows amd64/386
+ 2025-09-25 e8a4f508d1 lib/fips140: re-seal v1.0.0
+ 2025-09-25 9b7a328089 crypto/internal/fips140: remove key import PCTs, make keygen PCTs fatal
+ 2025-09-25 7f9ab7203f crypto/internal/fips140: update frozen module version to "v1.0.0"
+ 2025-09-25 fb5719cbda crypto/internal/fips140/ecdsa: make TestingOnlyNewDRBG generic
+ 2025-09-25 56067e31f2 std: remove unused declarations
Change-Id: Iecb28fd62c69fbed59da557f46d31bae55889e2c
|
|
Updates #75620
Change-Id: I6a6a6964af4512e30eb4806e1dc7b0fd0835744f
Reviewed-on: https://go-review.googlesource.com/c/go/+/707255
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
Remove redundant "type:" prefix check on symbol names in isFixedLoad,
also refactor some duplicate code into methods.
Change-Id: I8358422596eea8c39d1a30a554bd0aae8b570038
Reviewed-on: https://go-review.googlesource.com/c/go/+/701275
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
The first arg of a generic function is the dictionary. This dictionary
is never nil, but it gets a nil check becuase the dict arg is treated as
a slice during construction.
cmp.Compare[go.shape.int] was:
00006 (+41) TESTB AX, (AX)
00007 (+52) CMPQ CX, BX
00008 (52) JGT 14
00009 (+55) JGE 12
00010 (+56) MOVL $1, AX
00011 (56) RET
00012 (+58) XORL AX, AX
00013 (58) RET
00014 (+53) MOVQ $-1, AX
00015 (53) RET
Note how the function begins with a TESTB that loads the dict to perform
the nil check.
This CL eliminates that nil check.
For most generic functions, this doesn't matter too much, but not
infrequently are generic functions written which never actually use the
dictionary (like cmp.Compare), so I suspect this might help in hot code
to avoid repeatedly touching the dictionary in memory, and in cases
where the generic function is not inlined (and thus the dict dropped).
compilecmp shows these changes (deduped):
cmp.Compare[go.shape.float64] 73 -> 72 (-1.37%)
cmp.Compare[go.shape.int] 26 -> 24 (-7.69%)
cmp.Compare[go.shape.int32] 25 -> 23 (-8.00%)
cmp.Compare[go.shape.int64] 26 -> 24 (-7.69%)
cmp.Compare[go.shape.string] 142 -> 141 (-0.70%)
cmp.Compare[go.shape.uint16] 26 -> 24 (-7.69%)
cmp.Compare[go.shape.uint] 26 -> 24 (-7.69%)
cmp.Compare[go.shape.uint32] 25 -> 23 (-8.00%)
cmp.Compare[go.shape.uint64] 26 -> 24 (-7.69%)
cmp.Compare[go.shape.uint8] 25 -> 23 (-8.00%)
cmp.Compare[go.shape.uintptr] 26 -> 24 (-7.69%)
cmp.Less[go.shape.float64] 35 -> 34 (-2.86%)
cmp.Less[go.shape.int32] 8 -> 6 (-25.00%)
cmp.Less[go.shape.int64] 9 -> 7 (-22.22%)
cmp.Less[go.shape.int] 9 -> 7 (-22.22%)
cmp.Less[go.shape.string] 112 -> 110 (-1.79%)
cmp.Less[go.shape.uint16] 9 -> 7 (-22.22%)
cmp.Less[go.shape.uint32] 8 -> 6 (-25.00%)
cmp.Less[go.shape.uint64] 9 -> 7 (-22.22%)
internal/synctest.Associate[go.shape.struct 114 -> 113 (-0.88%)
internal/trace.(*dataTable[go.shape.uint64,go.shape.string]).insert 805 -> 791 (-1.74%)
internal/trace.(*dataTable[go.shape.uint64,go.shape.struct 858 -> 852 (-0.70%)
main.(*gState[go.shape.int64]).stop 2111 -> 2085 (-1.23%)
main.(*gState[go.shape.int64]).unblock 941 -> 923 (-1.91%)
runtime.fmax[go.shape.float32] 85 -> 83 (-2.35%)
runtime.fmax[go.shape.float64] 89 -> 87 (-2.25%)
runtime.fmin[go.shape.float32] 85 -> 83 (-2.35%)
runtime.fmin[go.shape.float64] 89 -> 87 (-2.25%)
slices.BinarySearch[go.shape.[]string,go.shape.string] 346 -> 337 (-2.60%)
slices.Concat[go.shape.[]uint8,go.shape.uint8] 462 -> 453 (-1.95%)
slices.ContainsFunc[go.shape.[]*cmd/vendor/github.com/google/pprof/profile.Sample,go.shape.*uint8] 170 -> 169 (-0.59%)
slices.ContainsFunc[go.shape.[]*debug/dwarf.StructField,go.shape.*uint8] 170 -> 169 (-0.59%)
slices.ContainsFunc[go.shape.[]*go/ast.Field,go.shape.*uint8] 170 -> 169 (-0.59%)
slices.ContainsFunc[go.shape.[]string,go.shape.string] 186 -> 181 (-2.69%)
slices.Contains[go.shape.[]*cmd/compile/internal/syntax.BranchStmt,go.shape.*cmd/compile/internal/syntax.BranchStmt] 44 -> 42 (-4.55%)
slices.Contains[go.shape.[]cmd/compile/internal/syntax.Type,go.shape.interface 223 -> 219 (-1.79%)
slices.Contains[go.shape.[]crypto/tls.CurveID,go.shape.uint16] 44 -> 42 (-4.55%)
slices.Contains[go.shape.[]crypto/tls.SignatureScheme,go.shape.uint16] 44 -> 42 (-4.55%)
slices.Contains[go.shape.[]*go/ast.BranchStmt,go.shape.*go/ast.BranchStmt] 44 -> 42 (-4.55%)
slices.Contains[go.shape.[]go/types.Type,go.shape.interface 223 -> 219 (-1.79%)
slices.Contains[go.shape.[]int,go.shape.int] 44 -> 42 (-4.55%)
slices.Contains[go.shape.[]string,go.shape.string] 223 -> 219 (-1.79%)
slices.Contains[go.shape.[]uint16,go.shape.uint16] 44 -> 42 (-4.55%)
slices.Contains[go.shape.[]uint8,go.shape.uint8] 44 -> 42 (-4.55%)
slices.Insert[go.shape.[]string,go.shape.string] 1189 -> 1170 (-1.60%)
slices.medianCmpFunc[go.shape.struct 1118 -> 1113 (-0.45%)
slices.medianCmpFunc[go.shape.struct 1214 -> 1209 (-0.41%)
slices.medianCmpFunc[go.shape.struct 889 -> 887 (-0.22%)
slices.medianCmpFunc[go.shape.struct 901 -> 874 (-3.00%)
slices.order2Ordered[go.shape.float64] 89 -> 87 (-2.25%)
slices.order2Ordered[go.shape.uint16] 75 -> 70 (-6.67%)
slices.partialInsertionSortOrdered[go.shape.string] 1115 -> 1110 (-0.45%)
slices.partialInsertionSortOrdered[go.shape.uint16] 358 -> 352 (-1.68%)
slices.partitionEqualOrdered[go.shape.int] 208 -> 203 (-2.40%)
slices.partitionEqualOrdered[go.shape.int32] 208 -> 198 (-4.81%)
slices.partitionEqualOrdered[go.shape.int64] 208 -> 203 (-2.40%)
slices.partitionEqualOrdered[go.shape.uint32] 208 -> 198 (-4.81%)
slices.partitionEqualOrdered[go.shape.uint64] 208 -> 203 (-2.40%)
slices.partitionOrdered[go.shape.float64] 538 -> 533 (-0.93%)
slices.partitionOrdered[go.shape.int] 437 -> 427 (-2.29%)
slices.partitionOrdered[go.shape.int64] 437 -> 427 (-2.29%)
slices.partitionOrdered[go.shape.uint16] 447 -> 442 (-1.12%)
slices.partitionOrdered[go.shape.uint64] 437 -> 427 (-2.29%)
slices.rotateCmpFunc[go.shape.struct 1045 -> 1029 (-1.53%)
slices.rotateCmpFunc[go.shape.struct 1205 -> 1163 (-3.49%)
slices.rotateCmpFunc[go.shape.struct 1226 -> 1176 (-4.08%)
slices.rotateCmpFunc[go.shape.struct 1322 -> 1272 (-3.78%)
slices.rotateCmpFunc[go.shape.struct 1419 -> 1400 (-1.34%)
slices.rotateCmpFunc[go.shape.*uint8] 549 -> 538 (-2.00%)
slices.rotateLeft[go.shape.string] 603 -> 588 (-2.49%)
slices.rotateLeft[go.shape.uint8] 255 -> 250 (-1.96%)
slices.siftDownOrdered[go.shape.int] 181 -> 171 (-5.52%)
slices.siftDownOrdered[go.shape.int32] 181 -> 171 (-5.52%)
slices.siftDownOrdered[go.shape.int64] 181 -> 171 (-5.52%)
slices.siftDownOrdered[go.shape.string] 614 -> 592 (-3.58%)
slices.siftDownOrdered[go.shape.uint32] 181 -> 171 (-5.52%)
slices.siftDownOrdered[go.shape.uint64] 181 -> 171 (-5.52%)
time.parseRFC3339[go.shape.string] 1774 -> 1758 (-0.90%)
unique.(*canonMap[go.shape.struct 280 -> 276 (-1.43%)
unique.clone[go.shape.struct 311 -> 293 (-5.79%)
weak.Make[go.shape.6880e4598856efac32416085c0172278cf0fb9e5050ce6518bd9b7f7d1662440] 136 -> 134 (-1.47%)
weak.Make[go.shape.struct 136 -> 134 (-1.47%)
weak.Make[go.shape.uint8] 136 -> 134 (-1.47%)
Change-Id: I43dcea5f2aa37372f773e5edc6a2ef1dee0a8db7
Reviewed-on: https://go-review.googlesource.com/c/go/+/706655
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
AVXAES is a composite feature set, Intel did listed it as "AVXAES" in
the XED data instead of separating them.
The tests will be in the next CL.
Change-Id: I89c97261f2228b2fdafb48f63e82ef6239bdd5ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/706055
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
These features have been standardized since at least Wasm 2.0.
Always enable them.
The corresponding GOWASM settings are now no-op.
Change-Id: I0e59f21696a69a4e289127988aad629a720b002b
Reviewed-on: https://go-review.googlesource.com/c/go/+/707855
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Following CL 357330, use jump tables on Loong64.
goos: linux
goarch: loong64
pkg: cmd/compile/internal/test
cpu: Loongson-3A6000-HV @ 2500.00MHz
│ old │ new │
│ sec/op │ sec/op vs base │
Switch8Predictable 2.352n ± 0% 2.101n ± 0% -10.65% (p=0.000 n=10)
Switch8Unpredictable 11.99n ± 0% 10.25n ± 0% -14.51% (p=0.000 n=10)
Switch32Predictable 3.153n ± 0% 1.887n ± 1% -40.14% (p=0.000 n=10)
Switch32Unpredictable 12.47n ± 0% 10.22n ± 0% -18.00% (p=0.000 n=10)
SwitchStringPredictable 3.162n ± 0% 3.352n ± 0% +6.01% (p=0.000 n=10)
SwitchStringUnpredictable 14.70n ± 0% 13.31n ± 0% -9.46% (p=0.000 n=10)
SwitchTypePredictable 3.702n ± 0% 2.201n ± 0% -40.55% (p=0.000 n=10)
SwitchTypeUnpredictable 16.18n ± 0% 14.48n ± 0% -10.51% (p=0.000 n=10)
SwitchInterfaceTypePredictable 7.654n ± 0% 9.680n ± 0% +26.47% (p=0.000 n=10)
SwitchInterfaceTypeUnpredictable 22.04n ± 0% 22.44n ± 0% +1.81% (p=0.000 n=10)
geomean 7.441n 6.469n -13.07%
Change-Id: Id6f30fa73349c60fac17670084daee56973a955f
Reviewed-on: https://go-review.googlesource.com/c/go/+/705396
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
|
|
these are in the same style as the 32-bit select-from-pair,
including the grouped variant. This does not quite capture
the full awesome power of VSHUFPD where it can select
differently in each group; that will be some other method,
that is more complex.
Change-Id: I807ddd7c1256103b5b0d7c5d60bd70b185e3aaf0
Reviewed-on: https://go-review.googlesource.com/c/go/+/705695
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
Using this name until a better one appears:
x.Select128FromPair(3, 2, y)
Includes test for constant and variable case.
Checks for unexpected immediates (using the zeroing flag,
which is not supported for this intrinsic) and panics.
Change-Id: I9249475d6572968c127b4ee9e00328d717c07578
Reviewed-on: https://go-review.googlesource.com/c/go/+/705496
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|