| Age | Commit message (Collapse) | Author |
|
Per discussion on issue #75885, a type parameter on the RHS of an alias
declaration must not be declared in the same declaration (but it may be
declared by an enclosing function). This relaxes the spec slightly and
allows for (pre-existing) test cases.
Add a corresponding check to the type checker (there was no check for
type parameters on the RHS of alias declarations at all, before).
Fixes #75884.
Fixes #75885.
Change-Id: I1e5675978e6423d626c068829d4bf5e90035ea82
Reviewed-on: https://go-review.googlesource.com/c/go/+/721820
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
We can already stack allocate the backing store during append if the
resulting backing store doesn't escape. See CL 664299.
This CL enables us to often stack allocate the backing store during
append *even if* the result escapes. Typically, for code like:
func f(n int) []int {
var r []int
for i := range n {
r = append(r, i)
}
return r
}
the backing store for r escapes, but only by returning it.
Could we operate with r on the stack for most of its lifeime,
and only move it to the heap at the return point?
The current implementation of append will need to do an allocation
each time it calls growslice. This will happen on the 1st, 2nd, 4th,
8th, etc. append calls. The allocations done by all but the
last growslice call will then immediately be garbage.
We'd like to avoid doing some of those intermediate allocations
if possible. We rewrite the above code by introducing a move2heap
operation:
func f(n int) []int {
var r []int
for i := range n {
r = append(r, i)
}
r = move2heap(r)
return r
}
Using the move2heap runtime function, which does:
move2heap(r):
If r is already backed by heap storage, return r.
Otherwise, copy r to the heap and return the copy.
Now we can treat the backing store of r allocated at the
append site as not escaping. Previous stack allocation
optimizations now apply, which can use a fixed-size
stack-allocated backing store for r when appending.
See the description in cmd/compile/internal/slice/slice.go
for how we ensure that this optimization is safe.
Change-Id: I81f36e58bade2241d07f67967d8d547fff5302b8
Reviewed-on: https://go-review.googlesource.com/c/go/+/707755
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
goos: linux
goarch: loong64
pkg: runtime
cpu: Loongson-3A6000 @ 2500.00MHz
| old.txt | new.txt |
| sec/op | sec/op vs base |
ClearFat256 6.406n ± 0% 3.329n ± 1% -48.03% (p=0.000 n=10)
ClearFat512 12.810n ± 0% 7.607n ± 0% -40.62% (p=0.000 n=10)
ClearFat1024 25.62n ± 0% 14.01n ± 0% -45.32% (p=0.000 n=10)
ClearFat1032 26.02n ± 0% 14.28n ± 0% -45.14% (p=0.000 n=10)
ClearFat1040 26.02n ± 0% 14.41n ± 0% -44.62% (p=0.000 n=10)
MemclrKnownSize192 4.804n ± 0% 2.827n ± 0% -41.15% (p=0.000 n=10)
MemclrKnownSize248 6.561n ± 0% 4.371n ± 0% -33.38% (p=0.000 n=10)
MemclrKnownSize256 6.406n ± 0% 3.335n ± 0% -47.94% (p=0.000 n=10)
geomean 11.41n 6.453n -43.45%
goos: linux
goarch: loong64
pkg: runtime
cpu: Loongson-3C5000 @ 2200.00MHz
| old.txt | new.txt |
| sec/op | sec/op vs base |
ClearFat256 14.570n ± 0% 7.284n ± 0% -50.01% (p=0.000 n=10)
ClearFat512 29.13n ± 0% 14.57n ± 0% -49.98% (p=0.000 n=10)
ClearFat1024 58.26n ± 0% 29.15n ± 0% -49.97% (p=0.000 n=10)
ClearFat1032 58.73n ± 0% 29.15n ± 0% -50.36% (p=0.000 n=10)
ClearFat1040 59.18n ± 0% 29.26n ± 0% -50.56% (p=0.000 n=10)
MemclrKnownSize192 10.930n ± 0% 5.466n ± 0% -49.99% (p=0.000 n=10)
MemclrKnownSize248 14.110n ± 0% 6.772n ± 0% -52.01% (p=0.000 n=10)
MemclrKnownSize256 14.570n ± 0% 7.285n ± 0% -50.00% (p=0.000 n=10)
geomean 25.75n 12.78n -50.36%
Change-Id: I88d7b6ae2f6fc3f095979f24fb83ff42a9d2d42e
Reviewed-on: https://go-review.googlesource.com/c/go/+/720940
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
CL 715840 deferred popping from the object path during handling of
grouped declaration statements, which leaves extra objects on the
path since this executes in a loop.
Surprisingly, no test exercised this. This change fixes this small
bug and adds a supporting test.
Fixes #76366
Change-Id: I7fc038b39d3871eea3e60855c46614b463bcfa4f
Reviewed-on: https://go-review.googlesource.com/c/go/+/722060
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
The continue used to make sense since I first wrote this patch with a loop,
for testing the commutativity of the add.
This was refactored to just try both but I forgot to fix the continue.
Change-Id: I91466a052d5d8ee7193084a71faf69bd27e36d2a
Reviewed-on: https://go-review.googlesource.com/c/go/+/721204
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
|
|
Change-Id: Icf5db366b311b5f88809dd07f22cf4bfdead516c
Reviewed-on: https://go-review.googlesource.com/c/go/+/721203
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
|
|
For this code:
func f() (int, int) {
return 0, 0
}
We currently generate on arm64:
MOVD ZR, R0
MOVD R0, R1
This CL changes that to
MOVD ZR, R0
MOVD ZR, R1
Probably not a big performance difference, but it makes the generated
code clearer.
A followup-ish CL from 633075 when the zero fixed register was exposed
to the register allocator.
Change-Id: I869a92817dcbbca46c900999fab538e76e10ed05
Reviewed-on: https://go-review.googlesource.com/c/go/+/721440
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Gets rid of an EOR $1 instruction.
Change-Id: Ib032b0cee9ac484329c978af9b1305446f8d5dac
Reviewed-on: https://go-review.googlesource.com/c/go/+/721501
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This CL addressed some API change decisions in the API audit.
Instead of exposing the Intel format, we hide the add part of the
instructions under the peephole, and rename the API as
DotProdQuadruple
Change-Id: I471c0a755174bc15dd83bdc0f757d6356b92d835
Reviewed-on: https://go-review.googlesource.com/c/go/+/721420
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL addressed some naming changes decided in API audit.
Before After
SHA1Msg1 SHA1Message1, Remove signed
SHA1Msg2 SHA1Message2, Remove signed
SHA1NextE SHA1NextE, Remove signed
SHA1Round4 SHA1FourRounds, Remove signed
SHA256Msg1 SHA256Message1, Remove signed
SHA256Msg2 SHA256Message2, Remove signed
SHA256Rounds2 SHA256TwoRounds, Remove signed
Change-Id: If2cead113f37a9044bc5c65e78fa9d124e318005
Reviewed-on: https://go-review.googlesource.com/c/go/+/721003
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
riscv64
Make use of compressed instructions on riscv64 - add a compress
pass to the end of the assembler, which replaces non-compressed
instructions with compressed alternatives if possible.
Provide a `compressinstructions` compiler and assembler debug
flag, such that the compression pass can be disabled via
`-asmflags=all=-d=compressinstructions=0` and
`-gcflags=all=-d=compressinstructions=0`. Note that this does
not prevent the explicit use of compressed instructions via
assembly.
Note that this does not make use of compressed control transfer
instructions - this will be implemented in later changes.
Reduces the text size of a hello world binary by ~121KB
and reduces the text size of the go binary on riscv64 by ~1.21MB
(between 8-10% in both cases).
Updates #71105
Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64
Change-Id: I24258353688554042c2a836deed4830cc673e985
Reviewed-on: https://go-review.googlesource.com/c/go/+/523478
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Gets rid of some sign extensions.
Change-Id: Ie67ef36b4ca1cd1a2cd9fa5d84578db553578a22
Reviewed-on: https://go-review.googlesource.com/c/go/+/721241
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This CL changed AESEncryptRound and AESDecryptRound to
AESEncryptOneRound and AESDecryptOneRound.
This CL also adds the 512-bit version of some AES instructions.
Change-Id: Ia851a008cce2145b1ff193a89e172862060a725d
Reviewed-on: https://go-review.googlesource.com/c/go/+/721280
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
This CL named VPALIGNR ConcatShiftBytes[Grouped].
Change-Id: I46c6703085efb0613deefa512de9911b4fdf6bc4
Reviewed-on: https://go-review.googlesource.com/c/go/+/714440
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL fixed an error left by CL 718160.
Change-Id: I442ea59bc1ff0dda2914d1858dd5ebe93e2818dc
Reviewed-on: https://go-review.googlesource.com/c/go/+/720281
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
For v = x-y:
if y >= 0 then v <= x
if y <= x then v >= 0
(With appropriate guards against overflow/underflow.)
Fixes #76304
Change-Id: I8f8f1254156c347fa97802bd057a8379676720ae
Reviewed-on: https://go-review.googlesource.com/c/go/+/720740
Reviewed-by: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
flowLimit no longer needs to return whether it updated any information,
after CL 714920 which got rid of the iterate-until-done paradigm.
Change-Id: I0c5f592578ff27c27586c1f8b8a8d9071d94846d
Reviewed-on: https://go-review.googlesource.com/c/go/+/720720
Reviewed-by: Youlin Feng <fengyoulin@live.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
For #72850
Change-Id: I07e64f05c82a34b1dadb9a72e16f5045e68cbd24
Reviewed-on: https://go-review.googlesource.com/c/go/+/720642
Auto-Submit: Alan Donovan <adonovan@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
If the struct is a bunch of 0-sized fields and one pointer field.
Merged revert-of-revert for 4 CLs.
original revert
681937 695016
693415 694996
693615 695015
694195 694995
Fixes #74092
Update #74888
Update #74908
Update #74935
(updated issues are bugs in the last attempt at this)
Change-Id: I32246d49b8bac3bb080972dc06ab432a5480d560
Reviewed-on: https://go-review.googlesource.com/c/go/+/714421
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
The rule
(SLTI [x] (ORI [y] _)) && y >= 0 && int64(y) >= int64(x) => (MOVDconst [0])
is incorrect as it only generates correct code if the unknown value
being compared is >= 0. If the unknown value is < 0 the rule will
incorrectly produce a constant value of 0, whereas the code optimized
away by the rule would have produced a value of 1.
A new test that causes the faulty rule to generate incorrect code
is also added to ensure that the error does not return.
Change-Id: I69224e0776596f1b9538acf9dacf9009d305f966
Reviewed-on: https://go-review.googlesource.com/c/go/+/720220
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
goos: linux
goarch: riscv64
pkg: cmd/compile/internal/test
cpu: Spacemit(R) X60
│ /root/mul.base.log │ /root/mul.new.log │
│ sec/op │ sec/op vs base │
MulNeg 6.426µ ± 0% 4.501µ ± 0% -29.96% (p=0.000 n=10)
Mul2Neg 9.000µ ± 0% 6.431µ ± 0% -28.54% (p=0.000 n=10)
Mul2 1.263µ ± 0% 1.263µ ± 0% ~ (p=1.000 n=10)
MulNeg2 1.577µ ± 0% 1.577µ ± 0% ~ (p=0.211 n=10)
geomean 3.276µ 2.756µ -15.89%
goos: linux
goarch: amd64
pkg: cmd/compile/internal/test
cpu: AMD EPYC 7532 32-Core Processor
│ /root/base │ /root/new │
│ sec/op │ sec/op vs base │
MulNeg 691.9n ± 1% 319.4n ± 0% -53.83% (p=0.000 n=10)
Mul2Neg 630.0n ± 0% 629.6n ± 0% -0.07% (p=0.000 n=10)
Mul2 438.1n ± 0% 438.1n ± 0% ~ (p=0.728 n=10)
MulNeg2 439.3n ± 0% 439.4n ± 0% ~ (p=0.656 n=10)
geomean 538.2n 443.6n -17.58%
Change-Id: Ice8e6c8d1e8e3009ba8a0b1b689205174e199019
Reviewed-on: https://go-review.googlesource.com/c/go/+/720180
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
Optimize comparisons with constants that only differ by 1 bit (i.e.
a power of 2). For example:
x == 4 || x == 6 -> x|2 == 6
x != 1 && x != 5 -> x|4 != 5
Change-Id: Ic61719e5118446d21cf15652d9da22f7d95b2a15
Reviewed-on: https://go-review.googlesource.com/c/go/+/719420
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This change shouldn't have any impact on codegen. It removes some
unnecessary int8 and int64 casts from the riscv64 rewrite rules.
It also removes explicit typing where the types already match:
`(OldOp <t>) => (NewOp <t>)` is the same as `(OldOp) => (NewOp)`.
Change-Id: Ic02b65da8f548c8b9ad9ccb6627a03b7bf6f050f
Reviewed-on: https://go-review.googlesource.com/c/go/+/719220
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Commit-Queue: Junyang Shao <shaojunyang@google.com>
|
|
The type checker assigns types to objects. An object can be in
1 of 3 states, which are mapped to colors. This is stored as a
field on the object.
- white : type not known, awaiting processing
- grey : type pending, in processing
- black : type known, done processing
With the addition of Checker.objPathIdx, which maps objects to
a path index, presence in the map could signal grey coloring.
White and black coloring can be signaled by presence of
object.typ (a simple nil check).
This change removes the object.color field and its associated
methods, replacing it with an equivalent use of Checker.objPathIdx.
Checker.objPathIdx is integrated into the existing push and pop
methods, while slightly simplifying their signatures.
Note that the concept of object coloring remains the same - we
are merely changing and simplifying the mechanism which represents
the colors.
Change-Id: I91fb5e9a59dcb34c08ffc5b4ebc3f20a400094b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/715840
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Mark Freeman <markfreeman@google.com>
|
|
Conflicts:
- src/cmd/compile/internal/ir/symtab.go
- src/cmd/compile/internal/ssa/prove.go
- src/cmd/compile/internal/ssa/rewriteAMD64.go
- src/cmd/compile/internal/ssagen/intrinsics.go
- src/cmd/compile/internal/typecheck/builtin.go
- src/internal/buildcfg/exp.go
- src/internal/strconv/ftoa.go
- test/codegen/stack.go
Manually resolved some conflicts:
- Use internal/strconv for simd.String, remove internal/ftoa
- prove.go is just copied from the one on the main branch. We
have cherry-picked the changes to prove.go to main branch, so
our copy is identical to an old version of the one on the main
branch. There are CLs landed after our cherry-picks. Just copy
it over to adopt the new code.
Merge List:
+ 2025-11-13 57362e9814 go/types, types2: check for direct cycles as a separate phase
+ 2025-11-13 099e0027bd cmd/go/internal/modfetch: consolidate global vars
+ 2025-11-13 028375323f cmd/go/internal/modfetch/codehost: fix flaky TestReadZip
+ 2025-11-13 4ebf295b0b runtime: prefer to restart Ps on the same M after STW
+ 2025-11-13 625d8e9b9c runtime/pprof: fix goroutine leak profile tests for noopt
+ 2025-11-13 4684a26c26 spec: remove cycle restriction for type parameters
+ 2025-11-13 0f9c8fb29d cmd/asm,cmd/internal/obj/riscv: add support for riscv compressed instructions
+ 2025-11-13 a15d036ce2 cmd/internal/obj/riscv: implement better bit pattern encoding
+ 2025-11-12 abb241a789 cmd/internal/obj/loong64: add {,X}VS{ADD,SUB}.{B/H/W/V}{,U} instructions support
+ 2025-11-12 0929d21978 cmd/go: keep objects alive while stopping cleanups
+ 2025-11-12 f03d06ec1a runtime: fix list test memory management for mayMoreStack
+ 2025-11-12 48127f656b crypto/internal/fips140/sha3: remove outdated TODO
+ 2025-11-12 c3d1d42764 sync/atomic: amend comments for Value.{Swap,CompareAndSwap}
+ 2025-11-12 e0807ba470 cmd/compile: don't clear ptrmask in fillptrmask
+ 2025-11-12 66318d2b4b internal/abi: correctly describe result in Name.Name doc comment
+ 2025-11-12 34aef89366 cmd/compile: use FCLASSD for subnormal checks on riscv64
+ 2025-11-12 0c28789bd7 net/url: disallow raw IPv6 addresses in host
+ 2025-11-12 4e761b9a18 cmd/compile: optimize liveness in stackalloc
+ 2025-11-12 956909ff84 crypto/x509: move BetterTLS suite from crypto/tls
+ 2025-11-12 6525f46707 cmd/link: change shdr and phdr from arrays to slices
+ 2025-11-12 d3aeba1670 runtime: switch p.gcFractionalMarkTime to atomic.Int64
+ 2025-11-12 8873e8bea2 runtime,runtime/pprof: clean up goroutine leak profile writing
+ 2025-11-12 b8b84b789e cmd/go: clarify the -o testflag is only for copying the binary
+ 2025-11-12 c761b26b56 mime: parse media types that contain braces
+ 2025-11-12 65858a146e os/exec: include Cmd.Start in the list of methods that run Cmd
+ 2025-11-11 4bfc3a9d14 std,cmd: go fix -any std cmd
+ 2025-11-11 2263d4aabd runtime: doubly-linked sched.midle list
+ 2025-11-11 046dce0e54 runtime: use new list type for spanSPMCs
+ 2025-11-11 5f11275457 runtime: reusable intrusive doubly-linked list
+ 2025-11-11 951cf0501b internal/trace/testtrace: fix flag name typos
+ 2025-11-11 2750f95291 cmd/go: implement accurate pseudo-versions for Mercurial
+ 2025-11-11 b709a3e8b4 cmd/go/internal/vcweb: cache hg servers
+ 2025-11-11 426ef30ecf cmd/go: implement -reuse for Mercurial repos
+ 2025-11-10 5241d114f5 spec: more precise prose for special case of append
+ 2025-11-10 cdf64106f6 go/types, types2: first argument to append must never be be nil
+ 2025-11-10 a0eb4548cf .gitignore: ignore go test artifacts
+ 2025-11-10 bf58e7845e internal/trace: add "command" to convert text traces to raw
+ 2025-11-10 052c192a4c runtime: fix lock rank for work.spanSPMCs.lock
+ 2025-11-10 bc5ffe5c79 internal/runtime/sys,math/bits: eliminate bounds checks on len8tab
+ 2025-11-10 32f8d6486f runtime: document that tracefpunwindoff applies to some profilers
+ 2025-11-10 1c1c1942ba cmd/go: remove redundant AVX regex in security flag checks
+ 2025-11-10 3b3d6b9e5d cmd/internal/obj/arm64: shorten constant integer loads
+ 2025-11-10 5f4b5f1a19 runtime/msan: use different msan routine for copying
+ 2025-11-10 0fe6c8e8c8 runtime: tweak wording for comment of mcache.flushGen
+ 2025-11-10 95a0e5adc1 sync: don't call Done when f() panics in WaitGroup.Go
+ 2025-11-08 e8ed85d6c2 cmd/go: update goSum if necessary
+ 2025-11-08 b76103c08e cmd/go: output missing GoDebug entries
+ 2025-11-07 47a63a331d cmd/go: rewrite hgrepo1 test repo to be deterministic
+ 2025-11-07 7995751d3a cmd/go: copy git reuse and support repos to hg
+ 2025-11-07 66c7ca7fb3 cmd/go: improve TestScript/reuse_git
+ 2025-11-07 de84ac55c6 cmd/link: clean up some comments to Go standards
+ 2025-11-07 5cd1b73772 runtime: correctly print panics before fatal-ing on defer
+ 2025-11-07 91ca80f970 runtime/cgo: improve error messages after pointer panic
+ 2025-11-07 d36e88f21f runtime: tweak wording for doc
+ 2025-11-06 ad3ccd92e4 cmd/link: move pclntab out of relro section
+ 2025-11-06 43b91e7abd iter: fix a tiny doc comment bug
+ 2025-11-06 48c7fa13c6 Revert "runtime: remove the pc field of _defer struct"
+ 2025-11-05 8111104a21 cmd/internal/obj/loong64: add {,X}VSHUF.{B/H/W/V} instructions support
+ 2025-11-05 2e2072561c cmd/internal/obj/loong64: add {,X}VEXTRINS.{B,H,W,V} instruction support
+ 2025-11-05 01c29d1f0b internal/chacha8rand: replace VORV with instruction VMOVQ on loong64
+ 2025-11-05 f01a1841fd cmd/compile: fix error message on loong64
+ 2025-11-05 8cf7a0b4c9 cmd/link: support weak binding on darwin
+ 2025-11-05 2dd7e94e16 cmd/go: use go.dev instead of golang.org in flag errors
+ 2025-11-05 28f1ad5782 cmd/go: fix TestScript/govcs
+ 2025-11-05 daa220a1c9 cmd/go: silence TLS handshake errors during test
+ 2025-11-05 3ae9e95002 cmd/go: fix TestCgoPkgConfig on darwin with pkg-config installed
+ 2025-11-05 a494a26bc2 cmd/go: fix TestScript/vet_flags
+ 2025-11-05 a8fb94969c cmd/go: fix TestScript/tool_build_as_needed
+ 2025-11-05 04f05219c4 cmd/cgo: skip escape checks if call site has no argument
+ 2025-11-04 9f3a108ee0 os: ignore O_TRUNC errors on named pipes and terminal devices on Windows
+ 2025-11-04 0e1bd8b5f1 cmd/link, runtime: don't store text start in pcHeader
+ 2025-11-04 7347b54727 cmd/link: don't generate .gosymtab section
+ 2025-11-04 6914dd11c0 cmd/link: add and use new SymKind SFirstUnallocated
+ 2025-11-04 f5f14262d0 cmd/link: remove misleading comment
+ 2025-11-04 61de3a9dae cmd/link: remove unused SFILEPATH symbol kind
+ 2025-11-04 8e2bd267b5 cmd/link: add comments for SymKind values
+ 2025-11-04 16705b962e cmd/compile: faster liveness analysis in regalloc
+ 2025-11-04 a5fe6791d7 internal/syscall/windows: fix ReOpenFile sentinel error value
+ 2025-11-04 a7d174ccaa cmd/compile/internal/ssa: simplify riscv64 FCLASSD rewrite rules
+ 2025-11-04 856238615d runtime: amend doc for setPinned
+ 2025-11-04 c7ccbddf22 cmd/compile/internal/ssa: more aggressive on dead auto elim
+ 2025-11-04 75b2bb1d1a cmd/cgo: drop pre-1.18 support
+ 2025-11-04 dd839f1d00 internal/strconv: handle %f with fixedFtoa when possible
+ 2025-11-04 6e165b4d17 cmd/compile: implement Avg64u, Hmul64, Hmul64u for wasm
+ 2025-11-04 9f6590f333 encoding/pem: don't reslice in failure modes
+ 2025-11-03 34fec512ce internal/strconv: extract fixed-precision ftoa from ftoaryu.go
+ 2025-11-03 162ba6cc40 internal/strconv: add tests and benchmarks for ftoaFixed
+ 2025-11-03 9795c7ba22 internal/strconv: fix pow10 off-by-one in exponent result
+ 2025-11-03 ad5e941a45 cmd/internal/obj/loong64: using {xv,v}slli.d to perform copying between vector registers
+ 2025-11-03 dadbac0c9e cmd/internal/obj/loong64: add VPERMI.W, XVPERMI.{W,V,Q} instruction support
+ 2025-11-03 e2c6a2024c runtime: avoid append in printint, printuint
+ 2025-11-03 c93cc603cd runtime: allow Stack to traceback goroutines in syscall _Grunning window
+ 2025-11-03 b5353fd90a runtime: don't panic in castogscanstatus
+ 2025-11-03 43491f8d52 cmd/cgo: use the export'ed file/line in error messages
+ 2025-11-03 aa94fdf0cc cmd/go: link to go.dev/doc/godebug for removed GODEBUG settings
+ 2025-11-03 4d2b03d2fc crypto/tls: add BetterTLS test coverage
+ 2025-11-03 0c4444e13d cmd/internal/obj: support arm64 FMOVQ large offset encoding
+ 2025-11-03 85bec791a0 cmd/go/testdata/script: loosen list_empty_importpath for freebsd
+ 2025-11-03 17b57078ab internal/runtime/cgobench: add cgo callback benchmark
+ 2025-11-03 5f8fdb720c cmd/go: move functions to methods
+ 2025-11-03 0a95856b95 cmd/go: eliminate additional global variable
+ 2025-11-03 f93186fb44 cmd/go/internal/telemetrystats: count cgo usage
+ 2025-11-03 eaf28a27fd runtime: update outdated comments for deferprocStack
+ 2025-11-03 e12d8a90bf all: remove extra space in the comments
+ 2025-11-03 c5559344ac internal/profile: optimize Parse allocs
+ 2025-11-03 5132158ac2 bytes: add Buffer.Peek
+ 2025-11-03 361d51a6b5 runtime: remove the pc field of _defer struct
+ 2025-11-03 00ee1860ce crypto/internal/constanttime: expose intrinsics to the FIPS 140-3 packages
+ 2025-11-02 388c41c412 cmd/go: skip git sha256 tests if git < 2.29
+ 2025-11-01 385dc33250 runtime: prevent time.Timer.Reset(0) from deadlocking testing/synctest tests
+ 2025-10-31 99b724f454 cmd/go: document purego convention
+ 2025-10-31 27937289dc runtime: avoid zeroing scavenged memory
+ 2025-10-30 89dee70484 runtime: prioritize panic output over racefini
+ 2025-10-30 8683bb846d runtime: optimistically CAS atomicstatus directly in enter/exitsyscall
+ 2025-10-30 5b8e850340 runtime: don't track scheduling latency for _Grunning <-> _Gsyscall
+ 2025-10-30 251814e580 runtime: document tracer invariants explicitly
+ 2025-10-30 7244e9221f runtime: eliminate _Psyscall
+ 2025-10-30 5ef19c0d0c strconv: delete divmod1e9
+ 2025-10-30 d32b1f02c3 runtime: delete timediv
+ 2025-10-30 cbbd385cb8 strconv: remove arch-specific decision in formatBase10
+ 2025-10-30 6aca04a73a reflect: correct internal docs for uncommonType
+ 2025-10-30 235b4e729d cmd/compile/internal/ssa: model right shift more precisely
+ 2025-10-30 d44db293f9 go/token: fix a typo in a comment
+ 2025-10-30 cdc6b559ca strconv: remove hand-written divide on 32-bit systems
+ 2025-10-30 1e5bb416d8 cmd/compile: implement bits.Mul64 on 32-bit systems
+ 2025-10-30 38317c44e7 crypto/internal/fips140/aes: fix CTR generator
+ 2025-10-29 3be9a0e014 go/types, types: proceed with correct (invalid) type in case of a selector error
+ 2025-10-29 d2c5fa0814 strconv: remove &0xFF trick in formatBase10
+ 2025-10-29 9bbda7c99d cmd/compile: make prove understand div, mod better
+ 2025-10-29 915c1839fe test/codegen: simplify asmcheck pattern matching
+ 2025-10-29 32ee3f3f73 runtime: tweak example code for gorecover
+ 2025-10-29 da3fb90b23 crypto/internal/fips140/bigmod: fix extendedGCD comment
+ 2025-10-29 9035f7aea5 runtime: use internal/strconv
+ 2025-10-29 49c1da474d internal/itoa, internal/runtime/strconv: delete
+ 2025-10-29 b2a346bbd1 strconv: move all but Quote to internal/strconv
+ 2025-10-28 041f564b3e internal/runtime/gc/scan: avoid memory destination on VPCOMPRESSQ
+ 2025-10-28 81afd3a59b cmd/compile: extend ppc64 MADDLD to match const ADDconst & MULLDconst
+ 2025-10-28 ea50d61b66 cmd/compile: name change isDirect -> isDirectAndComparable
+ 2025-10-28 bd4dc413cd cmd/compile: don't optimize away a panicing interface comparison
+ 2025-10-28 30c047d0d0 cmd/compile: extend loong MOV*idx rules to match ADDshiftLLV
+ 2025-10-28 46e5e2b09a runtime: define PanicBounds in funcdata.h
+ 2025-10-28 3da0356685 crypto/internal/fips140test: collect 300M entropy samples for ESV
+ 2025-10-28 d5953185d5 runtime: amend comments a bit
+ 2025-10-28 12c8d14d94 errors: document that the target of Is must be comparable
+ 2025-10-28 1f4d14e493 go/types, types2: pull up package-level object sort to a separate phase
+ 2025-10-28 b8aa1ee442 go/types, types2: reduce locks held at once in resolveUnderlying
+ 2025-10-28 24af441437 cmd/compile: rewrite proved multiplies by 0 or 1 into CondSelect
+ 2025-10-28 2d33a456c6 cmd/compile: move branchelim supported arches to Config
+ 2025-10-27 2c91c33e88 crypto/subtle,cmd/compile: add intrinsics for ConstantTimeSelect and *Eq
+ 2025-10-27 73d7635fae cmd/compile: add generic rules to remove bool → int → bool roundtrips
+ 2025-10-27 1662d55247 cmd/compile: do not Zext bools to 64bits in amd64 CMOV generation rules
+ 2025-10-27 b8468d8c4e cmd/compile: introduce bytesizeToConst to cleanup switches in prove
+ 2025-10-27 9e25c2f6de cmd/link: internal linking support for windows/arm64
+ 2025-10-27 ff2ebf69c4 internal/runtime/gc/scan: correct size class size check
+ 2025-10-27 9a77aa4f08 cmd/compile: add position info to sccp debug messages
+ 2025-10-27 77dc138030 cmd/compile: teach prove about unsigned rounding-up divide
+ 2025-10-27 a0f33b2887 cmd/compile: change !l.nonzero() into l.maybezero()
+ 2025-10-27 5453b788fd cmd/compile: optimize Add64carry with unused carries into plain Add64
+ 2025-10-27 2ce5aab79e cmd/compile: remove 68857 ModU flowLimit workaround in prove
+ 2025-10-27 a50de4bda7 cmd/compile: remove 68857 min & max flowLimit workaround in prove
+ 2025-10-27 53be78630a cmd/compile: use topo-sort in prove to correctly learn facts while walking once
+ 2025-10-27 dec2b4c83d runtime: avoid bound check in freebsd binuptime
+ 2025-10-27 916e682d51 cmd/internal/obj, cmd/asm: reclassify the offset of memory access operations on loong64
+ 2025-10-27 2835b994fb cmd/go: remove global loader state variable
+ 2025-10-27 139f89226f cmd/go: use local state for telemetry
+ 2025-10-27 8239156571 cmd/go: use tagged switch
+ 2025-10-27 d741483a1f cmd/go: increase stmt threshold on amd64
+ 2025-10-27 a6929cf4a7 cmd/go: removed unused code in toolchain.Exec
+ 2025-10-27 180c07e2c1 go/types, types2: clarify docs for resolveUnderlying
+ 2025-10-27 d8a32f3d4b go/types, types2: wrap Named.fromRHS into Named.rhs
+ 2025-10-27 b2af92270f go/types, types2: verify stateMask transitions in debug mode
+ 2025-10-27 92decdcbaa net/url: further speed up escape and unescape
+ 2025-10-27 5f4ec3541f runtime: remove unused cgoCheckUsingType function
+ 2025-10-27 189f2c08cc time: rewrite IsZero method to use wall and ext fields
+ 2025-10-27 f619b4a00d cmd/go: reorder parameters so that context is first
+ 2025-10-27 f527994c61 sync: update comments for Once.done
+ 2025-10-26 5dcaf9a01b runtime: add GOEXPERIMENT=runtimefree
+ 2025-10-26 d7a52f9369 cmd/compile: use MOV(D|F) with const for Const(64|32)F on riscv64
+ 2025-10-26 6f04a92be3 internal/chacha8rand: provide vector implementation for riscv64
+ 2025-10-26 54e3adc533 cmd/go: use local loader state in test
+ 2025-10-26 ca379b1c56 cmd/go: remove loaderstate dependency
+ 2025-10-26 83a44bde64 cmd/go: remove unused loader state
+ 2025-10-26 7e7cd9de68 cmd/go: remove temporary rf cleanup script
+ 2025-10-26 53ad68de4b cmd/compile: allow unaligned load/store on Wasm
+ 2025-10-25 12ec09f434 cmd/go: use local state object in work.runBuild and work.runInstall
+ 2025-10-24 643f80a11f runtime: add ppc and s390 to 32 build constraints for gccgo
+ 2025-10-24 0afbeb5102 runtime: add ppc and s390 to linux 32 bits syscall build constraints for gccgo
+ 2025-10-24 7b506d106f cmd/go: use local state object in `generate.runGenerate`
+ 2025-10-24 26a8a21d7f cmd/go: use local state object in `env.runEnv`
+ 2025-10-24 f2dd3d7e31 cmd/go: use local state object in `vet.runVet`
+ 2025-10-24 784700439a cmd/go: use local state object in pkg `workcmd`
+ 2025-10-24 69673e9be2 cmd/go: use local state object in `tool.runTool`
+ 2025-10-24 2e12c5db11 cmd/go: use local state object in `test.runTest`
+ 2025-10-24 fe345ff2ae cmd/go: use local state object in `modget.runGet`
+ 2025-10-24 d312e27e8b cmd/go: use local state object in pkg `modcmd`
+ 2025-10-24 ea9cf26aa1 cmd/go: use local state object in `list.runList`
+ 2025-10-24 9926e1124e cmd/go: use local state object in `bug.runBug`
+ 2025-10-24 2c4fd7b2cd cmd/go: use local state object in `run.runRun`
+ 2025-10-24 ade9f33e1f cmd/go: add loaderstate as field on `mvsReqs`
+ 2025-10-24 ccf4192a31 cmd/go: make ImportMissingError work with local state
+ 2025-10-24 f5403f15f0 debug/pe: check for zdebug_gdb_scripts section in testDWARF
+ 2025-10-24 a26f860fa4 runtime: use 32-bit hash for maps on Wasm
+ 2025-10-24 747fe2efed encoding/json/v2: fix typo in documentation about errors.AsType
+ 2025-10-24 94f47fc03f cmd/link: remove pointless assignment in SetSymAlign
+ 2025-10-24 e6cff69051 crypto/x509: move constraint checking after chain building
+ 2025-10-24 f5f69a3de9 encoding/json/jsontext: avoid pinning application data in pools
+ 2025-10-24 a6a59f0762 encoding/json/v2: use slices.Sort directly
+ 2025-10-24 0d3dab9b1d crypto/x509: simplify candidate chain filtering
+ 2025-10-24 29046398bb cmd/go: refactor injection of modload.LoaderState
+ 2025-10-24 c18fa69e52 cmd/go: make ErrNoModRoot work with local state
+ 2025-10-24 296ecc918d cmd/go: add modload.State parameter to AllowedFunc
+ 2025-10-24 c445a61e52 cmd/go: add loaderstate as field on `QueryMatchesMainModulesError`
+ 2025-10-24 6ac40051d3 cmd/go: remove module loader state from ccompile
+ 2025-10-24 6a5a452528 cmd/go: inject vendor dir into builder struct
+ 2025-10-23 dfac972233 crypto/pbkdf2: add missing error return value in example
+ 2025-10-23 47bf8f073e unique: fix inconsistent panic prefix in canonmap cleanup path
+ 2025-10-23 03bd43e8bb go/types, types2: rename Named.resolve to unpack
+ 2025-10-23 9fcdc814b2 go/types, types2: rename loaded namedState to lazyLoaded
+ 2025-10-23 8401512a9b go/types, types2: rename complete namedState to hasMethods
+ 2025-10-23 cf826bfcb4 go/types, types2: set t.underlying exactly once in resolveUnderlying
+ 2025-10-23 c4e910895b net/url: speed up escape and unescape
+ 2025-10-23 3f6ac3a10f go/build: use slices.Equal
+ 2025-10-23 839da71f89 encoding/pem: properly calculate end indexes
+ 2025-10-23 39ed968832 cmd: update golang.org/x/arch for riscv64 disassembler
+ 2025-10-23 ca448191c9 all: replace Split in loops with more efficient SplitSeq
+ 2025-10-23 107fcb70de internal/goroot: replace HasPrefix+TrimPrefix with CutPrefix
+ 2025-10-23 8378276d66 strconv: optimize int-to-decimal and use consistently
+ 2025-10-23 e5688d0bdd cmd/internal/obj/riscv: simplify validation and encoding of raw instructions
+ 2025-10-22 77fc27972a doc/next: improve new(expr) release note
+ 2025-10-22 d94a8c56ad runtime: cleanup pagetrace
+ 2025-10-22 02728a2846 crypto/internal/fips140test: add entropy SHA2-384 testing
+ 2025-10-22 f92e01c117 runtime/cgo: fix cgoCheckArg description
+ 2025-10-22 50586182ab runtime: use backoff and ISB instruction to reduce contention in (*lfstack).pop and (*spanSet).pop on arm64
+ 2025-10-22 1ff59f3dd3 strconv: clean up powers-of-10 table, tests
+ 2025-10-22 7c9fa4d5e9 cmd/go: check if build output should overwrite files with renames
+ 2025-10-22 557b4d6e0f comment: change slice to string in function comment/help
+ 2025-10-22 d09a8c8ef4 go/types, types2: simplify locking in Named.resolveUnderlying
+ 2025-10-22 5a42af7f6c go/types, types2: in resolveUnderlying, only compute path when needed
+ 2025-10-22 4bdb55b5b8 go/types, types2: rename Named.under to Named.resolveUnderlying
+ 2025-10-21 29d43df8ab go/build, cmd/go: use ast.ParseDirective for go:embed
+ 2025-10-21 4e695dd634 go/ast: add ParseDirective for parsing directive comments
+ 2025-10-21 06e57e60a7 go/types, types2: only report version errors if new(expr) is ok otherwise
+ 2025-10-21 6c3d0d259f path/filepath: reword documentation for Rel
+ 2025-10-21 39fd61ddb0 go/types, types2: guard Named.underlying with Named.mu
+ 2025-10-21 4a0115c886 runtime,syscall: implement and use syscalln on darwin
+ 2025-10-21 261c561f5a all: gofmt -w
+ 2025-10-21 c9c78c06ef strconv: embed testdata in test
+ 2025-10-21 8f74f9daf4 sync: re-enable race even when panicking
+ 2025-10-21 8a6c64f4fe syscall: use rawSyscall6 to call ptrace in forkAndExecInChild
+ 2025-10-21 4620db72d2 runtime: use timer_settime64 on 32-bit Linux
+ 2025-10-21 b31dc77cea os: support deleting read-only files in RemoveAll on older Windows versions
+ 2025-10-21 46cc532900 cmd/compile/internal/ssa: fix typo in comment
+ 2025-10-21 2163a58021 crypto/internal/fips140/entropy: increase AllocsPerRun iterations
+ 2025-10-21 306eacbc11 cmd/go/testdata/script: disable list_empty_importpath test on Windows
+ 2025-10-21 a5a249d6a6 all: eliminate unnecessary type conversions
+ 2025-10-21 694182d77b cmd/internal/obj/ppc64: improve large prologue generation
+ 2025-10-21 b0dcb95542 cmd/compile: leave the horses alone
+ 2025-10-21 9a5a1202f4 runtime: clean dead architectures from go:build constraint
+ 2025-10-21 8539691d0c crypto/internal/fips140/entropy: move to crypto/internal/entropy/v1.0.0
+ 2025-10-20 99cf4d671c runtime: save lasx and lsx registers in loong64 async preemption
+ 2025-10-20 79ae97fe9b runtime: make procyieldAsm no longer loop infinitely if passed 0
+ 2025-10-20 f838faffe2 runtime: wrap procyield assembly and check for 0
+ 2025-10-20 ee4d2c312d runtime/trace: dump test traces on validation failure
+ 2025-10-20 7b81a1e107 net/url: reduce allocs in Encode
+ 2025-10-20 e425176843 cmd/asm: fix typo in comment
+ 2025-10-20 dc9a3e2a65 runtime: fix generation skew with trace reentrancy
+ 2025-10-20 df33c17091 runtime: add _Gdeadextra status
+ 2025-10-20 7503856d40 cmd/go: inject loaderstate into matcher function
+ 2025-10-20 d57c3fd743 cmd/go: inject State parameter into `work.runInstall`
+ 2025-10-20 e94a5008f6 cmd/go: inject State parameter into `work.runBuild`
+ 2025-10-20 d9e6f95450 cmd/go: inject State parameter into `workcmd.runSync`
+ 2025-10-20 9769a61e64 cmd/go: inject State parameter into `modget.runGet`
+ 2025-10-20 f859799ccf cmd/go: inject State parameter into `modcmd.runVerify`
+ 2025-10-20 0f820aca29 cmd/go: inject State parameter into `modcmd.runVendor`
+ 2025-10-20 92aa3e9e98 cmd/go: inject State parameter into `modcmd.runInit`
+ 2025-10-20 e176dff41c cmd/go: inject State parameter into `modcmd.runDownload`
+ 2025-10-20 e7c66a58d5 cmd/go: inject State parameter into `toolchain.Select`
+ 2025-10-20 4dc3dd9a86 cmd/go: add loaderstate to Switcher
+ 2025-10-20 bcf7da1595 cmd/go: convert functions to methods
+ 2025-10-20 0d3044f965 cmd/go: make Reset work with any State instance
+ 2025-10-20 386d81151d cmd/go: make setState work with any State instance
+ 2025-10-20 a420aa221e cmd/go: inject State parameter into `tool.runTool`
+ 2025-10-20 441e7194a4 cmd/go: inject State parameter into `test.runTest`
+ 2025-10-20 35e8309be2 cmd/go: inject State parameter into `list.runList`
+ 2025-10-20 29a81624f7 cmd/go: inject state parameter into `fmtcmd.runFmt`
+ 2025-10-20 f7eaea02fd cmd/go: inject state parameter into `clean.runClean`
+ 2025-10-20 58a8fdb6cf cmd/go: inject State parameter into `bug.runBug`
+ 2025-10-20 8d0bef7ffe runtime: add linkname documentation and guidance
+ 2025-10-20 3e43f48cb6 encoding/asn1: use reflect.TypeAssert to improve performance
+ 2025-10-20 4ad5585c2c runtime: fix _rt0_ppc64x_lib on aix
+ 2025-10-17 a5f55a441e cmd/fix: add modernize and inline analyzers
+ 2025-10-17 80876f4b42 cmd/go/internal/vet: tweak help doc
+ 2025-10-17 b5aefe07e5 all: remove unnecessary loop variable copies in tests
+ 2025-10-17 5137c473b6 go/types, types2: remove references to under function in comments
+ 2025-10-17 dbbb1bfc91 all: correct name for comments
+ 2025-10-17 0983090171 encoding/pem: properly decode strange PEM data
+ 2025-10-17 36863d6194 runtime: unify riscv64 library entry point
+ 2025-10-16 0c14000f87 go/types, types2: remove under(Type) in favor of Type.Underlying()
+ 2025-10-16 1099436f1b go/types, types2: change and enforce lifecycle of Named.fromRHS and Named.underlying fields
+ 2025-10-16 41f5659347 go/types, types2: remove superfluous unalias call (minor cleanup)
+ 2025-10-16 e7351c03c8 runtime: use DC ZVA instead of its encoding in WORD in arm64 memclr
+ 2025-10-16 6cbe0920c4 cmd: update to x/tools@7d9453cc
+ 2025-10-15 45eee553e2 cmd/internal/obj: move ARM64RegisterExtension from cmd/asm/internal/arch
+ 2025-10-15 27f9a6705c runtime: increase repeat count for alloc test
+ 2025-10-15 b68cebd809 net/http/httptest: record failed ResponseWriter writes
+ 2025-10-15 f1fed742eb cmd: fix three printf problems reported by newest vet
+ 2025-10-15 0984dcd757 cmd/compile: fix an error in comments
+ 2025-10-15 31f82877e8 go/types, types2: fix misleading internal comment
+ 2025-10-15 6346349f56 cmd/compile: replace angle brackets with square
+ 2025-10-15 284379cdfc cmd/compile: remove rematerializable values from live set across calls
+ 2025-10-15 519ae514ab cmd/compile: eliminate bound check for slices of the same length
+ 2025-10-15 b5a29cca48 cmd/distpack: add fix tool to inventory
+ 2025-10-15 bb5eb51715 runtime/pprof: fix errors in pprof_test
+ 2025-10-15 5c9a26c7f8 cmd/compile: use arm64 neon in LoweredMemmove/LoweredMemmoveLoop
+ 2025-10-15 61d1ff61ad cmd/compile: use block starting position for phi line number
+ 2025-10-15 5b29875c8e cmd/go: inject State parameter into `run.runRun`
+ 2025-10-15 5113496805 runtime/pprof: skip flaky test TestProfilerStackDepth/heap for now
+ 2025-10-15 36086e85f8 cmd/go: create temporary cleanup script
+ 2025-10-14 7056c71d32 cmd/compile: disable use of new saturating float-to-int conversions
+ 2025-10-14 6d5b13793f Revert "cmd/compile: make 386 float-to-int conversions match amd64"
+ 2025-10-14 bb2a14252b Revert "runtime: adjust softfloat corner cases to match amd64/arm64"
+ 2025-10-14 3bc9d9fa83 Revert "cmd/compile: make wasm match other platforms for FP->int32/64 conversions"
+ 2025-10-14 ee5af46172 encoding/json: avoid misleading errors under goexperiment.jsonv2
+ 2025-10-14 11d3d2f77d cmd/internal/obj/arm64: add support for PAC instructions
+ 2025-10-14 4dbf1a5a4c cmd/compile/internal/devirtualize: do not track assignments to non-PAUTO
+ 2025-10-14 0ddb5ed465 cmd/compile/internal/devirtualize: use FatalfAt instead of Fatalf where possible
+ 2025-10-14 0a239bcc99 Revert "net/url: disallow raw IPv6 addresses in host"
+ 2025-10-14 5a9ef44bc0 cmd/compile/internal/devirtualize: fix OCONVNOP assertion
+ 2025-10-14 3765758b96 go/types, types2: minor cleanup (remove TODO)
+ 2025-10-14 f6b9d56aff crypto/internal/fips140/entropy: fix benign race
+ 2025-10-14 60f6d2f623 crypto/internal/fips140/entropy: support SHA-384 sizes for ACVP tests
+ 2025-10-13 6fd8e88d07 encoding/json/v2: restrict presence of default options
+ 2025-10-13 1abc6b0204 go/types, types2: permit type cycles through type parameter lists
+ 2025-10-13 9fdd6904da strconv: add tests that Java once mishandled
+ 2025-10-13 9b8742f2e7 cmd/compile: don't depend on arch-dependent conversions in the compiler
+ 2025-10-13 0e64ee1286 encoding/json/v2: report EOF for top-level values in UnmarshalDecode
+ 2025-10-13 6bcd97d9f4 all: replace calls to errors.As with errors.AsType
+ 2025-10-11 1cd71689f2 crypto/x509: rework fix for CVE-2025-58187
+ 2025-10-11 8aa1efa223 cmd/link: in TestFallocate, only check number of blocks on Darwin
+ 2025-10-10 b497a29d25 encoding/json: fix regression in quoted numbers under goexperiment.jsonv2
+ 2025-10-10 48bb7a6114 cmd/compile: repair bisection behavior for float-to-unsigned conversion
+ 2025-10-10 e8a53538b4 runtime: fail TestGoroutineLeakProfile on data race
+ 2025-10-10 e3be2d1b2b net/url: disallow raw IPv6 addresses in host
+ 2025-10-10 aced4c79a2 net/http: strip request body headers on POST to GET redirects
+ 2025-10-10 584a89fe74 all: omit unnecessary reassignment
+ 2025-10-10 69e8279632 net/http: set cookie host to Request.Host when available
+ 2025-10-10 6f4c63ba63 cmd/go: unify "go fix" and "go vet"
+ 2025-10-10 955a5a0dc5 runtime: support arm64 Neon in async preemption
+ 2025-10-10 5368e77429 net/http: run TestRequestWriteTransport with fake time to avoid flakes
+ 2025-10-09 c53cb642de internal/buildcfg: enable greenteagc experiment for loong64
+ 2025-10-09 954fdcc51a cmd/compile: declare no output register for loong64 LoweredAtomic{And,Or}32 ops
+ 2025-10-09 19a30ea3f2 cmd/compile: call generated size-specialized malloc functions directly
+ 2025-10-09 80f3bb5516 reflect: remove timeout in TestChanOfGC
+ 2025-10-09 9db7e30bb4 net/url: allow IP-literals with IPv4-mapped IPv6 addresses
+ 2025-10-09 8d810286b3 cmd/compile: make wasm match other platforms for FP->int32/64 conversions
+ 2025-10-09 b9f3accdcf runtime: adjust softfloat corner cases to match amd64/arm64
+ 2025-10-09 78d75b3799 cmd/compile: make 386 float-to-int conversions match amd64
+ 2025-10-09 0e466a8d1d cmd/compile: modify float-to-[u]int so that amd64 and arm64 match
+ 2025-10-08 4837fbe414 net/http/httptest: check whether response bodies are allowed
+ 2025-10-08 ee163197a8 path/filepath: return cleaned path from Rel
+ 2025-10-08 de9da0de30 cmd/compile/internal/devirtualize: improve concrete type analysis
+ 2025-10-08 ae094a1397 crypto/internal/fips140test: make entropy file pair names match
+ 2025-10-08 941e5917c1 runtime: cleanup comments from asm_ppc64x.s improvements
+ 2025-10-08 d945600d06 cmd/gofmt: change -d to exit 1 if diffs exist
+ 2025-10-08 d4830c6130 cmd/internal/obj: fix Link.Diag printf errors
+ 2025-10-08 e1ca1de123 net/http: format pprof.go
+ 2025-10-08 e5d004c7a8 net/http: update HTTP/2 documentation to reference new config features
+ 2025-10-08 97fd6bdecc cmd/compile: fuse NaN checks with other comparisons
+ 2025-10-07 78b43037dc cmd/go: refactor usage of `workFilePath`
+ 2025-10-07 bb1ca7ae81 cmd/go, testing: add TB.ArtifactDir and -artifacts flag
+ 2025-10-07 1623927730 cmd/go: refactor usage of `requirements`
+ 2025-10-07 a1661e776f Revert "crypto/internal/fips140/subtle: add assembly implementation of xorBytes for mips64x"
+ 2025-10-07 cb81270113 Revert "crypto/internal/fips140/subtle: add assembly implementation of xorBytes for mipsx"
+ 2025-10-07 f2d0d05d28 cmd/go: refactor usage of `MainModules`
+ 2025-10-07 f7a68d3804 archive/tar: set a limit on the size of GNU sparse file 1.0 regions
+ 2025-10-07 463165699d net/mail: avoid quadratic behavior in mail address parsing
+ 2025-10-07 5ede095649 net/textproto: avoid quadratic complexity in Reader.ReadResponse
+ 2025-10-07 5ce8cd16f3 encoding/pem: make Decode complexity linear
+ 2025-10-07 f6f4e8b3ef net/url: enforce stricter parsing of bracketed IPv6 hostnames
+ 2025-10-07 7dd54e1fd7 runtime: make work.spanSPMCs.all doubly-linked
+ 2025-10-07 3ee761739b runtime: free spanQueue on P destroy
+ 2025-10-07 8709a41d5e encoding/asn1: prevent memory exhaustion when parsing using internal/saferio
+ 2025-10-07 9b9d02c5a0 net/http: add httpcookiemaxnum GODEBUG option to limit number of cookies parsed
+ 2025-10-07 3fc4c79fdb crypto/x509: improve domain name verification
+ 2025-10-07 6e4007e8cf crypto/x509: mitigate DoS vector when intermediate certificate contains DSA public key
+ 2025-10-07 6f7926589d cmd/go: refactor usage of `modRoots`
+ 2025-10-07 11d5484190 runtime: fix self-deadlock on sbrk platforms
+ 2025-10-07 2e52060084 cmd/go: refactor usage of `RootMode`
+ 2025-10-07 f86ddb54b5 cmd/go: refactor usage of `ForceUseModules`
+ 2025-10-07 c938051dd0 Revert "cmd/compile: redo arm64 LR/FP save and restore"
+ 2025-10-07 6469954203 runtime: assert p.destroy runs with GC not running
+ 2025-10-06 4c0fd3a2b4 internal/goexperiment: remove the synctest GOEXPERIMENT
+ 2025-10-06 c1e6e49d5d fmt: reduce Errorf("x") allocations to match errors.New("x")
+ 2025-10-06 7fbf54bfeb internal/buildcfg: enable greenteagc experiment by default
+ 2025-10-06 7bfeb43509 cmd/go: refactor usage of `initialized`
+ 2025-10-06 1d62e92567 test/codegen: make sure assignment results are used.
+ 2025-10-06 4fca79833f runtime: delete redundant code in the page allocator
+ 2025-10-06 719dfcf8a8 cmd/compile: redo arm64 LR/FP save and restore
+ 2025-10-06 f3312124c2 runtime: remove batching from spanSPMC free
+ 2025-10-06 24416458c2 cmd/go: export type State
+ 2025-10-06 c2fb15164b testing/synctest: remove Run
+ 2025-10-06 ac2ec82172 runtime: bump thread count slack for TestReadMetricsSched
+ 2025-10-06 e74b224b7c crypto/tls: streamline BoGo testing w/ -bogo-local-dir
+ 2025-10-06 3a05e7b032 spec: close tag
+ 2025-10-03 2a71af11fc net/url: improve URL docs
+ 2025-10-03 ee5369b003 cmd/link: add LIBRARY statement only with -buildmode=cshared
+ 2025-10-03 1bca4c1673 cmd/compile: improve slicemask removal
+ 2025-10-03 38b26f29f1 cmd/compile: remove stores to unread parameters
+ 2025-10-03 003b5ce1bc cmd/compile: fix SIMD const rematerialization condition
+ 2025-10-03 d91148c7a8 cmd/compile: enhance prove to infer bounds in slice len/cap calculations
+ 2025-10-03 20c9377e47 cmd/compile: enhance the chunked indexing case to include reslicing
+ 2025-10-03 ad3db2562e cmd/compile: handle rematerialized op for incompatible reg constraint
+ 2025-10-03 18cd4a1fc7 cmd/compile: use the right type for spill slot
+ 2025-10-03 1caa95acfa cmd/compile: enhance prove to deal with double-offset IsInBounds checks
+ 2025-10-03 ec70d19023 cmd/compile: rewrite to elide Slicemask from len==c>0 slicing
+ 2025-10-03 10e7968849 cmd/compile: accounts rematerialize ops's output reginfo
+ 2025-10-03 ab043953cb cmd/compile: minor tweak for race detector
+ 2025-10-03 ebb72bef44 cmd/compile: don't treat devel compiler as a released compiler
+ 2025-10-03 c54dc1418b runtime: support valgrind (but not asan) in specialized malloc functions
+ 2025-10-03 a7917eed70 internal/buildcfg: enable specializedmalloc experiment
+ 2025-10-03 630799c6c9 crypto/tls: add flag to render HTML BoGo report
Change-Id: I6bf904c523a77ee7d3dea9c8ae72292f8a5f2ba5
|
|
declaredType seems a better name for this function because it is reached
when processing a (Named or Alias) type declaration. In both cases, the
fromRHS field (rather than the underlying field) will be set.
Change-Id: Ibb1cc338e3b0632dc63f9cf2fd9d64cef6f1aaa5
Reviewed-on: https://go-review.googlesource.com/c/go/+/695955
Auto-Submit: Mark Freeman <markfreeman@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
|
|
A direct cycle is the most basic form of cycle, where no type literal or
predeclared type is reached. It is formed by a series of only TypeNames.
To illustrate, type T T is a direct cycle, but type T [1]T and type T *T
are not. Likewise, the below is also a direct cycle:
type A B
type B C
type C = A
Direct cycles are handled explicitly as part of resolveUnderlying, since
they are the only cycle which can prevent reaching an underlying type.
If we move this check to an earlier compiler phase, we can simplify
resolveUnderlying.
This is the first of (hopefully) several cycle kinds to be moved into a
preliminary phase, with the goal of simplifying the main type-checking
pass. For that reason, the bulk of the logic is placed in cycles.go.
CL based on an earlier version by Mark Freeman.
Change-Id: I3044c383278deb6acb8767c498d8cb68099ba8ef
Reviewed-on: https://go-review.googlesource.com/c/go/+/717343
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
It is only ever called with a newly allocated slice.
This clearing code dates back to the C version of the compiler,
in which the function started like this:
static void
gengcmask(Type *t, uint8 gcmask[16])
{
...
memset(gcmask, 0, 16);
if(!haspointers(t))
return;
That memset was required for C, but not for Go.
Change-Id: I6fceb99b2dc8682685dca2e4289fcd58e2e5a0e0
Reviewed-on: https://go-review.googlesource.com/c/go/+/718340
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
Only implemented for 64 bit floating point operations for now.
goos: linux
goarch: riscv64
pkg: math
cpu: Spacemit(R) X60
│ sec/op │ sec/op vs base │
Acos 154.1n ± 0% 154.1n ± 0% ~ (p=0.303 n=10)
Acosh 215.8n ± 6% 226.7n ± 0% ~ (p=0.439 n=10)
Asin 149.2n ± 1% 149.2n ± 0% ~ (p=0.700 n=10)
Asinh 262.1n ± 0% 258.5n ± 0% -1.37% (p=0.000 n=10)
Atan 99.48n ± 0% 99.49n ± 0% ~ (p=0.836 n=10)
Atanh 244.9n ± 0% 243.8n ± 0% -0.43% (p=0.002 n=10)
Atan2 158.2n ± 1% 153.3n ± 0% -3.10% (p=0.000 n=10)
Cbrt 186.8n ± 0% 181.1n ± 0% -3.03% (p=0.000 n=10)
Ceil 36.71n ± 1% 36.71n ± 0% ~ (p=0.434 n=10)
Copysign 6.531n ± 1% 6.526n ± 0% ~ (p=0.268 n=10)
Cos 98.19n ± 0% 95.40n ± 0% -2.84% (p=0.000 n=10)
Cosh 233.1n ± 0% 222.6n ± 0% -4.50% (p=0.000 n=10)
Erf 122.5n ± 0% 114.2n ± 0% -6.78% (p=0.000 n=10)
Erfc 126.0n ± 1% 116.6n ± 0% -7.46% (p=0.000 n=10)
Erfinv 138.8n ± 0% 138.6n ± 0% ~ (p=0.082 n=10)
Erfcinv 140.0n ± 0% 139.7n ± 0% ~ (p=0.359 n=10)
Exp 193.3n ± 0% 184.2n ± 0% -4.68% (p=0.000 n=10)
ExpGo 204.8n ± 0% 194.5n ± 0% -5.03% (p=0.000 n=10)
Expm1 152.5n ± 1% 145.0n ± 0% -4.92% (p=0.000 n=10)
Exp2 174.5n ± 0% 164.2n ± 0% -5.85% (p=0.000 n=10)
Exp2Go 184.4n ± 1% 175.4n ± 0% -4.88% (p=0.000 n=10)
Abs 4.912n ± 0% 4.914n ± 0% ~ (p=0.283 n=10)
Dim 15.50n ± 1% 15.52n ± 1% ~ (p=0.331 n=10)
Floor 36.89n ± 1% 36.76n ± 1% ~ (p=0.325 n=10)
Max 31.05n ± 1% 31.17n ± 1% ~ (p=0.628 n=10)
Min 31.01n ± 0% 31.06n ± 0% ~ (p=0.767 n=10)
Mod 294.1n ± 0% 245.6n ± 0% -16.52% (p=0.000 n=10)
Frexp 44.86n ± 1% 35.20n ± 0% -21.53% (p=0.000 n=10)
Gamma 195.8n ± 0% 185.4n ± 1% -5.29% (p=0.000 n=10)
Hypot 84.91n ± 0% 84.54n ± 1% -0.43% (p=0.006 n=10)
HypotGo 96.70n ± 0% 95.42n ± 1% -1.32% (p=0.000 n=10)
Ilogb 45.03n ± 0% 35.07n ± 1% -22.10% (p=0.000 n=10)
J0 634.5n ± 0% 627.2n ± 0% -1.16% (p=0.000 n=10)
J1 644.5n ± 0% 636.9n ± 0% -1.18% (p=0.000 n=10)
Jn 1.357µ ± 0% 1.344µ ± 0% -0.92% (p=0.000 n=10)
Ldexp 49.89n ± 0% 39.96n ± 0% -19.90% (p=0.000 n=10)
Lgamma 186.6n ± 0% 184.3n ± 0% -1.21% (p=0.000 n=10)
Log 150.4n ± 0% 141.1n ± 0% -6.15% (p=0.000 n=10)
Logb 46.70n ± 0% 35.89n ± 0% -23.15% (p=0.000 n=10)
Log1p 164.1n ± 0% 163.9n ± 0% ~ (p=0.122 n=10)
Log10 153.1n ± 0% 143.5n ± 0% -6.24% (p=0.000 n=10)
Log2 58.83n ± 0% 49.75n ± 0% -15.43% (p=0.000 n=10)
Modf 40.82n ± 1% 40.78n ± 0% ~ (p=0.239 n=10)
Nextafter32 49.15n ± 0% 48.93n ± 0% -0.44% (p=0.011 n=10)
Nextafter64 43.33n ± 0% 43.23n ± 0% ~ (p=0.228 n=10)
PowInt 269.4n ± 0% 243.8n ± 0% -9.49% (p=0.000 n=10)
PowFrac 618.0n ± 0% 571.7n ± 0% -7.48% (p=0.000 n=10)
Pow10Pos 13.09n ± 0% 13.05n ± 0% -0.31% (p=0.003 n=10)
Pow10Neg 30.99n ± 1% 30.99n ± 0% ~ (p=0.173 n=10)
Round 23.73n ± 0% 23.65n ± 0% -0.36% (p=0.011 n=10)
RoundToEven 27.87n ± 0% 27.73n ± 0% -0.48% (p=0.003 n=10)
Remainder 282.1n ± 0% 249.6n ± 0% -11.52% (p=0.000 n=10)
Signbit 11.46n ± 0% 11.42n ± 0% -0.39% (p=0.003 n=10)
Sin 115.2n ± 0% 113.2n ± 0% -1.74% (p=0.000 n=10)
Sincos 140.6n ± 0% 138.6n ± 0% -1.39% (p=0.000 n=10)
Sinh 252.0n ± 0% 241.4n ± 0% -4.21% (p=0.000 n=10)
SqrtIndirect 4.909n ± 0% 4.893n ± 0% -0.34% (p=0.021 n=10)
SqrtLatency 19.57n ± 1% 19.57n ± 0% ~ (p=0.087 n=10)
SqrtIndirectLatency 19.64n ± 0% 19.57n ± 0% -0.36% (p=0.025 n=10)
SqrtGoLatency 198.1n ± 0% 197.4n ± 0% -0.35% (p=0.014 n=10)
SqrtPrime 5.733µ ± 0% 5.725µ ± 0% ~ (p=0.116 n=10)
Tan 149.1n ± 0% 146.8n ± 0% -1.54% (p=0.000 n=10)
Tanh 248.2n ± 1% 238.1n ± 0% -4.05% (p=0.000 n=10)
Trunc 36.86n ± 0% 36.70n ± 0% -0.43% (p=0.029 n=10)
Y0 638.2n ± 0% 633.6n ± 0% -0.71% (p=0.000 n=10)
Y1 641.8n ± 0% 636.1n ± 0% -0.87% (p=0.000 n=10)
Yn 1.358µ ± 0% 1.345µ ± 0% -0.92% (p=0.000 n=10)
Float64bits 5.721n ± 0% 5.709n ± 0% -0.22% (p=0.044 n=10)
Float64frombits 4.905n ± 0% 4.893n ± 0% ~ (p=0.266 n=10)
Float32bits 12.27n ± 0% 12.23n ± 0% ~ (p=0.122 n=10)
Float32frombits 4.909n ± 0% 4.893n ± 0% -0.32% (p=0.024 n=10)
FMA 6.556n ± 0% 6.526n ± 0% ~ (p=0.283 n=10)
geomean 86.82n 83.75n -3.54%
Change-Id: I522297a79646d76543d516accce291f5a3cea337
Reviewed-on: https://go-review.googlesource.com/c/go/+/717560
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
The stackalloc code needs to run a liveness pass to build the
interference graph between stack slots. Because the values that we need
liveness over is so sparse, we can optimize the analysis by using a path
exploration algorithm rather than a iterative dataflow one
In local testing, this cuts 74.05 ms of CPU time off a build of cmd/compile.
Change-Id: I765ace87d5e8aae177e65eb63da482e3d698bea7
Reviewed-on: https://go-review.googlesource.com/c/go/+/718540
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This change mechanically replaces all occurrences of interface{}
by 'any' (where deemed safe by the 'any' modernizer) throughout
std and cmd, minus their vendor trees.
Since this fix is relatively numerous, it gets its own CL.
Also, 'go generate go/types'.
Change-Id: I14a6b52856c3291c1d27935409bca8d5fd4242a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/719702
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Alan Donovan <adonovan@google.com>
|
|
This CL generates optimizations for masked variant of AVX512
instructions for patterns:
x.Op(y).Merge(z, mask) => OpMasked(z, x, y mask), where OpMasked is
resultInArg0.
Change-Id: Ife7ccc9ddbf76ae921a085bd6a42b965da9bc179
Reviewed-on: https://go-review.googlesource.com/c/go/+/718160
Reviewed-by: David Chase <drchase@google.com>
TryBot-Bypass: Junyang Shao <shaojunyang@google.com>
|
|
The current implementation followed the spec faithfully for
the special case for append. Per the spec:
As a special case, append also accepts a first argument assignable to
type []byte with a second argument of string type followed by ... .
As it happens, nil is assignable to []byte, so append(nil, ""...)
didn't get an error message but a subsequent assertion failed.
This CL ensures that the first argument to append is never nil and
always a slice. We should make the spec more precise (separate CL).
Fixes #76220.
Change-Id: I581d11827a75afbb257077814beea813d4fe2441
Reviewed-on: https://go-review.googlesource.com/c/go/+/718860
Auto-Submit: Robert Griesemer <gri@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Brett Howell <devbrett90@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
For 512-bits they are unchanged. This CL adds the optimization rules for
128/256-bits under feature check.
This CL also fixed a bug for masked load variant of instructions and
make them zeroing by default as well.
Change-Id: I6fe395541c0cd509984a81841420e71c3af732f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/717822
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
These should really be machine ops only.
Change-Id: Idcc611719eff068153d88c5162dd2e0883e5e0ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/717821
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This reverts commit 361d51a6b58bccaab0559e06737c918018a7a5fa.
Reason for revert: Breaks some tests inside Google (on arm64?)
Change-Id: Iaea45fdcf9b4f9d36553687ca7f479750fe559da
Reviewed-on: https://go-review.googlesource.com/c/go/+/718066
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Youlin Feng <fengyoulin@live.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Change-Id: I90428330b17ab9f93ae94a77cefc24464e225df5
Reviewed-on: https://go-review.googlesource.com/c/go/+/717700
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Instead of iterating until dataflow stabilization, fill in values known to be
live at the beginning of loop headers. This reduces the number of passes
over the CFG from the depth of the loopnest to just 2.
In a test instrumented version of this change, run against
cmd/compile/internal/ssa, it brought the time spent in liveness analysis
down to 150.52ms from 225.49ms on my machine.
Change-Id: Ic72762eedfd1f10b1ba74c430ed62ab4ebd3ec5c
Reviewed-on: https://go-review.googlesource.com/c/go/+/695255
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
We don't need to check that the bit patterns of the constants
match, it is sufficient to just check the constant is equal to the
given value.
While we're here also change the FCLASSD rules to use a bit pattern
for the mask. I think this improves readability, particularly as
more uses of FCLASSD get added (e.g. CL 717560).
These changes should not affect codegen.
Change-Id: I92a6338dc71e6a71e04306f67d7d86016c6e9c47
Reviewed-on: https://go-review.googlesource.com/c/go/+/717580
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Propagate "unread" across OpMoves. If the addr of this auto is only used
by an OpMove as its source arg, and the OpMove's target arg is the addr
of another auto. If the 2nd auto can be eliminated, this one can also be
eliminated.
This CL eliminates unnecessary memory copies and makes the frame smaller
in the following code snippet:
func contains(m map[string][16]int, k string) bool {
_, ok := m[k]
return ok
}
These are the benchmark results followed by the benchmark code:
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
Map1Access2Ok-8 9.582n ± 2% 9.226n ± 0% -3.72% (p=0.000 n=20)
Map2Access2Ok-8 13.79n ± 1% 10.24n ± 1% -25.77% (p=0.000 n=20)
Map3Access2Ok-8 68.68n ± 1% 12.65n ± 1% -81.58% (p=0.000 n=20)
package main_test
import "testing"
var (
m1 = map[int]int{}
m2 = map[int][16]int{}
m3 = map[int][256]int{}
)
func init() {
for i := range 1000 {
m1[i] = i
m2[i] = [16]int{15:i}
m3[i] = [256]int{255:i}
}
}
func BenchmarkMap1Access2Ok(b *testing.B) {
for i := range b.N {
_, ok := m1[i%1000]
if !ok {
b.Errorf("%d not found", i)
}
}
}
func BenchmarkMap2Access2Ok(b *testing.B) {
for i := range b.N {
_, ok := m2[i%1000]
if !ok {
b.Errorf("%d not found", i)
}
}
}
func BenchmarkMap3Access2Ok(b *testing.B) {
for i := range b.N {
_, ok := m3[i%1000]
if !ok {
b.Errorf("%d not found", i)
}
}
}
Fixes #75398
Change-Id: If75e9caaa50d460efc31a94565b9ba28c8158771
Reviewed-on: https://go-review.googlesource.com/c/go/+/702875
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This lets us remove useAvg and useHmul from the division rules.
The compiler is simpler and the generated code is faster.
goos: wasip1
goarch: wasm
pkg: internal/strconv
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
AppendFloat/Decimal 192.8n ± 1% 194.6n ± 0% +0.91% (p=0.000 n=10)
AppendFloat/Float 328.6n ± 0% 279.6n ± 0% -14.93% (p=0.000 n=10)
AppendFloat/Exp 335.6n ± 1% 289.2n ± 1% -13.80% (p=0.000 n=10)
AppendFloat/NegExp 336.0n ± 0% 289.1n ± 1% -13.97% (p=0.000 n=10)
AppendFloat/LongExp 332.4n ± 0% 285.2n ± 1% -14.20% (p=0.000 n=10)
AppendFloat/Big 348.2n ± 0% 300.1n ± 0% -13.83% (p=0.000 n=10)
AppendFloat/BinaryExp 137.4n ± 0% 138.2n ± 0% +0.55% (p=0.001 n=10)
AppendFloat/32Integer 193.3n ± 1% 196.5n ± 0% +1.66% (p=0.000 n=10)
AppendFloat/32ExactFraction 283.3n ± 0% 268.9n ± 1% -5.08% (p=0.000 n=10)
AppendFloat/32Point 279.9n ± 0% 266.5n ± 0% -4.80% (p=0.000 n=10)
AppendFloat/32Exp 300.1n ± 0% 288.3n ± 1% -3.90% (p=0.000 n=10)
AppendFloat/32NegExp 288.2n ± 1% 277.9n ± 1% -3.59% (p=0.000 n=10)
AppendFloat/32Shortest 261.7n ± 0% 250.2n ± 0% -4.39% (p=0.000 n=10)
AppendFloat/32Fixed8Hard 173.3n ± 1% 158.9n ± 1% -8.31% (p=0.000 n=10)
AppendFloat/32Fixed9Hard 180.0n ± 0% 167.9n ± 2% -6.70% (p=0.000 n=10)
AppendFloat/64Fixed1 167.1n ± 0% 149.6n ± 1% -10.50% (p=0.000 n=10)
AppendFloat/64Fixed2 162.4n ± 1% 146.5n ± 0% -9.73% (p=0.000 n=10)
AppendFloat/64Fixed2.5 165.5n ± 0% 149.4n ± 1% -9.70% (p=0.000 n=10)
AppendFloat/64Fixed3 166.4n ± 1% 150.2n ± 0% -9.74% (p=0.000 n=10)
AppendFloat/64Fixed4 163.7n ± 0% 149.6n ± 1% -8.62% (p=0.000 n=10)
AppendFloat/64Fixed5Hard 182.8n ± 1% 167.1n ± 1% -8.61% (p=0.000 n=10)
AppendFloat/64Fixed12 222.2n ± 0% 208.8n ± 0% -6.05% (p=0.000 n=10)
AppendFloat/64Fixed16 197.6n ± 1% 181.7n ± 0% -8.02% (p=0.000 n=10)
AppendFloat/64Fixed12Hard 194.5n ± 0% 181.0n ± 0% -6.99% (p=0.000 n=10)
AppendFloat/64Fixed17Hard 205.1n ± 1% 191.9n ± 0% -6.44% (p=0.000 n=10)
AppendFloat/64Fixed18Hard 6.269µ ± 0% 6.643µ ± 0% +5.97% (p=0.000 n=10)
AppendFloat/64FixedF1 211.7n ± 1% 197.0n ± 0% -6.95% (p=0.000 n=10)
AppendFloat/64FixedF2 189.4n ± 0% 174.2n ± 0% -8.08% (p=0.000 n=10)
AppendFloat/64FixedF3 169.0n ± 0% 154.9n ± 0% -8.32% (p=0.000 n=10)
AppendFloat/Slowpath64 321.2n ± 0% 274.2n ± 1% -14.63% (p=0.000 n=10)
AppendFloat/SlowpathDenormal64 307.4n ± 1% 261.2n ± 0% -15.03% (p=0.000 n=10)
AppendInt 3.367µ ± 1% 3.376µ ± 0% ~ (p=0.517 n=10)
AppendUint 675.5n ± 0% 676.9n ± 0% ~ (p=0.196 n=10)
AppendIntSmall 28.13n ± 1% 28.17n ± 0% +0.14% (p=0.015 n=10)
AppendUintVarlen/digits=1 20.70n ± 0% 20.51n ± 1% -0.89% (p=0.018 n=10)
AppendUintVarlen/digits=2 20.43n ± 0% 20.27n ± 0% -0.81% (p=0.001 n=10)
AppendUintVarlen/digits=3 38.48n ± 0% 37.93n ± 0% -1.43% (p=0.000 n=10)
AppendUintVarlen/digits=4 41.10n ± 0% 38.78n ± 1% -5.62% (p=0.000 n=10)
AppendUintVarlen/digits=5 42.25n ± 1% 42.11n ± 0% -0.32% (p=0.041 n=10)
AppendUintVarlen/digits=6 45.40n ± 1% 43.14n ± 0% -4.98% (p=0.000 n=10)
AppendUintVarlen/digits=7 46.81n ± 1% 46.03n ± 0% -1.66% (p=0.000 n=10)
AppendUintVarlen/digits=8 48.88n ± 1% 46.59n ± 1% -4.68% (p=0.000 n=10)
AppendUintVarlen/digits=9 49.94n ± 2% 49.41n ± 1% -1.06% (p=0.000 n=10)
AppendUintVarlen/digits=10 57.28n ± 1% 56.92n ± 1% -0.62% (p=0.045 n=10)
AppendUintVarlen/digits=11 60.09n ± 1% 58.11n ± 2% -3.30% (p=0.000 n=10)
AppendUintVarlen/digits=12 62.22n ± 0% 61.85n ± 0% -0.59% (p=0.000 n=10)
AppendUintVarlen/digits=13 64.94n ± 0% 62.92n ± 0% -3.10% (p=0.000 n=10)
AppendUintVarlen/digits=14 65.42n ± 1% 65.19n ± 1% -0.34% (p=0.005 n=10)
AppendUintVarlen/digits=15 68.17n ± 0% 66.13n ± 0% -2.99% (p=0.000 n=10)
AppendUintVarlen/digits=16 70.21n ± 1% 70.09n ± 1% ~ (p=0.517 n=10)
AppendUintVarlen/digits=17 72.93n ± 0% 70.49n ± 0% -3.34% (p=0.000 n=10)
AppendUintVarlen/digits=18 73.01n ± 0% 72.75n ± 0% -0.35% (p=0.000 n=10)
AppendUintVarlen/digits=19 79.27n ± 1% 79.49n ± 1% ~ (p=0.671 n=10)
AppendUintVarlen/digits=20 82.18n ± 0% 80.43n ± 1% -2.14% (p=0.000 n=10)
geomean 143.4n 136.0n -5.20%
Change-Id: I8245814a0259ad13cf9225f57db8e9fe3d2e4267
Reviewed-on: https://go-review.googlesource.com/c/go/+/717407
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Fix sorting slices to avoid panic when there are more opsDataImm than
opsData (the problem occurs when generating only a subset of instructions
but it may be better to keep them sorted by their own names anyway).
Change-Id: Iea7fe61259e8416f16c46158d87c84b1d7a3076d
Reviewed-on: https://go-review.googlesource.com/c/go/+/716121
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Junyang Shao <shaojunyang@google.com>
|
|
Since we always can get the address of `CALL runtime.deferreturn(SB)`
from the unwinder, so it is not necessary to record the caller's pc
in the _defer struct. For the stack allocated _defer, this CL makes
the frame smaller.
Change-Id: I0fd347e4bc07cf8a9b954816323df30fc52552b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/716720
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Intrinsifying things inside the module (crypto/internal/fips140/subtle)
is asking for trouble, as the import paths are rewritten by the
GOFIPS140 mechanism, and we might have to support multiple modules
in the future.
Importing crypto/subtle from inside a FIPS 140-3 module is not allowed,
and is basically asking for circular dependencies.
Instead, break off the intrinsics into their own package
(crypto/internal/constanttime), and keep the byte slice operations
in crypto/internal/fips140/subtle. crypto/subtle then becomes a thin
dispatch layer.
Change-Id: I6a6a6964cd5cb5ad06e9d1679201447f5a811da4
Reviewed-on: https://go-review.googlesource.com/c/go/+/716120
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
|
|
Prove currently checks for 0 sign bit extraction (x>>63) at the
end of the pass, but it is more general and more useful
(and not really more work) to model right shift during
value range tracking. This handles sign bit extraction (both 0 and -1)
but also makes the value ranges available for proving bounds checks.
'go build -a -gcflags=-d=ssa/prove/debug=1 std'
finds 105 new things to prove.
https://gist.github.com/rsc/8ac41176e53ed9c2f1a664fc668e8336
For example, the compiler now recognizes that this code in
strconv does not need to check the second shift for being ≥ 64.
msb := xHi >> 63
retMantissa := xHi >> (msb + 38)
nor does this code in regexp:
return b < utf8.RuneSelf && specialBytes[b%16]&(1<<(b/16)) != 0
This code in math no longer has a bounds check on the first index:
if 0 <= n && n <= 308 {
return pow10postab32[uint(n)/32] * pow10tab[uint(n)%32]
}
The diff shows one "lost" proof in ycbcr.go but it's not really lost:
the expression was folded to a constant instead, and that only shows
up with debug=2. A diff of that output is at
https://gist.github.com/rsc/9139ed46c6019ae007f5a1ba4bb3250f
Change-Id: I84087311e0a303f00e2820d957a6f8b29ee22519
Reviewed-on: https://go-review.googlesource.com/c/go/+/716140
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
This CL implements Mul64uhilo, Hmul64, Hmul64u, and Avg64u
on 32-bit systems, with the effect that constant division of both
int64s and uint64s can now be emitted directly in all cases,
and also that bits.Mul64 can be intrinsified on 32-bit systems.
Previously, constant division of uint64s by values 0 ≤ c ≤ 0xFFFF were
implemented as uint32 divisions by c and some fixup. After expanding
those smaller constant divisions, the code for i/999 required:
(386) 7 mul, 10 add, 2 sub, 3 rotate, 3 shift (104 bytes)
(arm) 7 mul, 9 add, 3 sub, 2 shift (104 bytes)
(mips) 7 mul, 10 add, 5 sub, 6 shift, 3 sgtu (176 bytes)
For that much code, we might as well use a full 64x64->128 multiply
that can be used for all divisors, not just small ones.
Having done that, the same i/999 now generates:
(386) 4 mul, 9 add, 2 sub, 2 or, 6 shift (112 bytes)
(arm) 4 mul, 8 add, 2 sub, 2 or, 3 shift (92 bytes)
(mips) 4 mul, 11 add, 3 sub, 6 shift, 8 sgtu, 4 or (196 bytes)
The size increase on 386 is due to a few extra register spills.
The size increase on mips is due to add-with-carry being hard.
The new approach is more general, letting us delete the old special case
and guarantee that all int64 and uint64 divisions by constants are
generated directly on 32-bit systems.
This especially speeds up code making heavy use of bits.Mul64 with
a constant argument, which happens in strconv and various crypto
packages. A few examples are benchmarked below.
pkg: cmd/compile/internal/test
benchmark \ host local linux-amd64 s7 linux-386 s7:GOARCH=386
vs base vs base vs base vs base vs base
DivconstI64 ~ ~ ~ -49.66% -21.02%
ModconstI64 ~ ~ ~ -13.45% +14.52%
DivisiblePow2constI64 ~ ~ ~ +0.97% -1.32%
DivisibleconstI64 ~ ~ ~ -20.01% -48.28%
DivisibleWDivconstI64 ~ ~ -1.76% -38.59% -42.74%
DivconstU64/3 ~ ~ ~ -13.82% -4.09%
DivconstU64/5 ~ ~ ~ -14.10% -3.54%
DivconstU64/37 -2.07% -4.45% ~ -19.60% -9.55%
DivconstU64/1234567 ~ ~ ~ -61.55% -56.93%
ModconstU64 ~ ~ ~ -6.25% ~
DivisibleconstU64 ~ ~ ~ -2.78% -7.82%
DivisibleWDivconstU64 ~ ~ ~ +4.23% +2.56%
pkg: math/bits
benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386
vs base vs base vs base vs base
Add ~ ~ ~ ~
Add32 +1.59% ~ ~ ~
Add64 ~ ~ ~ ~
Add64multiple ~ ~ ~ ~
Sub ~ ~ ~ ~
Sub32 ~ ~ ~ ~
Sub64 ~ ~ -9.20% ~
Sub64multiple ~ ~ ~ ~
Mul ~ ~ ~ ~
Mul32 ~ ~ ~ ~
Mul64 ~ ~ -41.58% -53.21%
Div ~ ~ ~ ~
Div32 ~ ~ ~ ~
Div64 ~ ~ ~ ~
pkg: strconv
benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386
vs base vs base vs base vs base
ParseInt/Pos/7bit ~ ~ -11.08% -6.75%
ParseInt/Pos/26bit ~ ~ -13.65% -11.02%
ParseInt/Pos/31bit ~ ~ -14.65% -9.71%
ParseInt/Pos/56bit -1.80% ~ -17.97% -10.78%
ParseInt/Pos/63bit ~ ~ -13.85% -9.63%
ParseInt/Neg/7bit ~ ~ -12.14% -7.26%
ParseInt/Neg/26bit ~ ~ -14.18% -9.81%
ParseInt/Neg/31bit ~ ~ -14.51% -9.02%
ParseInt/Neg/56bit ~ ~ -15.79% -9.79%
ParseInt/Neg/63bit ~ ~ -15.68% -11.07%
AppendFloat/Decimal ~ ~ -7.25% -12.26%
AppendFloat/Float ~ ~ -15.96% -19.45%
AppendFloat/Exp ~ ~ -13.96% -17.76%
AppendFloat/NegExp ~ ~ -14.89% -20.27%
AppendFloat/LongExp ~ ~ -12.68% -17.97%
AppendFloat/Big ~ ~ -11.10% -16.64%
AppendFloat/BinaryExp ~ ~ ~ ~
AppendFloat/32Integer ~ ~ -10.05% -10.91%
AppendFloat/32ExactFraction ~ ~ -8.93% -13.00%
AppendFloat/32Point ~ ~ -10.36% -14.89%
AppendFloat/32Exp ~ ~ -9.88% -13.54%
AppendFloat/32NegExp ~ ~ -10.16% -14.26%
AppendFloat/32Shortest ~ ~ -11.39% -14.96%
AppendFloat/32Fixed8Hard ~ ~ ~ -2.31%
AppendFloat/32Fixed9Hard ~ ~ ~ -7.01%
AppendFloat/64Fixed1 ~ ~ -2.83% -8.23%
AppendFloat/64Fixed2 ~ ~ ~ -7.94%
AppendFloat/64Fixed3 ~ ~ -4.07% -7.22%
AppendFloat/64Fixed4 ~ ~ -7.24% -7.62%
AppendFloat/64Fixed12 ~ ~ -6.57% -4.82%
AppendFloat/64Fixed16 ~ ~ -4.00% -5.81%
AppendFloat/64Fixed12Hard -2.22% ~ -4.07% -6.35%
AppendFloat/64Fixed17Hard -2.12% ~ ~ -3.79%
AppendFloat/64Fixed18Hard -1.89% ~ +2.48% ~
AppendFloat/Slowpath64 -1.85% ~ -14.49% -18.21%
AppendFloat/SlowpathDenormal64 ~ ~ -13.08% -19.41%
pkg: crypto/internal/fips140/nistec/fiat
benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386
vs base vs base vs base vs base
Mul/P224 ~ ~ -29.95% -39.60%
Mul/P384 ~ ~ -37.11% -63.33%
Mul/P521 ~ ~ -26.62% -12.42%
Square/P224 +1.46% ~ -40.62% -49.18%
Square/P384 ~ ~ -45.51% -69.68%
Square/P521 +90.37% ~ -25.26% -11.23%
(The +90% is a separate problem and not real; that much variation
can be seen on that system by running the same binary from two
different files.)
pkg: crypto/internal/fips140/edwards25519
benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386
vs base vs base vs base vs base
EncodingDecoding ~ ~ -34.67% -35.75%
ScalarBaseMult ~ ~ -31.25% -30.29%
ScalarMult ~ ~ -33.45% -32.54%
VarTimeDoubleScalarBaseMult ~ ~ -33.78% -33.68%
Change-Id: Id3c91d42cd01def6731b755e99f8f40c6ad1bb65
Reviewed-on: https://go-review.googlesource.com/c/go/+/716061
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Fixes #76103.
Change-Id: Idc2f5d1d7aeb4a9b468e7c268e3bf5b85d1c3777
Reviewed-on: https://go-review.googlesource.com/c/go/+/716300
Reviewed-by: Robert Griesemer <gri@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Robert Griesemer <gri@google.com>
|
|
This CL introduces new divisible and divmod passes that rewrite
divisibility checks and div, mod, and mul. These happen after
prove, so that prove can make better sense of the code for
deriving bounds, and they must run before decompose, so that
64-bit ops can be lowered to 32-bit ops on 32-bit systems.
And then they need another generic pass as well, to optimize
the generated code before decomposing.
The three opt passes are "opt", "middle opt", and "late opt".
(Perhaps instead they should be "generic", "opt", and "late opt"?)
The "late opt" pass repeats the "middle opt" work on any new code
that has been generated in the interim.
There will not be new divs or mods, but there may be new muls.
The x%c==0 rewrite rules are much simpler now, since they can
match before divs have been rewritten. This has the effect of
applying them more consistently and making the rewrite rules
independent of the exact div rewrites.
Prove is also now charged with marking signed div/mod as
unsigned when the arguments call for it, allowing simpler
code to be emitted in various cases. For example,
t.Seconds()/2 and len(x)/2 are now recognized as unsigned,
meaning they compile to a simple shift (unsigned division),
avoiding the more complex fixup we need for signed values.
https://gist.github.com/rsc/99d9d3bd99cde87b6a1a390e3d85aa32
shows a diff of 'go build -a -gcflags=-d=ssa/prove/debug=1 std'
output before and after. "Proved Rsh64x64 shifts to zero" is replaced
by the higher-level "Proved Div64 is unsigned" (the shift was in the
signed expansion of div by constant), but otherwise prove is only
finding more things to prove.
One short example, in code that does x[i%len(x)]:
< runtime/mfinal.go:131:34: Proved Rsh64x64 shifts to zero
---
> runtime/mfinal.go:131:34: Proved Div64 is unsigned
> runtime/mfinal.go:131:38: Proved IsInBounds
A longer example:
< crypto/internal/fips140/sha3/shake.go:28:30: Proved Rsh64x64 shifts to zero
< crypto/internal/fips140/sha3/shake.go:38:27: Proved Rsh64x64 shifts to zero
< crypto/internal/fips140/sha3/shake.go:53:46: Proved Rsh64x64 shifts to zero
< crypto/internal/fips140/sha3/shake.go:55:46: Proved Rsh64x64 shifts to zero
---
> crypto/internal/fips140/sha3/shake.go:28:30: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:28:30: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:28:30: Proved IsSliceInBounds
> crypto/internal/fips140/sha3/shake.go:38:27: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:45:7: Proved IsSliceInBounds
> crypto/internal/fips140/sha3/shake.go:46:4: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:53:46: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:53:46: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:53:46: Proved IsSliceInBounds
> crypto/internal/fips140/sha3/shake.go:55:46: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:55:46: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:55:46: Proved IsSliceInBounds
These diffs are due to the smaller opt being better
and taking work away from prove:
< image/jpeg/dct.go:307:5: Proved IsInBounds
< image/jpeg/dct.go:308:5: Proved IsInBounds
...
< image/jpeg/dct.go:442:5: Proved IsInBounds
In the old opt, Mul by 8 was rewritten to Lsh by 3 early.
This CL delays that rule to help prove recognize mods,
but it also helps opt constant-fold the slice x[8*i:8*i+8:8*i+8].
Specifically, computing the length, opt can now do:
(Sub64 (Add (Mul 8 i) 8) (Add (Mul 8 i) 8)) ->
(Add 8 (Sub (Mul 8 i) (Mul 8 i))) ->
(Add 8 (Mul 8 (Sub i i))) ->
(Add 8 (Mul 8 0)) ->
(Add 8 0) ->
8
The key step is (Sub (Mul x y) (Mul x z)) -> (Mul x (Sub y z)),
Leaving the multiply as Mul enables using that step; the old
rewrite to Lsh blocked it, leaving prove to figure out the length
and then remove the bounds checks. But now opt can evaluate
the length down to a constant 8 and then constant-fold away
the bounds checks 0 < 8, 1 < 8, and so on. After that,
the compiler has nothing left to prove.
Benchmarks are noisy in general; I checked the assembly for the many
large increases below, and the vast majority are unchanged and
presumably hitting the caches differently in some way.
The divisibility optimizations were not reliably triggering before.
This leads to a very large improvement in some cases, like
DivisiblePow2constI64, DivisibleconstI64 on 64-bit systems
and DivisbleconstU64 on 32-bit systems.
Another way the divisibility optimizations were unreliable before
was incorrectly triggering for x/3, x%3 even though they are
written not to do that. There is a real but small slowdown
in the DivisibleWDivconst benchmarks on Mac because in the cases
used in the benchmark, it is still faster (on Mac) to do the
divisibility check than to remultiply.
This may be worth further study. Perhaps when there is no rotate
(meaning the divisor is odd), the divisibility optimization
should be enabled always. In any event, this CL makes it possible
to study that.
benchmark \ host s7 linux-amd64 mac linux-arm64 linux-ppc64le linux-386 s7:GOARCH=386 linux-arm
vs base vs base vs base vs base vs base vs base vs base vs base
LoadAdd ~ ~ ~ ~ ~ -1.59% ~ ~
ExtShift ~ ~ -42.14% +0.10% ~ +1.44% +5.66% +8.50%
Modify ~ ~ ~ ~ ~ ~ ~ -1.53%
MullImm ~ ~ ~ ~ ~ +37.90% -21.87% +3.05%
ConstModify ~ ~ ~ ~ -49.14% ~ ~ ~
BitSet ~ ~ ~ ~ -15.86% -14.57% +6.44% +0.06%
BitClear ~ ~ ~ ~ ~ +1.78% +3.50% +0.06%
BitToggle ~ ~ ~ ~ ~ -16.09% +2.91% ~
BitSetConst ~ ~ ~ ~ ~ ~ ~ -0.49%
BitClearConst ~ ~ ~ ~ -28.29% ~ ~ -0.40%
BitToggleConst ~ ~ ~ +8.89% -31.19% ~ ~ -0.77%
MulNeg ~ ~ ~ ~ ~ ~ ~ ~
Mul2Neg ~ ~ -4.83% ~ ~ -13.75% -5.92% ~
DivconstI64 ~ ~ ~ ~ ~ -30.12% ~ +0.50%
ModconstI64 ~ ~ -9.94% -4.63% ~ +3.15% ~ +5.32%
DivisiblePow2constI64 -34.49% -12.58% ~ ~ -12.25% ~ ~ ~
DivisibleconstI64 -24.69% -25.06% -0.40% -2.27% -42.61% -3.31% ~ +1.63%
DivisibleWDivconstI64 ~ ~ ~ ~ ~ -17.55% ~ -0.60%
DivconstU64/3 ~ ~ ~ ~ ~ +1.51% ~ ~
DivconstU64/5 ~ ~ ~ ~ ~ ~ ~ ~
DivconstU64/37 ~ ~ -0.18% ~ ~ +2.70% ~ ~
DivconstU64/1234567 ~ ~ ~ ~ ~ ~ ~ +0.12%
ModconstU64 ~ ~ ~ -0.24% ~ -5.10% -1.07% -1.56%
DivisibleconstU64 ~ ~ ~ ~ ~ -29.01% -59.13% -50.72%
DivisibleWDivconstU64 ~ ~ -12.18% -18.88% ~ -5.50% -3.91% +5.17%
DivconstI32 ~ ~ -0.48% ~ -34.69% +89.01% -6.01% -16.67%
ModconstI32 ~ +2.95% -0.33% ~ ~ -2.98% -5.40% -8.30%
DivisiblePow2constI32 ~ ~ ~ ~ ~ ~ ~ -16.22%
DivisibleconstI32 ~ ~ ~ ~ ~ -37.27% -47.75% -25.03%
DivisibleWDivconstI32 -11.59% +5.22% -12.99% -23.83% ~ +45.95% -7.03% -10.01%
DivconstU32 ~ ~ ~ ~ ~ +74.71% +4.81% ~
ModconstU32 ~ ~ +0.53% +0.18% ~ +51.16% ~ ~
DivisibleconstU32 ~ ~ ~ -0.62% ~ -4.25% ~ ~
DivisibleWDivconstU32 -2.77% +5.56% +11.12% -5.15% ~ +48.70% +25.11% -4.07%
DivconstI16 -6.06% ~ -0.33% +0.22% ~ ~ -9.68% +5.47%
ModconstI16 ~ ~ +4.44% +2.82% ~ ~ ~ +5.06%
DivisiblePow2constI16 ~ ~ ~ ~ ~ ~ ~ -0.17%
DivisibleconstI16 ~ ~ -0.23% ~ ~ ~ +4.60% +6.64%
DivisibleWDivconstI16 -1.44% -0.43% +13.48% -5.76% ~ +1.62% -23.15% -9.06%
DivconstU16 +1.61% ~ -0.35% -0.47% ~ ~ +15.59% ~
ModconstU16 ~ ~ ~ ~ ~ -0.72% ~ +14.23%
DivisibleconstU16 ~ ~ -0.05% +3.00% ~ ~ ~ +5.06%
DivisibleWDivconstU16 +52.10% +0.75% +17.28% +4.79% ~ -37.39% +5.28% -9.06%
DivconstI8 ~ ~ -0.34% -0.96% ~ ~ -9.20% ~
ModconstI8 +2.29% ~ +4.38% +2.96% ~ ~ ~ ~
DivisiblePow2constI8 ~ ~ ~ ~ ~ ~ ~ ~
DivisibleconstI8 ~ ~ ~ ~ ~ ~ +6.04% ~
DivisibleWDivconstI8 -26.44% +1.69% +17.03% +4.05% ~ +32.48% -24.90% ~
DivconstU8 -4.50% +14.06% -0.28% ~ ~ ~ +4.16% +0.88%
ModconstU8 ~ ~ +25.84% -0.64% ~ ~ ~ ~
DivisibleconstU8 ~ ~ -5.70% ~ ~ ~ ~ ~
DivisibleWDivconstU8 +49.55% +9.07% ~ +4.03% +53.87% -40.03% +39.72% -3.01%
Mul2 ~ ~ ~ ~ ~ ~ ~ ~
MulNeg2 ~ ~ ~ ~ -11.73% ~ ~ -0.02%
EfaceInteger ~ ~ ~ ~ ~ +18.11% ~ +2.53%
TypeAssert +33.90% +2.86% ~ ~ ~ -1.07% -5.29% -1.04%
Div64UnsignedSmall ~ ~ ~ ~ ~ ~ ~ ~
Div64Small ~ ~ ~ ~ ~ -0.88% ~ +2.39%
Div64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +0.35%
Div64SmallNegDividend ~ ~ ~ ~ ~ -0.84% ~ +3.57%
Div64SmallNegBoth ~ ~ ~ ~ ~ -0.86% ~ +3.55%
Div64Unsigned ~ ~ ~ ~ ~ ~ ~ -0.11%
Div64 ~ ~ ~ ~ ~ ~ ~ +0.11%
Div64NegDivisor ~ ~ ~ ~ ~ -1.29% ~ ~
Div64NegDividend ~ ~ ~ ~ ~ -1.44% ~ ~
Div64NegBoth ~ ~ ~ ~ ~ ~ ~ +0.28%
Mod64UnsignedSmall ~ ~ ~ ~ ~ +0.48% ~ +0.93%
Mod64Small ~ ~ ~ ~ ~ ~ ~ ~
Mod64SmallNegDivisor ~ ~ ~ ~ ~ ~ ~ +1.44%
Mod64SmallNegDividend ~ ~ ~ ~ ~ +0.22% ~ +1.37%
Mod64SmallNegBoth ~ ~ ~ ~ ~ ~ ~ -2.22%
Mod64Unsigned ~ ~ ~ ~ ~ -0.95% ~ +0.11%
Mod64 ~ ~ ~ ~ ~ ~ ~ ~
Mod64NegDivisor ~ ~ ~ ~ ~ ~ ~ -0.02%
Mod64NegDividend ~ ~ ~ ~ ~ ~ ~ ~
Mod64NegBoth ~ ~ ~ ~ ~ ~ ~ -0.02%
MulconstI32/3 ~ ~ ~ -25.00% ~ ~ ~ +47.37%
MulconstI32/5 ~ ~ ~ +33.28% ~ ~ ~ +32.21%
MulconstI32/12 ~ ~ ~ -2.13% ~ ~ ~ -0.02%
MulconstI32/120 ~ ~ ~ +2.93% ~ ~ ~ -0.03%
MulconstI32/-120 ~ ~ ~ -2.17% ~ ~ ~ -0.03%
MulconstI32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03%
MulconstI32/65538 ~ ~ ~ ~ ~ -33.38% ~ +0.04%
MulconstI64/3 ~ ~ ~ +33.35% ~ -0.37% ~ -0.13%
MulconstI64/5 ~ ~ ~ -25.00% ~ -0.34% ~ ~
MulconstI64/12 ~ ~ ~ +2.13% ~ +11.62% ~ +2.30%
MulconstI64/120 ~ ~ ~ -1.98% ~ ~ ~ ~
MulconstI64/-120 ~ ~ ~ +0.75% ~ ~ ~ ~
MulconstI64/65537 ~ ~ ~ ~ ~ +5.61% ~ ~
MulconstI64/65538 ~ ~ ~ ~ ~ +5.25% ~ ~
MulconstU32/3 ~ +0.81% ~ +33.39% ~ +77.92% ~ -32.31%
MulconstU32/5 ~ ~ ~ -24.97% ~ +77.92% ~ -24.47%
MulconstU32/12 ~ ~ ~ +2.06% ~ ~ ~ +0.03%
MulconstU32/120 ~ ~ ~ -2.74% ~ ~ ~ +0.03%
MulconstU32/65537 ~ ~ ~ ~ ~ ~ ~ +0.03%
MulconstU32/65538 ~ ~ ~ ~ ~ -33.42% ~ -0.03%
MulconstU64/3 ~ ~ ~ +33.33% ~ -0.28% ~ +1.22%
MulconstU64/5 ~ ~ ~ -25.00% ~ ~ ~ -0.64%
MulconstU64/12 ~ ~ ~ +2.30% ~ +11.59% ~ +0.14%
MulconstU64/120 ~ ~ ~ -2.82% ~ ~ ~ +0.04%
MulconstU64/65537 ~ +0.37% ~ ~ ~ +5.58% ~ ~
MulconstU64/65538 ~ ~ ~ ~ ~ +5.16% ~ ~
ShiftArithmeticRight ~ ~ ~ ~ ~ -10.81% ~ +0.31%
Switch8Predictable +14.69% ~ ~ ~ ~ -24.85% ~ ~
Switch8Unpredictable ~ -0.58% -3.80% ~ ~ -11.78% ~ -0.79%
Switch32Predictable -10.33% +17.89% ~ ~ ~ +5.76% ~ ~
Switch32Unpredictable -3.15% +1.19% +9.42% ~ ~ -10.30% -5.09% +0.44%
SwitchStringPredictable +70.88% +20.48% ~ ~ ~ +2.39% ~ +0.31%
SwitchStringUnpredictable ~ +3.91% -5.06% -0.98% ~ +0.61% +2.03% ~
SwitchTypePredictable +146.58% -1.10% ~ -12.45% ~ -0.46% -3.81% ~
SwitchTypeUnpredictable +0.46% -0.83% ~ +4.18% ~ +0.43% ~ +0.62%
SwitchInterfaceTypePredictable -13.41% -10.13% +11.03% ~ ~ -4.38% ~ +0.75%
SwitchInterfaceTypeUnpredictable -6.37% -2.14% ~ -3.21% ~ -4.20% ~ +1.08%
Fixes #63110.
Fixes #75954.
Change-Id: I55a876f08c6c14f419ce1a8cbba2eaae6c6efbf0
Reviewed-on: https://go-review.googlesource.com/c/go/+/714160
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
We noticed some hand-translated code that used
nested functions as the translation of asm macros,
and they were too big to inline, and the resulting
performance was underwhelming. Any such closures
really need to be inlined.
Because Gerrit removed votes from a previous patch
set, and because in offline discussion we realized
that this was actually a hard-to-abuse inlining hack,
I decided to turn it up some more, and also add a
"this one goes to 11" joke. The number is utterly
unprincipled, only "simd is supposed to go fast,
and this is a natural use of closures, and we don't
want there to be issues where it doesn't go fast."
The test verifies that the inlining occurs for a
function that exceeds the current inlining threshold.
Inspection of the generated code shows that it has
the desired effect.
Change-Id: I7a8b57c07d6482e6d98cedaf9622c960f956834d
Reviewed-on: https://go-review.googlesource.com/c/go/+/715740
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
Runtime doing its own number formatting dates back to
when runtime was the bottom-most Go package.
Those days are long gone. Use internal/strconv to avoid
duplicating code and also to get better floating-point
formatting:
% go1.24.6 run x.go
+1.234568e+004
% go run x.go
12345.678
%
With accurate floating point it becomes necessary to
introduce separate printers for float32 vs float64 and
for complex64 vs complex128. Otherwise float32(93.7)
prints as 93.69999694824219.
Change-Id: I25ae3f09519342dc3d1dcabf4711651423e00128
Reviewed-on: https://go-review.googlesource.com/c/go/+/716002
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|