aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/compile/internal
AgeCommit message (Collapse)Author
2025-08-20[dev.simd] cmd/compile: rewrite to elide Slicemask from len==c>0 slicingDavid Chase
This might have been something that prove could be educated into figuring out, but this also works, and it also helps prove downstream. Adjusted the prove test, because this change moved a message. Change-Id: I5eabe639eff5db9cd9766a6a8666fdb4973829cb Reviewed-on: https://go-review.googlesource.com/c/go/+/697715 Commit-Queue: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Bypass: David Chase <drchase@google.com>
2025-08-20[dev.simd] simd, cmd/compile: added .Masked() peephole opt for many operations.David Chase
This should get many of the low-hanging and important fruit. Others can follow later. It needs more testing. Change-Id: Ic186b075987e85c87197ef9e1ca0b4f33ff96697 Reviewed-on: https://go-review.googlesource.com/c/go/+/697515 Reviewed-by: Junyang Shao <shaojunyang@google.com> Commit-Queue: David Chase <drchase@google.com> TryBot-Bypass: David Chase <drchase@google.com>
2025-08-20[dev.simd] simd, cmd/compile: sample peephole optimization for .Masked()David Chase
This is not the end of such peephole optimizations, there would need to be many of these for many simd operations. Change-Id: I4511f6fac502bc7259c1c4414c96f56eb400c202 Reviewed-on: https://go-review.googlesource.com/c/go/+/697157 TryBot-Bypass: David Chase <drchase@google.com> Commit-Queue: David Chase <drchase@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-08-20[dev.simd] all: merge master (9de69f6) into dev.simdCherry Mui
Merge List: + 2025-08-20 9de69f6913 errors: mention Is/As in Join docs + 2025-08-20 4afd482812 cmd/go/internal/doc: pass URL fragments separately with -http + 2025-08-20 509d5f647f internal/poll: don't call Seek for overlapped Windows handles + 2025-08-20 853fc12739 internal/poll: set the correct file offset in FD.Seek for Windows overlapped handles + 2025-08-19 bd885401d5 runtime: save and restore all fcc registers in async preempt on loong64 + 2025-08-19 119546ea4f cmd/go: document install outputs to $GOOS_$GOARCH when cross compiling + 2025-08-19 ffa882059c unique: deflake TestCanonMap/LoadOrStore/ConcurrentUnsharedKeys + 2025-08-19 1f2e8e03e4 os: fix path in MkdirTemp error message + 2025-08-19 5024d0d884 cmd/compile: tweak example command in README + 2025-08-19 b80ffb64d8 internal/trace: remove redundant info from Event.String + 2025-08-19 c7d8bda459 cmd/compile/internal: make function comments match function names + 2025-08-19 de2d741667 internal/trace: use RFC3339Nano for wall clock snapshots in Event.String + 2025-08-19 c61db5ebd5 syscall: forkAndExecInChild1: don't reuse pid variable + 2025-08-19 07ee3bfc63 cmd/go: use modern pprof flags in documentation + 2025-08-18 5a56d8848b cmd/compile: ensure we use allowed registers for input-clobbering instructions + 2025-08-18 c3927a47f0 runtime: fix comments in tracetype.go + 2025-08-15 77f911e31c internal/trace: emit final sync event for generation in Go 1.26+ + 2025-08-15 786be1d2bf runtime: don't overwrite global stop channel in tests + 2025-08-15 4a7fde922f internal/trace: add end-of-generation signal to trace + 2025-08-15 cb814bd5bc net: skip TestIPv4WriteMsgUDPAddrPort on plan9 + 2025-08-15 78a3968c2c runtime/metrics: add metric for current Go-owned thread count + 2025-08-15 ab8121a407 runtime/metrics: add metric for total goroutines created + 2025-08-15 13df972f68 runtime/metrics: add metrics for goroutine sched states + 2025-08-15 bd07fafb0a runtime: disable stack shrinking for all waiting-for-suspendG cases + 2025-08-15 a651e2ea47 runtime: remove duff support for amd64 + 2025-08-15 e4291e484c runtime: remove duff support for arm64 + 2025-08-15 15d6dbc05c cmd/compile: use generated loops instead of DUFFCOPY on arm64 + 2025-08-15 bca3e98b8a cmd/go: test barrier actions + 2025-08-15 052fcde9fd internal/runtime: cleaner overflow checker + 2025-08-15 3871c0d84d syscall: permit nil destination address in sendmsgN{Inet4,Inet6} + 2025-08-14 a8564bd412 runtime: make all synctest bubble violations fatal panics Change-Id: Ibc94566bc69bcb59b1d79b6fa868610ca2d1d223
2025-08-20[dev.simd] simd, cmd/compile: add widening unsigned converts 8->16->32David Chase
Change-Id: If0bde7154bd622573375eba5539fd642b8ef9d2f Reviewed-on: https://go-review.googlesource.com/c/go/+/696555 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-08-19[dev.simd] simd: make OpMasked machine ops onlyJunyang Shao
Right now we can expect the `Op(...).Masked` idiom to lack many parts that will make the API incomplete. But to make the API sizes smaller, we are removing these ops' frontend types and interfaces for now. We will have the peepholes and a new pass checking the CPU features check domination relations to make these ops picked for the right `Op(...).Masked` idiom. Change-Id: I77f72a198b3d8b1880dcb911470db5e0089ac1ca Reviewed-on: https://go-review.googlesource.com/c/go/+/697155 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Bypass: Junyang Shao <shaojunyang@google.com>
2025-08-19cmd/compile/internal: make function comments match function namescuishuang
Change-Id: I2ebadb87a4f60487c8f720930a72bc5ef953d7cf Reviewed-on: https://go-review.googlesource.com/c/go/+/696695 Reviewed-by: Carlos Amedee <carlos@golang.org> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-19[dev.simd] simd, cmd/compile: implement ToMask, unexport asMask.Junyang Shao
This CL defines the mask semantic better: When converting from vector to mask, its element is set to true iff the corresponding vector element is non zero. Change-Id: I331c1c7992dc9e81c211bdc6d73e5eb3b8414506 Reviewed-on: https://go-review.googlesource.com/c/go/+/697056 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-18[dev.simd] simd, cmd/compile: mark BLEND instructions as not-zero-maskDavid Chase
Change-Id: Ida9f29423d62a25be41dcf637ffb9275b7cae642 Reviewed-on: https://go-review.googlesource.com/c/go/+/697055 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-18[dev.simd] cmd/compile, simd: added methods for "float" GetElemDavid Chase
This also required a "always use operation with least OverrideBase" filter in choosing the machine instructions. The order of generated HW operations is slightly modified because the Float version of GetElem appears earlier in the sorted operations list, though it is not chosen to generate the HW Op. Change-Id: I95fa67afca9c8b6f4f18941fdcaf69afdad8055b Reviewed-on: https://go-review.googlesource.com/c/go/+/696375 Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-08-18cmd/compile: ensure we use allowed registers for input-clobbering instructionsKeith Randall
For instructions which clobber their input register, we make a second copy of the input value so it is still available in a register for future instructions. That second copy might not respect the register input restrictions for the instruction. So the second copy we make here can't actually be used by the instruction - it should use the first copy, the second copy is the one that will persist beyond the clobber. Fixes #75063 Change-Id: I99acdc63f0c4e54567a174ff7ada601ae4e796b7 Reviewed-on: https://go-review.googlesource.com/c/go/+/697015 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-15cmd/compile: use generated loops instead of DUFFCOPY on arm64Keith Randall
Change-Id: Ic2aa8959b7fc594b86def70b6c2be38badf7970c Reviewed-on: https://go-review.googlesource.com/c/go/+/679015 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
2025-08-15[dev.simd] cmd/compile: make move/load/store dependent only on reg and widthJunyang Shao
This CL improve its previous CL by implementing move/load/storeByRegWidth. It should have not touched the compilation path of complex128, but as a side effect, the move/load/store of 16-byte SIMD vectors in X0 to X15 are now compiled to MOVUPS instead of VMOVDQU. These functions could be used in MOV*const, but this CL does not do that because we haven't seen problems of them yet. But in the future if we see problems calling these functions to find the right asm might be handy. Change-Id: I9b76e65eef8155479d3e288402aa96bc29a4f7cb Reviewed-on: https://go-review.googlesource.com/c/go/+/696255 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-08-15[dev.simd] cmd/compile: make (most) move/load/store lowering use reg and ↵Junyang Shao
width only This CL tries to clean up the move/load/store lowering a bit. After CL 695315 the register information for instructions are expected to be correct for SIMD, but we still need to pick the right instruction during ssa to asm lowering. The code before this CL should be working correctly, but MOVSSconst and MOVSDconst contains duplicated codes, this CL removes that. This CL also rewrite move/load/storeByTypeAndReg to use only the width and reg for all non-SIMD types, which is more consistent. Change-Id: I76c14f3d0140bcbd4fbea0df275fee0202a3b7d9 Reviewed-on: https://go-review.googlesource.com/c/go/+/696175 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-14[dev.simd] cmd/compile: accounts rematerialize ops's output reginfoJunyang Shao
This CL implements the check for rematerializeable value's output regspec at its remateralization site. It has some potential problems, please see the TODO in regalloc.go. Fixes #70451. Change-Id: Ib624b967031776851136554719e939e9bf116b7c Reviewed-on: https://go-review.googlesource.com/c/go/+/695315 Reviewed-by: David Chase <drchase@google.com> TryBot-Bypass: David Chase <drchase@google.com>
2025-08-14[dev.simd] all: merge master (924fe98) into dev.simdCherry Mui
Conflicts: - src/cmd/compile/internal/amd64/ssa.go - src/cmd/compile/internal/ssa/expand_calls.go - src/cmd/compile/internal/ssagen/ssa.go - src/internal/buildcfg/exp.go - src/internal/cpu/cpu.go - src/internal/cpu/cpu_x86.go - src/runtime/mkpreempt.go - src/runtime/preempt_amd64.go - src/runtime/preempt_amd64.s Merge List: + 2025-08-14 924fe98902 cmd/internal/obj/riscv: add encoding for compressed riscv64 instructions + 2025-08-13 320df537cc cmd/compile: emit classify instructions for infinity tests on riscv64 + 2025-08-13 ca66f907dd cmd/compile: use generated loops instead of DUFFCOPY on amd64 + 2025-08-13 4b1800e476 encoding/json/v2: cleanup error constructors + 2025-08-13 af8870708b encoding/json/v2: fix incorrect marshaling of NaN in float64 any + 2025-08-13 0a75e5a07b encoding/json/v2: fix wrong type with cyclic marshal error in map[string]any + 2025-08-13 de9b6f9875 cmd/pprof: update vendored github.com/google/pprof + 2025-08-13 674c5f0edd os/exec: fix incorrect expansion of ".." in LookPath on plan9 + 2025-08-13 9bbea0f21a cmd/compile: during regalloc, fixedreg values are always available + 2025-08-13 08eef97500 runtime/trace: fix documentation typo + 2025-08-13 2fe5d51d04 internal/trace: fix wrong scope for Event.Range or EvGCSweepActive + 2025-08-13 9fcb87c352 cmd/compile: teach prove about len's & cap's max based on the element size + 2025-08-13 9763ece873 cmd/compile: absorb NEGV into branch on loong64 + 2025-08-13 f10a82b76f all: update vendored dependencies [generated] + 2025-08-13 3bea95b277 cmd/link/internal/ld: remove OpenBSD buildid workaround + 2025-08-12 90b7d7aaa2 cmd/compile/internal: optimize multiplication use new operation 'ADDshiftLLV' on loong64 + 2025-08-12 1b263fc604 runtime/race: restore previous version of LLVM TSAN on macOS + 2025-08-12 b266318cf7 cmd/compile/internal/ssa: use BEQ/BNE to optimize the combination of XOR and EQ/NE on loong64 + 2025-08-12 adbf59525c internal/runtime/gc/scan: avoid -1 index when cache sizes unavailable + 2025-08-12 4e182db5fc Revert "cmd/compile: use generated loops instead of DUFFCOPY on amd64" + 2025-08-12 d2b3c1a504 internal/trace: clarify which StateTransition events have stacks + 2025-08-12 f63e12d0e0 internal/trace: fix Sync.ClockSnapshot comment + 2025-08-12 8e317da77d internal/trace: remove unused StateTransition.id field + 2025-08-12 f67d8ff34a internal/trace/tracev2: adjust comment for consistency + 2025-08-12 fe4d445c36 internal/trace/tracev2: fix EvSTWBegin comment to include stack ID + 2025-08-12 750789fab7 internal/trace/internal/testgen: fix missing stacks nframes arg + 2025-08-12 889ab74169 internal/runtime/gc/scan: import scan kernel from gclab [green tea] + 2025-08-12 182336bf05 net/http: fix data race in client + 2025-08-12 f04421ea9a cmd/compile: soften test for 74788 + 2025-08-12 28aa529c99 cmd/compile: use generated loops instead of DUFFZERO on arm64 + 2025-08-12 ec9e1176c3 cmd/compile: use generated loops instead of DUFFCOPY on amd64 + 2025-08-12 d0a64f7969 Revert "cmd/compile/internal/ssa: Use transitive properties for len/cap" + 2025-08-12 00a7bdcb55 all: delete aliastypeparams GOEXPERIMENT + 2025-08-11 74421a305b Revert "cmd/compile: allow multi-field structs to be stored directly in interfaces" + 2025-08-11 c31359138c Revert "cmd/compile: allow StructSelect [x] of interface data fields for x>0" + 2025-08-11 7248995b60 Revert "cmd/compile: allow more args in StructMake folding rule" + 2025-08-11 caf9fc3ccd Revert "reflect: handle zero-sized fields of directly-stored structures correctly" + 2025-08-11 ce3f3e2ae7 cmd/link/internal/ld, internal/syscall/unix: use posix_fallocate on netbsd + 2025-08-11 3dbef65bf3 database/sql: allow drivers to override Scan behavior + 2025-08-11 2b804abf07 net: context aware Dialer.Dial functions + 2025-08-11 6abfe7b0de cmd/dist: require Go 1.24.6 as minimum bootstrap toolchain + 2025-08-11 691af6ca28 encoding/json: fix Indent trailing whitespace regression in goexperiment.jsonv2 + 2025-08-11 925149da20 net/http: add example for CrossOriginProtection + 2025-08-11 cf4af0b2f3 encoding/json/v2: fix UnmarshalDecode regression with EOF + 2025-08-11 b096ddb9ea internal/runtime/maps: loop invariant code motion with h2(hash) by hand + 2025-08-11 a2431776eb net, os, file/filepath, syscall: use slices.Equal in tests + 2025-08-11 a7f05b38f7 cmd/compile: convert branch with zero to more optimal branch zero on loong64 + 2025-08-11 1718828c81 internal/sync: warn about incorrect unsafe usage in HashTrieMap + 2025-08-11 084c0f8494 cmd/compile: allow InlMark operations to be speculatively executed + 2025-08-10 a62f72f7a7 cmd/compile/internal/ssa: optimise more branches with SGTconst/SGTUconst on loong64 + 2025-08-08 fbac94a799 internal/sync: rename Store parameter from old to new + 2025-08-08 317be4cfeb cmd/compile/internal/staticinit: remove deadcode + 2025-08-08 bce5601cbb cmd/go: fix fips doc link + 2025-08-08 777d76c4f2 text/template: use sync.OnceValue for builtinFuncs + 2025-08-08 0201524c52 math: remove redundant infinity tests + 2025-08-08 dcc77f9e3c cmd/go: fix get -tool when multiple packages are provided + 2025-08-08 c7b85e9ddc all: update blog link + 2025-08-08 a8dd771e13 crypto/tls: check if quic conn can send session ticket + 2025-08-08 bdb2d50fdf net: fix WriteMsgUDPAddrPort addr handling on IPv4 sockets + 2025-08-08 768c51e368 internal/runtime/maps: remove unused var bitsetDeleted + 2025-08-08 b3388569a1 reflect: handle zero-sized fields of directly-stored structures correctly + 2025-08-08 d83b16fcb8 internal/bytealg: vector implementation of compare for riscv64 + 2025-08-07 dd3abf6bc5 internal/bytealg: optimize Index/IndexString on loong64 + 2025-08-07 73ff6d1480 cmd/internal/obj/loong64: change the immediate range of ALSL{W/WU/V} + 2025-08-07 f3606b0825 cmd/compile/internal/ssa: fix typo in LOONG64Ops.go comment + 2025-08-07 ee7bb8969a cmd/internal/obj/loong64: add support for FSEL instruction + 2025-08-07 1f7ffca171 time: skip TestLongAdjustTimers on plan9 (too slow) + 2025-08-06 8282b72d62 runtime/race: update darwin race syso + 2025-08-06 dc54d7b607 all: remove support for windows/arm + 2025-08-06 e0a1ea431c cmd/compile: make panicBounds stack frame smaller on ppc64 + 2025-08-06 2747f925dd debug/macho: support reading imported symbols without LC_DYSYMTAB + 2025-08-06 025d36917c cmd/internal/testdir: pass -buildid to link command + 2025-08-06 f53dcb6280 cmd/internal/testdir: unify link command + 2025-08-06 a3895fe9f1 database/sql: avoid closing Rows while scan is in progress + 2025-08-06 608e9fac90 go/types, types2: flip on position tracing + 2025-08-06 72e8237cc1 cmd/compile: allow more args in StructMake folding rule + 2025-08-06 3406a617d9 internal/bytealg: vector implementation of indexbyte for riscv64 + 2025-08-06 75ea2d05c0 internal/bytealg: vector implementation of equal for riscv64 + 2025-08-05 17a8be7117 crypto/sha512: use const table for key loading on loong64 + 2025-08-05 dda9d780e2 crypto/sha256: use const table for key loading on loong64 + 2025-08-05 5defe8ebb3 internal/chacha8rand: replace WORD with instruction VMOVQ + 2025-08-05 4c7362e41c cmd/internal/obj/loong64: add new instructions ALSL{W/WU/V} for loong64 + 2025-08-05 a552737418 cmd/compile: fold negation into multiplication on loong64 + 2025-08-05 e1fd4faf91 runtime: fix godoc comment for inVDSOPage + 2025-08-05 bcd25c79aa cmd/compile: allow StructSelect [x] of interface data fields for x>0 + 2025-08-05 b0945a54b5 cmd/dist, internal/platform: mark freebsd/riscv64 broken + 2025-08-05 55d961b202 runtime: save AVX2 and AVX-512 state on asynchronous preemption + 2025-08-05 af0c4fe2ca runtime: save scalar registers off stack in amd64 async preemption + 2025-08-05 e73afaae69 internal/cpu: add AVX-512-CD and DQ, and derived "basic AVX-512" + 2025-08-05 cef381ba60 runtime: eliminate global state in mkpreempt.go + 2025-08-05 c0025d5e0b go/parser: correct comment in expectedErrors + 2025-08-05 4ee0df8c46 cmd: remove dead code + 2025-08-05 a2c45f0eb1 runtime: test VDSO symbol hash values + 2025-08-05 cd55f86b8d cmd/compile: allow multi-field structs to be stored directly in interfaces + 2025-08-05 21ab0128b6 cmd/compile: remove support for old-style bounds check calls + 2025-08-05 802d056c78 cmd/compile: move ppc64 over to new bounds check strategy + 2025-08-05 a3295df873 cmd/compile/internal/ssa: Use transitive properties for len/cap + 2025-08-05 bd082857a5 doc: fix typo in go memory model doc + 2025-08-05 2b622b05a9 cmd/compile: remove isUintXPowerOfTwo functions + 2025-08-05 72147ffa75 cmd/compile: simplify isUintXPowerOfTwo implementation + 2025-08-05 26da1199eb cmd/compile: make isUint{32,64}PowerOfTwo implementations clearer + 2025-08-05 5ab9f23977 cmd/compile, runtime: add checkptr instrumentation for unsafe.Add + 2025-08-05 fcc036f03b cmd/compile: optimise float <-> int register moves on riscv64 Change-Id: Ie94f29d9b0cc14a52a536866f5abaef27b5c52d7
2025-08-13cmd/compile: emit classify instructions for infinity tests on riscv64Michael Munday
The 'classify' instruction on RISC-V sets a bit in a mask to indicate the class a floating point value belongs to (e.g. whether the value is an infinity, a normal number, a subnormal number and so on). There are other places this instruction is useful but for now I've just used it for infinity tests. The gains are relatively small (~1-2 instructions per IsInf call) but using FCLASSD does potentially unlock further optimizations. It also reduces the number of loads from memory and the number of moves between general purpose and floating point register files. goos: linux goarch: riscv64 pkg: math cpu: Spacemit(R) X60 │ sec/op │ sec/op vs base │ Acos 159.9n ± 0% 173.7n ± 0% +8.66% (p=0.000 n=10) Acosh 249.8n ± 0% 254.4n ± 0% +1.86% (p=0.000 n=10) Asin 159.9n ± 0% 173.7n ± 0% +8.66% (p=0.000 n=10) Asinh 292.2n ± 0% 283.0n ± 0% -3.15% (p=0.000 n=10) Atan 119.1n ± 0% 119.0n ± 0% -0.08% (p=0.036 n=10) Atanh 265.1n ± 0% 271.6n ± 0% +2.43% (p=0.000 n=10) Atan2 194.9n ± 0% 186.7n ± 0% -4.23% (p=0.000 n=10) Cbrt 216.3n ± 0% 203.1n ± 0% -6.10% (p=0.000 n=10) Ceil 31.82n ± 0% 31.81n ± 0% ~ (p=0.063 n=10) Copysign 4.897n ± 0% 4.893n ± 3% -0.08% (p=0.038 n=10) Cos 123.9n ± 0% 107.7n ± 1% -13.03% (p=0.000 n=10) Cosh 293.0n ± 0% 264.6n ± 0% -9.68% (p=0.000 n=10) Erf 150.0n ± 0% 133.8n ± 0% -10.80% (p=0.000 n=10) Erfc 151.8n ± 0% 137.9n ± 0% -9.16% (p=0.000 n=10) Erfinv 173.8n ± 0% 173.8n ± 0% ~ (p=0.820 n=10) Erfcinv 173.8n ± 0% 173.8n ± 0% ~ (p=1.000 n=10) Exp 247.7n ± 0% 220.4n ± 0% -11.04% (p=0.000 n=10) ExpGo 261.4n ± 0% 232.5n ± 0% -11.04% (p=0.000 n=10) Expm1 176.2n ± 0% 164.9n ± 0% -6.41% (p=0.000 n=10) Exp2 220.4n ± 0% 190.2n ± 0% -13.70% (p=0.000 n=10) Exp2Go 232.5n ± 0% 204.0n ± 0% -12.22% (p=0.000 n=10) Abs 4.897n ± 0% 4.897n ± 0% ~ (p=0.726 n=10) Dim 16.32n ± 0% 16.31n ± 0% ~ (p=0.770 n=10) Floor 31.84n ± 0% 31.83n ± 0% ~ (p=0.677 n=10) Max 26.11n ± 0% 26.13n ± 0% ~ (p=0.290 n=10) Min 26.10n ± 0% 26.11n ± 0% ~ (p=0.424 n=10) Mod 416.2n ± 0% 337.8n ± 0% -18.83% (p=0.000 n=10) Frexp 63.65n ± 0% 50.60n ± 0% -20.50% (p=0.000 n=10) Gamma 218.8n ± 0% 206.4n ± 0% -5.62% (p=0.000 n=10) Hypot 92.20n ± 0% 94.69n ± 0% +2.70% (p=0.000 n=10) HypotGo 107.7n ± 0% 109.3n ± 0% +1.49% (p=0.000 n=10) Ilogb 59.54n ± 0% 44.04n ± 0% -26.04% (p=0.000 n=10) J0 708.9n ± 0% 674.5n ± 0% -4.86% (p=0.000 n=10) J1 707.6n ± 0% 676.1n ± 0% -4.44% (p=0.000 n=10) Jn 1.513µ ± 0% 1.427µ ± 0% -5.68% (p=0.000 n=10) Ldexp 70.20n ± 0% 57.09n ± 0% -18.68% (p=0.000 n=10) Lgamma 201.5n ± 0% 185.3n ± 1% -8.01% (p=0.000 n=10) Log 201.5n ± 0% 182.7n ± 0% -9.35% (p=0.000 n=10) Logb 59.54n ± 0% 46.53n ± 0% -21.86% (p=0.000 n=10) Log1p 178.8n ± 0% 173.9n ± 6% -2.74% (p=0.021 n=10) Log10 201.4n ± 0% 184.3n ± 0% -8.49% (p=0.000 n=10) Log2 79.17n ± 0% 66.07n ± 0% -16.54% (p=0.000 n=10) Modf 34.27n ± 0% 34.25n ± 0% ~ (p=0.559 n=10) Nextafter32 49.34n ± 0% 49.37n ± 0% +0.05% (p=0.040 n=10) Nextafter64 43.66n ± 0% 43.66n ± 0% ~ (p=0.869 n=10) PowInt 309.1n ± 0% 267.4n ± 0% -13.49% (p=0.000 n=10) PowFrac 769.6n ± 0% 677.3n ± 0% -11.98% (p=0.000 n=10) Pow10Pos 13.88n ± 0% 13.88n ± 0% ~ (p=0.811 n=10) Pow10Neg 19.58n ± 0% 19.57n ± 0% ~ (p=0.993 n=10) Round 23.65n ± 0% 23.66n ± 0% ~ (p=0.354 n=10) RoundToEven 27.75n ± 0% 27.75n ± 0% ~ (p=0.971 n=10) Remainder 380.0n ± 0% 309.9n ± 0% -18.45% (p=0.000 n=10) Signbit 13.06n ± 0% 13.06n ± 0% ~ (p=1.000 n=10) Sin 133.8n ± 0% 120.8n ± 0% -9.75% (p=0.000 n=10) Sincos 160.7n ± 0% 147.7n ± 0% -8.12% (p=0.000 n=10) Sinh 305.9n ± 0% 277.9n ± 0% -9.17% (p=0.000 n=10) SqrtIndirect 3.265n ± 0% 3.264n ± 0% ~ (p=0.546 n=10) SqrtLatency 19.58n ± 0% 19.58n ± 0% ~ (p=0.973 n=10) SqrtIndirectLatency 19.59n ± 0% 19.58n ± 0% ~ (p=0.370 n=10) SqrtGoLatency 205.7n ± 0% 202.7n ± 0% -1.46% (p=0.000 n=10) SqrtPrime 4.953µ ± 0% 4.954µ ± 0% ~ (p=0.477 n=10) Tan 163.2n ± 0% 150.2n ± 0% -7.99% (p=0.000 n=10) Tanh 312.4n ± 0% 284.2n ± 0% -9.01% (p=0.000 n=10) Trunc 31.83n ± 0% 31.83n ± 0% ~ (p=0.663 n=10) Y0 701.0n ± 0% 669.2n ± 0% -4.54% (p=0.000 n=10) Y1 704.5n ± 0% 672.4n ± 0% -4.55% (p=0.000 n=10) Yn 1.490µ ± 0% 1.422µ ± 0% -4.60% (p=0.000 n=10) Float64bits 5.713n ± 0% 5.710n ± 0% ~ (p=0.926 n=10) Float64frombits 4.896n ± 0% 4.896n ± 0% ~ (p=0.663 n=10) Float32bits 12.25n ± 0% 12.25n ± 0% ~ (p=0.571 n=10) Float32frombits 4.898n ± 0% 4.896n ± 0% ~ (p=0.754 n=10) FMA 4.895n ± 0% 4.895n ± 0% ~ (p=0.745 n=10) geomean 94.40n 89.43n -5.27% Change-Id: I4fe0f2e9f609e38d79463f9ba2519a3f9427432e Reviewed-on: https://go-review.googlesource.com/c/go/+/348389 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com>
2025-08-13cmd/compile: use generated loops instead of DUFFCOPY on amd64Keith Randall
This reverts commit 4e182db5fc876564a4f87a0602c58ea0ddc6e37c (CL 695196), which is itself a revert of ec9e1176c3209cf92e73e3deb2d8073fab5ea4d6 (CL 678620). So this CL is exactly the same as CL 678620, but with a regalloc fix (CL 696035) submitted first. Change-Id: I743ab32fa3aa6ef3e1b2b6751a2ef4519139057c Reviewed-on: https://go-review.googlesource.com/c/go/+/696016 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-13cmd/compile: during regalloc, fixedreg values are always availableKeith Randall
It is ok to clobber registers that have a copy of a fixedreg value, as that value is always available in its original location later if we need it. (See 14 lines below the change.) This CL will fix the regalloc infinite loop that CL 678620 introduced. That CL requests that the stack pointer value be materialized in a non-stack-pointer register, which is atypical. That condition triggered the infinite loop that this CL fixes. The infinite loop is the compiler trying to reuse that non-stack-pointer register for something else, but then refusing to give it up because it thought that non-stack-pointer register held the last copy of the original SP value. Change-Id: Id604d0937fb9d3753ee273bf1917753d3ef2d5d7 Reviewed-on: https://go-review.googlesource.com/c/go/+/696035 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-13[dev.simd] go/types: exclude simd/_gen module from TestStdlibAustin Clements
We're about to add a small simd/_gen submodule that imports external dependencies. Exclude it from TestStdlib since it won't be able to follow those dependencies. Change-Id: I29a1adc98d141b9c511aa29e1992fab2248747d5 Reviewed-on: https://go-review.googlesource.com/c/go/+/695976 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-08-13[dev.simd] simd: add emulations for missing AVX2 comparisonsDavid Chase
this also removes AVX512 versions of the operations that would use the same names, but not run on AVX2-only includes files generated by simdgen CL 692355 Change-Id: Iff29042245b7688133fed49a03e681e85235b8a8 Reviewed-on: https://go-review.googlesource.com/c/go/+/692335 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-08-13[dev.simd] simd, cmd/compile: generated code for BroadcastDavid Chase
Generated by simdgen CL 693599 This turned out to require some additional work in other places, including filling in missing methods (use OverwriteBase to get FP versions). Also includes a test. Change-Id: I2efe8967837834745f9cae661d4d4dcbb5390b6f Reviewed-on: https://go-review.googlesource.com/c/go/+/693758 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-08-13[dev.simd] cmd/compile: fix LoadReg so it is aware of register targetDavid Chase
SIMD code generation created interesting new type/register combintations. Change-Id: I9c9a73bf51f6cb54551db1fdc88f9dd1eef7ab26 Reviewed-on: https://go-review.googlesource.com/c/go/+/695895 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-08-13[dev.simd] cmd/compile: fix isIntrinsic for methods; fix fp <-> gp movesDavid Chase
also includes a handy debugging hook for the inliner. Change-Id: I23d0619506219d21db78c6c801612ff058562142 Reviewed-on: https://go-review.googlesource.com/c/go/+/694118 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-08-13[dev.simd] cmd/compile: generated code from 'fix generated rules for shifts'David Chase
this code is generated by simdgen CL 695455 Change-Id: I5afdc209a50b49d68e120130e0578e4666bf8749 Reviewed-on: https://go-review.googlesource.com/c/go/+/695475 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-08-13cmd/compile: teach prove about len's & cap's max based on the element sizeJorropo
Change-Id: I88056fada1ff488c199fce54cf737dbdd091214d Reviewed-on: https://go-review.googlesource.com/c/go/+/695095 Auto-Submit: Jorropo <jorropo.pgm@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2025-08-13cmd/compile: absorb NEGV into branch on loong64Xiaolin Zhao
Removes 132 instructions from the go binary on loong64. Change-Id: Ia02dc305b12f63a64f3f48d120ef852d45cc2a7b Reviewed-on: https://go-review.googlesource.com/c/go/+/695115 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com>
2025-08-12cmd/compile/internal: optimize multiplication use new operation ↵limeidan
'ADDshiftLLV' on loong64 goos: linux goarch: loong64 pkg: cmd/compile/internal/test cpu: Loongson-3A6000-HV @ 2500.00MHz │ old │ new │ │ sec/op │ sec/op vs base │ MulconstI32/3 0.8004n ± 0% 0.4247n ± 2% -46.94% (p=0.000 n=10) MulconstI32/5 0.8005n ± 0% 0.4256n ± 1% -46.83% (p=0.000 n=10) MulconstI32/12 1.2010n ± 0% 0.8005n ± 0% -33.35% (p=0.000 n=10) MulconstI32/120 0.8090n ± 0% 0.8067n ± 0% -0.28% (p=0.007 n=10) MulconstI32/-120 0.8109n ± 0% 0.8072n ± 0% -0.47% (p=0.000 n=10) MulconstI32/65537 0.8004n ± 0% 0.8004n ± 0% ~ (p=1.000 n=10) MulconstI32/65538 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.265 n=10) MulconstI64/3 0.8005n ± 0% 0.4241n ± 1% -47.02% (p=0.000 n=10) MulconstI64/5 0.8004n ± 0% 0.4249n ± 1% -46.91% (p=0.000 n=10) MulconstI64/12 1.2010n ± 0% 0.8004n ± 0% -33.36% (p=0.000 n=10) MulconstI64/120 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.635 n=10) MulconstI64/-120 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.837 n=10) MulconstI64/65537 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.837 n=10) MulconstI64/65538 0.8096n ± 0% 0.8004n ± 0% -1.14% (p=0.000 n=10) MulconstU32/3 0.8004n ± 0% 0.4263n ± 1% -46.75% (p=0.000 n=10) MulconstU32/5 0.8005n ± 0% 0.4262n ± 1% -46.76% (p=0.000 n=10) MulconstU32/12 1.2010n ± 0% 0.8005n ± 0% -33.35% (p=0.000 n=10) MulconstU32/120 0.8105n ± 0% 0.8096n ± 0% ~ (p=0.183 n=10) MulconstU32/65537 0.8004n ± 0% 0.8004n ± 0% ~ (p=1.000 n=10) MulconstU32/65538 0.8005n ± 0% 0.8005n ± 0% ~ (p=1.000 n=10) MulconstU64/3 0.8004n ± 0% 0.4265n ± 4% -46.71% (p=0.000 n=10) MulconstU64/5 0.8004n ± 0% 0.4256n ± 0% -46.82% (p=0.000 n=10) MulconstU64/12 1.2010n ± 0% 0.8004n ± 0% -33.36% (p=0.000 n=10) MulconstU64/120 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.387 n=10) MulconstU64/65537 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.265 n=10) MulconstU64/65538 0.8080n ± 0% 0.8004n ± 0% -0.93% (p=0.000 n=10) geomean 0.8539n 0.6597n -22.74% Change-Id: Ie33e88985d7639f481bbba540bc917b9f185c357 Reviewed-on: https://go-review.googlesource.com/c/go/+/693855 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-12cmd/compile/internal/ssa: use BEQ/BNE to optimize the combination of XOR and ↵limeidan
EQ/NE on loong64 Reduce the number of go toolchain instructions on loong64 as follows: file before after Δ % go 1599056 1590560 -8496 -0.5313% gofmt 326188 326104 -84 -0.0258% asm 563482 561250 -2232 -0.3961% cgo 488644 485252 -3392 -0.6942% compile 2504614 2486388 -18226 -0.7277% cover 526322 523270 -3052 -0.5799% link 714532 711124 -3408 -0.4770% preprofile 242316 241112 -1204 -0.4969% vet 794446 786118 -8328 -1.0483% Change-Id: I0914889119a28ea672b694529ef54513fbb3f3b5 Reviewed-on: https://go-review.googlesource.com/c/go/+/693875 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@google.com>
2025-08-12Revert "cmd/compile: use generated loops instead of DUFFCOPY on amd64"Keith Randall
This reverts commit ec9e1176c3209cf92e73e3deb2d8073fab5ea4d6 (CL 678620). Reason for revert: causing regalloc to get into an infinite loop Change-Id: Ie53c58c6126804af6d6883ea4acdcfb632a172bd Reviewed-on: https://go-review.googlesource.com/c/go/+/695196 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2025-08-12cmd/compile: use generated loops instead of DUFFZERO on arm64Keith Randall
Change-Id: Ie0c8263f36d1bcfd0edfc4ea6710ae6c113c4d48 Reviewed-on: https://go-review.googlesource.com/c/go/+/678995 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Jorropo <jorropo.pgm@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-08-12cmd/compile: use generated loops instead of DUFFCOPY on amd64Keith Randall
goarch: amd64 cpu: 12th Gen Intel(R) Core(TM) i7-12700 │ base │ exp │ │ sec/op │ sec/op vs base │ MemmoveKnownSize112-20 1.764n ± 0% 1.247n ± 0% -29.31% (p=0.000 n=10) MemmoveKnownSize128-20 1.891n ± 0% 1.405n ± 1% -25.72% (p=0.000 n=10) MemmoveKnownSize192-20 2.521n ± 0% 2.114n ± 3% -16.16% (p=0.000 n=10) MemmoveKnownSize248-20 4.028n ± 0% 3.877n ± 1% -3.75% (p=0.000 n=10) MemmoveKnownSize256-20 3.272n ± 0% 2.961n ± 2% -9.53% (p=0.000 n=10) MemmoveKnownSize512-20 6.733n ± 3% 5.936n ± 4% -11.83% (p=0.000 n=10) MemmoveKnownSize1024-20 13.905n ± 5% 9.798n ± 9% -29.54% (p=0.000 n=10) Change-Id: Icc01cec0d8b072300d749a5ce76f53b3725b5c65 Reviewed-on: https://go-review.googlesource.com/c/go/+/678620 Reviewed-by: Jorropo <jorropo.pgm@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Jakub Ciolek <jakub@ciolek.dev>
2025-08-12Revert "cmd/compile/internal/ssa: Use transitive properties for len/cap"Keith Randall
This reverts commit a3295df873bb22b3ba427124b1220370a5ca5cdb (CL 679155) Reason for revert: leads to a very expensive prove pass, see #74974 (Maybe not this CL's fault, just tickling some superlinear behavior.) Change-Id: I75302c04cfc5e1e075aeb80edb73080bfb1efcac Reviewed-on: https://go-review.googlesource.com/c/go/+/695175 Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Daniel Morsing <daniel.morsing@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Keith Randall <khr@golang.org>
2025-08-12all: delete aliastypeparams GOEXPERIMENTCherry Mui
Always enable aliastypeparams and remove the GOEXPERIMENT. Change-Id: Ic38fe25b0bba312a7f83f7bb94b57ab75ce0f0c3 Reviewed-on: https://go-review.googlesource.com/c/go/+/691956 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-08-11Revert "cmd/compile: allow multi-field structs to be stored directly in ↵Keith Randall
interfaces" This reverts commit cd55f86b8dcfc139ee5c17d32530ac9e758c8bc0 (CL 681937) Reason for revert: still causing compiler failures on Google test code Change-Id: I5cd482fd607fd060a523257082d48821b5f965d6 Reviewed-on: https://go-review.googlesource.com/c/go/+/695016 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-11Revert "cmd/compile: allow StructSelect [x] of interface data fields for x>0"Keith Randall
This reverts commit bcd25c79aa5675d3e5d28c09715b8147906da006 (CL 693415) Reason for revert: still causing compiler failures on Google test code Change-Id: I887edcff56bde3ffa316f2b629021ad323a357fa Reviewed-on: https://go-review.googlesource.com/c/go/+/694996 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-11Revert "cmd/compile: allow more args in StructMake folding rule"Keith Randall
This reverts commit 72e8237cc11569de2faf9885a1b83d06446533b5 (CL 693615) Reason for revert: still causing compiler failures on Google test code Change-Id: I4a7850c321d95ed7803d56866bb0c524c7a377d3 Reviewed-on: https://go-review.googlesource.com/c/go/+/695015 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@google.com>
2025-08-11[dev.simd] cmd/compile, simd: update generated filesAustin Clements
This CL is generated by x/arch CL 694860 Change-Id: Ifa7c0e9749b1d9a20f31b70aafe563d7844ce6b0 Reviewed-on: https://go-review.googlesource.com/c/go/+/694917 Auto-Submit: Austin Clements <austin@google.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-11[dev.simd] cmd/compile, simd: update generated filesAustin Clements
This CL is generated by x/arch CL 694857. Change-Id: I9745fa8c9b2e3f49bd2cff5ff6b5578c0c67bfa1 Reviewed-on: https://go-review.googlesource.com/c/go/+/694915 Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Austin Clements <austin@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-11[dev.simd] cmd/compile: keep track of multiple rule file names in ssa/_genDavid Chase
This was a long-standing "we need to fix this" for simd work, this fixes it. I expect that simd peephole rule files will be coming soon and there will be more errors and we will be happier to have this. Change-Id: Iefffc43e3e2110939f8d406f6e5da7e9e2d55bd9 Reviewed-on: https://go-review.googlesource.com/c/go/+/694455 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-11[dev.simd] cmd/compile, simd: jump table for imm opsJunyang Shao
This CL fixes some errors in prog generation for imm operations, please see the changes in ssa.go for details. This CL also implements the jump table for non-const immediate arg. The current implementation exhaust 0-255, the bound-checked version will be in the next CL. This CL is partially generated by CL 694375. Change-Id: I75fe9900430b4fca5b39b0c0958a13b20b1104b7 Reviewed-on: https://go-review.googlesource.com/c/go/+/694395 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-11cmd/compile: convert branch with zero to more optimal branch zero on loong64Xiaolin Zhao
This reduces 7500+ instructions from the go toolchain binary on loong64. file before after Δ % asm 555066 554406 -660 -0.1189% cgo 481480 480764 -716 -0.1487% compile 2475836 2474776 -1060 -0.0428% cover 516536 515788 -748 -0.1448% link 702220 701216 -1004 -0.1430% preprofile 238626 238122 -504 -0.2112% vet 792798 791894 -904 -0.1140% go 1573108 1571676 -1432 -0.0910% gofmt 320578 320042 -536 -0.1672% total 7656248 7648684 -7564 -0.0988% Change-Id: I51b70a1543bc258b7664caa8647e75eecbaf5eed Reviewed-on: https://go-review.googlesource.com/c/go/+/693495 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-11cmd/compile: allow InlMark operations to be speculatively executedMichael Munday
Although InlMark takes a memory argument it ultimately becomes a NOP and therefore is safe to speculatively execute. Fixes #74915 Change-Id: I64317dd433e300ac28de2bcf201845083ec2ac82 Reviewed-on: https://go-review.googlesource.com/c/go/+/693795 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-08-10cmd/compile/internal/ssa: optimise more branches with SGTconst/SGTUconst on ↵Xiaolin Zhao
loong64 Add branches to convert EQZ/NEZ into more optimal branch conditions. This reduces 720 instructions from the go toolchain binary on loong64. file before after Δ % asm 555306 555082 -224 -0.0403% cgo 481814 481742 -72 -0.0149% compile 2475686 2475710 +24 +0.0010% cover 516854 516770 -84 -0.0163% link 702566 702530 -36 -0.0051% preprofile 238612 238548 -64 -0.0268% vet 793140 793060 -80 -0.0101% go 1573466 1573346 -120 -0.0076% gofmt 320560 320496 -64 -0.0200% total 7658004 7657284 -720 -0.0094% Additionally, rename EQ/NE to EQZ/NEZ to enhance readability. Change-Id: Ibc876bc8b8d4e81d5c3aaf0b74b60419f3c771b1 Reviewed-on: https://go-review.googlesource.com/c/go/+/693455 Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@google.com>
2025-08-08cmd/compile/internal/staticinit: remove deadcodeCuong Manh Le
The staticAssignInlinedCall function contains code for handling non-Unified IR. As Unified IR is now the sole format for the frontend, this code is obsolete and can be removed. Change-Id: Iac93a9b59ec6d639851e1b17ba1f75563d8bcda5 Reviewed-on: https://go-review.googlesource.com/c/go/+/694075 Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@google.com>
2025-08-07cmd/compile/internal/ssa: fix typo in LOONG64Ops.go commentXiaolin Zhao
Change-Id: I680bae7fc1a26c1f249ab833fa8d41e9387b2d50 Reviewed-on: https://go-review.googlesource.com/c/go/+/693456 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2025-08-07[dev.simd] cmd/compile, simd: API interface fixesJunyang Shao
- Absolute -> Abs - ApproximateReciprocal -> Reciprocal - Other derived apis also changed. - Round -> RoundToEven - Other derived apis also changed. - Drop DotProdBroadcast - Fused(Mul|Add)(Mul|Add)? -> remove the "Fused" - MulEvenWiden -> remove 64bit - MulLow -> Mul, add unit - PairDotProd -> DotProdPairs - make AddDotProdPairs machine ops only - peepholes will be in another CL at dev.simd. - PopCount -> OnesCount - Saturated* -> *Saturated - Fix (Add|Sub)Saturated uint mappings. - UnsignedSignedQuadDotProdAccumulate -> AddDotProdQuadruple - The "DotProdQuadruple" instruction does not exist, so no peepholes for this. This CL is generated by CL 694095. Change-Id: If4110cc04ab96240cf56f2348d35ed2a719687de Reviewed-on: https://go-review.googlesource.com/c/go/+/694115 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-07[dev.simd] cmd/compile, simd: add value conversion ToBits for maskJunyang Shao
This CL is generated by CL 693598. Change-Id: I949d3b3b4e5670cb30f0fb9dc779f7359409b54c Reviewed-on: https://go-review.googlesource.com/c/go/+/693755 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-06[dev.simd] cmd/compile, simd: add ExpandJunyang Shao
This CL is generated by CL 693336. Change-Id: Ic1712d49fcad0544fa3c19b0249d8bc65b347104 Reviewed-on: https://go-review.googlesource.com/c/go/+/693375 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-06[dev.simd] cmd/compile, simd: (Set|Get)(Lo|Hi)Junyang Shao
This CL is generated by CL 693335. Change-Id: Ie9adda526573f979ec7e4f535033ba29236cc5cb Reviewed-on: https://go-review.googlesource.com/c/go/+/693355 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>