| Age | Commit message (Collapse) | Author |
|
This might have been something that prove could be educated
into figuring out, but this also works, and it also helps
prove downstream.
Adjusted the prove test, because this change moved a message.
Change-Id: I5eabe639eff5db9cd9766a6a8666fdb4973829cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/697715
Commit-Queue: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Bypass: David Chase <drchase@google.com>
|
|
This should get many of the low-hanging and important fruit.
Others can follow later.
It needs more testing.
Change-Id: Ic186b075987e85c87197ef9e1ca0b4f33ff96697
Reviewed-on: https://go-review.googlesource.com/c/go/+/697515
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Commit-Queue: David Chase <drchase@google.com>
TryBot-Bypass: David Chase <drchase@google.com>
|
|
This is not the end of such peephole optimizations, there
would need to be many of these for many simd operations.
Change-Id: I4511f6fac502bc7259c1c4414c96f56eb400c202
Reviewed-on: https://go-review.googlesource.com/c/go/+/697157
TryBot-Bypass: David Chase <drchase@google.com>
Commit-Queue: David Chase <drchase@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
Merge List:
+ 2025-08-20 9de69f6913 errors: mention Is/As in Join docs
+ 2025-08-20 4afd482812 cmd/go/internal/doc: pass URL fragments separately with -http
+ 2025-08-20 509d5f647f internal/poll: don't call Seek for overlapped Windows handles
+ 2025-08-20 853fc12739 internal/poll: set the correct file offset in FD.Seek for Windows overlapped handles
+ 2025-08-19 bd885401d5 runtime: save and restore all fcc registers in async preempt on loong64
+ 2025-08-19 119546ea4f cmd/go: document install outputs to $GOOS_$GOARCH when cross compiling
+ 2025-08-19 ffa882059c unique: deflake TestCanonMap/LoadOrStore/ConcurrentUnsharedKeys
+ 2025-08-19 1f2e8e03e4 os: fix path in MkdirTemp error message
+ 2025-08-19 5024d0d884 cmd/compile: tweak example command in README
+ 2025-08-19 b80ffb64d8 internal/trace: remove redundant info from Event.String
+ 2025-08-19 c7d8bda459 cmd/compile/internal: make function comments match function names
+ 2025-08-19 de2d741667 internal/trace: use RFC3339Nano for wall clock snapshots in Event.String
+ 2025-08-19 c61db5ebd5 syscall: forkAndExecInChild1: don't reuse pid variable
+ 2025-08-19 07ee3bfc63 cmd/go: use modern pprof flags in documentation
+ 2025-08-18 5a56d8848b cmd/compile: ensure we use allowed registers for input-clobbering instructions
+ 2025-08-18 c3927a47f0 runtime: fix comments in tracetype.go
+ 2025-08-15 77f911e31c internal/trace: emit final sync event for generation in Go 1.26+
+ 2025-08-15 786be1d2bf runtime: don't overwrite global stop channel in tests
+ 2025-08-15 4a7fde922f internal/trace: add end-of-generation signal to trace
+ 2025-08-15 cb814bd5bc net: skip TestIPv4WriteMsgUDPAddrPort on plan9
+ 2025-08-15 78a3968c2c runtime/metrics: add metric for current Go-owned thread count
+ 2025-08-15 ab8121a407 runtime/metrics: add metric for total goroutines created
+ 2025-08-15 13df972f68 runtime/metrics: add metrics for goroutine sched states
+ 2025-08-15 bd07fafb0a runtime: disable stack shrinking for all waiting-for-suspendG cases
+ 2025-08-15 a651e2ea47 runtime: remove duff support for amd64
+ 2025-08-15 e4291e484c runtime: remove duff support for arm64
+ 2025-08-15 15d6dbc05c cmd/compile: use generated loops instead of DUFFCOPY on arm64
+ 2025-08-15 bca3e98b8a cmd/go: test barrier actions
+ 2025-08-15 052fcde9fd internal/runtime: cleaner overflow checker
+ 2025-08-15 3871c0d84d syscall: permit nil destination address in sendmsgN{Inet4,Inet6}
+ 2025-08-14 a8564bd412 runtime: make all synctest bubble violations fatal panics
Change-Id: Ibc94566bc69bcb59b1d79b6fa868610ca2d1d223
|
|
Change-Id: If0bde7154bd622573375eba5539fd642b8ef9d2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/696555
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
Right now we can expect the `Op(...).Masked` idiom to lack many parts that will
make the API incomplete. But to make the API sizes smaller, we are removing these ops' frontend types and interfaces for now. We will have the peepholes and a new pass
checking the CPU features check domination relations to make these ops
picked for the right `Op(...).Masked` idiom.
Change-Id: I77f72a198b3d8b1880dcb911470db5e0089ac1ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/697155
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Bypass: Junyang Shao <shaojunyang@google.com>
|
|
Change-Id: I2ebadb87a4f60487c8f720930a72bc5ef953d7cf
Reviewed-on: https://go-review.googlesource.com/c/go/+/696695
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL defines the mask semantic better:
When converting from vector to mask, its element is set to true iff
the corresponding vector element is non zero.
Change-Id: I331c1c7992dc9e81c211bdc6d73e5eb3b8414506
Reviewed-on: https://go-review.googlesource.com/c/go/+/697056
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: Ida9f29423d62a25be41dcf637ffb9275b7cae642
Reviewed-on: https://go-review.googlesource.com/c/go/+/697055
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This also required a "always use operation with least
OverrideBase" filter in choosing the machine instructions.
The order of generated HW operations is slightly
modified because the Float version of GetElem
appears earlier in the sorted operations list,
though it is not chosen to generate the HW Op.
Change-Id: I95fa67afca9c8b6f4f18941fdcaf69afdad8055b
Reviewed-on: https://go-review.googlesource.com/c/go/+/696375
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
For instructions which clobber their input register, we make a second
copy of the input value so it is still available in a register for
future instructions.
That second copy might not respect the register input restrictions
for the instruction. So the second copy we make here can't actually
be used by the instruction - it should use the first copy, the second
copy is the one that will persist beyond the clobber.
Fixes #75063
Change-Id: I99acdc63f0c4e54567a174ff7ada601ae4e796b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/697015
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: Ic2aa8959b7fc594b86def70b6c2be38badf7970c
Reviewed-on: https://go-review.googlesource.com/c/go/+/679015
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
|
|
This CL improve its previous CL by implementing
move/load/storeByRegWidth. It should have not touched the compilation
path of complex128, but as a side effect, the move/load/store of
16-byte SIMD vectors in X0 to X15 are now compiled to MOVUPS instead of
VMOVDQU.
These functions could be used in MOV*const, but this CL does not do that
because we haven't seen problems of them yet. But in the future if we
see problems calling these functions to find the right asm might be handy.
Change-Id: I9b76e65eef8155479d3e288402aa96bc29a4f7cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/696255
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
width only
This CL tries to clean up the move/load/store lowering a bit. After CL
695315 the register information for instructions are expected to be
correct for SIMD, but we still need to pick the right instruction during
ssa to asm lowering.
The code before this CL should be working correctly, but MOVSSconst and
MOVSDconst contains duplicated codes, this CL removes that.
This CL also rewrite move/load/storeByTypeAndReg to use only the width
and reg for all non-SIMD types, which is more consistent.
Change-Id: I76c14f3d0140bcbd4fbea0df275fee0202a3b7d9
Reviewed-on: https://go-review.googlesource.com/c/go/+/696175
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL implements the check for rematerializeable value's output
regspec at its remateralization site. It has some potential problems,
please see the TODO in regalloc.go.
Fixes #70451.
Change-Id: Ib624b967031776851136554719e939e9bf116b7c
Reviewed-on: https://go-review.googlesource.com/c/go/+/695315
Reviewed-by: David Chase <drchase@google.com>
TryBot-Bypass: David Chase <drchase@google.com>
|
|
Conflicts:
- src/cmd/compile/internal/amd64/ssa.go
- src/cmd/compile/internal/ssa/expand_calls.go
- src/cmd/compile/internal/ssagen/ssa.go
- src/internal/buildcfg/exp.go
- src/internal/cpu/cpu.go
- src/internal/cpu/cpu_x86.go
- src/runtime/mkpreempt.go
- src/runtime/preempt_amd64.go
- src/runtime/preempt_amd64.s
Merge List:
+ 2025-08-14 924fe98902 cmd/internal/obj/riscv: add encoding for compressed riscv64 instructions
+ 2025-08-13 320df537cc cmd/compile: emit classify instructions for infinity tests on riscv64
+ 2025-08-13 ca66f907dd cmd/compile: use generated loops instead of DUFFCOPY on amd64
+ 2025-08-13 4b1800e476 encoding/json/v2: cleanup error constructors
+ 2025-08-13 af8870708b encoding/json/v2: fix incorrect marshaling of NaN in float64 any
+ 2025-08-13 0a75e5a07b encoding/json/v2: fix wrong type with cyclic marshal error in map[string]any
+ 2025-08-13 de9b6f9875 cmd/pprof: update vendored github.com/google/pprof
+ 2025-08-13 674c5f0edd os/exec: fix incorrect expansion of ".." in LookPath on plan9
+ 2025-08-13 9bbea0f21a cmd/compile: during regalloc, fixedreg values are always available
+ 2025-08-13 08eef97500 runtime/trace: fix documentation typo
+ 2025-08-13 2fe5d51d04 internal/trace: fix wrong scope for Event.Range or EvGCSweepActive
+ 2025-08-13 9fcb87c352 cmd/compile: teach prove about len's & cap's max based on the element size
+ 2025-08-13 9763ece873 cmd/compile: absorb NEGV into branch on loong64
+ 2025-08-13 f10a82b76f all: update vendored dependencies [generated]
+ 2025-08-13 3bea95b277 cmd/link/internal/ld: remove OpenBSD buildid workaround
+ 2025-08-12 90b7d7aaa2 cmd/compile/internal: optimize multiplication use new operation 'ADDshiftLLV' on loong64
+ 2025-08-12 1b263fc604 runtime/race: restore previous version of LLVM TSAN on macOS
+ 2025-08-12 b266318cf7 cmd/compile/internal/ssa: use BEQ/BNE to optimize the combination of XOR and EQ/NE on loong64
+ 2025-08-12 adbf59525c internal/runtime/gc/scan: avoid -1 index when cache sizes unavailable
+ 2025-08-12 4e182db5fc Revert "cmd/compile: use generated loops instead of DUFFCOPY on amd64"
+ 2025-08-12 d2b3c1a504 internal/trace: clarify which StateTransition events have stacks
+ 2025-08-12 f63e12d0e0 internal/trace: fix Sync.ClockSnapshot comment
+ 2025-08-12 8e317da77d internal/trace: remove unused StateTransition.id field
+ 2025-08-12 f67d8ff34a internal/trace/tracev2: adjust comment for consistency
+ 2025-08-12 fe4d445c36 internal/trace/tracev2: fix EvSTWBegin comment to include stack ID
+ 2025-08-12 750789fab7 internal/trace/internal/testgen: fix missing stacks nframes arg
+ 2025-08-12 889ab74169 internal/runtime/gc/scan: import scan kernel from gclab [green tea]
+ 2025-08-12 182336bf05 net/http: fix data race in client
+ 2025-08-12 f04421ea9a cmd/compile: soften test for 74788
+ 2025-08-12 28aa529c99 cmd/compile: use generated loops instead of DUFFZERO on arm64
+ 2025-08-12 ec9e1176c3 cmd/compile: use generated loops instead of DUFFCOPY on amd64
+ 2025-08-12 d0a64f7969 Revert "cmd/compile/internal/ssa: Use transitive properties for len/cap"
+ 2025-08-12 00a7bdcb55 all: delete aliastypeparams GOEXPERIMENT
+ 2025-08-11 74421a305b Revert "cmd/compile: allow multi-field structs to be stored directly in interfaces"
+ 2025-08-11 c31359138c Revert "cmd/compile: allow StructSelect [x] of interface data fields for x>0"
+ 2025-08-11 7248995b60 Revert "cmd/compile: allow more args in StructMake folding rule"
+ 2025-08-11 caf9fc3ccd Revert "reflect: handle zero-sized fields of directly-stored structures correctly"
+ 2025-08-11 ce3f3e2ae7 cmd/link/internal/ld, internal/syscall/unix: use posix_fallocate on netbsd
+ 2025-08-11 3dbef65bf3 database/sql: allow drivers to override Scan behavior
+ 2025-08-11 2b804abf07 net: context aware Dialer.Dial functions
+ 2025-08-11 6abfe7b0de cmd/dist: require Go 1.24.6 as minimum bootstrap toolchain
+ 2025-08-11 691af6ca28 encoding/json: fix Indent trailing whitespace regression in goexperiment.jsonv2
+ 2025-08-11 925149da20 net/http: add example for CrossOriginProtection
+ 2025-08-11 cf4af0b2f3 encoding/json/v2: fix UnmarshalDecode regression with EOF
+ 2025-08-11 b096ddb9ea internal/runtime/maps: loop invariant code motion with h2(hash) by hand
+ 2025-08-11 a2431776eb net, os, file/filepath, syscall: use slices.Equal in tests
+ 2025-08-11 a7f05b38f7 cmd/compile: convert branch with zero to more optimal branch zero on loong64
+ 2025-08-11 1718828c81 internal/sync: warn about incorrect unsafe usage in HashTrieMap
+ 2025-08-11 084c0f8494 cmd/compile: allow InlMark operations to be speculatively executed
+ 2025-08-10 a62f72f7a7 cmd/compile/internal/ssa: optimise more branches with SGTconst/SGTUconst on loong64
+ 2025-08-08 fbac94a799 internal/sync: rename Store parameter from old to new
+ 2025-08-08 317be4cfeb cmd/compile/internal/staticinit: remove deadcode
+ 2025-08-08 bce5601cbb cmd/go: fix fips doc link
+ 2025-08-08 777d76c4f2 text/template: use sync.OnceValue for builtinFuncs
+ 2025-08-08 0201524c52 math: remove redundant infinity tests
+ 2025-08-08 dcc77f9e3c cmd/go: fix get -tool when multiple packages are provided
+ 2025-08-08 c7b85e9ddc all: update blog link
+ 2025-08-08 a8dd771e13 crypto/tls: check if quic conn can send session ticket
+ 2025-08-08 bdb2d50fdf net: fix WriteMsgUDPAddrPort addr handling on IPv4 sockets
+ 2025-08-08 768c51e368 internal/runtime/maps: remove unused var bitsetDeleted
+ 2025-08-08 b3388569a1 reflect: handle zero-sized fields of directly-stored structures correctly
+ 2025-08-08 d83b16fcb8 internal/bytealg: vector implementation of compare for riscv64
+ 2025-08-07 dd3abf6bc5 internal/bytealg: optimize Index/IndexString on loong64
+ 2025-08-07 73ff6d1480 cmd/internal/obj/loong64: change the immediate range of ALSL{W/WU/V}
+ 2025-08-07 f3606b0825 cmd/compile/internal/ssa: fix typo in LOONG64Ops.go comment
+ 2025-08-07 ee7bb8969a cmd/internal/obj/loong64: add support for FSEL instruction
+ 2025-08-07 1f7ffca171 time: skip TestLongAdjustTimers on plan9 (too slow)
+ 2025-08-06 8282b72d62 runtime/race: update darwin race syso
+ 2025-08-06 dc54d7b607 all: remove support for windows/arm
+ 2025-08-06 e0a1ea431c cmd/compile: make panicBounds stack frame smaller on ppc64
+ 2025-08-06 2747f925dd debug/macho: support reading imported symbols without LC_DYSYMTAB
+ 2025-08-06 025d36917c cmd/internal/testdir: pass -buildid to link command
+ 2025-08-06 f53dcb6280 cmd/internal/testdir: unify link command
+ 2025-08-06 a3895fe9f1 database/sql: avoid closing Rows while scan is in progress
+ 2025-08-06 608e9fac90 go/types, types2: flip on position tracing
+ 2025-08-06 72e8237cc1 cmd/compile: allow more args in StructMake folding rule
+ 2025-08-06 3406a617d9 internal/bytealg: vector implementation of indexbyte for riscv64
+ 2025-08-06 75ea2d05c0 internal/bytealg: vector implementation of equal for riscv64
+ 2025-08-05 17a8be7117 crypto/sha512: use const table for key loading on loong64
+ 2025-08-05 dda9d780e2 crypto/sha256: use const table for key loading on loong64
+ 2025-08-05 5defe8ebb3 internal/chacha8rand: replace WORD with instruction VMOVQ
+ 2025-08-05 4c7362e41c cmd/internal/obj/loong64: add new instructions ALSL{W/WU/V} for loong64
+ 2025-08-05 a552737418 cmd/compile: fold negation into multiplication on loong64
+ 2025-08-05 e1fd4faf91 runtime: fix godoc comment for inVDSOPage
+ 2025-08-05 bcd25c79aa cmd/compile: allow StructSelect [x] of interface data fields for x>0
+ 2025-08-05 b0945a54b5 cmd/dist, internal/platform: mark freebsd/riscv64 broken
+ 2025-08-05 55d961b202 runtime: save AVX2 and AVX-512 state on asynchronous preemption
+ 2025-08-05 af0c4fe2ca runtime: save scalar registers off stack in amd64 async preemption
+ 2025-08-05 e73afaae69 internal/cpu: add AVX-512-CD and DQ, and derived "basic AVX-512"
+ 2025-08-05 cef381ba60 runtime: eliminate global state in mkpreempt.go
+ 2025-08-05 c0025d5e0b go/parser: correct comment in expectedErrors
+ 2025-08-05 4ee0df8c46 cmd: remove dead code
+ 2025-08-05 a2c45f0eb1 runtime: test VDSO symbol hash values
+ 2025-08-05 cd55f86b8d cmd/compile: allow multi-field structs to be stored directly in interfaces
+ 2025-08-05 21ab0128b6 cmd/compile: remove support for old-style bounds check calls
+ 2025-08-05 802d056c78 cmd/compile: move ppc64 over to new bounds check strategy
+ 2025-08-05 a3295df873 cmd/compile/internal/ssa: Use transitive properties for len/cap
+ 2025-08-05 bd082857a5 doc: fix typo in go memory model doc
+ 2025-08-05 2b622b05a9 cmd/compile: remove isUintXPowerOfTwo functions
+ 2025-08-05 72147ffa75 cmd/compile: simplify isUintXPowerOfTwo implementation
+ 2025-08-05 26da1199eb cmd/compile: make isUint{32,64}PowerOfTwo implementations clearer
+ 2025-08-05 5ab9f23977 cmd/compile, runtime: add checkptr instrumentation for unsafe.Add
+ 2025-08-05 fcc036f03b cmd/compile: optimise float <-> int register moves on riscv64
Change-Id: Ie94f29d9b0cc14a52a536866f5abaef27b5c52d7
|
|
The 'classify' instruction on RISC-V sets a bit in a mask to indicate
the class a floating point value belongs to (e.g. whether the value is
an infinity, a normal number, a subnormal number and so on). There are
other places this instruction is useful but for now I've just used it
for infinity tests.
The gains are relatively small (~1-2 instructions per IsInf call) but
using FCLASSD does potentially unlock further optimizations. It also
reduces the number of loads from memory and the number of moves
between general purpose and floating point register files.
goos: linux
goarch: riscv64
pkg: math
cpu: Spacemit(R) X60
│ sec/op │ sec/op vs base │
Acos 159.9n ± 0% 173.7n ± 0% +8.66% (p=0.000 n=10)
Acosh 249.8n ± 0% 254.4n ± 0% +1.86% (p=0.000 n=10)
Asin 159.9n ± 0% 173.7n ± 0% +8.66% (p=0.000 n=10)
Asinh 292.2n ± 0% 283.0n ± 0% -3.15% (p=0.000 n=10)
Atan 119.1n ± 0% 119.0n ± 0% -0.08% (p=0.036 n=10)
Atanh 265.1n ± 0% 271.6n ± 0% +2.43% (p=0.000 n=10)
Atan2 194.9n ± 0% 186.7n ± 0% -4.23% (p=0.000 n=10)
Cbrt 216.3n ± 0% 203.1n ± 0% -6.10% (p=0.000 n=10)
Ceil 31.82n ± 0% 31.81n ± 0% ~ (p=0.063 n=10)
Copysign 4.897n ± 0% 4.893n ± 3% -0.08% (p=0.038 n=10)
Cos 123.9n ± 0% 107.7n ± 1% -13.03% (p=0.000 n=10)
Cosh 293.0n ± 0% 264.6n ± 0% -9.68% (p=0.000 n=10)
Erf 150.0n ± 0% 133.8n ± 0% -10.80% (p=0.000 n=10)
Erfc 151.8n ± 0% 137.9n ± 0% -9.16% (p=0.000 n=10)
Erfinv 173.8n ± 0% 173.8n ± 0% ~ (p=0.820 n=10)
Erfcinv 173.8n ± 0% 173.8n ± 0% ~ (p=1.000 n=10)
Exp 247.7n ± 0% 220.4n ± 0% -11.04% (p=0.000 n=10)
ExpGo 261.4n ± 0% 232.5n ± 0% -11.04% (p=0.000 n=10)
Expm1 176.2n ± 0% 164.9n ± 0% -6.41% (p=0.000 n=10)
Exp2 220.4n ± 0% 190.2n ± 0% -13.70% (p=0.000 n=10)
Exp2Go 232.5n ± 0% 204.0n ± 0% -12.22% (p=0.000 n=10)
Abs 4.897n ± 0% 4.897n ± 0% ~ (p=0.726 n=10)
Dim 16.32n ± 0% 16.31n ± 0% ~ (p=0.770 n=10)
Floor 31.84n ± 0% 31.83n ± 0% ~ (p=0.677 n=10)
Max 26.11n ± 0% 26.13n ± 0% ~ (p=0.290 n=10)
Min 26.10n ± 0% 26.11n ± 0% ~ (p=0.424 n=10)
Mod 416.2n ± 0% 337.8n ± 0% -18.83% (p=0.000 n=10)
Frexp 63.65n ± 0% 50.60n ± 0% -20.50% (p=0.000 n=10)
Gamma 218.8n ± 0% 206.4n ± 0% -5.62% (p=0.000 n=10)
Hypot 92.20n ± 0% 94.69n ± 0% +2.70% (p=0.000 n=10)
HypotGo 107.7n ± 0% 109.3n ± 0% +1.49% (p=0.000 n=10)
Ilogb 59.54n ± 0% 44.04n ± 0% -26.04% (p=0.000 n=10)
J0 708.9n ± 0% 674.5n ± 0% -4.86% (p=0.000 n=10)
J1 707.6n ± 0% 676.1n ± 0% -4.44% (p=0.000 n=10)
Jn 1.513µ ± 0% 1.427µ ± 0% -5.68% (p=0.000 n=10)
Ldexp 70.20n ± 0% 57.09n ± 0% -18.68% (p=0.000 n=10)
Lgamma 201.5n ± 0% 185.3n ± 1% -8.01% (p=0.000 n=10)
Log 201.5n ± 0% 182.7n ± 0% -9.35% (p=0.000 n=10)
Logb 59.54n ± 0% 46.53n ± 0% -21.86% (p=0.000 n=10)
Log1p 178.8n ± 0% 173.9n ± 6% -2.74% (p=0.021 n=10)
Log10 201.4n ± 0% 184.3n ± 0% -8.49% (p=0.000 n=10)
Log2 79.17n ± 0% 66.07n ± 0% -16.54% (p=0.000 n=10)
Modf 34.27n ± 0% 34.25n ± 0% ~ (p=0.559 n=10)
Nextafter32 49.34n ± 0% 49.37n ± 0% +0.05% (p=0.040 n=10)
Nextafter64 43.66n ± 0% 43.66n ± 0% ~ (p=0.869 n=10)
PowInt 309.1n ± 0% 267.4n ± 0% -13.49% (p=0.000 n=10)
PowFrac 769.6n ± 0% 677.3n ± 0% -11.98% (p=0.000 n=10)
Pow10Pos 13.88n ± 0% 13.88n ± 0% ~ (p=0.811 n=10)
Pow10Neg 19.58n ± 0% 19.57n ± 0% ~ (p=0.993 n=10)
Round 23.65n ± 0% 23.66n ± 0% ~ (p=0.354 n=10)
RoundToEven 27.75n ± 0% 27.75n ± 0% ~ (p=0.971 n=10)
Remainder 380.0n ± 0% 309.9n ± 0% -18.45% (p=0.000 n=10)
Signbit 13.06n ± 0% 13.06n ± 0% ~ (p=1.000 n=10)
Sin 133.8n ± 0% 120.8n ± 0% -9.75% (p=0.000 n=10)
Sincos 160.7n ± 0% 147.7n ± 0% -8.12% (p=0.000 n=10)
Sinh 305.9n ± 0% 277.9n ± 0% -9.17% (p=0.000 n=10)
SqrtIndirect 3.265n ± 0% 3.264n ± 0% ~ (p=0.546 n=10)
SqrtLatency 19.58n ± 0% 19.58n ± 0% ~ (p=0.973 n=10)
SqrtIndirectLatency 19.59n ± 0% 19.58n ± 0% ~ (p=0.370 n=10)
SqrtGoLatency 205.7n ± 0% 202.7n ± 0% -1.46% (p=0.000 n=10)
SqrtPrime 4.953µ ± 0% 4.954µ ± 0% ~ (p=0.477 n=10)
Tan 163.2n ± 0% 150.2n ± 0% -7.99% (p=0.000 n=10)
Tanh 312.4n ± 0% 284.2n ± 0% -9.01% (p=0.000 n=10)
Trunc 31.83n ± 0% 31.83n ± 0% ~ (p=0.663 n=10)
Y0 701.0n ± 0% 669.2n ± 0% -4.54% (p=0.000 n=10)
Y1 704.5n ± 0% 672.4n ± 0% -4.55% (p=0.000 n=10)
Yn 1.490µ ± 0% 1.422µ ± 0% -4.60% (p=0.000 n=10)
Float64bits 5.713n ± 0% 5.710n ± 0% ~ (p=0.926 n=10)
Float64frombits 4.896n ± 0% 4.896n ± 0% ~ (p=0.663 n=10)
Float32bits 12.25n ± 0% 12.25n ± 0% ~ (p=0.571 n=10)
Float32frombits 4.898n ± 0% 4.896n ± 0% ~ (p=0.754 n=10)
FMA 4.895n ± 0% 4.895n ± 0% ~ (p=0.745 n=10)
geomean 94.40n 89.43n -5.27%
Change-Id: I4fe0f2e9f609e38d79463f9ba2519a3f9427432e
Reviewed-on: https://go-review.googlesource.com/c/go/+/348389
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This reverts commit 4e182db5fc876564a4f87a0602c58ea0ddc6e37c (CL 695196),
which is itself a revert of
ec9e1176c3209cf92e73e3deb2d8073fab5ea4d6 (CL 678620).
So this CL is exactly the same as CL 678620, but with a regalloc fix
(CL 696035) submitted first.
Change-Id: I743ab32fa3aa6ef3e1b2b6751a2ef4519139057c
Reviewed-on: https://go-review.googlesource.com/c/go/+/696016
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
It is ok to clobber registers that have a copy of a fixedreg value,
as that value is always available in its original location later
if we need it. (See 14 lines below the change.)
This CL will fix the regalloc infinite loop that CL 678620 introduced.
That CL requests that the stack pointer value be materialized in a
non-stack-pointer register, which is atypical. That condition
triggered the infinite loop that this CL fixes. The infinite loop is
the compiler trying to reuse that non-stack-pointer register for
something else, but then refusing to give it up because it thought
that non-stack-pointer register held the last copy of the original SP
value.
Change-Id: Id604d0937fb9d3753ee273bf1917753d3ef2d5d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/696035
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
We're about to add a small simd/_gen submodule that imports external
dependencies. Exclude it from TestStdlib since it won't be able to
follow those dependencies.
Change-Id: I29a1adc98d141b9c511aa29e1992fab2248747d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/695976
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
this also removes AVX512 versions of the operations
that would use the same names, but not run on AVX2-only
includes files generated by simdgen CL 692355
Change-Id: Iff29042245b7688133fed49a03e681e85235b8a8
Reviewed-on: https://go-review.googlesource.com/c/go/+/692335
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
Generated by simdgen CL 693599
This turned out to require some additional work in
other places, including filling in missing
methods (use OverwriteBase to get FP versions).
Also includes a test.
Change-Id: I2efe8967837834745f9cae661d4d4dcbb5390b6f
Reviewed-on: https://go-review.googlesource.com/c/go/+/693758
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
SIMD code generation created interesting new type/register
combintations.
Change-Id: I9c9a73bf51f6cb54551db1fdc88f9dd1eef7ab26
Reviewed-on: https://go-review.googlesource.com/c/go/+/695895
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
also includes a handy debugging hook for the inliner.
Change-Id: I23d0619506219d21db78c6c801612ff058562142
Reviewed-on: https://go-review.googlesource.com/c/go/+/694118
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
this code is generated by simdgen CL 695455
Change-Id: I5afdc209a50b49d68e120130e0578e4666bf8749
Reviewed-on: https://go-review.googlesource.com/c/go/+/695475
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
Change-Id: I88056fada1ff488c199fce54cf737dbdd091214d
Reviewed-on: https://go-review.googlesource.com/c/go/+/695095
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Removes 132 instructions from the go binary on loong64.
Change-Id: Ia02dc305b12f63a64f3f48d120ef852d45cc2a7b
Reviewed-on: https://go-review.googlesource.com/c/go/+/695115
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
'ADDshiftLLV' on loong64
goos: linux
goarch: loong64
pkg: cmd/compile/internal/test
cpu: Loongson-3A6000-HV @ 2500.00MHz
│ old │ new │
│ sec/op │ sec/op vs base │
MulconstI32/3 0.8004n ± 0% 0.4247n ± 2% -46.94% (p=0.000 n=10)
MulconstI32/5 0.8005n ± 0% 0.4256n ± 1% -46.83% (p=0.000 n=10)
MulconstI32/12 1.2010n ± 0% 0.8005n ± 0% -33.35% (p=0.000 n=10)
MulconstI32/120 0.8090n ± 0% 0.8067n ± 0% -0.28% (p=0.007 n=10)
MulconstI32/-120 0.8109n ± 0% 0.8072n ± 0% -0.47% (p=0.000 n=10)
MulconstI32/65537 0.8004n ± 0% 0.8004n ± 0% ~ (p=1.000 n=10)
MulconstI32/65538 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.265 n=10)
MulconstI64/3 0.8005n ± 0% 0.4241n ± 1% -47.02% (p=0.000 n=10)
MulconstI64/5 0.8004n ± 0% 0.4249n ± 1% -46.91% (p=0.000 n=10)
MulconstI64/12 1.2010n ± 0% 0.8004n ± 0% -33.36% (p=0.000 n=10)
MulconstI64/120 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.635 n=10)
MulconstI64/-120 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.837 n=10)
MulconstI64/65537 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.837 n=10)
MulconstI64/65538 0.8096n ± 0% 0.8004n ± 0% -1.14% (p=0.000 n=10)
MulconstU32/3 0.8004n ± 0% 0.4263n ± 1% -46.75% (p=0.000 n=10)
MulconstU32/5 0.8005n ± 0% 0.4262n ± 1% -46.76% (p=0.000 n=10)
MulconstU32/12 1.2010n ± 0% 0.8005n ± 0% -33.35% (p=0.000 n=10)
MulconstU32/120 0.8105n ± 0% 0.8096n ± 0% ~ (p=0.183 n=10)
MulconstU32/65537 0.8004n ± 0% 0.8004n ± 0% ~ (p=1.000 n=10)
MulconstU32/65538 0.8005n ± 0% 0.8005n ± 0% ~ (p=1.000 n=10)
MulconstU64/3 0.8004n ± 0% 0.4265n ± 4% -46.71% (p=0.000 n=10)
MulconstU64/5 0.8004n ± 0% 0.4256n ± 0% -46.82% (p=0.000 n=10)
MulconstU64/12 1.2010n ± 0% 0.8004n ± 0% -33.36% (p=0.000 n=10)
MulconstU64/120 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.387 n=10)
MulconstU64/65537 0.8005n ± 0% 0.8005n ± 0% ~ (p=0.265 n=10)
MulconstU64/65538 0.8080n ± 0% 0.8004n ± 0% -0.93% (p=0.000 n=10)
geomean 0.8539n 0.6597n -22.74%
Change-Id: Ie33e88985d7639f481bbba540bc917b9f185c357
Reviewed-on: https://go-review.googlesource.com/c/go/+/693855
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
EQ/NE on loong64
Reduce the number of go toolchain instructions on loong64 as follows:
file before after Δ %
go 1599056 1590560 -8496 -0.5313%
gofmt 326188 326104 -84 -0.0258%
asm 563482 561250 -2232 -0.3961%
cgo 488644 485252 -3392 -0.6942%
compile 2504614 2486388 -18226 -0.7277%
cover 526322 523270 -3052 -0.5799%
link 714532 711124 -3408 -0.4770%
preprofile 242316 241112 -1204 -0.4969%
vet 794446 786118 -8328 -1.0483%
Change-Id: I0914889119a28ea672b694529ef54513fbb3f3b5
Reviewed-on: https://go-review.googlesource.com/c/go/+/693875
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This reverts commit ec9e1176c3209cf92e73e3deb2d8073fab5ea4d6 (CL 678620).
Reason for revert: causing regalloc to get into an infinite loop
Change-Id: Ie53c58c6126804af6d6883ea4acdcfb632a172bd
Reviewed-on: https://go-review.googlesource.com/c/go/+/695196
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
|
|
Change-Id: Ie0c8263f36d1bcfd0edfc4ea6710ae6c113c4d48
Reviewed-on: https://go-review.googlesource.com/c/go/+/678995
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
goarch: amd64
cpu: 12th Gen Intel(R) Core(TM) i7-12700
│ base │ exp │
│ sec/op │ sec/op vs base │
MemmoveKnownSize112-20 1.764n ± 0% 1.247n ± 0% -29.31% (p=0.000 n=10)
MemmoveKnownSize128-20 1.891n ± 0% 1.405n ± 1% -25.72% (p=0.000 n=10)
MemmoveKnownSize192-20 2.521n ± 0% 2.114n ± 3% -16.16% (p=0.000 n=10)
MemmoveKnownSize248-20 4.028n ± 0% 3.877n ± 1% -3.75% (p=0.000 n=10)
MemmoveKnownSize256-20 3.272n ± 0% 2.961n ± 2% -9.53% (p=0.000 n=10)
MemmoveKnownSize512-20 6.733n ± 3% 5.936n ± 4% -11.83% (p=0.000 n=10)
MemmoveKnownSize1024-20 13.905n ± 5% 9.798n ± 9% -29.54% (p=0.000 n=10)
Change-Id: Icc01cec0d8b072300d749a5ce76f53b3725b5c65
Reviewed-on: https://go-review.googlesource.com/c/go/+/678620
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Jakub Ciolek <jakub@ciolek.dev>
|
|
This reverts commit a3295df873bb22b3ba427124b1220370a5ca5cdb (CL 679155)
Reason for revert: leads to a very expensive prove pass, see #74974
(Maybe not this CL's fault, just tickling some superlinear behavior.)
Change-Id: I75302c04cfc5e1e075aeb80edb73080bfb1efcac
Reviewed-on: https://go-review.googlesource.com/c/go/+/695175
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Daniel Morsing <daniel.morsing@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
Always enable aliastypeparams and remove the GOEXPERIMENT.
Change-Id: Ic38fe25b0bba312a7f83f7bb94b57ab75ce0f0c3
Reviewed-on: https://go-review.googlesource.com/c/go/+/691956
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
interfaces"
This reverts commit cd55f86b8dcfc139ee5c17d32530ac9e758c8bc0 (CL 681937)
Reason for revert: still causing compiler failures on Google test code
Change-Id: I5cd482fd607fd060a523257082d48821b5f965d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/695016
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
This reverts commit bcd25c79aa5675d3e5d28c09715b8147906da006 (CL 693415)
Reason for revert: still causing compiler failures on Google test code
Change-Id: I887edcff56bde3ffa316f2b629021ad323a357fa
Reviewed-on: https://go-review.googlesource.com/c/go/+/694996
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
This reverts commit 72e8237cc11569de2faf9885a1b83d06446533b5 (CL 693615)
Reason for revert: still causing compiler failures on Google test code
Change-Id: I4a7850c321d95ed7803d56866bb0c524c7a377d3
Reviewed-on: https://go-review.googlesource.com/c/go/+/695015
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This CL is generated by x/arch CL 694860
Change-Id: Ifa7c0e9749b1d9a20f31b70aafe563d7844ce6b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/694917
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL is generated by x/arch CL 694857.
Change-Id: I9745fa8c9b2e3f49bd2cff5ff6b5578c0c67bfa1
Reviewed-on: https://go-review.googlesource.com/c/go/+/694915
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This was a long-standing "we need to fix this"
for simd work, this fixes it. I expect that
simd peephole rule files will be coming soon
and there will be more errors and we will be
happier to have this.
Change-Id: Iefffc43e3e2110939f8d406f6e5da7e9e2d55bd9
Reviewed-on: https://go-review.googlesource.com/c/go/+/694455
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL fixes some errors in prog generation for imm operations, please
see the changes in ssa.go for details.
This CL also implements the jump table for non-const immediate arg. The
current implementation exhaust 0-255, the bound-checked version will be
in the next CL.
This CL is partially generated by CL 694375.
Change-Id: I75fe9900430b4fca5b39b0c0958a13b20b1104b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/694395
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This reduces 7500+ instructions from the go toolchain binary on loong64.
file before after Δ %
asm 555066 554406 -660 -0.1189%
cgo 481480 480764 -716 -0.1487%
compile 2475836 2474776 -1060 -0.0428%
cover 516536 515788 -748 -0.1448%
link 702220 701216 -1004 -0.1430%
preprofile 238626 238122 -504 -0.2112%
vet 792798 791894 -904 -0.1140%
go 1573108 1571676 -1432 -0.0910%
gofmt 320578 320042 -536 -0.1672%
total 7656248 7648684 -7564 -0.0988%
Change-Id: I51b70a1543bc258b7664caa8647e75eecbaf5eed
Reviewed-on: https://go-review.googlesource.com/c/go/+/693495
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Although InlMark takes a memory argument it ultimately becomes a
NOP and therefore is safe to speculatively execute.
Fixes #74915
Change-Id: I64317dd433e300ac28de2bcf201845083ec2ac82
Reviewed-on: https://go-review.googlesource.com/c/go/+/693795
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
loong64
Add branches to convert EQZ/NEZ into more optimal branch conditions.
This reduces 720 instructions from the go toolchain binary on loong64.
file before after Δ %
asm 555306 555082 -224 -0.0403%
cgo 481814 481742 -72 -0.0149%
compile 2475686 2475710 +24 +0.0010%
cover 516854 516770 -84 -0.0163%
link 702566 702530 -36 -0.0051%
preprofile 238612 238548 -64 -0.0268%
vet 793140 793060 -80 -0.0101%
go 1573466 1573346 -120 -0.0076%
gofmt 320560 320496 -64 -0.0200%
total 7658004 7657284 -720 -0.0094%
Additionally, rename EQ/NE to EQZ/NEZ to enhance readability.
Change-Id: Ibc876bc8b8d4e81d5c3aaf0b74b60419f3c771b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/693455
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
The staticAssignInlinedCall function contains code for handling
non-Unified IR. As Unified IR is now the sole format for the frontend,
this code is obsolete and can be removed.
Change-Id: Iac93a9b59ec6d639851e1b17ba1f75563d8bcda5
Reviewed-on: https://go-review.googlesource.com/c/go/+/694075
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Change-Id: I680bae7fc1a26c1f249ab833fa8d41e9387b2d50
Reviewed-on: https://go-review.googlesource.com/c/go/+/693456
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
|
|
- Absolute -> Abs
- ApproximateReciprocal -> Reciprocal
- Other derived apis also changed.
- Round -> RoundToEven
- Other derived apis also changed.
- Drop DotProdBroadcast
- Fused(Mul|Add)(Mul|Add)? -> remove the "Fused"
- MulEvenWiden -> remove 64bit
- MulLow -> Mul, add unit
- PairDotProd -> DotProdPairs
- make AddDotProdPairs machine ops only - peepholes will be in another
CL at dev.simd.
- PopCount -> OnesCount
- Saturated* -> *Saturated
- Fix (Add|Sub)Saturated uint mappings.
- UnsignedSignedQuadDotProdAccumulate -> AddDotProdQuadruple
- The "DotProdQuadruple" instruction does not exist, so no peepholes for
this.
This CL is generated by CL 694095.
Change-Id: If4110cc04ab96240cf56f2348d35ed2a719687de
Reviewed-on: https://go-review.googlesource.com/c/go/+/694115
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL is generated by CL 693598.
Change-Id: I949d3b3b4e5670cb30f0fb9dc779f7359409b54c
Reviewed-on: https://go-review.googlesource.com/c/go/+/693755
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL is generated by CL 693336.
Change-Id: Ic1712d49fcad0544fa3c19b0249d8bc65b347104
Reviewed-on: https://go-review.googlesource.com/c/go/+/693375
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL is generated by CL 693335.
Change-Id: Ie9adda526573f979ec7e4f535033ba29236cc5cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/693355
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|