aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/compile
AgeCommit message (Collapse)Author
2025-09-10Revert "cmd/compile: improve stp merging for non-sequent cases"Keith Randall
This reverts commit 4c63d798cb947a3cdd5a5b68f254a73d83eb288f. Reason for revert: Causes miscompilations. See issue 75365. Change-Id: Icd1fcfeb23d2ec524b16eb556030f43875e1c90d Reviewed-on: https://go-review.googlesource.com/c/go/+/702455 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Mark Freeman <markfreeman@google.com>
2025-09-09runtime: remove duff support for riscv64Meng Zhuo
Change-Id: I987d9f49fbd2650eef4224f72271bf752c54d39c Reviewed-on: https://go-review.googlesource.com/c/go/+/700538 Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
2025-09-09cmd/compile: use generated loops instead of DUFFCOPY on riscv64Meng Zhuo
MemmoveKnownSize112-4 632.1Mi ± 1% 1288.5Mi ± 0% +103.85% (p=0.000 n=10) MemmoveKnownSize128-4 636.1Mi ± 0% 1280.9Mi ± 1% +101.36% (p=0.000 n=10) MemmoveKnownSize192-4 645.3Mi ± 0% 1306.9Mi ± 1% +102.53% (p=0.000 n=10) MemmoveKnownSize248-4 650.2Mi ± 2% 1312.5Mi ± 1% +101.87% (p=0.000 n=10) MemmoveKnownSize256-4 650.7Mi ± 0% 1303.6Mi ± 1% +100.33% (p=0.000 n=10) MemmoveKnownSize512-4 658.2Mi ± 1% 1293.9Mi ± 0% +96.60% (p=0.000 n=10) MemmoveKnownSize1024-4 662.1Mi ± 0% 1312.6Mi ± 0% +98.26% (p=0.000 n=10) Change-Id: I43681ca029880025558b33ddc4295da3947c9b28 Reviewed-on: https://go-review.googlesource.com/c/go/+/700537 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Mark Freeman <markfreeman@google.com>
2025-09-09cmd/compile: use generated loops instead of DUFFZERO on riscv64Meng Zhuo
MemclrKnownSize112-4 5.602Gi ± 0% 5.601Gi ± 0% ~ (p=0.363 n=10) MemclrKnownSize128-4 6.933Gi ± 1% 6.545Gi ± 1% -5.59% (p=0.000 n=10) MemclrKnownSize192-4 8.055Gi ± 1% 7.804Gi ± 0% -3.12% (p=0.000 n=10) MemclrKnownSize248-4 8.489Gi ± 0% 8.718Gi ± 0% +2.69% (p=0.000 n=10) MemclrKnownSize256-4 8.762Gi ± 0% 8.763Gi ± 0% ~ (p=0.494 n=10) MemclrKnownSize512-4 9.514Gi ± 1% 9.514Gi ± 0% ~ (p=0.529 n=10) MemclrKnownSize1024-4 9.940Gi ± 0% 9.939Gi ± 1% ~ (p=0.989 n=10) ClearFat3-4 1.300Gi ± 0% 1.301Gi ± 0% ~ (p=0.447 n=10) ClearFat4-4 3.902Gi ± 0% 3.902Gi ± 0% ~ (p=0.971 n=10) ClearFat5-4 665.8Mi ± 0% 1331.5Mi ± 0% +100.01% (p=0.000 n=10) ClearFat6-4 665.8Mi ± 0% 1330.5Mi ± 0% +99.82% (p=0.000 n=10) ClearFat7-4 490.7Mi ± 0% 1331.9Mi ± 0% +171.45% (p=0.000 n=10) ClearFat8-4 5.201Gi ± 0% 5.202Gi ± 0% ~ (p=0.123 n=10) ClearFat9-4 856.1Mi ± 0% 1331.6Mi ± 0% +55.54% (p=0.000 n=10) ClearFat10-4 887.8Mi ± 0% 1331.9Mi ± 0% +50.03% (p=0.000 n=10) ClearFat11-4 915.3Mi ± 0% 1331.1Mi ± 0% +45.42% (p=0.000 n=10) ClearFat12-4 5.202Gi ± 0% 5.202Gi ± 0% ~ (p=0.481 n=10) ClearFat13-4 961.5Mi ± 0% 1331.8Mi ± 0% +38.50% (p=0.000 n=10) ClearFat14-4 981.0Mi ± 0% 1331.8Mi ± 0% +35.76% (p=0.000 n=10) ClearFat15-4 951.3Mi ± 0% 1331.4Mi ± 0% +39.96% (p=0.000 n=10) ClearFat16-4 1.600Gi ± 0% 5.202Gi ± 0% +225.10% (p=0.000 n=10) ClearFat18-4 1.018Gi ± 0% 1.300Gi ± 0% +27.77% (p=0.000 n=10) ClearFat20-4 2.601Gi ± 0% 4.938Gi ± 12% +89.87% (p=0.000 n=10) ClearFat24-4 2.601Gi ± 0% 5.201Gi ± 0% +99.96% (p=0.000 n=10) ClearFat32-4 1.982Gi ± 0% 5.203Gi ± 0% +162.55% (p=0.000 n=10) ClearFat40-4 3.467Gi ± 0% 4.338Gi ± 0% +25.11% (p=0.000 n=10) ClearFat48-4 3.671Gi ± 0% 5.201Gi ± 0% +41.69% (p=0.000 n=10) ClearFat56-4 3.640Gi ± 0% 5.201Gi ± 0% +42.88% (p=0.000 n=10) ClearFat64-4 2.250Gi ± 0% 5.202Gi ± 0% +131.25% (p=0.000 n=10) ClearFat72-4 4.064Gi ± 0% 5.201Gi ± 0% +27.97% (p=0.000 n=10) ClearFat128-4 4.496Gi ± 0% 5.203Gi ± 0% +15.71% (p=0.000 n=10) ClearFat256-4 4.756Gi ± 0% 5.201Gi ± 0% +9.36% (p=0.000 n=10) ClearFat512-4 2.512Gi ± 0% 5.201Gi ± 0% +107.03% (p=0.000 n=10) ClearFat1024-4 4.255Gi ± 0% 5.202Gi ± 0% +22.26% (p=0.000 n=10) ClearFat1032-4 4.260Gi ± 0% 5.201Gi ± 0% +22.09% (p=0.000 n=10) ClearFat1040-4 4.285Gi ± 1% 5.203Gi ± 0% +21.41% (p=0.000 n=10) geomean 2.005Gi 3.020Gi +50.58% Change-Id: Iea1da734ff8eaf1b5a2822ae2bdb7f4fd9b65651 Reviewed-on: https://go-review.googlesource.com/c/go/+/699635 Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Freeman <markfreeman@google.com>
2025-09-09cmd/compile: simplify zerorange on riscv64Meng Zhuo
Drop large zeroing cases, part of removing duff support. Change-Id: Ia2936f649901886f3eb1d7ba1f90e3bf40ea2dee Reviewed-on: https://go-review.googlesource.com/c/go/+/697615 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Julian Zhu <jz531210@gmail.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: Joel Sing <joel@sing.id.au>
2025-09-09cmd/compile/internal/inline: ignore superfluous slicingJake Bailey
When slicing, ignore expressions which could be elided, as in slicing starting at 0 or ending at len(v). Fixes #75278 Change-Id: I9c18e29c3d4da9bef89bd25bb261d3cb60e66392 Reviewed-on: https://go-review.googlesource.com/c/go/+/701216 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Mark Freeman <markfreeman@google.com>
2025-09-09cmd/compile/internal/ssa: expand runtime.memequal for length {3,5,6,7}Youlin Feng
This CL slightly speeds up strings.HasPrefix when testing constant prefixes of length {3,5,6,7}. goos: linux goarch: amd64 cpu: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz │ old │ new │ │ sec/op │ sec/op vs base │ StringPrefix3-8 11.125n ± 2% 8.539n ± 1% -23.25% (p=0.000 n=20) StringPrefix5-8 11.170n ± 2% 8.700n ± 1% -22.11% (p=0.000 n=20) StringPrefix6-8 11.190n ± 2% 8.655n ± 1% -22.65% (p=0.000 n=20) StringPrefix7-8 11.095n ± 1% 8.878n ± 1% -19.98% (p=0.000 n=20) Change-Id: I510a80d59cf78680b57d68780d35d212d24030e2 Reviewed-on: https://go-review.googlesource.com/c/go/+/700816 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Freeman <markfreeman@google.com> Auto-Submit: Keith Randall <khr@golang.org>
2025-09-09cmd/compile: improve stp merging for non-sequent casesMelnikov Denis
Original algorithm merges stores with the first mergeable store in the chain, but it misses some cases. Additional reordering stores in increasing order of memory access in the chain allows merging in these cases. Fixes #71987 There are the results of sweet benchmarks and the difference between sizes of sections .text │ old.results │ new.results │ │ sec/op │ sec/op vs base │ BleveIndexBatch100-4 7.614 ± 2% 7.548 ± 1% ~ (p=0.190 n=10) ESBuildThreeJS-4 821.3m ± 0% 819.0m ± 1% ~ (p=0.165 n=10) ESBuildRomeTS-4 206.2m ± 1% 204.4m ± 1% -0.90% (p=0.023 n=10) EtcdPut-4 64.89m ± 1% 64.94m ± 2% ~ (p=0.684 n=10) EtcdSTM-4 318.4m ± 0% 319.2m ± 1% ~ (p=0.631 n=10) GoBuildKubelet-4 157.4 ± 0% 157.6 ± 0% ~ (p=0.105 n=10) GoBuildKubeletLink-4 12.42 ± 2% 12.41 ± 1% ~ (p=0.529 n=10) GoBuildIstioctl-4 124.4 ± 0% 124.4 ± 0% ~ (p=0.579 n=10) GoBuildIstioctlLink-4 8.700 ± 1% 8.693 ± 1% ~ (p=0.912 n=10) GoBuildFrontend-4 46.52 ± 0% 46.50 ± 0% ~ (p=0.971 n=10) GoBuildFrontendLink-4 2.282 ± 1% 2.272 ± 1% ~ (p=0.529 n=10) GoBuildTsgo-4 75.02 ± 1% 75.31 ± 1% ~ (p=0.436 n=10) GoBuildTsgoLink-4 1.229 ± 1% 1.219 ± 1% -0.82% (p=0.035 n=10) GopherLuaKNucleotide-4 34.77 ± 5% 34.31 ± 1% -1.33% (p=0.015 n=10) MarkdownRenderXHTML-4 286.6m ± 0% 285.7m ± 1% ~ (p=0.315 n=10) Tile38QueryLoad-4 657.2µ ± 1% 660.3µ ± 0% ~ (p=0.436 n=10) geomean 2.570 2.563 -0.24% Executable Old .text New .text Change ------------------------------------------------------- benchmark 6504820 6504020 -0.01% bleve-index-bench 3903860 3903636 -0.01% esbuild 4801012 4801172 +0.00% esbuild-bench 1256404 1256340 -0.01% etcd 9188148 9187076 -0.01% etcd-bench 6462228 6461524 -0.01% go 5924468 5923892 -0.01% go-build-bench 1282004 1281940 -0.00% gopher-lua-bench 1639540 1639348 -0.01% markdown-bench 1478452 1478356 -0.01% tile38-bench 2753524 2753300 -0.01% tile38-server 10241380 10240068 -0.01% Change-Id: Ieb4fdfd656aca458f65fc45938de70550632bd13 Reviewed-on: https://go-review.googlesource.com/c/go/+/698097 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: Keith Randall <khr@google.com>
2025-09-09cmd/compile: use constant zero register instead of specialized zero ↵Julian Zhu
instructions on mips64x Refer to CL 633075, mips64x has a constant zero register that can be used to do this. Change-Id: I7b60f9a9fe0015299f48b9219ba0eddd3c02e07a Reviewed-on: https://go-review.googlesource.com/c/go/+/700935 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Freeman <markfreeman@google.com>
2025-09-09cmd/compile: introduce CCMP generationCh1n-ch1nless
Introduce new aux type "ARM64ConditionalParams", which contains condition code, NZCV flags and constant with indicator of using it for CCMP instructions Updates #71268 Change-Id: I322a6cb7077c9a2c4415893c5eb7ff7692d5a2de Reviewed-on: https://go-review.googlesource.com/c/go/+/698037 Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org>
2025-09-09cmd/compile: fix bounds check reportKeith Randall
For constant-index, variable length situations. Inadvertant shadowing of the yVal variable. Oops. Fixes #75327 Change-Id: I3403066fc39b7664222a3098cf0f22b5761ea66a Reviewed-on: https://go-review.googlesource.com/c/go/+/702015 Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-09-08[dev.simd] cmd/compile, simd: add VPLZCNT[DQ]Junyang Shao
Change-Id: Ifd6d8c12deac9c41722fdf2511d860a334e83438 Reviewed-on: https://go-review.googlesource.com/c/go/+/701915 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Bypass: Junyang Shao <shaojunyang@google.com>
2025-09-08cmd/compile: fold constant in ADDshift op on loong64Xiaolin Zhao
Removes 918 instructions from the go binary on loong64. file before after Δ go 1633120 1632948 -172 gofmt 323470 323334 -136 asm 568024 568024 -0 cgo 488030 487890 -140 compile 2501050 2500728 -322 cover 530124 530124 -0 link 723532 723520 -12 preprofile 240568 240568 -0 vet 819392 819256 -136 Change-Id: Id4015c66b2073323b7ad257b3ed05bb99f81e9a1 Reviewed-on: https://go-review.googlesource.com/c/go/+/701655 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Mark Freeman <markfreeman@google.com>
2025-09-08cmd/compile: optimize loads from abi.Type.{Size_,PtrBytes,Kind_}Jake Bailey
With the previous CL in place, we can now pretty easily optimize a few more loads from abi.Type. I've done Size_, PtrBytes, and Kind_, which are easily calculated. Among std/cmd, this rule fires a number of times: 75 abi.Type field Kind_ 50 abi.PtrType field Elem 14 abi.Type field Hash 4 abi.Type field Size_ 2 abi.Type field PtrBytes The other ones that show up when compiling std/cmd are TFlag and GCData, but these are not trivially calculated. Doing TFlag would probably be a decent help given it's often used in things like switches where statically knowing the kind could eliminate a bunch of dead code. Change-Id: Ic7fd2113fa7479af914d06916edbca60cc71819f Reviewed-on: https://go-review.googlesource.com/c/go/+/701298 Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-09-08cmd/compile: consolidate logic for rewriting fixed loadsJake Bailey
Many CLs have worked with this bit of code, extending the cases more and more for various fixed addresses and constants. But, I find that it's getting duplicitive, and I don't find the current setup very clear that something like isFixed32 _only_ works for a specific element within the type data. This CL rewrites these rules (pun unintended) into a single set of rewrite rules with shared logic, which stops hardcoding offsets and type compatibility checks. This should open the door to optimizing further type:... field loads, of which most can be done entirely statically but are not yet today outside Hash and Elem. Passes toolstash -cmp. Change-Id: I754138ce1785c6036eada9ed53f0ce2ad2a58b63 Reviewed-on: https://go-review.googlesource.com/c/go/+/701297 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: Florian Lehner <lehner.florian86@gmail.com>
2025-09-08cmd/compile: simplify zerorange on mipsJulian Zhu
Change-Id: I9feffa3906f1e1e9fd54f24113130322411cc9d4 Reviewed-on: https://go-review.googlesource.com/c/go/+/701155 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2025-09-07[dev.simd] cmd/compile: enhance prove to deal with double-offset IsInBounds ↵David Chase
checks For chunked iterations (useful for, but not exclusive to, SIMD calculations) it is common to see the combination of ``` for ; i <= len(m)-4; i += 4 { ``` and ``` r0, r1, r2, r3 := m[i], m[i+1], m[i+2], m[i+3] `` Prove did not handle the case of len-offset1 vs index+offset2 checking, but this change fixes this. There may be other similar cases yet to handle -- this worked for the chunked loops for simd, as well as a handful in std. Change-Id: I3785df83028d517e5e5763206653b34b2befd3d0 Reviewed-on: https://go-review.googlesource.com/c/go/+/700696 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-09-05cmd/compile: mark abi.PtrType.Elem sym as usedJake Bailey
CL 700336 let the compiler see into the abi.PtrType.Elem field, but forgot the MarkTypeSymUsedInInterface to ensure that the symbol is marked as referenced. I am not sure how to write a test for this, but I noticed this when working on further optimizations where I "fixed" this issue and confusingly failed toolstash -cmp, with diffs like: @@ -70582,6 +70582,7 @@ reflect.groupAndSlotOf<1> STEXT size=696 args=0x20 locals=0x1e0 funcid=0x0 align rel 3+0 t=R_USEIFACE type:*reflect.rtype<0>+0 rel 3+0 t=R_USEIFACE type:*reflect.rtype<0>+0 rel 3+0 t=R_USEIFACE type:*uint64<0>+0 + rel 3+0 t=R_USEIFACE type:uint64<0>+0 rel 71+0 t=R_CALLIND +0 rel 92+4 t=R_PCREL go:itab.*reflect.rtype,reflect.Type<0>+0 rel 114+4 t=R_CALL reflect.(*rtype).ptrTo<1>+0 Updates #75203 Change-Id: Ib8de8a32aeb8a7ea6fcf5d728a2e4944ef227ab2 Reviewed-on: https://go-review.googlesource.com/c/go/+/701296 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com>
2025-09-05[dev.simd] cmd/compile, simd: add ClearAVXUpperBitsCherry Mui
Intended for transitioning from AVX to SSE, this helps early adopters benchmarking. The compiler should take care of that, one day. Change-Id: I9d7413f22f30f8dc0c632e8e806386d9ca8e8308 Reviewed-on: https://go-review.googlesource.com/c/go/+/701199 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: David Chase <drchase@google.com>
2025-09-05cmd/compile: optimize loads from readonly globals into constants on loong64Xiaolin Zhao
Ref: CL 141118 Update #26498 Change-Id: I9c4ad2bedc4d50bd273bbe9119a898d4fca95e45 Reviewed-on: https://go-review.googlesource.com/c/go/+/700875 Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-09-05cmd/compile: simplify specific addition operations using the ADDV16 instructionXiaolin Zhao
On loong64, the addi.d instruction can only directly handle 12-bit immediate numbers. If a larger immediate number needs to be processed, it must first be placed in a register, and then the add.d instruction is used to complete the processing of the larger immediate number. If a larger immediate number c satisfies is32Bit(c) && c&0xffff == 0, then the ADDV16 instruction can be used to complete the addition operation. Removes 164 instructions from the go binary on loong64. Change-Id: I404de93cc4eaaa12fe424f5a0d61b03231215d1a Reviewed-on: https://go-review.googlesource.com/c/go/+/700536 Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-09-05cmd/compile: simplify zerorange on mips64Julian Zhu
Change-Id: I488b55a21eaaf74373c2789a34bf9b3945ced072 Reviewed-on: https://go-review.googlesource.com/c/go/+/700936 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org>
2025-09-04runtime, cmd/compile, cmd/internal/obj: remove duff support for loong64limeidan
Change-Id: I44d6452933c8010f7dfbf821a32053f9d1cf151e Reviewed-on: https://go-review.googlesource.com/c/go/+/700096 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2025-09-04path{,/filepath}: speed up MatchJulien Cretel
This change adds benchmarks for Match and speeds it up by simplifying scanChunk (to the point of making it inlineable) and eliminating some branches in matchChunk. Here are some benchmark results (no change to allocations): goos: darwin goarch: amd64 pkg: path cpu: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz │ old │ new2 │ │ sec/op │ sec/op vs base │ Match/"abc"_"abc"-8 24.64n ± 6% 21.30n ± 1% -13.53% (p=0.000 n=10) Match/"*"_"abc"-8 9.117n ± 3% 8.345n ± 2% -8.47% (p=0.000 n=10) Match/"*c"_"abc"-8 29.59n ± 3% 29.86n ± 3% ~ (p=0.055 n=10) Match/"a*"_"a"-8 22.88n ± 2% 20.66n ± 1% -9.68% (p=0.000 n=10) Match/"a*"_"abc"-8 23.57n ± 1% 21.59n ± 0% -8.40% (p=0.000 n=10) Match/"a*"_"ab/c"-8 23.18n ± 1% 21.48n ± 1% -7.35% (p=0.000 n=10) Match/"a*/b"_"abc/b"-8 54.96n ± 1% 52.06n ± 0% -5.29% (p=0.000 n=10) Match/"a*/b"_"a/c/b"-8 34.50n ± 3% 31.42n ± 2% -8.91% (p=0.000 n=10) Match/"a*b*c*d*e*/f"_"axbxcxdxe/f"-8 125.4n ± 1% 114.9n ± 1% -8.45% (p=0.000 n=10) Match/"a*b*c*d*e*/f"_"axbxcxdxexxx/f"-8 151.1n ± 1% 149.4n ± 1% -1.09% (p=0.001 n=10) Match/"a*b*c*d*e*/f"_"axbxcxdxe/xxx/f"-8 127.8n ± 4% 115.6n ± 3% -9.51% (p=0.000 n=10) Match/"a*b*c*d*e*/f"_"axbxcxdxexxx/fff"-8 155.6n ± 1% 168.0n ± 11% ~ (p=0.143 n=10) Match/"a*b?c*x"_"abxbbxdbxebxczzx"-8 189.2n ± 1% 193.5n ± 6% ~ (p=1.000 n=10) Match/"a*b?c*x"_"abxbbxdbxebxczzy"-8 196.3n ± 1% 190.6n ± 2% -2.93% (p=0.001 n=10) Match/"ab[c]"_"abc"-8 39.50n ± 1% 37.84n ± 1% -4.20% (p=0.000 n=10) Match/"ab[b-d]"_"abc"-8 47.94n ± 1% 45.99n ± 1% -4.06% (p=0.000 n=10) Match/"ab[e-g]"_"abc"-8 49.48n ± 1% 47.57n ± 1% -3.85% (p=0.000 n=10) Match/"ab[^c]"_"abc"-8 41.55n ± 1% 38.01n ± 2% -8.51% (p=0.000 n=10) Match/"ab[^b-d]"_"abc"-8 50.48n ± 1% 46.35n ± 0% -8.17% (p=0.000 n=10) Match/"ab[^e-g]"_"abc"-8 50.12n ± 3% 47.37n ± 2% -5.47% (p=0.000 n=10) Match/"a\\*b"_"a*b"-8 24.59n ± 0% 21.77n ± 1% -11.47% (p=0.000 n=10) Match/"a\\*b"_"ab"-8 26.14n ± 1% 24.52n ± 3% -6.18% (p=0.000 n=10) Match/"a?b"_"a☺b"-8 25.66n ± 2% 23.40n ± 1% -8.81% (p=0.000 n=10) Match/"a[^a]b"_"a☺b"-8 41.14n ± 0% 38.97n ± 3% -5.29% (p=0.001 n=10) Match/"a???b"_"a☺b"-8 39.66n ± 6% 36.63n ± 6% -7.63% (p=0.003 n=10) Match/"a[^a][^a][^a]b"_"a☺b"-8 85.07n ± 8% 84.97n ± 1% ~ (p=0.971 n=10) Match/"[a-ζ]*"_"α"-8 49.45n ± 4% 48.34n ± 2% -2.23% (p=0.001 n=10) Match/"*[a-ζ]"_"A"-8 67.67n ± 3% 70.53n ± 2% +4.23% (p=0.000 n=10) Match/"a?b"_"a/b"-8 26.61n ± 5% 25.92n ± 1% -2.59% (p=0.000 n=10) Match/"a*b"_"a/b"-8 29.38n ± 3% 26.61n ± 2% -9.46% (p=0.000 n=10) Match/"[\\]a]"_"]"-8 40.75n ± 3% 39.31n ± 2% -3.52% (p=0.000 n=10) Match/"[\\-]"_"-"-8 30.58n ± 1% 30.53n ± 2% ~ (p=0.565 n=10) Match/"[x\\-]"_"x"-8 41.02n ± 2% 39.32n ± 1% -4.13% (p=0.000 n=10) Match/"[x\\-]"_"-"-8 40.74n ± 2% 39.82n ± 1% -2.25% (p=0.000 n=10) Match/"[x\\-]"_"z"-8 42.19n ± 1% 40.47n ± 1% -4.08% (p=0.001 n=10) Match/"[\\-x]"_"x"-8 40.77n ± 1% 39.61n ± 3% -2.86% (p=0.000 n=10) Match/"[\\-x]"_"-"-8 40.94n ± 2% 39.39n ± 0% -3.79% (p=0.000 n=10) Match/"[\\-x]"_"a"-8 42.12n ± 2% 42.11n ± 1% ~ (p=0.954 n=10) Match/"[]a]"_"]"-8 23.73n ± 2% 21.67n ± 2% -8.72% (p=0.000 n=10) Match/"[-]"_"-"-8 21.28n ± 1% 19.96n ± 1% -6.16% (p=0.000 n=10) Match/"[x-]"_"x"-8 28.96n ± 1% 27.98n ± 3% -3.37% (p=0.000 n=10) Match/"[x-]"_"-"-8 29.46n ± 2% 29.27n ± 7% ~ (p=0.796 n=10) Match/"[x-]"_"z"-8 29.14n ± 0% 30.78n ± 9% ~ (p=0.468 n=10) Match/"[-x]"_"x"-8 23.62n ± 3% 22.08n ± 1% -6.54% (p=0.000 n=10) Match/"[-x]"_"-"-8 23.70n ± 2% 22.11n ± 2% -6.69% (p=0.000 n=10) Match/"[-x]"_"a"-8 23.71n ± 0% 22.09n ± 3% -6.83% (p=0.000 n=10) Match/"\\"_"a"-8 11.845n ± 2% 9.496n ± 3% -19.83% (p=0.000 n=10) Match/"[a-b-c]"_"a"-8 39.85n ± 1% 40.69n ± 1% +2.10% (p=0.000 n=10) Match/"["_"a"-8 18.57n ± 3% 16.47n ± 3% -11.33% (p=0.000 n=10) Match/"[^"_"a"-8 20.30n ± 3% 17.96n ± 1% -11.53% (p=0.000 n=10) Match/"[^bc"_"a"-8 36.81n ± 7% 32.87n ± 2% -10.72% (p=0.000 n=10) Match/"a["_"a"-8 20.11n ± 1% 19.41n ± 1% -3.46% (p=0.000 n=10) Match/"a["_"ab"-8 23.19n ± 2% 21.80n ± 1% -6.02% (p=0.000 n=10) Match/"a["_"x"-8 20.79n ± 1% 19.51n ± 1% -6.13% (p=0.000 n=10) Match/"a/b["_"x"-8 29.80n ± 0% 28.36n ± 0% -4.82% (p=0.000 n=10) Match/"*x"_"xxx"-8 30.77n ± 2% 28.55n ± 2% -7.23% (p=0.000 n=10) geomean 37.11n 35.15n -5.29% pkg: path/filepath │ old │ new2 │ │ sec/op │ sec/op vs base │ Match/"abc"_"abc"-8 22.79n ± 3% 21.62n ± 1% -5.11% (p=0.000 n=10) Match/"*"_"abc"-8 11.410n ± 3% 9.595n ± 1% -15.91% (p=0.000 n=10) Match/"*c"_"abc"-8 29.69n ± 2% 29.09n ± 0% -2.00% (p=0.001 n=10) Match/"a*"_"a"-8 23.69n ± 1% 21.12n ± 1% -10.83% (p=0.000 n=10) Match/"a*"_"abc"-8 25.38n ± 4% 22.41n ± 5% -11.70% (p=0.000 n=10) Match/"a*"_"ab/c"-8 24.73n ± 1% 21.85n ± 3% -11.65% (p=0.000 n=10) Match/"a*/b"_"abc/b"-8 53.73n ± 2% 53.30n ± 2% -0.82% (p=0.037 n=10) Match/"a*/b"_"a/c/b"-8 32.97n ± 2% 31.64n ± 0% -4.02% (p=0.000 n=10) Match/"a*b*c*d*e*/f"_"axbxcxdxe/f"-8 120.7n ± 0% 124.0n ± 1% +2.82% (p=0.000 n=10) Match/"a*b*c*d*e*/f"_"axbxcxdxexxx/f"-8 149.4n ± 2% 160.0n ± 6% +7.13% (p=0.009 n=10) Match/"a*b*c*d*e*/f"_"axbxcxdxe/xxx/f"-8 121.8n ± 2% 122.8n ± 9% ~ (p=0.898 n=10) Match/"a*b*c*d*e*/f"_"axbxcxdxexxx/fff"-8 149.7n ± 1% 151.2n ± 2% +1.04% (p=0.005 n=10) Match/"a*b?c*x"_"abxbbxdbxebxczzx"-8 187.8n ± 1% 184.8n ± 1% -1.57% (p=0.001 n=10) Match/"a*b?c*x"_"abxbbxdbxebxczzy"-8 195.6n ± 1% 191.2n ± 2% -2.22% (p=0.018 n=10) Match/"ab[c]"_"abc"-8 40.30n ± 4% 37.31n ± 3% -7.41% (p=0.000 n=10) Match/"ab[b-d]"_"abc"-8 47.92n ± 4% 45.92n ± 4% -4.17% (p=0.002 n=10) Match/"ab[e-g]"_"abc"-8 47.58n ± 4% 45.22n ± 0% -4.98% (p=0.000 n=10) Match/"ab[^c]"_"abc"-8 40.33n ± 2% 36.95n ± 2% -8.38% (p=0.000 n=10) Match/"ab[^b-d]"_"abc"-8 48.68n ± 2% 46.49n ± 3% -4.50% (p=0.001 n=10) Match/"ab[^e-g]"_"abc"-8 49.11n ± 2% 48.42n ± 2% -1.42% (p=0.014 n=10) Match/"a\\*b"_"a*b"-8 26.79n ± 9% 23.70n ± 1% -11.53% (p=0.000 n=10) Match/"a\\*b"_"ab"-8 23.84n ± 11% 25.99n ± 1% +9.02% (p=0.015 n=10) Match/"a?b"_"a☺b"-8 25.72n ± 2% 23.70n ± 1% -7.87% (p=0.000 n=10) Match/"a[^a]b"_"a☺b"-8 41.72n ± 2% 39.24n ± 2% -5.94% (p=0.000 n=10) Match/"a???b"_"a☺b"-8 35.94n ± 1% 35.30n ± 3% -1.78% (p=0.009 n=10) Match/"a[^a][^a][^a]b"_"a☺b"-8 82.17n ± 0% 84.56n ± 3% +2.91% (p=0.000 n=10) Match/"[a-ζ]*"_"α"-8 52.01n ± 3% 49.78n ± 0% -4.30% (p=0.000 n=10) Match/"*[a-ζ]"_"A"-8 65.62n ± 1% 66.91n ± 3% +1.97% (p=0.000 n=10) Match/"a?b"_"a/b"-8 24.60n ± 2% 25.60n ± 1% +4.07% (p=0.001 n=10) Match/"a*b"_"a/b"-8 28.00n ± 1% 26.50n ± 1% -5.37% (p=0.000 n=10) Match/"[\\]a]"_"]"-8 40.72n ± 1% 39.55n ± 2% -2.86% (p=0.002 n=10) Match/"[\\-]"_"-"-8 30.52n ± 2% 30.15n ± 1% -1.18% (p=0.003 n=10) Match/"[x\\-]"_"x"-8 40.69n ± 1% 40.41n ± 2% ~ (p=0.105 n=10) Match/"[x\\-]"_"-"-8 40.50n ± 2% 40.69n ± 2% ~ (p=0.109 n=10) Match/"[x\\-]"_"z"-8 40.80n ± 3% 39.92n ± 2% -2.17% (p=0.002 n=10) Match/"[\\-x]"_"x"-8 40.73n ± 2% 43.78n ± 5% +7.49% (p=0.003 n=10) Match/"[\\-x]"_"-"-8 40.86n ± 2% 39.58n ± 9% ~ (p=0.105 n=10) Match/"[\\-x]"_"a"-8 40.67n ± 0% 40.85n ± 3% ~ (p=0.288 n=10) Match/"[]a]"_"]"-8 22.71n ± 1% 20.35n ± 2% -10.37% (p=0.000 n=10) Match/"[-]"_"-"-8 20.98n ± 2% 19.71n ± 2% -6.10% (p=0.000 n=10) Match/"[x-]"_"x"-8 30.78n ± 1% 26.29n ± 1% -14.56% (p=0.000 n=10) Match/"[x-]"_"-"-8 30.79n ± 2% 26.38n ± 3% -14.29% (p=0.000 n=10) Match/"[x-]"_"z"-8 30.77n ± 0% 26.47n ± 1% -13.97% (p=0.000 n=10) Match/"[-x]"_"x"-8 23.73n ± 1% 21.08n ± 1% -11.19% (p=0.000 n=10) Match/"[-x]"_"-"-8 23.68n ± 1% 21.11n ± 3% -10.81% (p=0.000 n=10) Match/"[-x]"_"a"-8 23.74n ± 3% 21.04n ± 0% -11.37% (p=0.000 n=10) Match/"\\"_"a"-8 11.57n ± 0% 10.35n ± 0% -10.63% (p=0.000 n=10) Match/"[a-b-c]"_"a"-8 42.46n ± 1% 38.97n ± 3% -8.23% (p=0.000 n=10) Match/"["_"a"-8 18.03n ± 7% 15.95n ± 1% -11.51% (p=0.000 n=10) Match/"[^"_"a"-8 19.20n ± 11% 17.96n ± 3% -6.41% (p=0.000 n=10) Match/"[^bc"_"a"-8 32.82n ± 3% 32.34n ± 0% -1.45% (p=0.000 n=10) Match/"a["_"a"-8 19.48n ± 2% 19.48n ± 2% ~ (p=0.670 n=10) Match/"a["_"ab"-8 22.19n ± 3% 22.09n ± 1% ~ (p=0.148 n=10) Match/"a["_"x"-8 20.32n ± 3% 19.66n ± 1% -3.27% (p=0.001 n=10) Match/"a/b["_"x"-8 28.68n ± 1% 28.52n ± 1% ~ (p=0.315 n=10) Match/"*x"_"xxx"-8 29.95n ± 3% 29.27n ± 3% -2.27% (p=0.006 n=10) geomean 36.76n 35.10n -4.51% Change-Id: I02d07b7a066e5789587035180fa15ae07a9579a6 GitHub-Last-Rev: 8b314b430920f9636088d0291adf8d518fc56b0a GitHub-Pull-Request: golang/go#75211 Reviewed-on: https://go-review.googlesource.com/c/go/+/700315 Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Kirill Kolyshkin <kolyshkin@gmail.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-09-04cmd/compile/internal/ssa: load constant values from abi.PtrType.ElemYoulin Feng
This CL makes the generated code for reflect.TypeFor as simple as an intrinsic function. Fixes #75203 Change-Id: I7bb48787101f07e77ab5c583292e834c28a028d6 Reviewed-on: https://go-review.googlesource.com/c/go/+/700336 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Keith Randall <khr@golang.org>
2025-09-04cmd/compile: add store to load forwarding rules on riscv64Michael Munday
Adds the equivalent integer variants of the floating point rules added in CL 599235. Change-Id: Ibe03c26383059821d99cea2337799e6416b4c581 Reviewed-on: https://go-review.googlesource.com/c/go/+/700175 Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Julian Zhu <jz531210@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2025-09-03cmd/compile: export to DWARF types only referenced through interfacesAlessandro Arzilli
Delve and viewcore use DWARF type DIEs to display and explore the runtime value of interface variables. This has always been slightly problematic since the runtime type of an interface variable might only be reachable through interfaces and thus be missing from debug_info (see issue #46670). Prior to commit f4de2ecf this was not a severe problem since a struct literal caused the allocation of a struct into an autotemp variable, which was then used by dwarfgen to make sure that the DIE for that type would be generated. After f4de2ecf such autotemps are no longer being generated and go1.25.0 ends up having many more instances of interfaces with unreadable runtime type (https://github.com/go-delve/delve/issues/4080). This commit fixes this problem by scanning the relocation of the function symbol and adding to the function's DIE symbol references to all types used by the function to create interfaces. Fixes go-delve/delve#4080 Updates #46670 Change-Id: I3e9db1c0d1662905373239816a72604ac533b09e Reviewed-on: https://go-review.googlesource.com/c/go/+/696955 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Than McIntosh <thanm@golang.org> Reviewed-by: Florian Lehner <lehner.florian86@gmail.com>
2025-09-03cmd/compile: use generated loops instead of DUFFCOPY on loong64limeidan
Change-Id: If9da2b5681e5d05d7c3d51f003f1fe662d3feaec Reviewed-on: https://go-review.googlesource.com/c/go/+/699855 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-09-03cmd/compile: simplify memory load and store operations on loong64Xiaolin Zhao
If the memory load and store operations use the same ptr, they are combined into a direct move operation between registers, like riscv64. Change-Id: I889e51a5146aee7f15340114bc4abd12eb6b8a1f Reviewed-on: https://go-review.googlesource.com/c/go/+/700535 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Keith Randall <khr@golang.org>
2025-09-03cmd/compile: simplify the support for 32bit high multiply on loong64Xiaolin Zhao
Removes 152 instructions from the Go binary on loong64. Change-Id: Icf8ead4f4ca965f51add85ac5e45c3cca8916401 Reviewed-on: https://go-review.googlesource.com/c/go/+/700335 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2025-09-03[dev.simd] all: merge master (4c4cefc) into dev.simdCherry Mui
Merge List: + 2025-09-03 4c4cefc19a cmd/gofmt: simplify logic to process arguments + 2025-09-03 925a3cdcd1 unicode/utf8: make DecodeRune{,InString} inlineable + 2025-09-03 3e596d448f math: rename Modf parameter int to integer + 2025-09-02 2a7f1d47b0 runtime: use one more address bit for tagged pointers + 2025-09-02 b09068041a cmd/dist: run racebench tests only in longtest mode + 2025-09-02 355370ac52 runtime: add comment for concatstring2 + 2025-09-02 1eec830f54 go/doc: linkify interface methods + 2025-08-31 7bba745820 cmd/compile: use generated loops instead of DUFFZERO on loong64 + 2025-08-31 882335e2cb cmd/internal/obj/loong64: add LDPTR.{W/D} and STPTR.{W/D} instructions support + 2025-08-31 d4b17f5869 internal/runtime/atomic: reset wrong jump target in Cas{,64} on loong64 + 2025-08-31 6a08e80399 net/http: skip redirecting in ServeMux when URL path for CONNECT is empty + 2025-08-29 8bcda6c79d runtime/race: add race detector support for linux/riscv64 + 2025-08-29 8377adafc5 cmd/cgo: split loadDWARF into two parts + 2025-08-29 a7d9d5a80a cmd/cgo: move typedefs and typedefList out of Package + 2025-08-29 1d459c4357 all: delete more windows/arm remnants + 2025-08-29 27ce6e4e26 cmd/compile: remove sign extension before MULW on riscv64 + 2025-08-29 84b070bfb1 cmd/compile/internal/ssa: make oneBit function generic + 2025-08-29 fe42628dae internal/cpu: inline DebugOptions + 2025-08-29 94b7d519bd net: update document on limitation of iprawsock on Windows + 2025-08-29 ba9e1ddccf testing: allow specify temp dir by GOTMPDIR environment variable + 2025-08-29 9f6936b8da cmd/link: disallow linkname of runtime.addmoduledata + 2025-08-29 89d41d254a bytes, strings: speed up TrimSpace + 2025-08-29 38204e0872 testing/synctest: call out common issues with tests + 2025-08-29 252c901125 os,syscall: pass file flags to CreateFile on Windows + 2025-08-29 53515fb0a9 crypto/tls: use hash.Cloner + 2025-08-28 13bb48e6fb go/constant: fix complex != unknown comparison + 2025-08-28 ba1109feb5 net: remove redundant cgoLookupCNAME return parameter + 2025-08-28 f74ed44ed9 net/http/httputil: remove redundant pw.Close() call in DumpRequestOut + 2025-08-28 a9689d2e0b time: skip TestLongAdjustTimers in short mode on single CPU systems + 2025-08-28 ebc763f76d syscall: only get parent PID if SysProcAttr.Pdeathsig is set + 2025-08-28 7f1864b0a8 strings: remove redundant "runs" from string.Fields docstring + 2025-08-28 90c21fa5b6 net/textproto: eliminate some bounds checks + 2025-08-27 e47d88beae os: return nil slice when ReadDir is used with a file on file_windows + 2025-08-27 6b837a64db cmd/internal/obj/loong64: simplify buildop + 2025-08-27 765905e3bd debug/elf: don't panic if symtab too small + 2025-08-27 2ee4b31242 net/http: Ensure that CONNECT proxied requests respect MaxResponseHeaderBytes + 2025-08-27 b21867b1a2 net/http: require exact match for CrossSiteProtection bypass patterns + 2025-08-27 d19e377f6e cmd/cgo: make it safe to run gcc in parallel + 2025-08-27 49a2f3ed87 net: allow zero value destination address in WriteMsgUDPAddrPort + 2025-08-26 afc51ed007 internall/poll: remove bufs field from Windows' poll.operation + 2025-08-26 801b74eb95 internal/poll: remove rsa field from Windows' poll.operation + 2025-08-26 fa18c547cd syscall: sort Windows env block in StartProcess + 2025-08-26 bfd130db02 internal/poll: don't use stack-allocated WSAMsg parameters + 2025-08-26 dae9e456ae runtime: identify virtual memory layout for riscv64 + 2025-08-25 25c2d4109f math: use Trunc to implement Modf + 2025-08-25 4e05a070c4 math: implement IsInf using Abs + 2025-08-25 1eed4f32a0 math: optimize Signbit implementation slightly + 2025-08-25 bd71b94659 cmd/compile/internal: optimizing add+sll rule using ALSLV instruction on loong64 + 2025-08-25 ea55ca3600 runtime: skip doInit of plugins in runtime.main + 2025-08-25 9ae2f1fb57 internal/trace: skip async preempt off tests on low end systems + 2025-08-25 bbd5342a62 net: fix cgoResSearch + 2025-08-25 ed7f804775 os: set full name for Roots created with Root.OpenRoot + 2025-08-25 a21249436b internal/poll: use fdMutex to provide read/write locking on Windows + 2025-08-24 44c5956bf7 test/codegen: add Mul2 and DivPow2 test for loong64 + 2025-08-24 0aa8019e94 test/codegen: add Mul* test for loong64 + 2025-08-24 83420974b7 test/codegen: add sqrt* abs and copysign test for loong64 + 2025-08-23 f2db0dca0b net/http/httptest: redirect example.com requests to server + 2025-08-22 d86ec92499 internal/syscall/windows: increase internal Windows O_ flags values + 2025-08-22 9d3f7fda70 crypto/tls: fix quic comment typo + 2025-08-22 78a05c541f internal/poll: don't pass non-nil WSAMsg.Name with 0 namelen on windows + 2025-08-22 52c3f73fda runtime/metrics: improve doc + 2025-08-22 a076f49757 os: fix Root.MkdirAll to handle race of directory creation + 2025-08-22 98238fd495 all: delete remaining windows/arm code + 2025-08-21 1ad30844d9 cmd/asm: process forward jump to PCALIGN + 2025-08-21 13c082601d internal/poll: permit nil destination address in WriteMsg{Inet4,Inet6} + 2025-08-21 9b0a507735 runtime: remove remaining windows/arm files and comments + 2025-08-21 1843f1e9c0 cmd/compile: use zero register instead of specialized *zero instructions on loong64 + 2025-08-21 e0870a0a12 cmd/compile: simplify zerorange on loong64 + 2025-08-21 fb8bbe46d5 cmd/compile/internal/ssa: eliminate unnecessary extension operations + 2025-08-21 9632ba8160 cmd/compile: optimize some patterns into revb2h/revb4h instruction on loong64 + 2025-08-21 8dcab6f450 syscall: simplify execve handling on libc platforms + 2025-08-21 ba840c1bf9 cmd/compile: deduplication in the source code generated by mknode + 2025-08-21 fa706ea50f cmd/compile: optimize rule (x + x) << c to x << c+1 on loong64 + 2025-08-21 ffc85ee1f1 cmd/internal/objabi,cmd/link: add support for additional riscv64 relocations Change-Id: I3896f74b1a3cc0a52b29ca48767bb0ba84620f71
2025-09-03unicode/utf8: make DecodeRune{,InString} inlineableJulien Cretel
This change makes the fast path for ASCII characters inlineable in DecodeRune and DecodeRuneInString and removes most instances of manual inlining at call sites. Here are some benchmark results (no change to allocations): goos: darwin goarch: amd64 pkg: unicode/utf8 cpu: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz │ old │ new │ │ sec/op │ sec/op vs base │ DecodeASCIIRune-8 2.4545n ± 2% 0.6253n ± 2% -74.52% (p=0.000 n=20) DecodeJapaneseRune-8 3.988n ± 1% 4.023n ± 1% +0.86% (p=0.050 n=20) DecodeASCIIRuneInString-8 2.4675n ± 1% 0.6264n ± 2% -74.61% (p=0.000 n=20) DecodeJapaneseRuneInString-8 3.992n ± 1% 4.001n ± 1% ~ (p=0.625 n=20) geomean 3.134n 1.585n -49.43% Note: when #61502 gets resolved, DecodeRune and DecodeRuneInString should be reverted to their idiomatic implementations. Fixes #31666 Updates #48195 Change-Id: I4be25c4f52417dc28b3a7bd72f1b04018470f39d GitHub-Last-Rev: 2e352a0045027e059be79cdb60241b5cf35fec71 GitHub-Pull-Request: golang/go#75181 Reviewed-on: https://go-review.googlesource.com/c/go/+/699675 Reviewed-by: Sean Liao <sean@liao.dev> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-09-02[dev.simd] simd, cmd/compile: add Interleave{Hi,Lo} (VPUNPCK*)David Chase
these are building blocks for transpose, not sure of their best names yet. Change-Id: I3800a55de9fa7fde2590ca822894c8a75387dec3 Reviewed-on: https://go-review.googlesource.com/c/go/+/698576 Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-09-02[dev.simd] cmd/compile: add instructions and rewrites for scalar-> vector movesDavid Chase
This required changes to the assembler so that VMOVSS and VMOVSD could handle FP constants. Change-Id: Iaa2f8df71867a3283bc058b7ec691b56a3e73621 Reviewed-on: https://go-review.googlesource.com/c/go/+/698240 Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-31cmd/compile: use generated loops instead of DUFFZERO on loong64limeidan
Change-Id: Id43ee4353d4bac96627f8b0f54545cdd3d2a1d1b Reviewed-on: https://go-review.googlesource.com/c/go/+/699695 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2025-08-29cmd/compile: remove sign extension before MULW on riscv64Michael Munday
These sign extensions aren't necessary. Change-Id: Id68ea596a18b30949d4605b2885f02e49e42d8e1 Reviewed-on: https://go-review.googlesource.com/c/go/+/699595 Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-08-29cmd/compile/internal/ssa: make oneBit function genericMichael Munday
Allows rewrite rules using oneBit to be made more compact. Change-Id: I986715f77db5b548759d809fe668e1893048f25c Reviewed-on: https://go-review.googlesource.com/c/go/+/699295 Reviewed-by: Youlin Feng <fengyoulin@live.com> Commit-Queue: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-25cmd/compile/internal: optimizing add+sll rule using ALSLV instruction on loong64limeidan
Reduce the number of go toolchain instructions on loong64 as follows: file before after Δ % go 1573148 1571708 -1,440 -0.0915% gofmt 320578 320090 -488 -0.1522% asm 555066 554406 -660 -0.1189% cgo 481566 480926 -640 -0.1329% compile 2475962 2473880 -2,082 -0.0841% cover 516536 515920 -616 -0.1193% link 702172 701404 -768 -0.1094% preprofile 238626 238274 -352 -0.1475% vet 792928 792100 -828 -0.1044% Change-Id: I61e462726835959c60e1b4e5256d4020202418ab Reviewed-on: https://go-review.googlesource.com/c/go/+/693877 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2025-08-22[dev.simd] cmd/compile: sample peephole optimization for SIMD broadcastDavid Chase
After tinkering and rewrite, this also optimizes some instances of SetElem(0). Change-Id: Ibba2d50a56b68ccf9de517ef24ca52b64c6c5b2c Reviewed-on: https://go-review.googlesource.com/c/go/+/696376 Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-22[dev.simd] cmd/compile: remove VPADDD4Cherry Mui
It is from my sample SIMD compilation, not used in the real thing. The actual operation is VPADDD128. Also clean up some of my XXX comments. Change-Id: Ic20a9dd3c8531e25d88ba045ccef70cb856790d8 Reviewed-on: https://go-review.googlesource.com/c/go/+/698475 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-22[dev.simd] cmd/compile: correct register mask of some AVX512 opsCherry Mui
Change-Id: Ifce9d6667955c9b16b1cd78d6dd216a9c568c17a Reviewed-on: https://go-review.googlesource.com/c/go/+/698239 Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-08-22[dev.simd] cmd/compile: use X15 for zero value in AVX contextCherry Mui
With the previous CL, the X15 (aliasd with Y15, Z15) register holds the zero value for the whole register width. Use that in AVX context when a zero value is needed. Change-Id: If49b7059bce50c5e86f90bace0eaa830a91fa0fc Reviewed-on: https://go-review.googlesource.com/c/go/+/698238 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> TryBot-Bypass: Cherry Mui <cherryyz@google.com>
2025-08-22[dev.simd] cmd/compile: ensure the whole X15 register is zeroedCherry Mui
On AMD64, we reserve the X15 register as the zero register. Currently we use an SSE instruction to zero it, and we only use it in SSE contexts. When the machine supports AVX, the high bits of the register is not necessarily zeroed. Now that the compiler generates AVX code for SIMD, it would be great to have a zero register in the AVX context. This CL zeroes the whole X15 register if AVX is supported. Change-Id: I4dc803362f2e007b1614b90de435fbb7814cebc7 Reviewed-on: https://go-review.googlesource.com/c/go/+/698237 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: David Chase <drchase@google.com>
2025-08-22[dev.simd] cmd/compile, simd: complete AVX2? u?int shufflesJunyang Shao
The namings follow the following convention: - If its indices are from constant, amend "Constant" to the name. - If its indices are used by multiple groups, mend "Grouped" to the name. - If its indexing only the low part, amend "Lo", similarly "Hi". Change-Id: I6a58f5dae54c882ebd59f39b5288f6f3f14d957f Reviewed-on: https://go-review.googlesource.com/c/go/+/698296 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-08-22[dev.simd] cmd/compile, simd: make Permute 128-bit use AVX VPSHUFBJunyang Shao
Change-Id: Ib89f602f797065e411eb0cbc95ccf2748b25fdec Reviewed-on: https://go-review.googlesource.com/c/go/+/698295 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-08-22[dev.simd] cmd/compile, simd: add packed saturated u?int conversionsJunyang Shao
This CL should complete the conversions between int and uint. Change-Id: I46742a62214f346e014a68b9c72a9b116a127f67 Reviewed-on: https://go-review.googlesource.com/c/go/+/698236 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Commit-Queue: David Chase <drchase@google.com> Reviewed-by: David Chase <drchase@google.com>
2025-08-22[dev.simd] cmd/compile, simd: add saturated u?int conversionsJunyang Shao
Change-Id: I0c7f2d7ec31c59c95568ff8d4560989de849427e Reviewed-on: https://go-review.googlesource.com/c/go/+/698235 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-21cmd/compile: use zero register instead of specialized *zero instructions on ↵limeidan
loong64 Refer to CL 633075, loong64 has a zero(R0) register that can be used to do this. Change-Id: I846c6bdfcfd6dbfa18338afc13e34e350580ead4 Reviewed-on: https://go-review.googlesource.com/c/go/+/693876 Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org>
2025-08-21cmd/compile: simplify zerorange on loong64limeidan
Refer to CL 678936, we also did the same thing on loong64. Change-Id: I156a9110a034878192f64baf8018115424aa5f0a Reviewed-on: https://go-review.googlesource.com/c/go/+/697957 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-21cmd/compile/internal/ssa: eliminate unnecessary extension operationslimeidan
Reduce the number of go toolchain instructions on loong64 as follows: file before after Δ % go 1598706 1597230 -1476 -0.0923% gofmt 325180 324736 -444 -0.1365% asm 562538 562098 -440 -0.0782% cgo 488298 487634 -664 -0.1360% compile 2504502 2503590 -912 -0.0364% cover 525976 525312 -664 -0.1262% link 714182 713226 -956 -0.1339% preprofile 241308 240988 -320 -0.1326% vet 794112 793316 -796 -0.1002% Change-Id: I048ef79518b41e83c53da1a3a6b7edaca7cb63f8 Reviewed-on: https://go-review.googlesource.com/c/go/+/693856 Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Carlos Amedee <carlos@golang.org> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>