aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/internal
AgeCommit message (Collapse)Author
2024-11-14cmd/internal/obj/fips: mark R_ADDRPOWR_GOT as a pcrel relocationPaul E. Murphy
It's actually a TOC relative relocation, but those are also accepted as pcrel relocations here too. This fixes compilation on GOPPC64 <= power9. Change-Id: I235125a76f59ab26c6c753540cfaeb398f9c105d Reviewed-on: https://go-review.googlesource.com/c/go/+/628157 Auto-Submit: Paul Murphy <murp@ibm.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-14all: enable FIPS verification codeRuss Cox
Previous CLs committed changes to cmd/compile, cmd/link, and crypto/internal/fips/check behind boolean flags. Turn those flags on, to enable the CLs. This is a separate, trivial CL for easier rollback. For #69536. Change-Id: I68206bae0b7d7ad5c8758267d1a2e68853b63644 Reviewed-on: https://go-review.googlesource.com/c/go/+/626000 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-13cmd/internal/obj/wasm: correct return PC for frameless wasmexport wrappersCherry Mui
For a wasmexport wrapper, we generate a call to the actual exported Go function, and use the wrapper function's PC 1 as the (fake) return address. This address is not used for returning, which is handled by the Wasm call stack. It is used for stack unwinding, and PC 1 makes it past the prologue and therefore has the right SP delta. But if the function has no arguments and results, the wrapper is frameless, with no prologue, and PC 1 doesn't exist. This causes the unwinder to fail. In this case, we put PC 0, which also has the correct SP delta (0). Fixes #69584. Change-Id: Ic047a6e62100db540b5099cc5a56a1d0f16d58b9 Reviewed-on: https://go-review.googlesource.com/c/go/+/624000 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-13cmd/internal/obj: add tool to generate Cnames stringchenguoqi
Add cmd/internal/obj/mkcnames.go to do the generation and update the architecture packages to use it to maintain the Cnames tables. Currently works correctly on arm64,loong64,mips,ppc64 and s390x. Change-Id: I5220b0ba6d8a8a5fcc4d9774731eb2af69a671af Reviewed-on: https://go-review.googlesource.com/c/go/+/622256 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> Auto-Submit: Ian Lance Taylor <iant@golang.org> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Commit-Queue: Ian Lance Taylor <iant@golang.org>
2024-11-13cmd/compile, cmd/link: add FIPS verification supportRuss Cox
For FIPS init-time code+data verification, we need to arrange to put the FIPS symbols into contiguous regions of the executable and then record those sections along with the expected checksum. The cmd/internal/obj changes identify the FIPS symbols and give them distinguished types, which the linker then places in contiguous regions. The linker also writes out information to use at run time to find the FIPS sections, along with the expected hash. See cmd/internal/obj/fips.go and cmd/link/internal/ld/fips.go for more details. The code is disabled in this commit. CL 625998 and 625999 adds tests. CL 626000 enables the code. For #69536. Change-Id: I48da6db94bc0bea7428c43d4abcf999527bccfcd Reviewed-on: https://go-review.googlesource.com/c/go/+/625997 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-13cmd/internal/obj/loong64: add support of VMOVQ and XVMOVQGuoqi Chen
This CL refers to the implementation of ARM64 and adds support for the following types of SIMD instructions: 1. Move general-purpose register to a vector element, e.g.: VMOVQ Rj, <Vd>.<T>[index] <T> can have the following values: B, H, W, V 2. Move vector element to general-purpose register, e.g.: VMOVQ <Vj>.<T>[index], Rd <T> can have the following values: B, BU, H, HU, W, WU, VU 3. Duplicate general-purpose register to vector, e.g.: VMOVQ Rj, <Vd>.<T> <T> can have the following values: B16, H8, W4, V2, B32, H16, W8, V4 4. Move vector, e.g.: XVMOVQ Xj, <Xd>.<T> <T> can have the following values: B16, H8, W4, V2, Q1 5. Move vector element to scalar, e.g.: XVMOVQ Xj, <Xd>.<T>[index] XVMOVQ Xj.<T>[index], Xd <T> can have the following values: W, V 6. Move vector element to vector register, e.g.: VMOVQ <Vn>.<T>[index], Vn.<T> <T> can have the following values: B, H, W, V This CL only adds syntax and doesn't break any assembly that already exists. Change-Id: I7656efac6def54da6c5ae182f39c2a21bfdf92bb Reviewed-on: https://go-review.googlesource.com/c/go/+/616258 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-12cmd/compile: optimize math/bits.OnesCount{16,32,64} implementation on loong64Guoqi Chen
Use Loong64's LSX instruction VPCNT to implement math/bits.OnesCount{16,32,64} and make it intrinsic. Benchmark results on loongson 3A5000 and 3A6000 machines: goos: linux goarch: loong64 pkg: math/bits cpu: Loongson-3A5000-HV @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | OnesCount 4.413n ± 0% 1.401n ± 0% -68.25% (p=0.000 n=10) OnesCount8 1.364n ± 0% 1.363n ± 0% ~ (p=0.130 n=10) OnesCount16 2.112n ± 0% 1.534n ± 0% -27.37% (p=0.000 n=10) OnesCount32 4.533n ± 0% 1.529n ± 0% -66.27% (p=0.000 n=10) OnesCount64 4.565n ± 0% 1.531n ± 1% -66.46% (p=0.000 n=10) geomean 3.048n 1.470n -51.78% goos: linux goarch: loong64 pkg: math/bits cpu: Loongson-3A6000 @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | OnesCount 3.553n ± 0% 1.201n ± 0% -66.20% (p=0.000 n=10) OnesCount8 0.8021n ± 0% 0.8004n ± 0% -0.21% (p=0.000 n=10) OnesCount16 1.216n ± 0% 1.000n ± 0% -17.76% (p=0.000 n=10) OnesCount32 3.006n ± 0% 1.035n ± 0% -65.57% (p=0.000 n=10) OnesCount64 3.503n ± 0% 1.035n ± 0% -70.45% (p=0.000 n=10) geomean 2.053n 1.006n -51.01% Change-Id: I07a5b8da2bb48711b896387ec7625145804affc8 Reviewed-on: https://go-review.googlesource.com/c/go/+/620978 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-11cmd/compile: allow more types for wasmimport/wasmexport parameters and resultsCherry Mui
As proposed on #66984, this CL allows more types to be used as wasmimport/wasmexport function parameters and results. Specifically, bool, string, and uintptr are now allowed, and also pointer types that point to allowed element types. Allowed element types includes sized integer and floating point types (including small integer types like uint8 which are not directly allowed as a parameter type), bool, array whose element type is allowed, and struct whose fields are allowed element type and also include a struct.HostLayout field. For #66984. Change-Id: Ie5452a1eda21c089780dfb4d4246de6008655c84 Reviewed-on: https://go-review.googlesource.com/c/go/+/626615 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-11cmd/internal/obj/loong64: switch Lookup function call to ABIInternal modeGuoqi Chen
CL 521790 has experimentally enabled RegABI support on Loong64, so it is possible to switch the Lookup function call to ABIInternal mode. Change-Id: I3ae053e20c0791efebe6b6bdc9a1550a11372bc2 Reviewed-on: https://go-review.googlesource.com/c/go/+/544435 Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-08cmd/internal/goobj: regenerate builtinlistIan Lance Taylor
CL 622042 added rand as a compiler builtin, but did not update builtinlist. Also update the mkbuiltin comment to refer to the current file location, and add a comment for runtime.rand that it is called from the compiler. For #54766 Change-Id: I99d2c0bb0658da333775afe2ed0447265c845c82 Reviewed-on: https://go-review.googlesource.com/c/go/+/626755 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Ian Lance Taylor <iant@golang.org>
2024-11-08cmd/asm: use single-instruction forms for all loong64 sign and zero extensionsXiaolin Zhao
8-bit and 16-bit sign extensions and 32-bit zero extensions were realized with left and right shifts before this change. We now support assembling EXTWB, EXTWH and BSTRPICKV, so all three can be done with a single insn respectively. This patch is a copy of CL 479496. Co-authored-by: WANG Xuerui <git@xen0n.name> Change-Id: Iee5741dd9ebb25746f51008f3f6c86704339d615 Reviewed-on: https://go-review.googlesource.com/c/go/+/626195 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-08cmd/internal/obj/loong64: add {V,XV}PCNT.{B,H,W,D} instructions supportGuoqi Chen
Go asm syntax: VPCNT{B,H,W,V} VJ, VD XVPCNT{B,H,W,V} XJ, XD Equivalent platform assembler syntax: vpcnt.{b,w,h,d} vd, vj xvpcnt.{b,w,h,d} xd, xj Change-Id: Icec4446b1925745bc3a0bc3f6397d862953b9098 Reviewed-on: https://go-review.googlesource.com/c/go/+/620736 Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
2024-11-07cmd/internal/objabi, cmd/link: introduce SymKind helper methodsRuss Cox
These will be necessary when we start using the new FIPS symbols. Split into a separate CL so that these refactoring changes can be tested separate from any FIPS-specific changes. Passes golang.org/x/tools/cmd/toolstash/buildall. Change-Id: I73e5873fcb677f1f572f0668b4dc6f3951d822bc Reviewed-on: https://go-review.googlesource.com/c/go/+/625996 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Russ Cox <rsc@golang.org>
2024-11-07cmd/internal/objabi, cmd/link: add FIPS symbol kindsRuss Cox
Add FIPS symbol kinds that will be needed for FIPS support. This is a separate CL to keep the re-generated changes in the string methods separate from hand-written changes. The separate symbol kinds will let us group the FIPS-related code and data together, so that it can be checksummed at startup, as required by FIPS. It's also separate because it breaks buildall, by changing the on-disk symbol kind enumeration. We want non-buildall changes to be as simple as possible. For #69536. Change-Id: I2d5a238498929fff8b24736ee54330c17323bd86 Reviewed-on: https://go-review.googlesource.com/c/go/+/625995 Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07cmd/internal/obj: replace obj.Addrel func with LSym.AddRel methodRuss Cox
The old API was to do r := obj.AddRel(sym) r.Type = this r.Off = that etc The new API is: sym.AddRel(ctxt, obj.Reloc{Type: this: Off: that, etc}) This new API is more idiomatic and avoids ever having relocations that are only partially constructed. Most importantly, it sets up for sym.AddRel being able to check relocation validity in the future. (Passing ctxt is for use in validity checking.) Passes golang.org/x/tools/cmd/toolstash/buildall. Change-Id: I042ea76e61bb3bf6402f98ca11291a13f4799972 Reviewed-on: https://go-review.googlesource.com/c/go/+/625616 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07cmd/internal/obj/loong64: add {V,XV}SEQ.{B,H,W,D} instructions supportGuoqi Chen
Go asm syntax: VSEQ{B,H,W,V} VJ, VK, VD XVSEQ{B,H,W,V} XJ, XK, XD Equivalent platform assembler syntax: vseq.{b,w,h,d} vd, vj, vk xvseq.{b,w,h,d} xd, xj, xk Change-Id: Ia87277b12c817ebc41a46f4c3d09f4b76995ff2f Reviewed-on: https://go-review.googlesource.com/c/go/+/616076 Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07cmd/internal/obj/loong64: add {V,XV}LD/{V,XV}LDX/{V,XV}ST/{V,XV}STX ↵Guoqi Chen
instructions support This CL adding primitive asm support of Loong64 LSX [1] and LASX [2], by introducing new sets of register V0-V31 (C_VREG), X0-X31 (C_XREG) and 8 new instructions. On Loong64, VLD,XVLD,VST,XVST implement vector memory access operations using immediate values offset. VLDX, XVLDX, VSTX, XVSTX implement vector memory access operations using register offset. Go asm syntax: VMOVQ n(RJ), RV (128bit vector load) XVMOVQ n(RJ), RX (256bit vector load) VMOVQ RV, n(RJ) (128bit vector store) XVMOVQ RX, n(RJ) (256bit vector store) VMOVQ (RJ)(RK), RV (128bit vector load) XVMOVQ (RJ)(RK), RX (256bit vector load) VMOVQ RV, (RJ)(RK) (128bit vector store) XVMOVQ RX, (RJ)(RK) (256bit vector store) Equivalent platform assembler syntax: vld vd, rj, si12 xvld xd, rj, si12 vst vd, rj, si12 xvst xd, rj, si12 vldx vd, rj, rk xvldx xd, rj, rk vstx vd, rj, rk xvstx xd, rj, rk [1]: LSX: Loongson SIMD Extension, 128bit [2]: LASX: Loongson Advanced SIMD Extension, 256bit Change-Id: Ibaf5ddfd29b77670c3c44cc32bead36b2c8b8003 Reviewed-on: https://go-review.googlesource.com/c/go/+/616075 Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07cmd/compiler,internal/runtime/atomic: optimize Store{64,32,8} on loong64Guoqi Chen
On Loong64, AMSWAPDB{W,V} instructions are supported by default, and AMSWAPDB{B,H} [1] is a new instruction added by LA664(Loongson 3A6000) and later microarchitectures. Therefore, AMSWAPDB{W,V} (full barrier) is used to implement AtomicStore{32,64}, and the traditional MOVB or the new AMSWAPDBB is used to implement AtomicStore8 according to the CPU feature. The StoreRelease barrier on Loong64 is "dbar 0x12", but it is still necessary to ensure consistency in the order of Store/Load [2]. LoweredAtomicStorezero{32,64} was removed because on loong64 the constant "0" uses the R0 register, and there is no performance difference between the implementations of LoweredAtomicStorezero{32,64} and LoweredAtomicStore{32,64}. goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A5000-HV @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | AtomicStore64 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20) AtomicStore64-2 19.61n ± 0% 13.61n ± 0% -30.57% (p=0.000 n=20) AtomicStore64-4 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20) AtomicStore 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20) AtomicStore-2 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20) AtomicStore-4 19.62n ± 0% 13.62n ± 0% -30.58% (p=0.000 n=20) AtomicStore8 19.61n ± 0% 20.01n ± 0% +2.04% (p=0.000 n=20) AtomicStore8-2 19.62n ± 0% 20.02n ± 0% +2.01% (p=0.000 n=20) AtomicStore8-4 19.61n ± 0% 20.02n ± 0% +2.09% (p=0.000 n=20) geomean 19.61n 15.48n -21.08% goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A6000 @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | AtomicStore64 18.03n ± 0% 12.81n ± 0% -28.93% (p=0.000 n=20) AtomicStore64-2 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20) AtomicStore64-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20) AtomicStore-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore8 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore8-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore8-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) geomean 18.01n 12.81n -28.89% [1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html [2]: https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=gcc/config/loongarch/sync.md Change-Id: I4ae5e8dd0e6f026129b6e503990a763ed40c6097 Reviewed-on: https://go-review.googlesource.com/c/go/+/581356 Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> Reviewed-by: Meidan Li <limeidan@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2024-11-06cmd/objdump: add s390x plan9 disasm supportSrinivas Pokala
This CL provides vendor support for s390x disassembler plan9 syntax. cd $GOROOT/src/cmd go get golang.org/x/arch@master go mod tidy go mod vendor For #15255 Change-Id: I20c87510a1aee2d1cf2df58feb535974c4c0e3ef Reviewed-on: https://go-review.googlesource.com/c/go/+/623075 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Vishwanatha HD <vishwanatha.hd@ibm.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-02cmd/asm: add support for loong64 FMA instructionsXiaolin Zhao
Add support for assembling the FMA instructions present in the LoongArch base ISA v1.00. This requires adding a new instruction format and making use of a third source operand, which is put in RestArgs[0]. The single-precision instructions have the `.s` prefix in their official mnemonics, and similar Go asm instructions all have `S` prefix for the other architectures having FMA support, but in this change they instead have `F` prefix in Go asm because loong64 currently follows the mips backends in the naming convention. This could be changed later because FMA is fully expressible in pure Go, making it unlikely to have to hand- write such assembly in the wild. Example mapping between actual encoding and Go asm syntax: fmadd.s fd, fj, fk, fa -> FMADDF fa, fk, fj, fd (prog.From = fa, prog.Reg = fk, prog.RestArgs[0] = fj and prog.To = fd) fmadd.s fd, fd, fk, fa -> FMADDF fa, fk, fd (prog.From = fa, prog.Reg = fk and prog.To = fd) This patch is a copy of CL 477716. Co-authored-by: WANG Xuerui <git@xen0n.name> Change-Id: I9b4e4c601d6c5a854ee238f085849666e4faf090 Reviewed-on: https://go-review.googlesource.com/c/go/+/623877 Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-11-01cmd/asm: add support for loong64 CRC32 instructionsXiaolin Zhao
This patch is a copy of CL 478595. Co-authored-by: WANG Xuerui <git@xen0n.name> Change-Id: Ifb6e8183c83a5dfe5dec84e173a74d5de62692a0 Reviewed-on: https://go-review.googlesource.com/c/go/+/623875 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-01cmd/asm: add support for the rest of loong64 unary bitopsXiaolin Zhao
All remaining unary bitop instructions in the LoongArch v1.00 base ISA are added with this change. While at it, add the missing W suffix to the current CLO/CLZ names. They are not used anywhere as far as we know, so no breakage is expected. Also, stop reusing SLL's instruction format for simplicity, in favor of a new but trivial instruction format case. This patch is a copy of CL 477717. Co-authored-by: WANG Xuerui <git@xen0n.name> Change-Id: Idbcaca25dda1ed313674ef8b26da722e8d7151c0 Reviewed-on: https://go-review.googlesource.com/c/go/+/623876 Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-10-31cmd/internal/obj/arm64: make sure prologue and epilogue are pattern matched ↵Hao Liu
for small frames CL 379075 implemented function prologue/epilogue with STP/LDP. To fix issue #53374, CL 412474 reverted the prologue STP change for small frames, and the LDP in epilogue was kept. The current instructions are: prologue: MOVD.W R30, -offset(RSP) MOVD R29, -8(RSP) epilogue: LDP -8(RSP), (R29, R30) ADD $offset, RSP, RSP It seems a bit strange, as: 1) The prolog and epilogue are not in the same pattern (either STR-LDR, or STP-LDP). 2) Go Internal ABI defines that R30 is saved at 0(RSP) and R29 is saved at -8(RSP), so we can not use a single STP.W/LDP.P to save/restore LR&FP and adjust SP. Changing the ABI causes too much complexity, and the benefit is not that big. This patch reverts the small frames' epilogue change in CL 379075. It converts LDP in the epilogue to LDR-LDR. Another solution is to re-apply the STP change in prologue, which requires to fix #53609. This seems the easier and safer solution in the mean time. The new instructions are: prologue: MOVD.W R30, -offset(RSP) MOVD R29, -8(RSP) epilogue: MOVD -8(RSP), R29 MOVD.P offset(RSP), R30 The current pattern may cause performance issues in Store-Forwarding on micro-architectures like AmpereOne. Assuming a function call in the middle of such code is short enough that the stores are still around, then the LDP executes and it may wait longer to get the results from separated stores in Store Buffers other than single STP. Store-Forwarding aims to improve the efficiency of the processor by allowing data to be forwarded directly from a store operation to a subsequent load operation when certain conditions are met. See the paper: "Memory Barriers: a Hardware View for Software Hackers" (chapter 3.2: Store Forwarding). The performance of following ARM64 Linux servers were tested: 1) AmpereOne (ARM v8.6+) from Ampere Computing. 2) Ampere Altra (ARM Neoverse N1) from Ampere Computing. 3) Graviton2 (ARM Neoverse N1) from AWS. The effect of this change depends the hardware implementation of store-forwarding. It can obviously improve AmpereOne, especially for small functions that are frequently called and returned quickly. E.g., JSON Marshal/Unmarshal benchmarks on AmpereOne: goos: linux goarch: arm64 pkg: encoding/json │ ampere-one.base │ ampere-one.new │ │ sec/op │ sec/op vs base │ CodeMarshal-8 882.1µ ± 1% 779.6µ ± 1% -11.62% (p=0.000 n=10) CodeMarshalError-8 961.5µ ± 0% 855.7µ ± 1% -11.01% (p=0.000 n=10) MarshalBytes/32-8 207.6n ± 1% 187.8n ± 0% -9.52% (p=0.000 n=10) MarshalBytes/256-8 501.0n ± 1% 482.6n ± 1% -3.68% (p=0.000 n=10) MarshalBytes/4096-8 5.336µ ± 1% 5.074µ ± 1% -4.92% (p=0.000 n=10) MarshalBytesError/32-8 242.3µ ± 2% 205.7µ ± 3% -15.08% (p=0.000 n=10) MarshalBytesError/256-8 242.4µ ± 1% 205.2µ ± 2% -15.35% (p=0.000 n=10) MarshalBytesError/4096-8 247.9µ ± 0% 210.1µ ± 1% -15.24% (p=0.000 n=10) MarshalMap-8 150.8n ± 1% 145.7n ± 0% -3.35% (p=0.000 n=10) EncodeMarshaler-8 50.30n ± 26% 54.48n ± 6% ~ (p=0.739 n=10) CodeUnmarshal-8 4.796m ± 2% 4.055m ± 1% -15.45% (p=0.000 n=10) CodeUnmarshalReuse-8 4.260m ± 1% 3.496m ± 1% -17.94% (p=0.000 n=10) UnmarshalString-8 73.89n ± 1% 65.83n ± 1% -10.91% (p=0.000 n=10) UnmarshalFloat64-8 60.63n ± 1% 58.66n ± 25% ~ (p=0.143 n=10) UnmarshalInt64-8 55.62n ± 1% 53.25n ± 22% ~ (p=0.468 n=10) UnmarshalMap-8 255.3n ± 1% 230.3n ± 1% -9.77% (p=0.000 n=10) UnmarshalNumber-8 467.2n ± 1% 367.0n ± 0% -21.43% (p=0.000 n=10) geomean 6.224µ 5.605µ -9.94% Other ARM64 micro-architectures may be not affected so much by such issue. E.g., benchmarks on Ampere Altra and Graviton2 show slight improvements: │ altra.base │ altra.new │ │ sec/op │ sec/op vs base │ CodeMarshal-8 980.1µ ± 1% 977.3µ ± 1% ~ (p=0.912 n=10) CodeMarshalError-8 1.109m ± 3% 1.096m ± 5% ~ (p=0.971 n=10) MarshalBytes/32-8 246.8n ± 1% 245.4n ± 0% -0.55% (p=0.002 n=10) MarshalBytes/256-8 590.9n ± 1% 606.6n ± 1% +2.67% (p=0.000 n=10) MarshalBytes/4096-8 6.351µ ± 1% 6.376µ ± 1% ~ (p=0.183 n=10) MarshalBytesError/32-8 245.3µ ± 2% 246.1µ ± 2% ~ (p=0.684 n=10) MarshalBytesError/256-8 245.5µ ± 1% 248.7µ ± 2% ~ (p=0.218 n=10) MarshalBytesError/4096-8 254.2µ ± 1% 254.9µ ± 1% ~ (p=0.481 n=10) MarshalMap-8 152.7n ± 2% 151.5n ± 3% ~ (p=0.782 n=10) EncodeMarshaler-8 45.95n ± 7% 42.88n ± 5% -6.70% (p=0.014 n=10) CodeUnmarshal-8 5.121m ± 4% 5.125m ± 3% ~ (p=0.579 n=10) CodeUnmarshalReuse-8 4.616m ± 3% 4.634m ± 2% ~ (p=0.529 n=10) UnmarshalString-8 72.12n ± 2% 72.20n ± 2% ~ (p=0.912 n=10) UnmarshalFloat64-8 64.44n ± 5% 63.20n ± 4% ~ (p=0.393 n=10) UnmarshalInt64-8 61.49n ± 2% 58.14n ± 4% -5.45% (p=0.002 n=10) UnmarshalMap-8 263.6n ± 2% 266.2n ± 1% ~ (p=0.196 n=10) UnmarshalNumber-8 464.7n ± 1% 464.0n ± 0% ~ (p=0.566 n=10) geomean 6.617µ 6.575µ -0.64% │ graviton2.base │ graviton2.new │ │ sec/op │ sec/op vs base │ CodeMarshal-8 1.122m ± 0% 1.118m ± 1% ~ (p=0.052 n=10) CodeMarshalError-8 1.216m ± 1% 1.214m ± 0% ~ (p=0.631 n=10) MarshalBytes/32-8 289.9n ± 0% 280.8n ± 0% -3.17% (p=0.000 n=10) MarshalBytes/256-8 675.9n ± 0% 664.7n ± 0% -1.66% (p=0.000 n=10) MarshalBytes/4096-8 6.884µ ± 0% 6.885µ ± 0% ~ (p=0.565 n=10) MarshalBytesError/32-8 293.1µ ± 2% 288.9µ ± 2% ~ (p=0.123 n=10) MarshalBytesError/256-8 296.0µ ± 3% 289.0µ ± 1% -2.36% (p=0.019 n=10) MarshalBytesError/4096-8 300.4µ ± 1% 295.6µ ± 0% -1.60% (p=0.000 n=10) MarshalMap-8 168.8n ± 1% 168.8n ± 1% ~ (p=1.000 n=10) EncodeMarshaler-8 53.77n ± 8% 50.05n ± 12% ~ (p=0.579 n=10) CodeUnmarshal-8 5.875m ± 2% 5.882m ± 1% ~ (p=0.796 n=10) CodeUnmarshalReuse-8 5.383m ± 1% 5.366m ± 0% ~ (p=0.631 n=10) UnmarshalString-8 74.59n ± 1% 73.99n ± 0% -0.80% (p=0.001 n=10) UnmarshalFloat64-8 68.52n ± 7% 64.19n ± 18% ~ (p=0.868 n=10) UnmarshalInt64-8 65.32n ± 13% 62.24n ± 8% ~ (p=0.138 n=10) UnmarshalMap-8 290.1n ± 0% 291.3n ± 0% +0.43% (p=0.010 n=10) UnmarshalNumber-8 514.4n ± 0% 499.4n ± 0% -2.93% (p=0.000 n=10) geomean 7.459µ 7.317µ -1.91% Change-Id: If27386fc5f514b76bdaf2012c2ce86cc65f7ca5b Reviewed-on: https://go-review.googlesource.com/c/go/+/621775 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-30cmd/internal/osinfo: stop importing golang.org/x/sys/unixIan Lance Taylor
This is the only non-vendored file that imports x/sys/unix. Switch to fetching the information in this package. Change-Id: I4e54c2cd8b4953066e2bee42922f35c387fb43e9 Reviewed-on: https://go-review.googlesource.com/c/go/+/623435 Auto-Submit: Ian Lance Taylor <iant@google.com> Commit-Queue: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-30cmd/internal/obj/riscv: update RISC-V instruction tableJoel Sing
Regenerate RISC-V instruction table from the riscv-opcodes repository, due to various changes and shuffling upstream. This has been changed to remove pseudo-instructions, since Go only needs the instruction encodings and including the pseudo-instructions is creating unnecessary complications (for example, the inclusion of ANOP and ARET, as well as strangely named aliases such as AJALPSEUDO/AJALRPSEUDO). Remove pseudo-instructions that are not currently supported by the assembler and add specific handling for RDCYCLE, RDTIME and RDINSTRET, which were previously implemented via the instruction encodings. Change-Id: I78be4506ba6b627eba1f321406081a63bab5b2e6 Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64 Reviewed-on: https://go-review.googlesource.com/c/go/+/616116 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-10-29cmd/internal/obj/ppc64: support for extended mnemonics of BCJayanth Krishnamurthy
BGT, BLT, BLE, BGE, BNE, BVS, BVC, and BEQ support by assembler. This will simplify the usage of BC constructs like BC 12, 30, LR <=> BEQ CR7, LR BC 12, 2, LR <=> BEQ CR0, LR BC 12, 0, target <=> BLT CR0, target BC 12, 2, target <=> BEQ CR0, target BC 12, 5, target <=> BGT CR1, target BC 12, 30, target <=> BEQ CR7, target BC 4, 6, target <=> BNE CR1, target BC 4, 5, target <=> BLE CR1, target code cleanup based on the above additions. Change-Id: I02fdb212b6fe3f85ce447e05f4d42118c9ce63b5 Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10 Reviewed-on: https://go-review.googlesource.com/c/go/+/612395 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Paul Murphy <murp@ibm.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-10-29cmd/internal/obj/ppc64: add double-decimal arithmetic instructionsJayanth Krishnamurthy
Assembler support provided for the instructions DADD, DSUB, DMUL, and DDIV. Change-Id: Ic12ba02ce453cb1ca275334ca1924fb2009da767 Reviewed-on: https://go-review.googlesource.com/c/go/+/620856 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Paul Murphy <murp@ibm.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-10-29cmd/internal/objfile,cmd/objdump: add disassembly support for riscv64Joel Sing
Add support to `go tool objdump` for disassembling riscv64 binaries. Revendor to bring in cmd/vendor/golang.org/x/arch/riscv64/riscv64asm, which provides the actual disassembly implementation. Fixes #36738 Change-Id: I0f29968509041c0c5698fc2d6910a6a0bea9d3c0 Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64 Reviewed-on: https://go-review.googlesource.com/c/go/+/622257 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-10-28cmd: use internal/syscall/windows to get Windows versionqmuntal
internal/syscall/windows already provides a function to get the Windows version. There is no need to use golang.org/x/sys/windows for this. Change-Id: If31e9c662b10716ed6c3e9054604366e494345cf Reviewed-on: https://go-review.googlesource.com/c/go/+/622815 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-10-25cmd/internal/obj: add prologue_end DWARF stmt for riscv64Lin Runze
This patch adds prologue_end statement to the DWARF info for riscv64, which delve debugger uses for skip stacksplit prologue. Change-Id: I4e5d9c26202385f65b3118b16f53f66de9d327f0 Reviewed-on: https://go-review.googlesource.com/c/go/+/620295 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-10-25cmd/internal/obj/riscv: update references to RISC-V specificationJoel Sing
Update references to version 20240411 of the RISC-V specifications. Reorder and regroup instructions to maintain ordering. Change-Id: Iea2a5d22ad677e04948e9a9325986ad301c03f35 Reviewed-on: https://go-review.googlesource.com/c/go/+/616115 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: David Chase <drchase@google.com>
2024-10-24cmd/internal/obj,cmd/asm: add vector registers to riscv64 assemblerJoel Sing
This adds V0 through V31 as vector registers, which are available on CPUs that support the V extension. Change-Id: Ibffee3f9a2cf1d062638715b3744431d72d451ce Reviewed-on: https://go-review.googlesource.com/c/go/+/595404 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: 鹏程汪 <wangpengcheng.pp@bytedance.com>
2024-10-24cmd/internal/obj/riscv: add vector instruction encodingsJoel Sing
Regenerate the riscv instruction encoding table with the V extension enabled. Add constants and names for the resulting 375 instructions. Change-Id: Icce688493aeb1e9880fb76a0618643f57e481273 Reviewed-on: https://go-review.googlesource.com/c/go/+/595403 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: 鹏程汪 <wangpengcheng.pp@bytedance.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-10-23cmd/asm: add support for LDREXB/STREXBMauri de Souza Meneguzzo
These are 8-bit ARM Load/Store atomics and are available starting from armv6k. See https://developer.arm.com/documentation/dui0379/e/arm-and-thumb-instructions/strex For #69735 Change-Id: I12623433c89070495c178208ee4758b3cdefd368 GitHub-Last-Rev: d6a797836af1dccdcc6e6554725546b386d01615 GitHub-Pull-Request: golang/go#69959 Cq-Include-Trybots: luci.golang.try:gotip-linux-arm Reviewed-on: https://go-review.googlesource.com/c/go/+/621395 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-22cmd/compile: use STP/LDP around morestack on arm64Keith Randall
The spill/restore code around morestack is almost never exectued, so we should make it as small as possible. Using 2-register loads/stores makes sense here. Also, the offsets from SP are pretty small so the offset almost always fits in the (smaller than a normal load/store) offset field of the instruction. Makes cmd/go 0.6% smaller. Change-Id: I8845283c1b269a259498153924428f6173bda293 Reviewed-on: https://go-review.googlesource.com/c/go/+/621556 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-10-21cmd/internal/buildid: skip over GNU build ID from buildid computationCherry Mui
This is similar to CL 618597, but for GNU build ID on ELF. This makes it possible to enable "-B gobuildid" by default on ELF. Updates #41004. For #63934. Change-Id: I4e663a27a2f7824bce994c783fe6d9ce8d1a395a Reviewed-on: https://go-review.googlesource.com/c/go/+/618600 Reviewed-by: Than McIntosh <thanm@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-21cmd/internal/buildid: skip over Mach-O UUID from buildid computationCherry Mui
With the "-B gobuildid" linker option (which will be the default on some platforms), the host build ID (GNU build ID, Mach-O UUID) depends on the Go buildid. If the host build ID is included in the Go buildid computation, it will lead to convergence problem for the toolchain binaries. So ignore the host build ID in the buildid computation. This CL only handles Mach-O UUID. ELF GNU build ID will be handled later. For #68678. For #63934. Cq-Include-Trybots: luci.golang.try:gotip-darwin-amd64_14,gotip-darwin-arm64_13 Change-Id: Ie8ff20402a1c6083246d25dea391140c75be40d0 Reviewed-on: https://go-review.googlesource.com/c/go/+/618597 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@golang.org>
2024-10-21cmd/internal/macho: new package for handling mach-o files in toolchainCherry Mui
Currently the linker has some code handling and manipulating Mach-O files. Specifically, it augments the debug/macho package with file offset and length, so the content can be handled or updated easily with the file. Move this code to an internal package, so it can be used by other part of the toolchain, e.g. buildid computation. For #68678. Cq-Include-Trybots: luci.golang.try:gotip-darwin-amd64_14,gotip-darwin-arm64_13 Change-Id: I2311af0a06441b7fd887ca5c6ed9e6fc44670a16 Reviewed-on: https://go-review.googlesource.com/c/go/+/618596 Reviewed-by: Than McIntosh <thanm@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-14all: wire up swisstable mapsMichael Pratt
Use the new SwissTable-based map in internal/runtime/maps as the basis for the runtime map when GOEXPERIMENT=swissmap. Integration is complete enough to pass all.bash. Notable missing features: * Race integration / concurrent write detection * Stack-allocated maps * Specialized "fast" map variants * Indirect key / elem For #54766. Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap Change-Id: Ie97b656b6d8e05c0403311ae08fef9f51756a639 Reviewed-on: https://go-review.googlesource.com/c/go/+/594596 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-08cmd/internal/obj: optimize the function stacksplit on loong64Xiaolin Zhao
In the process of stack split checking, loong64 uses the following logic: if SP > stackguard then goto done, else morestack The possible problem here is that the probability of morestack execution is much lower than done, while static branch prediction is more inclined to obtain morestack, which will cause a certain probability of branch prediction error. Change the logic here to: if SP <= stackguard then goto morestack, else done benchmarks on 3A6000: goos: linux goarch: loong64 pkg: fmt cpu: Loongson-3A6000 @ 2500.00MHz │ bench.old │ bench.new │ │ sec/op │ sec/op vs base │ SprintfPadding 418.3n ± 1% 387.0n ± 0% -7.49% (p=0.000 n=20) SprintfEmpty 35.95n ± 0% 35.86n ± 0% -0.25% (p=0.000 n=20) SprintfString 75.02n ± 1% 72.24n ± 0% -3.71% (p=0.000 n=20) SprintfTruncateString 165.7n ± 3% 139.9n ± 1% -15.58% (p=0.000 n=20) SprintfTruncateBytes 171.0n ± 0% 147.3n ± 0% -13.83% (p=0.000 n=20) SprintfSlowParsingPath 90.56n ± 0% 80.85n ± 0% -10.72% (p=0.000 n=20) SprintfQuoteString 560.2n ± 0% 509.7n ± 0% -9.01% (p=0.000 n=20) SprintfInt 58.62n ± 0% 56.45n ± 0% -3.70% (p=0.000 n=20) SprintfIntInt 141.7n ± 0% 122.2n ± 0% -13.73% (p=0.000 n=20) SprintfPrefixedInt 210.6n ± 0% 208.8n ± 0% -0.88% (p=0.000 n=20) SprintfFloat 282.3n ± 0% 251.8n ± 1% -10.80% (p=0.000 n=20) SprintfComplex 854.1n ± 0% 813.8n ± 0% -4.71% (p=0.000 n=20) SprintfBoolean 76.32n ± 0% 71.14n ± 1% -6.79% (p=0.000 n=20) SprintfHexString 218.5n ± 0% 193.4n ± 0% -11.51% (p=0.000 n=20) SprintfHexBytes 321.3n ± 0% 275.0n ± 0% -14.42% (p=0.000 n=20) SprintfBytes 573.5n ± 0% 553.2n ± 1% -3.54% (p=0.000 n=20) SprintfStringer 501.1n ± 1% 446.6n ± 0% -10.86% (p=0.000 n=20) SprintfStructure 1.793µ ± 0% 1.683µ ± 0% -6.16% (p=0.000 n=20) ManyArgs 500.0n ± 0% 470.4n ± 0% -5.92% (p=0.000 n=20) FprintInt 67.51n ± 0% 65.71n ± 0% -2.66% (p=0.000 n=20) FprintfBytes 130.9n ± 0% 129.5n ± 1% -1.11% (p=0.000 n=20) FprintIntNoAlloc 67.55n ± 0% 65.80n ± 0% -2.58% (p=0.000 n=20) ScanInts 386.3µ ± 0% 346.5µ ± 0% -10.29% (p=0.000 n=20) ScanRecursiveInt 25.97m ± 0% 25.93m ± 0% -0.15% (p=0.038 n=20) ScanRecursiveIntReaderWrapper 26.07m ± 0% 25.93m ± 0% -0.53% (p=0.001 n=20) geomean 702.6n 653.7n -6.96% goos: linux goarch: loong64 pkg: test/bench/go1 cpu: Loongson-3A6000 @ 2500.00MHz │ bench.old │ bench.new │ │ sec/op │ sec/op vs base │ BinaryTree17 7.688 ± 1% 7.724 ± 0% +0.47% (p=0.040 n=20) Fannkuch11 2.670 ± 0% 2.645 ± 0% -0.94% (p=0.000 n=20) FmtFprintfEmpty 35.93n ± 0% 37.50n ± 0% +4.37% (p=0.000 n=20) FmtFprintfString 56.32n ± 0% 59.74n ± 0% +6.08% (p=0.000 n=20) FmtFprintfInt 64.47n ± 0% 61.26n ± 0% -4.98% (p=0.000 n=20) FmtFprintfIntInt 100.30n ± 0% 99.67n ± 0% -0.63% (p=0.000 n=20) FmtFprintfPrefixedInt 116.7n ± 0% 119.3n ± 0% +2.23% (p=0.000 n=20) FmtFprintfFloat 234.1n ± 0% 203.4n ± 0% -13.11% (p=0.000 n=20) FmtManyArgs 503.0n ± 0% 467.9n ± 0% -6.96% (p=0.000 n=20) GobDecode 8.125m ± 0% 7.299m ± 0% -10.17% (p=0.000 n=20) GobEncode 8.930m ± 1% 8.581m ± 1% -3.91% (p=0.000 n=20) Gzip 280.0m ± 0% 279.8m ± 0% -0.10% (p=0.000 n=20) Gunzip 33.30m ± 0% 32.48m ± 0% -2.49% (p=0.000 n=20) HTTPClientServer 55.43µ ± 0% 54.10µ ± 1% -2.41% (p=0.000 n=20) JSONEncode 10.086m ± 0% 9.055m ± 0% -10.22% (p=0.000 n=20) JSONDecode 49.37m ± 1% 46.22m ± 1% -6.40% (p=0.000 n=20) Mandelbrot200 4.606m ± 0% 4.606m ± 0% ~ (p=0.280 n=20) GoParse 5.010m ± 0% 4.855m ± 0% -3.09% (p=0.000 n=20) RegexpMatchEasy0_32 59.09n ± 0% 59.32n ± 0% +0.39% (p=0.000 n=20) RegexpMatchEasy0_1K 455.2n ± 0% 453.8n ± 0% -0.31% (p=0.000 n=20) RegexpMatchEasy1_32 59.24n ± 0% 60.11n ± 0% +1.47% (p=0.000 n=20) RegexpMatchEasy1_1K 555.2n ± 0% 553.9n ± 0% -0.23% (p=0.000 n=20) RegexpMatchMedium_32 845.7n ± 0% 775.6n ± 0% -8.28% (p=0.000 n=20) RegexpMatchMedium_1K 26.68µ ± 0% 26.48µ ± 0% -0.78% (p=0.000 n=20) RegexpMatchHard_32 1.317µ ± 0% 1.326µ ± 0% +0.68% (p=0.000 n=20) RegexpMatchHard_1K 41.35µ ± 0% 40.95µ ± 0% -0.97% (p=0.000 n=20) Revcomp 463.0m ± 0% 473.0m ± 0% +2.15% (p=0.000 n=20) Template 83.80m ± 0% 76.26m ± 1% -9.00% (p=0.000 n=20) TimeParse 283.3n ± 0% 260.8n ± 0% -7.96% (p=0.000 n=20) TimeFormat 307.2n ± 0% 290.5n ± 0% -5.45% (p=0.000 n=20) geomean 53.16µ 51.67µ -2.79% Change-Id: Iaec2f50db18e9a2b405605f8b92af3683114ea34 Reviewed-on: https://go-review.googlesource.com/c/go/+/616035 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-01cmd/internal/obj: make asmidx error less crypticAustin Clements
It's still pretty cryptic, but at least now instead of printing asm: asmidx: bad address 0/2067/2068 it will print asm: asmidx: bad address 0/BX/SP Change-Id: I1c73c439c94c5b9d3039728db85102a818739db9 Reviewed-on: https://go-review.googlesource.com/c/go/+/611256 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Austin Clements <austin@google.com>
2024-09-27cmd/internal/obj/loong64: mark functions with small stacks NOSPLITXiaolin Zhao
goos: linux goarch: loong64 pkg: test/bench/go1 cpu: Loongson-3A6000 @ 2500.00MHz │ bench.old │ bench.new │ │ sec/op │ sec/op vs base │ BinaryTree17 7.728 ± 1% 7.703 ± 1% ~ (p=0.345 n=15) Fannkuch11 2.645 ± 0% 2.429 ± 0% -8.17% (p=0.000 n=15) FmtFprintfEmpty 35.89n ± 0% 35.83n ± 0% -0.17% (p=0.000 n=15) FmtFprintfString 59.48n ± 0% 59.45n ± 0% ~ (p=0.280 n=15) FmtFprintfInt 62.04n ± 0% 61.11n ± 0% -1.50% (p=0.000 n=15) FmtFprintfIntInt 97.74n ± 0% 96.56n ± 0% -1.21% (p=0.000 n=15) FmtFprintfPrefixedInt 116.7n ± 0% 116.6n ± 0% ~ (p=0.975 n=15) FmtFprintfFloat 204.5n ± 0% 203.2n ± 0% -0.64% (p=0.000 n=15) FmtManyArgs 456.2n ± 0% 454.6n ± 0% -0.35% (p=0.000 n=15) GobDecode 7.142m ± 1% 6.960m ± 1% -2.55% (p=0.000 n=15) GobEncode 8.172m ± 1% 8.081m ± 0% -1.11% (p=0.001 n=15) Gzip 279.9m ± 0% 280.1m ± 0% +0.05% (p=0.011 n=15) Gunzip 32.69m ± 0% 32.44m ± 0% -0.79% (p=0.000 n=15) HTTPClientServer 53.94µ ± 0% 53.68µ ± 0% -0.48% (p=0.000 n=15) JSONEncode 9.297m ± 0% 9.110m ± 0% -2.01% (p=0.000 n=15) JSONDecode 47.21m ± 0% 47.99m ± 2% +1.66% (p=0.000 n=15) Mandelbrot200 4.601m ± 0% 4.606m ± 0% +0.11% (p=0.000 n=15) GoParse 4.666m ± 0% 4.664m ± 0% ~ (p=0.512 n=15) RegexpMatchEasy0_32 59.76n ± 0% 58.92n ± 0% -1.41% (p=0.000 n=15) RegexpMatchEasy0_1K 458.1n ± 0% 455.3n ± 0% -0.61% (p=0.000 n=15) RegexpMatchEasy1_32 59.36n ± 0% 60.25n ± 0% +1.50% (p=0.000 n=15) RegexpMatchEasy1_1K 557.7n ± 0% 566.0n ± 0% +1.49% (p=0.000 n=15) RegexpMatchMedium_32 803.0n ± 0% 783.7n ± 0% -2.40% (p=0.000 n=15) RegexpMatchMedium_1K 27.29µ ± 0% 26.54µ ± 0% -2.76% (p=0.000 n=15) RegexpMatchHard_32 1.388µ ± 2% 1.333µ ± 0% -3.96% (p=0.000 n=15) RegexpMatchHard_1K 40.91µ ± 0% 40.96µ ± 0% +0.12% (p=0.001 n=15) Revcomp 474.7m ± 0% 474.2m ± 0% ~ (p=0.325 n=15) Template 77.13m ± 1% 75.70m ± 1% -1.86% (p=0.000 n=15) TimeParse 271.3n ± 0% 271.7n ± 0% +0.15% (p=0.000 n=15) TimeFormat 289.4n ± 0% 290.8n ± 0% +0.48% (p=0.000 n=15) geomean 51.70µ 51.22µ -0.92% Change-Id: Ib71b60226251722ae828edb1d7d8b5a27f383570 Reviewed-on: https://go-review.googlesource.com/c/go/+/616098 Reviewed-by: David Chase <drchase@google.com> Auto-Submit: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-09-27cmd/internal/obj/riscv: rename the iIEncodingMark Ryan
We rename it to iIIEncoding to reflect the fact that instructions that use this encoding take two integer registers. This change will allow us to add a new encoding for I-type instructions that take a single integer register. This new encoding will be used for instructions that modify CSRs. Change-Id: Ic507d0020e18f6aa72353f4d3ffcd0e868261e7a Reviewed-on: https://go-review.googlesource.com/c/go/+/614355 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Joel Sing <joel@sing.id.au> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: David Chase <drchase@google.com>
2024-09-19cmd/objdump: add loong64 disassembler supportGuoqi Chen
This CL provides vendor support for loong64 disassembler gnu and plan9 syntax. cd $GOROOT/src/cmd go get golang.org/x/arch@master go mod tidy go mod vendor Change-Id: Ic8b888de0aa11cba58cbf559f8f69337d1d69309 Reviewed-on: https://go-review.googlesource.com/c/go/+/609015 Reviewed-by: Meidan Li <limeidan@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
2024-09-18cmd/internal/obj/s390x: fix m6 field encoding for VSTRC instruction on s390xSrinivas Pokala
M6 field for all extended mnemonics of VSTRC set to zero This fixes VSTRC codegen to emit correctly and added testcases for all the extended mnemonics. Fixes #69216 Change-Id: I2a1b7fb61d6bd6444286eab56a506225c90b75e7 Reviewed-on: https://go-review.googlesource.com/c/go/+/612315 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Vishwanatha HD <vishwanatha.hd@ibm.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-09-17runtime: move getcallersp to internal/runtime/sysMichael Pratt
Moving these intrinsics to a base package enables other internal/runtime packages to use them. For #54766. Change-Id: I45a530422207dd94b5ad4eee51216c9410a84040 Reviewed-on: https://go-review.googlesource.com/c/go/+/613261 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-09-17runtime: move getcallerpc to internal/runtime/sysMichael Pratt
Moving these intrinsics to a base package enables other internal/runtime packages to use them. For #54766. Change-Id: I0b3eded3bb45af53e3eb5bab93e3792e6a8beb46 Reviewed-on: https://go-review.googlesource.com/c/go/+/613260 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-09-16cmd/compile/internal/typecheck: remove getcallerpc/sp builtin signatureMichael Pratt
The compiler never does a lookup of these (LookupRuntime), so they aren't needed here. getcallerpc is only used in intrinsification. getcallersp is used in intrinsification and defer handling via a direct OGETCALLERSP op. For #54766. Change-Id: I1666ceef3360a84573ae5b41b1c51d9205de7235 Reviewed-on: https://go-review.googlesource.com/c/go/+/613495 Reviewed-by: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-09-13cmd/internal/obj/loong64: add support for instructions FSCALEB{F/D} and ↵Xiaolin Zhao
FLOGB{F/D} Go asm syntax: FSCALEB{F/D} FK, FJ, FD FLOGB{F/D} FJ, FD Equivalent platform assembler syntax: fscaleb.{s/d} fd, fj, fk flogb.{s/d} fd, fj Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html Change-Id: I6cd75c7605adbb572dae86d6470ec7cf20ce0f6c Reviewed-on: https://go-review.googlesource.com/c/go/+/612975 Auto-Submit: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Tim King <taking@google.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-09-12cmd/compile,cmd/internal/obj/riscv: always provide ANDN, ORN and XNOR for ↵Joel Sing
riscv64 The ANDN, ORN and XNOR RISC-V Zbb extension instructions are easily synthesised. Make them always available by adding support to the riscv64 assembler so that we either emit two instruction sequences, or a single instruction, when permitted by the GORISCV64 profile. This means that these instructions can be used unconditionally, simplifying compiler rewrite rules, codegen tests and manually written assembly. Around 180 instructions are removed from the Go binary on riscv64 when built with rva22u64. Change-Id: Ib2d90f2593a306530dc0ed08a981acde4d01be20 Reviewed-on: https://go-review.googlesource.com/c/go/+/611895 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Tim King <taking@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>