aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/internal/obj
AgeCommit message (Collapse)Author
2025-01-16cmd/internal/obj/wasm, runtime: detect wasmexport call before runtime ↵Cherry Mui
initialization If a wasmexport function is called from the host before initializing the Go Wasm module, currently it will likely fail with a bounds error, because the uninitialized SP is 0, and any SP decrement will make it out of bounds. As at least some Wasm runtime doesn't call _initialize by default, This error can be common. And the bounds error looks confusing to the users. Therefore, we detect this case and emit a clearer error. Fixes #71240. Updates #65199. Change-Id: I107095f08c76cdceb7781ab0304218eab7029ab6 Reviewed-on: https://go-review.googlesource.com/c/go/+/643115 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-12-12cmd/internal/obj: disallow linknamed access to builtin symbolsCherry Mui
Currently, a symbol reference is counted as a reference to a builtin symbol if the name matches a builtin. Usually builtin references are generated by the compiler. But one could manually write one with linkname. Since the list of builtin functions are subject to change from time to time, we don't want users to depend on their names. So we don't count a linknamed reference as a builtin reference, and instead, count it as a named reference, so it is checked by the linker. Change-Id: Id3543295185c6bbd73a8cff82afb8f9cb4fd6f71 Reviewed-on: https://go-review.googlesource.com/c/go/+/635755 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-27cmd/internal/obj: handle static assembly symbols correctly in FIPS checkRuss Cox
Static symbols don't have the package prefix, so we need to identify them specially. Change-Id: Iaa0456de802478f6a257164e9703f18f8dc7eb50 Reviewed-on: https://go-review.googlesource.com/c/go/+/631975 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-22cmd/compile, cmd/link: FIPS fixes for large programsRuss Cox
1. In cmd/internal/obj, only apply the exclusion list to data symbols. Text symbols are always fine since they can use PC-relative relocations. 2. In cmd/link, only skip trampolines for text symbols in the same package with the same type. Before, all text symbols had type STEXT, but now that there are different sections of STEXT, we can only rely on symbols in the same package in the same section being close enough not to need trampolines. Fixes #70379. Change-Id: Ifad2bdd6001ad3b5b23e641127554e9ec374715e Reviewed-on: https://go-review.googlesource.com/c/go/+/631036 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-21cmd/internal/obj/riscv: rework instruction encoding informationJoel Sing
Currently, instruction encoding is a slice of encoding types, which is indexed by a masked version of the riscv64 opcode. Additional information about some instructions (for example, if an instruction has a ternary form and if there is an immediate form for an instruction) is manually specified in other parts of the assembler code. Rework the instruction encoding information so that we use a table driven form, providing additional data for each instruction where relevant. This means that we can simplify other parts of the code by simply looking up the instruction data and reusing minimal logic. Change-Id: I7b3b6c61a4868647edf28bd7dbae2150e043ae00 Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64 Reviewed-on: https://go-review.googlesource.com/c/go/+/622535 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-11-21all: fix some function names and typos in commentcuishuang
Change-Id: I07e7c8eaa5bd4bac0d576b2f2f4cd3f81b0b77a4 Reviewed-on: https://go-review.googlesource.com/c/go/+/630055 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Commit-Queue: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Russ Cox <rsc@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com>
2024-11-21cmd/internal/obj/ppc64: support for decimal floating point instructionsJayanth Krishnamurthy jayanth.krishnamurthy@ibm.com
1. Support for decimal arithmetic quad instructions of powerpc: DADDQ, DSUBQ, DMULQ and DDIVQ. 2. Support for decimal compare ordered, unordered, quad instructions of powerpc: DCMPU, DCMPO, DCMPUQ, and DCMPOQ. Change-Id: I32a15a7f0a127b022b1f43d376e0ab0f7e9dd108 Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10 Reviewed-on: https://go-review.googlesource.com/c/go/+/623036 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Paul Murphy <murp@ibm.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Paul Murphy <murp@ibm.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-20all: rename crypto/internal/fips to crypto/internal/fips140Russ Cox
Sometimes we've used the 140 suffix (GOFIPS140, crypto/fips140) and sometimes not (crypto/internal/fips, cmd/go/internal/fips). Use it always, to avoid having to remember which is which. Also, there are other FIPS standards, like AES (FIPS 197), SHA-2 (FIPS 180), and so on, which have nothing to do with FIPS 140. Best to be clear. For #70123. Change-Id: I33b29dabd9e8b2703d2af25e428f88bc81c7c307 Reviewed-on: https://go-review.googlesource.com/c/go/+/630115 Reviewed-by: Filippo Valsorda <filippo@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Roland Shoemaker <roland@golang.org>
2024-11-19cmd/internal/obj/arm64: recognize FIPS static temps as unalignedRuss Cox
Code like x := [12]byte{1,2,3,4,5,6,7,8,9,10,11,12} stores x in a pair of registers and uses MOVD/MOVWU to load the values from RODATA. The code generator needs to understand not to use the aligned PC-relative relocation for that sequence. In non-FIPS modes, more statictemp optimizations can be applied and this problematic sequence doesn't happen. Fix the decision about whether to assume alignment to match the code used by the linker when deciding what to align. Fixes the linker failure in CL 626437 patch set 5. Change-Id: Iedad862c6faee758d4a2c5120cab2d329265b134 Reviewed-on: https://go-review.googlesource.com/c/go/+/628835 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Bypass: Russ Cox <rsc@golang.org>
2024-11-18cmd/internal/obj: exclude external test packages from FIPS scopeRuss Cox
Excluding external test packages allows them to use //go:embed, which requires data relocations in data. (Obviously the external test code is testing the FIPS module, not part of it, so this is reasonable.) Change-Id: I4bae71320ccb5faf718c045540a9ba6dd93e378f Reviewed-on: https://go-review.googlesource.com/c/go/+/628735 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-15crypto/internal/fips/check: enable windows/arm64Filippo Valsorda
Looks like it works. Cq-Include-Trybots: luci.golang.try:gotip-windows-arm64 Change-Id: I4914d5076eccaf1dd850a148070f179edf291c40 Reviewed-on: https://go-review.googlesource.com/c/go/+/627958 Reviewed-by: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Filippo Valsorda <filippo@golang.org>
2024-11-14cmd/internal/obj/fips: mark R_ADDRPOWR_GOT as a pcrel relocationPaul E. Murphy
It's actually a TOC relative relocation, but those are also accepted as pcrel relocations here too. This fixes compilation on GOPPC64 <= power9. Change-Id: I235125a76f59ab26c6c753540cfaeb398f9c105d Reviewed-on: https://go-review.googlesource.com/c/go/+/628157 Auto-Submit: Paul Murphy <murp@ibm.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-14all: enable FIPS verification codeRuss Cox
Previous CLs committed changes to cmd/compile, cmd/link, and crypto/internal/fips/check behind boolean flags. Turn those flags on, to enable the CLs. This is a separate, trivial CL for easier rollback. For #69536. Change-Id: I68206bae0b7d7ad5c8758267d1a2e68853b63644 Reviewed-on: https://go-review.googlesource.com/c/go/+/626000 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-13cmd/internal/obj/wasm: correct return PC for frameless wasmexport wrappersCherry Mui
For a wasmexport wrapper, we generate a call to the actual exported Go function, and use the wrapper function's PC 1 as the (fake) return address. This address is not used for returning, which is handled by the Wasm call stack. It is used for stack unwinding, and PC 1 makes it past the prologue and therefore has the right SP delta. But if the function has no arguments and results, the wrapper is frameless, with no prologue, and PC 1 doesn't exist. This causes the unwinder to fail. In this case, we put PC 0, which also has the correct SP delta (0). Fixes #69584. Change-Id: Ic047a6e62100db540b5099cc5a56a1d0f16d58b9 Reviewed-on: https://go-review.googlesource.com/c/go/+/624000 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-13cmd/internal/obj: add tool to generate Cnames stringchenguoqi
Add cmd/internal/obj/mkcnames.go to do the generation and update the architecture packages to use it to maintain the Cnames tables. Currently works correctly on arm64,loong64,mips,ppc64 and s390x. Change-Id: I5220b0ba6d8a8a5fcc4d9774731eb2af69a671af Reviewed-on: https://go-review.googlesource.com/c/go/+/622256 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> Auto-Submit: Ian Lance Taylor <iant@golang.org> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Commit-Queue: Ian Lance Taylor <iant@golang.org>
2024-11-13cmd/compile, cmd/link: add FIPS verification supportRuss Cox
For FIPS init-time code+data verification, we need to arrange to put the FIPS symbols into contiguous regions of the executable and then record those sections along with the expected checksum. The cmd/internal/obj changes identify the FIPS symbols and give them distinguished types, which the linker then places in contiguous regions. The linker also writes out information to use at run time to find the FIPS sections, along with the expected hash. See cmd/internal/obj/fips.go and cmd/link/internal/ld/fips.go for more details. The code is disabled in this commit. CL 625998 and 625999 adds tests. CL 626000 enables the code. For #69536. Change-Id: I48da6db94bc0bea7428c43d4abcf999527bccfcd Reviewed-on: https://go-review.googlesource.com/c/go/+/625997 Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-13cmd/internal/obj/loong64: add support of VMOVQ and XVMOVQGuoqi Chen
This CL refers to the implementation of ARM64 and adds support for the following types of SIMD instructions: 1. Move general-purpose register to a vector element, e.g.: VMOVQ Rj, <Vd>.<T>[index] <T> can have the following values: B, H, W, V 2. Move vector element to general-purpose register, e.g.: VMOVQ <Vj>.<T>[index], Rd <T> can have the following values: B, BU, H, HU, W, WU, VU 3. Duplicate general-purpose register to vector, e.g.: VMOVQ Rj, <Vd>.<T> <T> can have the following values: B16, H8, W4, V2, B32, H16, W8, V4 4. Move vector, e.g.: XVMOVQ Xj, <Xd>.<T> <T> can have the following values: B16, H8, W4, V2, Q1 5. Move vector element to scalar, e.g.: XVMOVQ Xj, <Xd>.<T>[index] XVMOVQ Xj.<T>[index], Xd <T> can have the following values: W, V 6. Move vector element to vector register, e.g.: VMOVQ <Vn>.<T>[index], Vn.<T> <T> can have the following values: B, H, W, V This CL only adds syntax and doesn't break any assembly that already exists. Change-Id: I7656efac6def54da6c5ae182f39c2a21bfdf92bb Reviewed-on: https://go-review.googlesource.com/c/go/+/616258 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-11cmd/compile: allow more types for wasmimport/wasmexport parameters and resultsCherry Mui
As proposed on #66984, this CL allows more types to be used as wasmimport/wasmexport function parameters and results. Specifically, bool, string, and uintptr are now allowed, and also pointer types that point to allowed element types. Allowed element types includes sized integer and floating point types (including small integer types like uint8 which are not directly allowed as a parameter type), bool, array whose element type is allowed, and struct whose fields are allowed element type and also include a struct.HostLayout field. For #66984. Change-Id: Ie5452a1eda21c089780dfb4d4246de6008655c84 Reviewed-on: https://go-review.googlesource.com/c/go/+/626615 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-11cmd/internal/obj/loong64: switch Lookup function call to ABIInternal modeGuoqi Chen
CL 521790 has experimentally enabled RegABI support on Loong64, so it is possible to switch the Lookup function call to ABIInternal mode. Change-Id: I3ae053e20c0791efebe6b6bdc9a1550a11372bc2 Reviewed-on: https://go-review.googlesource.com/c/go/+/544435 Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-08cmd/asm: use single-instruction forms for all loong64 sign and zero extensionsXiaolin Zhao
8-bit and 16-bit sign extensions and 32-bit zero extensions were realized with left and right shifts before this change. We now support assembling EXTWB, EXTWH and BSTRPICKV, so all three can be done with a single insn respectively. This patch is a copy of CL 479496. Co-authored-by: WANG Xuerui <git@xen0n.name> Change-Id: Iee5741dd9ebb25746f51008f3f6c86704339d615 Reviewed-on: https://go-review.googlesource.com/c/go/+/626195 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-08cmd/internal/obj/loong64: add {V,XV}PCNT.{B,H,W,D} instructions supportGuoqi Chen
Go asm syntax: VPCNT{B,H,W,V} VJ, VD XVPCNT{B,H,W,V} XJ, XD Equivalent platform assembler syntax: vpcnt.{b,w,h,d} vd, vj xvpcnt.{b,w,h,d} xd, xj Change-Id: Icec4446b1925745bc3a0bc3f6397d862953b9098 Reviewed-on: https://go-review.googlesource.com/c/go/+/620736 Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
2024-11-07cmd/internal/objabi, cmd/link: introduce SymKind helper methodsRuss Cox
These will be necessary when we start using the new FIPS symbols. Split into a separate CL so that these refactoring changes can be tested separate from any FIPS-specific changes. Passes golang.org/x/tools/cmd/toolstash/buildall. Change-Id: I73e5873fcb677f1f572f0668b4dc6f3951d822bc Reviewed-on: https://go-review.googlesource.com/c/go/+/625996 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Russ Cox <rsc@golang.org>
2024-11-07cmd/internal/obj: replace obj.Addrel func with LSym.AddRel methodRuss Cox
The old API was to do r := obj.AddRel(sym) r.Type = this r.Off = that etc The new API is: sym.AddRel(ctxt, obj.Reloc{Type: this: Off: that, etc}) This new API is more idiomatic and avoids ever having relocations that are only partially constructed. Most importantly, it sets up for sym.AddRel being able to check relocation validity in the future. (Passing ctxt is for use in validity checking.) Passes golang.org/x/tools/cmd/toolstash/buildall. Change-Id: I042ea76e61bb3bf6402f98ca11291a13f4799972 Reviewed-on: https://go-review.googlesource.com/c/go/+/625616 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07cmd/internal/obj/loong64: add {V,XV}SEQ.{B,H,W,D} instructions supportGuoqi Chen
Go asm syntax: VSEQ{B,H,W,V} VJ, VK, VD XVSEQ{B,H,W,V} XJ, XK, XD Equivalent platform assembler syntax: vseq.{b,w,h,d} vd, vj, vk xvseq.{b,w,h,d} xd, xj, xk Change-Id: Ia87277b12c817ebc41a46f4c3d09f4b76995ff2f Reviewed-on: https://go-review.googlesource.com/c/go/+/616076 Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07cmd/internal/obj/loong64: add {V,XV}LD/{V,XV}LDX/{V,XV}ST/{V,XV}STX ↵Guoqi Chen
instructions support This CL adding primitive asm support of Loong64 LSX [1] and LASX [2], by introducing new sets of register V0-V31 (C_VREG), X0-X31 (C_XREG) and 8 new instructions. On Loong64, VLD,XVLD,VST,XVST implement vector memory access operations using immediate values offset. VLDX, XVLDX, VSTX, XVSTX implement vector memory access operations using register offset. Go asm syntax: VMOVQ n(RJ), RV (128bit vector load) XVMOVQ n(RJ), RX (256bit vector load) VMOVQ RV, n(RJ) (128bit vector store) XVMOVQ RX, n(RJ) (256bit vector store) VMOVQ (RJ)(RK), RV (128bit vector load) XVMOVQ (RJ)(RK), RX (256bit vector load) VMOVQ RV, (RJ)(RK) (128bit vector store) XVMOVQ RX, (RJ)(RK) (256bit vector store) Equivalent platform assembler syntax: vld vd, rj, si12 xvld xd, rj, si12 vst vd, rj, si12 xvst xd, rj, si12 vldx vd, rj, rk xvldx xd, rj, rk vstx vd, rj, rk xvstx xd, rj, rk [1]: LSX: Loongson SIMD Extension, 128bit [2]: LASX: Loongson Advanced SIMD Extension, 256bit Change-Id: Ibaf5ddfd29b77670c3c44cc32bead36b2c8b8003 Reviewed-on: https://go-review.googlesource.com/c/go/+/616075 Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-11-07cmd/compiler,internal/runtime/atomic: optimize Store{64,32,8} on loong64Guoqi Chen
On Loong64, AMSWAPDB{W,V} instructions are supported by default, and AMSWAPDB{B,H} [1] is a new instruction added by LA664(Loongson 3A6000) and later microarchitectures. Therefore, AMSWAPDB{W,V} (full barrier) is used to implement AtomicStore{32,64}, and the traditional MOVB or the new AMSWAPDBB is used to implement AtomicStore8 according to the CPU feature. The StoreRelease barrier on Loong64 is "dbar 0x12", but it is still necessary to ensure consistency in the order of Store/Load [2]. LoweredAtomicStorezero{32,64} was removed because on loong64 the constant "0" uses the R0 register, and there is no performance difference between the implementations of LoweredAtomicStorezero{32,64} and LoweredAtomicStore{32,64}. goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A5000-HV @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | AtomicStore64 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20) AtomicStore64-2 19.61n ± 0% 13.61n ± 0% -30.57% (p=0.000 n=20) AtomicStore64-4 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20) AtomicStore 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20) AtomicStore-2 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20) AtomicStore-4 19.62n ± 0% 13.62n ± 0% -30.58% (p=0.000 n=20) AtomicStore8 19.61n ± 0% 20.01n ± 0% +2.04% (p=0.000 n=20) AtomicStore8-2 19.62n ± 0% 20.02n ± 0% +2.01% (p=0.000 n=20) AtomicStore8-4 19.61n ± 0% 20.02n ± 0% +2.09% (p=0.000 n=20) geomean 19.61n 15.48n -21.08% goos: linux goarch: loong64 pkg: internal/runtime/atomic cpu: Loongson-3A6000 @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | AtomicStore64 18.03n ± 0% 12.81n ± 0% -28.93% (p=0.000 n=20) AtomicStore64-2 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20) AtomicStore64-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20) AtomicStore-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore8 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore8-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) AtomicStore8-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20) geomean 18.01n 12.81n -28.89% [1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html [2]: https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=gcc/config/loongarch/sync.md Change-Id: I4ae5e8dd0e6f026129b6e503990a763ed40c6097 Reviewed-on: https://go-review.googlesource.com/c/go/+/581356 Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn> Reviewed-by: Meidan Li <limeidan@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2024-11-02cmd/asm: add support for loong64 FMA instructionsXiaolin Zhao
Add support for assembling the FMA instructions present in the LoongArch base ISA v1.00. This requires adding a new instruction format and making use of a third source operand, which is put in RestArgs[0]. The single-precision instructions have the `.s` prefix in their official mnemonics, and similar Go asm instructions all have `S` prefix for the other architectures having FMA support, but in this change they instead have `F` prefix in Go asm because loong64 currently follows the mips backends in the naming convention. This could be changed later because FMA is fully expressible in pure Go, making it unlikely to have to hand- write such assembly in the wild. Example mapping between actual encoding and Go asm syntax: fmadd.s fd, fj, fk, fa -> FMADDF fa, fk, fj, fd (prog.From = fa, prog.Reg = fk, prog.RestArgs[0] = fj and prog.To = fd) fmadd.s fd, fd, fk, fa -> FMADDF fa, fk, fd (prog.From = fa, prog.Reg = fk and prog.To = fd) This patch is a copy of CL 477716. Co-authored-by: WANG Xuerui <git@xen0n.name> Change-Id: I9b4e4c601d6c5a854ee238f085849666e4faf090 Reviewed-on: https://go-review.googlesource.com/c/go/+/623877 Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-11-01cmd/asm: add support for loong64 CRC32 instructionsXiaolin Zhao
This patch is a copy of CL 478595. Co-authored-by: WANG Xuerui <git@xen0n.name> Change-Id: Ifb6e8183c83a5dfe5dec84e173a74d5de62692a0 Reviewed-on: https://go-review.googlesource.com/c/go/+/623875 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-01cmd/asm: add support for the rest of loong64 unary bitopsXiaolin Zhao
All remaining unary bitop instructions in the LoongArch v1.00 base ISA are added with this change. While at it, add the missing W suffix to the current CLO/CLZ names. They are not used anywhere as far as we know, so no breakage is expected. Also, stop reusing SLL's instruction format for simplicity, in favor of a new but trivial instruction format case. This patch is a copy of CL 477717. Co-authored-by: WANG Xuerui <git@xen0n.name> Change-Id: Idbcaca25dda1ed313674ef8b26da722e8d7151c0 Reviewed-on: https://go-review.googlesource.com/c/go/+/623876 Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-10-31cmd/internal/obj/arm64: make sure prologue and epilogue are pattern matched ↵Hao Liu
for small frames CL 379075 implemented function prologue/epilogue with STP/LDP. To fix issue #53374, CL 412474 reverted the prologue STP change for small frames, and the LDP in epilogue was kept. The current instructions are: prologue: MOVD.W R30, -offset(RSP) MOVD R29, -8(RSP) epilogue: LDP -8(RSP), (R29, R30) ADD $offset, RSP, RSP It seems a bit strange, as: 1) The prolog and epilogue are not in the same pattern (either STR-LDR, or STP-LDP). 2) Go Internal ABI defines that R30 is saved at 0(RSP) and R29 is saved at -8(RSP), so we can not use a single STP.W/LDP.P to save/restore LR&FP and adjust SP. Changing the ABI causes too much complexity, and the benefit is not that big. This patch reverts the small frames' epilogue change in CL 379075. It converts LDP in the epilogue to LDR-LDR. Another solution is to re-apply the STP change in prologue, which requires to fix #53609. This seems the easier and safer solution in the mean time. The new instructions are: prologue: MOVD.W R30, -offset(RSP) MOVD R29, -8(RSP) epilogue: MOVD -8(RSP), R29 MOVD.P offset(RSP), R30 The current pattern may cause performance issues in Store-Forwarding on micro-architectures like AmpereOne. Assuming a function call in the middle of such code is short enough that the stores are still around, then the LDP executes and it may wait longer to get the results from separated stores in Store Buffers other than single STP. Store-Forwarding aims to improve the efficiency of the processor by allowing data to be forwarded directly from a store operation to a subsequent load operation when certain conditions are met. See the paper: "Memory Barriers: a Hardware View for Software Hackers" (chapter 3.2: Store Forwarding). The performance of following ARM64 Linux servers were tested: 1) AmpereOne (ARM v8.6+) from Ampere Computing. 2) Ampere Altra (ARM Neoverse N1) from Ampere Computing. 3) Graviton2 (ARM Neoverse N1) from AWS. The effect of this change depends the hardware implementation of store-forwarding. It can obviously improve AmpereOne, especially for small functions that are frequently called and returned quickly. E.g., JSON Marshal/Unmarshal benchmarks on AmpereOne: goos: linux goarch: arm64 pkg: encoding/json │ ampere-one.base │ ampere-one.new │ │ sec/op │ sec/op vs base │ CodeMarshal-8 882.1µ ± 1% 779.6µ ± 1% -11.62% (p=0.000 n=10) CodeMarshalError-8 961.5µ ± 0% 855.7µ ± 1% -11.01% (p=0.000 n=10) MarshalBytes/32-8 207.6n ± 1% 187.8n ± 0% -9.52% (p=0.000 n=10) MarshalBytes/256-8 501.0n ± 1% 482.6n ± 1% -3.68% (p=0.000 n=10) MarshalBytes/4096-8 5.336µ ± 1% 5.074µ ± 1% -4.92% (p=0.000 n=10) MarshalBytesError/32-8 242.3µ ± 2% 205.7µ ± 3% -15.08% (p=0.000 n=10) MarshalBytesError/256-8 242.4µ ± 1% 205.2µ ± 2% -15.35% (p=0.000 n=10) MarshalBytesError/4096-8 247.9µ ± 0% 210.1µ ± 1% -15.24% (p=0.000 n=10) MarshalMap-8 150.8n ± 1% 145.7n ± 0% -3.35% (p=0.000 n=10) EncodeMarshaler-8 50.30n ± 26% 54.48n ± 6% ~ (p=0.739 n=10) CodeUnmarshal-8 4.796m ± 2% 4.055m ± 1% -15.45% (p=0.000 n=10) CodeUnmarshalReuse-8 4.260m ± 1% 3.496m ± 1% -17.94% (p=0.000 n=10) UnmarshalString-8 73.89n ± 1% 65.83n ± 1% -10.91% (p=0.000 n=10) UnmarshalFloat64-8 60.63n ± 1% 58.66n ± 25% ~ (p=0.143 n=10) UnmarshalInt64-8 55.62n ± 1% 53.25n ± 22% ~ (p=0.468 n=10) UnmarshalMap-8 255.3n ± 1% 230.3n ± 1% -9.77% (p=0.000 n=10) UnmarshalNumber-8 467.2n ± 1% 367.0n ± 0% -21.43% (p=0.000 n=10) geomean 6.224µ 5.605µ -9.94% Other ARM64 micro-architectures may be not affected so much by such issue. E.g., benchmarks on Ampere Altra and Graviton2 show slight improvements: │ altra.base │ altra.new │ │ sec/op │ sec/op vs base │ CodeMarshal-8 980.1µ ± 1% 977.3µ ± 1% ~ (p=0.912 n=10) CodeMarshalError-8 1.109m ± 3% 1.096m ± 5% ~ (p=0.971 n=10) MarshalBytes/32-8 246.8n ± 1% 245.4n ± 0% -0.55% (p=0.002 n=10) MarshalBytes/256-8 590.9n ± 1% 606.6n ± 1% +2.67% (p=0.000 n=10) MarshalBytes/4096-8 6.351µ ± 1% 6.376µ ± 1% ~ (p=0.183 n=10) MarshalBytesError/32-8 245.3µ ± 2% 246.1µ ± 2% ~ (p=0.684 n=10) MarshalBytesError/256-8 245.5µ ± 1% 248.7µ ± 2% ~ (p=0.218 n=10) MarshalBytesError/4096-8 254.2µ ± 1% 254.9µ ± 1% ~ (p=0.481 n=10) MarshalMap-8 152.7n ± 2% 151.5n ± 3% ~ (p=0.782 n=10) EncodeMarshaler-8 45.95n ± 7% 42.88n ± 5% -6.70% (p=0.014 n=10) CodeUnmarshal-8 5.121m ± 4% 5.125m ± 3% ~ (p=0.579 n=10) CodeUnmarshalReuse-8 4.616m ± 3% 4.634m ± 2% ~ (p=0.529 n=10) UnmarshalString-8 72.12n ± 2% 72.20n ± 2% ~ (p=0.912 n=10) UnmarshalFloat64-8 64.44n ± 5% 63.20n ± 4% ~ (p=0.393 n=10) UnmarshalInt64-8 61.49n ± 2% 58.14n ± 4% -5.45% (p=0.002 n=10) UnmarshalMap-8 263.6n ± 2% 266.2n ± 1% ~ (p=0.196 n=10) UnmarshalNumber-8 464.7n ± 1% 464.0n ± 0% ~ (p=0.566 n=10) geomean 6.617µ 6.575µ -0.64% │ graviton2.base │ graviton2.new │ │ sec/op │ sec/op vs base │ CodeMarshal-8 1.122m ± 0% 1.118m ± 1% ~ (p=0.052 n=10) CodeMarshalError-8 1.216m ± 1% 1.214m ± 0% ~ (p=0.631 n=10) MarshalBytes/32-8 289.9n ± 0% 280.8n ± 0% -3.17% (p=0.000 n=10) MarshalBytes/256-8 675.9n ± 0% 664.7n ± 0% -1.66% (p=0.000 n=10) MarshalBytes/4096-8 6.884µ ± 0% 6.885µ ± 0% ~ (p=0.565 n=10) MarshalBytesError/32-8 293.1µ ± 2% 288.9µ ± 2% ~ (p=0.123 n=10) MarshalBytesError/256-8 296.0µ ± 3% 289.0µ ± 1% -2.36% (p=0.019 n=10) MarshalBytesError/4096-8 300.4µ ± 1% 295.6µ ± 0% -1.60% (p=0.000 n=10) MarshalMap-8 168.8n ± 1% 168.8n ± 1% ~ (p=1.000 n=10) EncodeMarshaler-8 53.77n ± 8% 50.05n ± 12% ~ (p=0.579 n=10) CodeUnmarshal-8 5.875m ± 2% 5.882m ± 1% ~ (p=0.796 n=10) CodeUnmarshalReuse-8 5.383m ± 1% 5.366m ± 0% ~ (p=0.631 n=10) UnmarshalString-8 74.59n ± 1% 73.99n ± 0% -0.80% (p=0.001 n=10) UnmarshalFloat64-8 68.52n ± 7% 64.19n ± 18% ~ (p=0.868 n=10) UnmarshalInt64-8 65.32n ± 13% 62.24n ± 8% ~ (p=0.138 n=10) UnmarshalMap-8 290.1n ± 0% 291.3n ± 0% +0.43% (p=0.010 n=10) UnmarshalNumber-8 514.4n ± 0% 499.4n ± 0% -2.93% (p=0.000 n=10) geomean 7.459µ 7.317µ -1.91% Change-Id: If27386fc5f514b76bdaf2012c2ce86cc65f7ca5b Reviewed-on: https://go-review.googlesource.com/c/go/+/621775 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-30cmd/internal/obj/riscv: update RISC-V instruction tableJoel Sing
Regenerate RISC-V instruction table from the riscv-opcodes repository, due to various changes and shuffling upstream. This has been changed to remove pseudo-instructions, since Go only needs the instruction encodings and including the pseudo-instructions is creating unnecessary complications (for example, the inclusion of ANOP and ARET, as well as strangely named aliases such as AJALPSEUDO/AJALRPSEUDO). Remove pseudo-instructions that are not currently supported by the assembler and add specific handling for RDCYCLE, RDTIME and RDINSTRET, which were previously implemented via the instruction encodings. Change-Id: I78be4506ba6b627eba1f321406081a63bab5b2e6 Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64 Reviewed-on: https://go-review.googlesource.com/c/go/+/616116 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-10-29cmd/internal/obj/ppc64: support for extended mnemonics of BCJayanth Krishnamurthy
BGT, BLT, BLE, BGE, BNE, BVS, BVC, and BEQ support by assembler. This will simplify the usage of BC constructs like BC 12, 30, LR <=> BEQ CR7, LR BC 12, 2, LR <=> BEQ CR0, LR BC 12, 0, target <=> BLT CR0, target BC 12, 2, target <=> BEQ CR0, target BC 12, 5, target <=> BGT CR1, target BC 12, 30, target <=> BEQ CR7, target BC 4, 6, target <=> BNE CR1, target BC 4, 5, target <=> BLE CR1, target code cleanup based on the above additions. Change-Id: I02fdb212b6fe3f85ce447e05f4d42118c9ce63b5 Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10 Reviewed-on: https://go-review.googlesource.com/c/go/+/612395 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Paul Murphy <murp@ibm.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-10-29cmd/internal/obj/ppc64: add double-decimal arithmetic instructionsJayanth Krishnamurthy
Assembler support provided for the instructions DADD, DSUB, DMUL, and DDIV. Change-Id: Ic12ba02ce453cb1ca275334ca1924fb2009da767 Reviewed-on: https://go-review.googlesource.com/c/go/+/620856 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Paul Murphy <murp@ibm.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-10-25cmd/internal/obj: add prologue_end DWARF stmt for riscv64Lin Runze
This patch adds prologue_end statement to the DWARF info for riscv64, which delve debugger uses for skip stacksplit prologue. Change-Id: I4e5d9c26202385f65b3118b16f53f66de9d327f0 Reviewed-on: https://go-review.googlesource.com/c/go/+/620295 Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-10-25cmd/internal/obj/riscv: update references to RISC-V specificationJoel Sing
Update references to version 20240411 of the RISC-V specifications. Reorder and regroup instructions to maintain ordering. Change-Id: Iea2a5d22ad677e04948e9a9325986ad301c03f35 Reviewed-on: https://go-review.googlesource.com/c/go/+/616115 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: David Chase <drchase@google.com>
2024-10-24cmd/internal/obj,cmd/asm: add vector registers to riscv64 assemblerJoel Sing
This adds V0 through V31 as vector registers, which are available on CPUs that support the V extension. Change-Id: Ibffee3f9a2cf1d062638715b3744431d72d451ce Reviewed-on: https://go-review.googlesource.com/c/go/+/595404 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: 鹏程汪 <wangpengcheng.pp@bytedance.com>
2024-10-24cmd/internal/obj/riscv: add vector instruction encodingsJoel Sing
Regenerate the riscv instruction encoding table with the V extension enabled. Add constants and names for the resulting 375 instructions. Change-Id: Icce688493aeb1e9880fb76a0618643f57e481273 Reviewed-on: https://go-review.googlesource.com/c/go/+/595403 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: 鹏程汪 <wangpengcheng.pp@bytedance.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-10-23cmd/asm: add support for LDREXB/STREXBMauri de Souza Meneguzzo
These are 8-bit ARM Load/Store atomics and are available starting from armv6k. See https://developer.arm.com/documentation/dui0379/e/arm-and-thumb-instructions/strex For #69735 Change-Id: I12623433c89070495c178208ee4758b3cdefd368 GitHub-Last-Rev: d6a797836af1dccdcc6e6554725546b386d01615 GitHub-Pull-Request: golang/go#69959 Cq-Include-Trybots: luci.golang.try:gotip-linux-arm Reviewed-on: https://go-review.googlesource.com/c/go/+/621395 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-22cmd/compile: use STP/LDP around morestack on arm64Keith Randall
The spill/restore code around morestack is almost never exectued, so we should make it as small as possible. Using 2-register loads/stores makes sense here. Also, the offsets from SP are pretty small so the offset almost always fits in the (smaller than a normal load/store) offset field of the instruction. Makes cmd/go 0.6% smaller. Change-Id: I8845283c1b269a259498153924428f6173bda293 Reviewed-on: https://go-review.googlesource.com/c/go/+/621556 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-10-08cmd/internal/obj: optimize the function stacksplit on loong64Xiaolin Zhao
In the process of stack split checking, loong64 uses the following logic: if SP > stackguard then goto done, else morestack The possible problem here is that the probability of morestack execution is much lower than done, while static branch prediction is more inclined to obtain morestack, which will cause a certain probability of branch prediction error. Change the logic here to: if SP <= stackguard then goto morestack, else done benchmarks on 3A6000: goos: linux goarch: loong64 pkg: fmt cpu: Loongson-3A6000 @ 2500.00MHz │ bench.old │ bench.new │ │ sec/op │ sec/op vs base │ SprintfPadding 418.3n ± 1% 387.0n ± 0% -7.49% (p=0.000 n=20) SprintfEmpty 35.95n ± 0% 35.86n ± 0% -0.25% (p=0.000 n=20) SprintfString 75.02n ± 1% 72.24n ± 0% -3.71% (p=0.000 n=20) SprintfTruncateString 165.7n ± 3% 139.9n ± 1% -15.58% (p=0.000 n=20) SprintfTruncateBytes 171.0n ± 0% 147.3n ± 0% -13.83% (p=0.000 n=20) SprintfSlowParsingPath 90.56n ± 0% 80.85n ± 0% -10.72% (p=0.000 n=20) SprintfQuoteString 560.2n ± 0% 509.7n ± 0% -9.01% (p=0.000 n=20) SprintfInt 58.62n ± 0% 56.45n ± 0% -3.70% (p=0.000 n=20) SprintfIntInt 141.7n ± 0% 122.2n ± 0% -13.73% (p=0.000 n=20) SprintfPrefixedInt 210.6n ± 0% 208.8n ± 0% -0.88% (p=0.000 n=20) SprintfFloat 282.3n ± 0% 251.8n ± 1% -10.80% (p=0.000 n=20) SprintfComplex 854.1n ± 0% 813.8n ± 0% -4.71% (p=0.000 n=20) SprintfBoolean 76.32n ± 0% 71.14n ± 1% -6.79% (p=0.000 n=20) SprintfHexString 218.5n ± 0% 193.4n ± 0% -11.51% (p=0.000 n=20) SprintfHexBytes 321.3n ± 0% 275.0n ± 0% -14.42% (p=0.000 n=20) SprintfBytes 573.5n ± 0% 553.2n ± 1% -3.54% (p=0.000 n=20) SprintfStringer 501.1n ± 1% 446.6n ± 0% -10.86% (p=0.000 n=20) SprintfStructure 1.793µ ± 0% 1.683µ ± 0% -6.16% (p=0.000 n=20) ManyArgs 500.0n ± 0% 470.4n ± 0% -5.92% (p=0.000 n=20) FprintInt 67.51n ± 0% 65.71n ± 0% -2.66% (p=0.000 n=20) FprintfBytes 130.9n ± 0% 129.5n ± 1% -1.11% (p=0.000 n=20) FprintIntNoAlloc 67.55n ± 0% 65.80n ± 0% -2.58% (p=0.000 n=20) ScanInts 386.3µ ± 0% 346.5µ ± 0% -10.29% (p=0.000 n=20) ScanRecursiveInt 25.97m ± 0% 25.93m ± 0% -0.15% (p=0.038 n=20) ScanRecursiveIntReaderWrapper 26.07m ± 0% 25.93m ± 0% -0.53% (p=0.001 n=20) geomean 702.6n 653.7n -6.96% goos: linux goarch: loong64 pkg: test/bench/go1 cpu: Loongson-3A6000 @ 2500.00MHz │ bench.old │ bench.new │ │ sec/op │ sec/op vs base │ BinaryTree17 7.688 ± 1% 7.724 ± 0% +0.47% (p=0.040 n=20) Fannkuch11 2.670 ± 0% 2.645 ± 0% -0.94% (p=0.000 n=20) FmtFprintfEmpty 35.93n ± 0% 37.50n ± 0% +4.37% (p=0.000 n=20) FmtFprintfString 56.32n ± 0% 59.74n ± 0% +6.08% (p=0.000 n=20) FmtFprintfInt 64.47n ± 0% 61.26n ± 0% -4.98% (p=0.000 n=20) FmtFprintfIntInt 100.30n ± 0% 99.67n ± 0% -0.63% (p=0.000 n=20) FmtFprintfPrefixedInt 116.7n ± 0% 119.3n ± 0% +2.23% (p=0.000 n=20) FmtFprintfFloat 234.1n ± 0% 203.4n ± 0% -13.11% (p=0.000 n=20) FmtManyArgs 503.0n ± 0% 467.9n ± 0% -6.96% (p=0.000 n=20) GobDecode 8.125m ± 0% 7.299m ± 0% -10.17% (p=0.000 n=20) GobEncode 8.930m ± 1% 8.581m ± 1% -3.91% (p=0.000 n=20) Gzip 280.0m ± 0% 279.8m ± 0% -0.10% (p=0.000 n=20) Gunzip 33.30m ± 0% 32.48m ± 0% -2.49% (p=0.000 n=20) HTTPClientServer 55.43µ ± 0% 54.10µ ± 1% -2.41% (p=0.000 n=20) JSONEncode 10.086m ± 0% 9.055m ± 0% -10.22% (p=0.000 n=20) JSONDecode 49.37m ± 1% 46.22m ± 1% -6.40% (p=0.000 n=20) Mandelbrot200 4.606m ± 0% 4.606m ± 0% ~ (p=0.280 n=20) GoParse 5.010m ± 0% 4.855m ± 0% -3.09% (p=0.000 n=20) RegexpMatchEasy0_32 59.09n ± 0% 59.32n ± 0% +0.39% (p=0.000 n=20) RegexpMatchEasy0_1K 455.2n ± 0% 453.8n ± 0% -0.31% (p=0.000 n=20) RegexpMatchEasy1_32 59.24n ± 0% 60.11n ± 0% +1.47% (p=0.000 n=20) RegexpMatchEasy1_1K 555.2n ± 0% 553.9n ± 0% -0.23% (p=0.000 n=20) RegexpMatchMedium_32 845.7n ± 0% 775.6n ± 0% -8.28% (p=0.000 n=20) RegexpMatchMedium_1K 26.68µ ± 0% 26.48µ ± 0% -0.78% (p=0.000 n=20) RegexpMatchHard_32 1.317µ ± 0% 1.326µ ± 0% +0.68% (p=0.000 n=20) RegexpMatchHard_1K 41.35µ ± 0% 40.95µ ± 0% -0.97% (p=0.000 n=20) Revcomp 463.0m ± 0% 473.0m ± 0% +2.15% (p=0.000 n=20) Template 83.80m ± 0% 76.26m ± 1% -9.00% (p=0.000 n=20) TimeParse 283.3n ± 0% 260.8n ± 0% -7.96% (p=0.000 n=20) TimeFormat 307.2n ± 0% 290.5n ± 0% -5.45% (p=0.000 n=20) geomean 53.16µ 51.67µ -2.79% Change-Id: Iaec2f50db18e9a2b405605f8b92af3683114ea34 Reviewed-on: https://go-review.googlesource.com/c/go/+/616035 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-01cmd/internal/obj: make asmidx error less crypticAustin Clements
It's still pretty cryptic, but at least now instead of printing asm: asmidx: bad address 0/2067/2068 it will print asm: asmidx: bad address 0/BX/SP Change-Id: I1c73c439c94c5b9d3039728db85102a818739db9 Reviewed-on: https://go-review.googlesource.com/c/go/+/611256 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Austin Clements <austin@google.com>
2024-09-27cmd/internal/obj/loong64: mark functions with small stacks NOSPLITXiaolin Zhao
goos: linux goarch: loong64 pkg: test/bench/go1 cpu: Loongson-3A6000 @ 2500.00MHz │ bench.old │ bench.new │ │ sec/op │ sec/op vs base │ BinaryTree17 7.728 ± 1% 7.703 ± 1% ~ (p=0.345 n=15) Fannkuch11 2.645 ± 0% 2.429 ± 0% -8.17% (p=0.000 n=15) FmtFprintfEmpty 35.89n ± 0% 35.83n ± 0% -0.17% (p=0.000 n=15) FmtFprintfString 59.48n ± 0% 59.45n ± 0% ~ (p=0.280 n=15) FmtFprintfInt 62.04n ± 0% 61.11n ± 0% -1.50% (p=0.000 n=15) FmtFprintfIntInt 97.74n ± 0% 96.56n ± 0% -1.21% (p=0.000 n=15) FmtFprintfPrefixedInt 116.7n ± 0% 116.6n ± 0% ~ (p=0.975 n=15) FmtFprintfFloat 204.5n ± 0% 203.2n ± 0% -0.64% (p=0.000 n=15) FmtManyArgs 456.2n ± 0% 454.6n ± 0% -0.35% (p=0.000 n=15) GobDecode 7.142m ± 1% 6.960m ± 1% -2.55% (p=0.000 n=15) GobEncode 8.172m ± 1% 8.081m ± 0% -1.11% (p=0.001 n=15) Gzip 279.9m ± 0% 280.1m ± 0% +0.05% (p=0.011 n=15) Gunzip 32.69m ± 0% 32.44m ± 0% -0.79% (p=0.000 n=15) HTTPClientServer 53.94µ ± 0% 53.68µ ± 0% -0.48% (p=0.000 n=15) JSONEncode 9.297m ± 0% 9.110m ± 0% -2.01% (p=0.000 n=15) JSONDecode 47.21m ± 0% 47.99m ± 2% +1.66% (p=0.000 n=15) Mandelbrot200 4.601m ± 0% 4.606m ± 0% +0.11% (p=0.000 n=15) GoParse 4.666m ± 0% 4.664m ± 0% ~ (p=0.512 n=15) RegexpMatchEasy0_32 59.76n ± 0% 58.92n ± 0% -1.41% (p=0.000 n=15) RegexpMatchEasy0_1K 458.1n ± 0% 455.3n ± 0% -0.61% (p=0.000 n=15) RegexpMatchEasy1_32 59.36n ± 0% 60.25n ± 0% +1.50% (p=0.000 n=15) RegexpMatchEasy1_1K 557.7n ± 0% 566.0n ± 0% +1.49% (p=0.000 n=15) RegexpMatchMedium_32 803.0n ± 0% 783.7n ± 0% -2.40% (p=0.000 n=15) RegexpMatchMedium_1K 27.29µ ± 0% 26.54µ ± 0% -2.76% (p=0.000 n=15) RegexpMatchHard_32 1.388µ ± 2% 1.333µ ± 0% -3.96% (p=0.000 n=15) RegexpMatchHard_1K 40.91µ ± 0% 40.96µ ± 0% +0.12% (p=0.001 n=15) Revcomp 474.7m ± 0% 474.2m ± 0% ~ (p=0.325 n=15) Template 77.13m ± 1% 75.70m ± 1% -1.86% (p=0.000 n=15) TimeParse 271.3n ± 0% 271.7n ± 0% +0.15% (p=0.000 n=15) TimeFormat 289.4n ± 0% 290.8n ± 0% +0.48% (p=0.000 n=15) geomean 51.70µ 51.22µ -0.92% Change-Id: Ib71b60226251722ae828edb1d7d8b5a27f383570 Reviewed-on: https://go-review.googlesource.com/c/go/+/616098 Reviewed-by: David Chase <drchase@google.com> Auto-Submit: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-09-27cmd/internal/obj/riscv: rename the iIEncodingMark Ryan
We rename it to iIIEncoding to reflect the fact that instructions that use this encoding take two integer registers. This change will allow us to add a new encoding for I-type instructions that take a single integer register. This new encoding will be used for instructions that modify CSRs. Change-Id: Ic507d0020e18f6aa72353f4d3ffcd0e868261e7a Reviewed-on: https://go-review.googlesource.com/c/go/+/614355 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Joel Sing <joel@sing.id.au> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: David Chase <drchase@google.com>
2024-09-18cmd/internal/obj/s390x: fix m6 field encoding for VSTRC instruction on s390xSrinivas Pokala
M6 field for all extended mnemonics of VSTRC set to zero This fixes VSTRC codegen to emit correctly and added testcases for all the extended mnemonics. Fixes #69216 Change-Id: I2a1b7fb61d6bd6444286eab56a506225c90b75e7 Reviewed-on: https://go-review.googlesource.com/c/go/+/612315 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Vishwanatha HD <vishwanatha.hd@ibm.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2024-09-13cmd/internal/obj/loong64: add support for instructions FSCALEB{F/D} and ↵Xiaolin Zhao
FLOGB{F/D} Go asm syntax: FSCALEB{F/D} FK, FJ, FD FLOGB{F/D} FJ, FD Equivalent platform assembler syntax: fscaleb.{s/d} fd, fj, fk flogb.{s/d} fd, fj Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html Change-Id: I6cd75c7605adbb572dae86d6470ec7cf20ce0f6c Reviewed-on: https://go-review.googlesource.com/c/go/+/612975 Auto-Submit: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Tim King <taking@google.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-09-12cmd/compile,cmd/internal/obj/riscv: always provide ANDN, ORN and XNOR for ↵Joel Sing
riscv64 The ANDN, ORN and XNOR RISC-V Zbb extension instructions are easily synthesised. Make them always available by adding support to the riscv64 assembler so that we either emit two instruction sequences, or a single instruction, when permitted by the GORISCV64 profile. This means that these instructions can be used unconditionally, simplifying compiler rewrite rules, codegen tests and manually written assembly. Around 180 instructions are removed from the Go binary on riscv64 when built with rva22u64. Change-Id: Ib2d90f2593a306530dc0ed08a981acde4d01be20 Reviewed-on: https://go-review.googlesource.com/c/go/+/611895 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Tim King <taking@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-09-06cmd/internal/obj/arm64: Add helpers for span7 passesSebastian Nickolls
Adds helper functions for the literal pooling, large branch handling and code emission stages of the span7 assembler pass. This hides the implementation of the current assembler from the general workflow in span7 to make the implementation easier to change in future. Updates #44734 Change-Id: I8859956b23ad4faebeeff6df28051b098ef90fed Reviewed-on: https://go-review.googlesource.com/c/go/+/595755 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-09-06cmd/internal: use t.TempDir in testsKir Kolyshkin
Change-Id: I3d4c66793afa3769a8450e2d65093a0f9115596e Reviewed-on: https://go-review.googlesource.com/c/go/+/611043 Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-09-05cmd/internal/obj/loong64: add support for instructions ANDN and ORNXiaolin Zhao
Go asm syntax: ANDN/ORN RK, RJ, RD or ANDN/ORN RK, RD Equivalent platform assembler syntax: andn/orn rd, rj, rk or andn/orn rd, rd, rk Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html Change-Id: I6d240ecae8f9443811ca450aed3574f13f0f4a81 Reviewed-on: https://go-review.googlesource.com/c/go/+/610475 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Commit-Queue: abner chenc <chenguoqi@loongson.cn> Reviewed-by: abner chenc <chenguoqi@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: abner chenc <chenguoqi@loongson.cn>
2024-09-04cmd: use 16 bytes hash when possibleCuong Manh Le
CL 402595 changes all usages of 16 bytes hash to 32 bytes hash by using notsha256. However, since CL 454836, notsha256 is not necessary anymore, so this CL reverts those changes to 16 bytes hash using cmd/internal/hash package. Updates #51940 Updates #64751 Change-Id: Ic015468ca4a49d0c3b1fb9fdbed93fddef3c838f Reviewed-on: https://go-review.googlesource.com/c/go/+/610598 Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>