| Age | Commit message (Collapse) | Author |
|
initialization
If a wasmexport function is called from the host before
initializing the Go Wasm module, currently it will likely fail
with a bounds error, because the uninitialized SP is 0, and any
SP decrement will make it out of bounds.
As at least some Wasm runtime doesn't call _initialize by default,
This error can be common. And the bounds error looks confusing to
the users. Therefore, we detect this case and emit a clearer error.
Fixes #71240.
Updates #65199.
Change-Id: I107095f08c76cdceb7781ab0304218eab7029ab6
Reviewed-on: https://go-review.googlesource.com/c/go/+/643115
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Currently, a symbol reference is counted as a reference to a
builtin symbol if the name matches a builtin. Usually builtin
references are generated by the compiler. But one could manually
write one with linkname. Since the list of builtin functions are
subject to change from time to time, we don't want users to depend
on their names. So we don't count a linknamed reference as a
builtin reference, and instead, count it as a named reference, so
it is checked by the linker.
Change-Id: Id3543295185c6bbd73a8cff82afb8f9cb4fd6f71
Reviewed-on: https://go-review.googlesource.com/c/go/+/635755
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Static symbols don't have the package prefix, so we need to identify
them specially.
Change-Id: Iaa0456de802478f6a257164e9703f18f8dc7eb50
Reviewed-on: https://go-review.googlesource.com/c/go/+/631975
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
1. In cmd/internal/obj, only apply the exclusion list to data symbols.
Text symbols are always fine since they can use PC-relative relocations.
2. In cmd/link, only skip trampolines for text symbols in the same package
with the same type. Before, all text symbols had type STEXT, but now that
there are different sections of STEXT, we can only rely on symbols in the
same package in the same section being close enough not to need
trampolines.
Fixes #70379.
Change-Id: Ifad2bdd6001ad3b5b23e641127554e9ec374715e
Reviewed-on: https://go-review.googlesource.com/c/go/+/631036
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Currently, instruction encoding is a slice of encoding types, which
is indexed by a masked version of the riscv64 opcode. Additional
information about some instructions (for example, if an instruction
has a ternary form and if there is an immediate form for an instruction)
is manually specified in other parts of the assembler code.
Rework the instruction encoding information so that we use a table
driven form, providing additional data for each instruction where
relevant. This means that we can simplify other parts of the code
by simply looking up the instruction data and reusing minimal logic.
Change-Id: I7b3b6c61a4868647edf28bd7dbae2150e043ae00
Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64
Reviewed-on: https://go-review.googlesource.com/c/go/+/622535
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Change-Id: I07e7c8eaa5bd4bac0d576b2f2f4cd3f81b0b77a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/630055
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
|
|
1. Support for decimal arithmetic quad instructions of powerpc: DADDQ, DSUBQ, DMULQ
and DDIVQ.
2. Support for decimal compare ordered, unordered, quad instructions of powerpc:
DCMPU, DCMPO, DCMPUQ, and DCMPOQ.
Change-Id: I32a15a7f0a127b022b1f43d376e0ab0f7e9dd108
Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10
Reviewed-on: https://go-review.googlesource.com/c/go/+/623036
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Paul Murphy <murp@ibm.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Sometimes we've used the 140 suffix (GOFIPS140, crypto/fips140)
and sometimes not (crypto/internal/fips, cmd/go/internal/fips).
Use it always, to avoid having to remember which is which.
Also, there are other FIPS standards, like AES (FIPS 197), SHA-2 (FIPS 180),
and so on, which have nothing to do with FIPS 140. Best to be clear.
For #70123.
Change-Id: I33b29dabd9e8b2703d2af25e428f88bc81c7c307
Reviewed-on: https://go-review.googlesource.com/c/go/+/630115
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
|
|
Code like x := [12]byte{1,2,3,4,5,6,7,8,9,10,11,12} stores x in
a pair of registers and uses MOVD/MOVWU to load the values
from RODATA. The code generator needs to understand not
to use the aligned PC-relative relocation for that sequence.
In non-FIPS modes, more statictemp optimizations can be applied
and this problematic sequence doesn't happen.
Fix the decision about whether to assume alignment to match
the code used by the linker when deciding what to align.
Fixes the linker failure in CL 626437 patch set 5.
Change-Id: Iedad862c6faee758d4a2c5120cab2d329265b134
Reviewed-on: https://go-review.googlesource.com/c/go/+/628835
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Bypass: Russ Cox <rsc@golang.org>
|
|
Excluding external test packages allows them to use
//go:embed, which requires data relocations in data.
(Obviously the external test code is testing the FIPS module,
not part of it, so this is reasonable.)
Change-Id: I4bae71320ccb5faf718c045540a9ba6dd93e378f
Reviewed-on: https://go-review.googlesource.com/c/go/+/628735
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Looks like it works.
Cq-Include-Trybots: luci.golang.try:gotip-windows-arm64
Change-Id: I4914d5076eccaf1dd850a148070f179edf291c40
Reviewed-on: https://go-review.googlesource.com/c/go/+/627958
Reviewed-by: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Filippo Valsorda <filippo@golang.org>
|
|
It's actually a TOC relative relocation, but those are also accepted
as pcrel relocations here too. This fixes compilation on GOPPC64 <= power9.
Change-Id: I235125a76f59ab26c6c753540cfaeb398f9c105d
Reviewed-on: https://go-review.googlesource.com/c/go/+/628157
Auto-Submit: Paul Murphy <murp@ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Previous CLs committed changes to cmd/compile, cmd/link,
and crypto/internal/fips/check behind boolean flags.
Turn those flags on, to enable the CLs.
This is a separate, trivial CL for easier rollback.
For #69536.
Change-Id: I68206bae0b7d7ad5c8758267d1a2e68853b63644
Reviewed-on: https://go-review.googlesource.com/c/go/+/626000
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
For a wasmexport wrapper, we generate a call to the actual
exported Go function, and use the wrapper function's PC 1 as the
(fake) return address. This address is not used for returning,
which is handled by the Wasm call stack. It is used for stack
unwinding, and PC 1 makes it past the prologue and therefore has
the right SP delta. But if the function has no arguments and
results, the wrapper is frameless, with no prologue, and PC 1
doesn't exist. This causes the unwinder to fail. In this case, we
put PC 0, which also has the correct SP delta (0).
Fixes #69584.
Change-Id: Ic047a6e62100db540b5099cc5a56a1d0f16d58b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/624000
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Add cmd/internal/obj/mkcnames.go to do the generation and update
the architecture packages to use it to maintain the Cnames tables.
Currently works correctly on arm64,loong64,mips,ppc64 and s390x.
Change-Id: I5220b0ba6d8a8a5fcc4d9774731eb2af69a671af
Reviewed-on: https://go-review.googlesource.com/c/go/+/622256
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Commit-Queue: Ian Lance Taylor <iant@golang.org>
|
|
For FIPS init-time code+data verification, we need to arrange to
put the FIPS symbols into contiguous regions of the executable
and then record those sections along with the expected checksum.
The cmd/internal/obj changes identify the FIPS symbols and give
them distinguished types, which the linker then places in contiguous
regions. The linker also writes out information to use at run time
to find the FIPS sections, along with the expected hash.
See cmd/internal/obj/fips.go and cmd/link/internal/ld/fips.go
for more details.
The code is disabled in this commit.
CL 625998 and 625999 adds tests.
CL 626000 enables the code.
For #69536.
Change-Id: I48da6db94bc0bea7428c43d4abcf999527bccfcd
Reviewed-on: https://go-review.googlesource.com/c/go/+/625997
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL refers to the implementation of ARM64 and adds support for the following
types of SIMD instructions:
1. Move general-purpose register to a vector element, e.g.:
VMOVQ Rj, <Vd>.<T>[index]
<T> can have the following values:
B, H, W, V
2. Move vector element to general-purpose register, e.g.:
VMOVQ <Vj>.<T>[index], Rd
<T> can have the following values:
B, BU, H, HU, W, WU, VU
3. Duplicate general-purpose register to vector, e.g.:
VMOVQ Rj, <Vd>.<T>
<T> can have the following values:
B16, H8, W4, V2, B32, H16, W8, V4
4. Move vector, e.g.:
XVMOVQ Xj, <Xd>.<T>
<T> can have the following values:
B16, H8, W4, V2, Q1
5. Move vector element to scalar, e.g.:
XVMOVQ Xj, <Xd>.<T>[index]
XVMOVQ Xj.<T>[index], Xd
<T> can have the following values:
W, V
6. Move vector element to vector register, e.g.:
VMOVQ <Vn>.<T>[index], Vn.<T>
<T> can have the following values:
B, H, W, V
This CL only adds syntax and doesn't break any assembly that already exists.
Change-Id: I7656efac6def54da6c5ae182f39c2a21bfdf92bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/616258
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
As proposed on #66984, this CL allows more types to be used as
wasmimport/wasmexport function parameters and results.
Specifically, bool, string, and uintptr are now allowed, and also
pointer types that point to allowed element types. Allowed element
types includes sized integer and floating point types (including
small integer types like uint8 which are not directly allowed as
a parameter type), bool, array whose element type is allowed, and
struct whose fields are allowed element type and also include a
struct.HostLayout field.
For #66984.
Change-Id: Ie5452a1eda21c089780dfb4d4246de6008655c84
Reviewed-on: https://go-review.googlesource.com/c/go/+/626615
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
CL 521790 has experimentally enabled RegABI support on Loong64, so it
is possible to switch the Lookup function call to ABIInternal mode.
Change-Id: I3ae053e20c0791efebe6b6bdc9a1550a11372bc2
Reviewed-on: https://go-review.googlesource.com/c/go/+/544435
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
8-bit and 16-bit sign extensions and 32-bit zero extensions were realized
with left and right shifts before this change. We now support assembling
EXTWB, EXTWH and BSTRPICKV, so all three can be done with a single insn
respectively.
This patch is a copy of CL 479496.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: Iee5741dd9ebb25746f51008f3f6c86704339d615
Reviewed-on: https://go-review.googlesource.com/c/go/+/626195
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Go asm syntax:
VPCNT{B,H,W,V} VJ, VD
XVPCNT{B,H,W,V} XJ, XD
Equivalent platform assembler syntax:
vpcnt.{b,w,h,d} vd, vj
xvpcnt.{b,w,h,d} xd, xj
Change-Id: Icec4446b1925745bc3a0bc3f6397d862953b9098
Reviewed-on: https://go-review.googlesource.com/c/go/+/620736
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
|
|
These will be necessary when we start using the new FIPS symbols.
Split into a separate CL so that these refactoring changes can be
tested separate from any FIPS-specific changes.
Passes golang.org/x/tools/cmd/toolstash/buildall.
Change-Id: I73e5873fcb677f1f572f0668b4dc6f3951d822bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/625996
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
|
|
The old API was to do
r := obj.AddRel(sym)
r.Type = this
r.Off = that
etc
The new API is:
sym.AddRel(ctxt, obj.Reloc{Type: this: Off: that, etc})
This new API is more idiomatic and avoids ever having relocations
that are only partially constructed. Most importantly, it sets up
for sym.AddRel being able to check relocation validity in the future.
(Passing ctxt is for use in validity checking.)
Passes golang.org/x/tools/cmd/toolstash/buildall.
Change-Id: I042ea76e61bb3bf6402f98ca11291a13f4799972
Reviewed-on: https://go-review.googlesource.com/c/go/+/625616
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Go asm syntax:
VSEQ{B,H,W,V} VJ, VK, VD
XVSEQ{B,H,W,V} XJ, XK, XD
Equivalent platform assembler syntax:
vseq.{b,w,h,d} vd, vj, vk
xvseq.{b,w,h,d} xd, xj, xk
Change-Id: Ia87277b12c817ebc41a46f4c3d09f4b76995ff2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/616076
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
instructions support
This CL adding primitive asm support of Loong64 LSX [1] and LASX [2], by introducing new
sets of register V0-V31 (C_VREG), X0-X31 (C_XREG) and 8 new instructions.
On Loong64, VLD,XVLD,VST,XVST implement vector memory access operations using immediate
values offset. VLDX, XVLDX, VSTX, XVSTX implement vector memory access operations using
register offset.
Go asm syntax:
VMOVQ n(RJ), RV (128bit vector load)
XVMOVQ n(RJ), RX (256bit vector load)
VMOVQ RV, n(RJ) (128bit vector store)
XVMOVQ RX, n(RJ) (256bit vector store)
VMOVQ (RJ)(RK), RV (128bit vector load)
XVMOVQ (RJ)(RK), RX (256bit vector load)
VMOVQ RV, (RJ)(RK) (128bit vector store)
XVMOVQ RX, (RJ)(RK) (256bit vector store)
Equivalent platform assembler syntax:
vld vd, rj, si12
xvld xd, rj, si12
vst vd, rj, si12
xvst xd, rj, si12
vldx vd, rj, rk
xvldx xd, rj, rk
vstx vd, rj, rk
xvstx xd, rj, rk
[1]: LSX: Loongson SIMD Extension, 128bit
[2]: LASX: Loongson Advanced SIMD Extension, 256bit
Change-Id: Ibaf5ddfd29b77670c3c44cc32bead36b2c8b8003
Reviewed-on: https://go-review.googlesource.com/c/go/+/616075
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
On Loong64, AMSWAPDB{W,V} instructions are supported by default, and AMSWAPDB{B,H} [1]
is a new instruction added by LA664(Loongson 3A6000) and later microarchitectures.
Therefore, AMSWAPDB{W,V} (full barrier) is used to implement AtomicStore{32,64}, and
the traditional MOVB or the new AMSWAPDBB is used to implement AtomicStore8 according
to the CPU feature.
The StoreRelease barrier on Loong64 is "dbar 0x12", but it is still necessary to
ensure consistency in the order of Store/Load [2].
LoweredAtomicStorezero{32,64} was removed because on loong64 the constant "0" uses
the R0 register, and there is no performance difference between the implementations
of LoweredAtomicStorezero{32,64} and LoweredAtomicStore{32,64}.
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000-HV @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
AtomicStore64 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20)
AtomicStore64-2 19.61n ± 0% 13.61n ± 0% -30.57% (p=0.000 n=20)
AtomicStore64-4 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20)
AtomicStore 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20)
AtomicStore-2 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20)
AtomicStore-4 19.62n ± 0% 13.62n ± 0% -30.58% (p=0.000 n=20)
AtomicStore8 19.61n ± 0% 20.01n ± 0% +2.04% (p=0.000 n=20)
AtomicStore8-2 19.62n ± 0% 20.02n ± 0% +2.01% (p=0.000 n=20)
AtomicStore8-4 19.61n ± 0% 20.02n ± 0% +2.09% (p=0.000 n=20)
geomean 19.61n 15.48n -21.08%
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
AtomicStore64 18.03n ± 0% 12.81n ± 0% -28.93% (p=0.000 n=20)
AtomicStore64-2 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20)
AtomicStore64-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20)
AtomicStore-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore8 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore8-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore8-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
geomean 18.01n 12.81n -28.89%
[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
[2]: https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=gcc/config/loongarch/sync.md
Change-Id: I4ae5e8dd0e6f026129b6e503990a763ed40c6097
Reviewed-on: https://go-review.googlesource.com/c/go/+/581356
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Add support for assembling the FMA instructions present in the LoongArch
base ISA v1.00. This requires adding a new instruction format and making
use of a third source operand, which is put in RestArgs[0].
The single-precision instructions have the `.s` prefix in their official
mnemonics, and similar Go asm instructions all have `S` prefix for the
other architectures having FMA support, but in this change they instead
have `F` prefix in Go asm because loong64 currently follows the mips
backends in the naming convention. This could be changed later because
FMA is fully expressible in pure Go, making it unlikely to have to hand-
write such assembly in the wild.
Example mapping between actual encoding and Go asm syntax:
fmadd.s fd, fj, fk, fa -> FMADDF fa, fk, fj, fd
(prog.From = fa, prog.Reg = fk, prog.RestArgs[0] = fj and prog.To = fd)
fmadd.s fd, fd, fk, fa -> FMADDF fa, fk, fd
(prog.From = fa, prog.Reg = fk and prog.To = fd)
This patch is a copy of CL 477716.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: I9b4e4c601d6c5a854ee238f085849666e4faf090
Reviewed-on: https://go-review.googlesource.com/c/go/+/623877
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
This patch is a copy of CL 478595.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: Ifb6e8183c83a5dfe5dec84e173a74d5de62692a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/623875
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
All remaining unary bitop instructions in the LoongArch v1.00 base ISA
are added with this change.
While at it, add the missing W suffix to the current CLO/CLZ names. They
are not used anywhere as far as we know, so no breakage is expected.
Also, stop reusing SLL's instruction format for simplicity, in favor of
a new but trivial instruction format case.
This patch is a copy of CL 477717.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: Idbcaca25dda1ed313674ef8b26da722e8d7151c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/623876
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
for small frames
CL 379075 implemented function prologue/epilogue with STP/LDP.
To fix issue #53374, CL 412474 reverted the prologue STP change for
small frames, and the LDP in epilogue was kept. The current instructions
are:
prologue:
MOVD.W R30, -offset(RSP)
MOVD R29, -8(RSP)
epilogue:
LDP -8(RSP), (R29, R30)
ADD $offset, RSP, RSP
It seems a bit strange, as:
1) The prolog and epilogue are not in the same pattern (either STR-LDR,
or STP-LDP).
2) Go Internal ABI defines that R30 is saved at 0(RSP) and R29 is saved
at -8(RSP), so we can not use a single STP.W/LDP.P to save/restore
LR&FP and adjust SP. Changing the ABI causes too much complexity,
and the benefit is not that big.
This patch reverts the small frames' epilogue change in CL 379075. It
converts LDP in the epilogue to LDR-LDR. Another solution is to re-apply
the STP change in prologue, which requires to fix #53609. This seems the
easier and safer solution in the mean time. The new instructions are:
prologue:
MOVD.W R30, -offset(RSP)
MOVD R29, -8(RSP)
epilogue:
MOVD -8(RSP), R29
MOVD.P offset(RSP), R30
The current pattern may cause performance issues in Store-Forwarding on
micro-architectures like AmpereOne. Assuming a function call in the
middle of such code is short enough that the stores are still around,
then the LDP executes and it may wait longer to get the results from
separated stores in Store Buffers other than single STP.
Store-Forwarding aims to improve the efficiency of the processor by
allowing data to be forwarded directly from a store operation to a
subsequent load operation when certain conditions are met. See the
paper: "Memory Barriers: a Hardware View for Software Hackers"
(chapter 3.2: Store Forwarding).
The performance of following ARM64 Linux servers were tested:
1) AmpereOne (ARM v8.6+) from Ampere Computing.
2) Ampere Altra (ARM Neoverse N1) from Ampere Computing.
3) Graviton2 (ARM Neoverse N1) from AWS.
The effect of this change depends the hardware implementation of
store-forwarding. It can obviously improve AmpereOne, especially for
small functions that are frequently called and returned quickly.
E.g., JSON Marshal/Unmarshal benchmarks on AmpereOne:
goos: linux
goarch: arm64
pkg: encoding/json
│ ampere-one.base │ ampere-one.new │
│ sec/op │ sec/op vs base │
CodeMarshal-8 882.1µ ± 1% 779.6µ ± 1% -11.62% (p=0.000 n=10)
CodeMarshalError-8 961.5µ ± 0% 855.7µ ± 1% -11.01% (p=0.000 n=10)
MarshalBytes/32-8 207.6n ± 1% 187.8n ± 0% -9.52% (p=0.000 n=10)
MarshalBytes/256-8 501.0n ± 1% 482.6n ± 1% -3.68% (p=0.000 n=10)
MarshalBytes/4096-8 5.336µ ± 1% 5.074µ ± 1% -4.92% (p=0.000 n=10)
MarshalBytesError/32-8 242.3µ ± 2% 205.7µ ± 3% -15.08% (p=0.000 n=10)
MarshalBytesError/256-8 242.4µ ± 1% 205.2µ ± 2% -15.35% (p=0.000 n=10)
MarshalBytesError/4096-8 247.9µ ± 0% 210.1µ ± 1% -15.24% (p=0.000 n=10)
MarshalMap-8 150.8n ± 1% 145.7n ± 0% -3.35% (p=0.000 n=10)
EncodeMarshaler-8 50.30n ± 26% 54.48n ± 6% ~ (p=0.739 n=10)
CodeUnmarshal-8 4.796m ± 2% 4.055m ± 1% -15.45% (p=0.000 n=10)
CodeUnmarshalReuse-8 4.260m ± 1% 3.496m ± 1% -17.94% (p=0.000 n=10)
UnmarshalString-8 73.89n ± 1% 65.83n ± 1% -10.91% (p=0.000 n=10)
UnmarshalFloat64-8 60.63n ± 1% 58.66n ± 25% ~ (p=0.143 n=10)
UnmarshalInt64-8 55.62n ± 1% 53.25n ± 22% ~ (p=0.468 n=10)
UnmarshalMap-8 255.3n ± 1% 230.3n ± 1% -9.77% (p=0.000 n=10)
UnmarshalNumber-8 467.2n ± 1% 367.0n ± 0% -21.43% (p=0.000 n=10)
geomean 6.224µ 5.605µ -9.94%
Other ARM64 micro-architectures may be not affected so much by such
issue. E.g., benchmarks on Ampere Altra and Graviton2 show slight
improvements:
│ altra.base │ altra.new │
│ sec/op │ sec/op vs base │
CodeMarshal-8 980.1µ ± 1% 977.3µ ± 1% ~ (p=0.912 n=10)
CodeMarshalError-8 1.109m ± 3% 1.096m ± 5% ~ (p=0.971 n=10)
MarshalBytes/32-8 246.8n ± 1% 245.4n ± 0% -0.55% (p=0.002 n=10)
MarshalBytes/256-8 590.9n ± 1% 606.6n ± 1% +2.67% (p=0.000 n=10)
MarshalBytes/4096-8 6.351µ ± 1% 6.376µ ± 1% ~ (p=0.183 n=10)
MarshalBytesError/32-8 245.3µ ± 2% 246.1µ ± 2% ~ (p=0.684 n=10)
MarshalBytesError/256-8 245.5µ ± 1% 248.7µ ± 2% ~ (p=0.218 n=10)
MarshalBytesError/4096-8 254.2µ ± 1% 254.9µ ± 1% ~ (p=0.481 n=10)
MarshalMap-8 152.7n ± 2% 151.5n ± 3% ~ (p=0.782 n=10)
EncodeMarshaler-8 45.95n ± 7% 42.88n ± 5% -6.70% (p=0.014 n=10)
CodeUnmarshal-8 5.121m ± 4% 5.125m ± 3% ~ (p=0.579 n=10)
CodeUnmarshalReuse-8 4.616m ± 3% 4.634m ± 2% ~ (p=0.529 n=10)
UnmarshalString-8 72.12n ± 2% 72.20n ± 2% ~ (p=0.912 n=10)
UnmarshalFloat64-8 64.44n ± 5% 63.20n ± 4% ~ (p=0.393 n=10)
UnmarshalInt64-8 61.49n ± 2% 58.14n ± 4% -5.45% (p=0.002 n=10)
UnmarshalMap-8 263.6n ± 2% 266.2n ± 1% ~ (p=0.196 n=10)
UnmarshalNumber-8 464.7n ± 1% 464.0n ± 0% ~ (p=0.566 n=10)
geomean 6.617µ 6.575µ -0.64%
│ graviton2.base │ graviton2.new │
│ sec/op │ sec/op vs base │
CodeMarshal-8 1.122m ± 0% 1.118m ± 1% ~ (p=0.052 n=10)
CodeMarshalError-8 1.216m ± 1% 1.214m ± 0% ~ (p=0.631 n=10)
MarshalBytes/32-8 289.9n ± 0% 280.8n ± 0% -3.17% (p=0.000 n=10)
MarshalBytes/256-8 675.9n ± 0% 664.7n ± 0% -1.66% (p=0.000 n=10)
MarshalBytes/4096-8 6.884µ ± 0% 6.885µ ± 0% ~ (p=0.565 n=10)
MarshalBytesError/32-8 293.1µ ± 2% 288.9µ ± 2% ~ (p=0.123 n=10)
MarshalBytesError/256-8 296.0µ ± 3% 289.0µ ± 1% -2.36% (p=0.019 n=10)
MarshalBytesError/4096-8 300.4µ ± 1% 295.6µ ± 0% -1.60% (p=0.000 n=10)
MarshalMap-8 168.8n ± 1% 168.8n ± 1% ~ (p=1.000 n=10)
EncodeMarshaler-8 53.77n ± 8% 50.05n ± 12% ~ (p=0.579 n=10)
CodeUnmarshal-8 5.875m ± 2% 5.882m ± 1% ~ (p=0.796 n=10)
CodeUnmarshalReuse-8 5.383m ± 1% 5.366m ± 0% ~ (p=0.631 n=10)
UnmarshalString-8 74.59n ± 1% 73.99n ± 0% -0.80% (p=0.001 n=10)
UnmarshalFloat64-8 68.52n ± 7% 64.19n ± 18% ~ (p=0.868 n=10)
UnmarshalInt64-8 65.32n ± 13% 62.24n ± 8% ~ (p=0.138 n=10)
UnmarshalMap-8 290.1n ± 0% 291.3n ± 0% +0.43% (p=0.010 n=10)
UnmarshalNumber-8 514.4n ± 0% 499.4n ± 0% -2.93% (p=0.000 n=10)
geomean 7.459µ 7.317µ -1.91%
Change-Id: If27386fc5f514b76bdaf2012c2ce86cc65f7ca5b
Reviewed-on: https://go-review.googlesource.com/c/go/+/621775
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Regenerate RISC-V instruction table from the riscv-opcodes repository,
due to various changes and shuffling upstream.
This has been changed to remove pseudo-instructions, since Go only
needs the instruction encodings and including the pseudo-instructions
is creating unnecessary complications (for example, the inclusion
of ANOP and ARET, as well as strangely named aliases such as
AJALPSEUDO/AJALRPSEUDO). Remove pseudo-instructions that are not
currently supported by the assembler and add specific handling for
RDCYCLE, RDTIME and RDINSTRET, which were previously implemented
via the instruction encodings.
Change-Id: I78be4506ba6b627eba1f321406081a63bab5b2e6
Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64
Reviewed-on: https://go-review.googlesource.com/c/go/+/616116
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
BGT, BLT, BLE, BGE, BNE, BVS, BVC, and BEQ support by assembler. This will simplify the usage of BC constructs like
BC 12, 30, LR <=> BEQ CR7, LR
BC 12, 2, LR <=> BEQ CR0, LR
BC 12, 0, target <=> BLT CR0, target
BC 12, 2, target <=> BEQ CR0, target
BC 12, 5, target <=> BGT CR1, target
BC 12, 30, target <=> BEQ CR7, target
BC 4, 6, target <=> BNE CR1, target
BC 4, 5, target <=> BLE CR1, target
code cleanup based on the above additions.
Change-Id: I02fdb212b6fe3f85ce447e05f4d42118c9ce63b5
Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10
Reviewed-on: https://go-review.googlesource.com/c/go/+/612395
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Assembler support provided for the instructions DADD, DSUB, DMUL, and DDIV.
Change-Id: Ic12ba02ce453cb1ca275334ca1924fb2009da767
Reviewed-on: https://go-review.googlesource.com/c/go/+/620856
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
This patch adds prologue_end statement to the DWARF info for riscv64,
which delve debugger uses for skip stacksplit prologue.
Change-Id: I4e5d9c26202385f65b3118b16f53f66de9d327f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/620295
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Update references to version 20240411 of the RISC-V specifications.
Reorder and regroup instructions to maintain ordering.
Change-Id: Iea2a5d22ad677e04948e9a9325986ad301c03f35
Reviewed-on: https://go-review.googlesource.com/c/go/+/616115
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
This adds V0 through V31 as vector registers, which are available on CPUs
that support the V extension.
Change-Id: Ibffee3f9a2cf1d062638715b3744431d72d451ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/595404
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: 鹏程汪 <wangpengcheng.pp@bytedance.com>
|
|
Regenerate the riscv instruction encoding table with the V extension
enabled. Add constants and names for the resulting 375 instructions.
Change-Id: Icce688493aeb1e9880fb76a0618643f57e481273
Reviewed-on: https://go-review.googlesource.com/c/go/+/595403
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: 鹏程汪 <wangpengcheng.pp@bytedance.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
These are 8-bit ARM Load/Store atomics and are available starting from armv6k.
See https://developer.arm.com/documentation/dui0379/e/arm-and-thumb-instructions/strex
For #69735
Change-Id: I12623433c89070495c178208ee4758b3cdefd368
GitHub-Last-Rev: d6a797836af1dccdcc6e6554725546b386d01615
GitHub-Pull-Request: golang/go#69959
Cq-Include-Trybots: luci.golang.try:gotip-linux-arm
Reviewed-on: https://go-review.googlesource.com/c/go/+/621395
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
The spill/restore code around morestack is almost never exectued, so
we should make it as small as possible. Using 2-register loads/stores
makes sense here. Also, the offsets from SP are pretty small so the
offset almost always fits in the (smaller than a normal load/store)
offset field of the instruction.
Makes cmd/go 0.6% smaller.
Change-Id: I8845283c1b269a259498153924428f6173bda293
Reviewed-on: https://go-review.googlesource.com/c/go/+/621556
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
In the process of stack split checking, loong64 uses the following
logic: if SP > stackguard then goto done, else morestack
The possible problem here is that the probability of morestack
execution is much lower than done, while static branch prediction
is more inclined to obtain morestack, which will cause a certain
probability of branch prediction error.
Change the logic here to:
if SP <= stackguard then goto morestack, else done
benchmarks on 3A6000:
goos: linux
goarch: loong64
pkg: fmt
cpu: Loongson-3A6000 @ 2500.00MHz
│ bench.old │ bench.new │
│ sec/op │ sec/op vs base │
SprintfPadding 418.3n ± 1% 387.0n ± 0% -7.49% (p=0.000 n=20)
SprintfEmpty 35.95n ± 0% 35.86n ± 0% -0.25% (p=0.000 n=20)
SprintfString 75.02n ± 1% 72.24n ± 0% -3.71% (p=0.000 n=20)
SprintfTruncateString 165.7n ± 3% 139.9n ± 1% -15.58% (p=0.000 n=20)
SprintfTruncateBytes 171.0n ± 0% 147.3n ± 0% -13.83% (p=0.000 n=20)
SprintfSlowParsingPath 90.56n ± 0% 80.85n ± 0% -10.72% (p=0.000 n=20)
SprintfQuoteString 560.2n ± 0% 509.7n ± 0% -9.01% (p=0.000 n=20)
SprintfInt 58.62n ± 0% 56.45n ± 0% -3.70% (p=0.000 n=20)
SprintfIntInt 141.7n ± 0% 122.2n ± 0% -13.73% (p=0.000 n=20)
SprintfPrefixedInt 210.6n ± 0% 208.8n ± 0% -0.88% (p=0.000 n=20)
SprintfFloat 282.3n ± 0% 251.8n ± 1% -10.80% (p=0.000 n=20)
SprintfComplex 854.1n ± 0% 813.8n ± 0% -4.71% (p=0.000 n=20)
SprintfBoolean 76.32n ± 0% 71.14n ± 1% -6.79% (p=0.000 n=20)
SprintfHexString 218.5n ± 0% 193.4n ± 0% -11.51% (p=0.000 n=20)
SprintfHexBytes 321.3n ± 0% 275.0n ± 0% -14.42% (p=0.000 n=20)
SprintfBytes 573.5n ± 0% 553.2n ± 1% -3.54% (p=0.000 n=20)
SprintfStringer 501.1n ± 1% 446.6n ± 0% -10.86% (p=0.000 n=20)
SprintfStructure 1.793µ ± 0% 1.683µ ± 0% -6.16% (p=0.000 n=20)
ManyArgs 500.0n ± 0% 470.4n ± 0% -5.92% (p=0.000 n=20)
FprintInt 67.51n ± 0% 65.71n ± 0% -2.66% (p=0.000 n=20)
FprintfBytes 130.9n ± 0% 129.5n ± 1% -1.11% (p=0.000 n=20)
FprintIntNoAlloc 67.55n ± 0% 65.80n ± 0% -2.58% (p=0.000 n=20)
ScanInts 386.3µ ± 0% 346.5µ ± 0% -10.29% (p=0.000 n=20)
ScanRecursiveInt 25.97m ± 0% 25.93m ± 0% -0.15% (p=0.038 n=20)
ScanRecursiveIntReaderWrapper 26.07m ± 0% 25.93m ± 0% -0.53% (p=0.001 n=20)
geomean 702.6n 653.7n -6.96%
goos: linux
goarch: loong64
pkg: test/bench/go1
cpu: Loongson-3A6000 @ 2500.00MHz
│ bench.old │ bench.new │
│ sec/op │ sec/op vs base │
BinaryTree17 7.688 ± 1% 7.724 ± 0% +0.47% (p=0.040 n=20)
Fannkuch11 2.670 ± 0% 2.645 ± 0% -0.94% (p=0.000 n=20)
FmtFprintfEmpty 35.93n ± 0% 37.50n ± 0% +4.37% (p=0.000 n=20)
FmtFprintfString 56.32n ± 0% 59.74n ± 0% +6.08% (p=0.000 n=20)
FmtFprintfInt 64.47n ± 0% 61.26n ± 0% -4.98% (p=0.000 n=20)
FmtFprintfIntInt 100.30n ± 0% 99.67n ± 0% -0.63% (p=0.000 n=20)
FmtFprintfPrefixedInt 116.7n ± 0% 119.3n ± 0% +2.23% (p=0.000 n=20)
FmtFprintfFloat 234.1n ± 0% 203.4n ± 0% -13.11% (p=0.000 n=20)
FmtManyArgs 503.0n ± 0% 467.9n ± 0% -6.96% (p=0.000 n=20)
GobDecode 8.125m ± 0% 7.299m ± 0% -10.17% (p=0.000 n=20)
GobEncode 8.930m ± 1% 8.581m ± 1% -3.91% (p=0.000 n=20)
Gzip 280.0m ± 0% 279.8m ± 0% -0.10% (p=0.000 n=20)
Gunzip 33.30m ± 0% 32.48m ± 0% -2.49% (p=0.000 n=20)
HTTPClientServer 55.43µ ± 0% 54.10µ ± 1% -2.41% (p=0.000 n=20)
JSONEncode 10.086m ± 0% 9.055m ± 0% -10.22% (p=0.000 n=20)
JSONDecode 49.37m ± 1% 46.22m ± 1% -6.40% (p=0.000 n=20)
Mandelbrot200 4.606m ± 0% 4.606m ± 0% ~ (p=0.280 n=20)
GoParse 5.010m ± 0% 4.855m ± 0% -3.09% (p=0.000 n=20)
RegexpMatchEasy0_32 59.09n ± 0% 59.32n ± 0% +0.39% (p=0.000 n=20)
RegexpMatchEasy0_1K 455.2n ± 0% 453.8n ± 0% -0.31% (p=0.000 n=20)
RegexpMatchEasy1_32 59.24n ± 0% 60.11n ± 0% +1.47% (p=0.000 n=20)
RegexpMatchEasy1_1K 555.2n ± 0% 553.9n ± 0% -0.23% (p=0.000 n=20)
RegexpMatchMedium_32 845.7n ± 0% 775.6n ± 0% -8.28% (p=0.000 n=20)
RegexpMatchMedium_1K 26.68µ ± 0% 26.48µ ± 0% -0.78% (p=0.000 n=20)
RegexpMatchHard_32 1.317µ ± 0% 1.326µ ± 0% +0.68% (p=0.000 n=20)
RegexpMatchHard_1K 41.35µ ± 0% 40.95µ ± 0% -0.97% (p=0.000 n=20)
Revcomp 463.0m ± 0% 473.0m ± 0% +2.15% (p=0.000 n=20)
Template 83.80m ± 0% 76.26m ± 1% -9.00% (p=0.000 n=20)
TimeParse 283.3n ± 0% 260.8n ± 0% -7.96% (p=0.000 n=20)
TimeFormat 307.2n ± 0% 290.5n ± 0% -5.45% (p=0.000 n=20)
geomean 53.16µ 51.67µ -2.79%
Change-Id: Iaec2f50db18e9a2b405605f8b92af3683114ea34
Reviewed-on: https://go-review.googlesource.com/c/go/+/616035
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
It's still pretty cryptic, but at least now instead of printing
asm: asmidx: bad address 0/2067/2068
it will print
asm: asmidx: bad address 0/BX/SP
Change-Id: I1c73c439c94c5b9d3039728db85102a818739db9
Reviewed-on: https://go-review.googlesource.com/c/go/+/611256
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Austin Clements <austin@google.com>
|
|
goos: linux
goarch: loong64
pkg: test/bench/go1
cpu: Loongson-3A6000 @ 2500.00MHz
│ bench.old │ bench.new │
│ sec/op │ sec/op vs base │
BinaryTree17 7.728 ± 1% 7.703 ± 1% ~ (p=0.345 n=15)
Fannkuch11 2.645 ± 0% 2.429 ± 0% -8.17% (p=0.000 n=15)
FmtFprintfEmpty 35.89n ± 0% 35.83n ± 0% -0.17% (p=0.000 n=15)
FmtFprintfString 59.48n ± 0% 59.45n ± 0% ~ (p=0.280 n=15)
FmtFprintfInt 62.04n ± 0% 61.11n ± 0% -1.50% (p=0.000 n=15)
FmtFprintfIntInt 97.74n ± 0% 96.56n ± 0% -1.21% (p=0.000 n=15)
FmtFprintfPrefixedInt 116.7n ± 0% 116.6n ± 0% ~ (p=0.975 n=15)
FmtFprintfFloat 204.5n ± 0% 203.2n ± 0% -0.64% (p=0.000 n=15)
FmtManyArgs 456.2n ± 0% 454.6n ± 0% -0.35% (p=0.000 n=15)
GobDecode 7.142m ± 1% 6.960m ± 1% -2.55% (p=0.000 n=15)
GobEncode 8.172m ± 1% 8.081m ± 0% -1.11% (p=0.001 n=15)
Gzip 279.9m ± 0% 280.1m ± 0% +0.05% (p=0.011 n=15)
Gunzip 32.69m ± 0% 32.44m ± 0% -0.79% (p=0.000 n=15)
HTTPClientServer 53.94µ ± 0% 53.68µ ± 0% -0.48% (p=0.000 n=15)
JSONEncode 9.297m ± 0% 9.110m ± 0% -2.01% (p=0.000 n=15)
JSONDecode 47.21m ± 0% 47.99m ± 2% +1.66% (p=0.000 n=15)
Mandelbrot200 4.601m ± 0% 4.606m ± 0% +0.11% (p=0.000 n=15)
GoParse 4.666m ± 0% 4.664m ± 0% ~ (p=0.512 n=15)
RegexpMatchEasy0_32 59.76n ± 0% 58.92n ± 0% -1.41% (p=0.000 n=15)
RegexpMatchEasy0_1K 458.1n ± 0% 455.3n ± 0% -0.61% (p=0.000 n=15)
RegexpMatchEasy1_32 59.36n ± 0% 60.25n ± 0% +1.50% (p=0.000 n=15)
RegexpMatchEasy1_1K 557.7n ± 0% 566.0n ± 0% +1.49% (p=0.000 n=15)
RegexpMatchMedium_32 803.0n ± 0% 783.7n ± 0% -2.40% (p=0.000 n=15)
RegexpMatchMedium_1K 27.29µ ± 0% 26.54µ ± 0% -2.76% (p=0.000 n=15)
RegexpMatchHard_32 1.388µ ± 2% 1.333µ ± 0% -3.96% (p=0.000 n=15)
RegexpMatchHard_1K 40.91µ ± 0% 40.96µ ± 0% +0.12% (p=0.001 n=15)
Revcomp 474.7m ± 0% 474.2m ± 0% ~ (p=0.325 n=15)
Template 77.13m ± 1% 75.70m ± 1% -1.86% (p=0.000 n=15)
TimeParse 271.3n ± 0% 271.7n ± 0% +0.15% (p=0.000 n=15)
TimeFormat 289.4n ± 0% 290.8n ± 0% +0.48% (p=0.000 n=15)
geomean 51.70µ 51.22µ -0.92%
Change-Id: Ib71b60226251722ae828edb1d7d8b5a27f383570
Reviewed-on: https://go-review.googlesource.com/c/go/+/616098
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
We rename it to iIIEncoding to reflect the fact that instructions
that use this encoding take two integer registers. This change
will allow us to add a new encoding for I-type instructions that
take a single integer register. This new encoding will be used for
instructions that modify CSRs.
Change-Id: Ic507d0020e18f6aa72353f4d3ffcd0e868261e7a
Reviewed-on: https://go-review.googlesource.com/c/go/+/614355
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
M6 field for all extended mnemonics of VSTRC set to zero
This fixes VSTRC codegen to emit correctly and added testcases for all
the extended mnemonics.
Fixes #69216
Change-Id: I2a1b7fb61d6bd6444286eab56a506225c90b75e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/612315
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Vishwanatha HD <vishwanatha.hd@ibm.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
FLOGB{F/D}
Go asm syntax:
FSCALEB{F/D} FK, FJ, FD
FLOGB{F/D} FJ, FD
Equivalent platform assembler syntax:
fscaleb.{s/d} fd, fj, fk
flogb.{s/d} fd, fj
Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html
Change-Id: I6cd75c7605adbb572dae86d6470ec7cf20ce0f6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/612975
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Tim King <taking@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
riscv64
The ANDN, ORN and XNOR RISC-V Zbb extension instructions are easily
synthesised. Make them always available by adding support to the
riscv64 assembler so that we either emit two instruction sequences,
or a single instruction, when permitted by the GORISCV64 profile.
This means that these instructions can be used unconditionally,
simplifying compiler rewrite rules, codegen tests and manually
written assembly.
Around 180 instructions are removed from the Go binary on riscv64
when built with rva22u64.
Change-Id: Ib2d90f2593a306530dc0ed08a981acde4d01be20
Reviewed-on: https://go-review.googlesource.com/c/go/+/611895
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Tim King <taking@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Adds helper functions for the literal pooling, large branch handling
and code emission stages of the span7 assembler pass. This hides the
implementation of the current assembler from the general workflow in
span7 to make the implementation easier to change in future.
Updates #44734
Change-Id: I8859956b23ad4faebeeff6df28051b098ef90fed
Reviewed-on: https://go-review.googlesource.com/c/go/+/595755
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: I3d4c66793afa3769a8450e2d65093a0f9115596e
Reviewed-on: https://go-review.googlesource.com/c/go/+/611043
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Go asm syntax:
ANDN/ORN RK, RJ, RD
or ANDN/ORN RK, RD
Equivalent platform assembler syntax:
andn/orn rd, rj, rk
or andn/orn rd, rd, rk
Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html
Change-Id: I6d240ecae8f9443811ca450aed3574f13f0f4a81
Reviewed-on: https://go-review.googlesource.com/c/go/+/610475
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Commit-Queue: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
|
|
CL 402595 changes all usages of 16 bytes hash to 32 bytes hash by using
notsha256.
However, since CL 454836, notsha256 is not necessary anymore, so this CL
reverts those changes to 16 bytes hash using cmd/internal/hash package.
Updates #51940
Updates #64751
Change-Id: Ic015468ca4a49d0c3b1fb9fdbed93fddef3c838f
Reviewed-on: https://go-review.googlesource.com/c/go/+/610598
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|