| Age | Commit message (Collapse) | Author |
|
It's actually a TOC relative relocation, but those are also accepted
as pcrel relocations here too. This fixes compilation on GOPPC64 <= power9.
Change-Id: I235125a76f59ab26c6c753540cfaeb398f9c105d
Reviewed-on: https://go-review.googlesource.com/c/go/+/628157
Auto-Submit: Paul Murphy <murp@ibm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Previous CLs committed changes to cmd/compile, cmd/link,
and crypto/internal/fips/check behind boolean flags.
Turn those flags on, to enable the CLs.
This is a separate, trivial CL for easier rollback.
For #69536.
Change-Id: I68206bae0b7d7ad5c8758267d1a2e68853b63644
Reviewed-on: https://go-review.googlesource.com/c/go/+/626000
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
For a wasmexport wrapper, we generate a call to the actual
exported Go function, and use the wrapper function's PC 1 as the
(fake) return address. This address is not used for returning,
which is handled by the Wasm call stack. It is used for stack
unwinding, and PC 1 makes it past the prologue and therefore has
the right SP delta. But if the function has no arguments and
results, the wrapper is frameless, with no prologue, and PC 1
doesn't exist. This causes the unwinder to fail. In this case, we
put PC 0, which also has the correct SP delta (0).
Fixes #69584.
Change-Id: Ic047a6e62100db540b5099cc5a56a1d0f16d58b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/624000
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Add cmd/internal/obj/mkcnames.go to do the generation and update
the architecture packages to use it to maintain the Cnames tables.
Currently works correctly on arm64,loong64,mips,ppc64 and s390x.
Change-Id: I5220b0ba6d8a8a5fcc4d9774731eb2af69a671af
Reviewed-on: https://go-review.googlesource.com/c/go/+/622256
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Commit-Queue: Ian Lance Taylor <iant@golang.org>
|
|
For FIPS init-time code+data verification, we need to arrange to
put the FIPS symbols into contiguous regions of the executable
and then record those sections along with the expected checksum.
The cmd/internal/obj changes identify the FIPS symbols and give
them distinguished types, which the linker then places in contiguous
regions. The linker also writes out information to use at run time
to find the FIPS sections, along with the expected hash.
See cmd/internal/obj/fips.go and cmd/link/internal/ld/fips.go
for more details.
The code is disabled in this commit.
CL 625998 and 625999 adds tests.
CL 626000 enables the code.
For #69536.
Change-Id: I48da6db94bc0bea7428c43d4abcf999527bccfcd
Reviewed-on: https://go-review.googlesource.com/c/go/+/625997
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL refers to the implementation of ARM64 and adds support for the following
types of SIMD instructions:
1. Move general-purpose register to a vector element, e.g.:
VMOVQ Rj, <Vd>.<T>[index]
<T> can have the following values:
B, H, W, V
2. Move vector element to general-purpose register, e.g.:
VMOVQ <Vj>.<T>[index], Rd
<T> can have the following values:
B, BU, H, HU, W, WU, VU
3. Duplicate general-purpose register to vector, e.g.:
VMOVQ Rj, <Vd>.<T>
<T> can have the following values:
B16, H8, W4, V2, B32, H16, W8, V4
4. Move vector, e.g.:
XVMOVQ Xj, <Xd>.<T>
<T> can have the following values:
B16, H8, W4, V2, Q1
5. Move vector element to scalar, e.g.:
XVMOVQ Xj, <Xd>.<T>[index]
XVMOVQ Xj.<T>[index], Xd
<T> can have the following values:
W, V
6. Move vector element to vector register, e.g.:
VMOVQ <Vn>.<T>[index], Vn.<T>
<T> can have the following values:
B, H, W, V
This CL only adds syntax and doesn't break any assembly that already exists.
Change-Id: I7656efac6def54da6c5ae182f39c2a21bfdf92bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/616258
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Use Loong64's LSX instruction VPCNT to implement math/bits.OnesCount{16,32,64}
and make it intrinsic.
Benchmark results on loongson 3A5000 and 3A6000 machines:
goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A5000-HV @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
OnesCount 4.413n ± 0% 1.401n ± 0% -68.25% (p=0.000 n=10)
OnesCount8 1.364n ± 0% 1.363n ± 0% ~ (p=0.130 n=10)
OnesCount16 2.112n ± 0% 1.534n ± 0% -27.37% (p=0.000 n=10)
OnesCount32 4.533n ± 0% 1.529n ± 0% -66.27% (p=0.000 n=10)
OnesCount64 4.565n ± 0% 1.531n ± 1% -66.46% (p=0.000 n=10)
geomean 3.048n 1.470n -51.78%
goos: linux
goarch: loong64
pkg: math/bits
cpu: Loongson-3A6000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
OnesCount 3.553n ± 0% 1.201n ± 0% -66.20% (p=0.000 n=10)
OnesCount8 0.8021n ± 0% 0.8004n ± 0% -0.21% (p=0.000 n=10)
OnesCount16 1.216n ± 0% 1.000n ± 0% -17.76% (p=0.000 n=10)
OnesCount32 3.006n ± 0% 1.035n ± 0% -65.57% (p=0.000 n=10)
OnesCount64 3.503n ± 0% 1.035n ± 0% -70.45% (p=0.000 n=10)
geomean 2.053n 1.006n -51.01%
Change-Id: I07a5b8da2bb48711b896387ec7625145804affc8
Reviewed-on: https://go-review.googlesource.com/c/go/+/620978
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
As proposed on #66984, this CL allows more types to be used as
wasmimport/wasmexport function parameters and results.
Specifically, bool, string, and uintptr are now allowed, and also
pointer types that point to allowed element types. Allowed element
types includes sized integer and floating point types (including
small integer types like uint8 which are not directly allowed as
a parameter type), bool, array whose element type is allowed, and
struct whose fields are allowed element type and also include a
struct.HostLayout field.
For #66984.
Change-Id: Ie5452a1eda21c089780dfb4d4246de6008655c84
Reviewed-on: https://go-review.googlesource.com/c/go/+/626615
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
CL 521790 has experimentally enabled RegABI support on Loong64, so it
is possible to switch the Lookup function call to ABIInternal mode.
Change-Id: I3ae053e20c0791efebe6b6bdc9a1550a11372bc2
Reviewed-on: https://go-review.googlesource.com/c/go/+/544435
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
CL 622042 added rand as a compiler builtin, but did not update builtinlist.
Also update the mkbuiltin comment to refer to the current file location,
and add a comment for runtime.rand that it is called from the compiler.
For #54766
Change-Id: I99d2c0bb0658da333775afe2ed0447265c845c82
Reviewed-on: https://go-review.googlesource.com/c/go/+/626755
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
|
|
8-bit and 16-bit sign extensions and 32-bit zero extensions were realized
with left and right shifts before this change. We now support assembling
EXTWB, EXTWH and BSTRPICKV, so all three can be done with a single insn
respectively.
This patch is a copy of CL 479496.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: Iee5741dd9ebb25746f51008f3f6c86704339d615
Reviewed-on: https://go-review.googlesource.com/c/go/+/626195
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Go asm syntax:
VPCNT{B,H,W,V} VJ, VD
XVPCNT{B,H,W,V} XJ, XD
Equivalent platform assembler syntax:
vpcnt.{b,w,h,d} vd, vj
xvpcnt.{b,w,h,d} xd, xj
Change-Id: Icec4446b1925745bc3a0bc3f6397d862953b9098
Reviewed-on: https://go-review.googlesource.com/c/go/+/620736
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
|
|
These will be necessary when we start using the new FIPS symbols.
Split into a separate CL so that these refactoring changes can be
tested separate from any FIPS-specific changes.
Passes golang.org/x/tools/cmd/toolstash/buildall.
Change-Id: I73e5873fcb677f1f572f0668b4dc6f3951d822bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/625996
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
|
|
Add FIPS symbol kinds that will be needed for FIPS support.
This is a separate CL to keep the re-generated changes in
the string methods separate from hand-written changes.
The separate symbol kinds will let us group the FIPS-related
code and data together, so that it can be checksummed at
startup, as required by FIPS.
It's also separate because it breaks buildall, by changing the
on-disk symbol kind enumeration. We want non-buildall
changes to be as simple as possible.
For #69536.
Change-Id: I2d5a238498929fff8b24736ee54330c17323bd86
Reviewed-on: https://go-review.googlesource.com/c/go/+/625995
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
The old API was to do
r := obj.AddRel(sym)
r.Type = this
r.Off = that
etc
The new API is:
sym.AddRel(ctxt, obj.Reloc{Type: this: Off: that, etc})
This new API is more idiomatic and avoids ever having relocations
that are only partially constructed. Most importantly, it sets up
for sym.AddRel being able to check relocation validity in the future.
(Passing ctxt is for use in validity checking.)
Passes golang.org/x/tools/cmd/toolstash/buildall.
Change-Id: I042ea76e61bb3bf6402f98ca11291a13f4799972
Reviewed-on: https://go-review.googlesource.com/c/go/+/625616
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Go asm syntax:
VSEQ{B,H,W,V} VJ, VK, VD
XVSEQ{B,H,W,V} XJ, XK, XD
Equivalent platform assembler syntax:
vseq.{b,w,h,d} vd, vj, vk
xvseq.{b,w,h,d} xd, xj, xk
Change-Id: Ia87277b12c817ebc41a46f4c3d09f4b76995ff2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/616076
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
instructions support
This CL adding primitive asm support of Loong64 LSX [1] and LASX [2], by introducing new
sets of register V0-V31 (C_VREG), X0-X31 (C_XREG) and 8 new instructions.
On Loong64, VLD,XVLD,VST,XVST implement vector memory access operations using immediate
values offset. VLDX, XVLDX, VSTX, XVSTX implement vector memory access operations using
register offset.
Go asm syntax:
VMOVQ n(RJ), RV (128bit vector load)
XVMOVQ n(RJ), RX (256bit vector load)
VMOVQ RV, n(RJ) (128bit vector store)
XVMOVQ RX, n(RJ) (256bit vector store)
VMOVQ (RJ)(RK), RV (128bit vector load)
XVMOVQ (RJ)(RK), RX (256bit vector load)
VMOVQ RV, (RJ)(RK) (128bit vector store)
XVMOVQ RX, (RJ)(RK) (256bit vector store)
Equivalent platform assembler syntax:
vld vd, rj, si12
xvld xd, rj, si12
vst vd, rj, si12
xvst xd, rj, si12
vldx vd, rj, rk
xvldx xd, rj, rk
vstx vd, rj, rk
xvstx xd, rj, rk
[1]: LSX: Loongson SIMD Extension, 128bit
[2]: LASX: Loongson Advanced SIMD Extension, 256bit
Change-Id: Ibaf5ddfd29b77670c3c44cc32bead36b2c8b8003
Reviewed-on: https://go-review.googlesource.com/c/go/+/616075
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
On Loong64, AMSWAPDB{W,V} instructions are supported by default, and AMSWAPDB{B,H} [1]
is a new instruction added by LA664(Loongson 3A6000) and later microarchitectures.
Therefore, AMSWAPDB{W,V} (full barrier) is used to implement AtomicStore{32,64}, and
the traditional MOVB or the new AMSWAPDBB is used to implement AtomicStore8 according
to the CPU feature.
The StoreRelease barrier on Loong64 is "dbar 0x12", but it is still necessary to
ensure consistency in the order of Store/Load [2].
LoweredAtomicStorezero{32,64} was removed because on loong64 the constant "0" uses
the R0 register, and there is no performance difference between the implementations
of LoweredAtomicStorezero{32,64} and LoweredAtomicStore{32,64}.
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A5000-HV @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
AtomicStore64 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20)
AtomicStore64-2 19.61n ± 0% 13.61n ± 0% -30.57% (p=0.000 n=20)
AtomicStore64-4 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20)
AtomicStore 19.61n ± 0% 13.61n ± 0% -30.60% (p=0.000 n=20)
AtomicStore-2 19.62n ± 0% 13.61n ± 0% -30.63% (p=0.000 n=20)
AtomicStore-4 19.62n ± 0% 13.62n ± 0% -30.58% (p=0.000 n=20)
AtomicStore8 19.61n ± 0% 20.01n ± 0% +2.04% (p=0.000 n=20)
AtomicStore8-2 19.62n ± 0% 20.02n ± 0% +2.01% (p=0.000 n=20)
AtomicStore8-4 19.61n ± 0% 20.02n ± 0% +2.09% (p=0.000 n=20)
geomean 19.61n 15.48n -21.08%
goos: linux
goarch: loong64
pkg: internal/runtime/atomic
cpu: Loongson-3A6000 @ 2500.00MHz
| bench.old | bench.new |
| sec/op | sec/op vs base |
AtomicStore64 18.03n ± 0% 12.81n ± 0% -28.93% (p=0.000 n=20)
AtomicStore64-2 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20)
AtomicStore64-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore 18.02n ± 0% 12.81n ± 0% -28.91% (p=0.000 n=20)
AtomicStore-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore8 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore8-2 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
AtomicStore8-4 18.01n ± 0% 12.81n ± 0% -28.87% (p=0.000 n=20)
geomean 18.01n 12.81n -28.89%
[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
[2]: https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=gcc/config/loongarch/sync.md
Change-Id: I4ae5e8dd0e6f026129b6e503990a763ed40c6097
Reviewed-on: https://go-review.googlesource.com/c/go/+/581356
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Qiqi Huang <huangqiqi@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
This CL provides vendor support for s390x disassembler plan9 syntax.
cd $GOROOT/src/cmd
go get golang.org/x/arch@master
go mod tidy
go mod vendor
For #15255
Change-Id: I20c87510a1aee2d1cf2df58feb535974c4c0e3ef
Reviewed-on: https://go-review.googlesource.com/c/go/+/623075
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Vishwanatha HD <vishwanatha.hd@ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Add support for assembling the FMA instructions present in the LoongArch
base ISA v1.00. This requires adding a new instruction format and making
use of a third source operand, which is put in RestArgs[0].
The single-precision instructions have the `.s` prefix in their official
mnemonics, and similar Go asm instructions all have `S` prefix for the
other architectures having FMA support, but in this change they instead
have `F` prefix in Go asm because loong64 currently follows the mips
backends in the naming convention. This could be changed later because
FMA is fully expressible in pure Go, making it unlikely to have to hand-
write such assembly in the wild.
Example mapping between actual encoding and Go asm syntax:
fmadd.s fd, fj, fk, fa -> FMADDF fa, fk, fj, fd
(prog.From = fa, prog.Reg = fk, prog.RestArgs[0] = fj and prog.To = fd)
fmadd.s fd, fd, fk, fa -> FMADDF fa, fk, fd
(prog.From = fa, prog.Reg = fk and prog.To = fd)
This patch is a copy of CL 477716.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: I9b4e4c601d6c5a854ee238f085849666e4faf090
Reviewed-on: https://go-review.googlesource.com/c/go/+/623877
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
This patch is a copy of CL 478595.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: Ifb6e8183c83a5dfe5dec84e173a74d5de62692a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/623875
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
All remaining unary bitop instructions in the LoongArch v1.00 base ISA
are added with this change.
While at it, add the missing W suffix to the current CLO/CLZ names. They
are not used anywhere as far as we know, so no breakage is expected.
Also, stop reusing SLL's instruction format for simplicity, in favor of
a new but trivial instruction format case.
This patch is a copy of CL 477717.
Co-authored-by: WANG Xuerui <git@xen0n.name>
Change-Id: Idbcaca25dda1ed313674ef8b26da722e8d7151c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/623876
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
for small frames
CL 379075 implemented function prologue/epilogue with STP/LDP.
To fix issue #53374, CL 412474 reverted the prologue STP change for
small frames, and the LDP in epilogue was kept. The current instructions
are:
prologue:
MOVD.W R30, -offset(RSP)
MOVD R29, -8(RSP)
epilogue:
LDP -8(RSP), (R29, R30)
ADD $offset, RSP, RSP
It seems a bit strange, as:
1) The prolog and epilogue are not in the same pattern (either STR-LDR,
or STP-LDP).
2) Go Internal ABI defines that R30 is saved at 0(RSP) and R29 is saved
at -8(RSP), so we can not use a single STP.W/LDP.P to save/restore
LR&FP and adjust SP. Changing the ABI causes too much complexity,
and the benefit is not that big.
This patch reverts the small frames' epilogue change in CL 379075. It
converts LDP in the epilogue to LDR-LDR. Another solution is to re-apply
the STP change in prologue, which requires to fix #53609. This seems the
easier and safer solution in the mean time. The new instructions are:
prologue:
MOVD.W R30, -offset(RSP)
MOVD R29, -8(RSP)
epilogue:
MOVD -8(RSP), R29
MOVD.P offset(RSP), R30
The current pattern may cause performance issues in Store-Forwarding on
micro-architectures like AmpereOne. Assuming a function call in the
middle of such code is short enough that the stores are still around,
then the LDP executes and it may wait longer to get the results from
separated stores in Store Buffers other than single STP.
Store-Forwarding aims to improve the efficiency of the processor by
allowing data to be forwarded directly from a store operation to a
subsequent load operation when certain conditions are met. See the
paper: "Memory Barriers: a Hardware View for Software Hackers"
(chapter 3.2: Store Forwarding).
The performance of following ARM64 Linux servers were tested:
1) AmpereOne (ARM v8.6+) from Ampere Computing.
2) Ampere Altra (ARM Neoverse N1) from Ampere Computing.
3) Graviton2 (ARM Neoverse N1) from AWS.
The effect of this change depends the hardware implementation of
store-forwarding. It can obviously improve AmpereOne, especially for
small functions that are frequently called and returned quickly.
E.g., JSON Marshal/Unmarshal benchmarks on AmpereOne:
goos: linux
goarch: arm64
pkg: encoding/json
│ ampere-one.base │ ampere-one.new │
│ sec/op │ sec/op vs base │
CodeMarshal-8 882.1µ ± 1% 779.6µ ± 1% -11.62% (p=0.000 n=10)
CodeMarshalError-8 961.5µ ± 0% 855.7µ ± 1% -11.01% (p=0.000 n=10)
MarshalBytes/32-8 207.6n ± 1% 187.8n ± 0% -9.52% (p=0.000 n=10)
MarshalBytes/256-8 501.0n ± 1% 482.6n ± 1% -3.68% (p=0.000 n=10)
MarshalBytes/4096-8 5.336µ ± 1% 5.074µ ± 1% -4.92% (p=0.000 n=10)
MarshalBytesError/32-8 242.3µ ± 2% 205.7µ ± 3% -15.08% (p=0.000 n=10)
MarshalBytesError/256-8 242.4µ ± 1% 205.2µ ± 2% -15.35% (p=0.000 n=10)
MarshalBytesError/4096-8 247.9µ ± 0% 210.1µ ± 1% -15.24% (p=0.000 n=10)
MarshalMap-8 150.8n ± 1% 145.7n ± 0% -3.35% (p=0.000 n=10)
EncodeMarshaler-8 50.30n ± 26% 54.48n ± 6% ~ (p=0.739 n=10)
CodeUnmarshal-8 4.796m ± 2% 4.055m ± 1% -15.45% (p=0.000 n=10)
CodeUnmarshalReuse-8 4.260m ± 1% 3.496m ± 1% -17.94% (p=0.000 n=10)
UnmarshalString-8 73.89n ± 1% 65.83n ± 1% -10.91% (p=0.000 n=10)
UnmarshalFloat64-8 60.63n ± 1% 58.66n ± 25% ~ (p=0.143 n=10)
UnmarshalInt64-8 55.62n ± 1% 53.25n ± 22% ~ (p=0.468 n=10)
UnmarshalMap-8 255.3n ± 1% 230.3n ± 1% -9.77% (p=0.000 n=10)
UnmarshalNumber-8 467.2n ± 1% 367.0n ± 0% -21.43% (p=0.000 n=10)
geomean 6.224µ 5.605µ -9.94%
Other ARM64 micro-architectures may be not affected so much by such
issue. E.g., benchmarks on Ampere Altra and Graviton2 show slight
improvements:
│ altra.base │ altra.new │
│ sec/op │ sec/op vs base │
CodeMarshal-8 980.1µ ± 1% 977.3µ ± 1% ~ (p=0.912 n=10)
CodeMarshalError-8 1.109m ± 3% 1.096m ± 5% ~ (p=0.971 n=10)
MarshalBytes/32-8 246.8n ± 1% 245.4n ± 0% -0.55% (p=0.002 n=10)
MarshalBytes/256-8 590.9n ± 1% 606.6n ± 1% +2.67% (p=0.000 n=10)
MarshalBytes/4096-8 6.351µ ± 1% 6.376µ ± 1% ~ (p=0.183 n=10)
MarshalBytesError/32-8 245.3µ ± 2% 246.1µ ± 2% ~ (p=0.684 n=10)
MarshalBytesError/256-8 245.5µ ± 1% 248.7µ ± 2% ~ (p=0.218 n=10)
MarshalBytesError/4096-8 254.2µ ± 1% 254.9µ ± 1% ~ (p=0.481 n=10)
MarshalMap-8 152.7n ± 2% 151.5n ± 3% ~ (p=0.782 n=10)
EncodeMarshaler-8 45.95n ± 7% 42.88n ± 5% -6.70% (p=0.014 n=10)
CodeUnmarshal-8 5.121m ± 4% 5.125m ± 3% ~ (p=0.579 n=10)
CodeUnmarshalReuse-8 4.616m ± 3% 4.634m ± 2% ~ (p=0.529 n=10)
UnmarshalString-8 72.12n ± 2% 72.20n ± 2% ~ (p=0.912 n=10)
UnmarshalFloat64-8 64.44n ± 5% 63.20n ± 4% ~ (p=0.393 n=10)
UnmarshalInt64-8 61.49n ± 2% 58.14n ± 4% -5.45% (p=0.002 n=10)
UnmarshalMap-8 263.6n ± 2% 266.2n ± 1% ~ (p=0.196 n=10)
UnmarshalNumber-8 464.7n ± 1% 464.0n ± 0% ~ (p=0.566 n=10)
geomean 6.617µ 6.575µ -0.64%
│ graviton2.base │ graviton2.new │
│ sec/op │ sec/op vs base │
CodeMarshal-8 1.122m ± 0% 1.118m ± 1% ~ (p=0.052 n=10)
CodeMarshalError-8 1.216m ± 1% 1.214m ± 0% ~ (p=0.631 n=10)
MarshalBytes/32-8 289.9n ± 0% 280.8n ± 0% -3.17% (p=0.000 n=10)
MarshalBytes/256-8 675.9n ± 0% 664.7n ± 0% -1.66% (p=0.000 n=10)
MarshalBytes/4096-8 6.884µ ± 0% 6.885µ ± 0% ~ (p=0.565 n=10)
MarshalBytesError/32-8 293.1µ ± 2% 288.9µ ± 2% ~ (p=0.123 n=10)
MarshalBytesError/256-8 296.0µ ± 3% 289.0µ ± 1% -2.36% (p=0.019 n=10)
MarshalBytesError/4096-8 300.4µ ± 1% 295.6µ ± 0% -1.60% (p=0.000 n=10)
MarshalMap-8 168.8n ± 1% 168.8n ± 1% ~ (p=1.000 n=10)
EncodeMarshaler-8 53.77n ± 8% 50.05n ± 12% ~ (p=0.579 n=10)
CodeUnmarshal-8 5.875m ± 2% 5.882m ± 1% ~ (p=0.796 n=10)
CodeUnmarshalReuse-8 5.383m ± 1% 5.366m ± 0% ~ (p=0.631 n=10)
UnmarshalString-8 74.59n ± 1% 73.99n ± 0% -0.80% (p=0.001 n=10)
UnmarshalFloat64-8 68.52n ± 7% 64.19n ± 18% ~ (p=0.868 n=10)
UnmarshalInt64-8 65.32n ± 13% 62.24n ± 8% ~ (p=0.138 n=10)
UnmarshalMap-8 290.1n ± 0% 291.3n ± 0% +0.43% (p=0.010 n=10)
UnmarshalNumber-8 514.4n ± 0% 499.4n ± 0% -2.93% (p=0.000 n=10)
geomean 7.459µ 7.317µ -1.91%
Change-Id: If27386fc5f514b76bdaf2012c2ce86cc65f7ca5b
Reviewed-on: https://go-review.googlesource.com/c/go/+/621775
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This is the only non-vendored file that imports x/sys/unix.
Switch to fetching the information in this package.
Change-Id: I4e54c2cd8b4953066e2bee42922f35c387fb43e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/623435
Auto-Submit: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Regenerate RISC-V instruction table from the riscv-opcodes repository,
due to various changes and shuffling upstream.
This has been changed to remove pseudo-instructions, since Go only
needs the instruction encodings and including the pseudo-instructions
is creating unnecessary complications (for example, the inclusion
of ANOP and ARET, as well as strangely named aliases such as
AJALPSEUDO/AJALRPSEUDO). Remove pseudo-instructions that are not
currently supported by the assembler and add specific handling for
RDCYCLE, RDTIME and RDINSTRET, which were previously implemented
via the instruction encodings.
Change-Id: I78be4506ba6b627eba1f321406081a63bab5b2e6
Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64
Reviewed-on: https://go-review.googlesource.com/c/go/+/616116
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
BGT, BLT, BLE, BGE, BNE, BVS, BVC, and BEQ support by assembler. This will simplify the usage of BC constructs like
BC 12, 30, LR <=> BEQ CR7, LR
BC 12, 2, LR <=> BEQ CR0, LR
BC 12, 0, target <=> BLT CR0, target
BC 12, 2, target <=> BEQ CR0, target
BC 12, 5, target <=> BGT CR1, target
BC 12, 30, target <=> BEQ CR7, target
BC 4, 6, target <=> BNE CR1, target
BC 4, 5, target <=> BLE CR1, target
code cleanup based on the above additions.
Change-Id: I02fdb212b6fe3f85ce447e05f4d42118c9ce63b5
Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-ppc64_power8,gotip-linux-ppc64le_power8,gotip-linux-ppc64le_power9,gotip-linux-ppc64le_power10
Reviewed-on: https://go-review.googlesource.com/c/go/+/612395
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Assembler support provided for the instructions DADD, DSUB, DMUL, and DDIV.
Change-Id: Ic12ba02ce453cb1ca275334ca1924fb2009da767
Reviewed-on: https://go-review.googlesource.com/c/go/+/620856
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Add support to `go tool objdump` for disassembling riscv64 binaries.
Revendor to bring in cmd/vendor/golang.org/x/arch/riscv64/riscv64asm,
which provides the actual disassembly implementation.
Fixes #36738
Change-Id: I0f29968509041c0c5698fc2d6910a6a0bea9d3c0
Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64
Reviewed-on: https://go-review.googlesource.com/c/go/+/622257
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
internal/syscall/windows already provides a function to get the Windows
version. There is no need to use golang.org/x/sys/windows for this.
Change-Id: If31e9c662b10716ed6c3e9054604366e494345cf
Reviewed-on: https://go-review.googlesource.com/c/go/+/622815
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
This patch adds prologue_end statement to the DWARF info for riscv64,
which delve debugger uses for skip stacksplit prologue.
Change-Id: I4e5d9c26202385f65b3118b16f53f66de9d327f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/620295
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Update references to version 20240411 of the RISC-V specifications.
Reorder and regroup instructions to maintain ordering.
Change-Id: Iea2a5d22ad677e04948e9a9325986ad301c03f35
Reviewed-on: https://go-review.googlesource.com/c/go/+/616115
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
This adds V0 through V31 as vector registers, which are available on CPUs
that support the V extension.
Change-Id: Ibffee3f9a2cf1d062638715b3744431d72d451ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/595404
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: 鹏程汪 <wangpengcheng.pp@bytedance.com>
|
|
Regenerate the riscv instruction encoding table with the V extension
enabled. Add constants and names for the resulting 375 instructions.
Change-Id: Icce688493aeb1e9880fb76a0618643f57e481273
Reviewed-on: https://go-review.googlesource.com/c/go/+/595403
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: 鹏程汪 <wangpengcheng.pp@bytedance.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
These are 8-bit ARM Load/Store atomics and are available starting from armv6k.
See https://developer.arm.com/documentation/dui0379/e/arm-and-thumb-instructions/strex
For #69735
Change-Id: I12623433c89070495c178208ee4758b3cdefd368
GitHub-Last-Rev: d6a797836af1dccdcc6e6554725546b386d01615
GitHub-Pull-Request: golang/go#69959
Cq-Include-Trybots: luci.golang.try:gotip-linux-arm
Reviewed-on: https://go-review.googlesource.com/c/go/+/621395
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
The spill/restore code around morestack is almost never exectued, so
we should make it as small as possible. Using 2-register loads/stores
makes sense here. Also, the offsets from SP are pretty small so the
offset almost always fits in the (smaller than a normal load/store)
offset field of the instruction.
Makes cmd/go 0.6% smaller.
Change-Id: I8845283c1b269a259498153924428f6173bda293
Reviewed-on: https://go-review.googlesource.com/c/go/+/621556
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
This is similar to CL 618597, but for GNU build ID on ELF. This
makes it possible to enable "-B gobuildid" by default on ELF.
Updates #41004.
For #63934.
Change-Id: I4e663a27a2f7824bce994c783fe6d9ce8d1a395a
Reviewed-on: https://go-review.googlesource.com/c/go/+/618600
Reviewed-by: Than McIntosh <thanm@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
With the "-B gobuildid" linker option (which will be the default
on some platforms), the host build ID (GNU build ID, Mach-O UUID)
depends on the Go buildid. If the host build ID is included in the
Go buildid computation, it will lead to convergence problem for
the toolchain binaries. So ignore the host build ID in the buildid
computation.
This CL only handles Mach-O UUID. ELF GNU build ID will be handled
later.
For #68678.
For #63934.
Cq-Include-Trybots: luci.golang.try:gotip-darwin-amd64_14,gotip-darwin-arm64_13
Change-Id: Ie8ff20402a1c6083246d25dea391140c75be40d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/618597
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@golang.org>
|
|
Currently the linker has some code handling and manipulating
Mach-O files. Specifically, it augments the debug/macho package
with file offset and length, so the content can be handled or
updated easily with the file.
Move this code to an internal package, so it can be used by other
part of the toolchain, e.g. buildid computation.
For #68678.
Cq-Include-Trybots: luci.golang.try:gotip-darwin-amd64_14,gotip-darwin-arm64_13
Change-Id: I2311af0a06441b7fd887ca5c6ed9e6fc44670a16
Reviewed-on: https://go-review.googlesource.com/c/go/+/618596
Reviewed-by: Than McIntosh <thanm@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Use the new SwissTable-based map in internal/runtime/maps as the basis
for the runtime map when GOEXPERIMENT=swissmap.
Integration is complete enough to pass all.bash. Notable missing
features:
* Race integration / concurrent write detection
* Stack-allocated maps
* Specialized "fast" map variants
* Indirect key / elem
For #54766.
Cq-Include-Trybots: luci.golang.try:gotip-linux-ppc64_power10,gotip-linux-amd64-longtest-swissmap
Change-Id: Ie97b656b6d8e05c0403311ae08fef9f51756a639
Reviewed-on: https://go-review.googlesource.com/c/go/+/594596
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
In the process of stack split checking, loong64 uses the following
logic: if SP > stackguard then goto done, else morestack
The possible problem here is that the probability of morestack
execution is much lower than done, while static branch prediction
is more inclined to obtain morestack, which will cause a certain
probability of branch prediction error.
Change the logic here to:
if SP <= stackguard then goto morestack, else done
benchmarks on 3A6000:
goos: linux
goarch: loong64
pkg: fmt
cpu: Loongson-3A6000 @ 2500.00MHz
│ bench.old │ bench.new │
│ sec/op │ sec/op vs base │
SprintfPadding 418.3n ± 1% 387.0n ± 0% -7.49% (p=0.000 n=20)
SprintfEmpty 35.95n ± 0% 35.86n ± 0% -0.25% (p=0.000 n=20)
SprintfString 75.02n ± 1% 72.24n ± 0% -3.71% (p=0.000 n=20)
SprintfTruncateString 165.7n ± 3% 139.9n ± 1% -15.58% (p=0.000 n=20)
SprintfTruncateBytes 171.0n ± 0% 147.3n ± 0% -13.83% (p=0.000 n=20)
SprintfSlowParsingPath 90.56n ± 0% 80.85n ± 0% -10.72% (p=0.000 n=20)
SprintfQuoteString 560.2n ± 0% 509.7n ± 0% -9.01% (p=0.000 n=20)
SprintfInt 58.62n ± 0% 56.45n ± 0% -3.70% (p=0.000 n=20)
SprintfIntInt 141.7n ± 0% 122.2n ± 0% -13.73% (p=0.000 n=20)
SprintfPrefixedInt 210.6n ± 0% 208.8n ± 0% -0.88% (p=0.000 n=20)
SprintfFloat 282.3n ± 0% 251.8n ± 1% -10.80% (p=0.000 n=20)
SprintfComplex 854.1n ± 0% 813.8n ± 0% -4.71% (p=0.000 n=20)
SprintfBoolean 76.32n ± 0% 71.14n ± 1% -6.79% (p=0.000 n=20)
SprintfHexString 218.5n ± 0% 193.4n ± 0% -11.51% (p=0.000 n=20)
SprintfHexBytes 321.3n ± 0% 275.0n ± 0% -14.42% (p=0.000 n=20)
SprintfBytes 573.5n ± 0% 553.2n ± 1% -3.54% (p=0.000 n=20)
SprintfStringer 501.1n ± 1% 446.6n ± 0% -10.86% (p=0.000 n=20)
SprintfStructure 1.793µ ± 0% 1.683µ ± 0% -6.16% (p=0.000 n=20)
ManyArgs 500.0n ± 0% 470.4n ± 0% -5.92% (p=0.000 n=20)
FprintInt 67.51n ± 0% 65.71n ± 0% -2.66% (p=0.000 n=20)
FprintfBytes 130.9n ± 0% 129.5n ± 1% -1.11% (p=0.000 n=20)
FprintIntNoAlloc 67.55n ± 0% 65.80n ± 0% -2.58% (p=0.000 n=20)
ScanInts 386.3µ ± 0% 346.5µ ± 0% -10.29% (p=0.000 n=20)
ScanRecursiveInt 25.97m ± 0% 25.93m ± 0% -0.15% (p=0.038 n=20)
ScanRecursiveIntReaderWrapper 26.07m ± 0% 25.93m ± 0% -0.53% (p=0.001 n=20)
geomean 702.6n 653.7n -6.96%
goos: linux
goarch: loong64
pkg: test/bench/go1
cpu: Loongson-3A6000 @ 2500.00MHz
│ bench.old │ bench.new │
│ sec/op │ sec/op vs base │
BinaryTree17 7.688 ± 1% 7.724 ± 0% +0.47% (p=0.040 n=20)
Fannkuch11 2.670 ± 0% 2.645 ± 0% -0.94% (p=0.000 n=20)
FmtFprintfEmpty 35.93n ± 0% 37.50n ± 0% +4.37% (p=0.000 n=20)
FmtFprintfString 56.32n ± 0% 59.74n ± 0% +6.08% (p=0.000 n=20)
FmtFprintfInt 64.47n ± 0% 61.26n ± 0% -4.98% (p=0.000 n=20)
FmtFprintfIntInt 100.30n ± 0% 99.67n ± 0% -0.63% (p=0.000 n=20)
FmtFprintfPrefixedInt 116.7n ± 0% 119.3n ± 0% +2.23% (p=0.000 n=20)
FmtFprintfFloat 234.1n ± 0% 203.4n ± 0% -13.11% (p=0.000 n=20)
FmtManyArgs 503.0n ± 0% 467.9n ± 0% -6.96% (p=0.000 n=20)
GobDecode 8.125m ± 0% 7.299m ± 0% -10.17% (p=0.000 n=20)
GobEncode 8.930m ± 1% 8.581m ± 1% -3.91% (p=0.000 n=20)
Gzip 280.0m ± 0% 279.8m ± 0% -0.10% (p=0.000 n=20)
Gunzip 33.30m ± 0% 32.48m ± 0% -2.49% (p=0.000 n=20)
HTTPClientServer 55.43µ ± 0% 54.10µ ± 1% -2.41% (p=0.000 n=20)
JSONEncode 10.086m ± 0% 9.055m ± 0% -10.22% (p=0.000 n=20)
JSONDecode 49.37m ± 1% 46.22m ± 1% -6.40% (p=0.000 n=20)
Mandelbrot200 4.606m ± 0% 4.606m ± 0% ~ (p=0.280 n=20)
GoParse 5.010m ± 0% 4.855m ± 0% -3.09% (p=0.000 n=20)
RegexpMatchEasy0_32 59.09n ± 0% 59.32n ± 0% +0.39% (p=0.000 n=20)
RegexpMatchEasy0_1K 455.2n ± 0% 453.8n ± 0% -0.31% (p=0.000 n=20)
RegexpMatchEasy1_32 59.24n ± 0% 60.11n ± 0% +1.47% (p=0.000 n=20)
RegexpMatchEasy1_1K 555.2n ± 0% 553.9n ± 0% -0.23% (p=0.000 n=20)
RegexpMatchMedium_32 845.7n ± 0% 775.6n ± 0% -8.28% (p=0.000 n=20)
RegexpMatchMedium_1K 26.68µ ± 0% 26.48µ ± 0% -0.78% (p=0.000 n=20)
RegexpMatchHard_32 1.317µ ± 0% 1.326µ ± 0% +0.68% (p=0.000 n=20)
RegexpMatchHard_1K 41.35µ ± 0% 40.95µ ± 0% -0.97% (p=0.000 n=20)
Revcomp 463.0m ± 0% 473.0m ± 0% +2.15% (p=0.000 n=20)
Template 83.80m ± 0% 76.26m ± 1% -9.00% (p=0.000 n=20)
TimeParse 283.3n ± 0% 260.8n ± 0% -7.96% (p=0.000 n=20)
TimeFormat 307.2n ± 0% 290.5n ± 0% -5.45% (p=0.000 n=20)
geomean 53.16µ 51.67µ -2.79%
Change-Id: Iaec2f50db18e9a2b405605f8b92af3683114ea34
Reviewed-on: https://go-review.googlesource.com/c/go/+/616035
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
It's still pretty cryptic, but at least now instead of printing
asm: asmidx: bad address 0/2067/2068
it will print
asm: asmidx: bad address 0/BX/SP
Change-Id: I1c73c439c94c5b9d3039728db85102a818739db9
Reviewed-on: https://go-review.googlesource.com/c/go/+/611256
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Austin Clements <austin@google.com>
|
|
goos: linux
goarch: loong64
pkg: test/bench/go1
cpu: Loongson-3A6000 @ 2500.00MHz
│ bench.old │ bench.new │
│ sec/op │ sec/op vs base │
BinaryTree17 7.728 ± 1% 7.703 ± 1% ~ (p=0.345 n=15)
Fannkuch11 2.645 ± 0% 2.429 ± 0% -8.17% (p=0.000 n=15)
FmtFprintfEmpty 35.89n ± 0% 35.83n ± 0% -0.17% (p=0.000 n=15)
FmtFprintfString 59.48n ± 0% 59.45n ± 0% ~ (p=0.280 n=15)
FmtFprintfInt 62.04n ± 0% 61.11n ± 0% -1.50% (p=0.000 n=15)
FmtFprintfIntInt 97.74n ± 0% 96.56n ± 0% -1.21% (p=0.000 n=15)
FmtFprintfPrefixedInt 116.7n ± 0% 116.6n ± 0% ~ (p=0.975 n=15)
FmtFprintfFloat 204.5n ± 0% 203.2n ± 0% -0.64% (p=0.000 n=15)
FmtManyArgs 456.2n ± 0% 454.6n ± 0% -0.35% (p=0.000 n=15)
GobDecode 7.142m ± 1% 6.960m ± 1% -2.55% (p=0.000 n=15)
GobEncode 8.172m ± 1% 8.081m ± 0% -1.11% (p=0.001 n=15)
Gzip 279.9m ± 0% 280.1m ± 0% +0.05% (p=0.011 n=15)
Gunzip 32.69m ± 0% 32.44m ± 0% -0.79% (p=0.000 n=15)
HTTPClientServer 53.94µ ± 0% 53.68µ ± 0% -0.48% (p=0.000 n=15)
JSONEncode 9.297m ± 0% 9.110m ± 0% -2.01% (p=0.000 n=15)
JSONDecode 47.21m ± 0% 47.99m ± 2% +1.66% (p=0.000 n=15)
Mandelbrot200 4.601m ± 0% 4.606m ± 0% +0.11% (p=0.000 n=15)
GoParse 4.666m ± 0% 4.664m ± 0% ~ (p=0.512 n=15)
RegexpMatchEasy0_32 59.76n ± 0% 58.92n ± 0% -1.41% (p=0.000 n=15)
RegexpMatchEasy0_1K 458.1n ± 0% 455.3n ± 0% -0.61% (p=0.000 n=15)
RegexpMatchEasy1_32 59.36n ± 0% 60.25n ± 0% +1.50% (p=0.000 n=15)
RegexpMatchEasy1_1K 557.7n ± 0% 566.0n ± 0% +1.49% (p=0.000 n=15)
RegexpMatchMedium_32 803.0n ± 0% 783.7n ± 0% -2.40% (p=0.000 n=15)
RegexpMatchMedium_1K 27.29µ ± 0% 26.54µ ± 0% -2.76% (p=0.000 n=15)
RegexpMatchHard_32 1.388µ ± 2% 1.333µ ± 0% -3.96% (p=0.000 n=15)
RegexpMatchHard_1K 40.91µ ± 0% 40.96µ ± 0% +0.12% (p=0.001 n=15)
Revcomp 474.7m ± 0% 474.2m ± 0% ~ (p=0.325 n=15)
Template 77.13m ± 1% 75.70m ± 1% -1.86% (p=0.000 n=15)
TimeParse 271.3n ± 0% 271.7n ± 0% +0.15% (p=0.000 n=15)
TimeFormat 289.4n ± 0% 290.8n ± 0% +0.48% (p=0.000 n=15)
geomean 51.70µ 51.22µ -0.92%
Change-Id: Ib71b60226251722ae828edb1d7d8b5a27f383570
Reviewed-on: https://go-review.googlesource.com/c/go/+/616098
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
We rename it to iIIEncoding to reflect the fact that instructions
that use this encoding take two integer registers. This change
will allow us to add a new encoding for I-type instructions that
take a single integer register. This new encoding will be used for
instructions that modify CSRs.
Change-Id: Ic507d0020e18f6aa72353f4d3ffcd0e868261e7a
Reviewed-on: https://go-review.googlesource.com/c/go/+/614355
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Joel Sing <joel@sing.id.au>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
This CL provides vendor support for loong64 disassembler gnu and plan9 syntax.
cd $GOROOT/src/cmd
go get golang.org/x/arch@master
go mod tidy
go mod vendor
Change-Id: Ic8b888de0aa11cba58cbf559f8f69337d1d69309
Reviewed-on: https://go-review.googlesource.com/c/go/+/609015
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
|
|
M6 field for all extended mnemonics of VSTRC set to zero
This fixes VSTRC codegen to emit correctly and added testcases for all
the extended mnemonics.
Fixes #69216
Change-Id: I2a1b7fb61d6bd6444286eab56a506225c90b75e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/612315
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Vishwanatha HD <vishwanatha.hd@ibm.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
Moving these intrinsics to a base package enables other internal/runtime
packages to use them.
For #54766.
Change-Id: I45a530422207dd94b5ad4eee51216c9410a84040
Reviewed-on: https://go-review.googlesource.com/c/go/+/613261
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Moving these intrinsics to a base package enables other internal/runtime
packages to use them.
For #54766.
Change-Id: I0b3eded3bb45af53e3eb5bab93e3792e6a8beb46
Reviewed-on: https://go-review.googlesource.com/c/go/+/613260
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
The compiler never does a lookup of these (LookupRuntime), so they
aren't needed here.
getcallerpc is only used in intrinsification. getcallersp is used in
intrinsification and defer handling via a direct OGETCALLERSP op.
For #54766.
Change-Id: I1666ceef3360a84573ae5b41b1c51d9205de7235
Reviewed-on: https://go-review.googlesource.com/c/go/+/613495
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
FLOGB{F/D}
Go asm syntax:
FSCALEB{F/D} FK, FJ, FD
FLOGB{F/D} FJ, FD
Equivalent platform assembler syntax:
fscaleb.{s/d} fd, fj, fk
flogb.{s/d} fd, fj
Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html
Change-Id: I6cd75c7605adbb572dae86d6470ec7cf20ce0f6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/612975
Auto-Submit: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Tim King <taking@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
riscv64
The ANDN, ORN and XNOR RISC-V Zbb extension instructions are easily
synthesised. Make them always available by adding support to the
riscv64 assembler so that we either emit two instruction sequences,
or a single instruction, when permitted by the GORISCV64 profile.
This means that these instructions can be used unconditionally,
simplifying compiler rewrite rules, codegen tests and manually
written assembly.
Around 180 instructions are removed from the Go binary on riscv64
when built with rva22u64.
Change-Id: Ib2d90f2593a306530dc0ed08a981acde4d01be20
Reviewed-on: https://go-review.googlesource.com/c/go/+/611895
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Tim King <taking@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|