| Age | Commit message (Collapse) | Author |
|
This CL adds the register list support for SVE:
[Z1.B, Z2.B]
[P1.B, P2.B]
[Z1.D]
[Z1.D, Z2.D, Z3.D]
[Z1.D, Z2.D, Z3.D, Z4.D]
This CL is generated by CL 763780.
Change-Id: I92210097a8a7525a5a53a2dce0b7652397275dd6
Reviewed-on: https://go-review.googlesource.com/c/go/+/763820
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL is generated by CL 759800.
The new register patterns are (examples):
Z1.B[5]
Z2[6]
P1[7]
PN1[8]
Change-Id: I5bccc4f1c0474dbd4cd4878bd488f36a7026c7ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/759780
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL integrates a new assembling path specifically designed for SVE
and other modern ARM64 instructions, utilizing generated instruction
tables. It contains the foundational files and modifications to direct
the assembling pipeline to use this new data-driven path.
In a.out.go, it registers new constants for registers and operand types
used by SVE.
A new file inst.go is added, which defines the instruction table data
types and utility functions for the new path. The entry point from the
upstream pipeline is `tryEncode`.
`tryEncode` returns false upon an encoding failure, which allows the
upstream matching logic to handle multiple potential matches. The exact
match is not finalized until an instruction is actually encoded, as
detailed in the comments for `elemEncoders`.
This CL also introduces the core generated tables (`anames_gen.go`,
`encoding_gen.go`, `goops_gen.go`, and `inst_gen.go`) which handle a
wide variety of SVE instructions. A comprehensive end-to-end assembly
test file (`arm64sveenc.s`) is added, containing hundreds of test cases
for these SVE instructions to verify the new encoding path.
To facilitate these encodings, this CL implements handling for operand
types such as AC_ARNG, AC_PREG, AC_PREGZM, and AC_ZREG. Others are left
as TODOs.
The generated files in this CL are produced by the `instgen` tool in CL
755180.
Original author Eric Fang (eric.fang@arm.com, CL 424137)
Change-Id: I483f170c776fcd8edd8b8b04520f9d69ee0855dd
Reviewed-on: https://go-review.googlesource.com/c/go/+/742620
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Change-Id: I36a0f0989d37bef45ea8778da799b56a7e9a0c30
Reviewed-on: https://go-review.googlesource.com/c/go/+/529515
Run-TryBot: shuang cui <imcusg@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
|
|
The previous code treats some operands such as EQ, LT, etc. as special
registers. However, they are not. This CL adds a new AddrType TYPE_SPOPD
and a new class C_SPOPD to support this kind of special operands, and
refactors the relevant code.
This patch is a copy of CL 260861, contributed by Junchen Li(junchen.li@arm.com).
Co-authored-by: Junchen Li(junchen.li@arm.com)
Change-Id: I57b28da458ee3332f610602632e7eda03af435f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/302849
Reviewed-by: Cherry Mui <cherryyz@google.com>
Trust: Eric Fang <eric.fang@arm.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This CL adds system register error checking test cases. There're two kinds of
error test cases:
1. illegal combination.
MRS should be used in this way: MRS <system register>, <general register>.
MSR should be used in this way: MSR <general register>, <system register>.
Error usage examples:
MRS R8, VTCR_EL2 // ERROR "illegal combination"
MSR VTCR_EL2, R8 // ERROR "illegal combination"
2. illegal read or write access.
Error usage examples:
MSR R7, MIDR_EL1 // ERROR "expected writable system register or pstate"
MRS OSLAR_EL1, R3 // ERROR "expected readable system register"
This CL reads system registers readable and writeable property to check whether
they're used with legal read or write access. This property is named AccessFlags
in sysRegEnc.go, and it is automatically generated by modifing the system register
generator.
Change-Id: Ic83d5f372de38d1ecd0df1ca56b354ee157f16b4
Reviewed-on: https://go-review.googlesource.com/c/go/+/194917
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This patch supports the EL0 and EL1 system registers used in MRS/MSR
instructions. This patch refactors the assembler code, allowing the
assembler to read system register information from the automatically
generated sysRegEnc.go file and move existing declared system registers
to the sysRegEnc.go file.
This patch adds 431 system registers, it is worth noting that the number
of special registers is initialized to less than 1024 in the list7.go file.
This CL also adds some test cases to test the newly added system registers.
The test cases are contributed by Dianhong Xu <Dianhong.Xu@arm.com>
Change-Id: Ic09a937eaaeefe82bd08b5dd726808f8ff6cebf6
Reviewed-on: https://go-review.googlesource.com/c/go/+/189577
Reviewed-by: Ben Shi <powerman1st@163.com>
Run-TryBot: Ben Shi <powerman1st@163.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
- Uncomment tests for AVX512 encoder
- Permit instruction suffixes for x86
- Permit limited reg list [reg-reg] syntax for x86 for multi-source ops
- EVEX encoding support in obj/x86 (Z-cases, asmevex, etc.)
- optabs and ytabs generated by x86avxgen (https://golang.org/cl/107216)
Note: suffix formatting implemented with updated CConv function.
Now arch asm backend should register formatting function by
calling RegisterOpSuffix.
Updates #22779
Change-Id: I076a167ee49582700e058c56ad74e6696710c8c8
Reviewed-on: https://go-review.googlesource.com/113315
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
The patch adds support for LDR(register offset) instruction.
And add the test cases and negative tests.
Change-Id: I5b32c6a5065afc4571116d4896f7ebec3c0416d3
Reviewed-on: https://go-review.googlesource.com/87955
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This change adds VLD1, VST1, VPMULL{2}, VEXT, VRBIT, VUSHR and VSHL instructions
for supporting AES-GCM implementation later.
Fixes #24400
Change-Id: I556feb88067f195cbe25629ec2b7a817acc58709
Reviewed-on: https://go-review.googlesource.com/101095
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Current ARM64 assembler has no check for the invalid value of both
shift amount and post-index immediate offset of LD1/ST1. This patch
adds the check.
This patch also fixes the printing error of register number equals
to 31, which should be printed as ZR instead of R31. Test cases
are also added.
Change-Id: I476235f3ab3a3fc91fe89c5a3149a4d4529c05c7
Reviewed-on: https://go-review.googlesource.com/100255
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
|
|
Improve runtime memclr_arm64.s using ZVA feature to zero out memory when n
is at least 64 bytes.
Also add DCZID_EL0 system register to use in MRS instruction.
Benchmark results of runtime/Memclr on Amberwing:
name old time/op new time/op delta
Memclr/5 12.7ns ± 0% 12.7ns ± 0% ~ (all equal)
Memclr/16 12.7ns ± 0% 12.2ns ± 1% -4.13% (p=0.000 n=7+8)
Memclr/64 14.0ns ± 0% 14.6ns ± 1% +4.29% (p=0.000 n=7+8)
Memclr/256 23.7ns ± 0% 25.7ns ± 0% +8.44% (p=0.000 n=8+7)
Memclr/4096 204ns ± 0% 74ns ± 0% -63.71% (p=0.000 n=8+8)
Memclr/65536 2.89µs ± 0% 0.84µs ± 0% -70.91% (p=0.000 n=8+8)
Memclr/1M 45.9µs ± 0% 17.0µs ± 0% -62.88% (p=0.000 n=8+8)
Memclr/4M 184µs ± 0% 77µs ± 4% -57.94% (p=0.001 n=6+8)
Memclr/8M 367µs ± 0% 144µs ± 1% -60.72% (p=0.000 n=7+8)
Memclr/16M 734µs ± 0% 293µs ± 1% -60.09% (p=0.000 n=8+8)
Memclr/64M 2.94ms ± 0% 1.23ms ± 0% -58.06% (p=0.000 n=7+8)
GoMemclr/5 8.00ns ± 0% 8.79ns ± 0% +9.83% (p=0.000 n=8+8)
GoMemclr/16 8.00ns ± 0% 7.60ns ± 0% -5.00% (p=0.000 n=8+8)
GoMemclr/64 10.8ns ± 0% 10.4ns ± 0% -3.70% (p=0.000 n=8+8)
GoMemclr/256 20.4ns ± 0% 21.2ns ± 0% +3.92% (p=0.000 n=8+8)
name old speed new speed delta
Memclr/5 394MB/s ± 0% 393MB/s ± 0% -0.28% (p=0.006 n=8+8)
Memclr/16 1.26GB/s ± 0% 1.31GB/s ± 1% +4.07% (p=0.000 n=7+8)
Memclr/64 4.57GB/s ± 0% 4.39GB/s ± 2% -3.91% (p=0.000 n=7+8)
Memclr/256 10.8GB/s ± 0% 10.0GB/s ± 0% -7.95% (p=0.001 n=7+6)
Memclr/4096 20.1GB/s ± 0% 55.3GB/s ± 0% +175.46% (p=0.000 n=8+8)
Memclr/65536 22.6GB/s ± 0% 77.8GB/s ± 0% +243.63% (p=0.000 n=7+8)
Memclr/1M 22.8GB/s ± 0% 61.5GB/s ± 0% +169.38% (p=0.000 n=8+8)
Memclr/4M 22.8GB/s ± 0% 54.3GB/s ± 4% +137.85% (p=0.001 n=6+8)
Memclr/8M 22.8GB/s ± 0% 58.1GB/s ± 1% +154.56% (p=0.000 n=7+8)
Memclr/16M 22.8GB/s ± 0% 57.2GB/s ± 1% +150.54% (p=0.000 n=8+8)
Memclr/64M 22.8GB/s ± 0% 54.4GB/s ± 0% +138.42% (p=0.000 n=7+8)
GoMemclr/5 625MB/s ± 0% 569MB/s ± 0% -8.90% (p=0.000 n=7+8)
GoMemclr/16 2.00GB/s ± 0% 2.10GB/s ± 0% +5.26% (p=0.000 n=8+8)
GoMemclr/64 5.92GB/s ± 0% 6.15GB/s ± 0% +3.83% (p=0.000 n=7+8)
GoMemclr/256 12.5GB/s ± 0% 12.1GB/s ± 0% -3.77% (p=0.000 n=8+7)
Benchmark results of runtime/Memclr on Amberwing without ZVA:
name old time/op new time/op delta
Memclr/5 12.7ns ± 0% 12.8ns ± 0% +0.79% (p=0.008 n=5+5)
Memclr/16 12.7ns ± 0% 12.7ns ± 0% ~ (p=0.444 n=5+5)
Memclr/64 14.0ns ± 0% 14.4ns ± 0% +2.86% (p=0.008 n=5+5)
Memclr/256 23.7ns ± 1% 19.2ns ± 0% -19.06% (p=0.008 n=5+5)
Memclr/4096 203ns ± 0% 119ns ± 0% -41.38% (p=0.008 n=5+5)
Memclr/65536 2.89µs ± 0% 1.66µs ± 0% -42.76% (p=0.008 n=5+5)
Memclr/1M 45.9µs ± 0% 26.2µs ± 0% -42.82% (p=0.008 n=5+5)
Memclr/4M 184µs ± 0% 105µs ± 0% -42.81% (p=0.008 n=5+5)
Memclr/8M 367µs ± 0% 210µs ± 0% -42.76% (p=0.008 n=5+5)
Memclr/16M 734µs ± 0% 420µs ± 0% -42.74% (p=0.008 n=5+5)
Memclr/64M 2.94ms ± 0% 1.69ms ± 0% -42.46% (p=0.008 n=5+5)
GoMemclr/5 8.00ns ± 0% 8.40ns ± 0% +5.00% (p=0.008 n=5+5)
GoMemclr/16 8.00ns ± 0% 8.40ns ± 0% +5.00% (p=0.008 n=5+5)
GoMemclr/64 10.8ns ± 0% 9.6ns ± 0% -11.02% (p=0.008 n=5+5)
GoMemclr/256 20.4ns ± 0% 17.2ns ± 0% -15.69% (p=0.008 n=5+5)
name old speed new speed delta
Memclr/5 393MB/s ± 0% 391MB/s ± 0% -0.64% (p=0.008 n=5+5)
Memclr/16 1.26GB/s ± 0% 1.26GB/s ± 0% -0.55% (p=0.008 n=5+5)
Memclr/64 4.57GB/s ± 0% 4.44GB/s ± 0% -2.79% (p=0.008 n=5+5)
Memclr/256 10.8GB/s ± 0% 13.3GB/s ± 0% +23.07% (p=0.016 n=4+5)
Memclr/4096 20.1GB/s ± 0% 34.3GB/s ± 0% +70.91% (p=0.008 n=5+5)
Memclr/65536 22.7GB/s ± 0% 39.6GB/s ± 0% +74.65% (p=0.008 n=5+5)
Memclr/1M 22.8GB/s ± 0% 40.0GB/s ± 0% +74.88% (p=0.008 n=5+5)
Memclr/4M 22.8GB/s ± 0% 39.9GB/s ± 0% +74.84% (p=0.008 n=5+5)
Memclr/8M 22.9GB/s ± 0% 39.9GB/s ± 0% +74.71% (p=0.008 n=5+5)
Memclr/16M 22.9GB/s ± 0% 39.9GB/s ± 0% +74.64% (p=0.008 n=5+5)
Memclr/64M 22.8GB/s ± 0% 39.7GB/s ± 0% +73.79% (p=0.008 n=5+5)
GoMemclr/5 625MB/s ± 0% 595MB/s ± 0% -4.77% (p=0.000 n=4+5)
GoMemclr/16 2.00GB/s ± 0% 1.90GB/s ± 0% -4.77% (p=0.008 n=5+5)
GoMemclr/64 5.92GB/s ± 0% 6.66GB/s ± 0% +12.48% (p=0.016 n=4+5)
GoMemclr/256 12.5GB/s ± 0% 14.9GB/s ± 0% +18.95% (p=0.008 n=5+5)
Fixes #22948
Change-Id: Iaae4e22391e25b54d299821bb7f8a81ac3986b93
Reviewed-on: https://go-review.googlesource.com/82055
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
The current assembler cannot handle PRFM(immediate) instruciton.
The fix creates a prfopfield struct that contains the eight
prefetch operations and the value to use in instruction. And add
the test cases.
Fixes #22932
Change-Id: I621d611bd930ef3c42306a4372447c46d53b2ccf
Reviewed-on: https://go-review.googlesource.com/81675
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Some ARM64-specific instructions (such as SIMD instructions) are not supported.
This patch adds support for the following:
1. Extended register, e.g.:
ADD Rm.<ext>[<<amount], Rn, Rd
<ext> can have the following values:
UXTB, UXTH, UXTW, UXTX, SXTB, SXTH, SXTW and SXTX
2. Arrangement for SIMD instructions, e.g.:
VADDP Vm.<T>, Vn.<T>, Vd.<T>
<T> can have the following values:
B8, B16, H4, H8, S2, S4 and D2
3. Width specifier and element index for SIMD instructions, e.g.:
VMOV Vn.<T>[index], Rd // MOV(to general register)
<T> can have the following values:
S and D
4. Register List, e.g.:
VLD1 (Rn), [Vt1.<T>, Vt2.<T>, Vt3.<T>]
5. Register offset variant, e.g.:
VLD1.P (Rn)(Rm), [Vt1.<T>, Vt2.<T>] // Rm is the post-index register
6. Go assembly for ARM64 reference manual
new added instructions are required to have according explanation items in
the manual and items for existed instructions will be added incrementally
For more information about the refinement background, please refer to the
discussion (https://groups.google.com/forum/#!topic/golang-dev/rWgDxCrL4GU)
This patch only adds syntax and doesn't break any assembly that already exists.
Change-Id: I34e90b7faae032820593a0e417022c354a882008
Reviewed-on: https://go-review.googlesource.com/41654
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Each architecture's Rconv function is only used inside its
respective package, so it does not need to be exported.
Change-Id: Ifbd629964d7a9edd66501d7cdf4750621d66d646
Reviewed-on: https://go-review.googlesource.com/39110
Run-TryBot: Dave Cheney <dave@cheney.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Current V-register range is V32~V63 on arm64. This patch changes it to
V0~V31.
fix #15465.
Change-Id: I90dab42dea46825ec5d7a8321ec4f6550735feb8
Reviewed-on: https://go-review.googlesource.com/22520
Reviewed-by: Aram Hăvărneanu <aram@mgk.ro>
Run-TryBot: Aram Hăvărneanu <aram@mgk.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.
This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:
$ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])')
$ go test go/doc -update
Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
ARM64 (ARMv8) has 32 general purpose, 64-bit integer registers
(R0-R31), 32 64-bit scalar floating point registers (F0-F31), and
32 128-bit vector registers (unused, V0-V31).
R31 is either the stack pointer (RSP), or the zero register (ZR),
depending on the instruction. Note the distinction between the
hardware stack pointer, RSP, and the virtual stack pointer SP.
The (hardware) stack pointer must be 16-byte aligned at all times;
the RSP register itself must be aligned, offset(RSP) only has to
have natural alignment.
Instructions are fixed-width, and are 32-bit wide. ARM64 supports
ARMv7 too (32-bit ARM), but not in the same process. In general,
there is not much in common between 32-bit ARM and ARM64, it's a
new architecture.
All implementations have floating point instructions.
This change adds a Prog.To3 field analogous to Prog.To. It is used
by exclusive load/store instructions such as STLXR which read from
one register, and write to both a register and a memory address.
STLXRW R1, (R0), R3
This will store the word contained in R1 to the memory address
pointed by R0. R3 will be updated with the status result of the
store. It is used to implement atomic operations.
No other changes are made to the portable Prog and Addr structures.
Change-Id: Ie839029aa5265bbad35769d9689eca11e1c48c47
Reviewed-on: https://go-review.googlesource.com/7046
Reviewed-by: Russ Cox <rsc@golang.org>
|