aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/internal/obj/arm64/asm7.go
AgeCommit message (Collapse)Author
39 hourscmd/asm, cmd/internal/obj/arm64: support memory with extensions in SVEJunyang Shao
This CL is generated by CL 764800. Supported addressing patterns: (Z7.D.SXTW<<2)(Z6.D), where Z6.D is the base, Z7.D is the indices. SXTW/UXTW represents signed/unsigned extension, << represents LSL. Change-Id: Ifc6c47833d5113be7cfe96943d369ab977b3a6ee Reviewed-on: https://go-review.googlesource.com/c/go/+/764780 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: golang-scoped@luci-project-accounts.iam.gserviceaccount.com <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Commit-Queue: Junyang Shao <shaojunyang@google.com>
5 dayscmd/asm, cmd/internal/obj/arm64: support register with index in SVEJunyang Shao
This CL is generated by CL 759800. The new register patterns are (examples): Z1.B[5] Z2[6] P1[7] PN1[8] Change-Id: I5bccc4f1c0474dbd4cd4878bd488f36a7026c7ca Reviewed-on: https://go-review.googlesource.com/c/go/+/759780 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
7 dayscmd/internal/obj/arm64: add ASIMD cross-lane reduction instructionsAlexander Musman
Add support for ASIMD instructions that reduce a vector to a scalar by operating across all lanes. These use the ASIMDALL encoding class from the ARM architecture specification. Integer cross-lane reductions (.B8, .B16, .H4, .H8, .S4): Signed max/min across lanes: VSMAXV, VSMINV Unsigned max/min across lanes: VUMAXV, VUMINV Floating-point cross-lane reductions (.S4 arrangement): FP max/min across lanes: VFMAXV, VFMINV FP max/min across lanes (NM): VFMAXNMV, VFMINNMV Change-Id: I6af4462d26803dfc7c78db2ad9df4284083e31e8 Reviewed-on: https://go-review.googlesource.com/c/go/+/762202 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
7 dayscmd/internal/obj/arm64: add ASIMD miscellaneous unary instructionsAlexander Musman
Add support for ASIMD unary miscellaneous instructions that operate on a single source register. These use the ASIMDMISC encoding class from the ARM architecture specification. These instruction need some validation for arrangement constraints: - VNOT only allows .B8/.B16 arrangements - VCLS/VCLZ do not support D arrangements - Floating-point variants (VFABS, VFNEG, VFSQRT, VFRINT*) only allow floating-point arrangements (S and D) New instructions by group: Integer absolute/negate: VABS, VNEG Floating-point abs/negate: VFABS, VFNEG Floating-point sqrt: VFSQRT Floating-point round: VFRINTN, VFRINTP, VFRINTM, VFRINTZ Saturating abs/negate: VSQABS, VSQNEG Bit/count operations: VCLS, VCLZ, VNOT Change-Id: I62242eda31f82cd34119c7d4f97316a030e7663b Reviewed-on: https://go-review.googlesource.com/c/go/+/762201 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Keith Randall <khr@golang.org>
7 dayscmd/internal/obj/arm64: add ASIMD arithmetic instructionsAlexander Musman
Add encoding support for ASIMD three-register instructions covering floating-point, saturating, halving, integer multiply/accumulate, min/max (including pairwise variants), and bitwise operations. These belong to the "Advanced SIMD Three-register (same)" instruction class defined by the ARM architecture, meaning the two source registers use the same element arrangement (e.g., both .S4 or both .D2). In the assembler they share a common encoding path using the ASIMDSAME() macro. New instructions by group: Floating-point arithmetic: VFADD, VFSUB, VFMUL, VFDIV Floating-point min/max: VFMAX, VFMAXNM, VFMIN, VFMINNM Pairwise floating-point: VFADDP, VFMAXP, VFMINP, VFMAXNMP, VFMINNMP Saturating arithmetic: VSQADD, VUQADD, VSQSUB, VUQSUB Average (halving add): VSHADD, VSRHADD, VUHADD, VURHADD Integer multiply/accum: VMUL, VMLA, VMLS Integer min/max: VSMAX, VSMIN Pairwise integer min/max: VSMAXP, VSMINP, VUMAXP, VUMINP Bitwise: VBIC, VORN Change-Id: I732c84123ad1f302260514fdfe0d020787da017b Reviewed-on: https://go-review.googlesource.com/c/go/+/762200 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
7 dayscmd/internal/obj/arm64: add ASIMD shift instructionsAlexander Musman
Add support for ASIMD shift instructions. These use the ASIMDSHF encoding class from the ARM architecture specification, where the shift amount is encoded as an immediate derived from the element size. Also add ASIMD shifts-by-vector (3-register form) where the shift amount comes from a second vector register. These use the ASIMDSAME encoding class. New instructions by group: Shift by immediate (signed): VSSHR, VSRSHR Shift by immediate (saturating): VSQSHL, VUQSHL Narrowing shift by immediate: VSHRN, VSHRN2 Shift by vector (3-reg): VSSHL, VUSHL, VSQSHL, VUQSHL Change-Id: I039cc16bc01980b04e6940cc1d4670faf5fa7e3c Reviewed-on: https://go-review.googlesource.com/c/go/+/762180 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
7 dayscmd/internal/obj/arm64: add remaining ASIMD compare instructionsAlexander Musman
Add remaining arm64 ASIMD vector compare instructions. All these instructions produce either all zeroes (false) or all ones (true) bits in each corresponding lane as the result. Added integer comparison instructions: - VCMEQ (compare to zero) - VCMGE, VCMGT (singed, both two-register and compare to zero) - VCMHI, VCMHS (unsigned two-register compare) - VCMLE, VCMLT (signed compare to zero) Added floating-point comparison instructions: - VFCMEQ, VFCMGE, VFCMGT (both two-register and zero variants) - VFCMLE, VFCMLT (compare to zero) Change-Id: I913165d3934f2556c9bdf38c5103ef56d86383ef Reviewed-on: https://go-review.googlesource.com/c/go/+/721640 Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
7 dayscmd/internal/obj: refactor arm64 ASIMD instruction encodingAlexander Musman
Refactor arm64 ASIMD opcodes to use common helper routines named after their instruction classes from the arm64 XML specification. Add helper routines like ASIMDSAME for instructions with encoding class "asimdsame" in arm64 encodingindex.xml. Helper arguments follow the bitfield order in the speficication tables. For example, the CMEQ instruction entry: <tr class="instructiontable" encname="CMEQ_asimdsame_only"...> <td bitwidth="1" class="bitfield">1</td> <td bitwidth="2" class="bitfield"></td> <td bitwidth="5" class="bitfield">10001</td> <td class="iformname" iformid="CMEQ_advsimd_reg">CMEQ (register)</td> <td class="enctags">Vector</td> </tr> Now corresponds to ASIMDSAME(1, 0, 0x11), where each argument matches the correspoding bitfield value in the table. Change-Id: I024f3eba552906a865841bc1a296f14e3fca73f5 Reviewed-on: https://go-review.googlesource.com/c/go/+/719280 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2026-03-20cmd/internal/obj/arm64: new arm64 assembling path for SVEJunyang Shao
This CL integrates a new assembling path specifically designed for SVE and other modern ARM64 instructions, utilizing generated instruction tables. It contains the foundational files and modifications to direct the assembling pipeline to use this new data-driven path. In a.out.go, it registers new constants for registers and operand types used by SVE. A new file inst.go is added, which defines the instruction table data types and utility functions for the new path. The entry point from the upstream pipeline is `tryEncode`. `tryEncode` returns false upon an encoding failure, which allows the upstream matching logic to handle multiple potential matches. The exact match is not finalized until an instruction is actually encoded, as detailed in the comments for `elemEncoders`. This CL also introduces the core generated tables (`anames_gen.go`, `encoding_gen.go`, `goops_gen.go`, and `inst_gen.go`) which handle a wide variety of SVE instructions. A comprehensive end-to-end assembly test file (`arm64sveenc.s`) is added, containing hundreds of test cases for these SVE instructions to verify the new encoding path. To facilitate these encodings, this CL implements handling for operand types such as AC_ARNG, AC_PREG, AC_PREGZM, and AC_ZREG. Others are left as TODOs. The generated files in this CL are produced by the `instgen` tool in CL 755180. Original author Eric Fang (eric.fang@arm.com, CL 424137) Change-Id: I483f170c776fcd8edd8b8b04520f9d69ee0855dd Reviewed-on: https://go-review.googlesource.com/c/go/+/742620 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2026-02-27cmd/internal/obj: arm64 assembler support unshifted hi for large offsetsAlexander Musman
Extend splitImm24uScaled to support an unshifted hi value (hi <= 0xfff) in addition to the shifted hi value (hi & ^0xfff000 == 0). This allows load/store instructions to handle more offsets using ADD + load/store sequences instead of falling back to the literal pool. This will be used by a subsequent change to add FMOVQ support in SSA form. Change-Id: I78490f5b1a60d49c1d42ad4daefb5d4e6021c965 Reviewed-on: https://go-review.googlesource.com/c/go/+/737320 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2026-02-24internal/cpu,cmd/internal/obj/arm64: add SBRoland Shoemaker
Add the SB (speculation barrier) instruction, and an internal/cpu feature bit to check its availability. Change-Id: I7c2d887ae75598f7c11cc875ec15ec3be76c09f5 Reviewed-on: https://go-review.googlesource.com/c/go/+/729501 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-02-22cmd/internal/obj: support arm64 FMOVQ from/to global addressAlexander Musman
Support arm64 FMOVQ from/to global address. Currently there are no global addresses known to be aligned by 16 bytes, and with this CL we will always use R_ADDRARM64 relocation with ADRP+ADD+FMOVQ instructions. Change-Id: I283009eda151d1875cf4457734e79b68a941a6df Reviewed-on: https://go-review.googlesource.com/c/go/+/718001 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2026-02-17cmd/internal/obj: remove ARM64 prefix from encoding helpersAlexander Musman
Remove the ARM64 prefix from encoding helper functions that were moved to cmd/internal/obj to be used by both cmd/asm and cmd/compile. These functions now use the package prefix and look like: arm64.EncodeRegisterExtension and arm64.RegisterListOffset. Change-Id: I3548a4fce1072083eb2f55310c9f7ca6a8e12253 Reviewed-on: https://go-review.googlesource.com/c/go/+/714320 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2026-02-17cmd/internal/obj: move ARM64RegisterListOffset from cmd/asm/internal/archAlexander Musman
Change-Id: Ieaacd8c40495e7dad61a068125b1d0e0cee832c4 Reviewed-on: https://go-review.googlesource.com/c/go/+/713500 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-02-12cmd/compile: set alignment for all content-addressable symbolsIan Lance Taylor
There is nothing particularly special about content-addressable symbols, it's just a place to start. This reduces the size of the tailscaled binary by about 16K. This happens mainly because before this CL the linker's symalign function kicks in for all static composite literals and PCDATA symbols, and gives them an alignment based on their size. If the size happens to be a multiple of 32, it gets an alignment of 32. That wastes space. For #6853 For #36313 Change-Id: I2f049eee8f2463dd2b5e20d7c9a270ac32a31e50 Reviewed-on: https://go-review.googlesource.com/c/go/+/727920 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Ian Lance Taylor <iant@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-11-23cmd/internal/obj/arm64, image/gif, runtime, sort: use math/bits to calculate ↵Axel Wagner
log2 In several places the integer log2 is calculated using loops or similar mechanisms. math/bits.Len* provide a simpler and more efficient mechanisms for this. Annoyingly, every usage has slightly different ideas of what "log2" means and how non-positive inputs should be handled. I verified the replacements in each case by comparing the result for inputs from 0 to 1<<16. Change-Id: Ie962a74674802da363e0038d34c06979ccb41cf3 Reviewed-on: https://go-review.googlesource.com/c/go/+/721880 Reviewed-by: Mark Freeman <markfreeman@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-11-10cmd/internal/obj/arm64: shorten constant integer loadsKeith Randall
Large integer constants can take up to 4 instructions to encode. We can encode some large constants with a single instruction, namely those which are bit patterns (repetitions of certain runs of 0s and 1s). Often the constants we want to encode are *close* to those bit patterns, but don't exactly match. For those, we can use 2 instructions, one to load the close-by bit pattern and one to fix up any mismatches. The constants we use to strength reduce divides often fit this pattern. For unsigned divides by 1 through 15, this CL applies to the constant for N=3,5,6,10,12,15. Triggers 17 times in hello world. Change-Id: I623abf32961fb3e74d0a163f6822f0647cd94499 Reviewed-on: https://go-review.googlesource.com/c/go/+/717900 Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-11-03cmd/internal/obj: support arm64 FMOVQ large offset encodingAlexander Musman
Support arm64 FMOVQ with large offset in immediate which is encoded using register offset instruction in opldrr or opstrr. This will help allowing folding immediate into new ssa ops FMOVQload and FMOVQstore. For example: FMOVQ F0, -20000(R0) is encoded as following: MOVD 3(PC), R27 FMOVQ F0, (R0)(R27) RET ffff b1e0 # constant value Change-Id: Ib71f92f6ff4b310bda004a440b1df41ffe164523 Reviewed-on: https://go-review.googlesource.com/c/go/+/716960 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-10-21all: eliminate unnecessary type conversionsJes Cok
Found by github.com/mdempsky/unconvert Change-Id: I88ce10390a49ba768a4deaa0df9057c93c1164de GitHub-Last-Rev: 3b0f7e8f74f58340637f33287c238765856b2483 GitHub-Pull-Request: golang/go#75974 Reviewed-on: https://go-review.googlesource.com/c/go/+/712940 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com>
2025-10-15cmd/internal/obj: move ARM64RegisterExtension from cmd/asm/internal/archVasily Leonenko
Change-Id: Iab41674953655efa7be3d306dfb3f5be486be501 Reviewed-on: https://go-review.googlesource.com/c/go/+/701455 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2025-10-14cmd/internal/obj/arm64: add support for PAC instructionsBill Roberts
Add support for the Pointer Authentication Code instructions required for the ELF ABI when enabling PAC aware binaries. This allows for assembly writers to add PAC instructions where needed to support this ABI. Follow up work is to enable the compiler to emit these instructions in the appropriate places. The TL;DR for the Linux ABI is that the prologue of a function that pushes the link register (LR) to the stack, signs the LR with a key managed by the operating system and hardware using a PAC instruction, like "paciasp". The function epilog, when restoring the LR from the stack will verify the signature, using an instruction like "autiasp". This helps prevents attackers from modifying the return address on the stack, a common technique for ROP attacks. Details on PAC can be found here: - https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/enabling-pac-and-bti-on-aarch64 - https://developer.arm.com/documentation/109576/0100/Pointer-Authentication-Code The ABI details can be found here: - https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst Change-Id: I4516ed1294d19f9ff9d278833d542821b6642aa9 Reviewed-on: https://go-review.googlesource.com/c/go/+/676675 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Joel Sing <joel@sing.id.au> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-10-08cmd/internal/obj: fix Link.Diag printf errorsAlan Donovan
go1.26's vet printf checker can associate the printf-wrapper property with local vars and struct fields if they are assigned from a printf-like func literal (CL 706635). This leads to better detection of mistakes. Change-Id: I604be1e200aa1aba75e09d4f36ab68c1dba3b8a3 Reviewed-on: https://go-review.googlesource.com/c/go/+/710195 Auto-Submit: Alan Donovan <adonovan@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-10-07Revert "cmd/compile: redo arm64 LR/FP save and restore"Keith Randall
This reverts commit 719dfcf8a8478d70360bf3c34c0e920be7b32994. Reason for revert: Causing crashes. Change-Id: I0b8526dd03d82fa074ce4f97f1789eeac702b3eb Reviewed-on: https://go-review.googlesource.com/c/go/+/709755 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-10-06cmd/compile: redo arm64 LR/FP save and restoreKeith Randall
Instead of storing LR (the return address) at 0(SP) and the FP (parent's frame pointer) at -8(SP), store them at framesize-8(SP) and framesize-16(SP), respectively. We push and pop data onto the stack such that we're never accessing anything below SP. The prolog/epilog lengths are unchanged (3 insns for a typical prolog, 2 for a typical epilog). We use 8 bytes more per frame. Typical prologue: STP.W (FP, LR), -16(SP) MOVD SP, FP SUB $C, SP Typical epilogue: ADD $C, SP LDP.P 16(SP), (FP, LR) RET The previous word where we stored LR, at 0(SP), is now unused. We could repurpose that slot for storing a local variable. The new prolog and epilog instructions are recognized by libunwind, so pc-sampling tools like perf should now be accurate. (TODO: except maybe after the first RET instruction? Have to look into that.) Update #73753 (fixes, for arm64) Update #57302 (Quim thinks this will help on that issue) Change-Id: I4800036a9a9a08aaaf35d9f99de79a36cf37ebb8 Reviewed-on: https://go-review.googlesource.com/c/go/+/674615 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2025-08-15runtime: remove duff support for arm64Keith Randall
Change-Id: Ib290079a77a746a8512cd4638310b24164f6a930 Reviewed-on: https://go-review.googlesource.com/c/go/+/679456 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Jorropo <jorropo.pgm@gmail.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-05cmd: remove dead codeqiulaidongfeng
Fixes #74076 Change-Id: Icc67b3d4e342f329584433bd1250c56ae8f5a73d Reviewed-on: https://go-review.googlesource.com/c/go/+/690635 Reviewed-by: Alan Donovan <adonovan@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Commit-Queue: Alan Donovan <adonovan@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Alan Donovan <adonovan@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-04-03cmd/internal/obj/arm64: return a bit shift from movconJoel Sing
Return the shift in bits from movcon, rather than returning an index. This allows a number of multiplications to be removed, making the code more readable. Scale down to an index only when encoding. Change-Id: I1be91eb526ad95d389e2f8ce97212311551790df Reviewed-on: https://go-review.googlesource.com/c/go/+/650939 Auto-Submit: Joel Sing <joel@sing.id.au> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-03cmd/internal/obj/arm64: deduplicate con32classJoel Sing
Teach conclass how to handle 32 bit values and deduplicate the code between con32class and conclass. Change-Id: I9c5eea31d443fd4c2ce700c6ea21e1d0bef665b0 Reviewed-on: https://go-review.googlesource.com/c/go/+/650938 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Joel Sing <joel@sing.id.au>
2025-04-03cmd/internal/obj/arm64: simplify conclassJoel Sing
Reduce repetition by pulling some common conversions into variables. Change-Id: I8c1cc806236b5ecdadf90f4507923718fa5de9b6 Reviewed-on: https://go-review.googlesource.com/c/go/+/650937 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-03-29cmd/internal/obj/arm64: factor out constant classification codeJoel Sing
This will allow for further improvements and deduplication. Change-Id: I9374fc2d16168ced06f3fcc9e558a9c85e24fd01 Reviewed-on: https://go-review.googlesource.com/c/go/+/650936 Reviewed-by: Fannie Zhang <Fannie.Zhang@arm.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-03-26cmd/internal/obj/arm64: add support for BTI instructionJoel Sing
Add support for the `BTI' instruction to the arm64 assembler. This instruction provides Branch Target Identification for targets of indirect branches. A BTI can be marked with a target type of 'C' (call), 'J' (jump) or 'JC' (jump or call). Updates #66054 Change-Id: I1cf31a0382207bb75b9b2deb49ac298a59c00d8a Reviewed-on: https://go-review.googlesource.com/c/go/+/646781 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Marvin Drees <marvin.drees@9elements.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-03-25cmd/internal/obj/arm64: move register encoding into opldrr/opstrrJoel Sing
Rather than having register encoding knowledge in each caller of opldrr/opstrr (and in a separate olsxrr function), pass the registers into opldrr/opstrr and let them handle the encoding. This reduces duplication and improves readability. Change-Id: I50a25263f305d01454f3ff95e8b6e7c76e760ab0 Reviewed-on: https://go-review.googlesource.com/c/go/+/471521 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-03-25cmd/internal/obj/arm64: provide and use an oprrrr functionJoel Sing
Provide a four register version of oprrr, which takes an additional 'ra' register. Use this instead of oprrr where appropriate. Change-Id: I8882957a83c2b08e407f37a37c61864cd920bbc9 Reviewed-on: https://go-review.googlesource.com/c/go/+/471519 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-03-25cmd/internal/obj/arm64: move register encoding into oprrrJoel Sing
Rather than having register encoding knowledge in each caller of oprrr, pass the registers into oprrr and let it handle the encoding. This reduces duplication and improves readability. Change-Id: Iab6c70f7796b7a8c071419654b8a5686aeee8c1b Reviewed-on: https://go-review.googlesource.com/c/go/+/471518 Reviewed-by: Fannie Zhang <Fannie.Zhang@arm.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-03-25cmd/internal/obj/arm64: replace range checks with isaddcon2Joel Sing
isaddcon2 tests for the range 0 <= v <= 0xffffff - replace duplicated range checks with calls to isaddcon2. Change-Id: Ia6f331852ed3d77715b265cb4fcc500579eac711 Reviewed-on: https://go-review.googlesource.com/c/go/+/650935 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Fannie Zhang <Fannie.Zhang@arm.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2024-11-19cmd/internal/obj/arm64: recognize FIPS static temps as unalignedRuss Cox
Code like x := [12]byte{1,2,3,4,5,6,7,8,9,10,11,12} stores x in a pair of registers and uses MOVD/MOVWU to load the values from RODATA. The code generator needs to understand not to use the aligned PC-relative relocation for that sequence. In non-FIPS modes, more statictemp optimizations can be applied and this problematic sequence doesn't happen. Fix the decision about whether to assume alignment to match the code used by the linker when deciding what to align. Fixes the linker failure in CL 626437 patch set 5. Change-Id: Iedad862c6faee758d4a2c5120cab2d329265b134 Reviewed-on: https://go-review.googlesource.com/c/go/+/628835 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Bypass: Russ Cox <rsc@golang.org>
2024-11-07cmd/internal/obj: replace obj.Addrel func with LSym.AddRel methodRuss Cox
The old API was to do r := obj.AddRel(sym) r.Type = this r.Off = that etc The new API is: sym.AddRel(ctxt, obj.Reloc{Type: this: Off: that, etc}) This new API is more idiomatic and avoids ever having relocations that are only partially constructed. Most importantly, it sets up for sym.AddRel being able to check relocation validity in the future. (Passing ctxt is for use in validity checking.) Passes golang.org/x/tools/cmd/toolstash/buildall. Change-Id: I042ea76e61bb3bf6402f98ca11291a13f4799972 Reviewed-on: https://go-review.googlesource.com/c/go/+/625616 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-09-06cmd/internal/obj/arm64: Add helpers for span7 passesSebastian Nickolls
Adds helper functions for the literal pooling, large branch handling and code emission stages of the span7 assembler pass. This hides the implementation of the current assembler from the general workflow in span7 to make the implementation easier to change in future. Updates #44734 Change-Id: I8859956b23ad4faebeeff6df28051b098ef90fed Reviewed-on: https://go-review.googlesource.com/c/go/+/595755 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-09-03cmd: replace many sort.Interface with slices.Sort and SortFuncZxilly
with slices there's no need to implement sort.Interface Change-Id: I59167e78881cb1df89a71e33d738d6aeca7adb71 GitHub-Last-Rev: 507ba84453f7305b6b2bf6317292111c00c93ffe GitHub-Pull-Request: golang/go#68724 Reviewed-on: https://go-review.googlesource.com/c/go/+/602895 Reviewed-by: Ian Lance Taylor <iant@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Robert Griesemer <gri@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2024-08-14cmd/internal/obj/arm64: Emit UDF instruction for undefined Prog nodesSebastian Nickolls
UDF provides a stronger guarantee for generating the Undefined Instruction exception than the current value being emitted. Change-Id: I234cd70ce04f21311959c1061ae24992438105f8 Reviewed-on: https://go-review.googlesource.com/c/go/+/605155 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-07-24cmd/internal/obj/arm64: support MSR DITRoland Shoemaker
Set the right instruction bits in asmout in order to allow using MSR with DIT and an immediate value. This allows us to avoid using an intermediary register when we want to set DIT (unsetting DIT already worked with the zero register). Ref: https://developer.arm.com/documentation/ddi0602/2024-06/Base-Instructions/MSR--immediate---Move-immediate-value-to-special-register-?lang=en Change-Id: Id049a0b4e0feb534cea992553228f9b5e12ddcea Reviewed-on: https://go-review.googlesource.com/c/go/+/597595 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-05-15cmd/compile, cmd/internal: fine-grained fiddling with loop alignmentDavid Chase
This appears to be useful only on amd64, and was specifically benchmarked on Apple Silicon and did not produce any benefit there. This CL adds the assembly instruction `PCALIGNMAX align,amount` which aligns to `align` if that can be achieved with `amount` or fewer bytes of padding. (0 means never, but will align the enclosing function.) Specifically, if low-order-address-bits + amount are greater than or equal to align; thus, `PCALIGNMAX 64,63` is the same as `PCALIGN 64` and `PCALIGNMAX 64,0` will never emit any alignment, but will still cause the function itself to be aligned to (at least) 64 bytes. Change-Id: Id51a056f1672f8095e8f755e01f72836c9686aa3 Reviewed-on: https://go-review.googlesource.com/c/go/+/577935 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2023-12-08cmd/internal/obj/arm64: fix invalid register pair for LDPeric fang
ZR register can be used in register pair of LDP, LDPW and LDPSW instructions, but now it's not allowed. This CL fixes this issue. Change-Id: I8467502de4664214e0b7dad0295c44f6cff16ee6 Reviewed-on: https://go-review.googlesource.com/c/go/+/547815 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Eric Fang <eric.fang@arm.com>
2023-10-18cmd/internal/obj/arm64: replace the migrated url addresscui fliter
Change-Id: I36a0f0989d37bef45ea8778da799b56a7e9a0c30 Reviewed-on: https://go-review.googlesource.com/c/go/+/529515 Run-TryBot: shuang cui <imcusg@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
2023-08-28cmd/internal/obj/arm64: avoid unnecessary pool literal usage for load/store ↵Joel Sing
pairs Implement better classification for load and store pair operations. This in turn allows us to avoid using pool literals when the offset fits in a 24 bit unsigned immediate. In this case, the offset can be calculated using two add immediate instructions, rather than loading the offset from the pool literal and then adding the offset to the base register. This requires the same number of instructions, however avoids a load from memory and does not require the offset to be stored in the literal pool. Updates #59615 Change-Id: I316ec3d54f1d06ae9d930e98d0c32471775fcb26 Reviewed-on: https://go-review.googlesource.com/c/go/+/515615 Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Joedian Reid <joedian@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-08-25cmd/internal/obj/arm64: load large constants into vector registers from rodataJoel Sing
Load large constants into vector registers from rodata, instead of placing them in the literal pool. This treats VMOVQ/VMOVD/VMOVS the same as FMOVD/FMOVS and makes use of the existing mechanism for storing values in rodata. Two additional instructions are required for a load, however these instructions are used infrequently and already have a high latency. Updates #59615 Change-Id: I54226730267689963d73321e548733ae2d66740e Reviewed-on: https://go-review.googlesource.com/c/go/+/515617 Reviewed-by: Eric Fang <eric.fang@arm.com> Reviewed-by: Carlos Amedee <carlos@golang.org> Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-08-03cmd/internal/obj/arm64: move register encoding into opxrrrJoel Sing
Rather than having register encoding knowledge in each caller of opxrrr, pass the registers into opxrrr and let it handle the encoding. This reduces duplication and improves readability. Change-Id: I202c503465a0169277a0f64340598203c9dcf20c Reviewed-on: https://go-review.googlesource.com/c/go/+/461140 Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: David Chase <drchase@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-07-31cmd/internal/obj/arm64: improve splitting of 24 bit unsigned scaled immediatesJoel Sing
The previous implementation would limit itself to 0xfff000 | 0xfff << shift, while the maximum possible value is 0xfff000 + 0xfff << shift. In practical terms, this means that an additional ((1 << shift) - 1) * 0x1000 of offset is reachable for operations that use this splitting format. In the case of an 8 byte load/store, this is an additional 0x7000 that can be reached without needing to use the literal pool. Updates #59615 Change-Id: Ice7023104042d31c115eafb9398c2b999bdd6583 Reviewed-on: https://go-review.googlesource.com/c/go/+/512540 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Joel Sing <joel@sing.id.au>
2023-07-31cmd/internal/obj/arm64: avoid unnecessary literal pool usage for movesJoel Sing
In a number of load and store cases, the use of the literal pool can be entirely avoided by simply adding or subtracting the offset from the register. This uses the same number of instructions, while avoiding a load from memory, along with the need for the value to be in the literal pool. Overall this reduces the size of binaries slightly and should have lower overhead. Updates #59615 Change-Id: I9cb6a403dc71e34a46af913f5db87dbf52f8688c Reviewed-on: https://go-review.googlesource.com/c/go/+/512539 Reviewed-by: David Chase <drchase@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Joel Sing <joel@sing.id.au>
2023-07-31cmd/internal/obj/arm64: improve classification of loads and storesJoel Sing
Currently, pool literals are added when they are not needed, namely in the case where the offset is a 24 bit unsigned scaled immediate. By improving the classification of loads and stores, we can avoid generating unused pool literals. However, more importantly this provides a basis for further improvement of the load and store code generation. Updates #59615 Change-Id: Ia3bad1709314565a05894a76c434cca2fa4533c4 Reviewed-on: https://go-review.googlesource.com/c/go/+/512538 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gopher Robot <gobot@golang.org>