| Age | Commit message (Collapse) | Author |
|
A recent change to improve shifts was generating some
invalid cases when the rule was based on an AND. The
extended mnemonics CLRLSLDI and CLRLSLWI only allow
certain values for the operands and in the mask case
those values were not being checked properly. This
adds a check to those rules to verify that the
'b' and 'n' values used when an AND was part of the rule
have correct values.
There was a bug in some diag messages in asm9. The
message expected 3 values but only provided 2. Those are
corrected here also.
The test/codegen/shift.go was updated to add a few more
cases to check for the case mentioned here.
Some of the comments that mention the order of operands
in these extended mnemonics were wrong and those have been
corrected.
Fixes #41683.
Change-Id: If5bb860acaa5051b9e0cd80784b2868b85898c31
Reviewed-on: https://go-review.googlesource.com/c/go/+/258138
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
|
|
This adds support for the extswsli instruction which combines
extsw followed by a shift.
New benchmark demonstrates the improvement:
name old time/op new time/op delta
ExtShift 1.34µs ± 0% 1.30µs ± 0% -3.15% (p=0.057 n=4+3)
Change-Id: I21b410676fdf15d20e0cbbaa75d7c6dcd3bbb7b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/257017
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@gmail.com>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
|
|
The CL 249758 added `FMOVQ $vcon, Vd` instruction and assembler used
128-bit simd literal-loading to load `$vcon` from pool into 128-bit vector
register `Vd`. Because Go does not have 128-bit integers for now, the
assembler will report an error of `immediate out of range` when
assembleing `FMOVQ $0x123456789abcdef0123456789abcdef, V0` instruction.
This patch lets 128-bit integers take two 64-bit operands, for the high
and low parts separately and adds `VMOVQ $hi, $lo, Vd` instruction to
move `$hi<<64+$lo' into 128-bit register `Vd`.
In addition, this patch renames `FMOVQ/FMOVD/FMOVS` ops to 'VMOVQ/VMOVD/VMOVS'
and uses them to move 128-bit, 64-bit and 32-bit constants into vector
registers, respectively
Update the go doc.
Fixes #40725
Change-Id: Ia3c83bb6463f104d2bee960905053a97299e0a3a
Reviewed-on: https://go-review.googlesource.com/c/go/+/255900
Trust: fannie zhang <Fannie.Zhang@arm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Change things so that the -S command line option for the assembler
works the same as -S in the compiler, e.g. you can use -S=2 to
get additional detail.
Change-Id: I7bdfba39a98e67c7ae4b93019e171b188bb99a2d
Reviewed-on: https://go-review.googlesource.com/c/go/+/255717
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
This change adds rules to find pairs of instructions that can
be combined into a single shifts. These instruction sequences
are common in array addressing within loops. Improvements can
be seen in many crypto packages and the hash packages.
These are based on the extended mnemonics found in the ISA
sections C.8.1 and C.8.2.
Some rules in PPC64.rules were moved because the ordering prevented
some matching.
The following results were generated on power9.
hash/crc32:
CRC32/poly=Koopman/size=40/align=0 195ns ± 0% 163ns ± 0% -16.41%
CRC32/poly=Koopman/size=40/align=1 200ns ± 0% 163ns ± 0% -18.50%
CRC32/poly=Koopman/size=512/align=0 1.98µs ± 0% 1.67µs ± 0% -15.46%
CRC32/poly=Koopman/size=512/align=1 1.98µs ± 0% 1.69µs ± 0% -14.80%
CRC32/poly=Koopman/size=1kB/align=0 3.90µs ± 0% 3.31µs ± 0% -15.27%
CRC32/poly=Koopman/size=1kB/align=1 3.85µs ± 0% 3.31µs ± 0% -14.15%
CRC32/poly=Koopman/size=4kB/align=0 15.3µs ± 0% 13.1µs ± 0% -14.22%
CRC32/poly=Koopman/size=4kB/align=1 15.4µs ± 0% 13.1µs ± 0% -14.79%
CRC32/poly=Koopman/size=32kB/align=0 137µs ± 0% 105µs ± 0% -23.56%
CRC32/poly=Koopman/size=32kB/align=1 137µs ± 0% 105µs ± 0% -23.53%
crypto/rc4:
RC4_128 733ns ± 0% 650ns ± 0% -11.32% (p=1.000 n=1+1)
RC4_1K 5.80µs ± 0% 5.17µs ± 0% -10.89% (p=1.000 n=1+1)
RC4_8K 45.7µs ± 0% 40.8µs ± 0% -10.73% (p=1.000 n=1+1)
crypto/sha1:
Hash8Bytes 635ns ± 0% 613ns ± 0% -3.46% (p=1.000 n=1+1)
Hash320Bytes 2.30µs ± 0% 2.18µs ± 0% -5.38% (p=1.000 n=1+1)
Hash1K 5.88µs ± 0% 5.38µs ± 0% -8.62% (p=1.000 n=1+1)
Hash8K 42.0µs ± 0% 37.9µs ± 0% -9.75% (p=1.000 n=1+1)
There are other improvements found in golang.org/x/crypto which are all in the
range of 5-15%.
Change-Id: I193471fbcf674151ffe2edab212799d9b08dfb8c
Reviewed-on: https://go-review.googlesource.com/c/go/+/252097
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
|
|
stack address
Currently, when the offset of "MOVD $offset(Rn), Rd" is a large positive
constant or a negative constant, the assembler will load this offset from
the constant pool.This patch gets rid of the constant pool by encoding the
offset into two ADD instructions if it's a large positive constant or one
SUB instruction if negative. For very large negative offset, it is rarely
used, here we don't optimize this case.
Optimized case 1: MOVD $-0x100000(R7), R0
Before: LDR 0x67670(constant pool), R27; ADD R27.UXTX, R0, R7
After: SUB $0x100000, R7, R0
Optimized case 2: MOVD $0x123468(R7), R0
Before: LDR 0x67670(constant pool), R27; ADD R27.UXTX, R0, R7
After: ADD $0x123000, R7, R27; ADD $0x000468, R27, R0
1. Binary size before/after.
binary size change
pkg/linux_arm64 +4KB
pkg/tool/linux_arm64 no change
go no change
gofmt no change
2. go1 benckmark.
name old time/op new time/op delta
pkg:test/bench/go1 goos:linux goarch:arm64
BinaryTree17-64 7335721401.800000ns +-40% 6264542009.800000ns +-14% ~ (p=0.421 n=5+5)
Fannkuch11-64 3886551822.600000ns +- 0% 3875870590.200000ns +- 0% ~ (p=0.151 n=5+5)
FmtFprintfEmpty-64 82.960000ns +- 1% 83.900000ns +- 2% +1.13% (p=0.048 n=5+5)
FmtFprintfString-64 149.200000ns +- 1% 148.000000ns +- 0% -0.80% (p=0.016 n=5+4)
FmtFprintfInt-64 177.000000ns +- 0% 178.400000ns +- 2% ~ (p=0.794 n=4+5)
FmtFprintfIntInt-64 240.200000ns +- 2% 239.400000ns +- 4% ~ (p=0.302 n=5+5)
FmtFprintfPrefixedInt-64 300.400000ns +- 0% 299.200000ns +- 1% ~ (p=0.119 n=5+5)
FmtFprintfFloat-64 360.000000ns +- 0% 361.600000ns +- 3% ~ (p=0.349 n=4+5)
FmtManyArgs-64 1064.400000ns +- 1% 1061.400000ns +- 0% ~ (p=0.087 n=5+5)
GobDecode-64 12080404.400000ns +- 2% 11637601.000000ns +- 1% -3.67% (p=0.008 n=5+5)
GobEncode-64 8474973.800000ns +- 2% 7977801.600000ns +- 2% -5.87% (p=0.008 n=5+5)
Gzip-64 416501238.400000ns +- 0% 410463405.400000ns +- 0% -1.45% (p=0.008 n=5+5)
Gunzip-64 58088415.200000ns +- 0% 58826209.600000ns +- 0% +1.27% (p=0.008 n=5+5)
HTTPClientServer-64 128660.200000ns +-23% 117840.800000ns +- 8% ~ (p=0.222 n=5+5)
JSONEncode-64 17547746.800000ns +- 4% 17216180.000000ns +- 1% ~ (p=0.222 n=5+5)
JSONDecode-64 80879896.000000ns +- 1% 80063737.200000ns +- 0% -1.01% (p=0.008 n=5+5)
Mandelbrot200-64 5484901.600000ns +- 0% 5483614.400000ns +- 0% ~ (p=0.310 n=5+5)
GoParse-64 6201166.800000ns +- 6% 6150920.600000ns +- 1% ~ (p=0.548 n=5+5)
RegexpMatchEasy0_32-64 135.000000ns +- 0% 139.200000ns +- 7% ~ (p=0.643 n=5+5)
RegexpMatchEasy0_1K-64 484.600000ns +- 2% 483.800000ns +- 2% ~ (p=0.984 n=5+5)
RegexpMatchEasy1_32-64 128.000000ns +- 1% 124.600000ns +- 1% -2.66% (p=0.008 n=5+5)
RegexpMatchEasy1_1K-64 769.400000ns +- 2% 761.400000ns +- 1% ~ (p=0.460 n=5+5)
RegexpMatchMedium_32-64 12.900000ns +- 0% 12.500000ns +- 0% -3.10% (p=0.008 n=5+5)
RegexpMatchMedium_1K-64 57879.200000ns +- 1% 56512.200000ns +- 0% -2.36% (p=0.008 n=5+5)
RegexpMatchHard_32-64 3091.600000ns +- 1% 3071.000000ns +- 0% -0.67% (p=0.048 n=5+5)
RegexpMatchHard_1K-64 92941.200000ns +- 1% 92794.000000ns +- 0% ~ (p=1.000 n=5+5)
Revcomp-64 1695605187.000000ns +-54% 1821697637.400000ns +-47% ~ (p=1.000 n=5+5)
Template-64 112839686.800000ns +- 1% 109964069.200000ns +- 3% ~ (p=0.095 n=5+5)
TimeParse-64 587.000000ns +- 0% 587.000000ns +- 0% ~ (all equal)
TimeFormat-64 586.000000ns +- 1% 584.200000ns +- 1% ~ (p=0.659 n=5+5)
[Geo mean] 81804.262218ns 80694.712973ns -1.36%
name old speed new speed delta
pkg:test/bench/go1 goos:linux goarch:arm64
GobDecode-64 63.6MB/s +- 2% 66.0MB/s +- 1% +3.78% (p=0.008 n=5+5)
GobEncode-64 90.6MB/s +- 2% 96.2MB/s +- 2% +6.23% (p=0.008 n=5+5)
Gzip-64 46.6MB/s +- 0% 47.3MB/s +- 0% +1.47% (p=0.008 n=5+5)
Gunzip-64 334MB/s +- 0% 330MB/s +- 0% -1.25% (p=0.008 n=5+5)
JSONEncode-64 111MB/s +- 4% 113MB/s +- 1% ~ (p=0.222 n=5+5)
JSONDecode-64 24.0MB/s +- 1% 24.2MB/s +- 0% +1.02% (p=0.008 n=5+5)
GoParse-64 9.35MB/s +- 6% 9.42MB/s +- 1% ~ (p=0.571 n=5+5)
RegexpMatchEasy0_32-64 237MB/s +- 0% 231MB/s +- 7% ~ (p=0.690 n=5+5)
RegexpMatchEasy0_1K-64 2.11GB/s +- 2% 2.12GB/s +- 2% ~ (p=1.000 n=5+5)
RegexpMatchEasy1_32-64 250MB/s +- 1% 257MB/s +- 1% +2.63% (p=0.008 n=5+5)
RegexpMatchEasy1_1K-64 1.33GB/s +- 2% 1.35GB/s +- 1% ~ (p=0.548 n=5+5)
RegexpMatchMedium_32-64 77.6MB/s +- 0% 79.8MB/s +- 0% +2.80% (p=0.008 n=5+5)
RegexpMatchMedium_1K-64 17.7MB/s +- 1% 18.1MB/s +- 0% +2.41% (p=0.008 n=5+5)
RegexpMatchHard_32-64 10.4MB/s +- 1% 10.4MB/s +- 0% ~ (p=0.056 n=5+5)
RegexpMatchHard_1K-64 11.0MB/s +- 1% 11.0MB/s +- 0% ~ (p=0.984 n=5+5)
Revcomp-64 188MB/s +-71% 155MB/s +-71% ~ (p=1.000 n=5+5)
Template-64 17.2MB/s +- 1% 17.7MB/s +- 3% ~ (p=0.095 n=5+5)
[Geo mean] 79.2MB/s 79.3MB/s +0.24%
Change-Id: I593ac3e7037afafc3605ad4b0cfb51d5dd88015d
Reviewed-on: https://go-review.googlesource.com/c/go/+/232438
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This CL adds USHLL, USHLL2, UZP1, UZP2, and BIF instructions requested
by #40725. And since UXTL* are aliases of USHLL*, this CL also merges
them into one case.
Updates #40725
Change-Id: I404a4fdaf953319f72eea548175bec1097a2a816
Reviewed-on: https://go-review.googlesource.com/c/go/+/253659
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Enable VBSL, VBIT, VCMTST, VUXTL VUXTL2 and FMOVQ SIMD
instructions required by the issue #40725. And FMOVQ
instrucion is used to move a large constant to a Vn
register.
Add test cases.
Fixes #40725
Change-Id: I1cac1922a0a0165d698a4b73a41f7a5f0a0ad549
Reviewed-on: https://go-review.googlesource.com/c/go/+/249758
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
instructions of arm64
The post-index offset of VLD[1-4]R instructions is decided by the
"size" field not "Q" field, the current assembler uses "Q" fileld
to check the correctness of post-index offset which is not correct.
This patch fixes it.
Fixes #40725
Change-Id: If1cde7f21c6b3ee0e491649eb567700bd1475c84
Reviewed-on: https://go-review.googlesource.com/c/go/+/249757
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Fixing several error message and comment texts of the ARM64 assembler
to use arrangement specifiers of Go's assembly style.
Change-Id: Icdbb14fba7aaede40d57d0d754795b050366a1ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/237859
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
These are the indexed vsx load operations with the
same endian and alignment benefits of {l,st}vx.
Likewise, cleanup redundant comments in op{load,store}x and
fix ISA 3.0 typos nearby.
Change-Id: Ie1ace17c6150cf9168a834e435114028ff6eb07c
Reviewed-on: https://go-review.googlesource.com/c/go/+/249025
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
|
|
ARMv8.2-SHA add SHA512 intructions:
1. SHA512H Vm.D2, Vn, Vd
2. SHA512H2 Vm.D2, Vn, Vd
3. SHA512SU0 Vn.D2, Vd.D2
4. SHA512SU1 Vm.D2, Vn.D2, Vd.D2
ARMv8 Architecture Reference Manual C7.2.234-C7.2.234
Change-Id: Ie970fef1bba5312ad466f246035da4c40a1bbb39
Reviewed-on: https://go-review.googlesource.com/c/go/+/180057
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
For content-addressable symbols with relocations, we build a
content hash based on its content and relocations. Depending on
the category of the referenced symbol, we choose different hash
algorithms such that the hash is globally consistent.
For now, we only support content-addressable symbols with
relocations when the current package's import path is known, so
that the symbol names are fully expanded. Otherwise, if the
referenced symbol is a named symbol whose name is not fully
expanded, the hash won't be globally consistent, and can cause
erroneous collisions. This is fine for now, as the deduplication
is just an optimization, not a requirement for correctness (until
we get to type descriptors).
Change-Id: I639e4e03dd749b5d71f0a55c2525926575b1ac30
Reviewed-on: https://go-review.googlesource.com/c/go/+/243142
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Jeremy Faller <jeremy@golang.org>
|
|
Change-Id: I446db56b20ef2189e23e225a91a17736c1d11e4c
|
|
When the arrangement specifier is "B16", the 30-bit should be 1 rather than 0.
This CL fixes this error.
Fixes #39445
Change-Id: Ib44881cdb8b3aab855cb30f2c52a085cd73a6a2c
Reviewed-on: https://go-review.googlesource.com/c/go/+/236638
Run-TryBot: eric fang <eric.fang@arm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Merge conflicts are mostly recently changed nm/objdump output
format and its tests. Resolved easily (mostly just using the
format on master branch).
Change-Id: I99d8410a9a02947ecf027d9cae5762861562baf5
|
|
Most of the docs are in the new wiki page
https://golang.org/wiki/Spectre.
Updates #37419.
Change-Id: I6e8f76670593c089de895e1665b41d874f879df9
Reviewed-on: https://go-review.googlesource.com/c/go/+/236599
Reviewed-by: Austin Clements <austin@google.com>
|
|
Now we have ctxt.IsAsm, use that, instead of passing in a
parameter.
Change-Id: I81dedbe6459424fa9a4c2bfbd9abd83d83f3a107
Reviewed-on: https://go-review.googlesource.com/c/go/+/234492
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Jeremy Faller <jeremy@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
If the package path is known, pass it to the object file writer
so the symbol names are pre-expanded. (We already expand the
package path in debug info.)
Change-Id: I2b2b71edbb98924cbf3c4f9142b7e109e5b7501a
Reviewed-on: https://go-review.googlesource.com/c/go/+/234491
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Jeremy Faller <jeremy@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
Change-Id: Ia30d70096e740d012e4d9e070bbc4347805527a7
|
|
The BITCON test, isbitcon, assumes 32-bit constants are expanded
repeatedly, i.e. by copying the low 32 bits to high 32 bits,
instead of zero extending. We already do such expansion in
progedit. In con32class when classifying 32-bit constants, we
should use the expanded constant, instead of zero-extending it.
TODO: we could have better encoding for things like ANDW $-1, Rx.
Fixes #38946.
Change-Id: I37d0c95d744834419db5c897fd1f6c187595c926
Reviewed-on: https://go-review.googlesource.com/c/go/+/232984
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
We are not going to merge to master until Go 1.16 cycle. The old
object support can go now.
Change-Id: I93e6f584974c7749d0a0c2e7a96def35134dc566
Reviewed-on: https://go-review.googlesource.com/c/go/+/231918
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
This commit adds a new option to the x86 assembler. If the
GOAMD64 environment variable is set to alignedjumps (the
default) and we're doing a 64 bit build, the assembler will
make sure that neither stand alone nor macro-fused jumps will
end on or cross 32 byte boundaries. To achieve this, functions
are aligned on 32 byte boundaries, rather than 16 bytes, and
jump instructions are padded to ensure that they do not
cross or end on 32 byte boundaries. Jumps are padded
by adding a NOP instruction of the appropriate length before
the jump.
The commit is likely to result in larger binary sizes when
GOAMD64=alignedjumps. On the binaries tested so far, an
increase of between 1.4% and 1.5% has been observed.
Updates #35881
Co-authored-by: David Chase <drchase@google.com>
Change-Id: Ief0722300bc3f987098e4fd92b22b14ad6281d91
Reviewed-on: https://go-review.googlesource.com/c/go/+/219357
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Adds a few instructions to ppc64enc.s that were missing from the
previous update.
Change-Id: Ieafce39e905cdf4da3bfb00fdd5a39ab28089cb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/230437
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This updates the PPC64.rules file to use the MOD instructions
that are available in power9. Prior to power9 this is done
using a longer sequence with multiply and divide.
Included in this change is removal of the REM* opcode variations
that set the CC or OV bits since their settings are based
on the DIV and are not appropriate for the REM.
Change-Id: Iceed9ce33e128e1911c15592ee674276ce8ba3fa
Reviewed-on: https://go-review.googlesource.com/c/go/+/229761
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This allows more exciting changes to compiler-generated assembly
language that might not be correct for tricky hand-crafted
assembly (e.g., nop padding breaking tables of call or branch
instructions).
Updates #35881
Change-Id: I842b811796076c160180a364564f2844604df3fb
Reviewed-on: https://go-review.googlesource.com/c/go/+/229708
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This change adds some instructions that were missing from the
ppc64 assembler, mostly power9 but a few others from earlier.
Tests in cmd/asm for ppc64 were updated: ppc64.s includes the
new instructions, and ppc64enc.s now includes not only the
new instructions but most ppc64 opcodes to provide a more
complete test of the ppc64 assembler.
The ppc64 instruction set is used for linux/ppc64le,
linux/ppc64, and aix/ppc64.
Change-Id: I8695f89dbca06174847963f4ef869f2e584d5bbf
Reviewed-on: https://go-review.googlesource.com/c/go/+/229479
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Originally on s390x, ADDE does not work when adding numbers from a memory location.
For example: ADDE (R3), R4 will result in a failure.
Since ADDC, ADD and ADDW already supports adding from memory location,
let's support that for ADDE as well.
Change-Id: I7cbe112ea154733a621b948c6a21bbee63fb0c62
Reviewed-on: https://go-review.googlesource.com/c/go/+/229304
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
On s390x, we already have MVCIN opcode in asmz.go,
but we did not use it. This CL uses that opcode and adds MVCIN
instruction.
MVCIN instruction can be used to move data from one storage location
to another while reversing the order of bytes within the field. This
could be useful when transforming data from little-endian to big-endian.
Change-Id: Ifa1a911c0d3442f4a62f91f74ed25b196d01636b
Reviewed-on: https://go-review.googlesource.com/c/go/+/227478
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The only conflict is a modify-deletion conflict in
cmd/link/internal/ld/link.go, where the old error reporter is
deleted in the new linker. Ported to
cmd/link/internal/ld/errors.go.
Change-Id: I5c78f398ea95bc1d7e6579c84dd8252c9f2196b7
|
|
Implement various branch pseudo-instructions for riscv64. These make it easier
to read/write assembly and will also make it easier for the compiler to generate
optimised code.
Change-Id: Ic31a7748c0e1495522ebecf34b440842b8d12c04
Reviewed-on: https://go-review.googlesource.com/c/go/+/226397
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The only merge conflict is the addition of -spectre flag on
master and the addition of -go115newobj flag on dev.link.
Resolved trivially.
Change-Id: I5b46c2b25e140d6c3d8cb129acbd7a248ff03bb9
|
|
Add back the newobj flag, renamed to go115newobj, for feature
gating. The flag defaults to true.
This essentially reverts CL 206398 as well as CL 220060.
The old object format isn't working yet. Will fix in followup CLs.
Change-Id: I1ace2a9cbb1a322d2266972670d27bda4e24adbc
Reviewed-on: https://go-review.googlesource.com/c/go/+/224623
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
Provide NEG/NEGW pseudo-instructions, which translate to SUB/SUBW with the
zero register as a source.
Change-Id: I2c1ec1e75611c234c5ee8e39390dd188f8e42bae
Reviewed-on: https://go-review.googlesource.com/c/go/+/221689
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Add a NOT pseudo-instruction that translates to XORI $-1.
Change-Id: I2be4cfe2939e988cd7f8d30260b704701d78475f
Reviewed-on: https://go-review.googlesource.com/c/go/+/221688
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Add support for floating-point classification instructions.
Change-Id: I64463d477b3db0cca16ff7bced64f154011ef4cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/220542
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Use instructions in place of currently used defines.
Updates #36765
Change-Id: I00bb59e77b1aace549d7857cc9721ba2cb4ac6ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/220541
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Add support for Load-Reserved (LR) and Store-Conditional (SC) instructions.
Use instructions in place of currently used defines.
Updates #36765
Change-Id: I77e660639802293ece40cfde4865ac237e3308d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/220540
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Also remove #define's that were previously in use.
Updates #36765
Change-Id: I90b6a8629c78f549012f3f6c5f3b325336182712
Reviewed-on: https://go-review.googlesource.com/c/go/+/220539
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Rework instruction generation so that multiple instructions are generated
from a single obj.Prog, rather than the current approach where obj.Progs
are rewritten. This allows the original obj.Prog to remain intact, before
being converted into an architecture specific instruction form.
This simplifies the code and removes a level of indirection that results
from trying to manipulate obj.Prog.To/obj.Prog.From into forms that match
the instruction encoding. Furthermore, the errors reported make more sense
since it matches up with the actual assembly that was parsed.
Note that the CALL/JMP/JALR type sequences have not yet been migrated to
this framework and will likely be converted at a later time.
Updates #27532
Change-Id: I9fd12562ed1db0a08cfdc32793897d2a1920ebaa
Reviewed-on: https://go-review.googlesource.com/c/go/+/211917
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This commit extends the -spectre flag to cmd/asm and adds
a new Spectre mitigation mode "ret", which enables the use
of retpolines.
Retpolines prevent speculation about the target of an indirect
jump or call and are described in more detail here:
https://support.google.com/faqs/answer/7625886
Change-Id: I4f2cb982fa94e44d91e49bd98974fd125619c93a
Reviewed-on: https://go-review.googlesource.com/c/go/+/222661
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Clean merge.
Change-Id: I2ae070c60c011779a0f0a1344f5b6d45ef10d8a1
|
|
This does some clean up of the ppc64 opcodes to remove names
from the opcode list that don't actually assemble. At one time
names were added to this list to represent opcode "classes" to
organize other opcodes that have the same set of operand
combinations. Since this is not documented, it is confusing as
to which opcodes can be used in an asm file and which can't, and
which opcodes should be supported in the disassembler. It is
clearer for the user if the list of Go opcodes are all opcodes
that can be assembled with names that match the ppc64 opcode
where possible.
I found this when trying to use Go opcode XXLAND in an asm file
which seems like it should map to ppc64 xxland but when used it
gets this error:
go tool asm test_xxland.s
asm: bad r/r, r/r/r or r/r/r/r opcode XXLAND
asm: assembly failed
This change removes the opcodes that are only used for opcode
"classes" and fixes the case statement where they are referenced.
This also fixes XXLAND and XXPERM which are opcodes that should
assemble to their corresponding ppc64 opcode but do not.
Change-Id: I52300db6b22f7f8b3dd3491c3f35a384b943352c
Reviewed-on: https://go-review.googlesource.com/c/go/+/223138
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Adds a new "-p" option to the assembler, for specifying the import
path of the package being compiled. DWARF generation is now conditional
on having a valid package path -- if we don't know the package path,
then don't emit DWARF.
This is intended to lay the groundwork for removing the various
"patchDWARFname" hacks in the linker.
Change-Id: I5f8315c0881791eb8fe1f2ba32f5bb0ae76f6b98
Reviewed-on: https://go-review.googlesource.com/c/go/+/222718
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Jeremy Faller <jeremy@golang.org>
|
|
Clean merge.
Change-Id: Ib1c3217641f3f09260133b92f406b63f14a0c51e
|
|
This CL adding primitive asm support of MIPS MSA by introducing
new sets of register W0-W31 (C_WREG) and 12 new instructions:
* VMOV{B,H,W,D} ADDCONST, WREG (Vector load immediate)
* VMOV{B,H,W,D} SOREG, WREG (Vector load)
* VMOV{B,H,W,D} WREG, SOREG (Vector store)
Ref: MIPS Architecture for Programmers Volume IV-j: The MIPS64 SIMD Architecture Module
Change-Id: I3362c59a73c82c94769c18a19a0bee7e5029217d
Reviewed-on: https://go-review.googlesource.com/c/go/+/215723
Run-TryBot: Meng Zhuo <mengzhuo1203@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Clean merge.
Change-Id: I8098f15f249f7a1c054c708d22dac41e5951cddb
|
|
Add support to the asimd instruction rev16 which reverses elements in
16-bit halfwords.
syntax:
VREV16 <Vn>.<T>, <Vd>.<T>
<T> should be either B8 or B16.
Change-Id: I7a7b8e772589c51ca9eb6dca98bab1aac863c6c2
Reviewed-on: https://go-review.googlesource.com/c/go/+/213738
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
It has been a while we have not done this.
Merge conflict resolution:
- deleted/rewritten code modified on master
- CL 214286, ported in CL 217317
(cmd/internal/obj/objfile.go)
- CL 210678, it already includes a fix to new code
(cmd/link/internal/ld/deadcode.go)
- CL 209317, applied in this CL
(cmd/link/internal/loadelf/ldelf.go)
Change-Id: Ie927ea6a1d69ce49e8d03e56148cb2725e377876
|
|
On RISCV64, the U-instructions (AUIPC and LUI) take 20 bits, append 12 bits
of zeros and sign extend to 64-bits. As such, the 20 bit immediate value is
signed not unsigned.
Updates #27532
Change-Id: I725215a1dc500106dbfdc0a4425f3c0b2a6f411e
Reviewed-on: https://go-review.googlesource.com/c/go/+/216257
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|