| Age | Commit message (Collapse) | Author |
|
When rva22u64 is available, we can now use the native rotation instructions
from the Zbb extension. Use these instead of synthesising rotation
instructions.
This provides a significant performance gain for SHA-512, the following
benchmarked on a StarFive VisionFive 2:
│ sha512.rva20u64 │ sha512.rva22u64 │
│ B/s │ B/s vs base │
Hash8Bytes/New-4 859.4Ki ± 0% 1337.9Ki ± 0% +55.68% (p=0.000 n=10)
Hash8Bytes/Sum384-4 888.7Ki ± 1% 1308.6Ki ± 1% +47.25% (p=0.000 n=10)
Hash8Bytes/Sum512-4 869.1Ki ± 0% 1269.5Ki ± 1% +46.07% (p=0.000 n=10)
Hash1K/New-4 19.83Mi ± 0% 29.03Mi ± 0% +46.38% (p=0.000 n=10)
Hash1K/Sum384-4 20.00Mi ± 0% 28.86Mi ± 0% +44.30% (p=0.000 n=10)
Hash1K/Sum512-4 19.93Mi ± 0% 28.72Mi ± 0% +44.11% (p=0.000 n=10)
Hash8K/New-4 23.85Mi ± 0% 34.12Mi ± 0% +43.09% (p=0.000 n=10)
Hash8K/Sum384-4 23.88Mi ± 0% 34.09Mi ± 0% +42.77% (p=0.000 n=10)
Hash8K/Sum512-4 23.87Mi ± 0% 34.07Mi ± 0% +42.71% (p=0.000 n=10)
geomean 7.399Mi 10.78Mi +45.77%
Change-Id: I9dca8e3f311eea101684c806cb998872dc697288
Reviewed-on: https://go-review.googlesource.com/c/go/+/572716
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
|
|
Follow up CL 560155
Change-Id: Id9230d79c296452f3741123c75b45c3d3b1be4f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/574295
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Than McIntosh <thanm@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
For #65355
Change-Id: I65dd090fb99de9b231af2112c5ccb0eb635db2be
Reviewed-on: https://go-review.googlesource.com/c/go/+/560155
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ibrahim Bazoka <ibrahimbazoka729@gmail.com>
Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
|
|
When GORISCV64 enables rva22u64, use SEXTB for MOVB, SEXTH for MOVH, ZEXTH
for MOVHU and ADDUW for MOVWU. These are single instruction alternatives
to the two instruction shift sequences that are needed otherwise.
Change-Id: Iea5e394f57e238ae8771400a87287c1ee507d44c
Reviewed-on: https://go-review.googlesource.com/c/go/+/572736
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
|
|
for ARM64 targets that support LSE
Remove dynamic checks for atomic instructions for ARM64 targets that support LSE extension.
For #66131
Change-Id: I0ec1b183a3f4ea4c8a537430646e6bc4b4f64271
Reviewed-on: https://go-review.googlesource.com/c/go/+/569536
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Fannie Zhang <Fannie.Zhang@arm.com>
Reviewed-by: Shu-Chun Weng <scw@google.com>
|
|
RtlGetNtVersionNumbers
The RtlGetNtVersionNumbers function is not documented by Microsoft.
Use RtlGetVersion instead, which is documented and available on all
supported versions of Windows.
Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-longtest,gotip-windows-arm64
Change-Id: Ibaf0e2c28e673951476c5d863a829fd166705aea
Reviewed-on: https://go-review.googlesource.com/c/go/+/571015
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
Add assembler support for Zba, Zbb, Zbs extensions, which are
mandatory in the rva22u64 profile. These can be used to accelerate
address computation and bit manipulation.
Change-Id: Ie90fe6b76b1382cf69984a0e71a72d3cba0e750a
Reviewed-on: https://go-review.googlesource.com/c/go/+/559655
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
The runtime.elf_* symbols are assembly functions which are used
to support the gcc/llvm -Os option when used with cgo.
When compiling Go for shared code, we attempt to strip out the
TOC regenation code added by the go assembler for these symbols.
This causes the symbol to no longer appear as an assembly
function which causes problems later on when handling other
implicit symbols.
Avoid adding a TOC regeneration prologue to these functions
to avoid this issue.
Fixes #66265
Change-Id: Icbf8e4438d177082a57bb228e39b232e7a0d7ada
Reviewed-on: https://go-review.googlesource.com/c/go/+/571835
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
For #51572
Change-Id: I23bb25b8cf1ecb9be25eb6ab9e89cd397b58b3c8
Reviewed-on: https://go-review.googlesource.com/c/go/+/572535
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
Change-Id: I6b30ac3e9d15c29197426fb16dc4031056f6bb10
GitHub-Last-Rev: e2dda286f26587726870a5779d6caa0c5abd6750
GitHub-Pull-Request: golang/go#66331
Reviewed-on: https://go-review.googlesource.com/c/go/+/571915
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
The purpose of this package is to have a build tagged variant so that
when we're building the bootstrap go command it does not depend on the
net package. (net is a dependency of golang.org/x/telemetry/counter on
Windows).
The TESTGO_TELEMETRY_DIR environment variable used by the go tests to
change the telemetry directory is renamed to TEST_TELEMETRY_DIR to
make it more general to other commands that might want to set it for
the purpose of tests. The test telemetry directory is now set using
telemetry.Start instead of countertest.Open. This also means that the
logic that decides whether to upload counter files is now going to run
from the cmd/go tests (but that's okay because it's aleady been
running when cmd/go has been invoked outside of its tests.
Change-Id: Ic4272e5083facde010482d8b8fc3c95c03564bc9
Reviewed-on: https://go-review.googlesource.com/c/go/+/571096
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
|
|
adds on ppc64x
For GOPPC64 < 10 targets, most large 32 bit constants (those
exceeding int16 capacity) can be added using two instructions
instead of 3.
This cannot be done for values greater than 0x7FFF7FFF, so this
must be done during asm preprocessing as the optab matching
rules cannot differentiate this special case.
Likewise, constants 0x8000 <= x < 0x10000 are not converted. The
assembler currently generates 2 instructions sequences for these
constants.
Change-Id: I1ccc839c6c28fc32f15d286b2e52e2d22a2a06d4
Reviewed-on: https://go-review.googlesource.com/c/go/+/568116
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Provide and use rotation pseudo-instructions for riscv64. The RISC-V bitmanip
extension adds support for hardware rotation instructions in the form of ROL,
ROLW, ROR, RORI, RORIW and RORW. These are easily implemented in the assembler
as pseudo-instructions for CPUs that do not support the bitmanip extension.
This approach provides a number of advantages, including reducing the rewrite
rules needed in the compiler, simplifying codegen tests and most importantly,
allowing these instructions to be used in assembly (for example, riscv64
optimised versions of SHA-256 and SHA-512). When bitmanip support is added,
these instruction sequences can simply be replaced with a single instruction
if permitted by the GORISCV64 profile.
Change-Id: Ia23402e1a82f211ac760690deb063386056ae1fa
Reviewed-on: https://go-review.googlesource.com/c/go/+/565015
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
|
|
Adds a new custom attribute to the DIE of captured variables,
containing the offset for the variable inside the closure struct. This
can be used by debuggers to display the contents of a closure variable.
Based on a sample program (delve) this increases the executable size by 0.06%.
benchstat output:
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
Template 153.0m ± 6% 152.5m ± 14% ~ (p=0.684 n=10)
Unicode 100.2m ± 15% 104.9m ± 7% ~ (p=0.247 n=10)
GoTypes 943.6m ± 8% 986.2m ± 10% ~ (p=0.280 n=10)
Compiler 97.79m ± 6% 101.63m ± 12% ~ (p=0.393 n=10)
SSA 6.872 ± 37% 9.413 ± 106% ~ (p=0.190 n=10)
Flate 128.0m ± 36% 125.0m ± 56% ~ (p=0.481 n=10)
GoParser 214.9m ± 26% 201.4m ± 68% ~ (p=0.579 n=10)
Reflect 452.6m ± 22% 412.2m ± 74% ~ (p=0.739 n=10)
Tar 166.2m ± 27% 155.9m ± 73% ~ (p=0.393 n=10)
XML 219.3m ± 24% 211.3m ± 76% ~ (p=0.739 n=10)
LinkCompiler 523.2m ± 13% 513.5m ± 47% ~ (p=0.631 n=10)
ExternalLinkCompiler 1.684 ± 2% 1.659 ± 25% ~ (p=0.218 n=10)
LinkWithoutDebugCompiler 304.9m ± 12% 309.1m ± 7% ~ (p=0.631 n=10)
StdCmd 70.76 ± 14% 68.66 ± 53% ~ (p=1.000 n=10)
geomean 511.5m 515.4m +0.77%
│ old.txt │ new.txt │
│ user-sec/op │ user-sec/op vs base │
Template 269.6m ± 13% 292.3m ± 17% ~ (p=0.393 n=10)
Unicode 110.2m ± 8% 101.7m ± 18% ~ (p=0.247 n=10)
GoTypes 2.181 ± 9% 2.356 ± 12% ~ (p=0.280 n=10)
Compiler 119.1m ± 11% 121.9m ± 15% ~ (p=0.481 n=10)
SSA 17.75 ± 52% 26.94 ± 123% ~ (p=0.190 n=10)
Flate 256.2m ± 43% 226.8m ± 73% ~ (p=0.739 n=10)
GoParser 427.0m ± 24% 422.3m ± 72% ~ (p=0.529 n=10)
Reflect 990.5m ± 23% 905.5m ± 75% ~ (p=0.912 n=10)
Tar 307.9m ± 27% 308.9m ± 64% ~ (p=0.393 n=10)
XML 432.8m ± 24% 427.6m ± 89% ~ (p=0.796 n=10)
LinkCompiler 796.9m ± 14% 800.4m ± 56% ~ (p=0.481 n=10)
ExternalLinkCompiler 1.666 ± 4% 1.671 ± 28% ~ (p=0.971 n=10)
LinkWithoutDebugCompiler 316.7m ± 12% 325.6m ± 8% ~ (p=0.579 n=10)
geomean 579.5m 594.0m +2.51%
│ old.txt │ new.txt │
│ text-bytes │ text-bytes vs base │
HelloSize 842.9Ki ± 0% 842.9Ki ± 0% ~ (p=1.000 n=10) ¹
CmdGoSize 10.95Mi ± 0% 10.95Mi ± 0% ~ (p=1.000 n=10) ¹
geomean 3.003Mi 3.003Mi +0.00%
¹ all samples are equal
│ old.txt │ new.txt │
│ data-bytes │ data-bytes vs base │
HelloSize 15.08Ki ± 0% 15.08Ki ± 0% ~ (p=1.000 n=10) ¹
CmdGoSize 314.7Ki ± 0% 314.7Ki ± 0% ~ (p=1.000 n=10) ¹
geomean 68.88Ki 68.88Ki +0.00%
¹ all samples are equal
│ old.txt │ new.txt │
│ bss-bytes │ bss-bytes vs base │
HelloSize 396.8Ki ± 0% 396.8Ki ± 0% ~ (p=1.000 n=10) ¹
CmdGoSize 428.8Ki ± 0% 428.8Ki ± 0% ~ (p=1.000 n=10) ¹
geomean 412.5Ki 412.5Ki +0.00%
¹ all samples are equal
│ old.txt │ new.txt │
│ exe-bytes │ exe-bytes vs base │
HelloSize 1.310Mi ± 0% 1.310Mi ± 0% +0.02% (p=0.000 n=10)
CmdGoSize 16.37Mi ± 0% 16.38Mi ± 0% +0.01% (p=0.000 n=10)
geomean 4.631Mi 4.632Mi +0.02%
Change-Id: Ib416ee2d916ec61ad4a5c26bab09597595f57e04
Reviewed-on: https://go-review.googlesource.com/c/go/+/563816
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
Change-Id: I271f25aefaace61935d55a1b6b7c026d022d92a7
GitHub-Last-Rev: 304e3ee979f4fde58184e7035cd5d0d6b50bca74
GitHub-Pull-Request: golang/go#66023
Reviewed-on: https://go-review.googlesource.com/c/go/+/567918
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Similar with what we are doing for -goexperiment.
For #65778
Change-Id: I7dda69512a3ffb491e3de31941ae1c3d34fececf
Reviewed-on: https://go-review.googlesource.com/c/go/+/568156
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
|
|
values
These binary operations can be done in two sequential instructions instead of loading a
constant into REGTMP and doing the binary op.
Change-Id: Ie0ab863f9e81afad140b92b265bca4d3f0fe90b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/565215
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
|
|
Change-Id: I2990701e71a63af7bdd6851b6008dc63cb1c1a83
Reviewed-on: https://go-review.googlesource.com/c/go/+/535616
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Currently, changing putvar or putAbstractVar involves:
1. changing the abbrevs array to add new abbrevs or modify existing ones
2. changing the DW_ABRV_XXX const block to add the new abbrevs, this const
block must match the changes to the abbrevs array
3. change the code at the start of putvar and putAbstractVar that selects
the abbrev to use
4. change the body of putvar/putAbstractVar to emit the right attributes in
the right sequence
Each change must agree with all other, this is error prone and if an mistake
is made there is no compile time or runtime check detecting it. Erroneous
code will simply produce unreadable debug sections.
This commit adds a mechanism to automatically generate code for abbrev
selection as well as the abbrev definitions based on static examination of
the body of putvar and putAbstractVar.
TestPutVarAbbrevGenerator is responsible for checking that the generated
code is kept updated and will regenerated it by passing the '-generate'
option to it.
benchstat output:
| old.txt | new.txt |
| sec/op | sec/op vs base |
Template 153.8m ± 6% 153.0m ± 6% ~ (p=0.853 n=10)
Unicode 98.98m ± 19% 100.22m ± 15% ~ (p=0.796 n=10)
GoTypes 1013.7m ± 13% 943.6m ± 8% ~ (p=0.353 n=10)
Compiler 98.48m ± 10% 97.79m ± 6% ~ (p=0.353 n=10)
SSA 8.921 ± 31% 6.872 ± 37% ~ (p=0.912 n=10)
Flate 114.3m ± 21% 128.0m ± 36% ~ (p=0.436 n=10)
GoParser 219.0m ± 27% 214.9m ± 26% ~ (p=0.631 n=10)
Reflect 447.5m ± 20% 452.6m ± 22% ~ (p=0.684 n=10)
Tar 166.9m ± 27% 166.2m ± 27% ~ (p=0.529 n=10)
XML 218.6m ± 25% 219.3m ± 24% ~ (p=0.631 n=10)
LinkCompiler 492.7m ± 12% 523.2m ± 13% ~ (p=0.315 n=10)
ExternalLinkCompiler 1.684 ± 3% 1.684 ± 2% ~ (p=0.684 n=10)
LinkWithoutDebugCompiler 296.0m ± 8% 304.9m ± 12% ~ (p=0.579 n=10)
StdCmd 69.59 ± 15% 70.76 ± 14% ~ (p=0.436 n=10)
geomean 516.0m 511.5m -0.87%
| old.txt | new.txt |
| user-sec/op | user-sec/op vs base |
Template 281.5m ± 10% 269.6m ± 13% ~ (p=0.315 n=10)
Unicode 107.3m ± 8% 110.2m ± 8% ~ (p=0.165 n=10)
GoTypes 2.414 ± 16% 2.181 ± 9% ~ (p=0.315 n=10)
Compiler 116.0m ± 16% 119.1m ± 11% ~ (p=0.971 n=10)
SSA 25.47 ± 39% 17.75 ± 52% ~ (p=0.739 n=10)
Flate 205.2m ± 25% 256.2m ± 43% ~ (p=0.393 n=10)
GoParser 456.8m ± 28% 427.0m ± 24% ~ (p=0.912 n=10)
Reflect 960.3m ± 22% 990.5m ± 23% ~ (p=0.280 n=10)
Tar 299.8m ± 27% 307.9m ± 27% ~ (p=0.631 n=10)
XML 425.0m ± 21% 432.8m ± 24% ~ (p=0.353 n=10)
LinkCompiler 768.1m ± 11% 796.9m ± 14% ~ (p=0.631 n=10)
ExternalLinkCompiler 1.713 ± 5% 1.666 ± 4% ~ (p=0.190 n=10)
LinkWithoutDebugCompiler 313.0m ± 9% 316.7m ± 12% ~ (p=0.481 n=10)
geomean 588.6m 579.5m -1.55%
| old.txt | new.txt |
| text-bytes | text-bytes vs base |
HelloSize 842.9Ki ± 0% 842.9Ki ± 0% ~ (p=1.000 n=10) ¹
CmdGoSize 10.95Mi ± 0% 10.95Mi ± 0% ~ (p=1.000 n=10) ¹
geomean 3.003Mi 3.003Mi +0.00%
¹ all samples are equal
| old.txt | new.txt |
| data-bytes | data-bytes vs base |
HelloSize 15.08Ki ± 0% 15.08Ki ± 0% ~ (p=1.000 n=10) ¹
CmdGoSize 314.7Ki ± 0% 314.7Ki ± 0% ~ (p=1.000 n=10) ¹
geomean 68.88Ki 68.88Ki +0.00%
¹ all samples are equal
| old.txt | new.txt |
| bss-bytes | bss-bytes vs base |
HelloSize 396.8Ki ± 0% 396.8Ki ± 0% ~ (p=1.000 n=10) ¹
CmdGoSize 428.8Ki ± 0% 428.8Ki ± 0% ~ (p=1.000 n=10) ¹
geomean 412.5Ki 412.5Ki +0.00%
¹ all samples are equal
| old.txt | new.txt |
| exe-bytes | exe-bytes vs base |
HelloSize 1.310Mi ± 0% 1.310Mi ± 0% -0.01% (p=0.000 n=10)
CmdGoSize 16.37Mi ± 0% 16.37Mi ± 0% -0.00% (p=0.000 n=10)
geomean 4.631Mi 4.631Mi -0.00%
Change-Id: I7edf37b5a47fd9aceef931ddf2c701e66a7b38b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/563815
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Fixes #62047
Change-Id: If7811c1eb9073fb09b7006076998f8b2e1810bfb
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/539975
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
For #65355
Change-Id: I5fefe30dcb520159de565e61dafc74a740fc8730
Reviewed-on: https://go-review.googlesource.com/c/go/+/559715
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
This CL adds rounding modes for riscv64 floating point conversion
instructions by suffix with 5 modes: RNE, RTZ, RDN, RUP and RMM.
For example, for round to nearest (RNE), we can use `FCVTLD.RNE`
According to RISCV manual 8.7 and 9.5, we changed these
conversion instructions:
FCVTWS
FCVTLS
FCVTWUS
FCVTLUS
FCVTWD
FCVTLD
FCVTWUD
FCVTLUD
Note: Round towards zero (RTZ) by default for all these instructions above.
Change-Id: I491e522e14d721e24aa7f528ee0c4640c54c5808
Reviewed-on: https://go-review.googlesource.com/c/go/+/504736
Reviewed-by: Joel Sing <joel@sing.id.au>
Run-TryBot: M Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
Combine MOVW/MOVD using C_16CON as they accept any 16 bit
constant.
Remove MULLW/MULLD C_U16CON optab entry. These assemble to
the mulli opcode which only accepts a signed 16 bit constant.
Remove superfluous optab entrys for VSPLTB and VSPLTISB,
as C_S16CON accepts C_U15CON arguments.
Change-Id: Ie20dd07bcedda428fb1dd674474d7dfa67d76dc1
Reviewed-on: https://go-review.googlesource.com/c/go/+/563915
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Eli Bendersky <eliben@google.com>
|
|
Rename C_LCON, C_SCON, C_ADDCON, C_ANDCON into their aliased names
and remove them.
Change-Id: I8f67cc973f8059e65b81669d91a44500fc136b0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/563097
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Make C_S32CON, C_U32CON, and C_32CON distinct classifiers to allow
more specific matching of 32 bit constants. C_U31CON is added to
support C_S32CON.
Likewise, add C_16CON which is the union of C_S16CON and C_U16CON
classification. This wil allow simplifying MOVD/MOVW optab entries
in a future patch.
Change-Id: I193acc0ded8f3edd91d306e39c3e7e55a9811e04
Reviewed-on: https://go-review.googlesource.com/c/go/+/562346
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
The assembler treats C_SBRA and C_LBRA optab classes identically,
combine them into one class to reduce the number of optab classes.
Likewise, C_LBRAPIC is renamed to C_BRAPIC for consistency with
the above change.
Change-Id: I47000e7273cb8f89a4d0621d71433ccbfb7afb70
Reviewed-on: https://go-review.googlesource.com/c/go/+/557916
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
uintptr
Adds the DW_AT_go_runtime_type attribute to the debug_info entry for
unsafe.Pointer (which is special) and fixes the debug_info entry of
uintptr so that its DW_AT_go_runtime_type attribute has the proper
class (it was accidentally using DW_CLS_ADDRESS instead of
DW_CLS_GO_TYPEREF)
Change-Id: I52e18593935fbda9bc425e849f4c7f50e9144ad4
Reviewed-on: https://go-review.googlesource.com/c/go/+/558275
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
with a dot or underscore
Fixes #54221
Change-Id: Id16f553251daf0b7c51f45232a4133f7dfb1ebb9
GitHub-Last-Rev: 675c2bfcbb4fd31da0442dd0e612874934cc0d87
GitHub-Pull-Request: golang/go#65298
Reviewed-on: https://go-review.googlesource.com/c/go/+/558696
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
|
|
The variable represents the RISC-V user-mode application profile for
which to compile. Valid values are rva20u64 (the default) and
rva22u64.
Setting GORISCV64=rva20u64 defines the riscv64.rva20u64 build tag,
sets the internal variable buildcfg.GORISCV64 to 20 and defines the
macro GORISCV64_rva20u64 for use in assembly language code.
Setting GORISCV64=rva22u64 defines the riscv64.rva20u64 and
riscv64.rva22u64 build tags, sets the internal variable
buildcfg.GORISCV64 to 22 and defines the macro GORISCV64_rva22u64
for use in assembly language code.
This patch only provides a mechanism for the compiler and hand-coded
assembly language functions to take advantage of the RISC-V
extensions mandated by the application profiles. Further patches
will be required to get the compiler/assembler and assembly language
functions to actually generate and use these extensions.
Fixes #61476
Change-Id: I9195ae6ee71703cd2112160e89157ab63b8391af
Reviewed-on: https://go-review.googlesource.com/c/go/+/541135
Reviewed-by: M Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Wang Yaduo <wangyaduo@linux.alibaba.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: M Zhuo <mengzhuo1203@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Right shift by 0 has bad semantics. Make sure if we try to right shift by 0,
do a left shift by 0 instead.
CL 549955 handled full instructions with this strange no-op encoding.
This CL handles the shift done to instruction register inputs.
(The former is implemented using the latter, but not until deep
inside the assembler.)
Update #64715
Change-Id: Ibfabb4b13e2595551e58b977162fe005aaaa0ad1
Reviewed-on: https://go-review.googlesource.com/c/go/+/550335
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Right shifts, for some odd reasons, can encode shifts of constant
1-32 instead of 0-31. Left shifts, however, can encode shifts 0-31.
When the shift amount is 0, arm recommends encoding right shifts
using left shifts.
Fixes #64715
Change-Id: Id3825349aa7195028037893dfe01fa0e405eaa51
Reviewed-on: https://go-review.googlesource.com/c/go/+/549955
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
ZR register can be used in register pair of LDP, LDPW and LDPSW
instructions, but now it's not allowed. This CL fixes this issue.
Change-Id: I8467502de4664214e0b7dad0295c44f6cff16ee6
Reviewed-on: https://go-review.googlesource.com/c/go/+/547815
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
|
|
The exported API is only available with GOEXPERIMENT=rangefunc.
This will let Go 1.22 users who want to experiment with rangefuncs
access an efficient implementation of iter.Pull and iter.Pull2.
For #61897.
Change-Id: I6ef5fa8f117567efe4029b7b8b0f4d9b85697fb7
Reviewed-on: https://go-review.googlesource.com/c/go/+/543319
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Move ChaCha8 code into internal/chacha8rand and use it to implement
runtime.rand, which is used for the unseeded global source for
both math/rand and math/rand/v2. This also affects the calculation of
the start point for iteration over very very large maps (when the
32-bit fastrand is not big enough).
The benefit is that misuse of the global random number generators
in math/rand and math/rand/v2 in contexts where non-predictable
randomness is important for security reasons is no longer a
security problem, removing a common mistake among programmers
who are unaware of the different kinds of randomness.
The cost is an extra 304 bytes per thread stored in the m struct
plus 2-3ns more per random uint64 due to the more sophisticated
algorithm. Using PCG looks like it would cost about the same,
although I haven't benchmarked that.
Before this, the math/rand and math/rand/v2 global generator
was wyrand (https://github.com/wangyi-fudan/wyhash).
For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson
ALFG was justifiable, since the latter was not any better.
But for math/rand/v2, the global generator really should be
at least as good as one of the well-studied, specific algorithms
provided directly by the package, and it's not.
(Wyrand is still reasonable for scheduling and cache decisions.)
Good randomness does have a cost: about twice wyrand.
Also rationalize the various runtime rand references.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │
│ sec/op │ sec/op vs base │
ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20)
PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20)
SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20)
GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20)
GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20)
GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20)
GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20)
Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20)
Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20)
GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20)
IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20)
Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20)
Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20)
Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20)
Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20)
Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20)
Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20)
Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20)
Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20)
Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20)
Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20)
Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20)
Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20)
Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20)
ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20)
NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20)
Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20)
Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20)
Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20)
ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20)
Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
│ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │
│ sec/op │ sec/op vs base │
ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20)
PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20)
SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20)
GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20)
GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20)
GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20)
GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20)
Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20)
Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20)
GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20)
IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20)
Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20)
Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20)
Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20)
Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20)
Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20)
Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20)
Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20)
Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20)
Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20)
Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20)
Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20)
ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20)
NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20)
Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20)
Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20)
Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20)
ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20)
Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.386 │ 5cf807d1ea.386 │
│ sec/op │ sec/op vs base │
ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20)
PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20)
SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20)
GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20)
GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20)
GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20)
GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20)
Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20)
Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20)
GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20)
IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20)
Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20)
Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20)
Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20)
Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20)
Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20)
Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20)
Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20)
Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20)
Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20)
Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20)
Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20)
Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20)
Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20)
ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20)
NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20)
Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20)
Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20)
Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20)
ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20)
Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20)
For #61716.
Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2
Reviewed-on: https://go-review.googlesource.com/c/go/+/516860
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Fixes #64375
Change-Id: I24ce67ef254db447cdf37a3fda5b5ab5fc782a36
GitHub-Last-Rev: 05590b9e20b31413d455a6e87bc38843e33ff116
GitHub-Pull-Request: golang/go#64376
Reviewed-on: https://go-review.googlesource.com/c/go/+/544757
Auto-Submit: Bryan Mills <bcmills@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
KMCTR encoding arguments incorrect way, which leading illegal instruction wherver we call KMCTR instruction.IBM z13 machine test's TestAESGCM test using gcmASM implementation, which uses KMCTR instruction to encrypt using AES in counter mode and the KIMD instruction for GHASH. z14+ machines onwards uses gcmKMA implementation for the same.
Fixes #63387
Change-Id: I86aeb99573c3f636a71908c99e06a9530655aa5d
Reviewed-on: https://go-review.googlesource.com/c/go/+/535675
Reviewed-by: Vishwanatha HD <vishwanatha.hd@ibm.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Add support for PCALIGN directive on riscv.
This directive can be used within Go asm to align instruction
by padding NOP directives.
This patch also adds a test to verify the correctness of the PCALIGN
directive.
Original credit by Cooper Qu (Alibaba)
https://gitee.com/xuantie_riscv/xuantie-patch
Change-Id: I8b6524a2bf81a1baf7c9d04b7da2db6c1a7b428f
Reviewed-on: https://go-review.googlesource.com/c/go/+/541740
Run-TryBot: M Zhuo <mzh@golangcn.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Wang Yaduo <wangyaduo@linux.alibaba.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This change refactors the cmd/trace package and adds most of the support
for v2 traces.
The following features of note are missing in this CL and will be
implemented in follow-up CLs:
- The focustask filter for the trace viewer
- The taskid filter for the trace viewer
- The goid filter for the trace viewer
- Pprof profiles
- The MMU graph
- The goroutine analysis pages
- The task analysis pages
- The region analysis pages
This CL makes one notable change to the trace CLI: it makes the -d flag
accept an integer to set the debug mode. For old traces -d != 0 works
just like -d. For new traces -d=1 means the high-level events and -d=2
means the low-level events.
Thanks to Felix Geisendörfer (felix.geisendoerfer@datadoghq.com) for
doing a lot of work on this CL; I picked this up from him and got a
massive headstart as a result.
For #60773.
For #63960.
Change-Id: I3626e22473227c5980134a85f1bb6a845f567b1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/542218
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Bypass: Michael Knyszek <mknyszek@google.com>
|
|
Update #40724
Co-authored-by: Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Change-Id: Ie92da57e29bae0e5cccb2a49a7cbeaf02cbf3a8d
Reviewed-on: https://go-review.googlesource.com/c/go/+/521787
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: David Chase <drchase@google.com>
|
|
regABI arguments
Update #40724
Co-authored-by: Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Change-Id: Ic7e2e7fb4c1d3670e6abbfb817aa6e4e654e08d3
Reviewed-on: https://go-review.googlesource.com/c/go/+/521777
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: David Chase <drchase@google.com>
|
|
loong64
Updates #58784
Change-Id: Ic98d10a512fea0c3ca321ab52693d9f6775126a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/480875
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: WANG Xuerui <git@xen0n.name>
Reviewed-by: WANG Xuerui <git@xen0n.name>
|
|
device for loong64
Add R21 to the allocatable registers, use R20 and R21 in duff
device. This CL is in preparation for subsequent regABI support.
Updates #40724
Co-authored-by: Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Change-Id: If1661adc0f766925fbe74827a369797f95fa28a9
Reviewed-on: https://go-review.googlesource.com/c/go/+/521775
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This change introduces new options to set the floating point
mode on ARM targets. The GOARM version number can optionally be
followed by ',hardfloat' or ',softfloat' to select whether to
use hardware instructions or software emulation for floating
point computations, respectively. For example,
GOARM=7,softfloat.
Previously, software floating point support was limited to
GOARM=5. With these options, software floating point is now
extended to all ARM versions, including GOARM=6 and 7. This
change also extends hardware floating point to GOARM=5.
GOARM=5 defaults to softfloat and GOARM=6 and 7 default to
hardfloat.
For #61588
Change-Id: I23dc86fbd0733b262004a2ed001e1032cf371e94
Reviewed-on: https://go-review.googlesource.com/c/go/+/514907
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
Change-Id: I179b50ae8e73677d4d408b83424afbbfe6aa17a1
GitHub-Last-Rev: 2e2d9c1e45556155d02db4df381b99f2d1bc5c0e
GitHub-Pull-Request: golang/go#63478
Reviewed-on: https://go-review.googlesource.com/c/go/+/534015
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
getimpliedreg was used to set a default register in cases where
one was implied but not set by the assembler or compiler.
In most cases with constant values, R0 is implied, and is the value
0 by architectural design. In those cases, R0 is always used, so
treat 0 and REG_R0 as interchangeable in those encodings.
Similarly, the pseudo-register SP or FP is used to in place of the
stack pointer, always R1 on PPC64. Unconditionally set this during
classification of NAME_AUTO and NAME_PARAM as it may be 0.
The case where REGSB might be returned from getimpliedreg is never
used. REGSB is aliased to R2, but in practice it is either R0 or R2
depending on buildmode. See symbolAccess in asm9.go for an example.
Change-Id: I7283e66d5351f56a7fe04cee38714910eaa73cb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/434775
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This halves the size of the xcmp lookup table.
Change-Id: I543fb72709ca45c026e9b7d8084a78f2a8fcd43e
Reviewed-on: https://go-review.googlesource.com/c/go/+/542295
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Jayanth Krishnamurthy <jayanth.krishnamurthy@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
|
|
When this happens, panic.
This is a revised version of a check that used #next,
where this one instead uses a per-loop #exit flag,
and catches more problematic iterators.
Updates #56413.
Updates #61405.
Change-Id: I6574f754e475bb67b9236b4f6c25979089f9b629
Reviewed-on: https://go-review.googlesource.com/c/go/+/540263
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
This optab matching rule was used to match signed 16 bit values shifted
left by 16 bits. Unsigned 16 bit values greater than 0x7FFF<<16 were
classified as C_U32CON which led to larger than necessary codegen.
Instead, rewrite logical/arithmetic operations in the preprocessor pass
to use the 16 bit shifted immediate operation (e.g ADDIS vs ADD). This
simplifies the optab matching rules, while also minimizing codegen size
for large unsigned values.
Note, ADDIS sign-extends the constant argument, all others do not.
For matching opcodes, this means:
MOVD $is<<16,Rx becomes ADDIS $is,Rx or ORIS $is,Rx
MOVW $is<<16,Rx becomes ADDIS $is,Rx
ADD $is<<16,[Rx,]Ry becomes ADDIS $is[Rx,]Ry
OR $is<<16,[Rx,]Ry becomes ORIS $is[Rx,]Ry
XOR $is<<16,[Rx,]Ry becomes XORIS $is[Rx,]Ry
Change-Id: I1a988d9f52517a04bb8dc2e41d7caf3d5fff867c
Reviewed-on: https://go-review.googlesource.com/c/go/+/536735
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
|
|
Currently, instruction validation failure will result in a panic during
encoding. Furthermore, the errors generated do not include the PC or
file/line information that is normally present.
Fix this by:
- Tracking and printing the *obj.Prog associated with the instruction,
including the assembly instruction/opcode if it differs. This provides
the standard PC and file/line prefix, which is also expected by assembly
error end-to-end tests.
- Not proceeding with assembly if errors exist - with the current design,
errors are identified during validation, which is run via preprocess.
Attempts to encode invalid instructions will intentionally panic.
Add some additional riscv64 encoding errors, now that we can actually do so.
Change-Id: I64a7b83680c4d12aebdc96c67f9df625b5ef90d3
Reviewed-on: https://go-review.googlesource.com/c/go/+/523459
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: M Zhuo <mzh@golangcn.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: M Zhuo <mzh@golangcn.org>
|
|
For leaf but nonzero-frame functions.
Currently we're not restoring it properly. We also need to restore
it before popping the stack frame, so that the frame won't get
clobbered by a signal handler in the meantime.
Fixes #63830
Needs a test, but I'm not at all sure how we would actually do that. Leaving for inspiration.
Change-Id: I273a25f2a838f05a959c810145cccc5428eaf164
Reviewed-on: https://go-review.googlesource.com/c/go/+/538635
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Eric Fang <eric.fang@arm.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
|