| Age | Commit message (Collapse) | Author |
|
Change-Id: I9d88c8eb91106de412a9abc6601cdda06537d818
Reviewed-on: https://go-review.googlesource.com/c/go/+/461747
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
|
|
We have been expanding our use of GODEBUG for compatibility,
and the current implementation forces a tradeoff between
freshness and efficiency. It parses the environment variable
in full each time it is called, which is expensive. But if clients
cache the result, they won't respond to run-time GODEBUG
changes, as happened with x509sha1 (#56436).
This CL changes the GODEBUG API to provide efficient,
up-to-date results. Instead of a single Get function,
New returns a *godebug.Setting that itself has a Get method.
Clients can save the result of New, which is no more expensive
than errors.New, in a global variable, and then call that
variable's Get method to get the value. Get costs only two
atomic loads in the case where the variable hasn't changed
since the last call.
Unfortunately, these changes do require importing sync
from godebug, which will mean that sync itself will never
be able to use a GODEBUG setting. That doesn't seem like
such a hardship. If it was really necessary, the runtime could
pass a setting to package sync itself at startup, with the
caveat that that setting, like the ones used by runtime itself,
would not respond to run-time GODEBUG changes.
Change-Id: I99a3acfa24fb2a692610af26a5d14bbc62c966ac
Reviewed-on: https://go-review.googlesource.com/c/go/+/449504
Run-TryBot: Russ Cox <rsc@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Rename "Zeus" to "NeoverseV1" for the partnum 0xd40 to be
consistent with the documentation of MIDR_EL1 as described in
https://developer.arm.com/documentation/101427/0101/?lang=en
Change-Id: I2e3d5ec76b953a831cb4ab0438bc1c403648644b
Reviewed-on: https://go-review.googlesource.com/c/go/+/414775
Reviewed-by: Jonathan Swinney <jswinney@amazon.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Eric Fang <eric.fang@arm.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Deduplicate code for parsing system registers - this matches what is done
in golang.org/x/sys/cpu.
Change-Id: If3524eb2e361179c68678f8214230d7068fe4c60
Reviewed-on: https://go-review.googlesource.com/c/go/+/422217
Reviewed-by: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Change-Id: I1f21654b50d7b0cd8e1f854efe2724b72f067449
Reviewed-on: https://go-review.googlesource.com/c/go/+/422216
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Yuval Pavel Zholkover <paulzhol@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
addresses proposal #53084
required by sha-256 change list developed for #50543
Change-Id: I5454d746fce069a7a4993d70dc5b0a5544f8eeaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/408794
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@google.com>
|
|
The existing value for M1 is 64, which is the same as other arm64 cpus.
But the correct cacheLineSize for M1 should be 128, which can be
verified using the following command:
$ sysctl -a hw | grep cachelinesize
hw.cachelinesize: 128
Fixes #53075
Change-Id: Iaa8330010a4499b9b357c70743d55aed6ddb8588
GitHub-Last-Rev: df87eb9c503c6bc5220a92ef1bc4c4c89ef4658d
GitHub-Pull-Request: golang/go#53076
Reviewed-on: https://go-review.googlesource.com/c/go/+/408576
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Martin Möhrmann <martin@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Keith Randall <khr@google.com>
|
|
The new M1 cpu (Apple) comes with sha512 hardware
acceleration feature.
Change-Id: I823d1e9b09b472bd21571eee75cc5314cd66b1ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/408836
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
OpenBSD 7.1 onwards expose the aarch64 ISAR0 and ISAR1 registers via sysctl:
$ sysctl machdep
machdep.compatible=apple,j274
machdep.id_aa64isar0=153421459058925856
machdep.id_aa64isar1=1172796674562
Implement CPU feature detection for openbsd/arm64 based on this information.
Fixes #31746
Change-Id: If8a9b2b8fc557e1aaefbcb52f4d1bd9efc43856d
Reviewed-on: https://go-review.googlesource.com/c/go/+/421875
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Run-TryBot: Ian Lance Taylor <iant@google.com>
|
|
Change-Id: Id6a27d0b3f3fc4181a00569bacc578e72b04ce09
GitHub-Last-Rev: 85c063d1a2d62181d16044592a60acf970fe3c86
GitHub-Pull-Request: golang/go#53359
Reviewed-on: https://go-review.googlesource.com/c/go/+/411916
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Peter Zhang <binbin36520@gmail.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
|
|
We choose 64 because the L1 Dcache of Loongson 3A5000 CPU is
4-way 256-line 64-byte-per-line.
Change-Id: Ifb9a9f993dd6f75b5adb4ff6e4d93e945b1b2a98
Reviewed-on: https://go-review.googlesource.com/c/go/+/408854
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Alex Rakoczy <alex@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
Contributors to the loong64 port are:
Weining Lu <luweining@loongson.cn>
Lei Wang <wanglei@loongson.cn>
Lingqin Gong <gonglingqin@loongson.cn>
Xiaolin Zhao <zhaoxiaolin@loongson.cn>
Meidan Li <limeidan@loongson.cn>
Xiaojuan Zhai <zhaixiaojuan@loongson.cn>
Qiyuan Pu <puqiyuan@loongson.cn>
Guoqi Chen <chenguoqi@loongson.cn>
This port has been updated to Go 1.15.6:
https://github.com/loongson/go
Updates #46229
Change-Id: I39d42e5959391e47bf621b3bdd3d95de72f023cc
Reviewed-on: https://go-review.googlesource.com/c/go/+/342318
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
The PPC64 maintainers are testing on P10 hardware, so it is helpful
to report the correct cpu, even if this information is not used
elsewhere yet.
Note, AIX will report the current CPU of the host system, so a
POWER10 will not set the IsPOWER9 flag. This is existing behavior,
and should be fixed in a separate patch.
Change-Id: Iebe23dd96ebe03c8a1c70d1ed2dc1506bad3c330
Reviewed-on: https://go-review.googlesource.com/c/go/+/404394
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
Fix up a test to insure that it does the right thing when
"go test -cover" is in effect.
Fixes #52761.
Change-Id: I0c141181e2dcaefd592fb04813f812f2800511da
Reviewed-on: https://go-review.googlesource.com/c/go/+/404715
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
That feature can't be disabled if the microarchitectural version
requires it.
Change-Id: Iad8aaa8089d2f023e9ae5044c6da33224499f09b
Reviewed-on: https://go-review.googlesource.com/c/go/+/392994
Run-TryBot: Keith Randall <khr@golang.org>
Trust: Keith Randall <khr@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Martin Möhrmann <martin@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
e.g., if GOAMD64=v3, don't allow GODEBUG=cpu.XXX=off for XXX which
are required for v3.
Change-Id: Ib58a4c8b13c5464ba476448ba44bbb261218787c
Reviewed-on: https://go-review.googlesource.com/c/go/+/391694
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <martin@golang.org>
|
|
This should always be true, but use the HWCAP2 bit anyways.
Change-Id: Ib164cf05b4c9f0c509f41b7eb339ef32fb63e384
Reviewed-on: https://go-review.googlesource.com/c/go/+/389894
Trust: Paul Murphy <murp@ibm.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
CL 344955 and CL 359476 removed almost all // +build lines, but leaving
some assembly files and generating scripts. Also, some files were added
with // +build lines after CL 359476 was merged. Remove these or rename
files where more appropriate.
For #41184
Change-Id: I7eb85a498ed9788b42a636e775f261d755504ffa
Reviewed-on: https://go-review.googlesource.com/c/go/+/361480
Trust: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
Change-Id: Ifdf67e778e88ee70780428aa5479d2e091752a3a
Reviewed-on: https://go-review.googlesource.com/c/go/+/360605
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Trust: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
When these packages are released as part of Go 1.18,
Go 1.16 will no longer be supported, so we can remove
the +build tags in these files.
Ran go fix -fix=buildtag std cmd and then reverted the bootstrapDirs
as defined in src/cmd/dist/buildtool.go, which need to continue
to build with Go 1.4 for now.
Also reverted src/vendor and src/cmd/vendor, which will need
to be updated in their own repos first.
Manual changes in runtime/pprof/mprof_test.go to adjust line numbers.
For #41184.
Change-Id: Ic0f93f7091295b6abc76ed5cd6e6746e1280861e
Reviewed-on: https://go-review.googlesource.com/c/go/+/344955
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
With the removal of SSE2 runtime detection made in
golang.org/cl/344350 we can remove this mechanism as there
are no required features anymore.
For making sure CPUs running a go program support all
the minimal hardware requirements the go runtime should
do feature checks early in the runtime initialization
before it is likely any compiler emitted but unsupported
instructions are used. This is already the case for e.g.
checking MMX support on 386 arch targets.
Change-Id: If7b1cb6f43233841e917d37a18314d06a334a734
Reviewed-on: https://go-review.googlesource.com/c/go/+/354209
Trust: Martin Möhrmann <martin@golang.org>
Run-TryBot: Martin Möhrmann <martin@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
When GO386=sse2 we can assume sse2 to be present without
a runtime check. If GO386=softfloat is set we can avoid
the usage of SSE2 even if detected.
This might cause a memcpy, memclr and bytealg slowdown of Go
binaries compiled with softfloat on machines that support
SSE2. Such setups are rare and should use GO386=sse2 instead
if performance matters.
On targets that support SSE2 we avoid the runtime overhead of
dynamic cpu feature dispatch.
The removal of runtime sse2 checks also allows to simplify
internal/cpu further by removing handling of the required
feature option as a followup after this CL.
Change-Id: I90a853a8853a405cb665497c6d1a86556947ba17
Reviewed-on: https://go-review.googlesource.com/c/go/+/344350
Trust: Martin Möhrmann <martin@golang.org>
Run-TryBot: Martin Möhrmann <martin@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
To measure all instructions having been completed before reading
the time stamp counter with RDTSC an instruction sequence that
has instruction stream serializing properties which guarantee
waiting until all previous instructions have been executed is
needed. This does not necessary mean to wait for all stores to
be globally visible.
This CL aims to remove vendor specific logic for determining the
instruction sequence with CPU feature flag checks that are
CPU vendor independent.
For intel LFENCE has the wanted properties at least
since it was introduced together with SSE2 support.
On AMD instruction stream serializing LFENCE is supported by setting
an MSR C001_1029[1]=1 on AMD family 10h/12h/14h/15h/16h/17h processors.
AMD family 0Fh/11h processors support LFENCE as serializing always.
AMD plans support for this MSR and access to this bit for all future processors.
Source: https://developer.amd.com/wp-content/resources/Managing-Speculation-on-AMD-Processors.pdf
Reading the MSR to determine LFENCE properties is not always possible
or reliable (hypervisors). The Linux kernel is relying on serializing
LFENCE on AMD CPUs since a commit in July 2019: https://lkml.org/lkml/2019/7/22/295
and the MSR C001_1029 to enable serialization has been set by default
with the Spectre v1 mitigations.
Using an MFENCE on AMD is waiting on previous instructions having been executed
but in addition also flushes store buffers.
To align the serialization properties without runtime detection
of CPU manufacturers we can use the newer RDTSCP instruction which
waits until all previous instructions have been executed.
RDTSCP is available on Intel since around 2008 and on AMD CPUs since
around 2006. Support for RDTSCP can be checked independently
of manufacturer by checking CPUID bits.
Using RDTSCP is the default in Linux to read TSC in program order
when the instruction is available.
https://github.com/torvalds/linux/blob/e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93/arch/x86/include/asm/msr.h#L231
Change-Id: Ifa841843b9abb2816f8f0754a163ebf01385306d
Reviewed-on: https://go-review.googlesource.com/c/go/+/344429
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Martin Möhrmann <martin@golang.org>
Run-TryBot: Martin Möhrmann <martin@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
Don't add them to files in vendor and cmd/vendor though. These will be
pulled in by updating the respective dependencies.
For #41184
Change-Id: Icc57458c9b3033c347124323f33084c85b224c70
Reviewed-on: https://go-review.googlesource.com/c/go/+/319389
Trust: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).
Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild
Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/294430
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
As part of #42026, these helpers from io/ioutil were moved to os.
(ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.)
Update the Go tree to use the preferred names.
As usual, code compiled with the Go 1.4 bootstrap toolchain
and code vendored from other sources is excluded.
ReadDir changes are in a separate CL, because they are not a
simple search and replace.
For #42026.
Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/266365
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Fixes #42747
Change-Id: I6b1679348c77161f075f0678818bb003fc0e8c86
Reviewed-on: https://go-review.googlesource.com/c/go/+/271989
Trust: Martin Möhrmann <moehrmann@google.com>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
auxillary -> auxiliary
Change-Id: I7c29c4a63d236c3688b8e4f5af70650d43cd89c0
GitHub-Last-Rev: d4a18c71a15cf0803bd847225ed5bf898c52e0f3
GitHub-Pull-Request: golang/go#43024
Reviewed-on: https://go-review.googlesource.com/c/go/+/275592
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Keith Randall <khr@golang.org>
|
|
All instructions in the FMA extension on x86 are VEX prefixed.
VEX prefixed instructions generally require OSXSAVE to be enabled.
The execution of FMA instructions emitted by the Go compiler on amd64
will generate an invalid opcode exception if OSXSAVE is not enabled.
Fixes #41022
Change-Id: I49881630e7195c804110a2bd81b5bec8cac31ba8
Reviewed-on: https://go-review.googlesource.com/c/go/+/274479
Trust: Martin Möhrmann <moehrmann@google.com>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Remove all cpu features from the ARM64 struct that are not initialized
to reduce cache lines used and to avoid those features being
accidentially used without actual detection if they are present.
Add missing option to mask the CPUID feature.
Change-Id: I94bf90c0655de1af2218ac72117ac6c52adfc289
Reviewed-on: https://go-review.googlesource.com/c/go/+/267658
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Trust: Martin Möhrmann <moehrmann@google.com>
|
|
Replace the memmove implementation for moves of 17 bytes or larger
with an implementation from ARM optimized software. The moves of 16
bytes or fewer are unchanged, but the registers used are updated to
match the rest of the implementation.
This implementation makes use of new optimizations:
- software pipelined loop for large (>128 byte) moves
- medium size moves (17..128 bytes) have a new implementation
- address realignment when src or dst is unaligned
- preference for aligned src (loads) or dst (stores) depending on CPU
To support preference for aligned loads or aligned stores, a new CPU
flag is added. This flag indicates that the detected micro
architecture performs better with aligned loads. Some tested CPUs did
not exhibit a significant difference and are left with the default
behavior of realigning based on the destination address (stores).
Neoverse N1 (Tested on Graviton 2)
name old time/op new time/op delta
Memmove/0-4 1.88ns ± 1% 1.87ns ± 1% -0.58% (p=0.020 n=10+10)
Memmove/1-4 4.40ns ± 0% 4.40ns ± 0% ~ (all equal)
Memmove/8-4 3.88ns ± 3% 3.80ns ± 0% -1.97% (p=0.001 n=10+9)
Memmove/16-4 3.90ns ± 3% 3.80ns ± 0% -2.49% (p=0.000 n=10+9)
Memmove/32-4 4.80ns ± 0% 4.40ns ± 0% -8.33% (p=0.000 n=9+8)
Memmove/64-4 5.86ns ± 0% 5.00ns ± 0% -14.76% (p=0.000 n=8+8)
Memmove/128-4 8.46ns ± 0% 8.06ns ± 0% -4.62% (p=0.000 n=10+10)
Memmove/256-4 12.4ns ± 0% 12.2ns ± 0% -1.61% (p=0.000 n=10+10)
Memmove/512-4 19.5ns ± 0% 19.1ns ± 0% -2.05% (p=0.000 n=10+10)
Memmove/1024-4 33.7ns ± 0% 33.5ns ± 0% -0.59% (p=0.000 n=10+10)
Memmove/2048-4 62.1ns ± 0% 59.0ns ± 0% -4.99% (p=0.000 n=10+10)
Memmove/4096-4 117ns ± 1% 110ns ± 0% -5.66% (p=0.000 n=10+10)
MemmoveUnalignedDst/64-4 6.41ns ± 0% 5.62ns ± 0% -12.32% (p=0.000 n=10+7)
MemmoveUnalignedDst/128-4 9.40ns ± 0% 8.34ns ± 0% -11.24% (p=0.000 n=10+10)
MemmoveUnalignedDst/256-4 12.8ns ± 0% 12.8ns ± 0% ~ (all equal)
MemmoveUnalignedDst/512-4 20.4ns ± 0% 19.7ns ± 0% -3.43% (p=0.000 n=9+10)
MemmoveUnalignedDst/1024-4 34.1ns ± 0% 35.1ns ± 0% +2.93% (p=0.000 n=9+9)
MemmoveUnalignedDst/2048-4 61.5ns ± 0% 60.4ns ± 0% -1.77% (p=0.000 n=10+10)
MemmoveUnalignedDst/4096-4 122ns ± 0% 113ns ± 0% -7.38% (p=0.002 n=8+10)
MemmoveUnalignedSrc/64-4 7.25ns ± 1% 6.26ns ± 0% -13.64% (p=0.000 n=9+9)
MemmoveUnalignedSrc/128-4 10.5ns ± 0% 9.7ns ± 0% -7.52% (p=0.000 n=10+10)
MemmoveUnalignedSrc/256-4 17.1ns ± 0% 17.3ns ± 0% +1.17% (p=0.000 n=10+10)
MemmoveUnalignedSrc/512-4 27.0ns ± 0% 27.0ns ± 0% ~ (all equal)
MemmoveUnalignedSrc/1024-4 46.7ns ± 0% 35.7ns ± 0% -23.55% (p=0.000 n=10+9)
MemmoveUnalignedSrc/2048-4 85.2ns ± 0% 61.2ns ± 0% -28.17% (p=0.000 n=10+8)
MemmoveUnalignedSrc/4096-4 162ns ± 0% 113ns ± 0% -30.25% (p=0.000 n=10+10)
name old speed new speed delta
Memmove/4096-4 35.2GB/s ± 0% 37.1GB/s ± 0% +5.56% (p=0.000 n=10+9)
MemmoveUnalignedSrc/1024-4 21.9GB/s ± 0% 28.7GB/s ± 0% +30.90% (p=0.000 n=10+10)
MemmoveUnalignedSrc/2048-4 24.0GB/s ± 0% 33.5GB/s ± 0% +39.18% (p=0.000 n=10+9)
MemmoveUnalignedSrc/4096-4 25.3GB/s ± 0% 36.2GB/s ± 0% +43.50% (p=0.000 n=10+7)
Cortex-A72 (Graviton 1)
name old time/op new time/op delta
Memmove/0-4 3.06ns ± 3% 3.08ns ± 1% ~ (p=0.958 n=10+9)
Memmove/1-4 8.72ns ± 0% 7.85ns ± 0% -9.98% (p=0.002 n=8+10)
Memmove/8-4 8.29ns ± 0% 8.29ns ± 0% ~ (all equal)
Memmove/16-4 8.29ns ± 0% 8.29ns ± 0% ~ (all equal)
Memmove/32-4 8.19ns ± 2% 8.29ns ± 0% ~ (p=0.114 n=10+10)
Memmove/64-4 18.3ns ± 4% 10.0ns ± 0% -45.36% (p=0.000 n=10+10)
Memmove/128-4 14.8ns ± 0% 17.4ns ± 0% +17.77% (p=0.000 n=10+10)
Memmove/256-4 21.8ns ± 0% 23.1ns ± 0% +5.96% (p=0.000 n=10+10)
Memmove/512-4 35.8ns ± 0% 37.2ns ± 0% +3.91% (p=0.000 n=10+10)
Memmove/1024-4 63.7ns ± 0% 67.2ns ± 0% +5.49% (p=0.000 n=10+10)
Memmove/2048-4 126ns ± 0% 123ns ± 0% -2.38% (p=0.000 n=10+10)
Memmove/4096-4 238ns ± 1% 243ns ± 1% +1.93% (p=0.000 n=10+10)
MemmoveUnalignedDst/64-4 19.3ns ± 1% 12.0ns ± 1% -37.49% (p=0.000 n=10+10)
MemmoveUnalignedDst/128-4 17.2ns ± 0% 17.4ns ± 0% +1.16% (p=0.000 n=10+10)
MemmoveUnalignedDst/256-4 28.2ns ± 8% 29.2ns ± 0% ~ (p=0.352 n=10+10)
MemmoveUnalignedDst/512-4 49.8ns ± 3% 48.9ns ± 0% ~ (p=1.000 n=10+10)
MemmoveUnalignedDst/1024-4 89.5ns ± 0% 80.5ns ± 1% -10.02% (p=0.000 n=10+10)
MemmoveUnalignedDst/2048-4 180ns ± 0% 127ns ± 0% -29.44% (p=0.000 n=9+10)
MemmoveUnalignedDst/4096-4 347ns ± 0% 244ns ± 0% -29.59% (p=0.000 n=10+9)
MemmoveUnalignedSrc/128-4 16.1ns ± 0% 21.8ns ± 0% +35.40% (p=0.000 n=10+10)
MemmoveUnalignedSrc/256-4 24.9ns ± 8% 26.6ns ± 0% +6.70% (p=0.015 n=10+10)
MemmoveUnalignedSrc/512-4 39.4ns ± 6% 40.6ns ± 0% ~ (p=0.352 n=10+10)
MemmoveUnalignedSrc/1024-4 72.5ns ± 0% 83.0ns ± 1% +14.44% (p=0.000 n=9+10)
MemmoveUnalignedSrc/2048-4 129ns ± 1% 128ns ± 1% ~ (p=0.179 n=10+10)
MemmoveUnalignedSrc/4096-4 241ns ± 0% 253ns ± 1% +4.99% (p=0.000 n=9+9)
Cortex-A53 (Raspberry Pi 3)
name old time/op new time/op delta
Memmove/0-4 11.0ns ± 0% 11.0ns ± 1% ~ (p=0.294 n=8+10)
Memmove/1-4 29.6ns ± 0% 28.0ns ± 1% -5.41% (p=0.000 n=9+10)
Memmove/8-4 23.5ns ± 0% 22.1ns ± 0% -6.11% (p=0.000 n=8+8)
Memmove/16-4 23.7ns ± 1% 22.1ns ± 0% -6.59% (p=0.000 n=10+8)
Memmove/32-4 27.9ns ± 0% 27.1ns ± 0% -3.13% (p=0.000 n=8+8)
Memmove/64-4 33.8ns ± 0% 31.5ns ± 1% -6.99% (p=0.000 n=8+10)
Memmove/128-4 45.6ns ± 0% 44.2ns ± 1% -3.23% (p=0.000 n=9+10)
Memmove/256-4 69.3ns ± 0% 69.3ns ± 0% ~ (p=0.072 n=8+8)
Memmove/512-4 127ns ± 0% 110ns ± 0% -13.39% (p=0.000 n=8+8)
Memmove/1024-4 222ns ± 0% 205ns ± 1% -7.66% (p=0.000 n=7+10)
Memmove/2048-4 411ns ± 0% 366ns ± 0% -10.98% (p=0.000 n=8+9)
Memmove/4096-4 795ns ± 1% 695ns ± 1% -12.63% (p=0.000 n=10+10)
MemmoveUnalignedDst/64-4 44.0ns ± 0% 40.5ns ± 0% -7.93% (p=0.000 n=8+8)
MemmoveUnalignedDst/128-4 59.6ns ± 0% 54.9ns ± 0% -7.85% (p=0.000 n=9+9)
MemmoveUnalignedDst/256-4 98.2ns ±11% 90.0ns ± 1% ~ (p=0.130 n=10+10)
MemmoveUnalignedDst/512-4 161ns ± 2% 145ns ± 1% -9.96% (p=0.000 n=10+10)
MemmoveUnalignedDst/1024-4 281ns ± 0% 265ns ± 0% -5.65% (p=0.000 n=9+8)
MemmoveUnalignedDst/2048-4 528ns ± 0% 482ns ± 0% -8.73% (p=0.000 n=8+9)
MemmoveUnalignedDst/4096-4 1.02µs ± 1% 0.92µs ± 0% -10.00% (p=0.000 n=10+8)
MemmoveUnalignedSrc/64-4 42.4ns ± 1% 40.5ns ± 0% -4.39% (p=0.000 n=10+8)
MemmoveUnalignedSrc/128-4 57.4ns ± 0% 57.0ns ± 1% -0.75% (p=0.048 n=9+10)
MemmoveUnalignedSrc/256-4 88.1ns ± 1% 89.6ns ± 0% +1.70% (p=0.000 n=9+8)
MemmoveUnalignedSrc/512-4 160ns ± 2% 144ns ± 0% -9.89% (p=0.000 n=10+8)
MemmoveUnalignedSrc/1024-4 286ns ± 0% 266ns ± 1% -6.69% (p=0.000 n=8+10)
MemmoveUnalignedSrc/2048-4 525ns ± 0% 483ns ± 1% -7.96% (p=0.000 n=9+10)
MemmoveUnalignedSrc/4096-4 1.01µs ± 0% 0.92µs ± 1% -9.40% (p=0.000 n=8+10)
Change-Id: Ia1144e9d4dfafdece6e167c5e576bf80f254c8ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/243357
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: eric fang <eric.fang@arm.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Additionally removed unused PPC64.IsPOWER8 CPU feature detection.
Change-Id: I1411b03d396a72e08d6d51f8a1d1bad49eaa720e
Reviewed-on: https://go-review.googlesource.com/c/go/+/266077
Trust: Martin Möhrmann <moehrmann@google.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
|
|
Change-Id: I7d9e31c3b342731ddd7329962426fdfc80e9ed87
Reviewed-on: https://go-review.googlesource.com/c/go/+/263803
Trust: Martin Möhrmann <moehrmann@google.com>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
|
|
Supports 386 and amd64 architectures on all operating systems.
Example output:
$ go test -bench=.*
goos: darwin
goarch: amd64
pkg: strconv
cpu: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
BenchmarkAtof64Decimal-4 24431032 46.8 ns/op
...
As the displayed CPU information is only used for information
purposes it is lazily initialized when needed using the new
internal/sysinfo package.
This allows internal/cpu to stay without dependencies and avoid
initialization costs when the CPU information is not needed as
the new code to query the CPU name in internal/cpu can be
dead code eliminated if not used.
Fixes #39214
Change-Id: I77ae5c5d2fed6b28fa78dd45075f9f0a6a7f1bfd
Reviewed-on: https://go-review.googlesource.com/c/go/+/263804
Trust: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
When cpu_no_init.go was created most architectures did not have
code in the doinit function. Currently only mips(le), riscv64 and
wasm do not have empty doinit functions.
Keeping cpu_no_init.go around does not reduce the work to satisfy
the build process when adding support for new architectures.
To support a new architecture a new file or build directive has to
be added to an existing file at any rate to define the constant
CacheLinePadSize. A new empty doinit can then be created in the
new file or the existing doinit can be reused when adding the
additional build directive.
Change-Id: I58a97f8cdf1cf1be85c37f4550c40750358aa031
Reviewed-on: https://go-review.googlesource.com/c/go/+/263801
Trust: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
|
|
Move code to detect and mask arm64 CPU features from
runtime to internal/cpu.
Change-Id: Ib784e2ff056e8def125d68827b852f07a3eff0db
Reviewed-on: https://go-review.googlesource.com/c/go/+/261878
Trust: Martin Möhrmann <moehrmann@google.com>
Trust: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
|
|
Change-Id: I038b0fe165931b8ec3ef59f08dc73c8128d56572
Reviewed-on: https://go-review.googlesource.com/c/go/+/261365
Trust: Martin Möhrmann <moehrmann@google.com>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
On macOS, we can do "go build", can exec, and have the source
tree available, so we can enable more tests.
Skip ones that don't work. Most of them are due to that it
requires external linking (for now) and some tests don't work
with external linking (e.g. runtime deadlock detection). For
them, helper functions CanInternalLink/MustInternalLink are
introduced. I still want to have internal linking implemented,
but it is still a good idea to identify which tests don't work
with external linking.
Updates #38485.
Change-Id: I6b14697573cf3f371daf54b9ddd792acf232f2f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/260719
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
HWCap and HWCap2 are no longer linknamed into package runtime. Also,
merge two sentences both starting with "These are..." and don't mention
any file name where archauxv is defined, as it become outdated if
support for a new $GOOS/$GOARCH combination is added. This is e.g.
already the case for arm64, where archauxv is also defined for
freebsd/arm64.
Change-Id: I9314a66633736b12e777869a832d8b79d442a6f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/228057
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Like in x/sys/cpu, use anonymous structs to declare the CPU feature vars
instead of defining single-use types. Also, order the vars
alphabetically.
Change-Id: Iedd3ca51916e3cbb852d2aeed18b3a4c6613e778
Reviewed-on: https://go-review.googlesource.com/c/go/+/221757
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
|
|
Change-Id: Iacdad1758aa15e4703fccef38c08ecb338b95fd7
Reviewed-on: https://go-review.googlesource.com/c/go/+/200579
Run-TryBot: Meng Zhuo <mengzhuo1203@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Based on riscv-go port.
Updates #27532
Change-Id: Ia3aed521d4109e7b73f762c5a3cdacc7cdac430d
Reviewed-on: https://go-review.googlesource.com/c/go/+/204635
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Part 1: CL 199499 (GOOS nacl)
Part 2: CL 200077 (amd64p32 files, toolchain)
Part 3: stuff that arguably should've been part of Part 2, but I forgot
one of my grep patterns when splitting the original CL up into
two parts.
This one might also have interesting stuff to resurrect for any future
x32 ABI support.
Updates #30439
Change-Id: I2b4143374a253a003666f3c69e776b7e456bdb9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/200318
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This is part two if the nacl removal. Part 1 was CL 199499.
This CL removes amd64p32 support, which might be useful in the future
if we implement the x32 ABI. It also removes the nacl bits in the
toolchain, and some remaining nacl bits.
Updates #30439
Change-Id: I2475d5bb066d1b474e00e40d95b520e7c2e286e1
Reviewed-on: https://go-review.googlesource.com/c/go/+/200077
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
You were a useful port and you've served your purpose.
Thanks for all the play.
A subsequent CL will remove amd64p32 (including assembly files and
toolchain bits) and remaining bits. The amd64p32 removal will be
separated into its own CL in case we want to support the Linux x32 ABI
in the future and want our old amd64p32 support as a starting point.
Updates #30439
Change-Id: Ia3a0c7d49804adc87bf52a4dea7e3d3007f2b1cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/199499
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Use the following (suboptimal) script to obtain a list of possible
typos:
#!/usr/bin/env sh
set -x
git ls-files |\
grep -e '\.\(c\|cc\|go\)$' |\
xargs -n 1\
awk\
'/\/\// { gsub(/.*\/\//, ""); print; } /\/\*/, /\*\// { gsub(/.*\/\*/, ""); gsub(/\*\/.*/, ""); }' |\
hunspell -d en_US -l |\
grep '^[[:upper:]]\{0,1\}[[:lower:]]\{1,\}$' |\
grep -v -e '^.\{1,4\}$' -e '^.\{16,\}$' |\
sort -f |\
uniq -c |\
awk '$1 == 1 { print $2; }'
Then, go through the results manually and fix the most obvious typos in
the non-vendored code.
Change-Id: I3cb5830a176850e1a0584b8a40b47bde7b260eae
Reviewed-on: https://go-review.googlesource.com/c/go/+/193848
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
This CL will check for the Message-Security-Assist Extension 9 facility
which enables the KDSA instruction.
Change-Id: I659aac09726e0999ec652ef1f5983072c8131a48
Reviewed-on: https://go-review.googlesource.com/c/go/+/174529
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
This CL changes the internal/cpu API to more closely match the
public version in x/sys/cpu (added in CL 163003). This will make it
easier to update the dependencies of vendored code. The most prominent
renaming is from VE1 to VXE for the vector-enhancements facility 1.
VXE is the mnemonic used for this facility in the HWCAP vector.
Change-Id: I922d6c8bb287900a4bd7af70567e22eac567b5c1
Reviewed-on: https://go-review.googlesource.com/c/164437
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
available)
In the s390x assembly implementation of NIST P-256 curve, utilize faster multiply/square
instructions introduced in the z14. These new instructions are designed for crypto
and are constant time. The algorithm is unchanged except for faster
multiplication when run on a z14 or later. On z13, the original mutiplication
(also constant time) is used.
P-256 performance is critical in many applications, such as Blockchain.
name old time new time delta
BaseMultP256 24396 ns/op 21564 ns/op 1.13x
ScalarMultP256 87546 ns/op 72813 ns/op. 1.20x
Change-Id: I7e6d8b420fac56d5f9cc13c9423e2080df854bac
Reviewed-on: https://go-review.googlesource.com/c/146022
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
|
|
Change internal/cpu feature configuration to use
GODEBUG=cpu.feature1=value,cpu.feature2=value...
instead of GODEBUGCPU=feature1=value,feature2=value... .
This is not a backwards compatibility breaking change
since GODEBUGCPU was introduced in go1.11 as an
undocumented compiler experiment.
Fixes #28757
Change-Id: Ib21b3fed2334baeeb061a722ab1eb513d1137e87
Reviewed-on: https://go-review.googlesource.com/c/149578
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|