aboutsummaryrefslogtreecommitdiff
path: root/src/internal/cpu
AgeCommit message (Collapse)Author
2023-01-23runtime: enable sha512 optimizations on arm64 via hwcaps.Matt Horsnell
Change-Id: I9d88c8eb91106de412a9abc6601cdda06537d818 Reviewed-on: https://go-review.googlesource.com/c/go/+/461747 Reviewed-by: Martin Möhrmann <moehrmann@google.com> Auto-Submit: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@golang.org>
2022-11-14internal/godebug: define more efficient APIRuss Cox
We have been expanding our use of GODEBUG for compatibility, and the current implementation forces a tradeoff between freshness and efficiency. It parses the environment variable in full each time it is called, which is expensive. But if clients cache the result, they won't respond to run-time GODEBUG changes, as happened with x509sha1 (#56436). This CL changes the GODEBUG API to provide efficient, up-to-date results. Instead of a single Get function, New returns a *godebug.Setting that itself has a Get method. Clients can save the result of New, which is no more expensive than errors.New, in a global variable, and then call that variable's Get method to get the value. Get costs only two atomic loads in the case where the variable hasn't changed since the last call. Unfortunately, these changes do require importing sync from godebug, which will mean that sync itself will never be able to use a GODEBUG setting. That doesn't seem like such a hardship. If it was really necessary, the runtime could pass a setting to package sync itself at startup, with the caveat that that setting, like the ones used by runtime itself, would not respond to run-time GODEBUG changes. Change-Id: I99a3acfa24fb2a692610af26a5d14bbc62c966ac Reviewed-on: https://go-review.googlesource.com/c/go/+/449504 Run-TryBot: Russ Cox <rsc@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-08runtime internal/cpu: rename "Zeus" "NeoverseV1".Matthew Horsnell
Rename "Zeus" to "NeoverseV1" for the partnum 0xd40 to be consistent with the documentation of MIDR_EL1 as described in https://developer.arm.com/documentation/101427/0101/?lang=en Change-Id: I2e3d5ec76b953a831cb4ab0438bc1c403648644b Reviewed-on: https://go-review.googlesource.com/c/go/+/414775 Reviewed-by: Jonathan Swinney <jswinney@amazon.com> Auto-Submit: Ian Lance Taylor <iant@golang.org> Reviewed-by: Eric Fang <eric.fang@arm.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2022-09-26internal/cpu: deduplicate arm64 ISAR parsing codeJoel Sing
Deduplicate code for parsing system registers - this matches what is done in golang.org/x/sys/cpu. Change-Id: If3524eb2e361179c68678f8214230d7068fe4c60 Reviewed-on: https://go-review.googlesource.com/c/go/+/422217 Reviewed-by: Meng Zhuo <mzh@golangcn.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-26internal/cpu: enable arm64 SHA512 detection for freebsd/openbsdJoel Sing
Change-Id: I1f21654b50d7b0cd8e1f854efe2724b72f067449 Reviewed-on: https://go-review.googlesource.com/c/go/+/422216 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Meng Zhuo <mzh@golangcn.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Yuval Pavel Zholkover <paulzhol@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-08-15internal/cpu: detect sha-ni instruction support for AMD64ted
addresses proposal #53084 required by sha-256 change list developed for #50543 Change-Id: I5454d746fce069a7a4993d70dc5b0a5544f8eeaf Reviewed-on: https://go-review.googlesource.com/c/go/+/408794 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@google.com>
2022-08-14internal/cpu: fix cpu cacheLineSize for arm64 darwin(a.k.a. M1)Pure White
The existing value for M1 is 64, which is the same as other arm64 cpus. But the correct cacheLineSize for M1 should be 128, which can be verified using the following command: $ sysctl -a hw | grep cachelinesize hw.cachelinesize: 128 Fixes #53075 Change-Id: Iaa8330010a4499b9b357c70743d55aed6ddb8588 GitHub-Last-Rev: df87eb9c503c6bc5220a92ef1bc4c4c89ef4658d GitHub-Pull-Request: golang/go#53076 Reviewed-on: https://go-review.googlesource.com/c/go/+/408576 Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Meng Zhuo <mzh@golangcn.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Martin Möhrmann <martin@golang.org> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Keith Randall <khr@google.com>
2022-08-09internal/cpu: add sha512 for arm64Meng Zhuo
The new M1 cpu (Apple) comes with sha512 hardware acceleration feature. Change-Id: I823d1e9b09b472bd21571eee75cc5314cd66b1ff Reviewed-on: https://go-review.googlesource.com/c/go/+/408836 Reviewed-by: Than McIntosh <thanm@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-08internal/cpu: implement CPU feature detection for openbsd/arm64Joel Sing
OpenBSD 7.1 onwards expose the aarch64 ISAR0 and ISAR1 registers via sysctl: $ sysctl machdep machdep.compatible=apple,j274 machdep.id_aa64isar0=153421459058925856 machdep.id_aa64isar1=1172796674562 Implement CPU feature detection for openbsd/arm64 based on this information. Fixes #31746 Change-Id: If8a9b2b8fc557e1aaefbcb52f4d1bd9efc43856d Reviewed-on: https://go-review.googlesource.com/c/go/+/421875 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Joel Sing <joel@sing.id.au> Run-TryBot: Ian Lance Taylor <iant@google.com>
2022-06-14cpu: fix typos in test caseag9920
Change-Id: Id6a27d0b3f3fc4181a00569bacc578e72b04ce09 GitHub-Last-Rev: 85c063d1a2d62181d16044592a60acf970fe3c86 GitHub-Pull-Request: golang/go#53359 Reviewed-on: https://go-review.googlesource.com/c/go/+/411916 Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Peter Zhang <binbin36520@gmail.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Robert Griesemer <gri@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com>
2022-05-26internal/cpu: fix cpu cacheLineSize for loong64Guoqi Chen
We choose 64 because the L1 Dcache of Loongson 3A5000 CPU is 4-way 256-line 64-byte-per-line. Change-Id: Ifb9a9f993dd6f75b5adb4ff6e4d93e945b1b2a98 Reviewed-on: https://go-review.googlesource.com/c/go/+/408854 Run-TryBot: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Alex Rakoczy <alex@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-05-17internal/{cpu, goarch}: add constant definition for loong64Xiaodong Liu
Contributors to the loong64 port are: Weining Lu <luweining@loongson.cn> Lei Wang <wanglei@loongson.cn> Lingqin Gong <gonglingqin@loongson.cn> Xiaolin Zhao <zhaoxiaolin@loongson.cn> Meidan Li <limeidan@loongson.cn> Xiaojuan Zhai <zhaixiaojuan@loongson.cn> Qiyuan Pu <puqiyuan@loongson.cn> Guoqi Chen <chenguoqi@loongson.cn> This port has been updated to Go 1.15.6: https://github.com/loongson/go Updates #46229 Change-Id: I39d42e5959391e47bf621b3bdd3d95de72f023cc Reviewed-on: https://go-review.googlesource.com/c/go/+/342318 Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-10internal/cpu: report CPU if known on PPC64Paul E. Murphy
The PPC64 maintainers are testing on P10 hardware, so it is helpful to report the correct cpu, even if this information is not used elsewhere yet. Note, AIX will report the current CPU of the host system, so a POWER10 will not set the IsPOWER9 flag. This is existing behavior, and should be fixed in a separate patch. Change-Id: Iebe23dd96ebe03c8a1c70d1ed2dc1506bad3c330 Reviewed-on: https://go-review.googlesource.com/c/go/+/404394 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
2022-05-09internal/cpu: revise test to make it work properly with -coverThan McIntosh
Fix up a test to insure that it does the right thing when "go test -cover" is in effect. Fixes #52761. Change-Id: I0c141181e2dcaefd592fb04813f812f2800511da Reviewed-on: https://go-review.googlesource.com/c/go/+/404715 Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2022-03-15internal/cpu: don't run SSE3 disable test if GOAMD64>1Keith Randall
That feature can't be disabled if the microarchitectural version requires it. Change-Id: Iad8aaa8089d2f023e9ae5044c6da33224499f09b Reviewed-on: https://go-review.googlesource.com/c/go/+/392994 Run-TryBot: Keith Randall <khr@golang.org> Trust: Keith Randall <khr@golang.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Martin Möhrmann <martin@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-03-14internal/cpu: disallow disabling options that are required for microarchKeith Randall
e.g., if GOAMD64=v3, don't allow GODEBUG=cpu.XXX=off for XXX which are required for v3. Change-Id: Ib58a4c8b13c5464ba476448ba44bbb261218787c Reviewed-on: https://go-review.googlesource.com/c/go/+/391694 Trust: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Martin Möhrmann <martin@golang.org>
2022-03-07internal/cpu: set PPC64.IsPOWER8Paul E. Murphy
This should always be true, but use the HWCAP2 bit anyways. Change-Id: Ib164cf05b4c9f0c509f41b7eb339ef32fb63e384 Reviewed-on: https://go-review.googlesource.com/c/go/+/389894 Trust: Paul Murphy <murp@ibm.com> Run-TryBot: Paul Murphy <murp@ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-11-06all: remove more leftover // +build linesTobias Klauser
CL 344955 and CL 359476 removed almost all // +build lines, but leaving some assembly files and generating scripts. Also, some files were added with // +build lines after CL 359476 was merged. Remove these or rename files where more appropriate. For #41184 Change-Id: I7eb85a498ed9788b42a636e775f261d755504ffa Reviewed-on: https://go-review.googlesource.com/c/go/+/361480 Trust: Tobias Klauser <tobias.klauser@gmail.com> Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
2021-11-02cmd/go, internal/cpu: use internal/godebug in testsBrad Fitzpatrick
Change-Id: Ifdf67e778e88ee70780428aa5479d2e091752a3a Reviewed-on: https://go-review.googlesource.com/c/go/+/360605 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Trust: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-10-28all: go fix -fix=buildtag std cmd (except for bootstrap deps, vendor)Russ Cox
When these packages are released as part of Go 1.18, Go 1.16 will no longer be supported, so we can remove the +build tags in these files. Ran go fix -fix=buildtag std cmd and then reverted the bootstrapDirs as defined in src/cmd/dist/buildtool.go, which need to continue to build with Go 1.4 for now. Also reverted src/vendor and src/cmd/vendor, which will need to be updated in their own repos first. Manual changes in runtime/pprof/mprof_test.go to adjust line numbers. For #41184. Change-Id: Ic0f93f7091295b6abc76ed5cd6e6746e1280861e Reviewed-on: https://go-review.googlesource.com/c/go/+/344955 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
2021-10-06internal/cpu: remove option to mark cpu features requiredMartin Möhrmann
With the removal of SSE2 runtime detection made in golang.org/cl/344350 we can remove this mechanism as there are no required features anymore. For making sure CPUs running a go program support all the minimal hardware requirements the go runtime should do feature checks early in the runtime initialization before it is likely any compiler emitted but unsupported instructions are used. This is already the case for e.g. checking MMX support on 386 arch targets. Change-Id: If7b1cb6f43233841e917d37a18314d06a334a734 Reviewed-on: https://go-review.googlesource.com/c/go/+/354209 Trust: Martin Möhrmann <martin@golang.org> Run-TryBot: Martin Möhrmann <martin@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>
2021-08-23all: replace runtime SSE2 detection with GO386 settingMartin Möhrmann
When GO386=sse2 we can assume sse2 to be present without a runtime check. If GO386=softfloat is set we can avoid the usage of SSE2 even if detected. This might cause a memcpy, memclr and bytealg slowdown of Go binaries compiled with softfloat on machines that support SSE2. Such setups are rare and should use GO386=sse2 instead if performance matters. On targets that support SSE2 we avoid the runtime overhead of dynamic cpu feature dispatch. The removal of runtime sse2 checks also allows to simplify internal/cpu further by removing handling of the required feature option as a followup after this CL. Change-Id: I90a853a8853a405cb665497c6d1a86556947ba17 Reviewed-on: https://go-review.googlesource.com/c/go/+/344350 Trust: Martin Möhrmann <martin@golang.org> Run-TryBot: Martin Möhrmann <martin@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2021-08-23runtime: use RDTSCP for instruction stream serialized read of TSCMartin Möhrmann
To measure all instructions having been completed before reading the time stamp counter with RDTSC an instruction sequence that has instruction stream serializing properties which guarantee waiting until all previous instructions have been executed is needed. This does not necessary mean to wait for all stores to be globally visible. This CL aims to remove vendor specific logic for determining the instruction sequence with CPU feature flag checks that are CPU vendor independent. For intel LFENCE has the wanted properties at least since it was introduced together with SSE2 support. On AMD instruction stream serializing LFENCE is supported by setting an MSR C001_1029[1]=1 on AMD family 10h/12h/14h/15h/16h/17h processors. AMD family 0Fh/11h processors support LFENCE as serializing always. AMD plans support for this MSR and access to this bit for all future processors. Source: https://developer.amd.com/wp-content/resources/Managing-Speculation-on-AMD-Processors.pdf Reading the MSR to determine LFENCE properties is not always possible or reliable (hypervisors). The Linux kernel is relying on serializing LFENCE on AMD CPUs since a commit in July 2019: https://lkml.org/lkml/2019/7/22/295 and the MSR C001_1029 to enable serialization has been set by default with the Spectre v1 mitigations. Using an MFENCE on AMD is waiting on previous instructions having been executed but in addition also flushes store buffers. To align the serialization properties without runtime detection of CPU manufacturers we can use the newer RDTSCP instruction which waits until all previous instructions have been executed. RDTSCP is available on Intel since around 2008 and on AMD CPUs since around 2006. Support for RDTSCP can be checked independently of manufacturer by checking CPUID bits. Using RDTSCP is the default in Linux to read TSC in program order when the instruction is available. https://github.com/torvalds/linux/blob/e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93/arch/x86/include/asm/msr.h#L231 Change-Id: Ifa841843b9abb2816f8f0754a163ebf01385306d Reviewed-on: https://go-review.googlesource.com/c/go/+/344429 Reviewed-by: Keith Randall <khr@golang.org> Trust: Martin Möhrmann <martin@golang.org> Run-TryBot: Martin Möhrmann <martin@golang.org> TryBot-Result: Go Bot <gobot@golang.org>
2021-05-13all: add //go:build lines to assembly filesTobias Klauser
Don't add them to files in vendor and cmd/vendor though. These will be pulled in by updating the respective dependencies. For #41184 Change-Id: Icc57458c9b3033c347124323f33084c85b224c70 Reviewed-on: https://go-review.googlesource.com/c/go/+/319389 Trust: Tobias Klauser <tobias.klauser@gmail.com> Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
2021-02-20all: go fmt std cmd (but revert vendor)Russ Cox
Make all our package sources use Go 1.17 gofmt format (adding //go:build lines). Part of //go:build change (#41184). See https://golang.org/design/draft-gobuild Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4 Reviewed-on: https://go-review.googlesource.com/c/go/+/294430 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-12-09all: update to use os.ReadFile, os.WriteFile, os.CreateTemp, os.MkdirTempRuss Cox
As part of #42026, these helpers from io/ioutil were moved to os. (ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.) Update the Go tree to use the preferred names. As usual, code compiled with the Go 1.4 bootstrap toolchain and code vendored from other sources is excluded. ReadDir changes are in a separate CL, because they are not a simple search and replace. For #42026. Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3 Reviewed-on: https://go-review.googlesource.com/c/go/+/266365 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-12-07internal/cpu: add darwin/arm64 CPU feature detection supportMartin Möhrmann
Fixes #42747 Change-Id: I6b1679348c77161f075f0678818bb003fc0e8c86 Reviewed-on: https://go-review.googlesource.com/c/go/+/271989 Trust: Martin Möhrmann <moehrmann@google.com> Run-TryBot: Martin Möhrmann <martisch@uos.de> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-12-05internal/cpu: fix typo in cpu_arm64.goIkko Ashimine
auxillary -> auxiliary Change-Id: I7c29c4a63d236c3688b8e4f5af70650d43cd89c0 GitHub-Last-Rev: d4a18c71a15cf0803bd847225ed5bf898c52e0f3 GitHub-Pull-Request: golang/go#43024 Reviewed-on: https://go-review.googlesource.com/c/go/+/275592 Reviewed-by: Ian Lance Taylor <iant@golang.org> Trust: Keith Randall <khr@golang.org>
2020-12-03internal/cpu: disable FMA when OSXSAVE is not enabled on x86Martin Möhrmann
All instructions in the FMA extension on x86 are VEX prefixed. VEX prefixed instructions generally require OSXSAVE to be enabled. The execution of FMA instructions emitted by the Go compiler on amd64 will generate an invalid opcode exception if OSXSAVE is not enabled. Fixes #41022 Change-Id: I49881630e7195c804110a2bd81b5bec8cac31ba8 Reviewed-on: https://go-review.googlesource.com/c/go/+/274479 Trust: Martin Möhrmann <moehrmann@google.com> Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-11-05internal/cpu: fix and cleanup ARM64 cpu feature fields and optionsMartin Möhrmann
Remove all cpu features from the ARM64 struct that are not initialized to reduce cache lines used and to avoid those features being accidentially used without actual detection if they are present. Add missing option to mask the CPUID feature. Change-Id: I94bf90c0655de1af2218ac72117ac6c52adfc289 Reviewed-on: https://go-review.googlesource.com/c/go/+/267658 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Trust: Martin Möhrmann <moehrmann@google.com>
2020-11-02runtime: improve memmove performance on arm64Jonathan Swinney
Replace the memmove implementation for moves of 17 bytes or larger with an implementation from ARM optimized software. The moves of 16 bytes or fewer are unchanged, but the registers used are updated to match the rest of the implementation. This implementation makes use of new optimizations: - software pipelined loop for large (>128 byte) moves - medium size moves (17..128 bytes) have a new implementation - address realignment when src or dst is unaligned - preference for aligned src (loads) or dst (stores) depending on CPU To support preference for aligned loads or aligned stores, a new CPU flag is added. This flag indicates that the detected micro architecture performs better with aligned loads. Some tested CPUs did not exhibit a significant difference and are left with the default behavior of realigning based on the destination address (stores). Neoverse N1 (Tested on Graviton 2) name old time/op new time/op delta Memmove/0-4 1.88ns ± 1% 1.87ns ± 1% -0.58% (p=0.020 n=10+10) Memmove/1-4 4.40ns ± 0% 4.40ns ± 0% ~ (all equal) Memmove/8-4 3.88ns ± 3% 3.80ns ± 0% -1.97% (p=0.001 n=10+9) Memmove/16-4 3.90ns ± 3% 3.80ns ± 0% -2.49% (p=0.000 n=10+9) Memmove/32-4 4.80ns ± 0% 4.40ns ± 0% -8.33% (p=0.000 n=9+8) Memmove/64-4 5.86ns ± 0% 5.00ns ± 0% -14.76% (p=0.000 n=8+8) Memmove/128-4 8.46ns ± 0% 8.06ns ± 0% -4.62% (p=0.000 n=10+10) Memmove/256-4 12.4ns ± 0% 12.2ns ± 0% -1.61% (p=0.000 n=10+10) Memmove/512-4 19.5ns ± 0% 19.1ns ± 0% -2.05% (p=0.000 n=10+10) Memmove/1024-4 33.7ns ± 0% 33.5ns ± 0% -0.59% (p=0.000 n=10+10) Memmove/2048-4 62.1ns ± 0% 59.0ns ± 0% -4.99% (p=0.000 n=10+10) Memmove/4096-4 117ns ± 1% 110ns ± 0% -5.66% (p=0.000 n=10+10) MemmoveUnalignedDst/64-4 6.41ns ± 0% 5.62ns ± 0% -12.32% (p=0.000 n=10+7) MemmoveUnalignedDst/128-4 9.40ns ± 0% 8.34ns ± 0% -11.24% (p=0.000 n=10+10) MemmoveUnalignedDst/256-4 12.8ns ± 0% 12.8ns ± 0% ~ (all equal) MemmoveUnalignedDst/512-4 20.4ns ± 0% 19.7ns ± 0% -3.43% (p=0.000 n=9+10) MemmoveUnalignedDst/1024-4 34.1ns ± 0% 35.1ns ± 0% +2.93% (p=0.000 n=9+9) MemmoveUnalignedDst/2048-4 61.5ns ± 0% 60.4ns ± 0% -1.77% (p=0.000 n=10+10) MemmoveUnalignedDst/4096-4 122ns ± 0% 113ns ± 0% -7.38% (p=0.002 n=8+10) MemmoveUnalignedSrc/64-4 7.25ns ± 1% 6.26ns ± 0% -13.64% (p=0.000 n=9+9) MemmoveUnalignedSrc/128-4 10.5ns ± 0% 9.7ns ± 0% -7.52% (p=0.000 n=10+10) MemmoveUnalignedSrc/256-4 17.1ns ± 0% 17.3ns ± 0% +1.17% (p=0.000 n=10+10) MemmoveUnalignedSrc/512-4 27.0ns ± 0% 27.0ns ± 0% ~ (all equal) MemmoveUnalignedSrc/1024-4 46.7ns ± 0% 35.7ns ± 0% -23.55% (p=0.000 n=10+9) MemmoveUnalignedSrc/2048-4 85.2ns ± 0% 61.2ns ± 0% -28.17% (p=0.000 n=10+8) MemmoveUnalignedSrc/4096-4 162ns ± 0% 113ns ± 0% -30.25% (p=0.000 n=10+10) name old speed new speed delta Memmove/4096-4 35.2GB/s ± 0% 37.1GB/s ± 0% +5.56% (p=0.000 n=10+9) MemmoveUnalignedSrc/1024-4 21.9GB/s ± 0% 28.7GB/s ± 0% +30.90% (p=0.000 n=10+10) MemmoveUnalignedSrc/2048-4 24.0GB/s ± 0% 33.5GB/s ± 0% +39.18% (p=0.000 n=10+9) MemmoveUnalignedSrc/4096-4 25.3GB/s ± 0% 36.2GB/s ± 0% +43.50% (p=0.000 n=10+7) Cortex-A72 (Graviton 1) name old time/op new time/op delta Memmove/0-4 3.06ns ± 3% 3.08ns ± 1% ~ (p=0.958 n=10+9) Memmove/1-4 8.72ns ± 0% 7.85ns ± 0% -9.98% (p=0.002 n=8+10) Memmove/8-4 8.29ns ± 0% 8.29ns ± 0% ~ (all equal) Memmove/16-4 8.29ns ± 0% 8.29ns ± 0% ~ (all equal) Memmove/32-4 8.19ns ± 2% 8.29ns ± 0% ~ (p=0.114 n=10+10) Memmove/64-4 18.3ns ± 4% 10.0ns ± 0% -45.36% (p=0.000 n=10+10) Memmove/128-4 14.8ns ± 0% 17.4ns ± 0% +17.77% (p=0.000 n=10+10) Memmove/256-4 21.8ns ± 0% 23.1ns ± 0% +5.96% (p=0.000 n=10+10) Memmove/512-4 35.8ns ± 0% 37.2ns ± 0% +3.91% (p=0.000 n=10+10) Memmove/1024-4 63.7ns ± 0% 67.2ns ± 0% +5.49% (p=0.000 n=10+10) Memmove/2048-4 126ns ± 0% 123ns ± 0% -2.38% (p=0.000 n=10+10) Memmove/4096-4 238ns ± 1% 243ns ± 1% +1.93% (p=0.000 n=10+10) MemmoveUnalignedDst/64-4 19.3ns ± 1% 12.0ns ± 1% -37.49% (p=0.000 n=10+10) MemmoveUnalignedDst/128-4 17.2ns ± 0% 17.4ns ± 0% +1.16% (p=0.000 n=10+10) MemmoveUnalignedDst/256-4 28.2ns ± 8% 29.2ns ± 0% ~ (p=0.352 n=10+10) MemmoveUnalignedDst/512-4 49.8ns ± 3% 48.9ns ± 0% ~ (p=1.000 n=10+10) MemmoveUnalignedDst/1024-4 89.5ns ± 0% 80.5ns ± 1% -10.02% (p=0.000 n=10+10) MemmoveUnalignedDst/2048-4 180ns ± 0% 127ns ± 0% -29.44% (p=0.000 n=9+10) MemmoveUnalignedDst/4096-4 347ns ± 0% 244ns ± 0% -29.59% (p=0.000 n=10+9) MemmoveUnalignedSrc/128-4 16.1ns ± 0% 21.8ns ± 0% +35.40% (p=0.000 n=10+10) MemmoveUnalignedSrc/256-4 24.9ns ± 8% 26.6ns ± 0% +6.70% (p=0.015 n=10+10) MemmoveUnalignedSrc/512-4 39.4ns ± 6% 40.6ns ± 0% ~ (p=0.352 n=10+10) MemmoveUnalignedSrc/1024-4 72.5ns ± 0% 83.0ns ± 1% +14.44% (p=0.000 n=9+10) MemmoveUnalignedSrc/2048-4 129ns ± 1% 128ns ± 1% ~ (p=0.179 n=10+10) MemmoveUnalignedSrc/4096-4 241ns ± 0% 253ns ± 1% +4.99% (p=0.000 n=9+9) Cortex-A53 (Raspberry Pi 3) name old time/op new time/op delta Memmove/0-4 11.0ns ± 0% 11.0ns ± 1% ~ (p=0.294 n=8+10) Memmove/1-4 29.6ns ± 0% 28.0ns ± 1% -5.41% (p=0.000 n=9+10) Memmove/8-4 23.5ns ± 0% 22.1ns ± 0% -6.11% (p=0.000 n=8+8) Memmove/16-4 23.7ns ± 1% 22.1ns ± 0% -6.59% (p=0.000 n=10+8) Memmove/32-4 27.9ns ± 0% 27.1ns ± 0% -3.13% (p=0.000 n=8+8) Memmove/64-4 33.8ns ± 0% 31.5ns ± 1% -6.99% (p=0.000 n=8+10) Memmove/128-4 45.6ns ± 0% 44.2ns ± 1% -3.23% (p=0.000 n=9+10) Memmove/256-4 69.3ns ± 0% 69.3ns ± 0% ~ (p=0.072 n=8+8) Memmove/512-4 127ns ± 0% 110ns ± 0% -13.39% (p=0.000 n=8+8) Memmove/1024-4 222ns ± 0% 205ns ± 1% -7.66% (p=0.000 n=7+10) Memmove/2048-4 411ns ± 0% 366ns ± 0% -10.98% (p=0.000 n=8+9) Memmove/4096-4 795ns ± 1% 695ns ± 1% -12.63% (p=0.000 n=10+10) MemmoveUnalignedDst/64-4 44.0ns ± 0% 40.5ns ± 0% -7.93% (p=0.000 n=8+8) MemmoveUnalignedDst/128-4 59.6ns ± 0% 54.9ns ± 0% -7.85% (p=0.000 n=9+9) MemmoveUnalignedDst/256-4 98.2ns ±11% 90.0ns ± 1% ~ (p=0.130 n=10+10) MemmoveUnalignedDst/512-4 161ns ± 2% 145ns ± 1% -9.96% (p=0.000 n=10+10) MemmoveUnalignedDst/1024-4 281ns ± 0% 265ns ± 0% -5.65% (p=0.000 n=9+8) MemmoveUnalignedDst/2048-4 528ns ± 0% 482ns ± 0% -8.73% (p=0.000 n=8+9) MemmoveUnalignedDst/4096-4 1.02µs ± 1% 0.92µs ± 0% -10.00% (p=0.000 n=10+8) MemmoveUnalignedSrc/64-4 42.4ns ± 1% 40.5ns ± 0% -4.39% (p=0.000 n=10+8) MemmoveUnalignedSrc/128-4 57.4ns ± 0% 57.0ns ± 1% -0.75% (p=0.048 n=9+10) MemmoveUnalignedSrc/256-4 88.1ns ± 1% 89.6ns ± 0% +1.70% (p=0.000 n=9+8) MemmoveUnalignedSrc/512-4 160ns ± 2% 144ns ± 0% -9.89% (p=0.000 n=10+8) MemmoveUnalignedSrc/1024-4 286ns ± 0% 266ns ± 1% -6.69% (p=0.000 n=8+10) MemmoveUnalignedSrc/2048-4 525ns ± 0% 483ns ± 1% -7.96% (p=0.000 n=9+10) MemmoveUnalignedSrc/4096-4 1.01µs ± 0% 0.92µs ± 1% -9.40% (p=0.000 n=8+10) Change-Id: Ia1144e9d4dfafdece6e167c5e576bf80f254c8ab Reviewed-on: https://go-review.googlesource.com/c/go/+/243357 TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: eric fang <eric.fang@arm.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-29runtime: move ppc64/aix cpu feature detection to internal/cpuMartin Möhrmann
Additionally removed unused PPC64.IsPOWER8 CPU feature detection. Change-Id: I1411b03d396a72e08d6d51f8a1d1bad49eaa720e Reviewed-on: https://go-review.googlesource.com/c/go/+/266077 Trust: Martin Möhrmann <moehrmann@google.com> Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
2020-10-22runtime: move s390x HWCap CPU feature detection to internal/cpuMartin Möhrmann
Change-Id: I7d9e31c3b342731ddd7329962426fdfc80e9ed87 Reviewed-on: https://go-review.googlesource.com/c/go/+/263803 Trust: Martin Möhrmann <moehrmann@google.com> Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
2020-10-20testing: print cpu type as label for benchmarksMartin Möhrmann
Supports 386 and amd64 architectures on all operating systems. Example output: $ go test -bench=.* goos: darwin goarch: amd64 pkg: strconv cpu: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz BenchmarkAtof64Decimal-4 24431032 46.8 ns/op ... As the displayed CPU information is only used for information purposes it is lazily initialized when needed using the new internal/sysinfo package. This allows internal/cpu to stay without dependencies and avoid initialization costs when the CPU information is not needed as the new code to query the CPU name in internal/cpu can be dead code eliminated if not used. Fixes #39214 Change-Id: I77ae5c5d2fed6b28fa78dd45075f9f0a6a7f1bfd Reviewed-on: https://go-review.googlesource.com/c/go/+/263804 Trust: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2020-10-20internal/cpu: make architectures without initialization work explicitMartin Möhrmann
When cpu_no_init.go was created most architectures did not have code in the doinit function. Currently only mips(le), riscv64 and wasm do not have empty doinit functions. Keeping cpu_no_init.go around does not reduce the work to satisfy the build process when adding support for new architectures. To support a new architecture a new file or build directive has to be added to an existing file at any rate to define the constant CacheLinePadSize. A new empty doinit can then be created in the new file or the existing doinit can be reused when adding the additional build directive. Change-Id: I58a97f8cdf1cf1be85c37f4550c40750358aa031 Reviewed-on: https://go-review.googlesource.com/c/go/+/263801 Trust: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
2020-10-20internal/cpu: consolidate arm64 feature detectionMartin Möhrmann
Move code to detect and mask arm64 CPU features from runtime to internal/cpu. Change-Id: Ib784e2ff056e8def125d68827b852f07a3eff0db Reviewed-on: https://go-review.googlesource.com/c/go/+/261878 Trust: Martin Möhrmann <moehrmann@google.com> Trust: Tobias Klauser <tobias.klauser@gmail.com> Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Benny Siegert <bsiegert@gmail.com>
2020-10-13internal/cpu: remove unused arm64 capabilitiesMartin Möhrmann
Change-Id: I038b0fe165931b8ec3ef59f08dc73c8128d56572 Reviewed-on: https://go-review.googlesource.com/c/go/+/261365 Trust: Martin Möhrmann <moehrmann@google.com> Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-10-09all: enable more tests on macOS/ARM64Cherry Zhang
On macOS, we can do "go build", can exec, and have the source tree available, so we can enable more tests. Skip ones that don't work. Most of them are due to that it requires external linking (for now) and some tests don't work with external linking (e.g. runtime deadlock detection). For them, helper functions CanInternalLink/MustInternalLink are introduced. I still want to have internal linking implemented, but it is still a good idea to identify which tests don't work with external linking. Updates #38485. Change-Id: I6b14697573cf3f371daf54b9ddd792acf232f2f2 Reviewed-on: https://go-review.googlesource.com/c/go/+/260719 Trust: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Than McIntosh <thanm@google.com>
2020-04-13internal/cpu: unify HWCap/HWCap2 commentsTobias Klauser
HWCap and HWCap2 are no longer linknamed into package runtime. Also, merge two sentences both starting with "These are..." and don't mention any file name where archauxv is defined, as it become outdated if support for a new $GOOS/$GOARCH combination is added. This is e.g. already the case for arm64, where archauxv is also defined for freebsd/arm64. Change-Id: I9314a66633736b12e777869a832d8b79d442a6f8 Reviewed-on: https://go-review.googlesource.com/c/go/+/228057 Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-03-02internal/cpu: use anonymous struct for CPU feature varsTobias Klauser
Like in x/sys/cpu, use anonymous structs to declare the CPU feature vars instead of defining single-use types. Also, order the vars alphabetically. Change-Id: Iedd3ca51916e3cbb852d2aeed18b3a4c6613e778 Reviewed-on: https://go-review.googlesource.com/c/go/+/221757 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com>
2020-02-28internal/cpu: add MIPS64x feature detectionMeng Zhuo
Change-Id: Iacdad1758aa15e4703fccef38c08ecb338b95fd7 Reviewed-on: https://go-review.googlesource.com/c/go/+/200579 Run-TryBot: Meng Zhuo <mengzhuo1203@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-11-11internal/cpu,internal/bytealg: add support for riscv64Joel Sing
Based on riscv-go port. Updates #27532 Change-Id: Ia3aed521d4109e7b73f762c5a3cdacc7cdac430d Reviewed-on: https://go-review.googlesource.com/c/go/+/204635 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-10-10all: remove nacl (part 3, more amd64p32)Brad Fitzpatrick
Part 1: CL 199499 (GOOS nacl) Part 2: CL 200077 (amd64p32 files, toolchain) Part 3: stuff that arguably should've been part of Part 2, but I forgot one of my grep patterns when splitting the original CL up into two parts. This one might also have interesting stuff to resurrect for any future x32 ABI support. Updates #30439 Change-Id: I2b4143374a253a003666f3c69e776b7e456bdb9c Reviewed-on: https://go-review.googlesource.com/c/go/+/200318 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-10-09all: remove the nacl port (part 2, amd64p32 + toolchain)Brad Fitzpatrick
This is part two if the nacl removal. Part 1 was CL 199499. This CL removes amd64p32 support, which might be useful in the future if we implement the x32 ABI. It also removes the nacl bits in the toolchain, and some remaining nacl bits. Updates #30439 Change-Id: I2475d5bb066d1b474e00e40d95b520e7c2e286e1 Reviewed-on: https://go-review.googlesource.com/c/go/+/200077 Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-10-09all: remove the nacl port (part 1)Brad Fitzpatrick
You were a useful port and you've served your purpose. Thanks for all the play. A subsequent CL will remove amd64p32 (including assembly files and toolchain bits) and remaining bits. The amd64p32 removal will be separated into its own CL in case we want to support the Linux x32 ABI in the future and want our old amd64p32 support as a starting point. Updates #30439 Change-Id: Ia3a0c7d49804adc87bf52a4dea7e3d3007f2b1cd Reviewed-on: https://go-review.googlesource.com/c/go/+/199499 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-09-08all: fix typosAinar Garipov
Use the following (suboptimal) script to obtain a list of possible typos: #!/usr/bin/env sh set -x git ls-files |\ grep -e '\.\(c\|cc\|go\)$' |\ xargs -n 1\ awk\ '/\/\// { gsub(/.*\/\//, ""); print; } /\/\*/, /\*\// { gsub(/.*\/\*/, ""); gsub(/\*\/.*/, ""); }' |\ hunspell -d en_US -l |\ grep '^[[:upper:]]\{0,1\}[[:lower:]]\{1,\}$' |\ grep -v -e '^.\{1,4\}$' -e '^.\{16,\}$' |\ sort -f |\ uniq -c |\ awk '$1 == 1 { print $2; }' Then, go through the results manually and fix the most obvious typos in the non-vendored code. Change-Id: I3cb5830a176850e1a0584b8a40b47bde7b260eae Reviewed-on: https://go-review.googlesource.com/c/go/+/193848 Reviewed-by: Robert Griesemer <gri@golang.org>
2019-04-30internal/cpu: add detection for the new ECDSA and EDDSA capabilities on s390xbill_ofarrell
This CL will check for the Message-Security-Assist Extension 9 facility which enables the KDSA instruction. Change-Id: I659aac09726e0999ec652ef1f5983072c8131a48 Reviewed-on: https://go-review.googlesource.com/c/go/+/174529 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-02-28internal/cpu: change s390x API to match x/sys/cpuMichael Munday
This CL changes the internal/cpu API to more closely match the public version in x/sys/cpu (added in CL 163003). This will make it easier to update the dependencies of vendored code. The most prominent renaming is from VE1 to VXE for the vector-enhancements facility 1. VXE is the mnemonic used for this facility in the HWCAP vector. Change-Id: I922d6c8bb287900a4bd7af70567e22eac567b5c1 Reviewed-on: https://go-review.googlesource.com/c/164437 Reviewed-by: Martin Möhrmann <moehrmann@google.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-12-05crypto/elliptic: utilize faster z14 multiply/square instructions (when ↵bill_ofarrell
available) In the s390x assembly implementation of NIST P-256 curve, utilize faster multiply/square instructions introduced in the z14. These new instructions are designed for crypto and are constant time. The algorithm is unchanged except for faster multiplication when run on a z14 or later. On z13, the original mutiplication (also constant time) is used. P-256 performance is critical in many applications, such as Blockchain. name old time new time delta BaseMultP256 24396 ns/op 21564 ns/op 1.13x ScalarMultP256 87546 ns/op 72813 ns/op. 1.20x Change-Id: I7e6d8b420fac56d5f9cc13c9423e2080df854bac Reviewed-on: https://go-review.googlesource.com/c/146022 Reviewed-by: Michael Munday <mike.munday@ibm.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Michael Munday <mike.munday@ibm.com>
2018-11-14internal/cpu: move GODEBUGCPU options into GODEBUGMartin Möhrmann
Change internal/cpu feature configuration to use GODEBUG=cpu.feature1=value,cpu.feature2=value... instead of GODEBUGCPU=feature1=value,feature2=value... . This is not a backwards compatibility breaking change since GODEBUGCPU was introduced in go1.11 as an undocumented compiler experiment. Fixes #28757 Change-Id: Ib21b3fed2334baeeb061a722ab1eb513d1137e87 Reviewed-on: https://go-review.googlesource.com/c/149578 Run-TryBot: Martin Möhrmann <martisch@uos.de> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>