aboutsummaryrefslogtreecommitdiff
path: root/src/internal/cpu/cpu.go
AgeCommit message (Collapse)Author
2022-11-08runtime internal/cpu: rename "Zeus" "NeoverseV1".Matthew Horsnell
Rename "Zeus" to "NeoverseV1" for the partnum 0xd40 to be consistent with the documentation of MIDR_EL1 as described in https://developer.arm.com/documentation/101427/0101/?lang=en Change-Id: I2e3d5ec76b953a831cb4ab0438bc1c403648644b Reviewed-on: https://go-review.googlesource.com/c/go/+/414775 Reviewed-by: Jonathan Swinney <jswinney@amazon.com> Auto-Submit: Ian Lance Taylor <iant@golang.org> Reviewed-by: Eric Fang <eric.fang@arm.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2022-08-15internal/cpu: detect sha-ni instruction support for AMD64ted
addresses proposal #53084 required by sha-256 change list developed for #50543 Change-Id: I5454d746fce069a7a4993d70dc5b0a5544f8eeaf Reviewed-on: https://go-review.googlesource.com/c/go/+/408794 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@google.com>
2022-08-09internal/cpu: add sha512 for arm64Meng Zhuo
The new M1 cpu (Apple) comes with sha512 hardware acceleration feature. Change-Id: I823d1e9b09b472bd21571eee75cc5314cd66b1ff Reviewed-on: https://go-review.googlesource.com/c/go/+/408836 Reviewed-by: Than McIntosh <thanm@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-10internal/cpu: report CPU if known on PPC64Paul E. Murphy
The PPC64 maintainers are testing on P10 hardware, so it is helpful to report the correct cpu, even if this information is not used elsewhere yet. Note, AIX will report the current CPU of the host system, so a POWER10 will not set the IsPOWER9 flag. This is existing behavior, and should be fixed in a separate patch. Change-Id: Iebe23dd96ebe03c8a1c70d1ed2dc1506bad3c330 Reviewed-on: https://go-review.googlesource.com/c/go/+/404394 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
2021-10-06internal/cpu: remove option to mark cpu features requiredMartin Möhrmann
With the removal of SSE2 runtime detection made in golang.org/cl/344350 we can remove this mechanism as there are no required features anymore. For making sure CPUs running a go program support all the minimal hardware requirements the go runtime should do feature checks early in the runtime initialization before it is likely any compiler emitted but unsupported instructions are used. This is already the case for e.g. checking MMX support on 386 arch targets. Change-Id: If7b1cb6f43233841e917d37a18314d06a334a734 Reviewed-on: https://go-review.googlesource.com/c/go/+/354209 Trust: Martin Möhrmann <martin@golang.org> Run-TryBot: Martin Möhrmann <martin@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>
2021-08-23all: replace runtime SSE2 detection with GO386 settingMartin Möhrmann
When GO386=sse2 we can assume sse2 to be present without a runtime check. If GO386=softfloat is set we can avoid the usage of SSE2 even if detected. This might cause a memcpy, memclr and bytealg slowdown of Go binaries compiled with softfloat on machines that support SSE2. Such setups are rare and should use GO386=sse2 instead if performance matters. On targets that support SSE2 we avoid the runtime overhead of dynamic cpu feature dispatch. The removal of runtime sse2 checks also allows to simplify internal/cpu further by removing handling of the required feature option as a followup after this CL. Change-Id: I90a853a8853a405cb665497c6d1a86556947ba17 Reviewed-on: https://go-review.googlesource.com/c/go/+/344350 Trust: Martin Möhrmann <martin@golang.org> Run-TryBot: Martin Möhrmann <martin@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2021-08-23runtime: use RDTSCP for instruction stream serialized read of TSCMartin Möhrmann
To measure all instructions having been completed before reading the time stamp counter with RDTSC an instruction sequence that has instruction stream serializing properties which guarantee waiting until all previous instructions have been executed is needed. This does not necessary mean to wait for all stores to be globally visible. This CL aims to remove vendor specific logic for determining the instruction sequence with CPU feature flag checks that are CPU vendor independent. For intel LFENCE has the wanted properties at least since it was introduced together with SSE2 support. On AMD instruction stream serializing LFENCE is supported by setting an MSR C001_1029[1]=1 on AMD family 10h/12h/14h/15h/16h/17h processors. AMD family 0Fh/11h processors support LFENCE as serializing always. AMD plans support for this MSR and access to this bit for all future processors. Source: https://developer.amd.com/wp-content/resources/Managing-Speculation-on-AMD-Processors.pdf Reading the MSR to determine LFENCE properties is not always possible or reliable (hypervisors). The Linux kernel is relying on serializing LFENCE on AMD CPUs since a commit in July 2019: https://lkml.org/lkml/2019/7/22/295 and the MSR C001_1029 to enable serialization has been set by default with the Spectre v1 mitigations. Using an MFENCE on AMD is waiting on previous instructions having been executed but in addition also flushes store buffers. To align the serialization properties without runtime detection of CPU manufacturers we can use the newer RDTSCP instruction which waits until all previous instructions have been executed. RDTSCP is available on Intel since around 2008 and on AMD CPUs since around 2006. Support for RDTSCP can be checked independently of manufacturer by checking CPUID bits. Using RDTSCP is the default in Linux to read TSC in program order when the instruction is available. https://github.com/torvalds/linux/blob/e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93/arch/x86/include/asm/msr.h#L231 Change-Id: Ifa841843b9abb2816f8f0754a163ebf01385306d Reviewed-on: https://go-review.googlesource.com/c/go/+/344429 Reviewed-by: Keith Randall <khr@golang.org> Trust: Martin Möhrmann <martin@golang.org> Run-TryBot: Martin Möhrmann <martin@golang.org> TryBot-Result: Go Bot <gobot@golang.org>
2020-11-05internal/cpu: fix and cleanup ARM64 cpu feature fields and optionsMartin Möhrmann
Remove all cpu features from the ARM64 struct that are not initialized to reduce cache lines used and to avoid those features being accidentially used without actual detection if they are present. Add missing option to mask the CPUID feature. Change-Id: I94bf90c0655de1af2218ac72117ac6c52adfc289 Reviewed-on: https://go-review.googlesource.com/c/go/+/267658 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Trust: Martin Möhrmann <moehrmann@google.com>
2020-11-02runtime: improve memmove performance on arm64Jonathan Swinney
Replace the memmove implementation for moves of 17 bytes or larger with an implementation from ARM optimized software. The moves of 16 bytes or fewer are unchanged, but the registers used are updated to match the rest of the implementation. This implementation makes use of new optimizations: - software pipelined loop for large (>128 byte) moves - medium size moves (17..128 bytes) have a new implementation - address realignment when src or dst is unaligned - preference for aligned src (loads) or dst (stores) depending on CPU To support preference for aligned loads or aligned stores, a new CPU flag is added. This flag indicates that the detected micro architecture performs better with aligned loads. Some tested CPUs did not exhibit a significant difference and are left with the default behavior of realigning based on the destination address (stores). Neoverse N1 (Tested on Graviton 2) name old time/op new time/op delta Memmove/0-4 1.88ns ± 1% 1.87ns ± 1% -0.58% (p=0.020 n=10+10) Memmove/1-4 4.40ns ± 0% 4.40ns ± 0% ~ (all equal) Memmove/8-4 3.88ns ± 3% 3.80ns ± 0% -1.97% (p=0.001 n=10+9) Memmove/16-4 3.90ns ± 3% 3.80ns ± 0% -2.49% (p=0.000 n=10+9) Memmove/32-4 4.80ns ± 0% 4.40ns ± 0% -8.33% (p=0.000 n=9+8) Memmove/64-4 5.86ns ± 0% 5.00ns ± 0% -14.76% (p=0.000 n=8+8) Memmove/128-4 8.46ns ± 0% 8.06ns ± 0% -4.62% (p=0.000 n=10+10) Memmove/256-4 12.4ns ± 0% 12.2ns ± 0% -1.61% (p=0.000 n=10+10) Memmove/512-4 19.5ns ± 0% 19.1ns ± 0% -2.05% (p=0.000 n=10+10) Memmove/1024-4 33.7ns ± 0% 33.5ns ± 0% -0.59% (p=0.000 n=10+10) Memmove/2048-4 62.1ns ± 0% 59.0ns ± 0% -4.99% (p=0.000 n=10+10) Memmove/4096-4 117ns ± 1% 110ns ± 0% -5.66% (p=0.000 n=10+10) MemmoveUnalignedDst/64-4 6.41ns ± 0% 5.62ns ± 0% -12.32% (p=0.000 n=10+7) MemmoveUnalignedDst/128-4 9.40ns ± 0% 8.34ns ± 0% -11.24% (p=0.000 n=10+10) MemmoveUnalignedDst/256-4 12.8ns ± 0% 12.8ns ± 0% ~ (all equal) MemmoveUnalignedDst/512-4 20.4ns ± 0% 19.7ns ± 0% -3.43% (p=0.000 n=9+10) MemmoveUnalignedDst/1024-4 34.1ns ± 0% 35.1ns ± 0% +2.93% (p=0.000 n=9+9) MemmoveUnalignedDst/2048-4 61.5ns ± 0% 60.4ns ± 0% -1.77% (p=0.000 n=10+10) MemmoveUnalignedDst/4096-4 122ns ± 0% 113ns ± 0% -7.38% (p=0.002 n=8+10) MemmoveUnalignedSrc/64-4 7.25ns ± 1% 6.26ns ± 0% -13.64% (p=0.000 n=9+9) MemmoveUnalignedSrc/128-4 10.5ns ± 0% 9.7ns ± 0% -7.52% (p=0.000 n=10+10) MemmoveUnalignedSrc/256-4 17.1ns ± 0% 17.3ns ± 0% +1.17% (p=0.000 n=10+10) MemmoveUnalignedSrc/512-4 27.0ns ± 0% 27.0ns ± 0% ~ (all equal) MemmoveUnalignedSrc/1024-4 46.7ns ± 0% 35.7ns ± 0% -23.55% (p=0.000 n=10+9) MemmoveUnalignedSrc/2048-4 85.2ns ± 0% 61.2ns ± 0% -28.17% (p=0.000 n=10+8) MemmoveUnalignedSrc/4096-4 162ns ± 0% 113ns ± 0% -30.25% (p=0.000 n=10+10) name old speed new speed delta Memmove/4096-4 35.2GB/s ± 0% 37.1GB/s ± 0% +5.56% (p=0.000 n=10+9) MemmoveUnalignedSrc/1024-4 21.9GB/s ± 0% 28.7GB/s ± 0% +30.90% (p=0.000 n=10+10) MemmoveUnalignedSrc/2048-4 24.0GB/s ± 0% 33.5GB/s ± 0% +39.18% (p=0.000 n=10+9) MemmoveUnalignedSrc/4096-4 25.3GB/s ± 0% 36.2GB/s ± 0% +43.50% (p=0.000 n=10+7) Cortex-A72 (Graviton 1) name old time/op new time/op delta Memmove/0-4 3.06ns ± 3% 3.08ns ± 1% ~ (p=0.958 n=10+9) Memmove/1-4 8.72ns ± 0% 7.85ns ± 0% -9.98% (p=0.002 n=8+10) Memmove/8-4 8.29ns ± 0% 8.29ns ± 0% ~ (all equal) Memmove/16-4 8.29ns ± 0% 8.29ns ± 0% ~ (all equal) Memmove/32-4 8.19ns ± 2% 8.29ns ± 0% ~ (p=0.114 n=10+10) Memmove/64-4 18.3ns ± 4% 10.0ns ± 0% -45.36% (p=0.000 n=10+10) Memmove/128-4 14.8ns ± 0% 17.4ns ± 0% +17.77% (p=0.000 n=10+10) Memmove/256-4 21.8ns ± 0% 23.1ns ± 0% +5.96% (p=0.000 n=10+10) Memmove/512-4 35.8ns ± 0% 37.2ns ± 0% +3.91% (p=0.000 n=10+10) Memmove/1024-4 63.7ns ± 0% 67.2ns ± 0% +5.49% (p=0.000 n=10+10) Memmove/2048-4 126ns ± 0% 123ns ± 0% -2.38% (p=0.000 n=10+10) Memmove/4096-4 238ns ± 1% 243ns ± 1% +1.93% (p=0.000 n=10+10) MemmoveUnalignedDst/64-4 19.3ns ± 1% 12.0ns ± 1% -37.49% (p=0.000 n=10+10) MemmoveUnalignedDst/128-4 17.2ns ± 0% 17.4ns ± 0% +1.16% (p=0.000 n=10+10) MemmoveUnalignedDst/256-4 28.2ns ± 8% 29.2ns ± 0% ~ (p=0.352 n=10+10) MemmoveUnalignedDst/512-4 49.8ns ± 3% 48.9ns ± 0% ~ (p=1.000 n=10+10) MemmoveUnalignedDst/1024-4 89.5ns ± 0% 80.5ns ± 1% -10.02% (p=0.000 n=10+10) MemmoveUnalignedDst/2048-4 180ns ± 0% 127ns ± 0% -29.44% (p=0.000 n=9+10) MemmoveUnalignedDst/4096-4 347ns ± 0% 244ns ± 0% -29.59% (p=0.000 n=10+9) MemmoveUnalignedSrc/128-4 16.1ns ± 0% 21.8ns ± 0% +35.40% (p=0.000 n=10+10) MemmoveUnalignedSrc/256-4 24.9ns ± 8% 26.6ns ± 0% +6.70% (p=0.015 n=10+10) MemmoveUnalignedSrc/512-4 39.4ns ± 6% 40.6ns ± 0% ~ (p=0.352 n=10+10) MemmoveUnalignedSrc/1024-4 72.5ns ± 0% 83.0ns ± 1% +14.44% (p=0.000 n=9+10) MemmoveUnalignedSrc/2048-4 129ns ± 1% 128ns ± 1% ~ (p=0.179 n=10+10) MemmoveUnalignedSrc/4096-4 241ns ± 0% 253ns ± 1% +4.99% (p=0.000 n=9+9) Cortex-A53 (Raspberry Pi 3) name old time/op new time/op delta Memmove/0-4 11.0ns ± 0% 11.0ns ± 1% ~ (p=0.294 n=8+10) Memmove/1-4 29.6ns ± 0% 28.0ns ± 1% -5.41% (p=0.000 n=9+10) Memmove/8-4 23.5ns ± 0% 22.1ns ± 0% -6.11% (p=0.000 n=8+8) Memmove/16-4 23.7ns ± 1% 22.1ns ± 0% -6.59% (p=0.000 n=10+8) Memmove/32-4 27.9ns ± 0% 27.1ns ± 0% -3.13% (p=0.000 n=8+8) Memmove/64-4 33.8ns ± 0% 31.5ns ± 1% -6.99% (p=0.000 n=8+10) Memmove/128-4 45.6ns ± 0% 44.2ns ± 1% -3.23% (p=0.000 n=9+10) Memmove/256-4 69.3ns ± 0% 69.3ns ± 0% ~ (p=0.072 n=8+8) Memmove/512-4 127ns ± 0% 110ns ± 0% -13.39% (p=0.000 n=8+8) Memmove/1024-4 222ns ± 0% 205ns ± 1% -7.66% (p=0.000 n=7+10) Memmove/2048-4 411ns ± 0% 366ns ± 0% -10.98% (p=0.000 n=8+9) Memmove/4096-4 795ns ± 1% 695ns ± 1% -12.63% (p=0.000 n=10+10) MemmoveUnalignedDst/64-4 44.0ns ± 0% 40.5ns ± 0% -7.93% (p=0.000 n=8+8) MemmoveUnalignedDst/128-4 59.6ns ± 0% 54.9ns ± 0% -7.85% (p=0.000 n=9+9) MemmoveUnalignedDst/256-4 98.2ns ±11% 90.0ns ± 1% ~ (p=0.130 n=10+10) MemmoveUnalignedDst/512-4 161ns ± 2% 145ns ± 1% -9.96% (p=0.000 n=10+10) MemmoveUnalignedDst/1024-4 281ns ± 0% 265ns ± 0% -5.65% (p=0.000 n=9+8) MemmoveUnalignedDst/2048-4 528ns ± 0% 482ns ± 0% -8.73% (p=0.000 n=8+9) MemmoveUnalignedDst/4096-4 1.02µs ± 1% 0.92µs ± 0% -10.00% (p=0.000 n=10+8) MemmoveUnalignedSrc/64-4 42.4ns ± 1% 40.5ns ± 0% -4.39% (p=0.000 n=10+8) MemmoveUnalignedSrc/128-4 57.4ns ± 0% 57.0ns ± 1% -0.75% (p=0.048 n=9+10) MemmoveUnalignedSrc/256-4 88.1ns ± 1% 89.6ns ± 0% +1.70% (p=0.000 n=9+8) MemmoveUnalignedSrc/512-4 160ns ± 2% 144ns ± 0% -9.89% (p=0.000 n=10+8) MemmoveUnalignedSrc/1024-4 286ns ± 0% 266ns ± 1% -6.69% (p=0.000 n=8+10) MemmoveUnalignedSrc/2048-4 525ns ± 0% 483ns ± 1% -7.96% (p=0.000 n=9+10) MemmoveUnalignedSrc/4096-4 1.01µs ± 0% 0.92µs ± 1% -9.40% (p=0.000 n=8+10) Change-Id: Ia1144e9d4dfafdece6e167c5e576bf80f254c8ab Reviewed-on: https://go-review.googlesource.com/c/go/+/243357 TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: eric fang <eric.fang@arm.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-03-02internal/cpu: use anonymous struct for CPU feature varsTobias Klauser
Like in x/sys/cpu, use anonymous structs to declare the CPU feature vars instead of defining single-use types. Also, order the vars alphabetically. Change-Id: Iedd3ca51916e3cbb852d2aeed18b3a4c6613e778 Reviewed-on: https://go-review.googlesource.com/c/go/+/221757 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com>
2020-02-28internal/cpu: add MIPS64x feature detectionMeng Zhuo
Change-Id: Iacdad1758aa15e4703fccef38c08ecb338b95fd7 Reviewed-on: https://go-review.googlesource.com/c/go/+/200579 Run-TryBot: Meng Zhuo <mengzhuo1203@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-09-08all: fix typosAinar Garipov
Use the following (suboptimal) script to obtain a list of possible typos: #!/usr/bin/env sh set -x git ls-files |\ grep -e '\.\(c\|cc\|go\)$' |\ xargs -n 1\ awk\ '/\/\// { gsub(/.*\/\//, ""); print; } /\/\*/, /\*\// { gsub(/.*\/\*/, ""); gsub(/\*\/.*/, ""); }' |\ hunspell -d en_US -l |\ grep '^[[:upper:]]\{0,1\}[[:lower:]]\{1,\}$' |\ grep -v -e '^.\{1,4\}$' -e '^.\{16,\}$' |\ sort -f |\ uniq -c |\ awk '$1 == 1 { print $2; }' Then, go through the results manually and fix the most obvious typos in the non-vendored code. Change-Id: I3cb5830a176850e1a0584b8a40b47bde7b260eae Reviewed-on: https://go-review.googlesource.com/c/go/+/193848 Reviewed-by: Robert Griesemer <gri@golang.org>
2019-04-30internal/cpu: add detection for the new ECDSA and EDDSA capabilities on s390xbill_ofarrell
This CL will check for the Message-Security-Assist Extension 9 facility which enables the KDSA instruction. Change-Id: I659aac09726e0999ec652ef1f5983072c8131a48 Reviewed-on: https://go-review.googlesource.com/c/go/+/174529 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-02-28internal/cpu: change s390x API to match x/sys/cpuMichael Munday
This CL changes the internal/cpu API to more closely match the public version in x/sys/cpu (added in CL 163003). This will make it easier to update the dependencies of vendored code. The most prominent renaming is from VE1 to VXE for the vector-enhancements facility 1. VXE is the mnemonic used for this facility in the HWCAP vector. Change-Id: I922d6c8bb287900a4bd7af70567e22eac567b5c1 Reviewed-on: https://go-review.googlesource.com/c/164437 Reviewed-by: Martin Möhrmann <moehrmann@google.com> Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-12-05crypto/elliptic: utilize faster z14 multiply/square instructions (when ↵bill_ofarrell
available) In the s390x assembly implementation of NIST P-256 curve, utilize faster multiply/square instructions introduced in the z14. These new instructions are designed for crypto and are constant time. The algorithm is unchanged except for faster multiplication when run on a z14 or later. On z13, the original mutiplication (also constant time) is used. P-256 performance is critical in many applications, such as Blockchain. name old time new time delta BaseMultP256 24396 ns/op 21564 ns/op 1.13x ScalarMultP256 87546 ns/op 72813 ns/op. 1.20x Change-Id: I7e6d8b420fac56d5f9cc13c9423e2080df854bac Reviewed-on: https://go-review.googlesource.com/c/146022 Reviewed-by: Michael Munday <mike.munday@ibm.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Michael Munday <mike.munday@ibm.com>
2018-11-14internal/cpu: move GODEBUGCPU options into GODEBUGMartin Möhrmann
Change internal/cpu feature configuration to use GODEBUG=cpu.feature1=value,cpu.feature2=value... instead of GODEBUGCPU=feature1=value,feature2=value... . This is not a backwards compatibility breaking change since GODEBUGCPU was introduced in go1.11 as an undocumented compiler experiment. Fixes #28757 Change-Id: Ib21b3fed2334baeeb061a722ab1eb513d1137e87 Reviewed-on: https://go-review.googlesource.com/c/149578 Run-TryBot: Martin Möhrmann <martisch@uos.de> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-29internal/cpu: remove unused and not required ppc64(le) feature detectionMartin Möhrmann
Minimum Go requirement for ppc64(le) architecture support is POWER8. https://github.com/golang/go/wiki/MinimumRequirements#ppc64-big-endian Reduce CPU features supported in internal/cpu to those needed to test minimum requirements and cpu feature kernel support for ppc64(le). Currently no internal/cpu feature variables are used to guard code from using unsupported instructions. The IsPower9 feature variable and detection is kept as it will soon be used to guard code execution. Reducing the set of detected CPU features for ppc64(le) makes implementing Go support for new operating systems easier as CPU feature detection for ppc64(le) needs operating system support (e.g. hwcap on Linux and getsystemcfg syscall on AIX). Change-Id: Ic4c17b31610970e481cd139c657da46507391d1d Reviewed-on: https://go-review.googlesource.com/c/145117 Run-TryBot: Martin Möhrmann <martisch@uos.de> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-24internal/cpu: add options and warnings for required cpu featuresMartin Möhrmann
Updates #27218 Change-Id: I8603f3a639cdd9ee201c4f1566692e5b88877fc4 Reviewed-on: https://go-review.googlesource.com/c/144107 Run-TryBot: Martin Möhrmann <martisch@uos.de> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-15internal/cpu: add invalid option warnings and support to enable cpu featuresMartin Möhrmann
This CL adds the ability to enable the cpu feature FEATURE by specifying FEATURE=on in GODEBUGCPU. Syntax support to enable cpu features is useful in combination with a preceeding all=off to disable all but some specific cpu features. Example: GODEBUGCPU=all=off,sse3=on This CL implements printing of warnings for invalid GODEBUGCPU settings: - requests enabling features that are not supported with the current CPU - specifying values different than 'on' or 'off' for a feature - settings for unkown cpu feature names Updates #27218 Change-Id: Ic13e5c4c35426a390c50eaa4bd2a408ef2ee21be Reviewed-on: https://go-review.googlesource.com/c/141800 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2018-10-15internal/cpu: expose ARM feature flags for FMAAkhil Indurti
This change exposes feature flags needed to implement an FMA intrinsic on ARM CPUs via auxv's HWCAP bits. Specifically, it exposes HasVFPv4 to detect if an ARM processor has the fourth version of the vector floating point unit. The relevant instruction for this CL is VFMA, emitted in Go as FMULAD. Updates #26630. Change-Id: Ibbc04fb24c2b4d994f93762360f1a37bc6d83ff7 Reviewed-on: https://go-review.googlesource.com/c/126315 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com>
2018-10-12internal/cpu: use 'off' for disabling cpu capabilities instead of '0'Martin Möhrmann
Updates #27218 Change-Id: I4ce20376fd601b5f958d79014af7eaf89e9de613 Reviewed-on: https://go-review.googlesource.com/c/141818 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-12internal/cpu: enable support for GODEBUGCPU in non-experimental buildsMartin Möhrmann
Enabling GODEBUGCPU without the need to set GOEXPERIMENT=debugcpu enables trybots and builders to run tests for GODEBUGCPU features in upcoming CLs that will implement the new syntax and features for non-experimental GODEBUGCPU support from proposal golang.org/issue/27218. Updates #27218 Change-Id: Icc69e51e736711a86b02b46bd441ffc28423beba Reviewed-on: https://go-review.googlesource.com/c/141817 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-24runtime: replace sys.CacheLineSize by corresponding internal/cpu const and varsMartin Möhrmann
sys here is runtime/internal/sys. Replace uses of sys.CacheLineSize for padding by cpu.CacheLinePad or cpu.CacheLinePadSize. Replace other uses of sys.CacheLineSize by cpu.CacheLineSize. Remove now unused sys.CacheLineSize. Updates #25203 Change-Id: I1daf410fe8f6c0493471c2ceccb9ca0a5a75ed8f Reviewed-on: https://go-review.googlesource.com/126601 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-08-24internal/cpu: add a CacheLinePadSize constantMartin Möhrmann
The new constant CacheLinePadSize can be used to compute best effort alignment of structs to cache lines. e.g. the runtime can use this in the locktab definition: var locktab [57]struct { l spinlock pad [cpu.CacheLinePadSize - unsafe.Sizeof(spinlock{})]byte } Change-Id: I86f6fbfc5ee7436f742776a7d4a99a1d54ffccc8 Reviewed-on: https://go-review.googlesource.com/131237 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-08-24runtime: move arm hardware division support detection to internal/cpuMartin Möhrmann
Assumes mandatory VFP and VFPv3 support to be present by default but not IDIVA if AT_HWCAP is not available. Adds GODEBUGCPU options to disable the use of code paths in the runtime that use hardware support for division. Change-Id: Ida02311bd9b9701de3fc120697e69445bf6c0853 Reviewed-on: https://go-review.googlesource.com/114826 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2018-08-21runtime: don't use linkname to refer to internal/cpuIan Lance Taylor
The runtime package already imports the internal/cpu package, so there is no reason for it to use go:linkname comments to refer to internal/cpu functions and variables. Since internal/cpu is internal, we can just export those names. Removing the obscurity of go:linkname outweighs the minor additional complexity added to the internal/cpu API. Change-Id: Id89951b7f3fc67cd9bce67ac6d01d44a647a10ad Reviewed-on: https://go-review.googlesource.com/128755 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com>
2018-08-20internal/cpu: add and use cpu.CacheLinePad for padding structsMartin Möhrmann
Add a CacheLinePad struct type to internal/cpu that has a size of CacheLineSize. This can be used for padding structs in order to avoid false sharing. Updates #25203 Change-Id: Icb95ae68d3c711f5f8217140811cad1a1d5be79a Reviewed-on: https://go-review.googlesource.com/116276 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-06-11crypto, internal/cpu: fix s390x AES feature detection and update SHA ↵Michael Munday
implementations Hardware AES support in Go on s390x currently requires ECB, CBC and CTR modes be available. It also requires that either the GHASH or GCM facilities are available. The existing checks missed some of these constraints. While we're here simplify the cpu package on s390x, moving masking code out of assembly and into Go code. Also, update SHA-{1,256,512} implementations to use the cpu package since that is now trivial. Finally I also added a test for internal/cpu on s390x which loads /proc/cpuinfo and checks it against the flags set by internal/cpu. Updates #25822 for changes to vet whitelist. Change-Id: Iac4183f571643209e027f730989c60a811c928eb Reviewed-on: https://go-review.googlesource.com/114397 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-26cmd/compile,go/build,internal/cpu: gofmtElias Naur
I don't know why these files were not formatted. Perhaps because their changes came from Github PRs? Change-Id: Ida8d7b9a36f0d1064caf74ca1911696a247a9bbe Reviewed-on: https://go-review.googlesource.com/114824 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-23crypto/{aes,internal/cipherhw,tls}: use common internal/cpu in place of cipherhwAnit Gandhi
When the internal/cpu package was introduced, the AES package still used the custom crypto/internal/cipherhw package for amd64 and s390x. This change removes that package entirely in favor of directly referencing the cpu feature flags set and exposed by the internal/cpu package. In addition, 5 new flags have been added to the internal/cpu s390x struct for detecting various cipher message (KM) features. Change-Id: I77cdd8bc1b04ab0e483b21bf1879b5801a4ba5f4 GitHub-Last-Rev: a611e3ecb1f480dcbfce3cb0c8c9e4058f56c1a4 GitHub-Pull-Request: golang/go#24766 Reviewed-on: https://go-review.googlesource.com/105695 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-22internal/cpu: add experiment to disable CPU features with GODEBUGCPUMartin Möhrmann
Needs the go compiler to be build with GOEXPERIMENT=debugcpu to be active. The GODEBUGCPU environment variable can be used to disable usage of specific processor features in the Go standard library. This is useful for testing and benchmarking different code paths that are guarded by internal/cpu variable checks. Use of processor features can not be enabled through GODEBUGCPU. To disable usage of AVX and SSE41 cpu features on GOARCH amd64 use: GODEBUGCPU=avx=0,sse41=0 The special "all" option can be used to disable all options: GODEBUGCPU=all=0 Updates #12805 Updates #15403 Change-Id: I699c2e6f74d98472b6fb4b1e5ffbf29b15697aab Reviewed-on: https://go-review.googlesource.com/91737 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-04-10internal/cpu,runtime: call cpu.initialize before alginitMeng Zhuo
runtime.alginit needs runtime/support_{aes,ssse3,sse41} feature flag to init aeshash function but internal/cpu.init not be called yet. This CL will call internal/cpu.initialize before runtime.alginit, so that we can move all cpu features related code to internal/cpu. Change-Id: I00b8e403ace3553f8c707563d95f27dade0bc853 Reviewed-on: https://go-review.googlesource.com/104636 Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-03-31internal/cpu: update arm64 cpu featuresMeng Zhuo
Follow the Linux Kernel 4.15 Add Arm64 minimalFeatures test Change-Id: I1c092521ba59b1e4096c27786fa0464f9ef7d311 Reviewed-on: https://go-review.googlesource.com/103636 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-03-02internal/bytealg: move IndexByte asssembly to the new bytealg packageKeith Randall
Move the IndexByte function from the runtime to a new bytealg package. The new package will eventually hold all the optimized assembly for groveling through byte slices and strings. It seems a better home for this code than randomly keeping it in runtime. Once this is in, the next step is to move the other functions (Compare, Equal, ...). Update #19792 This change seems complicated enough that we might just declare "not worth it" and abandon. Opinions welcome. The core assembly is all unchanged, except minor modifications where the code reads cpu feature bits. The wrapper functions have been cleaned up as they are now actually checked by vet. Change-Id: I9fa75bee5d85db3a65b3fd3b7997e60367523796 Reviewed-on: https://go-review.googlesource.com/98016 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-11-14internal/cpu: detect cpu features in internal/cpu packageFangming.Fang
change hash/crc32 package to use cpu package instead of using runtime internal variables to check crc32 instruction Change-Id: I8f88d2351bde8ed4e256f9adf822a08b9a00f532 Reviewed-on: https://go-review.googlesource.com/76490 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
2017-11-02cmd/internal/obj/x86: add ADX extensionIlya Tocar
Add support for ADX cpuid bit detection and all instructions, implied by that bit (ADOX/ADCX). They are useful for rsa and math/big in general. Change-Id: Idaa93303ead48fd18b9b3da09b3e79de2f7e2193 Reviewed-on: https://go-review.googlesource.com/74850 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-09-11internal/cpu: add support for x86 FMA cpu feature detectionMartin Möhrmann
Change-Id: I88ea39de01b07e6afa1c187c0df6a258da4aa8e4 Reviewed-on: https://go-review.googlesource.com/62650 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-08-23all: fix easy-to-miss typosAgniva De Sarker
Using the wonderful https://github.com/client9/misspell tool. Change-Id: Icdbc75a5559854f4a7a61b5271bcc7e3f99a1a24 Reviewed-on: https://go-review.googlesource.com/57851 Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com> Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-14runtime, internal/cpu: CPU capabilities detection for ppc64xCarlos Eduardo Seo
This change replaces the current runtime capabilities check for ppc64x with the new internal/cpu package. It also adds support for the new POWER9 ISA and capabilities. Updates #15403 Change-Id: I5b64a79e782f8da3603e5529600434f602986292 Reviewed-on: https://go-review.googlesource.com/53830 Reviewed-by: Martin Möhrmann <moehrmann@google.com>
2017-05-10internal/cpu: new package to detect cpu featuresMartin Möhrmann
Implements detection of x86 cpu features that are used in the go standard library. Changes all standard library packages to use the new cpu package instead of using runtime internal variables to check x86 cpu features. Updates: #15403 Change-Id: I2999a10cb4d9ec4863ffbed72f4e021a1dbc4bb9 Reviewed-on: https://go-review.googlesource.com/41476 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>