aboutsummaryrefslogtreecommitdiff
path: root/src/internal/cpu
AgeCommit message (Collapse)Author
2026-01-28cmd/compile, simd: capture VAES instructions and fix AVX512VAES featureJunyang Shao
The code previously filters out VAES-only instructions, this CL added them back. This CL added the VAES feature check following the Intel xed data: XED_ISA_SET_VAES: vaes.7.0.ecx.9 # avx.1.0.ecx.28 This CL also found out that the old AVX512VAES feature check is not checking the correct bits, it also fixes it: XED_ISA_SET_AVX512_VAES_128: vaes.7.0.ecx.9 aes.1.0.ecx.25 avx512f.7.0.ebx.16 avx512vl.7.0.ebx.31 XED_ISA_SET_AVX512_VAES_256: vaes.7.0.ecx.9 aes.1.0.ecx.25 avx512f.7.0.ebx.16 avx512vl.7.0.ebx.31 XED_ISA_SET_AVX512_VAES_512: vaes.7.0.ecx.9 aes.1.0.ecx.25 avx512f.7.0.ebx.16 It restricts to the most strict common set - includes avx512vl for even 512-bits although it doesn't requires it. Change-Id: I4e2f72b312fd2411589fbc12f9ee5c63c09c2e9a Reviewed-on: https://go-review.googlesource.com/c/go/+/738500 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-01-13simd/archsimd: 128- and 256-bit FMA operations do not require AVX-512Austin Clements
Currently, all FMA operations are marked as requiring AVX512, even on smaller vector widths. This is happening because the narrower FMA operations are marked as extension "FMA" in the XED. Since this extension doesn't start with "AVX", we filter them out very early in the XED process. However, this is just a quirk of naming: the FMA feature depends on the AVX feature, so it is part of AVX, even if it doesn't say so on the tin. Fix this by accepting the FMA extension and adding FMA to the table of CPU features. We also tweak internal/cpu slightly do it correctly enforces that the logical FMA feature depends on both the FMA and AVX CPUID flags. This actually *deletes* a lot of generated code because we no longer need the AVX-512 encoding of these 128- and 256-bit operations. Change-Id: I744a18d0be888f536ac034fe88b110347622be7e Reviewed-on: https://go-review.googlesource.com/c/go/+/736160 Auto-Submit: Austin Clements <austin@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-on: https://go-review.googlesource.com/c/go/+/736201 Reviewed-by: Austin Clements <austin@google.com>
2025-12-18internal/cpu: repair VNNI feature checkDavid Chase
This is a pain to test. Also the original test was never executed, because it was wrong. It looks like processors that might lack this features include Intel 11th generation and AMD Zen 4. These might or might not have bit 2 set in the 7th cpuid "leaf" (SM4) which is what the incorrect test was checking; the bug is triggered by ^VNNI & SM4. Apparently the SM4 bit is not usually set, else we would have seen a test failure. The "Lion Cove" microarchitecture (Arrow Lake, Lunar Lake) appears to trigger this problem, it's not clear if there are others. It was hard to verify this from online information. Fixes #76881. Change-Id: I21be6b4f47134d81e89799b0f06f89fcb6563264 Reviewed-on: https://go-review.googlesource.com/c/go/+/731240 TryBot-Bypass: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-09-30[dev.simd] cmd/compile, simd: add AES instructionsJunyang Shao
AVXAES is a composite feature set, Intel did listed it as "AVXAES" in the XED data instead of separating them. The tests will be in the next CL. Change-Id: I89c97261f2228b2fdafb48f63e82ef6239bdd5ca Reviewed-on: https://go-review.googlesource.com/c/go/+/706055 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-09-03[dev.simd] all: merge master (4c4cefc) into dev.simdCherry Mui
Merge List: + 2025-09-03 4c4cefc19a cmd/gofmt: simplify logic to process arguments + 2025-09-03 925a3cdcd1 unicode/utf8: make DecodeRune{,InString} inlineable + 2025-09-03 3e596d448f math: rename Modf parameter int to integer + 2025-09-02 2a7f1d47b0 runtime: use one more address bit for tagged pointers + 2025-09-02 b09068041a cmd/dist: run racebench tests only in longtest mode + 2025-09-02 355370ac52 runtime: add comment for concatstring2 + 2025-09-02 1eec830f54 go/doc: linkify interface methods + 2025-08-31 7bba745820 cmd/compile: use generated loops instead of DUFFZERO on loong64 + 2025-08-31 882335e2cb cmd/internal/obj/loong64: add LDPTR.{W/D} and STPTR.{W/D} instructions support + 2025-08-31 d4b17f5869 internal/runtime/atomic: reset wrong jump target in Cas{,64} on loong64 + 2025-08-31 6a08e80399 net/http: skip redirecting in ServeMux when URL path for CONNECT is empty + 2025-08-29 8bcda6c79d runtime/race: add race detector support for linux/riscv64 + 2025-08-29 8377adafc5 cmd/cgo: split loadDWARF into two parts + 2025-08-29 a7d9d5a80a cmd/cgo: move typedefs and typedefList out of Package + 2025-08-29 1d459c4357 all: delete more windows/arm remnants + 2025-08-29 27ce6e4e26 cmd/compile: remove sign extension before MULW on riscv64 + 2025-08-29 84b070bfb1 cmd/compile/internal/ssa: make oneBit function generic + 2025-08-29 fe42628dae internal/cpu: inline DebugOptions + 2025-08-29 94b7d519bd net: update document on limitation of iprawsock on Windows + 2025-08-29 ba9e1ddccf testing: allow specify temp dir by GOTMPDIR environment variable + 2025-08-29 9f6936b8da cmd/link: disallow linkname of runtime.addmoduledata + 2025-08-29 89d41d254a bytes, strings: speed up TrimSpace + 2025-08-29 38204e0872 testing/synctest: call out common issues with tests + 2025-08-29 252c901125 os,syscall: pass file flags to CreateFile on Windows + 2025-08-29 53515fb0a9 crypto/tls: use hash.Cloner + 2025-08-28 13bb48e6fb go/constant: fix complex != unknown comparison + 2025-08-28 ba1109feb5 net: remove redundant cgoLookupCNAME return parameter + 2025-08-28 f74ed44ed9 net/http/httputil: remove redundant pw.Close() call in DumpRequestOut + 2025-08-28 a9689d2e0b time: skip TestLongAdjustTimers in short mode on single CPU systems + 2025-08-28 ebc763f76d syscall: only get parent PID if SysProcAttr.Pdeathsig is set + 2025-08-28 7f1864b0a8 strings: remove redundant "runs" from string.Fields docstring + 2025-08-28 90c21fa5b6 net/textproto: eliminate some bounds checks + 2025-08-27 e47d88beae os: return nil slice when ReadDir is used with a file on file_windows + 2025-08-27 6b837a64db cmd/internal/obj/loong64: simplify buildop + 2025-08-27 765905e3bd debug/elf: don't panic if symtab too small + 2025-08-27 2ee4b31242 net/http: Ensure that CONNECT proxied requests respect MaxResponseHeaderBytes + 2025-08-27 b21867b1a2 net/http: require exact match for CrossSiteProtection bypass patterns + 2025-08-27 d19e377f6e cmd/cgo: make it safe to run gcc in parallel + 2025-08-27 49a2f3ed87 net: allow zero value destination address in WriteMsgUDPAddrPort + 2025-08-26 afc51ed007 internall/poll: remove bufs field from Windows' poll.operation + 2025-08-26 801b74eb95 internal/poll: remove rsa field from Windows' poll.operation + 2025-08-26 fa18c547cd syscall: sort Windows env block in StartProcess + 2025-08-26 bfd130db02 internal/poll: don't use stack-allocated WSAMsg parameters + 2025-08-26 dae9e456ae runtime: identify virtual memory layout for riscv64 + 2025-08-25 25c2d4109f math: use Trunc to implement Modf + 2025-08-25 4e05a070c4 math: implement IsInf using Abs + 2025-08-25 1eed4f32a0 math: optimize Signbit implementation slightly + 2025-08-25 bd71b94659 cmd/compile/internal: optimizing add+sll rule using ALSLV instruction on loong64 + 2025-08-25 ea55ca3600 runtime: skip doInit of plugins in runtime.main + 2025-08-25 9ae2f1fb57 internal/trace: skip async preempt off tests on low end systems + 2025-08-25 bbd5342a62 net: fix cgoResSearch + 2025-08-25 ed7f804775 os: set full name for Roots created with Root.OpenRoot + 2025-08-25 a21249436b internal/poll: use fdMutex to provide read/write locking on Windows + 2025-08-24 44c5956bf7 test/codegen: add Mul2 and DivPow2 test for loong64 + 2025-08-24 0aa8019e94 test/codegen: add Mul* test for loong64 + 2025-08-24 83420974b7 test/codegen: add sqrt* abs and copysign test for loong64 + 2025-08-23 f2db0dca0b net/http/httptest: redirect example.com requests to server + 2025-08-22 d86ec92499 internal/syscall/windows: increase internal Windows O_ flags values + 2025-08-22 9d3f7fda70 crypto/tls: fix quic comment typo + 2025-08-22 78a05c541f internal/poll: don't pass non-nil WSAMsg.Name with 0 namelen on windows + 2025-08-22 52c3f73fda runtime/metrics: improve doc + 2025-08-22 a076f49757 os: fix Root.MkdirAll to handle race of directory creation + 2025-08-22 98238fd495 all: delete remaining windows/arm code + 2025-08-21 1ad30844d9 cmd/asm: process forward jump to PCALIGN + 2025-08-21 13c082601d internal/poll: permit nil destination address in WriteMsg{Inet4,Inet6} + 2025-08-21 9b0a507735 runtime: remove remaining windows/arm files and comments + 2025-08-21 1843f1e9c0 cmd/compile: use zero register instead of specialized *zero instructions on loong64 + 2025-08-21 e0870a0a12 cmd/compile: simplify zerorange on loong64 + 2025-08-21 fb8bbe46d5 cmd/compile/internal/ssa: eliminate unnecessary extension operations + 2025-08-21 9632ba8160 cmd/compile: optimize some patterns into revb2h/revb4h instruction on loong64 + 2025-08-21 8dcab6f450 syscall: simplify execve handling on libc platforms + 2025-08-21 ba840c1bf9 cmd/compile: deduplication in the source code generated by mknode + 2025-08-21 fa706ea50f cmd/compile: optimize rule (x + x) << c to x << c+1 on loong64 + 2025-08-21 ffc85ee1f1 cmd/internal/objabi,cmd/link: add support for additional riscv64 relocations Change-Id: I3896f74b1a3cc0a52b29ca48767bb0ba84620f71
2025-09-02[dev.simd] internal/cpu: report AVX1 and 2 as supported on macOS 15 Rosetta 2Cherry Mui
Apparently, on macOS 15 or newer, Rosetta 2 supports AVX1 and 2. However, neither CPUID nor the Apple-recommended sysctl says it has AVX. If AVX is used without checking the CPU feature, it may run fine without SIGILL, but the runtime doesn't know AVX is available therefore save and restore its states. This may lead to value corruption. Check if we are running under Rosetta 2 on macOS 15 or newer. If so, report AVX1 and 2 as supported. Change-Id: Ib981379405b1ae28faa378f051096827d760a4cc Reviewed-on: https://go-review.googlesource.com/c/go/+/700055 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com>
2025-08-29internal/cpu: inline DebugOptionsTobias Klauser
internal/cpu.DebugOptions is only ever set in runtime.cpuinit on unix-like platforms. DebugOptions itself is only used in MustHaveDebugOptionsSupport, so inline the GOOS check there. Change-Id: I6a35d6b8afcdadfc59585258002f53c20026116c Reviewed-on: https://go-review.googlesource.com/c/go/+/699775 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Florian Lehner <lehner.florian86@gmail.com>
2025-08-14[dev.simd] all: merge master (924fe98) into dev.simdCherry Mui
Conflicts: - src/cmd/compile/internal/amd64/ssa.go - src/cmd/compile/internal/ssa/expand_calls.go - src/cmd/compile/internal/ssagen/ssa.go - src/internal/buildcfg/exp.go - src/internal/cpu/cpu.go - src/internal/cpu/cpu_x86.go - src/runtime/mkpreempt.go - src/runtime/preempt_amd64.go - src/runtime/preempt_amd64.s Merge List: + 2025-08-14 924fe98902 cmd/internal/obj/riscv: add encoding for compressed riscv64 instructions + 2025-08-13 320df537cc cmd/compile: emit classify instructions for infinity tests on riscv64 + 2025-08-13 ca66f907dd cmd/compile: use generated loops instead of DUFFCOPY on amd64 + 2025-08-13 4b1800e476 encoding/json/v2: cleanup error constructors + 2025-08-13 af8870708b encoding/json/v2: fix incorrect marshaling of NaN in float64 any + 2025-08-13 0a75e5a07b encoding/json/v2: fix wrong type with cyclic marshal error in map[string]any + 2025-08-13 de9b6f9875 cmd/pprof: update vendored github.com/google/pprof + 2025-08-13 674c5f0edd os/exec: fix incorrect expansion of ".." in LookPath on plan9 + 2025-08-13 9bbea0f21a cmd/compile: during regalloc, fixedreg values are always available + 2025-08-13 08eef97500 runtime/trace: fix documentation typo + 2025-08-13 2fe5d51d04 internal/trace: fix wrong scope for Event.Range or EvGCSweepActive + 2025-08-13 9fcb87c352 cmd/compile: teach prove about len's & cap's max based on the element size + 2025-08-13 9763ece873 cmd/compile: absorb NEGV into branch on loong64 + 2025-08-13 f10a82b76f all: update vendored dependencies [generated] + 2025-08-13 3bea95b277 cmd/link/internal/ld: remove OpenBSD buildid workaround + 2025-08-12 90b7d7aaa2 cmd/compile/internal: optimize multiplication use new operation 'ADDshiftLLV' on loong64 + 2025-08-12 1b263fc604 runtime/race: restore previous version of LLVM TSAN on macOS + 2025-08-12 b266318cf7 cmd/compile/internal/ssa: use BEQ/BNE to optimize the combination of XOR and EQ/NE on loong64 + 2025-08-12 adbf59525c internal/runtime/gc/scan: avoid -1 index when cache sizes unavailable + 2025-08-12 4e182db5fc Revert "cmd/compile: use generated loops instead of DUFFCOPY on amd64" + 2025-08-12 d2b3c1a504 internal/trace: clarify which StateTransition events have stacks + 2025-08-12 f63e12d0e0 internal/trace: fix Sync.ClockSnapshot comment + 2025-08-12 8e317da77d internal/trace: remove unused StateTransition.id field + 2025-08-12 f67d8ff34a internal/trace/tracev2: adjust comment for consistency + 2025-08-12 fe4d445c36 internal/trace/tracev2: fix EvSTWBegin comment to include stack ID + 2025-08-12 750789fab7 internal/trace/internal/testgen: fix missing stacks nframes arg + 2025-08-12 889ab74169 internal/runtime/gc/scan: import scan kernel from gclab [green tea] + 2025-08-12 182336bf05 net/http: fix data race in client + 2025-08-12 f04421ea9a cmd/compile: soften test for 74788 + 2025-08-12 28aa529c99 cmd/compile: use generated loops instead of DUFFZERO on arm64 + 2025-08-12 ec9e1176c3 cmd/compile: use generated loops instead of DUFFCOPY on amd64 + 2025-08-12 d0a64f7969 Revert "cmd/compile/internal/ssa: Use transitive properties for len/cap" + 2025-08-12 00a7bdcb55 all: delete aliastypeparams GOEXPERIMENT + 2025-08-11 74421a305b Revert "cmd/compile: allow multi-field structs to be stored directly in interfaces" + 2025-08-11 c31359138c Revert "cmd/compile: allow StructSelect [x] of interface data fields for x>0" + 2025-08-11 7248995b60 Revert "cmd/compile: allow more args in StructMake folding rule" + 2025-08-11 caf9fc3ccd Revert "reflect: handle zero-sized fields of directly-stored structures correctly" + 2025-08-11 ce3f3e2ae7 cmd/link/internal/ld, internal/syscall/unix: use posix_fallocate on netbsd + 2025-08-11 3dbef65bf3 database/sql: allow drivers to override Scan behavior + 2025-08-11 2b804abf07 net: context aware Dialer.Dial functions + 2025-08-11 6abfe7b0de cmd/dist: require Go 1.24.6 as minimum bootstrap toolchain + 2025-08-11 691af6ca28 encoding/json: fix Indent trailing whitespace regression in goexperiment.jsonv2 + 2025-08-11 925149da20 net/http: add example for CrossOriginProtection + 2025-08-11 cf4af0b2f3 encoding/json/v2: fix UnmarshalDecode regression with EOF + 2025-08-11 b096ddb9ea internal/runtime/maps: loop invariant code motion with h2(hash) by hand + 2025-08-11 a2431776eb net, os, file/filepath, syscall: use slices.Equal in tests + 2025-08-11 a7f05b38f7 cmd/compile: convert branch with zero to more optimal branch zero on loong64 + 2025-08-11 1718828c81 internal/sync: warn about incorrect unsafe usage in HashTrieMap + 2025-08-11 084c0f8494 cmd/compile: allow InlMark operations to be speculatively executed + 2025-08-10 a62f72f7a7 cmd/compile/internal/ssa: optimise more branches with SGTconst/SGTUconst on loong64 + 2025-08-08 fbac94a799 internal/sync: rename Store parameter from old to new + 2025-08-08 317be4cfeb cmd/compile/internal/staticinit: remove deadcode + 2025-08-08 bce5601cbb cmd/go: fix fips doc link + 2025-08-08 777d76c4f2 text/template: use sync.OnceValue for builtinFuncs + 2025-08-08 0201524c52 math: remove redundant infinity tests + 2025-08-08 dcc77f9e3c cmd/go: fix get -tool when multiple packages are provided + 2025-08-08 c7b85e9ddc all: update blog link + 2025-08-08 a8dd771e13 crypto/tls: check if quic conn can send session ticket + 2025-08-08 bdb2d50fdf net: fix WriteMsgUDPAddrPort addr handling on IPv4 sockets + 2025-08-08 768c51e368 internal/runtime/maps: remove unused var bitsetDeleted + 2025-08-08 b3388569a1 reflect: handle zero-sized fields of directly-stored structures correctly + 2025-08-08 d83b16fcb8 internal/bytealg: vector implementation of compare for riscv64 + 2025-08-07 dd3abf6bc5 internal/bytealg: optimize Index/IndexString on loong64 + 2025-08-07 73ff6d1480 cmd/internal/obj/loong64: change the immediate range of ALSL{W/WU/V} + 2025-08-07 f3606b0825 cmd/compile/internal/ssa: fix typo in LOONG64Ops.go comment + 2025-08-07 ee7bb8969a cmd/internal/obj/loong64: add support for FSEL instruction + 2025-08-07 1f7ffca171 time: skip TestLongAdjustTimers on plan9 (too slow) + 2025-08-06 8282b72d62 runtime/race: update darwin race syso + 2025-08-06 dc54d7b607 all: remove support for windows/arm + 2025-08-06 e0a1ea431c cmd/compile: make panicBounds stack frame smaller on ppc64 + 2025-08-06 2747f925dd debug/macho: support reading imported symbols without LC_DYSYMTAB + 2025-08-06 025d36917c cmd/internal/testdir: pass -buildid to link command + 2025-08-06 f53dcb6280 cmd/internal/testdir: unify link command + 2025-08-06 a3895fe9f1 database/sql: avoid closing Rows while scan is in progress + 2025-08-06 608e9fac90 go/types, types2: flip on position tracing + 2025-08-06 72e8237cc1 cmd/compile: allow more args in StructMake folding rule + 2025-08-06 3406a617d9 internal/bytealg: vector implementation of indexbyte for riscv64 + 2025-08-06 75ea2d05c0 internal/bytealg: vector implementation of equal for riscv64 + 2025-08-05 17a8be7117 crypto/sha512: use const table for key loading on loong64 + 2025-08-05 dda9d780e2 crypto/sha256: use const table for key loading on loong64 + 2025-08-05 5defe8ebb3 internal/chacha8rand: replace WORD with instruction VMOVQ + 2025-08-05 4c7362e41c cmd/internal/obj/loong64: add new instructions ALSL{W/WU/V} for loong64 + 2025-08-05 a552737418 cmd/compile: fold negation into multiplication on loong64 + 2025-08-05 e1fd4faf91 runtime: fix godoc comment for inVDSOPage + 2025-08-05 bcd25c79aa cmd/compile: allow StructSelect [x] of interface data fields for x>0 + 2025-08-05 b0945a54b5 cmd/dist, internal/platform: mark freebsd/riscv64 broken + 2025-08-05 55d961b202 runtime: save AVX2 and AVX-512 state on asynchronous preemption + 2025-08-05 af0c4fe2ca runtime: save scalar registers off stack in amd64 async preemption + 2025-08-05 e73afaae69 internal/cpu: add AVX-512-CD and DQ, and derived "basic AVX-512" + 2025-08-05 cef381ba60 runtime: eliminate global state in mkpreempt.go + 2025-08-05 c0025d5e0b go/parser: correct comment in expectedErrors + 2025-08-05 4ee0df8c46 cmd: remove dead code + 2025-08-05 a2c45f0eb1 runtime: test VDSO symbol hash values + 2025-08-05 cd55f86b8d cmd/compile: allow multi-field structs to be stored directly in interfaces + 2025-08-05 21ab0128b6 cmd/compile: remove support for old-style bounds check calls + 2025-08-05 802d056c78 cmd/compile: move ppc64 over to new bounds check strategy + 2025-08-05 a3295df873 cmd/compile/internal/ssa: Use transitive properties for len/cap + 2025-08-05 bd082857a5 doc: fix typo in go memory model doc + 2025-08-05 2b622b05a9 cmd/compile: remove isUintXPowerOfTwo functions + 2025-08-05 72147ffa75 cmd/compile: simplify isUintXPowerOfTwo implementation + 2025-08-05 26da1199eb cmd/compile: make isUint{32,64}PowerOfTwo implementations clearer + 2025-08-05 5ab9f23977 cmd/compile, runtime: add checkptr instrumentation for unsafe.Add + 2025-08-05 fcc036f03b cmd/compile: optimise float <-> int register moves on riscv64 Change-Id: Ie94f29d9b0cc14a52a536866f5abaef27b5c52d7
2025-08-12internal/runtime/gc/scan: import scan kernel from gclab [green tea]Michael Anthony Knyszek
This change imports the AVX512 GC scanning kernel from CL 593938 into a new package, internal/runtime/gc/scan. Credit to Austin Clements for most of this work. I did some cleanup, added support for more size classes to the expanders, and added more testing. I also restructured the code to make it easier and clearer to add new scan kernels for new architectures. For #73581. Change-Id: I76bcbc889fa6cad73ba0084620fae084a5912e6b Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64_avx512,gotip-linux-amd64_avx512-greenteagc Reviewed-on: https://go-review.googlesource.com/c/go/+/655280 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-08-05internal/cpu: add AVX-512-CD and DQ, and derived "basic AVX-512"Austin Clements
This adds detection for the CD and DQ sub-features of x86 AVX-512. Building on these, we also add a "derived" AVX-512 feature that bundles together the basic usable subset of subfeatures. Despite the F in AVX-512-F standing for "foundation", AVX-512-F+BW+DQ+VL together really form the basic usable subset of AVX-512 functionality. These have also all been supported together by almost every CPU, and are guaranteed by GOAMD64=v4, so there's little point in separating them out. This is a cherry-pick of CL 680899 from the dev.simd branch. Change-Id: I34356502bd1853ba2372e48db0b10d55cffe07a1 Reviewed-on: https://go-review.googlesource.com/c/go/+/693396 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Austin Clements <austin@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-04[dev.simd] all: merge master (7a1679d) into dev.simdCherry Mui
Conflicts: - src/cmd/compile/internal/amd64/ssa.go - src/cmd/compile/internal/ssa/rewriteAMD64.go - src/internal/buildcfg/exp.go - src/internal/cpu/cpu.go - src/internal/cpu/cpu_x86.go - src/internal/goexperiment/flags.go Merge List: + 2025-08-04 7a1679d7ae cmd/compile: move s390x over to new bounds check strategy + 2025-08-04 95693816a5 cmd/compile: move riscv64 over to new bounds check strategy + 2025-08-04 d7bd7773eb go/parser: remove safePos + 2025-08-04 4b6cbc377f cmd/cgo/internal/test: use (syntactic) constant for C array bound + 2025-08-03 b2960e3580 cmd/internal/obj/loong64: add {V,XV}{BITCLR/BITSET/BITREV}[I].{B/H/W/D} instructions support + 2025-08-03 abeeef1c08 cmd/compile/internal/test: fix typo in comments + 2025-08-03 d44749b65b cmd/internal/obj/loong64: add [X]VLDREPL.{B/H/W/D} instructions support + 2025-08-03 d6beda863e runtime: add reference to debugPinnerV1 + 2025-08-01 4ab1aec007 cmd/go: modload should use a read-write lock to improve concurrency + 2025-08-01 e666972a67 runtime: deduplicate Windows stdcall + 2025-08-01 ef40549786 runtime,syscall: move loadlibrary and getprocaddress to syscall + 2025-08-01 336931a4ca cmd/go: use os.Rename to move files on Windows + 2025-08-01 eef5f8d930 cmd/compile: enforce that locals are always accessed with SP base register + 2025-08-01 e071617222 cmd/compile: optimize multiplication rules on loong64 + 2025-07-31 eb7f515c4d cmd/compile: use generated loops instead of DUFFZERO on amd64 + 2025-07-31 c0ee2fd4e3 cmd/go: explicitly reject module paths "go" and "toolchain" + 2025-07-30 a4d99770c0 runtime/metrics: add cleanup and finalizer queue metrics + 2025-07-30 70a2ff7648 runtime: add cgo call benchmark + 2025-07-30 69338a335a cmd/go/internal/gover: fix ModIsPrerelease for toolchain versions + 2025-07-30 cedf63616a cmd/compile: add floating point min/max intrinsics on s390x + 2025-07-30 82a1921c3b all: remove redundant Swiss prefixes + 2025-07-30 2ae059ccaf all: remove GOEXPERIMENT=swissmap + 2025-07-30 cc571dab91 cmd/compile: deduplicate instructions when rewrite func results + 2025-07-30 2174a7936c crypto/tls: use standard chacha20-poly1305 cipher suite names + 2025-07-30 8330fb48a6 cmd/compile: move mips32 over to new bounds check strategy + 2025-07-30 9f9d7b50e8 cmd/compile: move mips64 over to new bounds check strategy + 2025-07-30 5216fd570e cmd/compile: move loong64 over to new bounds check strategy + 2025-07-30 89a0af86b8 cmd/compile: allow ops to specify clobbering input registers + 2025-07-30 5e94d72158 cmd/compile: simplify zerorange on arm64 + 2025-07-30 8cd85e602a cmd/compile: check domination of loop return in both controls + 2025-07-30 cefaed0de0 reflect: fix noswiss builder + 2025-07-30 3aa1b00081 regexp: fix compiling alternate patterns of different fold case literals + 2025-07-30 b1e933d955 cmd/compile: avoid extending when already sufficiently masked on loong64 + 2025-07-29 880ca333d7 cmd/compile: removing log2uint32 function + 2025-07-29 1513661dc3 cmd/compile: simplify logX implementations + 2025-07-29 bd94ae8903 cmd/compile: use unsigned power-of-two detector for unsigned mod + 2025-07-29 f3582fc80e cmd/compile: add unsigned power-of-two detector + 2025-07-29 f7d167fe71 internal/abi: move direct/indirect flag from Kind to TFlag + 2025-07-29 e0b07dc22e os/exec: fix incorrect expansion of "", "." and ".." in LookPath + 2025-07-29 25816d401c internal/goexperiment: delete RangeFunc goexperiment + 2025-07-29 7961bf71f8 internal/goexperiment: delete CacheProg goexperiment + 2025-07-29 e15a14c4dd sync: remove synchashtriemap GOEXPERIMENT + 2025-07-29 7dccd6395c cmd/compile: move arm32 over to new bounds check strategy + 2025-07-29 d79405a344 runtime: only deduct assist credit for arenas during GC + 2025-07-29 19a086f716 cmd/go/internal/telemetrystats: count goexperiments + 2025-07-29 aa95ab8215 image: fix formatting of godoc link + 2025-07-29 4c854b7a3e crypto/elliptic: change a variable name that have the same name as keywords + 2025-07-28 b10eb1d042 cmd/compile: simplify zerorange on amd64 + 2025-07-28 f8eae7a3c3 os/user: fix tests to pass on non-english Windows + 2025-07-28 0984264471 internal/poll: remove msg field from Windows' poll.operation + 2025-07-28 d7b4114346 internal/poll: remove rsan field from Windows' poll.operation + 2025-07-28 361b1ab41f internal/poll: remove sa field from Windows' poll.operation + 2025-07-28 9b6bd64e46 internal/poll: remove qty and flags fields from Windows' poll.operation + 2025-07-28 cd3655a824 internal/runtime/maps: fix spelling errors in comments + 2025-07-28 d5dc36af45 runtime: remove openbsd/mips64 related code + 2025-07-28 64ba72474d errors: omit redundant nil check in type assertion for Join + 2025-07-28 e151db3e06 all: omit unnecessary type conversions + 2025-07-28 4569255f8c cmd/compile: cleanup SelectN rules by indexing into args + 2025-07-28 94645d2413 cmd/compile: rewrite cmov(x, x, cond) into x + 2025-07-28 10c5cf68d4 net/http: add proper panic message + 2025-07-28 46b5839231 test/codegen: fix failing condmove wasm tests + 2025-07-28 98f301cf68 runtime,syscall: move SyscallX implementations from runtime to syscall + 2025-07-28 c7ed3a1c5a internal/runtime/syscall/windows: factor out code from runtime + 2025-07-28 e81eac19d3 hash/crc32: fix incorrect checksums with avx512+race + 2025-07-25 6fbad4be75 cmd/compile: remove no-longer-necessary call to calculateDepths + 2025-07-25 5045fdd8ff cmd/compile: fix containsUnavoidableCall computation + 2025-07-25 d28b27cd8e go/types, types2: use nil to represent incomplete explicit aliases + 2025-07-25 7b53d8d06e cmd/compile/internal/types2: add loaded state between loader calls and constraint expansion + 2025-07-25 374e3be2eb os/user: user random name for the test user account + 2025-07-25 1aa154621d runtime: rename scanobject to scanObject + 2025-07-25 41b429881a runtime: duplicate scanobject in greentea and non-greentea files + 2025-07-25 aeb256e98a cmd/compile: remove unused arg from gorecover + 2025-07-25 08376e1a9c runtime: iterate through inlinings when processing recover() + 2025-07-25 c76c3abc54 encoding/json: fix truncated Token error regression in goexperiment.jsonv2 + 2025-07-25 ebdbfccd98 encoding/json/jsontext: preserve buffer capacity in Encoder.Reset + 2025-07-25 91c4f0ccd5 reflect: avoid a bounds check in stack-constrained code + 2025-07-24 3636ced112 encoding/json: fix extra data regression under goexperiment.jsonv2 + 2025-07-24 a6eec8bdc7 encoding/json: reduce error text regressions under goexperiment.jsonv2 + 2025-07-24 0fa88dec1e time: remove redundant uint32 conversion in split + 2025-07-24 ada30b8248 internal/buildcfg: add ability to get GORISCV64 variable in GOGOARCH + 2025-07-24 6f6c6c5782 cmd/internal/obj: rip out argp adjustment for wrapper frames + 2025-07-24 7b50024330 runtime: detect successful recovers differently + 2025-07-24 7b9de668bd unicode/utf8: skip ahead during ascii runs in Valid/ValidString + 2025-07-24 076eae436e cmd/compile: move amd64 and 386 over to new bounds check strategy + 2025-07-24 f703dc5bef cmd/compile: add missing StringLen rule in prove + 2025-07-24 394d0bee8d cmd/compile: move arm64 over to new bounds check strategy + 2025-07-24 3024785b92 cmd/compile,runtime: remember idx+len for bounds check failure with less code + 2025-07-24 741a19ab41 runtime: move bounds check constants to internal/abi + 2025-07-24 ce05ad448f cmd/compile: rewrite condselects into doublings and halvings + 2025-07-24 fcd28070fe cmd/compile: add opt branchelim to rewrite some CondSelect into math + 2025-07-24 f32cf8e4b0 cmd/compile: learn transitive proofs for safe unsigned subs + 2025-07-24 d574856482 cmd/compile: learn transitive proofs for safe negative signed adds + 2025-07-24 1a72920f09 cmd/compile: learn transitive proofs for safe positive signed adds + 2025-07-24 e5f202bb60 cmd/compile: learn transitive proofs for safe unsigned adds + 2025-07-24 bd80f74bc1 cmd/compile: fold shift through AND for slice operations + 2025-07-24 5c45fe1385 internal/runtime/syscall: rename to internal/runtime/syscall/linux + 2025-07-24 592c2db868 cmd/compile: improve loopRotate to handle nested loops + 2025-07-24 dcb479c2f9 cmd/compile: optimize slice bounds checking with SUB/SUBconst comparisons + 2025-07-24 f11599b0b9 internal/poll: remove handle field from Windows' poll.operation + 2025-07-24 f7432e0230 internal/poll: remove fd field from Windows' poll.operation + 2025-07-24 e84ed38641 runtime: add benchmark for small-size memmory operation + 2025-07-24 18dbe5b941 hash/crc32: add AVX512 IEEE CRC32 calculation + 2025-07-24 c641900f72 cmd/compile: prefer base.Fatalf to panic in dwarfgen + 2025-07-24 d71d8aeafd cmd/internal/obj/s390x: add MVCLE instruction + 2025-07-24 b6cf1d94dc runtime: optimize memclr on mips64x + 2025-07-24 a8edd99479 runtime: improvement in memclr for s390x + 2025-07-24 bd04f65511 internal/runtime/exithook: fix a typo + 2025-07-24 5c8624a396 cmd/internal/goobj: make error output clear + 2025-07-24 44d73dfb4e cmd/go/internal/doc: clean up after merge with cmd/internal/doc + 2025-07-24 bd446662dd cmd/internal/doc: merge with cmd/go/internal/doc + 2025-07-24 da8b50c830 cmd/doc: delete + 2025-07-24 6669aa3b14 runtime: randomize heap base address + 2025-07-24 26338a7f69 cmd/compile: use better fatal message for staticValue1 + 2025-07-24 8587ba272e cmd/cgo: compare malloc return value to NULL instead of literal 0 + 2025-07-24 cae45167b7 go/types, types2: better error messages for certain type mismatches + 2025-07-24 2ddf542e4c cmd/compile: use ,ok return idiom for sparsemap.get + 2025-07-24 6505fcbd0a cmd/compile: use generics for sparse map + 2025-07-24 14f5eb7812 cmd/api: rerun updategolden + 2025-07-24 52b6d7f67a runtime: drop NetBSD kernel bug sysmon workaround fixed in NetBSD 9.2 + 2025-07-24 1ebebf1cc1 cmd/go: clean should respect workspaces + 2025-07-24 6536a93547 encoding/json/jsontext: preserve buffer capacity in Decoder.Reset + 2025-07-24 efc37e97c0 cmd/go: always return the cached path from go tool -n + 2025-07-23 98a031193b runtime: check TestUsingVDSO ExitError type assertion + 2025-07-23 6bb42997c8 doc/next: initialize + 2025-07-23 2696a11a97 internal/goversion: update Version to 1.26 + 2025-07-23 489868f776 cmd/link: scope test to linux & net.sendFile + 2025-07-22 71c2bf5513 cmd/compile: fix loclist for heap return vars without optimizations + 2025-07-22 c74399e7f5 net: correct comment for ListenConfig.ListenPacket + 2025-07-22 4ed9943b26 all: go fmt + 2025-07-22 1aaf7422f1 cmd/internal/objabi: remove redundant word in comment + 2025-07-21 d5ec0815e6 runtime: relax TestMemoryLimitNoGCPercent a bit + 2025-07-21 f7cc61e7d7 cmd/compile: for arm64 epilog, do SP increment with a single instruction + 2025-07-21 5dac42363b runtime: fix asan wrapper for riscv64 + 2025-07-21 e5502e0959 cmd/go: check subcommand properties + 2025-07-19 2363897932 cmd/internal/obj: enable got pcrel itype in fips140 for riscv64 + 2025-07-19 e32255fcc0 cmd/compile/internal/ssa: restrict architectures for TestDebugLines_74576 + 2025-07-18 0451816430 os: revert the use of AddCleanup to close files and roots + 2025-07-18 34b70684ba go/types: infer correct type for y in append(bytes, y...) + 2025-07-17 66536242fc cmd/compile/internal/escape: improve DWARF .debug_line numbering for literal rewriting optimizations + 2025-07-16 385000b004 runtime: fix idle time double-counting bug + 2025-07-16 f506ad2644 cmd/compile/internal/escape: speed up analyzing some functions with many closures + 2025-07-16 9c507e7942 cmd/link, runtime: on Wasm, put only function index in method table and func table + 2025-07-16 9782dcfd16 runtime: use 32-bit function index on Wasm + 2025-07-16 c876bf9346 cmd/internal/obj/wasm: use 64-bit instructions for indirect calls + 2025-07-15 b4309ece66 cmd/internal/doc: upgrade godoc pkgsite to 01b046e + 2025-07-15 75a19dbcd7 runtime: use memclrNoHeapPointers to clear inline mark bits + 2025-07-15 6d4a91c7a5 runtime: only clear inline mark bits on span alloc if necessary + 2025-07-15 0c6296ab12 runtime: have mergeInlineMarkBits also clear the inline mark bits + 2025-07-15 397d2117ec runtime: merge inline mark bits with gcmarkBits 8 bytes at a time + 2025-07-15 7dceabd3be runtime/maps: fix typo in group.go comment (instrinsified -> intrinsified) + 2025-07-15 d826bf4d74 os: remove useless error check + 2025-07-14 bb07e55aff runtime: expand GOMAXPROCS documentation + 2025-07-14 9159cd4ec6 encoding/json: decompose legacy options + 2025-07-14 c6556b8eb3 encoding/json/v2: add security section to doc + 2025-07-11 6ebb5f56d9 runtime: gofmt after CL 643897 and CL 662455 + 2025-07-11 1e48ca7020 encoding/json: remove legacy option to EscapeInvalidUTF8 + 2025-07-11 a0a99cb22b encoding/json/v2: report wrapped io.ErrUnexpectedEOF + 2025-07-11 9d04122d24 crypto/rsa: drop contradictory promise to keep PublicKey modulus secret + 2025-07-11 1ca23682dd crypto/rsa: fix documentation formatting + 2025-07-11 4bc3373c8e runtime: turn off large memmove tests under asan/msan Change-Id: I1e32d964eba770b85421efb86b305a2242f24466
2025-07-24hash/crc32: add AVX512 IEEE CRC32 calculationKlaus Post
Benchmark: goos: windows goarch: amd64 pkg: hash/crc32 cpu: AMD Ryzen 9 9950X 16-Core Processor benchmark old MB/s new MB/s speedup BenchmarkCRC32/poly=IEEE/size=15/align=0-32 1081.48 1089.42 1.01x BenchmarkCRC32/poly=IEEE/size=15/align=1-32 1085.87 1082.61 1.00x BenchmarkCRC32/poly=IEEE/size=40/align=0-32 2756.33 2752.37 1.00x BenchmarkCRC32/poly=IEEE/size=40/align=1-32 2758.27 2756.99 1.00x BenchmarkCRC32/poly=IEEE/size=512/align=0-32 18133.44 18076.52 1.00x BenchmarkCRC32/poly=IEEE/size=512/align=1-32 18151.05 18055.41 0.99x BenchmarkCRC32/poly=IEEE/size=1kB/align=0-32 19902.93 48581.07 2.44x BenchmarkCRC32/poly=IEEE/size=1kB/align=1-32 19966.99 48393.25 2.42x BenchmarkCRC32/poly=IEEE/size=4kB/align=0-32 21690.33 51679.25 2.38x BenchmarkCRC32/poly=IEEE/size=4kB/align=1-32 21655.30 51731.22 2.39x BenchmarkCRC32/poly=IEEE/size=32kB/align=0-32 22046.57 46406.90 2.10x BenchmarkCRC32/poly=IEEE/size=32kB/align=1-32 21986.22 46250.66 2.10x AVX512 are enabled above 1KB input size. This rather high limit is due to AVX512 may be slower to ramp up than the regular SSE4 implementation for smaller inputs. This is not reflected in the benchmarks, since consecutive calls means the CPU is "hot". The 'HasAVX512VPCLMULQDQ' name mirrors the one in golang.org/x/sys/cpu Change-Id: Id23685d8e3cc412b6d397a7d70056844bdb79271 Change-Id: Id23685d8e3cc412b6d397a7d70056844bdb79271 GitHub-Last-Rev: 6639f07b9febc7c96a7f3b402a2fd60f7be5e154 GitHub-Pull-Request: golang/go#74701 Reviewed-on: https://go-review.googlesource.com/c/go/+/689435 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Keith Randall <khr@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2025-07-21[dev.simd] simd, internal/cpu: support more AVX CPU Feature checksJunyang Shao
This CL adds more checks, it also changes HasAVX512GFNI to be exactly checking GFNI instead of being a virtual feature. This CL copies its logic from x/sys/arch. Change-Id: I4612b0409b8a3518928300562ae08bcf123d53a7 Reviewed-on: https://go-review.googlesource.com/c/go/+/688276 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-07-01[dev.simd] internal/cpu: add GFNI feature checkJunyang Shao
This CL amends HasAVX512 flag with GFNI check. This is needed because our SIMD API supports Galois Field operations. Change-Id: I3e957b7b2215d2b7b6b8a7a0ca3e2e60d453b2e5 Reviewed-on: https://go-review.googlesource.com/c/go/+/685295 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-06-13[dev.simd] internal/cpu: add AVX-512-CD and DQ, and derived "basic AVX-512"Austin Clements
This adds detection for the CD and DQ sub-features of x86 AVX-512. Building on these, we also add a "derived" AVX-512 feature that bundles together the basic usable subset of subfeatures. Despite the F in AVX-512-F standing for "foundation", AVX-512-F+BW+DQ+VL together really form the basic usable subset of AVX-512 functionality. These have also all been supported together by almost every CPU, and are guaranteed by GOAMD64=v4, so there's little point in separating them out. Change-Id: I34356502bd1853ba2372e48db0b10d55cffe07a1 Reviewed-on: https://go-review.googlesource.com/c/go/+/680899 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-21internal/cpu: add ARM64.HasSHA3Filippo Valsorda
For #69536 Change-Id: If237226ba03e282443b4fc90484968c903198cb1 Reviewed-on: https://go-review.googlesource.com/c/go/+/616715 Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Roland Shoemaker <roland@golang.org>
2025-05-01cmd/compile,internal/cpu,runtime: intrinsify math/bits.OnesCount on riscv64Joel Sing
For riscv64/rva22u64 and above, we can intrinsify math/bits.OnesCount using the CPOP/CPOPW machine instructions. Since the native Go implementation of OnesCount is relatively expensive, it is also worth emitting a check for Zbb support when compiled for rva20u64. On a Banana Pi F3, with GORISCV64=rva22u64: │ oc.1 │ oc.2 │ │ sec/op │ sec/op vs base │ OnesCount-8 16.930n ± 0% 4.389n ± 0% -74.08% (p=0.000 n=10) OnesCount8-8 5.642n ± 0% 5.016n ± 0% -11.10% (p=0.000 n=10) OnesCount16-8 9.404n ± 0% 5.015n ± 0% -46.67% (p=0.000 n=10) OnesCount32-8 13.165n ± 0% 4.388n ± 0% -66.67% (p=0.000 n=10) OnesCount64-8 16.300n ± 0% 4.388n ± 0% -73.08% (p=0.000 n=10) geomean 11.40n 4.629n -59.40% On a Banana Pi F3, compiled with GORISCV64=rva20u64 and with Zbb detection enabled: │ oc.3 │ oc.4 │ │ sec/op │ sec/op vs base │ OnesCount-8 16.930n ± 0% 5.643n ± 0% -66.67% (p=0.000 n=10) OnesCount8-8 5.642n ± 0% 5.642n ± 0% ~ (p=0.447 n=10) OnesCount16-8 10.030n ± 0% 6.896n ± 0% -31.25% (p=0.000 n=10) OnesCount32-8 13.170n ± 0% 5.642n ± 0% -57.16% (p=0.000 n=10) OnesCount64-8 16.300n ± 0% 5.642n ± 0% -65.39% (p=0.000 n=10) geomean 11.55n 5.873n -49.16% On a Banana Pi F3, compiled with GORISCV64=rva20u64 but with Zbb detection disabled: │ oc.3 │ oc.5 │ │ sec/op │ sec/op vs base │ OnesCount-8 16.93n ± 0% 29.47n ± 0% +74.07% (p=0.000 n=10) OnesCount8-8 5.642n ± 0% 5.643n ± 0% ~ (p=0.191 n=10) OnesCount16-8 10.03n ± 0% 15.05n ± 0% +50.05% (p=0.000 n=10) OnesCount32-8 13.17n ± 0% 18.18n ± 0% +38.04% (p=0.000 n=10) OnesCount64-8 16.30n ± 0% 21.94n ± 0% +34.60% (p=0.000 n=10) geomean 11.55n 15.84n +37.16% For hardware without Zbb, this adds ~5ns overhead, while for hardware with Zbb we achieve a performance gain up of up to 11ns. It is worth noting that OnesCount8 is cheap enough that it is preferable to stick with the generic version in this case. Change-Id: Id657e40e0dd1b1ab8cc0fe0f8a68df4c9f2d7da5 Reviewed-on: https://go-review.googlesource.com/c/go/+/660856 Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-14internal/cpu: add a detection for Neoverse(N3, V3, V3ae) coresfanzha02
The memmove implementation relies on the variable runtime.arm64UseAlignedLoads to select fastest code path. Considering Neoverse N3, V3 and V3ae cores prefer aligned loads, this patch adds code to detect them for memmove performance. Change-Id: I7266fc35d8b2c15ff516c592b987bafacb82b620 Reviewed-on: https://go-review.googlesource.com/c/go/+/664038 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2025-03-11internal/cpu: use correct variable when parsing CPU features lamcas and ↵Guoqi Chen
lam_bh on loong64 Change-Id: I5019f4e32243911f735f775bcb3c0dba5adb4162 Reviewed-on: https://go-review.googlesource.com/c/go/+/655395 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-03-11internal/bytealg: optimize Count{,String} in loong64Guoqi Chen
Benchmark on Loongson 3A6000 and 3A5000: goos: linux goarch: loong64 pkg: bytes cpu: Loongson-3A6000 @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | CountSingle/10 13.210n ± 0% 9.984n ± 0% -24.42% (p=0.000 n=15) CountSingle/32 31.970n ± 1% 7.205n ± 0% -77.46% (p=0.000 n=15) CountSingle/4K 4039.0n ± 0% 108.7n ± 0% -97.31% (p=0.000 n=15) CountSingle/4M 4158.9µ ± 0% 117.3µ ± 0% -97.18% (p=0.000 n=15) CountSingle/64M 68.641m ± 0% 2.585m ± 1% -96.23% (p=0.000 n=15) geomean 13.72µ 1.189µ -91.34% | bench.old | bench.new | | B/s | B/s vs base | CountSingle/10 722.0Mi ± 0% 955.2Mi ± 0% +32.30% (p=0.000 n=15) CountSingle/32 954.6Mi ± 1% 4235.4Mi ± 0% +343.68% (p=0.000 n=15) CountSingle/4K 967.2Mi ± 0% 35947.6Mi ± 0% +3616.64% (p=0.000 n=15) CountSingle/4M 961.8Mi ± 0% 34092.7Mi ± 0% +3444.71% (p=0.000 n=15) CountSingle/64M 932.4Mi ± 0% 24757.2Mi ± 1% +2555.24% (p=0.000 n=15) geomean 902.2Mi 10.17Gi +1054.77% goos: linux goarch: loong64 pkg: bytes cpu: Loongson-3A5000 @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | CountSingle/10 14.41n ± 0% 12.81n ± 0% -11.10% (p=0.000 n=15) CountSingle/32 36.230n ± 0% 9.609n ± 0% -73.48% (p=0.000 n=15) CountSingle/4K 4366.0n ± 0% 165.5n ± 0% -96.21% (p=0.000 n=15) CountSingle/4M 4464.7µ ± 0% 325.2µ ± 0% -92.72% (p=0.000 n=15) CountSingle/64M 75.627m ± 0% 8.307m ± 69% -89.02% (p=0.000 n=15) geomean 15.04µ 2.229µ -85.18% | bench.old | bench.new | | B/s | B/s vs base | CountSingle/10 661.8Mi ± 0% 744.4Mi ± 0% +12.49% (p=0.000 n=15) CountSingle/32 842.4Mi ± 0% 3176.1Mi ± 0% +277.03% (p=0.000 n=15) CountSingle/4K 894.7Mi ± 0% 23596.7Mi ± 0% +2537.34% (p=0.000 n=15) CountSingle/4M 895.9Mi ± 0% 12299.7Mi ± 0% +1272.88% (p=0.000 n=15) CountSingle/64M 846.3Mi ± 0% 7703.9Mi ± 41% +810.34% (p=0.000 n=15) geomean 823.3Mi 5.424Gi +574.68% Change-Id: Ie07592beac61bdb093470c524049ed494df4d703 Reviewed-on: https://go-review.googlesource.com/c/go/+/586055 Reviewed-by: Meidan Li <limeidan@loongson.cn> Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-02-24all: use testenv.Executable instead of os.Executable and os.Args[0]qmuntal
In test files, using testenv.Executable is more reliable than os.Executable or os.Args[0]. Change-Id: I88e577efeabc20d02ada27bf706ae4523129128e Reviewed-on: https://go-review.googlesource.com/c/go/+/651955 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2025-02-05cpu/internal: provide runtime detection of RISC-V extensions on LinuxMark Ryan
Add a RISCV64 variable to cpu/internal that indicates both the presence of RISC-V extensions and performance information about the underlying RISC-V cores. The variable is only populated with non false values on Linux. The detection code relies on the riscv_hwprobe syscall introduced in Linux 6.4. The patch can detect RVV 1.0 and whether the CPU supports fast misaligned accesses. It can only detect RVV 1.0 on a 6.5 kernel or later (without backports). Updates #61416 Change-Id: I2d8289345c885b699afff441d417cae38f6bdc54 Reviewed-on: https://go-review.googlesource.com/c/go/+/522995 Reviewed-by: Joel Sing <joel@sing.id.au> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com>
2024-11-12cmd/compile: optimize math/bits.OnesCount{16,32,64} implementation on loong64Guoqi Chen
Use Loong64's LSX instruction VPCNT to implement math/bits.OnesCount{16,32,64} and make it intrinsic. Benchmark results on loongson 3A5000 and 3A6000 machines: goos: linux goarch: loong64 pkg: math/bits cpu: Loongson-3A5000-HV @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | OnesCount 4.413n ± 0% 1.401n ± 0% -68.25% (p=0.000 n=10) OnesCount8 1.364n ± 0% 1.363n ± 0% ~ (p=0.130 n=10) OnesCount16 2.112n ± 0% 1.534n ± 0% -27.37% (p=0.000 n=10) OnesCount32 4.533n ± 0% 1.529n ± 0% -66.27% (p=0.000 n=10) OnesCount64 4.565n ± 0% 1.531n ± 1% -66.46% (p=0.000 n=10) geomean 3.048n 1.470n -51.78% goos: linux goarch: loong64 pkg: math/bits cpu: Loongson-3A6000 @ 2500.00MHz | bench.old | bench.new | | sec/op | sec/op vs base | OnesCount 3.553n ± 0% 1.201n ± 0% -66.20% (p=0.000 n=10) OnesCount8 0.8021n ± 0% 0.8004n ± 0% -0.21% (p=0.000 n=10) OnesCount16 1.216n ± 0% 1.000n ± 0% -17.76% (p=0.000 n=10) OnesCount32 3.006n ± 0% 1.035n ± 0% -65.57% (p=0.000 n=10) OnesCount64 3.503n ± 0% 1.035n ± 0% -70.45% (p=0.000 n=10) geomean 2.053n 1.006n -51.01% Change-Id: I07a5b8da2bb48711b896387ec7625145804affc8 Reviewed-on: https://go-review.googlesource.com/c/go/+/620978 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Meidan Li <limeidan@loongson.cn> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-04internal/cpu: add CPU feature LAMCAS and LAM_BH detection on loong64Guoqi Chen
Change-Id: Ic5580c4ee006d87b3152ae5de7b25fb532c6a33f Reviewed-on: https://go-review.googlesource.com/c/go/+/612976 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Martin Möhrmann <martin@golang.org>
2024-08-29internal/cpu, runtime: make linux/loong64 HWCAP data availableWANG Xuerui
This can be used to toggle runtime usages of ISA extensions as such usages appear. Only the CRC32 bit is exposed for now, as the others are not going to be utilized in the standard library for a while. Change-Id: I774032ca84dc8bcf1c9f17558917315af07c7314 Reviewed-on: https://go-review.googlesource.com/c/go/+/482416 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: xiaodong liu <teaofmoli@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn>
2024-07-24internal/cpu: add DIT detection on arm64Roland Shoemaker
Add support for detecting the DIT feature on ARM64 processors. This mirrors https://go.dev/cl/597377, but using the platform specific semantics. Updates #66450 Change-Id: Ia107e3e3369de7825af70823b485afe2f587358e Reviewed-on: https://go-review.googlesource.com/c/go/+/598335 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2024-07-22runtime: add ERMS-based memmove support for modern CPU platformsTangYang
The current memmove implementation uses REP MOVSB to copy data larger than 2KB when the useAVXmemmove global variable is false and the CPU supports the ERMS feature. This feature is currently only enabled on CPUs in the Sandy Bridge (Client) , Sandy Bridge (Server), Ivy Bridge (Client), and Ivy Bridge (Server) microarchitectures. For modern Intel CPU microarchitectures that support the ERMS feature, such as Ice Lake (Server), Sapphire Rapids , REP MOVSB achieves better performance than the AVX-based copy currently implemented in memmove. Benchstat result: goos: linux goarch: amd64 pkg: runtime cpu: Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz │ ./old.txt │ ./new.txt │ │ sec/op │ sec/op vs base │ Memmove/2048-2 25.24n ± 0% 24.27n ± 0% -3.84% (p=0.000 n=10) Memmove/4096-2 44.87n ± 0% 33.16n ± 1% -26.11% (p=0.000 n=10) geomean 33.65n 28.37n -15.71% │ ./old.txt │ ./new.txt │ │ B/s │ B/s vs base │ Memmove/2048-2 75.56Gi ± 0% 78.59Gi ± 0% +4.02% (p=0.000 n=10) Memmove/4096-2 85.01Gi ± 0% 115.05Gi ± 1% +35.34% (p=0.000 n=10) geomean 80.14Gi 95.09Gi +18.65% Fixes #66958 Change-Id: I1fafd1b51a16752f83ac15047cf3b29422a79d5d GitHub-Last-Rev: 89cf5af32b1b41e1499282058656a8a5c7aed359 GitHub-Pull-Request: golang/go#66959 Reviewed-on: https://go-review.googlesource.com/c/go/+/580735 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-05-29all: document legacy //go:linkname for modules with ≥100 dependentsRuss Cox
For #67401. Change-Id: I015408a3f437c1733d97160ef2fb5da6d4efcc5c Reviewed-on: https://go-review.googlesource.com/c/go/+/587598 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org>
2024-05-23all: document legacy //go:linkname for modules with ≥500 dependentsRuss Cox
For #67401. Change-Id: I7dd28c3b01a1a647f84929d15412aa43ab0089ee Reviewed-on: https://go-review.googlesource.com/c/go/+/587575 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-22all: document legacy //go:linkname for modules with ≥50,000 dependentsRuss Cox
Note that this depends on the revert of CL 581395 to move zeroVal back. For #67401. Change-Id: I507c27c2404ad1348aabf1ffa3740e6b1957495b Reviewed-on: https://go-review.googlesource.com/c/go/+/587217 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-05-22internal/cpu: remove unused const cpuid_SSE2Tobias Klauser
It's unused since CL 344350. Change-Id: I1aacb9d3db6798aa7013a58603112894e2281002 Reviewed-on: https://go-review.googlesource.com/c/go/+/587035 Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-17all: add push linknames to allow legacy pull linknamesCherry Mui
CL 585358 adds restrictions to disallow pull-only linknames (currently off by default). Currently, there are quite some pull- only linknames in user code in the wild. In order not to break those, we add push linknames to allow them to be pulled. This CL includes linknames found in a large code corpus (thanks Matthew Dempsky and Michael Pratt for the analysis!), that are not currently linknamed. Updates #67401. Change-Id: I32f5fc0c7a6abbd7a11359a025cfa2bf458fe767 Reviewed-on: https://go-review.googlesource.com/c/go/+/586137 Reviewed-by: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-15cmd/link: disallow pull-only linknamesCherry Mui
As mentioned in CL 584598, linkname is a mechanism that, when abused, can break API integrity and even safety of Go programs. CL 584598 is a first step to restrict the use of linknames, by implementing a blocklist. This CL takes a step further, tightening up the restriction by allowing linkname references ("pull") only when the definition side explicitly opts into it, by having a linkname on the definition (possibly to itself). This way, it is at least clear on the definition side that the symbol, despite being unexported, is accessed outside of the package. Unexported symbols without linkname can now be actually private. This is similar to the symbol visibility rule used by gccgo for years (which defines unexported non-linknamed symbols as C static symbols). As there can be pull-only linknames in the wild that may be broken by this change, we currently only enforce this rule for symbols defined in the standard library. Push linknames are added in the standard library to allow things build. Linkname references to external (non-Go) symbols are still allowed, as their visibility is controlled by the C symbol visibility rules and enforced by the C (static or dynamic) linker. Assembly symbols are treated similar to linknamed symbols. This is controlled by -checklinkname linker flag, currently not enabled by default. A follow-up CL will enable it by default. Change-Id: I07344f5c7a02124dbbef0fbc8fec3b666a4b2b0e Reviewed-on: https://go-review.googlesource.com/c/go/+/585358 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Russ Cox <rsc@golang.org>
2024-01-30all: fix typosJes Cok
Found by codespell. Change-Id: I2ebefbbbc8eacf9bc83051407519fa8fd5e4495f GitHub-Last-Rev: 9cc550f3dbd7c85194a58fd6858507db3e2962a9 GitHub-Pull-Request: golang/go#65346 Reviewed-on: https://go-review.googlesource.com/c/go/+/559095 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2023-11-22cmd/asm: fix the KMCTR instruction encoding and argument passingSrinivas Pokala
KMCTR encoding arguments incorrect way, which leading illegal instruction wherver we call KMCTR instruction.IBM z13 machine test's TestAESGCM test using gcmASM implementation, which uses KMCTR instruction to encrypt using AES in counter mode and the KIMD instruction for GHASH. z14+ machines onwards uses gcmKMA implementation for the same. Fixes #63387 Change-Id: I86aeb99573c3f636a71908c99e06a9530655aa5d Reviewed-on: https://go-review.googlesource.com/c/go/+/535675 Reviewed-by: Vishwanatha HD <vishwanatha.hd@ibm.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2023-11-15internal/cpu: detect support of AVX512Achille Roussel
Extracts changes from that were submitted in other CLs to enable AVX512 detection, notably: - https://go-review.googlesource.com/c/go/+/271521 - https://go-review.googlesource.com/c/go/+/379394 - https://go-review.googlesource.com/c/go/+/502476 This change adds properties to the cpu.X86 fields to enable runtime detection of AVX512, and the hasAVX512F, hasAVX512BW, and hasAVX512VL macros to support bypassing runtime checks in assembly code when GOAMD64=v4 is set. Change-Id: Ia7c3f22f1e66bf1de575aba522cb0d0a55ce791f Reviewed-on: https://go-review.googlesource.com/c/go/+/536257 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Martin Möhrmann <martin@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Martin Möhrmann <martin@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Martin Möhrmann <moehrmann@google.com> Commit-Queue: Martin Möhrmann <martin@golang.org> Reviewed-by: Martin Möhrmann <moehrmann@google.com>
2023-10-31internal/cpu: add comments to copied functionsapocelipes
Just as same as other copied functions, like stringsTrimSuffix in "os/executable_procfs.go" Change-Id: I9c9fbd75b009a5ae0e869cf1fddc77c0e08d9a67 GitHub-Last-Rev: 4c18865e15ede0f53121b6845a1879cdd70d1a38 GitHub-Pull-Request: golang/go#63704 Reviewed-on: https://go-review.googlesource.com/c/go/+/537056 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2023-10-31runtime: on arm32, detect whether we have sync instructionsKeith Randall
Make the choice of using these instructions dynamic (triggered by cpu feature detection) rather than static (trigered by GOARM setting). if GOARM>=7, we know we have them. For GOARM=5/6, dynamically dispatch based on auxv information. Update #17082 Update #61588 Change-Id: I8a50481d942f62cf36348998a99225d0d242f8af Reviewed-on: https://go-review.googlesource.com/c/go/+/525637 TryBot-Result: Gopher Robot <gobot@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Run-TryBot: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-09-12internal/cpu: fix wrong cache line size of riscv64Meng Zhuo
All of riscv CPU using 64B for cache-line size. i.e. U540 of Hifive Unleashed (https://www.sifive.com/boards/hifive-unleashed) Change-Id: I0d72d88ac026f45383c3b3eb3a77233d3c2e4004 Reviewed-on: https://go-review.googlesource.com/c/go/+/526659 Run-TryBot: M Zhuo <mzh@golangcn.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-09-05all: use ^TestName$ regular pattern for invoking a single testDmitri Shuralyov
Use ^ and $ in the -run flag regular expression value when the intention is to invoke a single named test. This removes the reliance on there not being another similarly named test to achieve the intended result. In particular, package syscall has tests named TestUnshareMountNameSpace and TestUnshareMountNameSpaceChroot that both trigger themselves setting GO_WANT_HELPER_PROCESS=1 to run alternate code in a helper process. As a consequence of overlap in their test names, the former was inadvertently triggering one too many helpers. Spotted while reviewing CL 525196. Apply the same change in other places to make it easier for code readers to see that said tests aren't running extraneous tests. The unlikely cases of -run=TestSomething intentionally being used to run all tests that have the TestSomething substring in the name can be better written as -run=^.*TestSomething.*$ or with a comment so it is clear it wasn't an oversight. Change-Id: Iba208aba3998acdbf8c6708e5d23ab88938bfc1e Reviewed-on: https://go-review.googlesource.com/c/go/+/524948 Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Kirill Kolyshkin <kolyshkin@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-09-05cmd/asm: add KMA and KMCTR instructions on s390x.root
This CL is to add assembly instruction mnemonics for the following instructions, mainly used in crypto packages. * KMA - cipher message with authentication * KMCTR - cipher message with counter Fixes #61163 Change-Id: Iff9a69911aeb4fab4bca8755b23a106eaebb2332 Reviewed-on: https://go-review.googlesource.com/c/go/+/515195 Reviewed-by: Carlos Amedee <carlos@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com>
2023-08-11cmd/asm: add KDSA instruction supportSrinivas Pokala
KDSA(Compute Digital Signature Authentication) instruction provides support for the signing and verification of elliptic curves Change-Id: I19996a307162dd4f476a1cfe4f8d1a74a609e6c1 Reviewed-on: https://go-review.googlesource.com/c/go/+/503215 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-08-02cmd/asm: add s390x crypto related instructionsSrinivas Pokala
This CL add's the following instructions,useful for cipher and message digest operations: * KM - cipher message * KMC - cipher message with chaining * KLMD - compute last message digest * KIMD - compute intermediate message digest Fixes #61163 Change-Id: Ib0636430c3e4888ed61b86c5acae45ee596463ff Reviewed-on: https://go-review.googlesource.com/c/go/+/509075 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2023-04-25internal/cpu: add a detection for Neoverse(N2, V2) coresfanzha02
The memmove implementation relies on the variable runtime.arm64UseAlignedLoads to select fastest code path. Considering Neoverse N2 and V2 cores prefer aligned loads, this patch adds code to detect them for memmove performance. And this patch uses a new variable ARM64.IsNeoverse to represent all Neoverse cores, removing the more specific versions. Change-Id: I9e06eae01a0325a0b604ac6af1e55711dd6133f7 Reviewed-on: https://go-review.googlesource.com/c/go/+/487815 Reviewed-by: Joel Sing <joel@sing.id.au> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Fannie Zhang <Fannie.Zhang@arm.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-18cmd/go: add check for unknown godebug settingRuss Cox
A //go:debug line mentioning an unknown or retired setting should be diagnosed as making the program invalid. Do that. We agreed on this in the proposal but I forgot to implement it. Change-Id: Ie69072a1682d4eeb6866c02adbbb426f608567c4 Reviewed-on: https://go-review.googlesource.com/c/go/+/476280 Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-18internal/cpu: add default osinit for ppc64/ppc64leJoel Sing
This will be used for operating systems other than AIX and Linux (both of which provide a more specific version). Updates #56001 Change-Id: Ia1de994866b66f03c83696faa92d0531a0b75273 Reviewed-on: https://go-review.googlesource.com/c/go/+/473698 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-01-23runtime: enable sha512 optimizations on arm64 via hwcaps.Matt Horsnell
Change-Id: I9d88c8eb91106de412a9abc6601cdda06537d818 Reviewed-on: https://go-review.googlesource.com/c/go/+/461747 Reviewed-by: Martin Möhrmann <moehrmann@google.com> Auto-Submit: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@golang.org>
2022-11-14internal/godebug: define more efficient APIRuss Cox
We have been expanding our use of GODEBUG for compatibility, and the current implementation forces a tradeoff between freshness and efficiency. It parses the environment variable in full each time it is called, which is expensive. But if clients cache the result, they won't respond to run-time GODEBUG changes, as happened with x509sha1 (#56436). This CL changes the GODEBUG API to provide efficient, up-to-date results. Instead of a single Get function, New returns a *godebug.Setting that itself has a Get method. Clients can save the result of New, which is no more expensive than errors.New, in a global variable, and then call that variable's Get method to get the value. Get costs only two atomic loads in the case where the variable hasn't changed since the last call. Unfortunately, these changes do require importing sync from godebug, which will mean that sync itself will never be able to use a GODEBUG setting. That doesn't seem like such a hardship. If it was really necessary, the runtime could pass a setting to package sync itself at startup, with the caveat that that setting, like the ones used by runtime itself, would not respond to run-time GODEBUG changes. Change-Id: I99a3acfa24fb2a692610af26a5d14bbc62c966ac Reviewed-on: https://go-review.googlesource.com/c/go/+/449504 Run-TryBot: Russ Cox <rsc@golang.org> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-08runtime internal/cpu: rename "Zeus" "NeoverseV1".Matthew Horsnell
Rename "Zeus" to "NeoverseV1" for the partnum 0xd40 to be consistent with the documentation of MIDR_EL1 as described in https://developer.arm.com/documentation/101427/0101/?lang=en Change-Id: I2e3d5ec76b953a831cb4ab0438bc1c403648644b Reviewed-on: https://go-review.googlesource.com/c/go/+/414775 Reviewed-by: Jonathan Swinney <jswinney@amazon.com> Auto-Submit: Ian Lance Taylor <iant@golang.org> Reviewed-by: Eric Fang <eric.fang@arm.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2022-09-26internal/cpu: deduplicate arm64 ISAR parsing codeJoel Sing
Deduplicate code for parsing system registers - this matches what is done in golang.org/x/sys/cpu. Change-Id: If3524eb2e361179c68678f8214230d7068fe4c60 Reviewed-on: https://go-review.googlesource.com/c/go/+/422217 Reviewed-by: Meng Zhuo <mzh@golangcn.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Joel Sing <joel@sing.id.au> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org>