aboutsummaryrefslogtreecommitdiff
path: root/src/runtime
AgeCommit message (Collapse)Author
2025-07-28runtime: remove openbsd/mips64 related codeJoel Sing
The openbsd/mips64 port has been broken for many years and it has not been possible to land the changes needed to unbreak it. As such, this port is considered dead and can be decommissioned in order to remove technical debt and allow other changes to be completed. Updates #61546 Change-Id: I9680eab9fb3aa85b83de47c66e9ebaf8c388a3bd Reviewed-on: https://go-review.googlesource.com/c/go/+/649659 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Mark Freeman <mark@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2025-07-28all: omit unnecessary type conversionsJes Cok
Found by github.com/mdempsky/unconvert Change-Id: Ib78cceb718146509d96dbb6da87b27dbaeba1306 GitHub-Last-Rev: dedf354811701ce8920c305b6f7aa78914a4171c GitHub-Pull-Request: golang/go#74771 Reviewed-on: https://go-review.googlesource.com/c/go/+/690735 Reviewed-by: Mark Freeman <mark@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2025-07-28runtime,syscall: move SyscallX implementations from runtime to syscallqmuntal
There is no need for syscall.Syscall{3,6,9,12,15,18} to be implemented in the runtime and linknamed from syscall. All of them can be implemented using the single runtime.syscall_syscalln function. While here, improve the documentation of syscall.SyscallN. Change-Id: I0e09d42e855d6baf900354c9b7992a4329c4ffc7 Reviewed-on: https://go-review.googlesource.com/c/go/+/690515 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-07-28internal/runtime/syscall/windows: factor out code from runtimeqmuntal
Factor out the code related to doing calls using the Windows stdcall calling convention into a separate package. This will allow us to reuse it in other low-level packages that can't depend on syscall. Updates #51087. Cq-Include-Trybots: luci.golang.try:gotip-windows-arm64,gotip-windows-amd64-longtest,gotip-solaris-amd64 Change-Id: I68640b07091183b50da6bef17406c10a397896e9 Reviewed-on: https://go-review.googlesource.com/c/go/+/689156 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-07-25runtime: rename scanobject to scanObjectMichael Anthony Knyszek
This is long overdue. Change-Id: I891b114cb581e82b903c20d1c455bbbdad548fe8 Reviewed-on: https://go-review.googlesource.com/c/go/+/690535 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2025-07-25runtime: duplicate scanobject in greentea and non-greentea filesMichael Anthony Knyszek
This change exists to help differentiate profile samples spent on Green Tea and non-Green-Tea GC time in mixed contexts. Change-Id: I8dea340d2d11ba4c410ae939fb5f37020d0b55d1 Reviewed-on: https://go-review.googlesource.com/c/go/+/689477 Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-25cmd/compile: remove unused arg from gorecoverKeith Randall
We don't need this argument anymore to match up a recover with its corresponding panic. Change-Id: I5d3646cdd766259ee9d3d995a2f215f02e17abc6 Reviewed-on: https://go-review.googlesource.com/c/go/+/685555 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2025-07-25runtime: iterate through inlinings when processing recover()Keith Randall
We care about the wrapper-ness of logical frames, not physical frames. Fixes #73916 Fixes #73917 Fixex #73920 Change-Id: Ia17c8390e71e6c0e13e23dcbb7bc7273ef25da90 Reviewed-on: https://go-review.googlesource.com/c/go/+/685375 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-07-24cmd/internal/obj: rip out argp adjustment for wrapper framesKeith Randall
The previous CL made this adjustment unnecessary. The argp field is no longer used by the runtime. Change-Id: I3491eeef4103c6653ec345d604c0acd290af9e8f Reviewed-on: https://go-review.googlesource.com/c/go/+/685356 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2025-07-24runtime: detect successful recovers differentlyKeith Randall
Use stack unwinding instead of keeping incremental track of the argp of defers that are allowed to recover. It's much simpler, and it lets us get rid of the incremental tracking by wrapper code. (Ripped out in a subsequent CL.) We only need to stack unwind a few frames to get the right answer, and only when recover()ing in a panic situation. It will be more expensive in that case, but cheaper in all others. Change-Id: Id095807db6864b7ac1e1baf09285b77a07c46d19 Reviewed-on: https://go-review.googlesource.com/c/go/+/685355 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2025-07-24cmd/compile: move amd64 and 386 over to new bounds check strategyKeith Randall
Change-Id: I13f54f04ccb8452e625dba4249e0d56bafd1fad8 Reviewed-on: https://go-review.googlesource.com/c/go/+/682397 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-07-24cmd/compile: move arm64 over to new bounds check strategyKeith Randall
For all the static bounds checks in cmd/go, we have: 6877 just a single instruction (the call itself) 139 needs an additional reg-reg move 602 needs an additional constant load 25 needs some other instruction that's ~90% implemented using just a single instruction. Reduces the text size of cmd/go by ~0.8%. Total binary size is just barely smaller, ~0.2%. (The difference is the new pcdata table.) Change-Id: I416e9c196f5d8d0e8f08e191e6df3045e11dccbe Reviewed-on: https://go-review.googlesource.com/c/go/+/682496 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-07-24cmd/compile,runtime: remember idx+len for bounds check failure with less codeKeith Randall
Currently we must put the index and length into specific registers so we can call into the runtime to report a bounds check failure. So a typical bounds check call is something like: MOVD R3, R0 MOVD R7, R1 CALL runtime.panicIndex or, if for instance the index is constant, MOVD $7, R0 MOVD R9, R1 CALL runtime.panicIndex Sometimes the MOVD can be avoided, if the value happens to be in the right register already. But that's not terribly common, and doesn't work at all for constants. Let's get rid of those MOVD instructions. They pollute the instruction cache and are almost never executed. Instead, we'll encode in a PCDATA table where the runtime should find the index and length. The table encodes, for each index and length, whether it is a constant or in a register, and which register or constant it is. That way, we can avoid all those useless MOVDs. Instead, we can figure out the index and length at runtime. This makes the bounds panic path slower, but that's a good tradeoff. We can encode registers 0-15 and constants 0-31. Anything outside that range still needs to use an explicit instruction. This CL is the foundation, followon CLs will move each architecture to the new strategy. Change-Id: I705c511e546e6aac59fed922a8eaed4585e96820 Reviewed-on: https://go-review.googlesource.com/c/go/+/682396 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-24runtime: move bounds check constants to internal/abiKeith Randall
For future use by the compiler. Change-Id: Id3da62006b283ac38008261c0ef88aaf71ef5896 Reviewed-on: https://go-review.googlesource.com/c/go/+/682456 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-07-24internal/runtime/syscall: rename to internal/runtime/syscall/linuxqmuntal
All code in internal/runtime/syscall is Linux-specific, so better move it to a new linux sub-directory. This way it will be easier to factor out runtime syscall code from other platforms, e.g. Windows. Updates #51087. Change-Id: Idd2a52444b33bf3ad576b47fd232e990cdc8ae75 Reviewed-on: https://go-review.googlesource.com/c/go/+/689155 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-24runtime: add benchmark for small-size memmory operationJulian Zhu
On RISC-V and MIPS andarchitectures, misaligned load/store is not mandatory for implementations. Therefore, it's important to handle memory operations involving small sizes or data with a remainder when divided by 8 or 4. This CL add some benchmark for small-size memmory operation, to ensure that SSA rules do not generate unaligned access traps on such architectures. Change-Id: I6fcdfdb76e9552d5b10df140fa92568ac9468386 Reviewed-on: https://go-review.googlesource.com/c/go/+/682575 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Keith Randall <khr@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2025-07-24runtime: optimize memclr on mips64xJulian Zhu
Memclr/5-4 49.94n ± 5% 50.51n ± 1% ~ (p=0.331 n=6) Memclr/16-4 22.71n ± 0% 21.01n ± 2% -7.47% (p=0.002 n=6) Memclr/64-4 49.70n ± 1% 26.09n ± 1% -47.51% (p=0.002 n=6) Memclr/256-4 84.23n ± 3% 44.32n ± 2% -47.38% (p=0.002 n=6) Memclr/4096-4 805.6n ± 1% 220.9n ± 2% -72.57% (p=0.002 n=6) Memclr/65536-4 12.734µ ± 1% 3.287µ ± 1% -74.19% (p=0.002 n=6) Memclr/1M-4 209.1µ ± 0% 105.9µ ± 5% -49.34% (p=0.002 n=6) Memclr/4M-4 838.9µ ± 6% 418.2µ ± 0% -50.15% (p=0.002 n=6) Memclr/8M-4 1.708m ± 4% 1.108m ± 4% -35.15% (p=0.002 n=6) Memclr/16M-4 3.458m ± 1% 2.840m ± 3% -17.88% (p=0.002 n=6) Memclr/64M-4 14.05m ± 0% 11.40m ± 2% -18.87% (p=0.002 n=6) MemclrUnaligned/0_5-4 50.57n ± 2% 51.00n ± 0% ~ (p=0.063 n=6) MemclrUnaligned/0_16-4 48.82n ± 8% 22.39n ± 1% -54.14% (p=0.002 n=6) MemclrUnaligned/0_64-4 52.73n ± 3% 25.29n ± 0% -52.05% (p=0.002 n=6) MemclrUnaligned/0_256-4 88.41n ± 1% 50.04n ± 7% -43.41% (p=0.002 n=6) MemclrUnaligned/0_4096-4 802.2n ± 1% 220.4n ± 1% -72.53% (p=0.002 n=6) MemclrUnaligned/0_65536-4 12.729µ ± 0% 3.341µ ± 6% -73.76% (p=0.002 n=6) MemclrUnaligned/1_5-4 50.52n ± 0% 50.99n ± 6% +0.93% (p=0.002 n=6) MemclrUnaligned/1_16-4 71.23n ± 1% 71.78n ± 1% +0.77% (p=0.041 n=6) MemclrUnaligned/1_64-4 85.11n ± 0% 76.30n ± 1% -10.36% (p=0.002 n=6) MemclrUnaligned/1_256-4 133.50n ± 2% 91.91n ± 1% -31.15% (p=0.002 n=6) MemclrUnaligned/1_4096-4 849.7n ± 0% 291.3n ± 2% -65.72% (p=0.002 n=6) MemclrUnaligned/1_65536-4 12.776µ ± 1% 3.399µ ± 1% -73.40% (p=0.002 n=6) MemclrUnaligned/4_5-4 44.34n ± 0% 44.52n ± 7% +0.41% (p=0.022 n=6) MemclrUnaligned/4_16-4 70.68n ± 0% 71.24n ± 4% ~ (p=0.132 n=6) MemclrUnaligned/4_64-4 81.83n ± 4% 77.98n ± 2% -4.71% (p=0.002 n=6) MemclrUnaligned/4_256-4 121.15n ± 3% 87.58n ± 0% -27.71% (p=0.002 n=6) MemclrUnaligned/4_4096-4 837.0n ± 2% 278.8n ± 3% -66.69% (p=0.002 n=6) MemclrUnaligned/4_65536-4 12.793µ ± 6% 3.373µ ± 3% -73.64% (p=0.002 n=6) MemclrUnaligned/7_5-4 43.89n ± 2% 43.10n ± 0% -1.80% (p=0.002 n=6) MemclrUnaligned/7_16-4 73.59n ± 2% 72.95n ± 1% -0.86% (p=0.006 n=6) MemclrUnaligned/7_64-4 88.67n ± 0% 78.89n ± 1% -11.03% (p=0.002 n=6) MemclrUnaligned/7_256-4 123.90n ± 1% 85.41n ± 2% -31.07% (p=0.002 n=6) MemclrUnaligned/7_4096-4 842.8n ± 2% 268.0n ± 0% -68.20% (p=0.002 n=6) MemclrUnaligned/7_65536-4 12.877µ ± 11% 3.348µ ± 0% -74.00% (p=0.002 n=6) MemclrUnaligned/0_1M-4 208.4µ ± 5% 104.6µ ± 1% -49.80% (p=0.002 n=6) MemclrUnaligned/0_4M-4 836.1µ ± 7% 419.3µ ± 2% -49.85% (p=0.002 n=6) MemclrUnaligned/0_8M-4 1.701m ± 9% 1.136m ± 12% -33.21% (p=0.002 n=6) MemclrUnaligned/0_16M-4 3.467m ± 16% 2.832m ± 4% -18.30% (p=0.002 n=6) MemclrUnaligned/0_64M-4 14.05m ± 2% 11.33m ± 2% -19.38% (p=0.002 n=6) MemclrUnaligned/1_1M-4 208.8µ ± 4% 104.7µ ± 1% -49.85% (p=0.002 n=6) MemclrUnaligned/1_4M-4 838.0µ ± 0% 418.3µ ± 2% -50.09% (p=0.002 n=6) MemclrUnaligned/1_8M-4 1.692m ± 1% 1.108m ± 3% -34.53% (p=0.002 n=6) MemclrUnaligned/1_16M-4 3.463m ± 20% 2.833m ± 6% -18.21% (p=0.002 n=6) MemclrUnaligned/1_64M-4 14.05m ± 4% 11.35m ± 2% -19.28% (p=0.002 n=6) MemclrUnaligned/4_1M-4 209.2µ ± 1% 104.7µ ± 7% -49.94% (p=0.002 n=6) MemclrUnaligned/4_4M-4 836.2µ ± 6% 418.8µ ± 15% -49.91% (p=0.002 n=6) MemclrUnaligned/4_8M-4 1.702m ± 0% 1.123m ± 4% -34.01% (p=0.002 n=6) MemclrUnaligned/4_16M-4 3.476m ± 8% 2.804m ± 2% -19.34% (p=0.002 n=6) MemclrUnaligned/4_64M-4 14.13m ± 25% 11.40m ± 0% -19.33% (p=0.002 n=6) MemclrUnaligned/7_1M-4 208.9µ ± 8% 104.9µ ± 6% -49.81% (p=0.002 n=6) MemclrUnaligned/7_4M-4 845.6µ ± 12% 418.2µ ± 7% -50.54% (p=0.002 n=6) MemclrUnaligned/7_8M-4 1.706m ± 10% 1.101m ± 3% -35.48% (p=0.002 n=6) MemclrUnaligned/7_16M-4 3.466m ± 3% 2.812m ± 2% -18.86% (p=0.002 n=6) MemclrUnaligned/7_64M-4 14.08m ± 5% 11.35m ± 18% -19.37% (p=0.002 n=6) GoMemclr/5-4 49.79n ± 2% 50.34n ± 0% ~ (p=0.394 n=6) GoMemclr/16-4 21.64n ± 0% 22.04n ± 7% +1.85% (p=0.002 n=6) GoMemclr/64-4 47.93n ± 4% 23.77n ± 4% -50.41% (p=0.002 n=6) GoMemclr/256-4 82.77n ± 2% 43.90n ± 0% -46.96% (p=0.002 n=6) Change-Id: I272967d001809ac4948e4118df6cdd0e0661ab96 Reviewed-on: https://go-review.googlesource.com/c/go/+/682195 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2025-07-24runtime: improvement in memclr for s390xkmvijay
The unrolled loop for sizes >= 4KB is further optimized. Offsets are computed and included in the XC instruction directly. This reduces code size and instructions, improves performance. goos: linux goarch: s390x pkg: runtime | Orig_Memclr_for_benchstat_2.log | MM_Memclr_for_benchstat_No_VSTL_3.log | | sec/op | sec/op vs base | Memclr/5 1.925n ± 0% 1.925n ± 0% ~ (p=0.211 n=10) Memclr/16 2.604n ± 13% 2.633n ± 11% ~ (p=0.912 n=10) Memclr/64 3.598n ± 2% 3.520n ± 5% ~ (p=0.190 n=10) Memclr/256 3.571n ± 12% 3.538n ± 11% ~ (p=0.739 n=10) Memclr/4096 15.15n ± 0% 15.14n ± 0% ~ (p=0.204 n=10) Memclr/65536 226.3n ± 0% 224.9n ± 0% -0.62% (p=0.000 n=10) Memclr/1M 12.77µ ± 0% 12.60µ ± 0% -1.35% (p=0.000 n=10) Memclr/4M 51.07µ ± 0% 50.37µ ± 0% -1.38% (p=0.000 n=10) Memclr/8M 102.1µ ± 0% 100.7µ ± 0% -1.36% (p=0.000 n=10) Memclr/16M 204.4µ ± 0% 201.6µ ± 0% -1.35% (p=0.000 n=10) Memclr/64M 965.4µ ± 0% 935.3µ ± 0% -3.12% (p=0.000 n=10) MemclrUnaligned/0_5 2.671n ± 6% 2.618n ± 0% ~ (p=0.194 n=10) MemclrUnaligned/0_16 3.143n ± 6% 2.955n ± 8% ~ (p=0.089 n=10) MemclrUnaligned/0_64 3.622n ± 3% 3.571n ± 2% ~ (p=0.304 n=10) MemclrUnaligned/0_256 3.712n ± 8% 3.653n ± 5% ~ (p=0.754 n=10) MemclrUnaligned/0_4096 15.14n ± 0% 15.14n ± 0% ~ (p=1.000 n=10) ¹ MemclrUnaligned/0_65536 231.9n ± 0% 225.2n ± 0% -2.91% (p=0.000 n=10) MemclrUnaligned/1_5 2.620n ± 8% 2.620n ± 0% ~ (p=0.866 n=10) MemclrUnaligned/1_16 3.103n ± 7% 2.933n ± 9% ~ (p=0.052 n=10) MemclrUnaligned/1_64 3.576n ± 3% 3.568n ± 3% ~ (p=0.748 n=10) MemclrUnaligned/1_256 3.744n ± 9% 3.709n ± 10% ~ (p=0.853 n=10) MemclrUnaligned/1_4096 26.23n ± 0% 26.23n ± 0% ~ (p=1.000 n=10) ¹ MemclrUnaligned/1_65536 401.1n ± 0% 399.5n ± 0% -0.40% (p=0.000 n=10) MemclrUnaligned/4_5 2.620n ± 6% 2.623n ± 0% ~ (p=0.985 n=10) MemclrUnaligned/4_16 3.095n ± 7% 3.005n ± 9% ~ (p=0.247 n=10) MemclrUnaligned/4_64 3.586n ± 1% 3.578n ± 3% ~ (p=1.000 n=10) MemclrUnaligned/4_256 3.843n ± 5% 3.742n ± 10% ~ (p=0.971 n=10) MemclrUnaligned/4_4096 26.23n ± 0% 26.23n ± 0% ~ (p=1.000 n=10) MemclrUnaligned/4_65536 401.1n ± 0% 399.5n ± 0% -0.41% (p=0.000 n=10) MemclrUnaligned/7_5 2.634n ± 6% 2.644n ± 4% ~ (p=0.896 n=10) MemclrUnaligned/7_16 3.119n ± 7% 3.044n ± 9% ~ (p=0.529 n=10) MemclrUnaligned/7_64 3.568n ± 1% 3.585n ± 3% ~ (p=0.499 n=10) MemclrUnaligned/7_256 3.741n ± 9% 3.629n ± 6% ~ (p=0.853 n=10) MemclrUnaligned/7_4096 26.23n ± 0% 26.23n ± 0% ~ (p=1.000 n=10) ¹ MemclrUnaligned/7_65536 401.1n ± 0% 399.4n ± 0% -0.42% (p=0.000 n=10) MemclrUnaligned/0_1M 12.82µ ± 0% 12.60µ ± 0% -1.70% (p=0.000 n=10) MemclrUnaligned/0_4M 51.28µ ± 0% 50.37µ ± 0% -1.77% (p=0.000 n=10) MemclrUnaligned/0_8M 102.5µ ± 0% 100.8µ ± 0% -1.75% (p=0.000 n=10) MemclrUnaligned/0_16M 205.1µ ± 0% 201.7µ ± 0% -1.62% (p=0.000 n=10) MemclrUnaligned/0_64M 965.2µ ± 0% 934.7µ ± 0% -3.16% (p=0.000 n=10) MemclrUnaligned/1_1M 16.02µ ± 0% 15.81µ ± 0% -1.34% (p=0.000 n=10) MemclrUnaligned/1_4M 64.03µ ± 0% 63.20µ ± 0% -1.29% (p=0.000 n=10) MemclrUnaligned/1_8M 128.0µ ± 0% 126.4µ ± 0% -1.27% (p=0.000 n=10) MemclrUnaligned/1_16M 256.3µ ± 0% 253.2µ ± 0% -1.21% (p=0.000 n=10) MemclrUnaligned/1_64M 1.210m ± 0% 1.187m ± 0% -1.88% (p=0.000 n=10) MemclrUnaligned/4_1M 16.03µ ± 0% 15.81µ ± 0% -1.37% (p=0.000 n=10) MemclrUnaligned/4_4M 64.04µ ± 0% 63.20µ ± 0% -1.31% (p=0.000 n=10) MemclrUnaligned/4_8M 128.0µ ± 0% 126.4µ ± 0% -1.27% (p=0.000 n=10) MemclrUnaligned/4_16M 256.1µ ± 0% 253.0µ ± 0% -1.20% (p=0.000 n=10) MemclrUnaligned/4_64M 1.210m ± 0% 1.188m ± 0% -1.81% (p=0.000 n=10) MemclrUnaligned/7_1M 16.02µ ± 0% 15.81µ ± 0% -1.32% (p=0.000 n=10) MemclrUnaligned/7_4M 64.06µ ± 0% 63.21µ ± 0% -1.34% (p=0.000 n=10) MemclrUnaligned/7_8M 128.1µ ± 0% 126.4µ ± 0% -1.29% (p=0.000 n=10) MemclrUnaligned/7_16M 256.2µ ± 0% 253.2µ ± 0% -1.18% (p=0.000 n=10) MemclrUnaligned/7_64M 1.210m ± 0% 1.188m ± 0% -1.82% (p=0.000 n=10) MemclrRange/1K_2K 841.1n ± 1% 879.0n ± 3% +4.51% (p=0.002 n=10) MemclrRange/2K_8K 1.435µ ± 2% 1.415µ ± 0% -1.39% (p=0.000 n=10) MemclrRange/4K_16K 1.241µ ± 0% 1.209µ ± 0% -2.58% (p=0.000 n=10) MemclrRange/160K_228K 19.83µ ± 0% 19.59µ ± 0% -1.22% (p=0.000 n=10) MemclrKnownSize1 1.732n ± 0% 1.732n ± 0% ~ (p=0.474 n=10) MemclrKnownSize2 1.925n ± 3% 1.925n ± 1% ~ (p=0.929 n=10) MemclrKnownSize4 1.732n ± 0% 1.732n ± 0% ~ (p=1.000 n=10) ¹ MemclrKnownSize8 1.732n ± 0% 1.732n ± 0% ~ (p=1.000 n=10) MemclrKnownSize16 2.413n ± 9% 2.681n ± 14% +11.10% (p=0.004 n=10) MemclrKnownSize32 3.284n ± 4% 3.328n ± 2% ~ (p=0.671 n=10) MemclrKnownSize64 4.893n ± 1% 4.882n ± 1% ~ (p=0.591 n=10) MemclrKnownSize112 5.623n ± 2% 5.596n ± 2% -0.48% (p=0.027 n=10) MemclrKnownSize128 5.612n ± 1% 5.599n ± 0% ~ (p=0.066 n=10) MemclrKnownSize192 7.128n ± 1% 7.337n ± 2% +2.93% (p=0.000 n=10) MemclrKnownSize248 6.740n ± 1% 6.829n ± 3% +1.33% (p=0.005 n=10) MemclrKnownSize256 3.657n ± 8% 3.512n ± 14% ~ (p=0.436 n=10) MemclrKnownSize512 3.624n ± 3% 3.982n ± 9% +9.88% (p=0.017 n=10) MemclrKnownSize1024 4.662n ± 0% 4.680n ± 0% +0.39% (p=0.000 n=10) MemclrKnownSize4096 15.14n ± 0% 15.15n ± 0% +0.07% (p=0.000 n=10) MemclrKnownSize512KiB 6.388µ ± 0% 6.309µ ± 0% -1.24% (p=0.000 n=10) geomean 268.9n 266.9n -0.75% ¹ all samples are equal Change-Id: I2911866fb82777311ec4219600fb48c85f7bf862 Reviewed-on: https://go-review.googlesource.com/c/go/+/682595 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-07-24runtime: randomize heap base addressRoland Shoemaker
During initialization, allow randomizing the heap base address by generating a random uint64 and using its bits to randomize various portions of the heap base address. We use the following method to randomize the base address: * We first generate a random heapArenaBytes aligned address that we use for generating the hints. * On the first call to mheap.grow, we then generate a random PallocChunkBytes aligned offset into the mmap'd heap region, which we use as the base for the heap region. * We then mark a random number of pages within the page allocator as allocated. Our final randomized "heap base address" becomes the first byte of the first available page returned by the page allocator. This results in an address with at least heapAddrBits-gc.PageShift-1 bits of entropy. Fixes #27583 Change-Id: Ideb4450a5ff747a132f702d563d2a516dec91a88 Reviewed-on: https://go-review.googlesource.com/c/go/+/674835 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-24runtime: drop NetBSD kernel bug sysmon workaround fixed in NetBSD 9.2Tobias Klauser
The NetBSD releases supported by the NetBSD project as off today are 9.4 and 10.1. The Go project's NetBSD builders are on 9.3. Thus, it is fine to drop the workaround which was only needed for NetBSD before 9.2. Fixes #46495 Cq-Include-Trybots: luci.golang.try:gotip-netbsd-arm64 Change-Id: I3c2ec42fb0f08f7dafdfb7f1dbd97853afc16386 Reviewed-on: https://go-review.googlesource.com/c/go/+/687735 Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com> Auto-Submit: Benny Siegert <bsiegert@gmail.com> Reviewed-by: Benny Siegert <bsiegert@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-23runtime: check TestUsingVDSO ExitError type assertionMichael Pratt
Currently this test panics if the error is not an ExitError. We aren't expecting other errors, but we want to continue to the t.Fatal so the error contents actually get logged. For #74672. Change-Id: I6a6a636cee5ddac500ed7ec549340b02944101ab Reviewed-on: https://go-review.googlesource.com/c/go/+/689956 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Pratt <mpratt@google.com>
2025-07-22all: go fmtMichael Pratt
Change-Id: I6a6a636c341d4ba3518be7f6806093bbdff11c88 Reviewed-on: https://go-review.googlesource.com/c/go/+/689535 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-07-21runtime: relax TestMemoryLimitNoGCPercent a bitKeith Randall
It seems to be pretty flaky. I've seen: retained=289438024 limit=268435456 bound=285212672 Which is ~4MB over the bound. Not sure why this tends to be darwin-specific, but we'll fix just darwin for now. (It isn't quite darwin-only, as it appeared in #66893. But it is certainly worse on darwin.) Fixes #73136 Update #66893 Change-Id: If609e909bc6c65c2663dd46b7a9bad4fd291c3da Reviewed-on: https://go-review.googlesource.com/c/go/+/689315 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-21runtime: fix asan wrapper for riscv64Keith Randall
Make sure we keep the M in X21. The instruction at line 87 needs to be loading from the M, not from the gsignal. Fixes #73979, maybe? Change-Id: I3aba1c35454636138b305bca26e20dbc8e6b1d6e Reviewed-on: https://go-review.googlesource.com/c/go/+/688975 Reviewed-by: Joel Sing <joel@sing.id.au> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com> Reviewed-by: Mark Ryan <markdryan@rivosinc.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-16runtime: fix idle time double-counting bugMichael Anthony Knyszek
This change fixes a bug in the accounting of sched.idleTime. In just the case where the GC CPU limiter needs up-to-date data, sched.idleTime is incremented in both the P-idle-time and idle-mark-work paths, but it should only be incremented in the former case. Fixes #74627. Change-Id: If41b03da102d47d25bec48ff750a9da27019b71d Reviewed-on: https://go-review.googlesource.com/c/go/+/687998 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-07-16cmd/link, runtime: on Wasm, put only function index in method table and func ↵Cherry Mui
table In the type descriptor's method table, it contains relative PCs of the methods (relative to the start of the text section) stored as 32-bit offsets. On Wasm, a PC is PC_F<<16 + PC_B, where PC_F is the function index, and PC_B is the block index. When there are more than 65536 functions, the PC will not fit into 32-bit (and relative to the section start doesn't help). Since there are no more bits for the function index, and the method table always targets the entry of a method, we put just the PC_F there, and rewrite back to a full PC at run time when we need the PC. This way we can have more than 65536 functions. The func table also contains 32-bit relative PCs, and it also always points to function entries. Do the same there, as well as other places where we use relative text offsets. Also add the relocation type in the relocation overflow error message. Also add check for function too big on Wasm. If a function has more than 65536 blocks, PC_B will overflow and PC = PC_F<<16 + PC_B will points to the wrong function. Fixes #64856. Change-Id: If9c307e9fb1641f367a5f19c39f88f455805d0bb Reviewed-on: https://go-review.googlesource.com/c/go/+/552835 Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-16runtime: use 32-bit function index on WasmCherry Mui
Following CL 567896, this is one more place we used only 16 bits for the function index. Change it to load 32 bits. For #64856. Change-Id: I66a78c086e67165604053313751c097a70c50ba9 Reviewed-on: https://go-review.googlesource.com/c/go/+/609118 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-07-15runtime: use memclrNoHeapPointers to clear inline mark bitsMichael Anthony Knyszek
Clearing the inline mark bits with memclrNoHeapPointers is slightly better than having the compiler insert, e.g. duffzero, since it can take advantage of wider SIMD instructions. duffzero is likely going away, but we know things the compiler doesn't, such as the fact that this memory is nicely aligned. In this particular case, memclrNoHeapPointers does a better job. For #73581. Change-Id: I3918096929acfe6efe6f469fb089ebe04b4acff5 Reviewed-on: https://go-review.googlesource.com/c/go/+/687938 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2025-07-15runtime: only clear inline mark bits on span alloc if necessaryMichael Anthony Knyszek
This change modifies initInlineMarkBits to only clear mark bits if the span wasn't just freshly allocated from the OS, where we know the bits are already zeroed. This probably doesn't make a huge difference most of the time, but it's an easy optimization and helps rule it out as a source of slowdown. For #73581. Change-Id: I78cd4d8968bb0bf6536c0a38ef9397475c39f0ad Reviewed-on: https://go-review.googlesource.com/c/go/+/687937 Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-07-15runtime: have mergeInlineMarkBits also clear the inline mark bitsMichael Anthony Knyszek
This is conceptually simpler, as the sweeper doesn't have to worry about clearing them separately. It also doesn't have a use for them. This will also be useful to avoiding unnecessary zeroing in initInlineMarkBits at allocation time. Currently, because it's used in both span allocation and at sweep time, we cannot blindly trust needzero. This change also renames mergeInlineMarkBits to moveInlineMarkBits to make this change in semantics clearer from the name. For #73581. Change-Id: Ib154738a945633b7ff5b2ae27235baa310400139 Reviewed-on: https://go-review.googlesource.com/c/go/+/687936 Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-07-15runtime: merge inline mark bits with gcmarkBits 8 bytes at a timeMichael Anthony Knyszek
Currently, with Green Tea GC, we need to copy (really bitwise-or) mark bits back into mspan.gcmarkBits, so that it can propagate to mspan.allocBits at sweep time. This function does actually seem to make sweeping small spans a good bit more expensive, though sweeping is still relatively cheap. There's some low-hanging fruit here though, in that the merge is performed one byte at a time, but this is pretty inefficient. We can almost as easily perform this merge one word at a time instead, which seems to make this operation about 33% faster. For #73581. Change-Id: I170d36e7a2193199c423dcd556cba048ebd698af Reviewed-on: https://go-review.googlesource.com/c/go/+/687935 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2025-07-14runtime: expand GOMAXPROCS documentationMichael Pratt
Expand the GOMAXPROCS documentation to include details of how defaults are selected, as this is something that inquisitive minds will want to know. I've added an additional warning that these details may changed. While we are here, add a bit more structure to make it easier to find the relevant parts of the documentation. For #73193. Change-Id: I6a6a636cae93237e3e3174822490d51805e70990 Reviewed-on: https://go-review.googlesource.com/c/go/+/685318 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Sean Liao <sean@liao.dev> Reviewed-by: Sean Liao <sean@liao.dev> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
2025-07-11runtime: gofmt after CL 643897 and CL 662455Tobias Klauser
Change-Id: I3103325ebe29509c00b129a317b5708aece575a0 Reviewed-on: https://go-review.googlesource.com/c/go/+/687715 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
2025-07-11runtime: turn off large memmove tests under asan/msanKeith Randall
Just like we do for race mode. They are just too slow when running with the sanitizers. Fixes #59448 Change-Id: I86e3e3488ec5c4c29e410955e9dc4cbc99d39b84 Reviewed-on: https://go-review.googlesource.com/c/go/+/687535 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Keith Randall <khr@golang.org>
2025-07-11[dev.simd] all: merge master (88cf0c5) into dev.simdCherry Mui
Merge List: + 2025-07-11 88cf0c5d55 cmd/link: do size fixups after symbol references are loaded + 2025-07-10 7a38975a48 os: trivial comment fix + 2025-07-10 aa5de9ebb5 synctest: fix comments for time.Now() in synctests + 2025-07-10 63ec70d4e1 crypto/cipher: Fix comment punctuation + 2025-07-09 8131635e5a runtime: run TestSignalDuringExec in its own process group + 2025-07-09 67c1704444 crypto/tls: empty server_name conf. ext. from server + 2025-07-08 54c9d77630 cmd/go: disable support for multiple vcs in one module + 2025-07-08 fca43a8436 internal: make struct comment match struct name + 2025-07-08 bb917bb030 cmd/compile: document that nosplit directive is unsafe + 2025-07-08 a5bda585d5 cmd/compile: run fmt on ssa + 2025-07-07 86b5ba7310 internal/trace: only test for sync preemption if async preemption is off + 2025-07-07 ef46e1b164 cmd/internal/doc: fix GOROOT skew and path joining bugs + 2025-07-07 75b43f9a97 runtime: make traceStack testable and add a benchmark + 2025-07-07 20978f46fd crypto/rsa: remove another forgotten note to future self + 2025-07-07 33fb4819f5 cmd/compile/internal/ssa: skip EndSequence entries in TestStmtLines + 2025-07-07 a995269a93 sort: clarify Less doc + 2025-07-03 6c3b5a2798 runtime: correct vdsoSP on S390X + 2025-07-03 dd687c3860 hash: document that Clone may only return ErrUnsupported or a nil error + 2025-07-02 b325151453 cmd/cgo/internal/testsanitizers: skip asan tests when FIPS140 mode is on + 2025-07-02 15d9fe43d6 testing/synctest: explicitly state Run will be removed in Go 1.26 + 2025-07-01 de646d94f7 cmd/go/internal/modindex: apply changes in CL 502615 to modindex package + 2025-07-01 2f653a5a9e crypto/tls: ensure the ECDSA curve matches the signature algorithm + 2025-07-01 6e95fd96cc crypto/ecdsa: fix crypto/x509 godoc links + 2025-07-01 7755a05209 Revert "crypto/internal/fips140/subtle: add assembly implementation of xorBytes for arm" + 2025-07-01 d168ad18e1 slices: update TestIssue68488 to avoid false positives + 2025-07-01 27ad1f5013 internal/abi: fix comment on NonEmptyInterface + 2025-06-30 86fca3dcb6 encoding/json/jsontext: use bytes.Buffer.AvailableBuffer + 2025-06-30 6bd9944c9a encoding/json/v2: avoid escaping jsonopts.Struct + 2025-06-30 e46d586edd cmd/compile/internal/escape: add debug hash for literal allocation optimizations + 2025-06-30 479b51ee1f cmd/compile/internal/escape: stop disabling literal allocation optimizations when coverage is enabled + 2025-06-30 8002d283e8 crypto/tls: update bogo version + 2025-06-30 fdd7713fe5 internal/goexperiment: fix godoc formatting Change-Id: I074e6c75778890930975925c016004aabca2b9d1
2025-07-09runtime: run TestSignalDuringExec in its own process groupMichael Anthony Knyszek
TestSignalDuringExec sends a SIGWINCH to the whole process group. However, it may execute concurrently with other copies of the runtime tests, especially through `go tool dist`, and gdb version <12.1 has a bug in non-interactive mode where recieving a SIGWINCH causes a crash. This change modifies SignalDuringExec in the testprog to first fork itself into a new process group. To avoid issues with Ctrl+C and the new process group hanging, the new process blocks on a pipe that is passed down to it. This pipe is automatically closed when its parent exits, which should ensure that the subprocess also exits. Fixes #58932. Change-Id: I3906afa28cf8b15d22ae612d071bce7f30fc3e6c Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-noswissmap,gotip-linux-amd64-longtest-aliastypeparams,gotip-linux-amd64-longtest,gotip-linux-386-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/686875 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-07-09[dev.simd] runtime: save Z16-Z31 registers in async preemptJunyang Shao
The register allocation will use the upper register soon, this CL is to enable that. Change-Id: I4d7285e08b95f4e6ebee72594dfbe8d1199f09ed Reviewed-on: https://go-review.googlesource.com/c/go/+/686498 TryBot-Bypass: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Commit-Queue: David Chase <drchase@google.com>
2025-07-08cmd/go: disable support for multiple vcs in one moduleRoland Shoemaker
Removes the somewhat redundant vcs.FromDir, "allowNesting" argument, which was always enabled, and disallow multiple VCS metadata folders being present in a single directory. This makes VCS injection attacks much more difficult. Also adds a GODEBUG, allowmultiplevcs, which re-enables this behavior. Thanks to RyotaK (https://ryotak.net) of GMO Flatt Security Inc for reporting this issue. Fixes #74380 Fixes CVE-2025-4674 Change-Id: I5787d90cdca8deb3aca6f154efb627df1e7d2789 Reviewed-on: https://go-review.googlesource.com/c/go/+/686515 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Commit-Queue: Carlos Amedee <carlos@golang.org> Reviewed-by: Carlos Amedee <carlos@golang.org>
2025-07-07runtime: make traceStack testable and add a benchmarkMichael Anthony Knyszek
Change-Id: Ide4daa5eee3fd4f3007d6ef23aa84b8916562c39 Reviewed-on: https://go-review.googlesource.com/c/go/+/684457 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-03runtime: correct vdsoSP on S390XCherry Mui
It should get the caller's SP. The current code gets the address of the first parameter, which is one word above the caller's SP. There is a slot for saving the LR at 0(SP) in the caller's frame. Fixes #62086 (for s390x). Change-Id: Ie8cbfabc8161b98658c884a32e0af72df189ea56 Reviewed-on: https://go-review.googlesource.com/c/go/+/685715 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-06-30[dev.simd] runtime: remove write barrier in xRegRestoreAustin Clements
Currently, there's a write barrier in xRegRestore when it assigns pp.xRegs.cache = gp.xRegs.state. This is bad because that gets called on the asyncPreempt return path, where we have really limited stack space, and we don't currently account for this write barrier. We can't simply mark xRegState as sys.NotInHeap because it's also embedded in runtime.p as register scratch space, and runtime.p is heap allocated. Hence, to fix this, we rename xRegState to just "xRegs" and introduce a wrapper "xRegState" type that embeds xRegs and is itself marked sys.NotInHeap. Then, anywhere we need a manually-managed pointer to register state, we use the new type. To ensure this doesn't happen again in the future, we also mark asyncPreempt2 as go:nowritebarrierrec. Change-Id: I5ff4841e55ff20047ff7d253ab659ab77aeb3391 Reviewed-on: https://go-review.googlesource.com/c/go/+/684836 Auto-Submit: Austin Clements <austin@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-06-30[dev.simd] all: merge master (740857f) into dev.simdCherry Mui
Merge List: + 2025-06-30 740857f529 runtime: stash allpSnapshot on the M + 2025-06-30 9ae38be302 sync: disassociate WaitGroups from bubbles on Wait + 2025-06-30 4731832342 crypto/hmac: wrap ErrUnsupported returned by Clone + 2025-06-30 03ad694dcb runtime: update skips for TestGdbBacktrace + 2025-06-30 9d1cd0b881 iter: add missing type parameter in doc + 2025-06-29 acb914f2c2 cmd/doc: fix -http on Windows + 2025-06-27 b51f1cdb87 runtime: remove arbitrary 5-second timeout in TestNeedmDeadlock + 2025-06-27 f1e6ae2f6f reflect: fix TypeAssert on nil interface values + 2025-06-27 e81c624656 os: use minimal file permissions when opening parent directory in RemoveAll + 2025-06-27 2a22aefa1f encoding/json: add security section to doc + 2025-06-27 742fda9524 runtime: account for missing frame pointer in preamble + 2025-06-27 fdc076ce76 net/http: fix RoundTrip context cancellation for js/wasm + 2025-06-27 d9d2cadd63 encoding/json: fix typo in hotlink for jsontext.PreserveRawStrings + 2025-06-26 0f8ab2db17 cmd/link: permit a larger size BSS reference to a smaller DATA symbol + 2025-06-26 988a20c8c5 cmd/compile/internal/escape: evaluate any side effects when rewriting with literals + 2025-06-25 b5d555991a encoding/json/jsontext: remove Encoder.UnusedBuffer + 2025-06-25 0b4d2eab2f encoding/json/jsontext: rename Encoder.UnusedBuffer as Encoder.AvailableBuffer Change-Id: Iea44ab825bdf087fbe7570df8d2d66d1d3327c31
2025-06-30[dev.simd] runtime: save AVX2 and AVX-512 state on asynchronous preemptionAustin Clements
Based on CL 669415 by shaojunyang@google.com. Change-Id: I574f15c3b18a7179a1573aaf567caf18d8602ef1 Reviewed-on: https://go-review.googlesource.com/c/go/+/680900 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Austin Clements <austin@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-06-30[dev.simd] runtime: save scalar registers off stack in amd64 async preemptionAustin Clements
Asynchronous preemption must save all registers that could be in use by Go code. Currently, it saves all of these to the goroutine stack. As a result, the stack frame requirements of asynchronous preemption can be rather high. On amd64, this requires 368 bytes of stack space, most of which is the XMM registers. Several RISC architectures are around 0.5 KiB. As we add support for SIMD instructions, this is going to become a problem. The AVX-512 register state is 2.5 KiB. This well exceeds the nosplit limit, and even if it didn't, could constrain when we can asynchronously preempt goroutines on small stacks. This CL fixes this by moving pure scalar state stored in non-GP registers off the stack and into an allocated "extended register state" object. To reduce space overhead, we only allocate these objects as needed. While in the theoretical limit, every G could need this register state, in practice very few do at a time. However, we can't allocate when we're in the middle of saving the register state during an asynchronous preemption, so we reserve scratch space on every P to temporarily store the register state, which can then be copied out to an allocated state object later by Go code. This commit only implements this for amd64, since that's where we're about to add much more vector state, but it lays the groundwork for doing this on any architecture that could benefit. Change-Id: I123a95e21c11d5c10942d70e27f84d2d99bbf735 Reviewed-on: https://go-review.googlesource.com/c/go/+/680898 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Austin Clements <austin@google.com>
2025-06-30runtime: stash allpSnapshot on the MMichael Pratt
findRunnable takes a snapshot of allp prior to dropping the P because afterwards procresize may mutate allp without synchronization. procresize is careful to never mutate the contents up to cap(allp), so findRunnable can still safely access the Ps in the slice. Unfortunately, growing allp is problematic. If procresize grows the allp backing array, it drops the reference to the old array. allpSnapshot still refers to the old array, but allpSnapshot is on the system stack in findRunnable, which also likely no longer has a P at all. This means that a future GC will not find the reference and can free the array and use it for another allocation. This would corrupt later reads that findRunnable does from the array. The fix is simple: the M struct itself is reachable by the GC, so we can stash the snapshot in the M to ensure it is visible to the GC. The ugliest part of the CL is the cleanup when we are done with the snapshot because there are so many return/goto top sites. I am tempted to put mp.clearAllpSnapshot() in the caller and at top to make this less error prone, at the expensive of extra unnecessary writes. Fixes #74414. Change-Id: I6a6a636c484e4f4b34794fd07910b3fffeca830b Reviewed-on: https://go-review.googlesource.com/c/go/+/684460 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com>
2025-06-30runtime: update skips for TestGdbBacktracelimeidan
We encountered a new type of "no such process" error on loong64, it's like this "Couldn't get NT_PRSTATUS registers: No such process.", I checked the source code of gdb, NT_PRSTATUS is not fixed, it may be another name, so I use regular expression here to match possible cases. Updates #50838 Fixes #74389 Change-Id: I3e3f7455b2dc6b8aa10c084f24f6a2a114790855 Reviewed-on: https://go-review.googlesource.com/c/go/+/684195 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: abner chenc <chenguoqi@loongson.cn> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-06-27runtime: remove arbitrary 5-second timeout in TestNeedmDeadlockCherry Mui
The NeedmDeadlock test program currently has a 5-second timeout, which is sort of arbitrary. It is long enough in regular mode (which usually takes 0.0X seconds), but not quite so for configurations like ASAN. Instead of using an arbitrary timeout, just use the test's deadline. The test program is invoked with testenv.Command, which will send it a SIGQUIT before the deadline expires. Fixes #56420 (at least for the asan builder). Change-Id: I0b13651cb07241401837ca2e60eaa1b83275b093 Reviewed-on: https://go-review.googlesource.com/c/go/+/684697 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-06-27runtime: account for missing frame pointer in preambleMichael Anthony Knyszek
If a goroutine is synchronously preempted, then taking a frame-pointer-based stack trace at that preemption will skip PC of the caller of the function which called into morestack. This happens because the frame pointer is pushed to the stack after the preamble, leaving the stack in an odd state for frame pointer unwinding. Deal with this by marking a goroutine as synchronously preempted and using that signal to load the missing PC from the stack. On LR platforms this is available in gp.sched.lr. On non-LR platforms like x86, it's at gp.sched.sp, because there are no args, no locals, and no frame pointer pushed to the SP yet. For #68090. Change-Id: I73a1206d8b84eecb8a96dbe727195da30088f288 Reviewed-on: https://go-review.googlesource.com/c/go/+/684435 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Nick Ripley <nick.ripley@datadoghq.com>
2025-06-25[dev.simd] all: merge master (f8ccda2) into dev.simdCherry Mui
Merge List: + 2025-06-25 f8ccda2e05 runtime: make explicit nil check in (*spanInlineMarkBits).init + 2025-06-25 f069a82998 runtime: note custom GOMAXPROCS even if value doesn't change + 2025-06-24 e515ef8bc2 context: fix typo in context_test.go + 2025-06-24 47b941f445 cmd/link: add one more linkname to the blocklist + 2025-06-24 34cf5f6205 go/types: add test for interface method field type + 2025-06-24 6e618cd42a encoding/json: use zstd compressed testdata + 2025-06-24 fcb9850859 net/http: reduce allocs in CrossOriginProtection.Check + 2025-06-24 11f11f2a00 encoding/json/v2: support ISO 8601 durations + 2025-06-24 62deaf4fb8 doc: fix links to runtime Environment Variables + 2025-06-24 2e9bb62bfe encoding/json/v2: reject unquoted dash as a JSON field name + 2025-06-23 ed7815726d encoding/json/v2: report error on time.Duration without explicit format + 2025-06-23 f866958246 cmd/dist: test encoding/json/... with GOEXPERIMENT=jsonv2 + 2025-06-23 f77a0aa6b6 internal/trace: improve gc-stress test + 2025-06-23 4506796a6e encoding/json/jsontext: consistently use JSON terminology + 2025-06-23 456a90aa16 runtime: add missing unlock in sysReserveAlignedSbrk + 2025-06-23 1cf6386b5e Revert "go/types, types2: don't register interface methods in Info.Types map" + 2025-06-20 49cdf0c42e testing, testing/synctest: handle T.Helper in synctest bubbles + 2025-06-20 3bf1eecbd3 runtime: fix struct comment + 2025-06-20 8ed23a2936 crypto/cipher: fix link to crypto/aes + 2025-06-20 ef60769b46 go/doc: add a golden test that reproduces #62640 + 2025-06-18 8552bcf7c2 cmd/go/internal/fips140: ignore GOEXPERIMENT on error + 2025-06-18 4c7567290c runtime: set mspan limit field early and eagerly + 2025-06-18 c6ac736288 runtime: prevent mutual deadlock between GC stopTheWorld and suspendG + 2025-06-17 53af292aed encoding/json/jsontext: fix spelling error + 2025-06-16 d058254689 cmd/dist: always include variant in package names + 2025-06-16 3254c2bb83 internal/reflectlite: fix comment about meaning of flag field + 2025-06-16 816199e421 runtime: don't let readTrace spin on trace.shutdown + 2025-06-16 ea00461b17 internal/trace: make Value follow reflect conventions + 2025-06-13 96a6e147b2 runtime: comment that some linknames are used by runtime/trace + 2025-06-13 644905891f runtime: remove unused unique.runtime_blockUntilEmptyFinalizerQueue + 2025-06-13 683810a368 cmd/link: block new standard library linknames + 2025-06-12 9149876112 all: replace a few user-visible mentions of golang.org and godoc.org + 2025-06-12 934d5f2cf7 internal/trace: end test programs with SIGQUIT + 2025-06-12 5a08865de3 net: remove some BUG entries + 2025-06-11 d166a0b03e encoding/json/jsontext, encoding/json/v2: document experimental nature + 2025-06-11 d4c6effaa7 cmd/compile: add up-to-date test for generated files + 2025-06-10 7fa2c736b3 os: disallow Root.Remove(".") on Plan 9, js, and Windows + 2025-06-10 281cfcfc1b runtime: handle system goroutines later in goroutine profiling + 2025-06-10 4f86f22671 testing/synctest, runtime: avoid panic when using linker-alloc WG from bubble Change-Id: I8bbbf40ce053a80395b08977e21b1f34c67de117
2025-06-25runtime: make explicit nil check in (*spanInlineMarkBits).initMichael Anthony Knyszek
The hugo binary gets slower, potentially dramatically so, with GOEXPERIMENT=greenteagc. The root cause is page mapping churn. The Green Tea code introduced a new implicit nil check on value in a freshly-allocated span to clear some new heap metadata. This nil check would read the fresh memory, causing Linux to back that virtual address space with an RO page. This would then be almost immediately written to, causing Linux to possibly flush the TLB and find memory to replace that read-only page (likely deduplicated as just the zero page). This CL fixes the issue by replacing the implicit nil check, which is a memory read expected to fault if it's truly nil, with an explicit one. The explicit nil check is a branch, and thus makes no reads to memory. The result is that the hugo binary no longer gets slower. No regression test because it doesn't seem possible without access to OS internals, like Linux tracepoints. We briefly experimented with RSS metrics, but they're inconsistent. Some system RSS metrics count the deduplicated zero page, while others (like those produced by /proc/self/smaps) do not. Instead, we'll add a new benchmark to our benchmark suite, separately. For #73581. Fixes #74375. Change-Id: I708321c14749a94ccff55072663012eba18b3b91 Reviewed-on: https://go-review.googlesource.com/c/go/+/684015 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>