aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/compile/internal/walk
AgeCommit message (Collapse)Author
6 dayscmd/compile: remove initContext typeCuong Manh Le
All literals are initialized in init function context, thus it's safe to remove the type and simplify all its usage. Change-Id: If128e915cf63199ab0b23b240e0f31508be8f377 Reviewed-on: https://go-review.googlesource.com/c/go/+/764021 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>
7 dayscmd/compile: remove unused inNonInitFunction contextCuong Manh Le
CL 737821 removed the last functional use of inNonInitFunction. This field and its associated logic are now dead code and can be safely deleted. Change-Id: I4023ec7ef5d10c94ef54b9726f18cda863ee2ff4 Reviewed-on: https://go-review.googlesource.com/c/go/+/764020 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>
7 dayscmd/compile: support all constant return types in switch lookup tablesqmuntal
Lookup tables for switch statements can be generalized to also support bools, strings, floats, and complex numbers. Change-Id: Ic3ece41fe2009050fbf08ba6f06ea8a567407974 Reviewed-on: https://go-review.googlesource.com/c/go/+/763320 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
12 dayscmd/compile: optimize switch statements using lookup tablesqmuntal
Switch statement containing integer constant cases and case bodies just returning a constant should be optimizable to a simpler and faster table lookup instead of a jump table. That is, a switch like this: switch x { case 0: return 10 case 1: return 20 case 2: return 30 case 3: return 40 default: return -1 } Could be optimized to this: var table = [4]int{10, 20, 30, 40} if uint(x) < 4 { return table[x] } return -1 The resulting code is smaller and faster, especially on platforms where jump tables are not supported. goos: windows goarch: arm64 pkg: cmd/compile/internal/test │ .\old.txt │ .\new.txt │ │ sec/op │ sec/op vs base │ SwitchLookup8Predictable-12 2.708n ± 6% 2.249n ± 5% -16.97% (p=0.000 n=10) SwitchLookup8Unpredictable-12 8.758n ± 7% 3.272n ± 4% -62.65% (p=0.000 n=10) SwitchLookup32Predictable-12 2.672n ± 5% 2.373n ± 6% -11.21% (p=0.000 n=10) SwitchLookup32Unpredictable-12 9.372n ± 7% 3.385n ± 6% -63.89% (p=0.000 n=10) geomean 4.937n 2.772n -43.84% Fixes #78203 Change-Id: I74fa3d77ef618412951b2e5c3cb6ebc760ce4ff1 Reviewed-on: https://go-review.googlesource.com/c/go/+/756340 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-03-31cmd/compile: ensure map/slice clearing expressions are walkedCuong Manh Le
The order pass ensures that initialization operations for clear(expr) are scheduled. However, if 'expr' is a conversion that the walk pass subsequently optimizes away or transforms, the resulting nodes can be left in an un-walked state. These un-walked nodes reach the SSA backend, which does not expect high-level IR, resulting in an ICE. This change ensures the expression is always walked during the transformation of the 'clear' builtin. Fixes #78410 Change-Id: I1997a28af020f39b2d325a58429eff9495048b1f Reviewed-on: https://go-review.googlesource.com/c/go/+/760981 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2026-03-26cmd/compile: fix missing walk for OAS2RECV nodeCuong Manh Le
When channel receive operator is used in the context that requires conversion to destination type, there's an implicit conversion operator inserted by typecheck. This typecheck-ed node is un-walked, then passing to the backend as-is, causing the ICE. Fixes #78313 Change-Id: Ibbc70cbd2d8069cc7cf81934406aa68c4da2686a Reviewed-on: https://go-review.googlesource.com/c/go/+/758660 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Jakub Ciolek <jakub@ciolek.dev> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-03-24cmd/compile: handle string literals in FIPS mode consistentlyCherry Mui
There are different code paths for compiling a composite literal, e.g. small vs. large, fully static vs. partially static. Following CL 755600, we need to apply the condition for string literals in FIPS mode consistently in all places. Enhance the test to check that not only does the code compile, the same code inside and outside of FIPS mode produce the same result. If the condition is not consistent in the compiler, it may compile the code, but not all the fields are actually assigned. Fixes #78173. Change-Id: Icaf673bd4798d4312d86c39b147d7fd33b9dae2c Reviewed-on: https://go-review.googlesource.com/c/go/+/756260 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2026-03-20Revert "runtime, cmd/compile: use preemptible memclr for large pointer-free ↵Michael Pratt
clears" This reverts CL 750480. Reason: Adding preemptible memclrNoHeapPointers exposes existing unsafe use of notInHeapSlice, causing crashes. Revert the memclr stack until the underlying issue is fixed. We keep the test added in CL 755942, which is useful regardless. For #78254. Change-Id: I8be3f9a20292b7f294e98e74e5a86c6a204406ae Reviewed-on: https://go-review.googlesource.com/c/go/+/757343 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2026-03-20Revert "cmd/compile: don't call memclrNoHeapPointersPreemptible from nosplit ↵Michael Pratt
functions" This reverts CL 756123. Reason: Adding preemptible memclrNoHeapPointers exposes existing unsafe use of notInHeapSlice, causing crashes. Revert the memclr stack until the underlying issue is fixed. For #78254. Change-Id: Ic5e6eee8e87f219e06bec8610fcc85cd52d634b1 Reviewed-on: https://go-review.googlesource.com/c/go/+/757341 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-03-19cmd/compile: enable promoted field selectors as keys in struct literalsRobert Griesemer
Switch the generated UIR version from V2 to V3. Adjust cmd/compile/internal/types to accept promoted field selectors in composite literals. Fixes #9859. Change-Id: Ie314e28567cfa6cf4c9e962a07b32dd05b06bf5e Reviewed-on: https://go-review.googlesource.com/c/go/+/755740 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Robert Griesemer <gri@google.com> Reviewed-by: Mark Freeman <markfreeman@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2026-03-17cmd/compile: don't call memclrNoHeapPointersPreemptible from nosplit functionsKeith Randall
That function will allow preemption, which could subsequently allow for a stack copy (from shrinking). We can't let that happen inside a nosplit function. Update #78081 Change-Id: I12e77b50bbdcdd1e08e505a863b13cd9e1f814ee Reviewed-on: https://go-review.googlesource.com/c/go/+/756123 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Keith Randall <khr@golang.org>
2026-03-17cmd/compile: don't treat string literal as static data in FIPS modeCherry Mui
Reference a string literal requires a relocation, which is not allowed in static data in FIPS mode, as this would be an absolute relocation, and cannot be properly hashed at both link time and run time. Also, make sure the symbol's FIPS type is set before writing. This ensures relocations are checked in FIPS RODATA symbols. Currently we only call setFIPSType in prewrite if we change the type from a BSS type to a DATA type. But it is possible that the compiler sets the symbol type to RODATA and start writing to it. For #78173. Change-Id: I120a3b28ee3f38e9024479344565f54dff87d430 Reviewed-on: https://go-review.googlesource.com/c/go/+/755600 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Russ Cox <rsc@golang.org>
2026-03-10cmd/compile: fix ICE when checking slice capCuong Manh Le
CL 226737 optimized len check when make slice in common case when len is within range of cap. However, the generated code does not walk the AST for the if condition, causing un-walked nodes passed to the backend. Fixes #78028 Change-Id: I492fb230c10e585dc09391728ef4df2c0058ce12 Reviewed-on: https://go-review.googlesource.com/c/go/+/753100 Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Jakub Ciolek <jakub@ciolek.dev> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com>
2026-03-04runtime, cmd/compile: use preemptible memclr for large pointer-free clears“Muhammad
Large memory clearing operations (via clear() or large slice allocation) currently use non-preemptible assembly loops. This blocks the Garbage Collector from performing a Stop The World (STW) event, leading to significant tail latency or even indefinite hangs in tight loops. This change introduces memclrNoHeapPointersPreemptible, which chunks clears into 256KB blocks with preemption checks. The compiler's walk phase is updated to emit this call for large pointer-free clears. To prevent regressions, SSA rewrite rules are added to ensure that constant-size clears (which are common and small) continue to be inlined into OpZero assembly. Benchmarks on darwin/arm64: - STW with 50MB clear: Improved from 'Hung' to ~500µs max pause. - Small clears (5-64B): No measurable regression. - Large clears (1M-64M): No measurable regression. Fixes #69327 Change-Id: Ide14d6bcdca1f60d6ac95443acb57da9a8822538 Reviewed-on: https://go-review.googlesource.com/c/go/+/750480 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Robert Griesemer <gri@google.com>
2026-02-06runtime: add explicit lower bounds check to decoderuneJonah Uellenberg
decoderune is only called by generated code, so we can guarantee that it's non-negative. This allows eliminating the automatic bounds check in it. To make this work, we need to expand the existing optimization to uints. This generally enables BCE for cases like this: ```go func test(list []int, idx uint64) int { if idx >= uint64(len(list)) { return 0 } list1 := list[idx:] return list1[0] } ``` Change-Id: I86a51b26ca0e63522dec99f7d6efe6bdcd2d6487 GitHub-Last-Rev: 82d44e0a080b53ee02c31ee1f92a8a0acd8d2621 GitHub-Pull-Request: golang/go#76610 Reviewed-on: https://go-review.googlesource.com/c/go/+/725101 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2026-02-04cmd/compile: don't double-walk the map argument of clearKeith Randall
mkcallstmt1 already walks the map argument of clear. mapClear then walks it again, which can cause problems if it is some syntax that is non-idempotent under walk. That is the case for the new way map lookups are being lowered in CL 736020. Fixes #77435 Change-Id: Ib2f6d7f2270308c2462aa276ed4413aaf7799fe3 Reviewed-on: https://go-review.googlesource.com/c/go/+/742120 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2026-02-02internal/maps,cmd/compile/internal/walk: replace calls to mapaccess1* with ↵ArsenySamoylov
mapaccess2* mapaccess1* and mapaccess2* functions share the same implementation and differ only in whether the boolean "found" is returned. This change replaces mapaccess1* calls with mapaccess2*. We can do this transparently, since the call site can safely discard the second (boolean) result. Ideally, mapacces1* functions could be removed entirely, but this change keeps them as thin wrappers for compatibility. Fixes #73196 Change-Id: I07c3423d22ed1095ac3666d00e134c2747b2f9c1 Reviewed-on: https://go-review.googlesource.com/c/go/+/736020 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2026-01-28cmd/compile: simplify slice/array range loops on loong64Guoqi Chen
loong64 supports R+R addressing ({st,ld}x.{b,h,w,d} instructions) and has implemented the relevant lowering rules (only width is 1). Removes 1616 instructions from the go binary on loong64. file before after Δ % asm 575366 575314 -52 -0.0090% cgo 489972 489884 -88 -0.0180% compile 2920418 2920110 -308 -0.0105% cover 540458 540290 -168 -0.0311% fix 865840 865668 -172 -0.0199% link 732858 732662 -196 -0.0267% preprofile 246022 245978 -44 -0.0179% vet 839268 839124 -144 -0.0172% go 1666470 1666114 -356 -0.0214% gofmt 326526 326438 -88 -0.0270% total 9203198 9201582 -1616 -0.0176% Change-Id: If3518547c785764877a6cf987781d43d8b572990 Reviewed-on: https://go-review.googlesource.com/c/go/+/738240 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2026-01-23cmd/compile: simplify AlgType usageKeith Randall
Only walk needs to distinguish different sizes of AMEM. Move the size-distinguishing AlgType there. Change-Id: I0a725b5bd13795a623b3668325f1068579abd340 Reviewed-on: https://go-review.googlesource.com/c/go/+/727461 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2026-01-23cmd/compile: clean up eq and hash implementationsKeith Randall
Use unsafe.Pointer instead of *any as the argument type. Now that we're using signatures, we don't have exact types so we might as well use unsafe.Pointer everywhere. Simplify hash function choice a bit. Change-Id: If1a07091031c4b966fde3a1d66295a04fd5a838c Reviewed-on: https://go-review.googlesource.com/c/go/+/727501 Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2026-01-23cmd/compile: redo how equality functions are generatedkhr@golang.org
Instead of generating an equality function for each type that needs it, generate one per "signature". A "signature" is a summary of the comparisons needed to check a type for equality. For instance, the type type S struct { i int32 j uint32 s string e error } Will have the signature "M8SI". M8 = 8 bytes of regular memory S = string I = nonempty interface This way, potentially many types that have the same signature can share the same equality function. The number of generated equality functions in the go binary is reduced from 634 to 286. The go binary is ~1% smaller. The generation of equality functions gets simpler (particularly, how we do inlining of sub-types, unrolling, etc.) and the generated code is probably a bit more efficient. The new function names are kind of weird, but will seldom show up for users. They will appear in cpu profiles, and in tracebacks in the situation where comparisons panic because an interface somewhere in the type being compared contains an uncomparable type (e.g. a slice). Note that this CL only affects generated comparison functions. It does not generally affect generated code for == (except when that code decides to call a comparison function as a subtask). Maybe a TODO for the future. Update #6853 Change-Id: I202bd6424cb6bf7c745a62c9603d4f01dc1a1fc8 Reviewed-on: https://go-review.googlesource.com/c/go/+/725380 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Keith Randall <khr@google.com>
2026-01-22cmd/compile: fix mis-compilation for static array initializationCuong Manh Le
The bug was first introduced when the compiler is still written in C, with CL 2254041. The static array was laid out with the wrong context, causing a stack pointer will be stored in global object. Fixes #61730 Fixes #77193 Change-Id: I22c8393314d251beb53db537043a63714c84f36a Reviewed-on: https://go-review.googlesource.com/c/go/+/737821 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org>
2025-11-24runtime: add GODEBUG=tracebacklabels=1 to include pprof labels in tracebacksDavid Finkel
Copy LabelSet to an internal package as label.Set, and include (escaped) labels within goroutine stack dumps. Labels are added to the goroutine header as quoted key:value pairs, so the line may get long if there are a lot of labels. To handle escaping, we add a printescaped function to the runtime and hook it up to the print function in the compiler with a new runtime.quoted type that's a sibling to runtime.hex. (in fact, we leverage some of the machinery from printhex to generate escape sequences). The escaping can be improved for printable runes outside basic ASCII (particularly for languages using non-latin stripts). Additionally, invalid UTF-8 can be improved. So we can experiment with the output format make this opt-in via a a new tracebacklabels GODEBUG var. Updates #23458 Updates #76349 Change-Id: I08e78a40c55839a809236fff593ef2090c13c036 Reviewed-on: https://go-review.googlesource.com/c/go/+/694119 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Alan Donovan <adonovan@google.com>
2025-11-20cmd/compile: stack allocate backing stores during appendkhr@golang.org
We can already stack allocate the backing store during append if the resulting backing store doesn't escape. See CL 664299. This CL enables us to often stack allocate the backing store during append *even if* the result escapes. Typically, for code like: func f(n int) []int { var r []int for i := range n { r = append(r, i) } return r } the backing store for r escapes, but only by returning it. Could we operate with r on the stack for most of its lifeime, and only move it to the heap at the return point? The current implementation of append will need to do an allocation each time it calls growslice. This will happen on the 1st, 2nd, 4th, 8th, etc. append calls. The allocations done by all but the last growslice call will then immediately be garbage. We'd like to avoid doing some of those intermediate allocations if possible. We rewrite the above code by introducing a move2heap operation: func f(n int) []int { var r []int for i := range n { r = append(r, i) } r = move2heap(r) return r } Using the move2heap runtime function, which does: move2heap(r): If r is already backed by heap storage, return r. Otherwise, copy r to the heap and return the copy. Now we can treat the backing store of r allocated at the append site as not escaping. Previous stack allocation optimizations now apply, which can use a fixed-size stack-allocated backing store for r when appending. See the description in cmd/compile/internal/slice/slice.go for how we ensure that this optimization is safe. Change-Id: I81f36e58bade2241d07f67967d8d547fff5302b8 Reviewed-on: https://go-review.googlesource.com/c/go/+/707755 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-10-30cmd/compile: implement bits.Mul64 on 32-bit systemsRuss Cox
This CL implements Mul64uhilo, Hmul64, Hmul64u, and Avg64u on 32-bit systems, with the effect that constant division of both int64s and uint64s can now be emitted directly in all cases, and also that bits.Mul64 can be intrinsified on 32-bit systems. Previously, constant division of uint64s by values 0 ≤ c ≤ 0xFFFF were implemented as uint32 divisions by c and some fixup. After expanding those smaller constant divisions, the code for i/999 required: (386) 7 mul, 10 add, 2 sub, 3 rotate, 3 shift (104 bytes) (arm) 7 mul, 9 add, 3 sub, 2 shift (104 bytes) (mips) 7 mul, 10 add, 5 sub, 6 shift, 3 sgtu (176 bytes) For that much code, we might as well use a full 64x64->128 multiply that can be used for all divisors, not just small ones. Having done that, the same i/999 now generates: (386) 4 mul, 9 add, 2 sub, 2 or, 6 shift (112 bytes) (arm) 4 mul, 8 add, 2 sub, 2 or, 3 shift (92 bytes) (mips) 4 mul, 11 add, 3 sub, 6 shift, 8 sgtu, 4 or (196 bytes) The size increase on 386 is due to a few extra register spills. The size increase on mips is due to add-with-carry being hard. The new approach is more general, letting us delete the old special case and guarantee that all int64 and uint64 divisions by constants are generated directly on 32-bit systems. This especially speeds up code making heavy use of bits.Mul64 with a constant argument, which happens in strconv and various crypto packages. A few examples are benchmarked below. pkg: cmd/compile/internal/test benchmark \ host local linux-amd64 s7 linux-386 s7:GOARCH=386 vs base vs base vs base vs base vs base DivconstI64 ~ ~ ~ -49.66% -21.02% ModconstI64 ~ ~ ~ -13.45% +14.52% DivisiblePow2constI64 ~ ~ ~ +0.97% -1.32% DivisibleconstI64 ~ ~ ~ -20.01% -48.28% DivisibleWDivconstI64 ~ ~ -1.76% -38.59% -42.74% DivconstU64/3 ~ ~ ~ -13.82% -4.09% DivconstU64/5 ~ ~ ~ -14.10% -3.54% DivconstU64/37 -2.07% -4.45% ~ -19.60% -9.55% DivconstU64/1234567 ~ ~ ~ -61.55% -56.93% ModconstU64 ~ ~ ~ -6.25% ~ DivisibleconstU64 ~ ~ ~ -2.78% -7.82% DivisibleWDivconstU64 ~ ~ ~ +4.23% +2.56% pkg: math/bits benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386 vs base vs base vs base vs base Add ~ ~ ~ ~ Add32 +1.59% ~ ~ ~ Add64 ~ ~ ~ ~ Add64multiple ~ ~ ~ ~ Sub ~ ~ ~ ~ Sub32 ~ ~ ~ ~ Sub64 ~ ~ -9.20% ~ Sub64multiple ~ ~ ~ ~ Mul ~ ~ ~ ~ Mul32 ~ ~ ~ ~ Mul64 ~ ~ -41.58% -53.21% Div ~ ~ ~ ~ Div32 ~ ~ ~ ~ Div64 ~ ~ ~ ~ pkg: strconv benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386 vs base vs base vs base vs base ParseInt/Pos/7bit ~ ~ -11.08% -6.75% ParseInt/Pos/26bit ~ ~ -13.65% -11.02% ParseInt/Pos/31bit ~ ~ -14.65% -9.71% ParseInt/Pos/56bit -1.80% ~ -17.97% -10.78% ParseInt/Pos/63bit ~ ~ -13.85% -9.63% ParseInt/Neg/7bit ~ ~ -12.14% -7.26% ParseInt/Neg/26bit ~ ~ -14.18% -9.81% ParseInt/Neg/31bit ~ ~ -14.51% -9.02% ParseInt/Neg/56bit ~ ~ -15.79% -9.79% ParseInt/Neg/63bit ~ ~ -15.68% -11.07% AppendFloat/Decimal ~ ~ -7.25% -12.26% AppendFloat/Float ~ ~ -15.96% -19.45% AppendFloat/Exp ~ ~ -13.96% -17.76% AppendFloat/NegExp ~ ~ -14.89% -20.27% AppendFloat/LongExp ~ ~ -12.68% -17.97% AppendFloat/Big ~ ~ -11.10% -16.64% AppendFloat/BinaryExp ~ ~ ~ ~ AppendFloat/32Integer ~ ~ -10.05% -10.91% AppendFloat/32ExactFraction ~ ~ -8.93% -13.00% AppendFloat/32Point ~ ~ -10.36% -14.89% AppendFloat/32Exp ~ ~ -9.88% -13.54% AppendFloat/32NegExp ~ ~ -10.16% -14.26% AppendFloat/32Shortest ~ ~ -11.39% -14.96% AppendFloat/32Fixed8Hard ~ ~ ~ -2.31% AppendFloat/32Fixed9Hard ~ ~ ~ -7.01% AppendFloat/64Fixed1 ~ ~ -2.83% -8.23% AppendFloat/64Fixed2 ~ ~ ~ -7.94% AppendFloat/64Fixed3 ~ ~ -4.07% -7.22% AppendFloat/64Fixed4 ~ ~ -7.24% -7.62% AppendFloat/64Fixed12 ~ ~ -6.57% -4.82% AppendFloat/64Fixed16 ~ ~ -4.00% -5.81% AppendFloat/64Fixed12Hard -2.22% ~ -4.07% -6.35% AppendFloat/64Fixed17Hard -2.12% ~ ~ -3.79% AppendFloat/64Fixed18Hard -1.89% ~ +2.48% ~ AppendFloat/Slowpath64 -1.85% ~ -14.49% -18.21% AppendFloat/SlowpathDenormal64 ~ ~ -13.08% -19.41% pkg: crypto/internal/fips140/nistec/fiat benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386 vs base vs base vs base vs base Mul/P224 ~ ~ -29.95% -39.60% Mul/P384 ~ ~ -37.11% -63.33% Mul/P521 ~ ~ -26.62% -12.42% Square/P224 +1.46% ~ -40.62% -49.18% Square/P384 ~ ~ -45.51% -69.68% Square/P521 +90.37% ~ -25.26% -11.23% (The +90% is a separate problem and not real; that much variation can be seen on that system by running the same binary from two different files.) pkg: crypto/internal/fips140/edwards25519 benchmark \ host s7 linux-amd64 linux-386 s7:GOARCH=386 vs base vs base vs base vs base EncodingDecoding ~ ~ -34.67% -35.75% ScalarBaseMult ~ ~ -31.25% -30.29% ScalarMult ~ ~ -33.45% -32.54% VarTimeDoubleScalarBaseMult ~ ~ -33.78% -33.68% Change-Id: Id3c91d42cd01def6731b755e99f8f40c6ad1bb65 Reviewed-on: https://go-review.googlesource.com/c/go/+/716061 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Russ Cox <rsc@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2025-10-29runtime: use internal/strconvRuss Cox
Runtime doing its own number formatting dates back to when runtime was the bottom-most Go package. Those days are long gone. Use internal/strconv to avoid duplicating code and also to get better floating-point formatting: % go1.24.6 run x.go +1.234568e+004 % go run x.go 12345.678 % With accurate floating point it becomes necessary to introduce separate printers for float32 vs float64 and for complex64 vs complex128. Otherwise float32(93.7) prints as 93.69999694824219. Change-Id: I25ae3f09519342dc3d1dcabf4711651423e00128 Reviewed-on: https://go-review.googlesource.com/c/go/+/716002 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-10-03cmd/compile: minor tweak for race detectorDavid Chase
This makes the front-end a little bit less temp-happy when instrumenting, which repairs the "is it a constant?" test in the simd intrinsic conversion which is otherwise broken by race detection. Also, this will perhaps be better code. Cherry-picked from the dev.simd branch. This CL is not necessarily SIMD specific. Apply early to reduce risk. Change-Id: I84b7a45b7bff62bb2c9f9662466b50858d288645 Reviewed-on: https://go-review.googlesource.com/c/go/+/685637 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-on: https://go-review.googlesource.com/c/go/+/708856
2025-08-05cmd/compile, runtime: add checkptr instrumentation for unsafe.AddCuong Manh Le
Fixes #74431 Change-Id: Id651ea0b82599ccaff8816af0a56ddbb149b6f89 Reviewed-on: https://go-review.googlesource.com/c/go/+/692015 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: t hepudds <thepudds1460@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-07-30all: remove redundant Swiss prefixesMichael Pratt
Now that there is only one map implementation we can simplify names. For #54766. Change-Id: I6a6a636cc6a8fc5e7712c27782fc0ced7467b939 Reviewed-on: https://go-review.googlesource.com/c/go/+/691596 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-30all: remove GOEXPERIMENT=swissmapMichael Pratt
For #54766. Change-Id: I6a6a636c40b5fe2e8b0d4a5e23933492bc8bb76e Reviewed-on: https://go-review.googlesource.com/c/go/+/691595 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2025-07-25cmd/compile: remove unused arg from gorecoverKeith Randall
We don't need this argument anymore to match up a recover with its corresponding panic. Change-Id: I5d3646cdd766259ee9d3d995a2f215f02e17abc6 Reviewed-on: https://go-review.googlesource.com/c/go/+/685555 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2025-06-30cmd/compile/internal/escape: add debug hash for literal allocation optimizationsthepudds
Several CLs earlier in this stack added optimizations to reduce user allocations by recognizing and taking advantage of literals, including CL 649555, CL 649079, and CL 649035. This CL adds debug hashing of those changes, which enables use of the bisect tool, such as 'bisect -compile=literalalloc go test -run=Foo'. This also allows these optimizations to be manually disabled via '-gcflags=all=-d=literalallochash=n'. Updates #71359 Change-Id: I854f7742a6efa5b17d914932d61a32b2297f0c88 Reviewed-on: https://go-review.googlesource.com/c/go/+/675415 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-06-04Revert "cmd/compile: Enable inlining of tail calls"Cherry Mui
This reverts CL 650455 and CL 655816. Reason for revert: it causes #73747. Properly fixing it gets into trickiness with defer/recover, wrapper, and inlining. We're late in the Go 1.25 release cycle. Fixes #73747. Change-Id: Ifb343d522b18fec3fec73a7c886678032ac8e4df Reviewed-on: https://go-review.googlesource.com/c/go/+/678575 Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2025-05-27cmd/compile/internal/walk: use original type for composite literals in addrTempthepudds
When creating a new *ir.Name or *ir.LinksymOffsetExpr to represent a composite literal stored in the read-only data section, we should use the original type of the expression that was found via ir.ReassignOracle.StaticValue. (This is needed because the StaticValue method can traverse through OCONVNOP operations to find its final result.) Otherwise, the compilation may succeed, but the linker might erroneously conclude that a type is not used and prune an itab when it should not, leading to a call at execution-time to runtime.unreachableMethod, which throws "fatal error: unreachable method called. linker bug?". The tests exercise both the case of a zero value struct literal that can be represented by the read-only runtime.zeroVal, which was the case of the simplified example from #73888, and also modifies that example to test the non zero value struct literal case. This CL makes two similar changes for those two cases. We can get either of the tests we are adding to fail independently if we only make a single corresponding change. Fixes #73888 Updates #71359 Change-Id: Ifd91f445cc168ab895cc27f7964a6557d5cc32e5 Reviewed-on: https://go-review.googlesource.com/c/go/+/676517 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-05-21cmd/compile, unique: model data flow of non-string pointersCherry Mui
Currently, hash/maphash.Comparable escapes its parameter if it contains non-string pointers, but does not escape strings or types that contain strings but no other pointers. This is achieved by a compiler intrinsic. unique.Make does something similar: it stores its parameter to a central map, with strings cloned. So from the escape analysis's perspective, the non-string pointers are passed through, whereas string pointers are not. We currently cannot model this type of type-dependent data flow directly in Go. So we do this with a compiler intrinsic. In fact, we can unify this and the intrinsic above. Tests are from Jake Bailey's CL 671955 (thanks!). Fixes #73680. Change-Id: Ia6a78e09dee39f8d9198a16758e4b5322ee2c56a Reviewed-on: https://go-review.googlesource.com/c/go/+/675156 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Jake Bailey <jacob.b.bailey@gmail.com>
2025-05-21cmd/compile/internal/walk: use global zeroVal in interface conversions for ↵thepudds
zero values This is a small-ish adjustment to the change earlier in our stack in CL 649555, which started creating read-only global storage for a composite literal used in an interface conversion and setting the interface data pointer to point to that global storage. In some cases, there are execution-time performance benefits to point to runtime.zeroVal in particular. In reflect, pointer checks against the runtime.zeroVal memory address are used to side-step some work, such as in reflect.Value.Set and reflect.Value.IsZero. In this CL, we therefore dig up the zeroVal symbol, and we use the machinery from earlier in our stack to use a pointer to zeroVal for the interface data pointer if we see examples like: sink = S{} or: s := S{} sink = s CL 649076 (also earlier in our stack) added most of the tests along with debug diagnostics in convert.go to make it easier to test this change. We add a benchmark in reflect to show examples of performance benefit. The left column is our immediately prior CL 649555, and the right is this CL. (The arrays of structs here do not seem to benefit, which we attempt to address in our next CL). goos: linux goarch: amd64 pkg: reflect cpu: Intel(R) Xeon(R) CPU @ 2.80GHz │ cl-649555 │ new │ │ sec/op │ sec/op vs base │ Zero/IsZero/ByteArray/size=16-4 4.176n ± 0% 4.171n ± 0% ~ (p=0.151 n=20) Zero/IsZero/ByteArray/size=64-4 6.921n ± 0% 3.864n ± 0% -44.16% (p=0.000 n=20) Zero/IsZero/ByteArray/size=1024-4 21.210n ± 0% 3.878n ± 0% -81.72% (p=0.000 n=20) Zero/IsZero/BigStruct/size=1024-4 25.505n ± 0% 5.061n ± 0% -80.15% (p=0.000 n=20) Zero/IsZero/SmallStruct/size=16-4 4.188n ± 0% 4.191n ± 0% ~ (p=0.106 n=20) Zero/IsZero/SmallStructArray/size=64-4 8.639n ± 0% 8.636n ± 0% ~ (p=0.973 n=20) Zero/IsZero/SmallStructArray/size=1024-4 79.99n ± 0% 80.06n ± 0% ~ (p=0.213 n=20) Zero/IsZero/Time/size=24-4 7.232n ± 0% 3.865n ± 0% -46.56% (p=0.000 n=20) Zero/SetZero/ByteArray/size=16-4 13.47n ± 0% 13.09n ± 0% -2.78% (p=0.000 n=20) Zero/SetZero/ByteArray/size=64-4 14.14n ± 0% 13.70n ± 0% -3.15% (p=0.000 n=20) Zero/SetZero/ByteArray/size=1024-4 24.22n ± 0% 20.18n ± 0% -16.68% (p=0.000 n=20) Zero/SetZero/BigStruct/size=1024-4 24.24n ± 0% 20.18n ± 0% -16.73% (p=0.000 n=20) Zero/SetZero/SmallStruct/size=16-4 13.45n ± 0% 13.10n ± 0% -2.60% (p=0.000 n=20) Zero/SetZero/SmallStructArray/size=64-4 14.12n ± 0% 13.69n ± 0% -3.05% (p=0.000 n=20) Zero/SetZero/SmallStructArray/size=1024-4 24.62n ± 0% 21.61n ± 0% -12.26% (p=0.000 n=20) Zero/SetZero/Time/size=24-4 13.59n ± 0% 13.40n ± 0% -1.40% (p=0.000 n=20) geomean 14.06n 10.19n -27.54% Finally, here are results from the benchmark example from #71323. Note however that almost all the benefit shown here is from our earlier CL 649555, which is a more general purpose change and eliminates the allocation using a different read-only global than this CL. │ go1.24 │ new │ │ sec/op │ sec/op vs base │ InterfaceAny 112.6000n ± 5% 0.8078n ± 3% -99.28% (p=0.000 n=20) ReflectValue 11.63n ± 2% 11.59n ± 0% ~ (p=0.330 n=20) │ go1.24.out │ new.out │ │ B/op │ B/op vs base │ InterfaceAny 224.0 ± 0% 0.0 ± 0% -100.00% (p=0.000 n=20) ReflectValue 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=20) ¹ │ go1.24.out │ new.out │ │ allocs/op │ allocs/op vs base │ InterfaceAny 1.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=20) ReflectValue 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=20) ¹ Updates #71359 Updates #71323 Change-Id: I64d8cf1a7900f011d2ec59b948388aeda1150676 Reviewed-on: https://go-review.googlesource.com/c/go/+/649078 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com>
2025-05-21cmd/compile/internal/walk: convert composite literals to interfaces without ↵thepudds
allocating Today, this interface conversion causes the struct literal to be heap allocated: var sink any func example1() { sink = S{1, 1} } For basic literals like integers that are directly used in an interface conversion that would otherwise allocate, the compiler is able to use read-only global storage (see #18704). This CL extends that to struct and array literals as well by creating read-only global storage that is able to represent for example S{1, 1}, and then using a pointer to that storage in the interface when the interface conversion happens. A more challenging example is: func example2() { v := S{1, 1} sink = v } In this case, the struct literal is not directly part of the interface conversion, but is instead assigned to a local variable. To still avoid heap allocation in cases like this, in walk we construct a cache that maps from expressions used in interface conversions to earlier expressions that can be used to represent the same value (via ir.ReassignOracle.StaticValue). This is somewhat analogous to how we avoided heap allocation for basic literals in CL 649077 earlier in our stack, though here we also need to do a little more work to create the read-only global. CL 649076 (also earlier in our stack) added most of the tests along with debug diagnostics in convert.go to make it easier to test this change. See the writeup in #71359 for details. Fixes #71359 Fixes #71323 Updates #62653 Updates #53465 Updates #8618 Change-Id: I8924f0c69ff738ea33439bd6af7b4066af493b90 Reviewed-on: https://go-review.googlesource.com/c/go/+/649555 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com>
2025-05-20cmd/compile/internal/escape: additional constant and zero value tests and ↵thepudds
logging This adds additional logging for the work that walk does to reduce how often an interface conversion results in an allocation. Also, as part of #71359, we will be updating how escape analysis and walk handle basic literals, composite literals, and zero values, so add some tests that uses this new logging. By the end of our CL stack, we address all of these tests. Updates #71359 Change-Id: I43fde8343d9aacaec1e05360417908014a86c8bd Reviewed-on: https://go-review.googlesource.com/c/go/+/649076 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-05-19cmd/compile: align stack-allocated backing stores higher than requiredKeith Randall
Because that's what mallocgc did and some user code came to rely on it. Fixes #73199 Change-Id: I45ca00d2ea448e6729ef9ac4cec3c1eb0ceccc89 Reviewed-on: https://go-review.googlesource.com/c/go/+/666116 Reviewed-by: t hepudds <thepudds1460@gmail.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-04-24cmd/compile: add cast in range loop final value computationKeith Randall
When replacing a loop where the iteration variable has a named type, we need to compute the last iteration value as i = T(len(a)-1), not just i = len(a)-1. Fixes #73491 Change-Id: Ic1cc3bdf8571a40c10060f929a9db8a888de2b70 Reviewed-on: https://go-review.googlesource.com/c/go/+/667815 Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@google.com> Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: Keith Randall <khr@google.com>
2025-04-21cmd/compile: ensure we evaluate side effects of len() argKeith Randall
For any len() which requires the evaluation of its arg (according to the spec). Update #72844 Change-Id: Id2b0bcc78073a6d5051abd000131dafdf65e7f26 Reviewed-on: https://go-review.googlesource.com/c/go/+/658097 Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2025-04-21cmd/compile: don't evaluate side effects of range over arrayKeith Randall
If the thing we're ranging over is an array or ptr to array, and it doesn't have a function call or channel receive in it, then we shouldn't evaluate it. Typecheck the ranged-over value as a constant in that case. That makes the unified exporter replace the range expression with a constant int. Change-Id: I0d4ea081de70d20cf6d1fa8d25ef6cb021975554 Reviewed-on: https://go-review.googlesource.com/c/go/+/659317 Reviewed-by: Junyang Shao <shaojunyang@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Robert Griesemer <gri@google.com>
2025-04-15cmd/compile/internal/escape: add hash for bisecting stack allocation of ↵thepudds
variable-sized makeslice CL 653856 enabled stack allocation of variable-sized makeslice results. This CL adds debug hashing of that change, plus a debug flag to control the byte threshold used. The debug hashing machinery means we also now have a way to disable just the CL 653856 optimization by doing -gcflags='all=-d=variablemakehash=n' or similar, though the stderr output will then typically have many lines of debug hash output. Using this CL plus the bisect command, I was able to retroactively find one of the lines of code responsible for #73199: $ bisect -compile=variablemake go test -skip TestListWireGuardDrivers [...] bisect: FOUND failing change set --- change set #1 (enabling changes causes failure) ./security_windows.go:1321:38 (variablemake) ./security_windows.go:1321:38 (variablemake) --- Previously, I had tracked down those lines by diffing '-gcflags=-m=1' output and brief code inspection, but seeing the bisect was very nice. This CL also adds a compiler debug flag to control the threshold for stack allocation of variably sized make results. This can help us identify more code that is relying on certain stack allocations. This might be a temporary flag that we delete prior to Go 1.25 (given we would not want people to rely on it), or maybe it might make sense to keep it for some period of time beyond the release of Go 1.25 to help the ecosystem shake out other bugs. Using these two flags together (and picking a threshold of 64 rather than the default of 32), it looks for example like this x/sys/windows code might be relying on stack allocation of a byte slice: $ bisect -compile=variablemake go test -gcflags=-d=variablemakethreshold=64 -skip TestListWireGuardDrivers [...] bisect: FOUND failing change set --- change set #1 (enabling changes causes failure) ./syscall_windows_test.go:1178:16 (variablemake) Updates #73199 Fixes #73253 Change-Id: I160179a0e3c148c3ea86be5c9b6cea8a52c3e5b7 Reviewed-on: https://go-review.googlesource.com/c/go/+/663795 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-04-11cmd/compile: turn off variable-sized make() stack allocation with -NKeith Randall
Give people a way to turn this optimization off. (Currently the constant-sized make() stack allocation is not disabled with -N. Kinda inconsistent, but oh well, probably worse to change it now.) Update #73253 Change-Id: Idb9ffde444f34e70673147fd6a962368904a7a55 Reviewed-on: https://go-review.googlesource.com/c/go/+/664655 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Jorropo <jorropo.pgm@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: t hepudds <thepudds1460@gmail.com>
2025-04-04cmd/compile: allow pointer-containing elements in stack allocationsKeith Randall
For variable-sized allocations. Turns out that we already implement the correct escape semantics for this case. Even when the result of the "make" does not escape, everything assigned into it does. Change-Id: Ia123c538d39f2f1e1581c24e4135a65af3821c5e Reviewed-on: https://go-review.googlesource.com/c/go/+/657937 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Robert Griesemer <gri@google.com>
2025-04-04cmd/compile: stack allocate variable-sized makesliceKeith Randall
Instead of always allocating variable-sized "make" calls on the heap, allocate a small, constant-sized array on the stack and use that array as the backing store if it is big enough. Requires the result of the "make" doesn't escape. if cap <= K { var arr [K]E slice = arr[:len:cap] } else { slice = makeslice(E, len, cap) } Pretty conservatively for now, K = 32/sizeof(E). The slice header is already 24 bytes, so wasting 32 bytes of stack if the requested size is too big isn't that bad. Larger would waste more stack space but maybe avoid more allocations. This CL also requires the element type be pointer-free. Maybe we could relax that at some point, but it is hard. If the element type has pointers we can get heap->stack pointers (in the case where the requested size is too big and the slice is heap allocated). Note that this only handles the case of makeslice called directly from compiler-generated code. It does not handle slices built in the runtime on behalf of the program (e.g. in growslice). Some of those are currently handled by passing in a tmpBuf (e.g. concatstrings), but we could probably do more. Change-Id: I8378efad527cd00d25948a80b82a68d88fbd93a1 Reviewed-on: https://go-review.googlesource.com/c/go/+/653856 Reviewed-by: Robert Griesemer <gri@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2025-03-11cmd/compile: Enable inlining of tail callsAlexander Musman
Enable inlining tail calls and do not limit emitting tail calls only to the non-inlineable methods when generating wrappers. This change produces additional code size reduction. Code size difference measured with this change (tried for x86_64): etcd binary: .text section size: 10613393 -> 10593841 (0.18%) total binary size: 33450787 -> 33424307 (0.07%) compile binary: .text section size: 10171025 -> 10126545 (0.43%) total binary size: 28241012 -> 28192628 (0.17%) cockroach binary: .text section size: 83947260 -> 83694140 (0.3%) total binary size: 263799808 -> 263534160 (0.1%) Change-Id: I694f83cb838e64bd4c51f05b7b9f2bf0193bb551 Reviewed-on: https://go-review.googlesource.com/c/go/+/650455 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Keith Randall <khr@golang.org>
2025-02-25cmd/compile, runtime: optimize concatbytesCuong Manh Le
CL 527935 optimized []byte(string1 + string2) to use runtime.concatbytes to prevent concatenating of strings before converting to slices. However, the optimization is implemented without allowing temporary buffer for slice on stack, causing un-necessary allocations. To fix this, optimize concatbytes to use temporary buffer if the result string length fit to the buffer size. Fixes #71943 Change-Id: I1d3c374cd46aad8f83a271b8a5ca79094f9fd8db Reviewed-on: https://go-review.googlesource.com/c/go/+/652395 Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>
2025-02-13cmd/compile: move []byte->string map key optimization to ssaKeith Randall
If we call slicebytetostring immediately (with no intervening writes) before calling map access or delete functions with the resulting string as the key, then we can just use the ptr/len of the slicebytetostring argument as the key. This avoids an allocation. Fixes #44898 Update #71132 There's old code in cmd/compile/internal/walk/order.go that handles some of these cases. 1. m[string(b)] 2. s := string(b); m[s] 3. m[[2]string{string(b1),string(b2)}] The old code handled cases 1&3. The new code handles cases 1&2. We'll leave the old code around to keep 3 working, although it seems not terribly common. Case 2 happens particularly after inlining, so it is pretty common. Change-Id: I8913226ca79d2c65f4e2bd69a38ac8c976a57e43 Reviewed-on: https://go-review.googlesource.com/c/go/+/640656 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-02-10cmd/compile: avoid ifaceeq call if we know the interface is directKeith Randall
We can just use == if the interface is direct. Fixes #70738 Change-Id: Ia9a644791a370fec969c04c42d28a9b58f16911f Reviewed-on: https://go-review.googlesource.com/c/go/+/635435 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>