<feed xmlns='http://www.w3.org/2005/Atom'>
<title>go/test/codegen/stack.go, branch makepkg</title>
<subtitle>Fork of Go programming language with my patches.</subtitle>
<id>http://git.kilabit.info/go/atom?h=makepkg</id>
<link rel='self' href='http://git.kilabit.info/go/atom?h=makepkg'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/'/>
<updated>2025-10-29T20:55:00Z</updated>
<entry>
<title>test/codegen: simplify asmcheck pattern matching</title>
<updated>2025-10-29T20:55:00Z</updated>
<author>
<name>Russ Cox</name>
<email>rsc@golang.org</email>
</author>
<published>2025-10-27T02:51:14Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=915c1839fe76aef4bea6191282be1e48ef1c64e2'/>
<id>urn:sha1:915c1839fe76aef4bea6191282be1e48ef1c64e2</id>
<content type='text'>
Separate patterns in asmcheck by spaces instead of commas.
Many patterns end in comma (like "MOV [$]123,") so separating
patterns by comma is not great; they're already quoted, so spaces are fine.

Also replace all tabs in the assembly lines with spaces before matching.
Finally, replace \$ or \\$ with [$] as the matching idiom.
The effect of all these is to make the patterns look like:

  	   // amd64:"BSFQ" "ORQ [$]256"

instead of the old:

  	   // amd64:"BSFQ","ORQ\t\\$256"

Update all tests as well.

Change-Id: Ia39febe5d7f67ba115846422789e11b185d5c807
Reviewed-on: https://go-review.googlesource.com/c/go/+/716060
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Reviewed-by: Alan Donovan &lt;adonovan@google.com&gt;
Reviewed-by: Jorropo &lt;jorropo.pgm@gmail.com&gt;
</content>
</entry>
<entry>
<title>cmd/compile: remove stores to unread parameters</title>
<updated>2025-10-03T19:31:20Z</updated>
<author>
<name>Cherry Mui</name>
<email>cherryyz@google.com</email>
</author>
<published>2025-09-22T14:57:29Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=38b26f29f1f97d24afc852b9f4eee829341ee682'/>
<id>urn:sha1:38b26f29f1f97d24afc852b9f4eee829341ee682</id>
<content type='text'>
Currently, we remove stores to local variables that are not read.
We don't do that for arguments. But arguments and locals are
essentially the same. Arguments are passed by value, and are not
expected to be read in the caller's frame. So we can remove the
writes to them as well. One exception is the cgo_unsafe_arg
directive, which makes all the arguments effectively address-taken.
cgo_unsafe_arg implies ABI0, so we just skip ABI0 functions'
arguments.

Cherry-picked from the dev.simd branch. This CL is not
necessarily SIMD specific. Apply early to reduce risk.

Change-Id: I8999fc50da6a87f22c1ec23e9a0c15483b6f7df8
Reviewed-on: https://go-review.googlesource.com/c/go/+/705815
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
Reviewed-by: Junyang Shao &lt;shaojunyang@google.com&gt;
Reviewed-on: https://go-review.googlesource.com/c/go/+/708865
</content>
</entry>
<entry>
<title>cmd/compile: improve store-to-load forwarding with compatible types</title>
<updated>2025-04-04T15:25:47Z</updated>
<author>
<name>Alexander Musman</name>
<email>alexander.musman@gmail.com</email>
</author>
<published>2025-04-01T15:43:38Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=16a6b71f18a5d05dde1a208a317a75fd652597f0'/>
<id>urn:sha1:16a6b71f18a5d05dde1a208a317a75fd652597f0</id>
<content type='text'>
Improve the compiler's store-to-load forwarding optimization by relaxing the
type comparison condition. Instead of requiring exact type equality (CMPeq),
we now use copyCompatibleType which allows forwarding between compatible
types where safe.

Fix several size comparison bugs in the nested store patterns. Previously,
we were comparing the size of the outer store with the load type,
rather than comparing with the size of the actual store being forwarded
from.

Skip OpConvert in dead store elimination to help get rid of dead stores such
as zeroing slices. OpConvert, like OpInlMark, doesn't really use the memory.

This optimization is particularly beneficial for code that creates slices with
computed pointers, such as the runtime's heapBitsSlice function, where
intermediate calculations were previously causing the compiler to miss
store-to-load forwarding opportunities.

Local sweet run result on an x86_64 laptop:

                       │  Orig.res   │              Hopt.res              │
                       │   sec/op    │   sec/op     vs base               │
BiogoIgor-8               5.303 ± 1%    5.322 ± 1%       ~ (p=0.190 n=10)
BiogoKrishna-8            7.894 ± 1%    7.828 ± 2%       ~ (p=0.190 n=10)
BleveIndexBatch100-8      2.257 ± 1%    2.248 ± 2%       ~ (p=0.529 n=10)
EtcdPut-8                30.12m ± 1%   30.03m ± 1%       ~ (p=0.796 n=10)
EtcdSTM-8                127.1m ± 1%   126.2m ± 0%  -0.74% (p=0.023 n=10)
GoBuildKubelet-8          52.21 ± 0%    52.05 ± 1%       ~ (p=0.063 n=10)
GoBuildKubeletLink-8      4.342 ± 1%    4.305 ± 0%  -0.85% (p=0.000 n=10)
GoBuildIstioctl-8         43.33 ± 0%    43.24 ± 0%  -0.22% (p=0.015 n=10)
GoBuildIstioctlLink-8     4.604 ± 1%    4.598 ± 0%       ~ (p=0.063 n=10)
GoBuildFrontend-8         15.33 ± 0%    15.29 ± 0%       ~ (p=0.143 n=10)
GoBuildFrontendLink-8    740.0m ± 1%   737.7m ± 1%       ~ (p=0.912 n=10)
GopherLuaKNucleotide-8    9.590 ± 1%    9.656 ± 1%       ~ (p=0.165 n=10)
MarkdownRenderXHTML-8    96.97m ± 1%   97.26m ± 2%       ~ (p=0.105 n=10)
Tile38QueryLoad-8        335.9µ ± 1%   335.6µ ± 1%       ~ (p=0.481 n=10)
geomean                   1.336         1.333       -0.22%

Change-Id: I031552623e6d5a3b1b5be8325e6314706e45534f
Reviewed-on: https://go-review.googlesource.com/c/go/+/662075
Reviewed-by: Carlos Amedee &lt;carlos@golang.org&gt;
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Auto-Submit: Carlos Amedee &lt;carlos@golang.org&gt;
Reviewed-by: Dmitri Shuralyov &lt;dmitshur@google.com&gt;
Reviewed-by: Keith Randall &lt;khr@golang.org&gt;
</content>
</entry>
<entry>
<title>cmd/compile: soften type matching when allocating stack slots</title>
<updated>2024-02-29T21:29:41Z</updated>
<author>
<name>khr@golang.org</name>
<email>khr@golang.org</email>
</author>
<published>2024-02-20T18:32:26Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=e93041333150e6b74f931119036156938dcd0925'/>
<id>urn:sha1:e93041333150e6b74f931119036156938dcd0925</id>
<content type='text'>
Currently we use pointer equality on types when deciding whether we can
reuse a stack slot. That's too strict, as we don't guarantee pointer
equality for the same type. In particular, it can vary based on whether
PtrTo has been called in the frontend or not.

Instead, use the type's LinkString, which is guaranteed to both be
unique for a type, and to not vary given two different type structures
describing the same type.

Update #65783

Change-Id: I64f55138475f04bfa30cfb819b786b7cc06aebe4
Reviewed-on: https://go-review.googlesource.com/c/go/+/565436
Reviewed-by: Keith Randall &lt;khr@google.com&gt;
Reviewed-by: Matthew Dempsky &lt;mdempsky@google.com&gt;
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Auto-Submit: Keith Randall &lt;khr@golang.org&gt;
Reviewed-by: Cuong Manh Le &lt;cuong.manhle.vn@gmail.com&gt;
</content>
</entry>
<entry>
<title>test/codegen: combine trivial PPC64 tests into ppc64x</title>
<updated>2023-01-27T18:24:12Z</updated>
<author>
<name>Paul E. Murphy</name>
<email>murp@ibm.com</email>
</author>
<published>2023-01-25T17:53:10Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=0301c6c3512561b85b48d0e167f3e405484f496f'/>
<id>urn:sha1:0301c6c3512561b85b48d0e167f3e405484f496f</id>
<content type='text'>
Use a small python script to consolidate duplicate
ppc64/ppc64le tests into a single ppc64x codegen test.

This makes small assumption that anytime two tests with
for different arch/variant combos exists, those tests
can be combined into a single ppc64x test.

E.x:

  // ppc64le: foo
  // ppc64le/power9: foo
into
  // ppc64x: foo

or

  // ppc64: foo
  // ppc64le: foo
into
  // ppc64x: foo

import glob
import re
files = glob.glob("codegen/*.go")
for file in files:
    with open(file) as f:
        text = [l for l in f]
    i = 0
    while i &lt; len(text):
        first = re.match("\s*// ?ppc64(le)?(/power[89])?:(.*)", text[i])
        if first:
            j = i+1
            while j &lt; len(text):
                second = re.match("\s*// ?ppc64(le)?(/power[89])?:(.*)", text[j])
                if not second:
                    break
                if (not first.group(2) or first.group(2) == second.group(2)) and first.group(3) == second.group(3):
                    text[i] = re.sub(" ?ppc64(le|x)?"," ppc64x",text[i])
                    text=text[:j] + (text[j+1:])
                else:
                    j += 1
        i+=1
    with open(file, 'w') as f:
        f.write("".join(text))

Change-Id: Ic6b009b54eacaadc5a23db9c5a3bf7331b595821
Reviewed-on: https://go-review.googlesource.com/c/go/+/463220
Reviewed-by: Cherry Mui &lt;cherryyz@google.com&gt;
Reviewed-by: Lynn Boger &lt;laboger@linux.vnet.ibm.com&gt;
Reviewed-by: Bryan Mills &lt;bcmills@google.com&gt;
Run-TryBot: Paul Murphy &lt;murp@ibm.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
</content>
</entry>
<entry>
<title>test: make codegen tests work with both ABIs</title>
<updated>2021-04-12T21:59:59Z</updated>
<author>
<name>Cherry Zhang</name>
<email>cherryyz@google.com</email>
</author>
<published>2021-04-12T18:00:49Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=263e13d1f7d2d13782c5a63799c9979b9bbfd853'/>
<id>urn:sha1:263e13d1f7d2d13782c5a63799c9979b9bbfd853</id>
<content type='text'>
Some codegen tests were written with the assumption that
arguments and results are in memory, and with a specific stack
layout. With the register ABI, the assumption is no longer true.
Adjust the tests to work with both cases.

- For tests expecting in memory arguments/results, change to use
  global variables or memory-assigned argument/results.

- Allow more registers. E.g. some tests expecting register names
  contain only letters (e.g. AX), but  it can also contain numbers
  (e.g. R10).

- Some instruction selection changes when operate on register vs.
  memory, e.g. ADDQ vs. LEAQ, MOVB vs. MOVL. Accept both.

TODO: mathbits.go and memops.go still need fix.
Change-Id: Ic5932b4b5dd3f5d30ed078d296476b641420c4c5
Reviewed-on: https://go-review.googlesource.com/c/go/+/309335
Trust: Cherry Zhang &lt;cherryyz@google.com&gt;
Run-TryBot: Cherry Zhang &lt;cherryyz@google.com&gt;
TryBot-Result: Go Bot &lt;gobot@golang.org&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
</content>
</entry>
<entry>
<title>misc, runtime, test:  extra tests and benchmarks for defer</title>
<updated>2019-09-25T23:27:16Z</updated>
<author>
<name>Dan Scales</name>
<email>danscales@google.com</email>
</author>
<published>2019-09-24T00:46:38Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=225f484c880a840046129f16102216ee29271e66'/>
<id>urn:sha1:225f484c880a840046129f16102216ee29271e66</id>
<content type='text'>
Add a bunch of extra tests and benchmarks for defer, in preparation for new
low-cost (open-coded) implementation of defers (see #34481),

 - New file defer_test.go that tests a bunch more unusual defer scenarios,
   including things that might have problems for open-coded defers.
 - Additions to callers_test.go actually verifying what the stack trace looks like
   for various panic or panic-recover scenarios.
 - Additions to crash_test.go testing several more crash scenarios involving
   recursive panics.
 - New benchmark in runtime_test.go measuring speed of panic-recover
 - New CGo benchmark in cgo_test.go calling from Go to C back to Go that
   shows defer overhead

Updates #34481

Change-Id: I423523f3e05fc0229d4277dd00073289a5526188
Reviewed-on: https://go-review.googlesource.com/c/go/+/197017
Run-TryBot: Dan Scales &lt;danscales@google.com&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: Keith Randall &lt;khr@golang.org&gt;
Reviewed-by: Austin Clements &lt;austin@google.com&gt;
</content>
</entry>
<entry>
<title>Revert "Revert "cmd/compile,runtime: allocate defer records on the stack""</title>
<updated>2019-06-10T16:19:39Z</updated>
<author>
<name>Keith Randall</name>
<email>khr@golang.org</email>
</author>
<published>2019-06-08T17:20:57Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=8f296f59de0703b0559474beb434a265e277bdca'/>
<id>urn:sha1:8f296f59de0703b0559474beb434a265e277bdca</id>
<content type='text'>
This reverts CL 180761

Reason for revert: Reinstate the stack-allocated defer CL.

There was nothing wrong with the CL proper, but stack allocation of defers exposed two other issues.

Issue #32477: Fix has been submitted as CL 181258.
Issue #32498: Possible fix is CL 181377 (not submitted yet).

Change-Id: I32b3365d5026600069291b068bbba6cb15295eb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/181378
Reviewed-by: Brad Fitzpatrick &lt;bradfitz@golang.org&gt;
</content>
</entry>
<entry>
<title>Revert "cmd/compile,runtime: allocate defer records on the stack"</title>
<updated>2019-06-05T19:50:09Z</updated>
<author>
<name>Keith Randall</name>
<email>khr@golang.org</email>
</author>
<published>2019-06-05T18:42:31Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=49200e3f3e61f505acb152e150d054ef1db03b3e'/>
<id>urn:sha1:49200e3f3e61f505acb152e150d054ef1db03b3e</id>
<content type='text'>
This reverts commit fff4f599fe1c21e411a99de5c9b3777d06ce0ce6.

Reason for revert: Seems to still have issues around GC.

Fixes #32452

Change-Id: Ibe7af629f9ad6a3d5312acd7b066123f484da7f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/180761
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: Josh Bleecher Snyder &lt;josharian@gmail.com&gt;
</content>
</entry>
<entry>
<title>cmd/compile,runtime: allocate defer records on the stack</title>
<updated>2019-06-04T17:35:20Z</updated>
<author>
<name>Keith Randall</name>
<email>keithr@alum.mit.edu</email>
</author>
<published>2019-04-11T16:50:59Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=fff4f599fe1c21e411a99de5c9b3777d06ce0ce6'/>
<id>urn:sha1:fff4f599fe1c21e411a99de5c9b3777d06ce0ce6</id>
<content type='text'>
When a defer is executed at most once in a function body,
we can allocate the defer record for it on the stack instead
of on the heap.

This should make defers like this (which are very common) faster.

This optimization applies to 363 out of the 370 static defer sites
in the cmd/go binary.

name     old time/op  new time/op  delta
Defer-4  52.2ns ± 5%  36.2ns ± 3%  -30.70%  (p=0.000 n=10+10)

Fixes #6980
Update #14939

Change-Id: I697109dd7aeef9e97a9eeba2ef65ff53d3ee1004
Reviewed-on: https://go-review.googlesource.com/c/go/+/171758
Run-TryBot: Keith Randall &lt;khr@golang.org&gt;
TryBot-Result: Gobot Gobot &lt;gobot@golang.org&gt;
Reviewed-by: Austin Clements &lt;austin@google.com&gt;
</content>
</entry>
</feed>
