<feed xmlns='http://www.w3.org/2005/Atom'>
<title>go/test/codegen/mathbits.go, branch makepkg</title>
<subtitle>Fork of Go programming language with my patches.</subtitle>
<id>http://git.kilabit.info/go/atom?h=makepkg</id>
<link rel='self' href='http://git.kilabit.info/go/atom?h=makepkg'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/'/>
<updated>2025-10-29T20:55:00Z</updated>
<entry>
<title>test/codegen: simplify asmcheck pattern matching</title>
<updated>2025-10-29T20:55:00Z</updated>
<author>
<name>Russ Cox</name>
<email>rsc@golang.org</email>
</author>
<published>2025-10-27T02:51:14Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=915c1839fe76aef4bea6191282be1e48ef1c64e2'/>
<id>urn:sha1:915c1839fe76aef4bea6191282be1e48ef1c64e2</id>
<content type='text'>
Separate patterns in asmcheck by spaces instead of commas.
Many patterns end in comma (like "MOV [$]123,") so separating
patterns by comma is not great; they're already quoted, so spaces are fine.

Also replace all tabs in the assembly lines with spaces before matching.
Finally, replace \$ or \\$ with [$] as the matching idiom.
The effect of all these is to make the patterns look like:

  	   // amd64:"BSFQ" "ORQ [$]256"

instead of the old:

  	   // amd64:"BSFQ","ORQ\t\\$256"

Update all tests as well.

Change-Id: Ia39febe5d7f67ba115846422789e11b185d5c807
Reviewed-on: https://go-review.googlesource.com/c/go/+/716060
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Reviewed-by: Alan Donovan &lt;adonovan@google.com&gt;
Reviewed-by: Jorropo &lt;jorropo.pgm@gmail.com&gt;
</content>
</entry>
<entry>
<title>test/codegen: make sure assignment results are used.</title>
<updated>2025-10-06T21:51:23Z</updated>
<author>
<name>Cherry Mui</name>
<email>cherryyz@google.com</email>
</author>
<published>2025-10-04T15:32:33Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=1d62e92567a858b18f4e7e0c24e071c039dd3edf'/>
<id>urn:sha1:1d62e92567a858b18f4e7e0c24e071c039dd3edf</id>
<content type='text'>
Some tests make assignments to an argument without reading it.
With CL 708865, they are treated as dead stores and are removed.
Make sure the results are used.

Fixes #75745.
Fixes #75746.

Change-Id: I05580beb1006505ec1550e5fa245b54dcefd10b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/708916
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Reviewed-by: Keith Randall &lt;khr@google.com&gt;
Reviewed-by: Keith Randall &lt;khr@golang.org&gt;
</content>
</entry>
<entry>
<title>test/codegen: add Mul* test for loong64</title>
<updated>2025-08-25T01:14:22Z</updated>
<author>
<name>Xiaolin Zhao</name>
<email>zhaoxiaolin@loongson.cn</email>
</author>
<published>2025-08-21T02:36:34Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=0aa8019e943ce49afc2b655ea8d5f55cbaa72cef'/>
<id>urn:sha1:0aa8019e943ce49afc2b655ea8d5f55cbaa72cef</id>
<content type='text'>
Change-Id: Ica285212e4884a96fe9738b53cdc789b223bf2e3
Reviewed-on: https://go-review.googlesource.com/c/go/+/697895
Reviewed-by: David Chase &lt;drchase@google.com&gt;
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Reviewed-by: Carlos Amedee &lt;carlos@golang.org&gt;
Reviewed-by: abner chenc &lt;chenguoqi@loongson.cn&gt;
</content>
</entry>
<entry>
<title>cmd/compile,internal/cpu,runtime: intrinsify math/bits.OnesCount on riscv64</title>
<updated>2025-05-01T12:57:41Z</updated>
<author>
<name>Joel Sing</name>
<email>joel@sing.id.au</email>
</author>
<published>2025-03-19T12:57:23Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=4d10d4ad849467f12a1a16a5ade26cc03d8f1a1f'/>
<id>urn:sha1:4d10d4ad849467f12a1a16a5ade26cc03d8f1a1f</id>
<content type='text'>
For riscv64/rva22u64 and above, we can intrinsify math/bits.OnesCount
using the CPOP/CPOPW machine instructions. Since the native Go
implementation of OnesCount is relatively expensive, it is also
worth emitting a check for Zbb support when compiled for rva20u64.

On a Banana Pi F3, with GORISCV64=rva22u64:

              │     oc.1     │                oc.2                 │
              │    sec/op    │   sec/op     vs base                │
OnesCount-8     16.930n ± 0%   4.389n ± 0%  -74.08% (p=0.000 n=10)
OnesCount8-8     5.642n ± 0%   5.016n ± 0%  -11.10% (p=0.000 n=10)
OnesCount16-8    9.404n ± 0%   5.015n ± 0%  -46.67% (p=0.000 n=10)
OnesCount32-8   13.165n ± 0%   4.388n ± 0%  -66.67% (p=0.000 n=10)
OnesCount64-8   16.300n ± 0%   4.388n ± 0%  -73.08% (p=0.000 n=10)
geomean          11.40n        4.629n       -59.40%

On a Banana Pi F3, compiled with GORISCV64=rva20u64 and with Zbb
detection enabled:

              │     oc.3     │                oc.4                 │
              │    sec/op    │   sec/op     vs base                │
OnesCount-8     16.930n ± 0%   5.643n ± 0%  -66.67% (p=0.000 n=10)
OnesCount8-8     5.642n ± 0%   5.642n ± 0%        ~ (p=0.447 n=10)
OnesCount16-8   10.030n ± 0%   6.896n ± 0%  -31.25% (p=0.000 n=10)
OnesCount32-8   13.170n ± 0%   5.642n ± 0%  -57.16% (p=0.000 n=10)
OnesCount64-8   16.300n ± 0%   5.642n ± 0%  -65.39% (p=0.000 n=10)
geomean          11.55n        5.873n       -49.16%

On a Banana Pi F3, compiled with GORISCV64=rva20u64 but with Zbb
detection disabled:

              │    oc.3     │                oc.5                 │
              │   sec/op    │   sec/op     vs base                │
OnesCount-8     16.93n ± 0%   29.47n ± 0%  +74.07% (p=0.000 n=10)
OnesCount8-8    5.642n ± 0%   5.643n ± 0%        ~ (p=0.191 n=10)
OnesCount16-8   10.03n ± 0%   15.05n ± 0%  +50.05% (p=0.000 n=10)
OnesCount32-8   13.17n ± 0%   18.18n ± 0%  +38.04% (p=0.000 n=10)
OnesCount64-8   16.30n ± 0%   21.94n ± 0%  +34.60% (p=0.000 n=10)
geomean         11.55n        15.84n       +37.16%

For hardware without Zbb, this adds ~5ns overhead, while for hardware
with Zbb we achieve a performance gain up of up to 11ns. It is worth
noting that OnesCount8 is cheap enough that it is preferable to stick
with the generic version in this case.

Change-Id: Id657e40e0dd1b1ab8cc0fe0f8a68df4c9f2d7da5
Reviewed-on: https://go-review.googlesource.com/c/go/+/660856
Reviewed-by: Carlos Amedee &lt;carlos@golang.org&gt;
Reviewed-by: Meng Zhuo &lt;mengzhuo1203@gmail.com&gt;
Reviewed-by: Mark Ryan &lt;markdryan@rivosinc.com&gt;
Reviewed-by: Dmitri Shuralyov &lt;dmitshur@google.com&gt;
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
</content>
</entry>
<entry>
<title>cmd/compile: intrinsify math/bits.Bswap on riscv64</title>
<updated>2025-05-01T12:57:13Z</updated>
<author>
<name>Joel Sing</name>
<email>joel@sing.id.au</email>
</author>
<published>2025-03-19T14:09:23Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=90e8b8cdaeb76a57604a461a138c59340daed7ef'/>
<id>urn:sha1:90e8b8cdaeb76a57604a461a138c59340daed7ef</id>
<content type='text'>
For riscv64/rva22u64 and above, we can intrinsify math/bits.Bswap
using the REV8 machine instruction.

On a StarFive VisionFive 2 with GORISCV64=rva22u64:

                 │     rb.1     │                rb.2                 │
                 │    sec/op    │   sec/op     vs base                │
ReverseBytes-4     18.790n ± 0%   4.026n ± 0%  -78.57% (p=0.000 n=10)
ReverseBytes16-4    6.710n ± 0%   5.368n ± 0%  -20.00% (p=0.000 n=10)
ReverseBytes32-4   13.420n ± 0%   5.368n ± 0%  -60.00% (p=0.000 n=10)
ReverseBytes64-4   17.450n ± 0%   4.026n ± 0%  -76.93% (p=0.000 n=10)
geomean             13.11n        4.649n       -64.54%

Change-Id: I26eee34270b1721f7304bb1cddb0fda129b20ece
Reviewed-on: https://go-review.googlesource.com/c/go/+/660855
Reviewed-by: Mark Ryan &lt;markdryan@rivosinc.com&gt;
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Reviewed-by: Meng Zhuo &lt;mengzhuo1203@gmail.com&gt;
Reviewed-by: Carlos Amedee &lt;carlos@golang.org&gt;
Reviewed-by: Junyang Shao &lt;shaojunyang@google.com&gt;
</content>
</entry>
<entry>
<title>cmd/compile: constant fold 128-bit multiplies</title>
<updated>2025-04-22T17:24:18Z</updated>
<author>
<name>Keith Randall</name>
<email>khr@golang.org</email>
</author>
<published>2025-04-21T19:44:24Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=7d0cb2a2adec493b8ad9d79ef35354c8e20f0213'/>
<id>urn:sha1:7d0cb2a2adec493b8ad9d79ef35354c8e20f0213</id>
<content type='text'>
The full 64x64-&gt;128 multiply comes up when using bits.Mul64.
The 64x64-&gt;64+overflow multiply comes up in unsafe.Slice when using
a constant length.

Change-Id: I298515162ca07d804b2d699d03bc957ca30a4ebc
Reviewed-on: https://go-review.googlesource.com/c/go/+/667175
Reviewed-by: Junyang Shao &lt;shaojunyang@google.com&gt;
Reviewed-by: Keith Randall &lt;khr@google.com&gt;
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
</content>
</entry>
<entry>
<title>cmd/compile: intrinsify math/bits.Len on riscv64</title>
<updated>2025-03-22T01:21:44Z</updated>
<author>
<name>Joel Sing</name>
<email>joel@sing.id.au</email>
</author>
<published>2025-02-23T13:27:34Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=b70244ff7a043786c211775b68259de6104ff91c'/>
<id>urn:sha1:b70244ff7a043786c211775b68259de6104ff91c</id>
<content type='text'>
For riscv64/rva22u64 and above, we can intrinsify math/bits.Len using the
CLZ/CLZW machine instructions.

On a StarFive VisionFive 2 with GORISCV64=rva22u64:

                 │   clz.b.1   │               clz.b.2               │
                 │   sec/op    │   sec/op     vs base                │
LeadingZeros-4     28.89n ± 0%   12.08n ± 0%  -58.19% (p=0.000 n=10)
LeadingZeros8-4    18.79n ± 0%   14.76n ± 0%  -21.45% (p=0.000 n=10)
LeadingZeros16-4   25.27n ± 0%   14.76n ± 0%  -41.59% (p=0.000 n=10)
LeadingZeros32-4   25.12n ± 0%   12.08n ± 0%  -51.92% (p=0.000 n=10)
LeadingZeros64-4   25.89n ± 0%   12.08n ± 0%  -53.35% (p=0.000 n=10)
geomean            24.55n        13.09n       -46.70%

Change-Id: I0dda684713dbdf5336af393f5ccbdae861c4f694
Reviewed-on: https://go-review.googlesource.com/c/go/+/652321
Reviewed-by: David Chase &lt;drchase@google.com&gt;
Reviewed-by: Meng Zhuo &lt;mengzhuo1203@gmail.com&gt;
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Reviewed-by: Mark Ryan &lt;markdryan@rivosinc.com&gt;
Reviewed-by: Cherry Mui &lt;cherryyz@google.com&gt;
</content>
</entry>
<entry>
<title>cmd/compile: intrinsify math/bits.TrailingZeros on riscv64</title>
<updated>2025-03-16T02:07:53Z</updated>
<author>
<name>Joel Sing</name>
<email>joel@sing.id.au</email>
</author>
<published>2025-02-23T11:17:53Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=6fb7bdc96d0398fab313586fba6fdc89cc14c679'/>
<id>urn:sha1:6fb7bdc96d0398fab313586fba6fdc89cc14c679</id>
<content type='text'>
For riscv64/rva22u64 and above, we can intrinsify math/bits.TrailingZeros
using the CTZ/CTZW machine instructions.

On a StarFive VisionFive 2 with GORISCV64=rva22u64:

                  │   ctz.b.1    │               ctz.b.2               │
                  │    sec/op    │   sec/op     vs base                │
TrailingZeros-4     25.500n ± 0%   8.052n ± 0%  -68.42% (p=0.000 n=10)
TrailingZeros8-4     14.76n ± 0%   10.74n ± 0%  -27.24% (p=0.000 n=10)
TrailingZeros16-4    26.84n ± 0%   10.74n ± 0%  -59.99% (p=0.000 n=10)
TrailingZeros32-4   25.500n ± 0%   8.052n ± 0%  -68.42% (p=0.000 n=10)
TrailingZeros64-4   25.500n ± 0%   8.052n ± 0%  -68.42% (p=0.000 n=10)
geomean              23.09n        9.035n       -60.88%

Change-Id: I71edf2b988acb7a68e797afda4ee66d7a57d587e
Reviewed-on: https://go-review.googlesource.com/c/go/+/652320
Reviewed-by: Cherry Mui &lt;cherryyz@google.com&gt;
Reviewed-by: Mark Ryan &lt;markdryan@rivosinc.com&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Reviewed-by: Meng Zhuo &lt;mengzhuo1203@gmail.com&gt;
</content>
</entry>
<entry>
<title>test/codegen: tighten the TrailingZeros64 test on 386</title>
<updated>2025-03-14T22:04:38Z</updated>
<author>
<name>Joel Sing</name>
<email>joel@sing.id.au</email>
</author>
<published>2025-02-26T10:03:15Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=c1c7e5902fda622d5d5870ed045407a9acd5666b'/>
<id>urn:sha1:c1c7e5902fda622d5d5870ed045407a9acd5666b</id>
<content type='text'>
Make the TrailingZeros64 code generation check more specific for 386.
Just checking for BSFL will match both the generic 64 bit decomposition
and the custom 386 lowering.

Change-Id: I62076f1889af0ef1f29704cba01ab419cae0c6e3
Reviewed-on: https://go-review.googlesource.com/c/go/+/656996
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Reviewed-by: David Chase &lt;drchase@google.com&gt;
Reviewed-by: Keith Randall &lt;khr@google.com&gt;
Auto-Submit: Keith Randall &lt;khr@google.com&gt;
Reviewed-by: Keith Randall &lt;khr@golang.org&gt;
</content>
</entry>
<entry>
<title>cmd/compile: simplify intrinsification of TrailingZeros16 and TrailingZeros8</title>
<updated>2025-02-27T11:45:44Z</updated>
<author>
<name>Joel Sing</name>
<email>joel@sing.id.au</email>
</author>
<published>2025-02-22T12:26:21Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/go/commit/?id=927fdb7843ce96b42791912b42d0d3e6735e8dde'/>
<id>urn:sha1:927fdb7843ce96b42791912b42d0d3e6735e8dde</id>
<content type='text'>
Decompose Ctz16 and Ctz8 within the SSA rules for LOONG64, MIPS, PPC64
and S390X, rather than having a custom intrinsic. Note that for PPC64 this
actually allows the existing Ctz16 and Ctz8 rules to be used.

Change-Id: I27a5e978f852b9d75396d2a80f5d7dfcb5ef7dd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/651816
Reviewed-by: Paul Murphy &lt;murp@ibm.com&gt;
TryBot-Result: Gopher Robot &lt;gobot@golang.org&gt;
LUCI-TryBot-Result: Go LUCI &lt;golang-scoped@luci-project-accounts.iam.gserviceaccount.com&gt;
Reviewed-by: Michael Pratt &lt;mpratt@google.com&gt;
Run-TryBot: Joel Sing &lt;joel@sing.id.au&gt;
Reviewed-by: Keith Randall &lt;khr@golang.org&gt;
Reviewed-by: Keith Randall &lt;khr@google.com&gt;
</content>
</entry>
</feed>
