diff options
| author | David Chase <drchase@google.com> | 2025-09-02 12:10:26 -0400 |
|---|---|---|
| committer | David Chase <drchase@google.com> | 2025-09-29 08:41:41 -0700 |
| commit | 1c961c2fb281c0335bcfef86ff146f911f9583d4 (patch) | |
| tree | 1eb56b66c1416803adf9d0bfdcd1218ed8a14257 /src/runtime/testdata | |
| parent | fe4af1c067dbdb59f8faa5d6f619ec1cb60e70b2 (diff) | |
| download | go-1c961c2fb281c0335bcfef86ff146f911f9583d4.tar.xz | |
[dev.simd] simd: use new data movement instructions to do "fast" transposes
This is a test/example/performance-comparison. Looking at the
generated code shows that there is still a lot of checking that
perhaps we can figure out how to optimize away.
$b/go test -bench=B -benchtime=5x .
goos: linux
goarch: amd64
pkg: simd/internal/simd_test
cpu: Intel(R) Xeon(R) Platinum 8481C CPU @ 2.70GHz
BenchmarkPlainTranspose-88 5 3143116414 ns/op
BenchmarkTiled4Transpose-88 5 1127457328 ns/op
BenchmarkTiled8Transpose-88 5 671788993 ns/op
Benchmark2BlockedTranspose-88 5 1665429657 ns/op
Benchmark3BlockedTranspose-88 5 1208767441 ns/op
Benchmark4BlockedTranspose-88 5 910212696 ns/op
Benchmark5aBlockedTranspose-88 5 939205670 ns/op
Benchmark5bBlockedTranspose-88 5 1018286871 ns/op
Change-Id: I78bae0fd2ff4f511dac4291b898bbb79b0114741
Reviewed-on: https://go-review.googlesource.com/c/go/+/700695
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Diffstat (limited to 'src/runtime/testdata')
0 files changed, 0 insertions, 0 deletions
