aboutsummaryrefslogtreecommitdiff
path: root/src/runtime
diff options
context:
space:
mode:
authorKeith Randall <khr@golang.org>2025-04-24 11:10:05 -0700
committerKeith Randall <khr@golang.org>2025-05-14 18:11:51 -0700
commitb30fa1bcc411f3a65a6e8f40ff3acdb1526ce0d0 (patch)
tree7ce412b2995042b07504bf0d67f23940a3a97ae9 /src/runtime
parentc31a5c571f32f350a0a1b30f2b0e85576096e14c (diff)
downloadgo-b30fa1bcc411f3a65a6e8f40ff3acdb1526ce0d0.tar.xz
runtime: improve scan inner loop
On every arch except amd64, it is faster to do x&(x-1) than x^(1<<n). Most archs need 3 instructions for the latter: MOV $1, R; SLL n, R; ANDN R, x. Maybe 4 if there's no ANDN. Most archs need only 2 instructions to do x&(x-1). It takes 3 on x86/amd64 because NEG only works in place. Only amd64 can do x^(1<<n) in a single instruction. (We could on 386 also, but that's currently not implemented.) Change-Id: I3b74b7a466ab972b20a25dbb21b572baf95c3467 Reviewed-on: https://go-review.googlesource.com/c/go/+/672956 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Diffstat (limited to 'src/runtime')
-rw-r--r--src/runtime/mbitmap.go9
1 files changed, 7 insertions, 2 deletions
diff --git a/src/runtime/mbitmap.go b/src/runtime/mbitmap.go
index 7d528b94b4..f9a4c4ce3d 100644
--- a/src/runtime/mbitmap.go
+++ b/src/runtime/mbitmap.go
@@ -219,8 +219,13 @@ func (tp typePointers) nextFast() (typePointers, uintptr) {
} else {
i = sys.TrailingZeros32(uint32(tp.mask))
}
- // BTCQ
- tp.mask ^= uintptr(1) << (i & (ptrBits - 1))
+ if GOARCH == "amd64" {
+ // BTCQ
+ tp.mask ^= uintptr(1) << (i & (ptrBits - 1))
+ } else {
+ // SUB, AND
+ tp.mask &= tp.mask - 1
+ }
// LEAQ (XX)(XX*8)
return tp, tp.addr + uintptr(i)*goarch.PtrSize
}