diff options
| author | Rick Hudson <rlh@golang.org> | 2016-02-24 14:36:30 -0500 |
|---|---|---|
| committer | Rick Hudson <rlh@golang.org> | 2016-04-27 21:54:54 +0000 |
| commit | 4093481523b1e064e998d5d586276db45f4d11a7 (patch) | |
| tree | 29903ce5306ba6185b6e9f0e2bc4cead5c64d31c /src/runtime/malloc.go | |
| parent | 44fe90d0b393c961e3fb1b4c37e93ce268da46bc (diff) | |
| download | go-4093481523b1e064e998d5d586276db45f4d11a7.tar.xz | |
[dev.garbage] runtime: add bit and cache ctz64 (count trailing zero)
Add to each span a 64 bit cache (allocCache) of the allocBits
at freeindex. allocCache is shifted such that the lowest bit
corresponds to the bit freeindex. allocBits uses a 0 to
indicate an object is free, on the other hand allocCache
uses a 1 to indicate an object is free. This facilitates
ctz64 (count trailing zero) which counts the number of 0s
trailing the least significant 1. This is also the index of
the least significant 1.
Each span maintains a freeindex indicating the boundary
between allocated objects and unallocated objects. allocCache
is shifted as freeindex is incremented such that the low bit
in allocCache corresponds to the bit a freeindex in the
allocBits array.
Currently ctz64 is written in Go using a for loop so it is
not very efficient. Use of the hardware instruction will
follow. With this in mind comparisons of the garbage
benchmark are as follows.
1.6 release 2.8 seconds
dev:garbage branch 3.1 seconds.
Profiling shows the go implementation of ctz64 takes up
1% of the total time.
Change-Id: If084ed9c3b1eda9f3c6ab2e794625cb870b8167f
Reviewed-on: https://go-review.googlesource.com/20200
Reviewed-by: Austin Clements <austin@google.com>
Diffstat (limited to 'src/runtime/malloc.go')
| -rw-r--r-- | src/runtime/malloc.go | 15 |
1 files changed, 8 insertions, 7 deletions
diff --git a/src/runtime/malloc.go b/src/runtime/malloc.go index 6db323a8d3..574ce3dafc 100644 --- a/src/runtime/malloc.go +++ b/src/runtime/malloc.go @@ -505,29 +505,30 @@ const ( func (c *mcache) nextFree(sizeclass int8) (v gclinkptr, shouldhelpgc bool) { s := c.alloc[sizeclass] shouldhelpgc = false - freeIndex := s.nextFreeIndex(s.freeindex) - + freeIndex := s.nextFreeIndex() if freeIndex == s.nelems { // The span is full. - if uintptr(s.allocCount) != s.nelems { - throw("s.allocCount != s.nelems && freeIndex == s.nelems") + if uintptr(s.allocCount) > s.nelems { + println("runtime: s.allocCount=", s.allocCount, "s.nelems=", s.nelems) + throw("s.allocCount > s.nelems && freeIndex == s.nelems") } systemstack(func() { c.refill(int32(sizeclass)) }) shouldhelpgc = true s = c.alloc[sizeclass] - freeIndex = s.nextFreeIndex(s.freeindex) + + freeIndex = s.nextFreeIndex() } + if freeIndex >= s.nelems { throw("freeIndex is not valid") } v = gclinkptr(freeIndex*s.elemsize + s.base()) - // Advance the freeIndex. - s.freeindex = freeIndex + 1 s.allocCount++ if uintptr(s.allocCount) > s.nelems { + println("s.allocCount=", s.allocCount, "s.nelems=", s.nelems) throw("s.allocCount > s.nelems") } return |
