| Age | Commit message (Collapse) | Author |
|
This makes it easier to identify which functions are used for memory
allocation by looking for functions that start with mallocgc. The Size
suffix is added so that the isSpecializedMalloc function in
cmd/compile/internal/ssa can distinguish between the generated functions
and the mallocgcTiny function called by mallocgc, similar to the SC
suffixes for the mallocgcSmallNoScanSC* and mallocgcSmallScanNoHeaderSC*
functons.
Change-Id: I6ad7f15617bf6f18ae5d1bfa2a0b94e86a6a6964
Reviewed-on: https://go-review.googlesource.com/c/go/+/735780
Reviewed-by: Michael Matloob <matloob@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Cq-Include-Trybots: luci.golang.try:gotip-linux-386-nosizespecializedmalloc,gotip-linux-amd64-nosizespecializedmalloc,gotip-linux-arm64-nosizespecializedmalloc
Change-Id: I6a6a696465004b939c989afc058c4c3e1fb7134f
Reviewed-on: https://go-review.googlesource.com/c/go/+/720401
Auto-Submit: Michael Matloob <matloob@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Matloob <matloob@google.com>
|
|
Separate patterns in asmcheck by spaces instead of commas.
Many patterns end in comma (like "MOV [$]123,") so separating
patterns by comma is not great; they're already quoted, so spaces are fine.
Also replace all tabs in the assembly lines with spaces before matching.
Finally, replace \$ or \\$ with [$] as the matching idiom.
The effect of all these is to make the patterns look like:
// amd64:"BSFQ" "ORQ [$]256"
instead of the old:
// amd64:"BSFQ","ORQ\t\\$256"
Update all tests as well.
Change-Id: Ia39febe5d7f67ba115846422789e11b185d5c807
Reviewed-on: https://go-review.googlesource.com/c/go/+/716060
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Jorropo <jorropo.pgm@gmail.com>
|
|
This change creates calls to size-specialized malloc functions instead
of calls to newObject when we know the size of the allocation at
compilation time. Most of it is a matter of calling the newObject
function (which will create calls to the size-specialized functions)
rather then the newObjectNonSpecialized function (which won't). In the
newHeapaddr, small, non-pointer case, we'll create a non specialized
newObject and transform that into the appropriate size-specialized
function when we produce the mallocgc in flushPendingHeapAllocations.
We have to update some of the rewrites in generic.rules to also apply to
the size-specialized functions when they apply to newObject.
The messiest thing is we have to adjust the offset we use to save the
memory profiler stack, because the depth of the call to profilealloc is
two frames fewer in the size-specialized malloc functions compared to
when newObject calls mallocgc. A bunch of tests have been adjusted to
account for that.
Change-Id: I6a6a6964c9037fb6719e392c4a498ed700b617d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/707856
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Matloob <matloob@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This CL slightly speeds up strings.HasPrefix when testing constant
prefixes of length {3,5,6,7}.
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
│ old │ new │
│ sec/op │ sec/op vs base │
StringPrefix3-8 11.125n ± 2% 8.539n ± 1% -23.25% (p=0.000 n=20)
StringPrefix5-8 11.170n ± 2% 8.700n ± 1% -22.11% (p=0.000 n=20)
StringPrefix6-8 11.190n ± 2% 8.655n ± 1% -22.65% (p=0.000 n=20)
StringPrefix7-8 11.095n ± 1% 8.878n ± 1% -19.98% (p=0.000 n=20)
Change-Id: I510a80d59cf78680b57d68780d35d212d24030e2
Reviewed-on: https://go-review.googlesource.com/c/go/+/700816
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Freeman <markfreeman@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
|
|
Ref: CL 141118
Update #26498
Change-Id: I9c4ad2bedc4d50bd273bbe9119a898d4fca95e45
Reviewed-on: https://go-review.googlesource.com/c/go/+/700875
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
This CL optimizes the compilation of string-to-bytes conversion in the
case of string additions.
Fixes #62407
Change-Id: Ic47df758478e5d061880620025c4ec7dbbff8a64
Reviewed-on: https://go-review.googlesource.com/c/go/+/527935
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Tim King <taking@google.com>
|
|
s==s is always true for strings. This comes up in NaN testing in
generic code, where we want x==x to compile completely away except for
float types.
Fixes #60777
Change-Id: I3ce054b5121354de2f9751b010fb409f148cb637
Reviewed-on: https://go-review.googlesource.com/c/go/+/503795
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
|
|
Change-Id: Ie04503e61400a793a6a29a4b58795254deabe472
Reviewed-on: https://go-review.googlesource.com/c/go/+/497276
Reviewed-by: Keith Randall <khr@golang.org>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
This CL optimizes memory moving with LDP and STP on arm64.
Benchmarks:
name old time/op new time/op delta
ClearFat7-160 1.08ns ± 0% 0.95ns ± 0% -11.41% (p=0.008 n=5+5)
ClearFat8-160 0.84ns ± 0% 0.84ns ± 0% -0.95% (p=0.008 n=5+5)
ClearFat11-160 1.08ns ± 0% 0.95ns ± 0% -11.46% (p=0.008 n=5+5)
ClearFat12-160 0.95ns ± 0% 0.95ns ± 0% ~ (p=0.063 n=4+5)
ClearFat13-160 1.08ns ± 0% 0.95ns ± 0% -11.45% (p=0.008 n=5+5)
ClearFat14-160 1.08ns ± 0% 0.95ns ± 0% -11.47% (p=0.008 n=5+5)
ClearFat15-160 1.24ns ± 0% 0.95ns ± 0% -22.98% (p=0.029 n=4+4)
ClearFat16-160 0.84ns ± 0% 0.83ns ± 0% -0.11% (p=0.008 n=5+5)
ClearFat24-160 2.15ns ± 0% 2.15ns ± 0% ~ (all equal)
ClearFat32-160 2.86ns ± 0% 2.86ns ± 0% ~ (p=0.333 n=5+4)
ClearFat40-160 2.15ns ± 0% 2.15ns ± 0% ~ (all equal)
ClearFat48-160 3.32ns ± 1% 3.31ns ± 1% ~ (p=0.690 n=5+5)
ClearFat56-160 2.15ns ± 0% 2.15ns ± 0% ~ (all equal)
ClearFat64-160 3.25ns ± 1% 3.26ns ± 1% ~ (p=0.841 n=5+5)
ClearFat72-160 2.22ns ± 0% 2.22ns ± 0% ~ (p=0.444 n=5+5)
ClearFat128-160 4.03ns ± 0% 4.04ns ± 0% +0.32% (p=0.008 n=5+5)
ClearFat256-160 6.44ns ± 0% 6.44ns ± 0% +0.08% (p=0.016 n=4+5)
ClearFat512-160 12.2ns ± 0% 12.2ns ± 0% +0.13% (p=0.008 n=5+5)
ClearFat1024-160 24.3ns ± 0% 24.3ns ± 0% ~ (p=0.167 n=5+5)
ClearFat1032-160 24.5ns ± 0% 24.5ns ± 0% ~ (p=0.238 n=4+5)
ClearFat1040-160 29.2ns ± 0% 29.3ns ± 0% +0.34% (p=0.008 n=5+5)
CopyFat7-160 1.43ns ± 0% 1.07ns ± 0% -24.97% (p=0.008 n=5+5)
CopyFat8-160 0.89ns ± 0% 0.89ns ± 0% ~ (p=0.238 n=5+5)
CopyFat11-160 1.43ns ± 0% 1.07ns ± 0% -24.97% (p=0.008 n=5+5)
CopyFat12-160 1.07ns ± 0% 1.07ns ± 0% ~ (p=0.238 n=5+4)
CopyFat13-160 1.43ns ± 0% 1.07ns ± 0% ~ (p=0.079 n=4+5)
CopyFat14-160 1.43ns ± 0% 1.07ns ± 0% -24.95% (p=0.008 n=5+5)
CopyFat15-160 1.79ns ± 0% 1.07ns ± 0% ~ (p=0.079 n=4+5)
CopyFat16-160 1.07ns ± 0% 1.07ns ± 0% ~ (p=0.444 n=5+5)
CopyFat24-160 1.84ns ± 2% 1.67ns ± 0% -9.28% (p=0.008 n=5+5)
CopyFat32-160 3.22ns ± 0% 2.92ns ± 0% -9.40% (p=0.008 n=5+5)
CopyFat64-160 3.64ns ± 0% 3.57ns ± 0% -1.96% (p=0.008 n=5+5)
CopyFat72-160 3.56ns ± 0% 3.11ns ± 0% -12.89% (p=0.008 n=5+5)
CopyFat128-160 5.06ns ± 0% 5.06ns ± 0% +0.04% (p=0.048 n=5+5)
CopyFat256-160 9.13ns ± 0% 9.13ns ± 0% ~ (p=0.659 n=5+5)
CopyFat512-160 17.4ns ± 0% 17.4ns ± 0% ~ (p=0.167 n=5+5)
CopyFat520-160 17.2ns ± 0% 17.3ns ± 0% +0.37% (p=0.008 n=5+5)
CopyFat1024-160 34.1ns ± 0% 34.0ns ± 0% ~ (p=0.127 n=5+5)
CopyFat1032-160 80.9ns ± 0% 34.2ns ± 0% -57.74% (p=0.008 n=5+5)
CopyFat1040-160 94.4ns ± 0% 41.7ns ± 0% -55.78% (p=0.016 n=5+4)
Change-Id: I14186f9f82b0ecf8b6c02191dc5da566b9a21e6c
Reviewed-on: https://go-review.googlesource.com/c/go/+/421654
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
As it can't appear in user package paths.
There is a hack for handling "go:buildid" and "type:*" on windows/386.
Previously, windows/386 requires underscore prefix on external symbols,
but that's only applied for SHOSTOBJ/SUNDEFEXT or cgo export symbols.
"go.buildid" is STEXT, "type.*" is STYPE, thus they are not prefixed
with underscore.
In external linking mode, the external linker can't resolve them as
external symbols. But we are lucky that they have "." in their name,
so the external linker see them as Forwarder RVA exports. See:
- https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#export-address-table
- https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=ld/pe-dll.c;h=e7b82ba6ffadf74dc1b9ee71dc13d48336941e51;hb=HEAD#l972)
This CL changes "." to ":" in symbols name, so theses symbols can not be
found by external linker anymore. So a hacky way is adding the
underscore prefix for these 2 symbols. I don't have enough knowledge to
verify whether adding the underscore for all STEXT/STYPE symbols are
fine, even if it could be, that would be done in future CL.
Fixes #37762
Change-Id: I92eaaf24c0820926a36e0530fdb07b07af1fcc35
Reviewed-on: https://go-review.googlesource.com/c/go/+/317917
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Ref: CL 141118
Update #26498
Change-Id: If4ea55c080b9aa10183eefe81fefbd4072deaf3a
Reviewed-on: https://go-review.googlesource.com/c/go/+/280646
Trust: Meng Zhuo <mzh@golangcn.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Extend the optimization introduced in CL 141118 to the wasm architecture.
And for reference, the rules trigger 212 times while building std and cmd
$GOOS=js GOARCH=wasm gotip build std cmd
$grep -E "Wasm.rules:44(1|2|3|4)" rulelog | wc -l
212
Updates #26498
Change-Id: I153684a2b98589ae812b42268da08b65679e09d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/185477
Run-TryBot: Agniva De Sarker <agniva.quicksilver@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Richard Musiol <neelance@gmail.com>
|
|
For moves >8,<16 bytes, do a move using non-overlapping loads/stores
if it would require no more instructions.
This helps a bit with the case when the move is from a static
constant, because then the code to materialize the value being moved
is smaller.
Change-Id: Ie47a5a7c654afeb4973142b0a9922faea13c9b54
Reviewed-on: https://go-review.googlesource.com/c/146019
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Instead of
MOVB go.string."foo"(SB), AX
do
MOVB $102, AX
When we know the global we're loading from is readonly, we can
do that read at compile time.
I've made this arch-dependent mostly because the cases where this
happens often are memory->memory moves, and those don't get
decomposed until lowering.
Did amd64/386/arm/arm64. Other architectures could follow.
Update #26498
Change-Id: I41b1dc831b2cd0a52dac9b97f4f4457888a46389
Reviewed-on: https://go-review.googlesource.com/c/141118
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
Do []byte(string) conversions more efficiently when the string
is a constant. Instead of calling stringtobyteslice, allocate
just the space we need and encode the initialization directly.
[]byte("foo") rewrites to the following pseudocode:
var s [3]byte // on heap or stack, depending on whether b escapes
s = *(*[3]byte)(&"foo"[0]) // initialize s from the string
b = s[:]
which generates this assembly:
0x001d 00029 (tmp1.go:9) LEAQ type.[3]uint8(SB), AX
0x0024 00036 (tmp1.go:9) MOVQ AX, (SP)
0x0028 00040 (tmp1.go:9) CALL runtime.newobject(SB)
0x002d 00045 (tmp1.go:9) MOVQ 8(SP), AX
0x0032 00050 (tmp1.go:9) MOVBLZX go.string."foo"+2(SB), CX
0x0039 00057 (tmp1.go:9) MOVWLZX go.string."foo"(SB), DX
0x0040 00064 (tmp1.go:9) MOVW DX, (AX)
0x0043 00067 (tmp1.go:9) MOVB CL, 2(AX)
// Then the slice is b = {AX, 3, 3}
The generated code is still not optimal, as it still does load/store
from read-only memory instead of constant stores. Next CL...
Update #26498
Fixes #10170
Change-Id: I4b990b19f9a308f60c8f4f148934acffefe0a5bd
Reviewed-on: https://go-review.googlesource.com/c/140698
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Adds a new runtime function to count runes in a string.
Modifies the compiler to detect the pattern len([]rune(string))
and replaces it with the new rune counting runtime function.
RuneCount/lenruneslice/ASCII 27.8ns ± 2% 14.5ns ± 3% -47.70% (p=0.000 n=10+10)
RuneCount/lenruneslice/Japanese 126ns ± 2% 60ns ± 2% -52.03% (p=0.000 n=10+10)
RuneCount/lenruneslice/MixedLength 104ns ± 2% 50ns ± 1% -51.71% (p=0.000 n=10+9)
Fixes #24923
Change-Id: Ie9c7e7391a4e2cca675c5cdcc1e5ce7d523948b9
Reviewed-on: https://go-review.googlesource.com/108985
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|