| Age | Commit message (Collapse) | Author |
|
This CL marks some darwin assembly functions as NOFRAME to avoid relying
on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions
without stack were also marked as NOFRAME.
This is a second attempt after CL 460235 was reverted.
Change-Id: I790f2108fc01ec193aa32b0bc82362c2344a9f3b
Reviewed-on: https://go-review.googlesource.com/c/go/+/466055
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Introduce a flag in the object file indicating whether a given
function corresponds to a compiler-generated (not user-written) init
function, such as "os.init" or "syscall.init". Add code to the
compiler to fill in the correct value for the flag, and add support to
the loader package in the linker for testing the flag. The new loader
API is currently unused, but will be needed in the next CL in this
stack.
Updates #2559.
Updates #36021.
Updates #14840.
Change-Id: Iea7ad2adda487e4af7a44f062f9817977c53b394
Reviewed-on: https://go-review.googlesource.com/c/go/+/463855
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Instruction formats: rdtime rd, rj
The RDTIME family of instructions are used to read constant frequency timer
information, the stable counter value is written into the general register
rd, and the counter id information is written into the general register rj.
(Note: both of its register operands are outputs).
Ref: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html
Change-Id: Ida5bbb28316ef70b5f616dac3e6fa6f2e77875b5
Reviewed-on: https://go-review.googlesource.com/c/go/+/421655
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Wayne Zuo <wdvxdr@golangcn.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
|
|
This CL instructs the Go x86 compiler to load the frame pointer address
using a MOV instead of a LEA instruction, being MOV 1 byte shorter:
Before
55 PUSHQ BP
48 8d 2c 24 LEAQ 0(SP), BP
After
55 PUSHQ BP
48 89 e5 MOVQ SP, BP
This reduces the size of the Go toolchain ~0.06%.
Updates #6853
Change-Id: I5557cf34c47e871d264ba0deda9b78338681a12c
Reviewed-on: https://go-review.googlesource.com/c/go/+/463845
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This CL changes how the x86 compiler stores and loads the frame pointer
on each function prologue and epilogue, with the goal to reduce the
final binary size without affecting performance.
The compiler is currently using MOV instructions to load and store BP,
which can take from 5 to 8 bytes each.
This CL changes this approach so it emits PUSH/POP instructions instead,
which always take only 1 byte each (when operating with BP). It can also
avoid using the SUBQ/ADDQ to grow the stack for functions that have
frame pointer but does not have local variables.
On Windows, this CL reduces the go toolchain size from 15,697,920 bytes
to 15,584,768 bytes, a reduction of 0.7%.
Example of epilog and prologue for a function with 0x10 bytes of
local variables:
Before
===
SUBQ $0x18, SP
MOVQ BP, 0x10(SP)
LEAQ 0x10(SP), BP
... function body ...
MOVQ 0x10(SP), BP
ADDQ $0x18, SP
RET
===
After
===
PUSHQ BP
LEAQ 0(SP), BP
SUBQ $0x10, SP
... function body ...
MOVQ ADDQ $0x10, SP
POPQ BP
RET
===
Updates #6853
Change-Id: Ice9e14bbf8dff083c5f69feb97e9a764c3ca7785
Reviewed-on: https://go-review.googlesource.com/c/go/+/462300
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
|
|
Change-Id: If092ae7c72b66f172ae32fa6c7294a7ac250362e
Reviewed-on: https://go-review.googlesource.com/c/go/+/463995
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
|
|
PutAbstractFunc doesn't use FnState.Filesym, so it isn't needed, but
more importantly it is misleading. DwarfAbstractFunc is frequently used
on inlined functions from outside the current compilation unit. For
those function, ctxt.fileSymbol returns nil, meaning it probably isn't
safe to use if the original compilation unit could also generate an
abstract func with the correct file symbol.
Change-Id: I0e6c76e41d75ac9ca07e0f775e49d791249e1c5d
Reviewed-on: https://go-review.googlesource.com/c/go/+/458198
Reviewed-by: Than McIntosh <thanm@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
|
|
DW_AT_decl_line provides the line number of function declarations (the
line containing the func keyword). This is the equivalent to CL 429638,
but provided via DWARF.
Note that the file of declarations (DW_AT_decl_file) is already provided
for non-inlined functions. It is omitted for inlined functions because
those DWARF subprograms may be generated outside of their source
compilation unit, where referencing the file table is difficult.
Fixes #57308.
Change-Id: I3ad12e1f366c4465c2a588297988a5825ef7efec
Reviewed-on: https://go-review.googlesource.com/c/go/+/458195
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
Zeroing requires a non-K0 mask register be specified.
(gcc enforces this when assembling.)
The non-K0 restriction is already handled by the Yknot0 restriction.
But if the mask register is missing altogether, we misassemble the
instruction.
Fixes #57952
Not sure if this is really worth mentioning in the release notes,
but just in case I'll mark it.
RELNOTE=yes
Change-Id: I8f05d3155503f1f16d1b5ab9d67686fe5b64dfea
Reviewed-on: https://go-review.googlesource.com/c/go/+/463229
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Илья Токарь <tocarip@gmail.com>
Reviewed-by: Iskander Sharipov <quasilyte@gmail.com>
|
|
This reverts CL 460235.
Reason for revert: This breaks darwin 10 and 11
Change-Id: I3c663ebe3b77eba45a006a3ebec5cabe667faa9e
Reviewed-on: https://go-review.googlesource.com/c/go/+/463635
Auto-Submit: Quim Muntal <quimmuntal@gmail.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
This CL marks some darwin assembly functions as NOFRAME to avoid relying
on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions
without stack were also marked as NOFRAME.
Change-Id: I797f3909bcf7f7aad304e4ede820c884231e54f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/460235
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
This CL marks non-leaf nosplit assembly functions as NOFRAME to avoid
relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions
without stack were also marked as NOFRAME.
Updates #57302
Updates #40044
Change-Id: Ia4d26f8420dcf2b54528969ffbf40a73f1315d61
Reviewed-on: https://go-review.googlesource.com/c/go/+/459395
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
This CL redesign how we get the TLS pointer on windows/i386.
It applies the same changes as done in CL 431775 for windows/amd64.
We were previously reading it from the [TEB] arbitrary data slot,
located at 0x14(FS), which can only hold 1 TLS pointer.
With this CL, we will read the TLS pointer from the TEB TLS slot array,
located at 0xE10(GS). The TLS slot array can hold multiple
TLS pointers, up to 64, so multiple Go runtimes running on the
same thread can coexists with different TLS.
Each new TLS slot has to be allocated via [TlsAlloc],
which returns the slot index. This index can then be used to get the
slot offset from GS with the following formula: 0xE10 + index*4.
The slot index is fixed per Go runtime, so we can store it
in runtime.tls_g and use it latter on to read/update the TLS pointer.
Loading the TLS pointer requires the following asm instructions:
MOVQ runtime.tls_g, AX
MOVQ AX(FS), AX
Notice that this approach will now be implemented in all the supported
windows arches.
[TEB]: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block
[TlsAlloc]: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc
Change-Id: If4550b0d44694ee6480d4093b851f4991a088b32
Reviewed-on: https://go-review.googlesource.com/c/go/+/454675
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Currently runtime.Breakpoint generates SIGSEGV in s390x.
The solution to this is add new asm instruction BRRK of
type FORMAT_E for the breakpoint exception.
Fixes #52103
Change-Id: I8358a56e428849a5d28d5ade141e1d7310bee084
Reviewed-on: https://go-review.googlesource.com/c/go/+/457456
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
This is the second round to look for spelling mistakes. This time the
manual sifting of the result list was made easier by filtering out
capitalized and camelcase words.
grep -r --include '*.go' -E '^// .*$' . | aspell list | grep -E -x '[A-Za-z]{1}[a-z]*' | sort | uniq
This PR will be imported into Gerrit with the title and first
comment (this text) used to generate the subject and body of
the Gerrit change.
Change-Id: Ie8a2092aaa7e1f051aa90f03dbaf2b9aaf5664a9
GitHub-Last-Rev: fc2bd6e0c51652f13a7588980f1408af8e6080f5
GitHub-Pull-Request: golang/go#57737
Reviewed-on: https://go-review.googlesource.com/c/go/+/461595
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
|
|
Current RISCV64 assembler do not check the invalid shift amount. This CL
adds the check to avoid generating invalid instructions.
Fixes #57755
Change-Id: If33877605e161baefd98c50db1f71641ca057507
Reviewed-on: https://go-review.googlesource.com/c/go/+/461755
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
|
|
These typos were found by executing grep, aspell, sort, and uniq in
a pipe and searching the resulting list manually for possible typos.
grep -r --include '*.go' -E '^// .*$' . | aspell list | sort | uniq
Change-Id: I56281eda3b178968fbf104de1f71316c1feac64f
GitHub-Last-Rev: e91c7cee340fadfa32b0c1773e4e5cd1ca567638
GitHub-Pull-Request: golang/go#57669
Reviewed-on: https://go-review.googlesource.com/c/go/+/460767
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
Provide file/line numbers for errors when we have them.
Make the assembler error text closer to the equivalent errors from the compiler.
Abort further processing when we come across errors.
Fixes #53994
Change-Id: I4d6a037d6d713c1329923fce4c1189b5609f3660
Reviewed-on: https://go-review.googlesource.com/c/go/+/455276
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@google.com>
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Change-Id: I69065f8adf101fdb28682c55997f503013a50e29
Reviewed-on: https://go-review.googlesource.com/c/go/+/449757
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joedian Reid <joedian@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
This CL cleans up the literal pool implementation and inserts an UNDEF
instruction before the literal pool if the last instruction of the
function is not an unconditional jump instruction, RET or ERET
instruction.
Change-Id: Ifecb9e3372478362dde246c1bc9bc8d527a469d5
Reviewed-on: https://go-review.googlesource.com/c/go/+/424134
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Currently, we judge whether we need to fix up the branch instruction
based on Optab.type_ field, but the type_ field in optab may change.
This CL marks the branch instruction in optab, and checks whether to
do fixing up according to the mark. Depending on the constant parameter
range of the branch instruction, there are two labels, BRANCH14BITS,
BRANCH19BITS. For the 26-bit branch, linker will handle it.
Besides this CL removes the unnecessary alignment of the DWORD
instruction. Because the ISA doesn't require it and no 64-bit load
assume it. The only effect is that there is some performance penalty
for loading from DWORDs if the 8-byte DWORD instruction crosses the
cache line, but this is very rare.
Change-Id: I993902b3fb5ad8e081dd6c441e86bcf581031835
Reviewed-on: https://go-review.googlesource.com/c/go/+/424135
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
|
|
A few new opcodes are added to support ROP mitigation on
Power10.
Change-Id: I13045aebc0b6fb09c64dc234ee5741318670d7ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/425597
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
testenv.Command sets a default timeout based on the test's deadline
and sends SIGQUIT (where supported) in case of a hang.
Change-Id: Ica1a9985f9abb1935434367c9c8ba28fc50f331d
Reviewed-on: https://go-review.googlesource.com/c/go/+/450699
Auto-Submit: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
|
|
This CL redesign how we get the TLS pointer on windows/amd64.
We were previously reading it from the [TEB] arbitrary data slot,
located at 0x28(GS), which can only hold 1 TLS pointer.
With this CL, we will read the TLS pointer from the TEB TLS slot array,
located at 0x1480(GS). The TLS slot array can hold multiple
TLS pointers, up to 64, so multiple Go runtimes running on the
same thread can coexists with different TLS.
Each new TLS slot has to be allocated via [TlsAlloc],
which returns the slot index. This index can then be used to get the
slot offset from GS with the following formula: 0x1480 + index*8
The slot index is fixed per Go runtime, so we can store it
in runtime.tls_g and use it latter on to read/update the TLS pointer.
Loading the TLS pointer requires the following asm instructions:
MOVQ runtime.tls_g, AX
MOVQ AX(GS), AX
Notice that this approach is also implemented on windows/arm64.
[TEB]: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block
[TlsAlloc]: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc
Updates #22192
Change-Id: Idea7119fd76a3cd083979a4d57ed64b552fa101b
Reviewed-on: https://go-review.googlesource.com/c/go/+/431775
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
|
|
Change-Id: Ib6ea1bd04d9b06542ed2b0f453c718115417c62c
Reviewed-on: https://go-review.googlesource.com/c/go/+/449755
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Some object file writer functions are structured like, having a
local variable, setting fields, then passing it to a Write method
which eventually calls io.Writer.Write. As the Write call is an
interface call it escapes the parameter, which in turn causes the
local variable to be heap allocated. To reduce allocation, use
pre-allocated scratch space instead.
Reduce number of allocations in the compiler:
name old allocs/op new allocs/op delta
Template 679k ± 0% 644k ± 0% -5.17% (p=0.000 n=20+20)
Unicode 603k ± 0% 581k ± 0% -3.67% (p=0.000 n=20+20)
GoTypes 3.83M ± 0% 3.63M ± 0% -5.30% (p=0.000 n=20+20)
Compiler 353k ± 0% 342k ± 0% -3.09% (p=0.000 n=18+19)
SSA 31.4M ± 0% 30.4M ± 0% -3.02% (p=0.000 n=20+20)
Flate 397k ± 0% 373k ± 0% -5.92% (p=0.000 n=20+18)
GoParser 777k ± 0% 735k ± 0% -5.37% (p=0.000 n=20+20)
Reflect 2.07M ± 0% 1.90M ± 0% -7.89% (p=0.000 n=18+20)
Tar 605k ± 0% 568k ± 0% -6.26% (p=0.000 n=19+16)
XML 801k ± 0% 766k ± 0% -4.36% (p=0.000 n=20+20)
[Geo mean] 1.18M 1.12M -5.02%
Change-Id: I9d02a72e459e645527196ac54b6ee643a5ea6bd3
Reviewed-on: https://go-review.googlesource.com/c/go/+/449637
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
|
|
AllPos truncates and overwrites its slice-storage input instead
of appending. This makes that clear.
Change-Id: I81653ff49a4a7d14fe9446fd6620943f3b20bbd3
Reviewed-on: https://go-review.googlesource.com/c/go/+/449478
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Change-Id: Ia5a090953d324f0f8aa9c1808c88125ad5eb6f98
Reviewed-on: https://go-review.googlesource.com/c/go/+/448955
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
|
|
Make linkgetlineFromPos and getFileIndexAndLine methods on Link, and
give the former a more descriptive name.
The docs are expanded to make it more clear that these are final
file/line visible in programs.
In getFileSymbolAndLine use ctxt.InnermostPos instead of ctxt.PosTable
direct, which makes it more clear that we want the semantics of
InnermostPos.
Change-Id: I7c3d344dec60407fa54b191be8a09c117cb87dd0
Reviewed-on: https://go-review.googlesource.com/c/go/+/446301
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This CL optimizes the sequence of instructions ADRP+ADD+LD/ST to the
sequence of ADRP+LD/ST(offset). This saves an ADD instruction.
The test result of compilecmp:
name old text-bytes new text-bytes delta
HelloSize 763kB ± 0% 755kB ± 0% -1.06% (p=0.000 n=20+20)
name old data-bytes new data-bytes delta
HelloSize 13.5kB ± 0% 13.5kB ± 0% ~ (all equal)
name old bss-bytes new bss-bytes delta
HelloSize 227kB ± 0% 227kB ± 0% ~ (all equal)
name old exe-bytes new exe-bytes delta
HelloSize 1.33MB ± 0% 1.33MB ± 0% -0.02% (p=0.000 n=20+20)
file before after Δ %
addr2line 3760392 3759504 -888 -0.024%
api 5361511 5295351 -66160 -1.234%
asm 5014157 4948674 -65483 -1.306%
buildid 2579949 2579485 -464 -0.018%
cgo 4492817 4491737 -1080 -0.024%
compile 23359229 23156074 -203155 -0.870%
cover 4823337 4756937 -66400 -1.377%
dist 3332850 3331794 -1056 -0.032%
doc 3902649 3836745 -65904 -1.689%
fix 3269708 3268828 -880 -0.027%
link 6510760 6443496 -67264 -1.033%
nm 3670740 3604348 -66392 -1.809%
objdump 4069599 4068967 -632 -0.016%
pack 2374824 2374208 -616 -0.026%
pprof 13874860 13805700 -69160 -0.498%
test2json 2599210 2598530 -680 -0.026%
trace 13231640 13162872 -68768 -0.520%
vet 7360899 7292267 -68632 -0.932%
total 113589131 112775517 -813614 -0.716%
Change-Id: Ie1cf277e149ddd3f352d05fa0753d0ced7e0b894
Reviewed-on: https://go-review.googlesource.com/c/go/+/444715
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
|
|
The existing implementation uses loops to implement bulk memory
operations such as memcpy and memclr. Now that bulk memory operations
have been standardized and are implemented in all major browsers and
engines (see https://webassembly.org/roadmap/), we should use them
to improve performance.
Updates #28360
Change-Id: I28df0e0350287d5e7e1d1c09a4064ea1054e7575
Reviewed-on: https://go-review.googlesource.com/c/go/+/444935
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Richard Musiol <neelance@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Auto-Submit: Richard Musiol <neelance@gmail.com>
Reviewed-by: Richard Musiol <neelance@gmail.com>
|
|
The instruction format of MOVBU is the same with MOVB, this CL deletes
MOVBU from optab for simplicity.
Change-Id: Ib034d6c29dd9793cf3e6f9fa8abff0ed0d931d0e
Reviewed-on: https://go-review.googlesource.com/c/go/+/445295
Run-TryBot: Eric Fang <eric.fang@arm.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
When using "MOVD $const, Rx", any 32b constant can be generated in
register quickly. Avoid transforming big uint32 values into a load.
And, fix the instance in runtime.usleep where I discovered this.
Change-Id: I46e156d7edf200f85b5b61162f00223c0ad81fe2
Reviewed-on: https://go-review.googlesource.com/c/go/+/444815
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
Change-Id: Ia7e1e3679e03d125feb9708cb05bbd32c4954edb
GitHub-Last-Rev: a62b72ea3edcf2b4f9f378cd03b1ac073ab80c74
GitHub-Pull-Request: golang/go#55957
Reviewed-on: https://go-review.googlesource.com/c/go/+/436879
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: hopehook <hopehook@golangcn.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This adds the function "start line number" to runtime._func and
runtime.inlinedCall objects. The "start line number" is the line number
of the func keyword or TEXT directive for assembly.
Subtracting the start line number from PC line number provides the
relative line offset of a PC from the the start of the function. This
helps with source stability by allowing code above the function to move
without invalidating samples within the function.
Encoding start line rather than relative lines directly is convenient
because the pprof format already contains a start line field.
This CL uses a straightforward encoding of explictly including a start
line field in every _func and inlinedCall. It is possible that we could
compress this further in the future. e.g., functions with a prologue
usually have <line of PC 0> == <start line>. In runtime.test, 95% of
functions have <line of PC 0> == <start line>.
According to bent, this is geomean +0.83% binary size vs master and
-0.31% binary size vs 1.19.
Note that //line directives can change the file and line numbers
arbitrarily. The encoded start line is as adjusted by //line directives.
Since this can change in the middle of a function, `line - start line`
offset calculations may not be meaningful if //line directives are in
use.
For #55022.
Change-Id: Iaabbc6dd4f85ffdda294266ef982ae838cc692f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/429638
Run-TryBot: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Change-Id: I375233dc700adbc58a6d4af995d07b352bf85b11
GitHub-Last-Rev: ef129205231b892f61b0135c87bb791a5e1a126c
GitHub-Pull-Request: golang/go#55994
Reviewed-on: https://go-review.googlesource.com/c/go/+/437715
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
Fixes #55832
Change-Id: Ib20279d47c1ca9a383a3c85bb41ca4f550bb0a33
GitHub-Last-Rev: 10af77a2f21397899f69938e6d98bb34b33bfddf
GitHub-Pull-Request: golang/go#55838
Reviewed-on: https://go-review.googlesource.com/c/go/+/433575
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: hopehook <hopehook@golangcn.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
Usually optimization rules have corresponding priorities, some need to
be run first, some run next, and some run last, which produces the best
code. But currently our optimization rules have no priority, this CL
adds a late lower pass that runs those rules that need to be run at last,
such as split unreasonable constant folding. This pass can be seen as
the second round of the lower pass.
For example:
func foo(a, b uint64) uint64 {
d := a+0x1234568
d1 := b+0x1234568
return d&d1
}
The code generated by the master branch:
0x0004 00004 ADD $19088744, R0, R2 // movz+movk+add
0x0010 00016 ADD $19088744, R1, R1 // movz+movk+add
0x001c 00028 AND R1, R2, R0
This is because the current constant folding optimization rules do not
take into account the range of constants, causing the constant to be
loaded repeatedly. This CL splits these unreasonable constants folding
in the late lower pass. With this CL the generated code:
0x0004 00004 MOVD $19088744, R2 // movz+movk
0x000c 00012 ADD R0, R2, R3
0x0010 00016 ADD R1, R2, R1
0x0014 00020 AND R1, R3, R0
This CL also adds constant folding optimization for ADDS instruction.
In addition, in order not to introduce the codegen regression, an
optimization rule is added to change the addition of a negative number
into a subtraction of a positive number.
go1 benchmarks:
name old time/op new time/op delta
BinaryTree17-8 1.22s ± 1% 1.24s ± 0% +1.56% (p=0.008 n=5+5)
Fannkuch11-8 1.54s ± 0% 1.53s ± 0% -0.69% (p=0.016 n=4+5)
FmtFprintfEmpty-8 14.1ns ± 0% 14.1ns ± 0% ~ (p=0.079 n=4+5)
FmtFprintfString-8 26.0ns ± 0% 26.1ns ± 0% +0.23% (p=0.008 n=5+5)
FmtFprintfInt-8 32.3ns ± 0% 32.9ns ± 1% +1.72% (p=0.008 n=5+5)
FmtFprintfIntInt-8 54.5ns ± 0% 55.5ns ± 0% +1.83% (p=0.008 n=5+5)
FmtFprintfPrefixedInt-8 61.5ns ± 0% 62.0ns ± 0% +0.93% (p=0.008 n=5+5)
FmtFprintfFloat-8 72.0ns ± 0% 73.6ns ± 0% +2.24% (p=0.008 n=5+5)
FmtManyArgs-8 221ns ± 0% 224ns ± 0% +1.22% (p=0.008 n=5+5)
GobDecode-8 1.91ms ± 0% 1.93ms ± 0% +0.98% (p=0.008 n=5+5)
GobEncode-8 1.40ms ± 1% 1.39ms ± 0% -0.79% (p=0.032 n=5+5)
Gzip-8 115ms ± 0% 117ms ± 1% +1.17% (p=0.008 n=5+5)
Gunzip-8 19.4ms ± 1% 19.3ms ± 0% -0.71% (p=0.016 n=5+4)
HTTPClientServer-8 27.0µs ± 0% 27.3µs ± 0% +0.80% (p=0.008 n=5+5)
JSONEncode-8 3.36ms ± 1% 3.33ms ± 0% ~ (p=0.056 n=5+5)
JSONDecode-8 17.5ms ± 2% 17.8ms ± 0% +1.71% (p=0.016 n=5+4)
Mandelbrot200-8 2.29ms ± 0% 2.29ms ± 0% ~ (p=0.151 n=5+5)
GoParse-8 1.35ms ± 1% 1.36ms ± 1% ~ (p=0.056 n=5+5)
RegexpMatchEasy0_32-8 24.5ns ± 0% 24.5ns ± 0% ~ (p=0.444 n=4+5)
RegexpMatchEasy0_1K-8 131ns ±11% 118ns ± 6% ~ (p=0.056 n=5+5)
RegexpMatchEasy1_32-8 22.9ns ± 0% 22.9ns ± 0% ~ (p=0.905 n=4+5)
RegexpMatchEasy1_1K-8 126ns ± 0% 127ns ± 0% ~ (p=0.063 n=4+5)
RegexpMatchMedium_32-8 486ns ± 5% 483ns ± 0% ~ (p=0.381 n=5+4)
RegexpMatchMedium_1K-8 15.4µs ± 1% 15.5µs ± 0% ~ (p=0.151 n=5+5)
RegexpMatchHard_32-8 687ns ± 0% 686ns ± 0% ~ (p=0.103 n=5+5)
RegexpMatchHard_1K-8 20.7µs ± 0% 20.7µs ± 1% ~ (p=0.151 n=5+5)
Revcomp-8 175ms ± 2% 176ms ± 3% ~ (p=1.000 n=5+5)
Template-8 20.4ms ± 6% 20.1ms ± 2% ~ (p=0.151 n=5+5)
TimeParse-8 112ns ± 0% 113ns ± 0% +0.97% (p=0.016 n=5+4)
TimeFormat-8 156ns ± 0% 145ns ± 0% -7.14% (p=0.029 n=4+4)
Change-Id: I3ced26e89041f873ac989586514ccc5ee09f13da
Reviewed-on: https://go-review.googlesource.com/c/go/+/425134
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
|
|
It's all local to a single file and responsible for 1.7% of total
space allocated summed over compilation of the bent benchmarks.
Showing nodes accounting for 27.16GB, 27.04% of 100.44GB total
Dropped 1622 nodes (cum <= 0.50GB)
Showing top 10 nodes out of 321
flat flat% sum% cum cum%
4.87GB 4.85% 4.85% 4.87GB 4.85% cmd/compile/internal/objw.init
4.79GB 4.77% 9.62% 4.81GB 4.79% runtime.allocm
4.72GB 4.70% 14.32% 4.72GB 4.70% cmd/compile/internal/types.newType
3.10GB 3.09% 17.41% 3.17GB 3.15% cmd/compile/internal/ssagen.InitConfig
1.86GB 1.85% 19.26% 2.61GB 2.60% cmd/compile/internal/ssa.cse
1.72GB 1.71% 20.97% 2.25GB 2.24% cmd/internal/obj.(*Link).traverseFuncAux
1.66GB 1.65% 22.62% 1.66GB 1.65% runtime.malg
1.61GB 1.61% 24.23% 1.64GB 1.63% cmd/compile/internal/ssa.schedule
1.42GB 1.41% 25.64% 1.42GB 1.41% cmd/compile/internal/ir.NewNameAt
Change-Id: Id18ee3b83cb23a7042d59714a0c1ca074e7bc7a8
Reviewed-on: https://go-review.googlesource.com/c/go/+/437297
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: David Chase <drchase@google.com>
|
|
Change-Id: Icd4062e570559f1d0c69d4bdb9e23412054cf2a6
GitHub-Last-Rev: fbbfbcb54dac88c9a8f5c5c6d210be46f87e27dd
GitHub-Pull-Request: golang/go#55958
Reviewed-on: https://go-review.googlesource.com/c/go/+/436880
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Change-Id: I4b596b252c1785b13c4a166e9ef5f4ae812cd1bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/436704
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Change-Id: I5350c6374cd39ce4512d29cd8a341c4996f3b601
Reviewed-on: https://go-review.googlesource.com/c/go/+/436703
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Change-Id: Ia0896bd1edf2558821244fecd1c297b599472f47
GitHub-Last-Rev: cfd1e1091a064cdc38469c02c6c013635d7d437b
GitHub-Pull-Request: golang/go#55944
Reviewed-on: https://go-review.googlesource.com/c/go/+/436637
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This CL removes some opcode placeholders that do not correspond
to any existing instructions and hence create confusion. Some
instructions that are no longer valid like LDMX are also removed.
Any references to this instruction in ISA 3.0 are considered
as documentation errata.
Change-Id: Ib71a657099723bbe1db88873233ee573b5c42fe7
Reviewed-on: https://go-review.googlesource.com/c/go/+/429860
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Paul Murphy <murp@ibm.com>
Run-TryBot: Archana Ravindar <aravind5@in.ibm.com>
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Archana Ravindar <aravind5@in.ibm.com>
|
|
For #45557
Change-Id: I56824135d86452603dd4ed4bab0e24c201bb0683
Reviewed-on: https://go-review.googlesource.com/c/go/+/426257
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Andy Pan <panjf2000@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
This CL adds some optimizaion rules:
1, Converts CMP to CMN, or vice versa, when comparing with a negative
number.
2, For equal and not equal comparisons, CMP can be converted to CMN in
some cases. In theory we could do the same optimization for LT, LE, GT
and GE, but need to account for overflow, this CL doesn't handle them.
There are no noticeable performance changes.
Change-Id: Ia49266c019ab7908ebc9510c2f02e121b1607869
Reviewed-on: https://go-review.googlesource.com/c/go/+/429795
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Eric Fang <eric.fang@arm.com>
|
|
Use ppc64map (from x/arch) to generate ISA 3.1 support for the
assembler. A new file asm9_gtables.go is added which contains
generated code to encode ISA 3.1 instructions, a function to assist
filling out the oprange structure, a lookup table for the fixed
bits of each instructions, and a slice of string name. Generated
functions are shared if their bitwise encoding match, and the
translation from an obj.Prog structure matches.
The generated file is entirely self-contained, and does not require
regenerating any other files for changes within it. If opcodes in
a.out.go are reordered or changed, anames.go must be updated in
the same way as before.
Future improvements could shrink the generated opcode table
to 32 bit entries as there is much less variation of the
encoding of the prefix word, but it is not always identical
for instructions which share a similar encoding of arguments
(e.g PLWA and PLWZ).
Updates #44549
Change-Id: Ie83fa02497c9ad2280678d68391043d3aae63175
Reviewed-on: https://go-review.googlesource.com/c/go/+/419535
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Jenny Rakoczy <jenny@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Jenny Rakoczy <jenny@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Jenny Rakoczy <jenny@golang.org>
|
|
This CL adds tests for some of the instructions that were
missing. A minor change was made to asm9.go to ensure EXTSWSLICC
test works.
Change-Id: I95cd096c85778fc93856d213aa4fb14c35228cec
Reviewed-on: https://go-review.googlesource.com/c/go/+/430376
Run-TryBot: Archana Ravindar <aravind5@in.ibm.com>
Reviewed-by: Jenny Rakoczy <jenny@golang.org>
Run-TryBot: Jenny Rakoczy <jenny@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Jenny Rakoczy <jenny@golang.org>
|
|
These can be simplified with the knowledge of how arguments are
assigned to obj.Prog objects on PPC64. If the argument is not
a register type, the Reg argument (a2 in optab) of obj.Prog is
not used, and those arguments are placed into RestArgs (a3, a4, a5
in optab).
This relaxes the special case handling enforced by IsPPC64RLD and
IsPPC64ISEL. Instead, arguments are assigned as noted above, and
incorrect usage of such opcodes is checked by optab rules, not by
the assembler front-end.
Likewise, add support for handling 6 argument opcodes, these do
not exist today, but will be added with ISA 3.1 (Power10).
Finally, to maintain backwards compatibility, some 4-arg opcodes
whose middle arguments are a register and immediate, could swap
these arguments and generate identical machine code. This likely
wasn't intended, but is possible. These are explicitly fixed up
in the backend, and the asm tests are extended to check these.
Change-Id: I5f8190212427dfe8e6f062185bfefb5fa4fd0e75
Reviewed-on: https://go-review.googlesource.com/c/go/+/427516
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
|
|
Allow the assembler frontend to match MMA register arguments added by
ISA 3.1. The prefix "A" (for accumulator) is chosen to identify them.
Updates #44549
Change-Id: I363e7d1103aee19d7966829d2079c3d876621efc
Reviewed-on: https://go-review.googlesource.com/c/go/+/419534
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|