aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/internal/obj
AgeCommit message (Collapse)Author
2023-01-26runtime: use explicit NOFRAME on darwin/amd64qmuntal
This CL marks some darwin assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Change-Id: I797f3909bcf7f7aad304e4ede820c884231e54f6 Reviewed-on: https://go-review.googlesource.com/c/go/+/460235 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-01-24runtime: use explicit NOFRAME on windows/amd64qmuntal
This CL marks non-leaf nosplit assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Updates #57302 Updates #40044 Change-Id: Ia4d26f8420dcf2b54528969ffbf40a73f1315d61 Reviewed-on: https://go-review.googlesource.com/c/go/+/459395 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2023-01-23runtime,cmd/internal/obj/x86: use TEB TLS slots on windows/i386qmuntal
This CL redesign how we get the TLS pointer on windows/i386. It applies the same changes as done in CL 431775 for windows/amd64. We were previously reading it from the [TEB] arbitrary data slot, located at 0x14(FS), which can only hold 1 TLS pointer. With this CL, we will read the TLS pointer from the TEB TLS slot array, located at 0xE10(GS). The TLS slot array can hold multiple TLS pointers, up to 64, so multiple Go runtimes running on the same thread can coexists with different TLS. Each new TLS slot has to be allocated via [TlsAlloc], which returns the slot index. This index can then be used to get the slot offset from GS with the following formula: 0xE10 + index*4. The slot index is fixed per Go runtime, so we can store it in runtime.tls_g and use it latter on to read/update the TLS pointer. Loading the TLS pointer requires the following asm instructions: MOVQ runtime.tls_g, AX MOVQ AX(FS), AX Notice that this approach will now be implemented in all the supported windows arches. [TEB]: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block [TlsAlloc]: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc Change-Id: If4550b0d44694ee6480d4093b851f4991a088b32 Reviewed-on: https://go-review.googlesource.com/c/go/+/454675 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-01-20cmd/internal/obj/s390x, runtime: fix breakpoint in s390xSrinivas Pokala
Currently runtime.Breakpoint generates SIGSEGV in s390x. The solution to this is add new asm instruction BRRK of type FORMAT_E for the breakpoint exception. Fixes #52103 Change-Id: I8358a56e428849a5d28d5ade141e1d7310bee084 Reviewed-on: https://go-review.googlesource.com/c/go/+/457456 Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2023-01-20all: fix typos in go file commentsMarcel Meyer
This is the second round to look for spelling mistakes. This time the manual sifting of the result list was made easier by filtering out capitalized and camelcase words. grep -r --include '*.go' -E '^// .*$' . | aspell list | grep -E -x '[A-Za-z]{1}[a-z]*' | sort | uniq This PR will be imported into Gerrit with the title and first comment (this text) used to generate the subject and body of the Gerrit change. Change-Id: Ie8a2092aaa7e1f051aa90f03dbaf2b9aaf5664a9 GitHub-Last-Rev: fc2bd6e0c51652f13a7588980f1408af8e6080f5 GitHub-Pull-Request: golang/go#57737 Reviewed-on: https://go-review.googlesource.com/c/go/+/461595 Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Robert Griesemer <gri@google.com>
2023-01-19cmd/internal/obj/riscv: add check for invalid shift amount inputWayne Zuo
Current RISCV64 assembler do not check the invalid shift amount. This CL adds the check to avoid generating invalid instructions. Fixes #57755 Change-Id: If33877605e161baefd98c50db1f71641ca057507 Reviewed-on: https://go-review.googlesource.com/c/go/+/461755 Reviewed-by: Joel Sing <joel@sing.id.au> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com> Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org>
2023-01-09all: fix typos in go file commentsMarcel Meyer
These typos were found by executing grep, aspell, sort, and uniq in a pipe and searching the resulting list manually for possible typos. grep -r --include '*.go' -E '^// .*$' . | aspell list | sort | uniq Change-Id: I56281eda3b178968fbf104de1f71316c1feac64f GitHub-Last-Rev: e91c7cee340fadfa32b0c1773e4e5cd1ca567638 GitHub-Pull-Request: golang/go#57669 Reviewed-on: https://go-review.googlesource.com/c/go/+/460767 Run-TryBot: Ian Lance Taylor <iant@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-12-05cmd/asm: improve assembler error messagesKeith Randall
Provide file/line numbers for errors when we have them. Make the assembler error text closer to the equivalent errors from the compiler. Abort further processing when we come across errors. Fixes #53994 Change-Id: I4d6a037d6d713c1329923fce4c1189b5609f3660 Reviewed-on: https://go-review.googlesource.com/c/go/+/455276 Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@google.com> Auto-Submit: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18all: add missing periods in commentscui fliter
Change-Id: I69065f8adf101fdb28682c55997f503013a50e29 Reviewed-on: https://go-review.googlesource.com/c/go/+/449757 Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Joedian Reid <joedian@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Joedian Reid <joedian@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-11-18cmd/internal/obj/arm64: tidy literal pooleric fang
This CL cleans up the literal pool implementation and inserts an UNDEF instruction before the literal pool if the last instruction of the function is not an unconditional jump instruction, RET or ERET instruction. Change-Id: Ifecb9e3372478362dde246c1bc9bc8d527a469d5 Reviewed-on: https://go-review.googlesource.com/c/go/+/424134 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Joedian Reid <joedian@golang.org> Run-TryBot: Eric Fang <eric.fang@arm.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18cmd/internal/obj/arm64: mark branch instructions in optaberic fang
Currently, we judge whether we need to fix up the branch instruction based on Optab.type_ field, but the type_ field in optab may change. This CL marks the branch instruction in optab, and checks whether to do fixing up according to the mark. Depending on the constant parameter range of the branch instruction, there are two labels, BRANCH14BITS, BRANCH19BITS. For the 26-bit branch, linker will handle it. Besides this CL removes the unnecessary alignment of the DWORD instruction. Because the ISA doesn't require it and no 64-bit load assume it. The only effect is that there is some performance penalty for loading from DWORDs if the 8-byte DWORD instruction crosses the cache line, but this is very rare. Change-Id: I993902b3fb5ad8e081dd6c441e86bcf581031835 Reviewed-on: https://go-review.googlesource.com/c/go/+/424135 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Eric Fang <eric.fang@arm.com> Reviewed-by: Joedian Reid <joedian@golang.org>
2022-11-16cmd/internal/obj/ppc64: add ISA 3.1B opcodesPaul E. Murphy
A few new opcodes are added to support ROP mitigation on Power10. Change-Id: I13045aebc0b6fb09c64dc234ee5741318670d7ff Reviewed-on: https://go-review.googlesource.com/c/go/+/425597 Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Joedian Reid <joedian@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-15cmd/internal/obj: use testenv.Command instead of exec.Command in testsBryan C. Mills
testenv.Command sets a default timeout based on the test's deadline and sends SIGQUIT (where supported) in case of a hang. Change-Id: Ica1a9985f9abb1935434367c9c8ba28fc50f331d Reviewed-on: https://go-review.googlesource.com/c/go/+/450699 Auto-Submit: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Bryan Mills <bcmills@google.com>
2022-11-14runtime,cmd/internal/obj/x86: use TEB TLS slots on windows/amd64qmuntal
This CL redesign how we get the TLS pointer on windows/amd64. We were previously reading it from the [TEB] arbitrary data slot, located at 0x28(GS), which can only hold 1 TLS pointer. With this CL, we will read the TLS pointer from the TEB TLS slot array, located at 0x1480(GS). The TLS slot array can hold multiple TLS pointers, up to 64, so multiple Go runtimes running on the same thread can coexists with different TLS. Each new TLS slot has to be allocated via [TlsAlloc], which returns the slot index. This index can then be used to get the slot offset from GS with the following formula: 0x1480 + index*8 The slot index is fixed per Go runtime, so we can store it in runtime.tls_g and use it latter on to read/update the TLS pointer. Loading the TLS pointer requires the following asm instructions: MOVQ runtime.tls_g, AX MOVQ AX(GS), AX Notice that this approach is also implemented on windows/arm64. [TEB]: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block [TlsAlloc]: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc Updates #22192 Change-Id: Idea7119fd76a3cd083979a4d57ed64b552fa101b Reviewed-on: https://go-review.googlesource.com/c/go/+/431775 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2022-11-11all: fix problematic commentscui fliter
Change-Id: Ib6ea1bd04d9b06542ed2b0f453c718115417c62c Reviewed-on: https://go-review.googlesource.com/c/go/+/449755 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Keith Randall <khr@google.com>
2022-11-10cmd/internal/obj: reduce allocations in object file writingCherry Mui
Some object file writer functions are structured like, having a local variable, setting fields, then passing it to a Write method which eventually calls io.Writer.Write. As the Write call is an interface call it escapes the parameter, which in turn causes the local variable to be heap allocated. To reduce allocation, use pre-allocated scratch space instead. Reduce number of allocations in the compiler: name old allocs/op new allocs/op delta Template 679k ± 0% 644k ± 0% -5.17% (p=0.000 n=20+20) Unicode 603k ± 0% 581k ± 0% -3.67% (p=0.000 n=20+20) GoTypes 3.83M ± 0% 3.63M ± 0% -5.30% (p=0.000 n=20+20) Compiler 353k ± 0% 342k ± 0% -3.09% (p=0.000 n=18+19) SSA 31.4M ± 0% 30.4M ± 0% -3.02% (p=0.000 n=20+20) Flate 397k ± 0% 373k ± 0% -5.92% (p=0.000 n=20+18) GoParser 777k ± 0% 735k ± 0% -5.37% (p=0.000 n=20+20) Reflect 2.07M ± 0% 1.90M ± 0% -7.89% (p=0.000 n=18+20) Tar 605k ± 0% 568k ± 0% -6.26% (p=0.000 n=19+16) XML 801k ± 0% 766k ± 0% -4.36% (p=0.000 n=20+20) [Geo mean] 1.18M 1.12M -5.02% Change-Id: I9d02a72e459e645527196ac54b6ee643a5ea6bd3 Reviewed-on: https://go-review.googlesource.com/c/go/+/449637 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com>
2022-11-10cmd/internal/obj: adjust (*Link).AllPos comment in inl.goDavid Chase
AllPos truncates and overwrites its slice-storage input instead of appending. This makes that clear. Change-Id: I81653ff49a4a7d14fe9446fd6620943f3b20bbd3 Reviewed-on: https://go-review.googlesource.com/c/go/+/449478 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2022-11-09all: add missing copyright headercui fliter
Change-Id: Ia5a090953d324f0f8aa9c1808c88125ad5eb6f98 Reviewed-on: https://go-review.googlesource.com/c/go/+/448955 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Auto-Submit: Bryan Mills <bcmills@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Bryan Mills <bcmills@google.com>
2022-10-31cmd/internal/obj: cleanup linkgetlineFromPosMichael Pratt
Make linkgetlineFromPos and getFileIndexAndLine methods on Link, and give the former a more descriptive name. The docs are expanded to make it more clear that these are final file/line visible in programs. In getFileSymbolAndLine use ctxt.InnermostPos instead of ctxt.PosTable direct, which makes it more clear that we want the semantics of InnermostPos. Change-Id: I7c3d344dec60407fa54b191be8a09c117cb87dd0 Reviewed-on: https://go-review.googlesource.com/c/go/+/446301 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-10-28cmd/internal/obj/arm64: optimize ADRP+ADD+LD/ST to ADRP+LD/ST(offset)eric fang
This CL optimizes the sequence of instructions ADRP+ADD+LD/ST to the sequence of ADRP+LD/ST(offset). This saves an ADD instruction. The test result of compilecmp: name old text-bytes new text-bytes delta HelloSize 763kB ± 0% 755kB ± 0% -1.06% (p=0.000 n=20+20) name old data-bytes new data-bytes delta HelloSize 13.5kB ± 0% 13.5kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 227kB ± 0% 227kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.33MB ± 0% 1.33MB ± 0% -0.02% (p=0.000 n=20+20) file before after Δ % addr2line 3760392 3759504 -888 -0.024% api 5361511 5295351 -66160 -1.234% asm 5014157 4948674 -65483 -1.306% buildid 2579949 2579485 -464 -0.018% cgo 4492817 4491737 -1080 -0.024% compile 23359229 23156074 -203155 -0.870% cover 4823337 4756937 -66400 -1.377% dist 3332850 3331794 -1056 -0.032% doc 3902649 3836745 -65904 -1.689% fix 3269708 3268828 -880 -0.027% link 6510760 6443496 -67264 -1.033% nm 3670740 3604348 -66392 -1.809% objdump 4069599 4068967 -632 -0.016% pack 2374824 2374208 -616 -0.026% pprof 13874860 13805700 -69160 -0.498% test2json 2599210 2598530 -680 -0.026% trace 13231640 13162872 -68768 -0.520% vet 7360899 7292267 -68632 -0.932% total 113589131 112775517 -813614 -0.716% Change-Id: Ie1cf277e149ddd3f352d05fa0753d0ced7e0b894 Reviewed-on: https://go-review.googlesource.com/c/go/+/444715 Run-TryBot: Eric Fang <eric.fang@arm.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Damien Neil <dneil@google.com>
2022-10-27runtime: add wasm bulk memory operationsGaret Halliday
The existing implementation uses loops to implement bulk memory operations such as memcpy and memclr. Now that bulk memory operations have been standardized and are implemented in all major browsers and engines (see https://webassembly.org/roadmap/), we should use them to improve performance. Updates #28360 Change-Id: I28df0e0350287d5e7e1d1c09a4064ea1054e7575 Reviewed-on: https://go-review.googlesource.com/c/go/+/444935 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Richard Musiol <neelance@gmail.com> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Richard Musiol <neelance@gmail.com> Reviewed-by: Richard Musiol <neelance@gmail.com>
2022-10-27cmd/internal/obj/arm64: remove AMOVBU from optaberifan01
The instruction format of MOVBU is the same with MOVB, this CL deletes MOVBU from optab for simplicity. Change-Id: Ib034d6c29dd9793cf3e6f9fa8abff0ed0d931d0e Reviewed-on: https://go-review.googlesource.com/c/go/+/445295 Run-TryBot: Eric Fang <eric.fang@arm.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-10-26cmd/internal/obj/ppc64: generate big uint32 values in registerPaul E. Murphy
When using "MOVD $const, Rx", any 32b constant can be generated in register quickly. Avoid transforming big uint32 values into a load. And, fix the instance in runtime.usleep where I discovered this. Change-Id: I46e156d7edf200f85b5b61162f00223c0ad81fe2 Reviewed-on: https://go-review.googlesource.com/c/go/+/444815 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Than McIntosh <thanm@google.com>
2022-10-26cmd: remove redundant _cui fliter
Change-Id: Ia7e1e3679e03d125feb9708cb05bbd32c4954edb GitHub-Last-Rev: a62b72ea3edcf2b4f9f378cd03b1ac073ab80c74 GitHub-Pull-Request: golang/go#55957 Reviewed-on: https://go-review.googlesource.com/c/go/+/436879 Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: hopehook <hopehook@golangcn.org> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-10-14cmd/compile,cmd/link,runtime: add start line numbers to func metadataMichael Pratt
This adds the function "start line number" to runtime._func and runtime.inlinedCall objects. The "start line number" is the line number of the func keyword or TEXT directive for assembly. Subtracting the start line number from PC line number provides the relative line offset of a PC from the the start of the function. This helps with source stability by allowing code above the function to move without invalidating samples within the function. Encoding start line rather than relative lines directly is convenient because the pprof format already contains a start line field. This CL uses a straightforward encoding of explictly including a start line field in every _func and inlinedCall. It is possible that we could compress this further in the future. e.g., functions with a prologue usually have <line of PC 0> == <start line>. In runtime.test, 95% of functions have <line of PC 0> == <start line>. According to bent, this is geomean +0.83% binary size vs master and -0.31% binary size vs 1.19. Note that //line directives can change the file and line numbers arbitrarily. The encoded start line is as adjusted by //line directives. Since this can change in the middle of a function, `line - start line` offset calculations may not be meaningful if //line directives are in use. For #55022. Change-Id: Iaabbc6dd4f85ffdda294266ef982ae838cc692f6 Reviewed-on: https://go-review.googlesource.com/c/go/+/429638 Run-TryBot: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-10-06all: remove redundant type conversioncui fliter
Change-Id: I375233dc700adbc58a6d4af995d07b352bf85b11 GitHub-Last-Rev: ef129205231b892f61b0135c87bb791a5e1a126c GitHub-Pull-Request: golang/go#55994 Reviewed-on: https://go-review.googlesource.com/c/go/+/437715 Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2022-10-05cmd/internal/obj/arm64: add missing operand register in GNU assemblyshaoliming
Fixes #55832 Change-Id: Ib20279d47c1ca9a383a3c85bb41ca4f550bb0a33 GitHub-Last-Rev: 10af77a2f21397899f69938e6d98bb34b33bfddf GitHub-Pull-Request: golang/go#55838 Reviewed-on: https://go-review.googlesource.com/c/go/+/433575 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: hopehook <hopehook@golangcn.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Than McIntosh <thanm@google.com>
2022-10-05cmd/compile: add late lower pass for last rules to runeric fang
Usually optimization rules have corresponding priorities, some need to be run first, some run next, and some run last, which produces the best code. But currently our optimization rules have no priority, this CL adds a late lower pass that runs those rules that need to be run at last, such as split unreasonable constant folding. This pass can be seen as the second round of the lower pass. For example: func foo(a, b uint64) uint64 { d := a+0x1234568 d1 := b+0x1234568 return d&d1 } The code generated by the master branch: 0x0004 00004 ADD $19088744, R0, R2 // movz+movk+add 0x0010 00016 ADD $19088744, R1, R1 // movz+movk+add 0x001c 00028 AND R1, R2, R0 This is because the current constant folding optimization rules do not take into account the range of constants, causing the constant to be loaded repeatedly. This CL splits these unreasonable constants folding in the late lower pass. With this CL the generated code: 0x0004 00004 MOVD $19088744, R2 // movz+movk 0x000c 00012 ADD R0, R2, R3 0x0010 00016 ADD R1, R2, R1 0x0014 00020 AND R1, R3, R0 This CL also adds constant folding optimization for ADDS instruction. In addition, in order not to introduce the codegen regression, an optimization rule is added to change the addition of a negative number into a subtraction of a positive number. go1 benchmarks: name old time/op new time/op delta BinaryTree17-8 1.22s ± 1% 1.24s ± 0% +1.56% (p=0.008 n=5+5) Fannkuch11-8 1.54s ± 0% 1.53s ± 0% -0.69% (p=0.016 n=4+5) FmtFprintfEmpty-8 14.1ns ± 0% 14.1ns ± 0% ~ (p=0.079 n=4+5) FmtFprintfString-8 26.0ns ± 0% 26.1ns ± 0% +0.23% (p=0.008 n=5+5) FmtFprintfInt-8 32.3ns ± 0% 32.9ns ± 1% +1.72% (p=0.008 n=5+5) FmtFprintfIntInt-8 54.5ns ± 0% 55.5ns ± 0% +1.83% (p=0.008 n=5+5) FmtFprintfPrefixedInt-8 61.5ns ± 0% 62.0ns ± 0% +0.93% (p=0.008 n=5+5) FmtFprintfFloat-8 72.0ns ± 0% 73.6ns ± 0% +2.24% (p=0.008 n=5+5) FmtManyArgs-8 221ns ± 0% 224ns ± 0% +1.22% (p=0.008 n=5+5) GobDecode-8 1.91ms ± 0% 1.93ms ± 0% +0.98% (p=0.008 n=5+5) GobEncode-8 1.40ms ± 1% 1.39ms ± 0% -0.79% (p=0.032 n=5+5) Gzip-8 115ms ± 0% 117ms ± 1% +1.17% (p=0.008 n=5+5) Gunzip-8 19.4ms ± 1% 19.3ms ± 0% -0.71% (p=0.016 n=5+4) HTTPClientServer-8 27.0µs ± 0% 27.3µs ± 0% +0.80% (p=0.008 n=5+5) JSONEncode-8 3.36ms ± 1% 3.33ms ± 0% ~ (p=0.056 n=5+5) JSONDecode-8 17.5ms ± 2% 17.8ms ± 0% +1.71% (p=0.016 n=5+4) Mandelbrot200-8 2.29ms ± 0% 2.29ms ± 0% ~ (p=0.151 n=5+5) GoParse-8 1.35ms ± 1% 1.36ms ± 1% ~ (p=0.056 n=5+5) RegexpMatchEasy0_32-8 24.5ns ± 0% 24.5ns ± 0% ~ (p=0.444 n=4+5) RegexpMatchEasy0_1K-8 131ns ±11% 118ns ± 6% ~ (p=0.056 n=5+5) RegexpMatchEasy1_32-8 22.9ns ± 0% 22.9ns ± 0% ~ (p=0.905 n=4+5) RegexpMatchEasy1_1K-8 126ns ± 0% 127ns ± 0% ~ (p=0.063 n=4+5) RegexpMatchMedium_32-8 486ns ± 5% 483ns ± 0% ~ (p=0.381 n=5+4) RegexpMatchMedium_1K-8 15.4µs ± 1% 15.5µs ± 0% ~ (p=0.151 n=5+5) RegexpMatchHard_32-8 687ns ± 0% 686ns ± 0% ~ (p=0.103 n=5+5) RegexpMatchHard_1K-8 20.7µs ± 0% 20.7µs ± 1% ~ (p=0.151 n=5+5) Revcomp-8 175ms ± 2% 176ms ± 3% ~ (p=1.000 n=5+5) Template-8 20.4ms ± 6% 20.1ms ± 2% ~ (p=0.151 n=5+5) TimeParse-8 112ns ± 0% 113ns ± 0% +0.97% (p=0.016 n=5+4) TimeFormat-8 156ns ± 0% 145ns ± 0% -7.14% (p=0.029 n=4+4) Change-Id: I3ced26e89041f873ac989586514ccc5ee09f13da Reviewed-on: https://go-review.googlesource.com/c/go/+/425134 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Eric Fang <eric.fang@arm.com>
2022-10-01cmd/internal/obj: hoist constant slice out of function called in loopDavid Chase
It's all local to a single file and responsible for 1.7% of total space allocated summed over compilation of the bent benchmarks. Showing nodes accounting for 27.16GB, 27.04% of 100.44GB total Dropped 1622 nodes (cum <= 0.50GB) Showing top 10 nodes out of 321 flat flat% sum% cum cum% 4.87GB 4.85% 4.85% 4.87GB 4.85% cmd/compile/internal/objw.init 4.79GB 4.77% 9.62% 4.81GB 4.79% runtime.allocm 4.72GB 4.70% 14.32% 4.72GB 4.70% cmd/compile/internal/types.newType 3.10GB 3.09% 17.41% 3.17GB 3.15% cmd/compile/internal/ssagen.InitConfig 1.86GB 1.85% 19.26% 2.61GB 2.60% cmd/compile/internal/ssa.cse 1.72GB 1.71% 20.97% 2.25GB 2.24% cmd/internal/obj.(*Link).traverseFuncAux 1.66GB 1.65% 22.62% 1.66GB 1.65% runtime.malg 1.61GB 1.61% 24.23% 1.64GB 1.63% cmd/compile/internal/ssa.schedule 1.42GB 1.41% 25.64% 1.42GB 1.41% cmd/compile/internal/ir.NewNameAt Change-Id: Id18ee3b83cb23a7042d59714a0c1ca074e7bc7a8 Reviewed-on: https://go-review.googlesource.com/c/go/+/437297 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: David Chase <drchase@google.com>
2022-09-30all: omit comparison bool constant to simplify codecui fliter
Change-Id: Icd4062e570559f1d0c69d4bdb9e23412054cf2a6 GitHub-Last-Rev: fbbfbcb54dac88c9a8f5c5c6d210be46f87e27dd GitHub-Pull-Request: golang/go#55958 Reviewed-on: https://go-review.googlesource.com/c/go/+/436880 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-09-29cmd/internal/obj/x86: return comparison directlycuiweixie
Change-Id: I4b596b252c1785b13c4a166e9ef5f4ae812cd1bc Reviewed-on: https://go-review.googlesource.com/c/go/+/436704 TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-09-29cmd/internal/obj: call delete directly without check existcuiweixie
Change-Id: I5350c6374cd39ce4512d29cd8a341c4996f3b601 Reviewed-on: https://go-review.googlesource.com/c/go/+/436703 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-09-29cmd: fix a few function names on commentscui fliter
Change-Id: Ia0896bd1edf2558821244fecd1c297b599472f47 GitHub-Last-Rev: cfd1e1091a064cdc38469c02c6c013635d7d437b GitHub-Pull-Request: golang/go#55944 Reviewed-on: https://go-review.googlesource.com/c/go/+/436637 Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-28cmd/internal/obj/ppc64: remove unnecessary opcodesArchana R
This CL removes some opcode placeholders that do not correspond to any existing instructions and hence create confusion. Some instructions that are no longer valid like LDMX are also removed. Any references to this instruction in ISA 3.0 are considered as documentation errata. Change-Id: Ib71a657099723bbe1db88873233ee573b5c42fe7 Reviewed-on: https://go-review.googlesource.com/c/go/+/429860 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Paul Murphy <murp@ibm.com> Run-TryBot: Archana Ravindar <aravind5@in.ibm.com> Reviewed-by: Benny Siegert <bsiegert@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Archana Ravindar <aravind5@in.ibm.com>
2022-09-20all: replace package ioutil with os and io in srcAndy Pan
For #45557 Change-Id: I56824135d86452603dd4ed4bab0e24c201bb0683 Reviewed-on: https://go-review.googlesource.com/c/go/+/426257 Run-TryBot: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: Andy Pan <panjf2000@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-09-20cmd/compile: Add some CMP and CMN optimization rules on arm64eric fang
This CL adds some optimizaion rules: 1, Converts CMP to CMN, or vice versa, when comparing with a negative number. 2, For equal and not equal comparisons, CMP can be converted to CMN in some cases. In theory we could do the same optimization for LT, LE, GT and GE, but need to account for overflow, this CL doesn't handle them. There are no noticeable performance changes. Change-Id: Ia49266c019ab7908ebc9510c2f02e121b1607869 Reviewed-on: https://go-review.googlesource.com/c/go/+/429795 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Eric Fang <eric.fang@arm.com>
2022-09-15cmd/internal/obj/ppc64: add ISA 3.1 instructionsPaul E. Murphy
Use ppc64map (from x/arch) to generate ISA 3.1 support for the assembler. A new file asm9_gtables.go is added which contains generated code to encode ISA 3.1 instructions, a function to assist filling out the oprange structure, a lookup table for the fixed bits of each instructions, and a slice of string name. Generated functions are shared if their bitwise encoding match, and the translation from an obj.Prog structure matches. The generated file is entirely self-contained, and does not require regenerating any other files for changes within it. If opcodes in a.out.go are reordered or changed, anames.go must be updated in the same way as before. Future improvements could shrink the generated opcode table to 32 bit entries as there is much less variation of the encoding of the prefix word, but it is not always identical for instructions which share a similar encoding of arguments (e.g PLWA and PLWZ). Updates #44549 Change-Id: Ie83fa02497c9ad2280678d68391043d3aae63175 Reviewed-on: https://go-review.googlesource.com/c/go/+/419535 Run-TryBot: Paul Murphy <murp@ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Jenny Rakoczy <jenny@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Jenny Rakoczy <jenny@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Jenny Rakoczy <jenny@golang.org>
2022-09-15cmd/asm, cmd/internal/obj/ppc64: increase asm test coverage for PPC64Archana R
This CL adds tests for some of the instructions that were missing. A minor change was made to asm9.go to ensure EXTSWSLICC test works. Change-Id: I95cd096c85778fc93856d213aa4fb14c35228cec Reviewed-on: https://go-review.googlesource.com/c/go/+/430376 Run-TryBot: Archana Ravindar <aravind5@in.ibm.com> Reviewed-by: Jenny Rakoczy <jenny@golang.org> Run-TryBot: Jenny Rakoczy <jenny@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Paul Murphy <murp@ibm.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Jenny Rakoczy <jenny@golang.org>
2022-09-07cmd/asm: improve argument to obj.Prog assignment on PPC64Paul E. Murphy
These can be simplified with the knowledge of how arguments are assigned to obj.Prog objects on PPC64. If the argument is not a register type, the Reg argument (a2 in optab) of obj.Prog is not used, and those arguments are placed into RestArgs (a3, a4, a5 in optab). This relaxes the special case handling enforced by IsPPC64RLD and IsPPC64ISEL. Instead, arguments are assigned as noted above, and incorrect usage of such opcodes is checked by optab rules, not by the assembler front-end. Likewise, add support for handling 6 argument opcodes, these do not exist today, but will be added with ISA 3.1 (Power10). Finally, to maintain backwards compatibility, some 4-arg opcodes whose middle arguments are a register and immediate, could swap these arguments and generate identical machine code. This likely wasn't intended, but is possible. These are explicitly fixed up in the backend, and the asm tests are extended to check these. Change-Id: I5f8190212427dfe8e6f062185bfefb5fa4fd0e75 Reviewed-on: https://go-review.googlesource.com/c/go/+/427516 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
2022-09-07cmd/asm,cmd/internal/obj/ppc64: recognize ppc64 ISA 3.1 MMA registersPaul E. Murphy
Allow the assembler frontend to match MMA register arguments added by ISA 3.1. The prefix "A" (for accumulator) is chosen to identify them. Updates #44549 Change-Id: I363e7d1103aee19d7966829d2079c3d876621efc Reviewed-on: https://go-review.googlesource.com/c/go/+/419534 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Paul Murphy <murp@ibm.com> Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-09-02cmd/internal/obj/riscv: fix comment typoWayne Zuo
Change-Id: Ica74977898f0af8c9abf42a003d8f02dbdc03d34 Reviewed-on: https://go-review.googlesource.com/c/go/+/427994 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Joel Sing <joel@sing.id.au> Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
2022-08-31cmd/internal/obj/arm64: allow transition from $0 to ZR for MSReric fang
Previously the first operand of MSR could be $0, which would be converted to the ZR register. This is prohibited by CL 404316, this CL restores this instruction format. Change-Id: I5b5be59e76aa58423a0fb96942d1b2a9de62e311 Reviewed-on: https://go-review.googlesource.com/c/go/+/426198 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com> Run-TryBot: Eric Fang <eric.fang@arm.com>
2022-08-30cmd/asm: add new classification for index memory operands on PPC64Archana R
When a base+displacement kind of operand is given in an index-mode instruction, the assembler does not flag it as an invalid instruction causing the user to get an incorrect encoding of that instruction leading to incorrect execution of the program. Enable assembler to recognize valid and invalid operands used in index mode instructions by classifying SOREG type into two further types XOREG (used uniquely in index addressing mode instructions) and SOREG for instructions working on base+displacement operands. Also cleaned up usage of obj.Addr.Scale on PPC64. Change-Id: Ib4d84343ae57477c6c074f44c4c2749496e11b91 Reviewed-on: https://go-review.googlesource.com/c/go/+/405542 Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Archana Ravindar <aravind5@in.ibm.com>
2022-08-30cmd/internal/obj/loong64: add ROTR, ROTRV instructions supportWayne Zuo
Reference: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html Change-Id: I29adb84eb70bffd963c79ed6957a5197896fb2bf Reviewed-on: https://go-review.googlesource.com/c/go/+/422316 Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2022-08-29cmd/compile/obj/arm64: fix encoding error of FMOVD/FMOVS $0|ZReric fang
Previously the first operand of FMOVD and FMOVS could be $0, which would be converted to the ZR register. This is prohibited by CL 404316, also it broken the encoding of "FMOVD/FMOVS ZR, Rn", this CL restores this instruction format and fixes the encoding issue. Fixes #54655. Fixes #54729. Change-Id: I9c42cd41296bed7ffd601609bd8ecaa27d11e659 Reviewed-on: https://go-review.googlesource.com/c/go/+/425188 Run-TryBot: Eric Fang <eric.fang@arm.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-23cmd/internal/obj/loong64: add MASKEQZ and MASKNEZ instructions supportWayne Zuo
Change-Id: Ied16c3be47c863a94d46bd568191057ded4b7d0a Reviewed-on: https://go-review.googlesource.com/c/go/+/416734 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
2022-08-23cmd/internal/obj/arm64: remove the transition from $0 to ZReric fang
Previously we convert $0 to the ZR register for some reasons, which causes two problems: 1. Confusion, the special case of the ZR register needs to be considered when dealing with constants. For encoding, some places we encode ZR, and some places we encode $0, although we have converted $0 to ZR. 2. Unexpected instruction format. All instructions that support ZR register operands can be replaced by $0. This patch removes this conversion. Note that this patch may cause previously unintendedly supported instruction formats to no longer be supported. Change-Id: I3d8d2c06711b7614a38191397da7776417f1861c Reviewed-on: https://go-review.googlesource.com/c/go/+/404316 Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Eric Fang <eric.fang@arm.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-23cmd/compile/internal/ssa: optimize memory moving on arm64eric fang
This CL optimizes memory moving with LDP and STP on arm64. Benchmarks: name old time/op new time/op delta ClearFat7-160 1.08ns ± 0% 0.95ns ± 0% -11.41% (p=0.008 n=5+5) ClearFat8-160 0.84ns ± 0% 0.84ns ± 0% -0.95% (p=0.008 n=5+5) ClearFat11-160 1.08ns ± 0% 0.95ns ± 0% -11.46% (p=0.008 n=5+5) ClearFat12-160 0.95ns ± 0% 0.95ns ± 0% ~ (p=0.063 n=4+5) ClearFat13-160 1.08ns ± 0% 0.95ns ± 0% -11.45% (p=0.008 n=5+5) ClearFat14-160 1.08ns ± 0% 0.95ns ± 0% -11.47% (p=0.008 n=5+5) ClearFat15-160 1.24ns ± 0% 0.95ns ± 0% -22.98% (p=0.029 n=4+4) ClearFat16-160 0.84ns ± 0% 0.83ns ± 0% -0.11% (p=0.008 n=5+5) ClearFat24-160 2.15ns ± 0% 2.15ns ± 0% ~ (all equal) ClearFat32-160 2.86ns ± 0% 2.86ns ± 0% ~ (p=0.333 n=5+4) ClearFat40-160 2.15ns ± 0% 2.15ns ± 0% ~ (all equal) ClearFat48-160 3.32ns ± 1% 3.31ns ± 1% ~ (p=0.690 n=5+5) ClearFat56-160 2.15ns ± 0% 2.15ns ± 0% ~ (all equal) ClearFat64-160 3.25ns ± 1% 3.26ns ± 1% ~ (p=0.841 n=5+5) ClearFat72-160 2.22ns ± 0% 2.22ns ± 0% ~ (p=0.444 n=5+5) ClearFat128-160 4.03ns ± 0% 4.04ns ± 0% +0.32% (p=0.008 n=5+5) ClearFat256-160 6.44ns ± 0% 6.44ns ± 0% +0.08% (p=0.016 n=4+5) ClearFat512-160 12.2ns ± 0% 12.2ns ± 0% +0.13% (p=0.008 n=5+5) ClearFat1024-160 24.3ns ± 0% 24.3ns ± 0% ~ (p=0.167 n=5+5) ClearFat1032-160 24.5ns ± 0% 24.5ns ± 0% ~ (p=0.238 n=4+5) ClearFat1040-160 29.2ns ± 0% 29.3ns ± 0% +0.34% (p=0.008 n=5+5) CopyFat7-160 1.43ns ± 0% 1.07ns ± 0% -24.97% (p=0.008 n=5+5) CopyFat8-160 0.89ns ± 0% 0.89ns ± 0% ~ (p=0.238 n=5+5) CopyFat11-160 1.43ns ± 0% 1.07ns ± 0% -24.97% (p=0.008 n=5+5) CopyFat12-160 1.07ns ± 0% 1.07ns ± 0% ~ (p=0.238 n=5+4) CopyFat13-160 1.43ns ± 0% 1.07ns ± 0% ~ (p=0.079 n=4+5) CopyFat14-160 1.43ns ± 0% 1.07ns ± 0% -24.95% (p=0.008 n=5+5) CopyFat15-160 1.79ns ± 0% 1.07ns ± 0% ~ (p=0.079 n=4+5) CopyFat16-160 1.07ns ± 0% 1.07ns ± 0% ~ (p=0.444 n=5+5) CopyFat24-160 1.84ns ± 2% 1.67ns ± 0% -9.28% (p=0.008 n=5+5) CopyFat32-160 3.22ns ± 0% 2.92ns ± 0% -9.40% (p=0.008 n=5+5) CopyFat64-160 3.64ns ± 0% 3.57ns ± 0% -1.96% (p=0.008 n=5+5) CopyFat72-160 3.56ns ± 0% 3.11ns ± 0% -12.89% (p=0.008 n=5+5) CopyFat128-160 5.06ns ± 0% 5.06ns ± 0% +0.04% (p=0.048 n=5+5) CopyFat256-160 9.13ns ± 0% 9.13ns ± 0% ~ (p=0.659 n=5+5) CopyFat512-160 17.4ns ± 0% 17.4ns ± 0% ~ (p=0.167 n=5+5) CopyFat520-160 17.2ns ± 0% 17.3ns ± 0% +0.37% (p=0.008 n=5+5) CopyFat1024-160 34.1ns ± 0% 34.0ns ± 0% ~ (p=0.127 n=5+5) CopyFat1032-160 80.9ns ± 0% 34.2ns ± 0% -57.74% (p=0.008 n=5+5) CopyFat1040-160 94.4ns ± 0% 41.7ns ± 0% -55.78% (p=0.016 n=5+4) Change-Id: I14186f9f82b0ecf8b6c02191dc5da566b9a21e6c Reviewed-on: https://go-review.googlesource.com/c/go/+/421654 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Eric Fang <eric.fang@arm.com> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-18all: remove duplicate "the" words in commentsAbirdcfly
Following CL 424454, using command rg --multiline " the\s{1,}the " * rg --multiline " the\s{1,}//\s{1,}the " * all the words "the" that are repeated in comments are found. Change-Id: I60b769b98f04c927b4c228e10f37faf190964069 Reviewed-on: https://go-review.googlesource.com/c/go/+/423836 Auto-Submit: Filippo Valsorda <filippo@golang.org> Reviewed-by: Than McIntosh <thanm@google.com> Run-TryBot: Filippo Valsorda <filippo@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Filippo Valsorda <filippo@golang.org>
2022-08-16runtime: process ptr bitmaps one word at a timeKeith Randall
[This is a retry of CL 407036 + its revert CL 422394. The only content change is the 1-line change in cmd/internal/obj/objfile.go.] Read the bitmaps one uintptr at a time instead of one byte at a time. Performance so far: Allocation heavy, no retention: ~30% faster in heapBitsSetType Scan heavy, ~no allocation: ~even in scanobject Change-Id: I04d899e1dbd23e989e9f552cdc1880318779c14c Reviewed-on: https://go-review.googlesource.com/c/go/+/422635 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>