aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/mkduff.go
AgeCommit message (Collapse)Author
2023-01-24runtime: use explicit NOFRAME on windows/amd64qmuntal
This CL marks non-leaf nosplit assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Updates #57302 Updates #40044 Change-Id: Ia4d26f8420dcf2b54528969ffbf40a73f1315d61 Reviewed-on: https://go-review.googlesource.com/c/go/+/459395 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2022-05-19runtime: implement duffzero/duffcopy for linux/loong64Xiaodong Liu
Contributors to the loong64 port are: Weining Lu <luweining@loongson.cn> Lei Wang <wanglei@loongson.cn> Lingqin Gong <gonglingqin@loongson.cn> Xiaolin Zhao <zhaoxiaolin@loongson.cn> Meidan Li <limeidan@loongson.cn> Xiaojuan Zhai <zhaixiaojuan@loongson.cn> Qiyuan Pu <puqiyuan@loongson.cn> Guoqi Chen <chenguoqi@loongson.cn> This port has been updated to Go 1.15.6: https://github.com/loongson/go Updates #46229 Change-Id: Ida040e76dc8172f60e6aee1ea2b5bce13ab3581e Reviewed-on: https://go-review.googlesource.com/c/go/+/368077 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>
2022-03-30runtime: add runtime changes for register ABI on riscv64Meng Zhuo
This CL adds - spill functions used by runtime - ABIInternal to functions Adding new stubs_riscv64 file to eliminate vet issues while compiling. Change-Id: I2a9f6088a1cd2d9708f26b2d97895b4e5f9f87e9 Reviewed-on: https://go-review.googlesource.com/c/go/+/360296 Trust: mzh <mzh@golangcn.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-03-10runtime,cmd/compile: change reg duff{zero,copy} for regabi riscv64Meng Zhuo
As CL 356519 require, X8-X23 will be argument register, however X10, X11 is used by duff device. This CL changes X10, X11 into X24, X25 to meet the prerequisite. Update #40724 Change-Id: Ie9b899afbba7e9a51bb7dacd89e49ca1c1fc33ff Reviewed-on: https://go-review.googlesource.com/c/go/+/357976 Trust: mzh <mzh@golangcn.org> Run-TryBot: mzh <mzh@golangcn.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Joel Sing <joel@sing.id.au>
2021-10-28all: manual fixups for //go:build vs // +buildRuss Cox
Update many generators, also handle files that were not part of the standard build during 'go fix' in CL 344955. Fixes #41184. Change-Id: I1edc684e8101882dcd11f75c6745c266fccfe9e7 Reviewed-on: https://go-review.googlesource.com/c/go/+/359476 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
2021-10-28all: go fix -fix=buildtag std cmd (except for bootstrap deps, vendor)Russ Cox
When these packages are released as part of Go 1.18, Go 1.16 will no longer be supported, so we can remove the +build tags in these files. Ran go fix -fix=buildtag std cmd and then reverted the bootstrapDirs as defined in src/cmd/dist/buildtool.go, which need to continue to build with Go 1.4 for now. Also reverted src/vendor and src/cmd/vendor, which will need to be updated in their own repos first. Manual changes in runtime/pprof/mprof_test.go to adjust line numbers. For #41184. Change-Id: Ic0f93f7091295b6abc76ed5cd6e6746e1280861e Reviewed-on: https://go-review.googlesource.com/c/go/+/344955 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Bryan C. Mills <bcmills@google.com>
2021-09-23cmd/compile: enable reg args and add duffcopy support on ppc64xLynn Boger
This adds support for duffcopy on ppc64x and updates the ssa/config.go file to enable register args and recognize the duffDevice is available on ppc64x. Change-Id: Ifc472cc9cc19c9a80e468fb52078c75f7dd44d36 Reviewed-on: https://go-review.googlesource.com/c/go/+/351490 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2021-06-02[dev.typeparams] runtime: mark assembly functions called directly from ↵Cherry Mui
compiler ABIInternal For functions such as gcWriteBarrier and panicIndexXXX, the compiler generates ABIInternal calls directly. And they must not use wrappers because it follows a special calling convention or the caller's PC is used. Mark them as ABIInternal. Note that even though they are marked as ABIInternal, they don't actually use the internal ABI, i.e. regabiargs is not honored for now. Now all.bash passes with GOEXPERIMENT=regabiwrappers (at least on macOS). Change-Id: I87e41964e6dc4efae03e8eb636ae9fa1d99285bb Reviewed-on: https://go-review.googlesource.com/c/go/+/323934 Trust: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-05-13all: add //go:build lines to assembly filesTobias Klauser
Don't add them to files in vendor and cmd/vendor though. These will be pulled in by updating the respective dependencies. For #41184 Change-Id: Icc57458c9b3033c347124323f33084c85b224c70 Reviewed-on: https://go-review.googlesource.com/c/go/+/319389 Trust: Tobias Klauser <tobias.klauser@gmail.com> Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
2021-02-20all: go fmt std cmd (but revert vendor)Russ Cox
Make all our package sources use Go 1.17 gofmt format (adding //go:build lines). Part of //go:build change (#41184). See https://golang.org/design/draft-gobuild Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4 Reviewed-on: https://go-review.googlesource.com/c/go/+/294430 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-02-03[dev.regabi] cmd/compile: reserve X15 as zero register on AMD64Cherry Zhang
In ABIInternal, reserve X15 as constant zero, and use it to zero memory. (Maybe there can be more use of it?) The register is zeroed when transition to ABIInternal from ABI0. Caveat: using X15 generates longer instructions than using X0. Maybe we want to use X0? Change-Id: I12d5ee92a01fc0b59dad4e5ab023ac71bc2a8b7d Reviewed-on: https://go-review.googlesource.com/c/go/+/288093 Trust: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2020-12-09all: update to use os.ReadFile, os.WriteFile, os.CreateTemp, os.MkdirTempRuss Cox
As part of #42026, these helpers from io/ioutil were moved to os. (ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.) Update the Go tree to use the preferred names. As usual, code compiled with the Go 1.4 bootstrap toolchain and code vendored from other sources is excluded. ReadDir changes are in a separate CL, because they are not a simple search and replace. For #42026. Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3 Reviewed-on: https://go-review.googlesource.com/c/go/+/266365 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-10-28cmd/compile,cmd/internal/obj/riscv,runtime: use Duff's devices on riscv64Michał Derkacz
Implement runtime.duffzero and runtime.duffcopy for riscv64. Use obj.ADUFFZERO/obj.ADUFFCOPY for medium size, word aligned zeroing/moving. Change-Id: I42ec622055630c94cb77e286d8d33dbe7c9f846c Reviewed-on: https://go-review.googlesource.com/c/go/+/237797 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Joel Sing <joel@sing.id.au> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-07runtime: remove outdated comment in mkduff.go about usage of STOSQMartin Möhrmann
Change-Id: I71966cc5def4615d64876165872e5e7f2956b270 Reviewed-on: https://go-review.googlesource.com/c/go/+/253397 Run-TryBot: Martin Möhrmann <martisch@uos.de> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-03-31runtime: generate dummy duffcopyCherry Zhang
Although duffcopy is not used on PPC64, duff_ppc64x.s and mkduff.go don't match. Make it so. Fixes #38188. Change-Id: Ic6c08e335795ea407880efd449f4229696af7744 Reviewed-on: https://go-review.googlesource.com/c/go/+/226719 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-08-28runtime, cmd/compile: implement and use DUFFCOPY on MIPS64Meng Zhuo
OS: Linux loongson 3.10.84 mips64el CPU: Loongson 3A3000 quad core name old time/op new time/op delta BinaryTree17 23.5s ± 1% 23.2s ± 0% -1.12% (p=0.008 n=5+5) Fannkuch11 10.2s ± 0% 10.1s ± 0% -0.19% (p=0.008 n=5+5) FmtFprintfEmpty 450ns ± 0% 446ns ± 1% -0.89% (p=0.024 n=5+5) FmtFprintfString 722ns ± 1% 721ns ± 1% ~ (p=0.762 n=5+5) FmtFprintfInt 693ns ± 2% 691ns ± 2% ~ (p=0.889 n=5+5) FmtFprintfIntInt 912ns ± 1% 911ns ± 0% ~ (p=0.722 n=5+5) FmtFprintfPrefixedInt 1.35µs ± 2% 1.35µs ± 2% ~ (p=1.000 n=5+5) FmtFprintfFloat 1.79µs ± 0% 1.78µs ± 0% ~ (p=0.683 n=5+5) FmtManyArgs 3.46µs ± 1% 3.48µs ± 1% ~ (p=0.246 n=5+5) GobDecode 48.8ms ± 1% 48.6ms ± 0% ~ (p=0.222 n=5+5) GobEncode 37.7ms ± 1% 37.4ms ± 1% ~ (p=0.095 n=5+5) Gzip 1.72s ± 1% 1.72s ± 0% ~ (p=0.905 n=5+4) Gunzip 342ms ± 0% 342ms ± 0% ~ (p=0.421 n=5+5) HTTPClientServer 219µs ± 1% 219µs ± 1% ~ (p=1.000 n=5+5) JSONEncode 89.1ms ± 1% 89.4ms ± 1% ~ (p=0.222 n=5+5) JSONDecode 292ms ± 1% 291ms ± 0% ~ (p=0.421 n=5+5) Mandelbrot200 15.7ms ± 0% 15.6ms ± 0% ~ (p=0.690 n=5+5) GoParse 19.5ms ± 1% 19.6ms ± 1% ~ (p=0.310 n=5+5) RegexpMatchEasy0_32 534ns ± 1% 529ns ± 1% ~ (p=0.056 n=5+5) RegexpMatchEasy0_1K 2.75µs ± 0% 2.74µs ± 0% -0.46% (p=0.008 n=5+5) RegexpMatchEasy1_32 572ns ± 2% 565ns ± 3% ~ (p=0.310 n=5+5) RegexpMatchEasy1_1K 4.15µs ± 0% 4.15µs ± 1% ~ (p=0.548 n=5+5) RegexpMatchMedium_32 31.2ns ± 0% 31.1ns ± 0% -0.45% (p=0.016 n=5+4) RegexpMatchMedium_1K 235µs ± 1% 235µs ± 0% ~ (p=1.000 n=5+5) RegexpMatchHard_32 13.9µs ± 1% 13.5µs ± 1% -2.74% (p=0.008 n=5+5) RegexpMatchHard_1K 416µs ± 2% 410µs ± 2% ~ (p=0.056 n=5+5) Revcomp 6.36s ± 0% 6.34s ± 0% -0.31% (p=0.008 n=5+5) Template 352ms ± 1% 353ms ± 0% +0.45% (p=0.032 n=5+5) TimeParse 2.04µs ± 4% 2.01µs ± 0% ~ (p=0.056 n=5+5) TimeFormat 2.97µs ± 0% 2.97µs ± 0% ~ (p=1.000 n=5+5) name old speed new speed delta GobDecode 15.7MB/s ± 1% 15.8MB/s ± 0% ~ (p=0.206 n=5+5) GobEncode 20.4MB/s ± 1% 20.5MB/s ± 1% ~ (p=0.056 n=5+5) Gzip 11.3MB/s ± 1% 11.3MB/s ± 0% ~ (p=0.841 n=5+4) Gunzip 56.7MB/s ± 0% 56.8MB/s ± 0% ~ (p=0.389 n=5+5) JSONEncode 21.8MB/s ± 1% 21.7MB/s ± 1% ~ (p=0.246 n=5+5) JSONDecode 6.66MB/s ± 0% 6.67MB/s ± 0% ~ (p=0.857 n=4+5) GoParse 2.97MB/s ± 1% 2.96MB/s ± 1% ~ (p=0.238 n=5+5) RegexpMatchEasy0_32 59.9MB/s ± 1% 60.5MB/s ± 1% +0.92% (p=0.032 n=5+5) RegexpMatchEasy0_1K 372MB/s ± 0% 374MB/s ± 0% +0.46% (p=0.008 n=5+5) RegexpMatchEasy1_32 56.0MB/s ± 2% 56.7MB/s ± 3% ~ (p=0.310 n=5+5) RegexpMatchEasy1_1K 247MB/s ± 0% 247MB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchMedium_32 32.0MB/s ± 0% 32.1MB/s ± 0% ~ (p=0.135 n=5+5) RegexpMatchMedium_1K 4.35MB/s ± 1% 4.35MB/s ± 1% ~ (p=0.825 n=5+5) RegexpMatchHard_32 2.30MB/s ± 1% 2.37MB/s ± 1% +2.78% (p=0.008 n=5+5) RegexpMatchHard_1K 2.47MB/s ± 1% 2.50MB/s ± 2% ~ (p=0.095 n=5+5) Revcomp 40.0MB/s ± 0% 40.1MB/s ± 0% +0.31% (p=0.016 n=5+5) Template 5.51MB/s ± 1% 5.49MB/s ± 0% ~ (p=0.190 n=5+5) Change-Id: I540a2e4e7992376ce04f93b332f64fc3b6071237 Reviewed-on: https://go-review.googlesource.com/c/go/+/185078 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-06-26cmd/compile, runtime: use R20, R21 in ARM64's Duff's devicesCherry Zhang
Currently we use R16 and R17 for ARM64's Duff's devices. According to ARM64 ABI, R16 and R17 can be used by the (external) linker as scratch registers in trampolines. So don't use these registers to pass information across functions. It seems unlikely that calling Duff's devices would need a trampoline in normal cases. But it could happen if the call target is out of the 128 MB direct jump limit. The choice of R20 and R21 is kind of arbitrary. The register allocator allocates from low-numbered registers. High numbered registers are chosen so it is unlikely to hold a live value and forces a spill. Fixes #32773. Change-Id: Id22d555b5afeadd4efcf62797d1580d641c39218 Reviewed-on: https://go-review.googlesource.com/c/go/+/183842 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2018-03-06runtime, cmd/compile: use ldp for DUFFCOPY on ARM64Meng Zhuo
name old time/op new time/op delta CopyFat8 2.15ns ± 1% 2.19ns ± 6% ~ (p=0.171 n=8+9) CopyFat12 2.15ns ± 0% 2.17ns ± 2% ~ (p=0.137 n=8+10) CopyFat16 2.17ns ± 3% 2.15ns ± 0% ~ (p=0.211 n=10+10) CopyFat24 2.16ns ± 1% 2.15ns ± 0% ~ (p=0.087 n=10+10) CopyFat32 11.5ns ± 0% 12.8ns ± 2% +10.87% (p=0.000 n=8+10) CopyFat64 20.2ns ± 2% 12.9ns ± 0% -36.11% (p=0.000 n=10+10) CopyFat128 37.2ns ± 0% 21.5ns ± 0% -42.20% (p=0.000 n=10+10) CopyFat256 71.6ns ± 0% 38.7ns ± 0% -45.95% (p=0.000 n=10+10) CopyFat512 140ns ± 0% 73ns ± 0% -47.86% (p=0.000 n=10+9) CopyFat520 142ns ± 0% 74ns ± 0% -47.54% (p=0.000 n=10+10) CopyFat1024 277ns ± 0% 141ns ± 0% -49.10% (p=0.000 n=10+10) Change-Id: If54bc571add5db674d5e081579c87e80153d0a5a Reviewed-on: https://go-review.googlesource.com/97395 Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-02-12runtime: use NOFRAME on mips and mips64Austin Clements
This replaces frame size -4/-8 with the NOFRAME flag in mips and mips64 assembly. This was automated with: sed -i -e 's/\(^TEXT.*[A-Z]\),\( *\)\$-[84]/\1|NOFRAME,\2$0/' $(find -name '*_mips*.s') Plus a manual fix to mkduff.go. The go binary is identical on both architectures before and after this change. Change-Id: I0310384d1a584118c41d1cd3a042bb8ea7227efb Reviewed-on: https://go-review.googlesource.com/92044 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-02-12runtime: use NOFRAME on arm64Austin Clements
This replaces frame size -8 with the NOFRAME flag in arm64 assembly. This was automated with: sed -i -e 's/\(^TEXT.*[A-Z]\),\( *\)\$-8/\1|NOFRAME,\2$0/' $(find -name '*_arm64.s') Plus a manual fix to mkduff.go. The go binary is identical before and after this change. Change-Id: I0310384d1a584118c41d1cd3a042bb8ea7227efa Reviewed-on: https://go-review.googlesource.com/92043 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-25cmd/compile: memory clearing optimization for arm64Wei Xiao
Use "STP (ZR, ZR), O(R)" instead of "MOVD ZR, O(R)" to implement memory clearing. Also improve assembler supports to STP/LDP. Results (A57@2GHzx8): benchmark old ns/op new ns/op delta BenchmarkClearFat8-8 1.00 1.00 +0.00% BenchmarkClearFat12-8 1.01 1.01 +0.00% BenchmarkClearFat16-8 1.01 1.01 +0.00% BenchmarkClearFat24-8 1.52 1.52 +0.00% BenchmarkClearFat32-8 3.00 2.02 -32.67% BenchmarkClearFat40-8 3.50 2.52 -28.00% BenchmarkClearFat48-8 3.50 3.03 -13.43% BenchmarkClearFat56-8 4.00 3.50 -12.50% BenchmarkClearFat64-8 4.25 4.00 -5.88% BenchmarkClearFat128-8 8.01 8.01 +0.00% BenchmarkClearFat256-8 16.1 16.0 -0.62% BenchmarkClearFat512-8 32.1 32.0 -0.31% BenchmarkClearFat1024-8 64.1 64.1 +0.00% Change-Id: Ie5f5eac271ff685884775005825f206167a5c146 Reviewed-on: https://go-review.googlesource.com/55610 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-08-16cmd/compile/internal/ssa: use sse to zero on amd64Ilya Tocar
Use 16-byte stores instead of 8-byte stores to zero small blocks. Also switch to duffzero for 65+ bytes only, because for each duffzero call we also save/restore BP, so call requires 4 instructions and replacing it with 4 sse stores doesn't cause code-bloat. Also switch duffzero to use leaq, instead of addq to avoid clobbering flags. ClearFat8-6 0.54ns ± 0% 0.54ns ± 0% ~ (all equal) ClearFat12-6 1.07ns ± 0% 1.07ns ± 0% ~ (all equal) ClearFat16-6 1.07ns ± 0% 0.69ns ± 0% -35.51% (p=0.001 n=8+9) ClearFat24-6 1.61ns ± 1% 1.07ns ± 0% -33.33% (p=0.000 n=10+10) ClearFat32-6 2.14ns ± 0% 1.07ns ± 0% -50.00% (p=0.001 n=8+9) ClearFat40-6 2.67ns ± 1% 1.61ns ± 0% -39.72% (p=0.000 n=10+8) ClearFat48-6 3.75ns ± 0% 2.68ns ± 0% -28.59% (p=0.000 n=9+9) ClearFat56-6 4.29ns ± 0% 3.22ns ± 0% -25.10% (p=0.000 n=9+9) ClearFat64-6 4.30ns ± 0% 3.22ns ± 0% -25.15% (p=0.000 n=8+8) ClearFat128-6 7.50ns ± 1% 7.51ns ± 0% ~ (p=0.767 n=10+9) ClearFat256-6 13.9ns ± 1% 13.9ns ± 1% ~ (p=0.257 n=10+10) ClearFat512-6 26.8ns ± 0% 26.8ns ± 0% ~ (p=0.467 n=8+8) ClearFat1024-6 52.5ns ± 0% 52.5ns ± 0% ~ (p=1.000 n=8+8) Also shaves ~20kb from go tool: go_old 10384994 go_new 10364514 [-20480 bytes] section differences global text (code) = -20585 bytes (-0.532047%) read-only data = -302 bytes (-0.018101%) Total difference -20887 bytes (-0.348731%) Change-Id: I15854e87544545c1af24775df895e38e16e12694 Reviewed-on: https://go-review.googlesource.com/54410 Run-TryBot: Ilya Tocar <ilya.tocar@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-06-13runtime, unicode: use consistent banner for generated codeBrad Fitzpatrick
Per golang.org/s/generatedcode Updates #nnn Change-Id: Ia7513ef6bd26c20b62b57b29f7770684a315d389 Reviewed-on: https://go-review.googlesource.com/45470 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Matt Layher <mdlayher@gmail.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-12runtime: update mkduff legacy commentsEmmanuel Odeke
Update comments for duffzero and duffcopy which referred to legacy locations: + cmd/?g/cgen.go + cmd/?g/ggen.go Remnants of the old days when we had 5g, 6g etc. Those locations have since moved to: + cmd/compile/internal/<arch>/cgen.go + cmd/compile/internal/<arch>/ggen.go Change-Id: Ie2ea668559d52d42b747260ea69a6d5b3d70e859 Reviewed-on: https://go-review.googlesource.com/29073 Reviewed-by: Russ Cox <rsc@golang.org>
2016-09-27runtime, cmd/compile: implement and use DUFFCOPY on ARM64Cherry Zhang
Change-Id: I8984eac30e5df78d4b94f19412135d3cc36969f8 Reviewed-on: https://go-review.googlesource.com/29910 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2015-11-12runtime: added assembly part of linux/mips64{,le} supportYao Zhang
Change-Id: I9e94027ef66c88007107de2b2b75c3d7cf1352af Reviewed-on: https://go-review.googlesource.com/14467 Reviewed-by: Minux Ma <minux@golang.org>
2015-10-18cmd/internal/obj, runtime: add NOFRAME flag to suppress stack frame set up ↵Michael Hudson-Doyle
on ppc64x Replace the confusing game where a frame size of $-8 would suppress the implicit setting up of a stack frame with a nice explicit flag. The code to set up the function prologue is still a little confusing but better than it was. Change-Id: I1d49278ff42c6bc734ebfb079998b32bc53f8d9a Reviewed-on: https://go-review.googlesource.com/15670 Reviewed-by: Minux Ma <minux@golang.org>
2015-09-22runtime: optimize duffcopy on amd64Ilya Tocar
Use movups to copy 16 bytes at a time. Results (haswell): name old time/op new time/op delta CopyFat8-48 0.62ns ± 3% 0.63ns ± 3% ~ (p=0.535 n=20+20) CopyFat12-48 0.92ns ± 2% 0.93ns ± 3% ~ (p=0.594 n=17+18) CopyFat16-48 1.23ns ± 2% 1.23ns ± 2% ~ (p=0.839 n=20+19) CopyFat24-48 1.85ns ± 2% 1.84ns ± 0% -0.48% (p=0.014 n=19+20) CopyFat32-48 2.45ns ± 0% 2.45ns ± 1% ~ (p=1.000 n=16+16) CopyFat64-48 3.30ns ± 2% 2.14ns ± 1% -35.00% (p=0.000 n=20+18) CopyFat128-48 6.05ns ± 0% 3.98ns ± 0% -34.22% (p=0.000 n=18+17) CopyFat256-48 11.9ns ± 3% 7.7ns ± 0% -35.87% (p=0.000 n=20+17) CopyFat512-48 23.0ns ± 2% 15.1ns ± 2% -34.52% (p=0.000 n=20+18) CopyFat1024-48 44.8ns ± 1% 29.8ns ± 2% -33.48% (p=0.000 n=17+19) Change-Id: I8a78773c656d400726a020894461e00c59f896bf Reviewed-on: https://go-review.googlesource.com/14836 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2015-09-16runtime: optimize duffzero for amd64.Ilya Tocar
Use MOVUPS to zero 16 bytes at a time. results (haswell): name old time/op new time/op delta ClearFat8-48 0.62ns ± 2% 0.62ns ± 1% ~ (p=0.085 n=20+15) ClearFat12-48 0.93ns ± 2% 0.93ns ± 2% ~ (p=0.757 n=19+19) ClearFat16-48 1.23ns ± 1% 1.23ns ± 1% ~ (p=0.896 n=19+17) ClearFat24-48 1.85ns ± 2% 1.84ns ± 0% -0.51% (p=0.023 n=20+15) ClearFat32-48 2.45ns ± 0% 2.46ns ± 2% ~ (p=0.053 n=17+18) ClearFat40-48 1.99ns ± 0% 0.92ns ± 2% -53.54% (p=0.000 n=19+20) ClearFat48-48 2.15ns ± 1% 0.92ns ± 2% -56.93% (p=0.000 n=19+20) ClearFat56-48 2.46ns ± 1% 1.23ns ± 0% -49.98% (p=0.000 n=19+14) ClearFat64-48 2.76ns ± 0% 2.14ns ± 1% -22.21% (p=0.000 n=17+17) ClearFat128-48 5.21ns ± 0% 3.99ns ± 0% -23.46% (p=0.000 n=17+19) ClearFat256-48 10.3ns ± 4% 7.7ns ± 0% -25.37% (p=0.000 n=20+17) ClearFat512-48 20.2ns ± 4% 15.0ns ± 1% -25.58% (p=0.000 n=20+17) ClearFat1024-48 39.7ns ± 2% 29.7ns ± 0% -25.05% (p=0.000 n=19+19) Change-Id: I200401eec971b2dd2450c0651c51e378bd982405 Reviewed-on: https://go-review.googlesource.com/14408 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-04-29cmd/internal/gc, cmd/[56789]g: rename stackcopy to blockcopyShenghou Ma
To avoid confusion with the runtime concept of copying stack. Change-Id: I33442377b71012c2482c2d0ddd561492c71e70d0 Reviewed-on: https://go-review.googlesource.com/8639 Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Russ Cox <rsc@golang.org>
2015-04-15cmd/6g, runtime: improve duffzero throughputJosh Bleecher Snyder
It is faster to execute MOVQ AX,(DI) MOVQ AX,8(DI) MOVQ AX,16(DI) MOVQ AX,24(DI) ADDQ $32,DI than STOSQ STOSQ STOSQ STOSQ However, in order to be able to jump into the middle of a block of MOVQs, the call site needs to pre-adjust DI. If we're clearing a small area, the cost of that DI pre-adjustment isn't repaid. This CL switches the DUFFZERO implementation to use a hybrid strategy, in which small clears use STOSQ as before, but large clears use mostly MOVQ/ADDQ blocks. benchmark old ns/op new ns/op delta BenchmarkClearFat8 0.55 0.55 +0.00% BenchmarkClearFat12 0.82 0.83 +1.22% BenchmarkClearFat16 0.55 0.55 +0.00% BenchmarkClearFat24 0.82 0.82 +0.00% BenchmarkClearFat32 2.20 1.94 -11.82% BenchmarkClearFat40 1.92 1.66 -13.54% BenchmarkClearFat48 2.21 1.93 -12.67% BenchmarkClearFat56 3.03 2.20 -27.39% BenchmarkClearFat64 3.26 2.48 -23.93% BenchmarkClearFat72 3.57 2.76 -22.69% BenchmarkClearFat80 3.83 3.05 -20.37% BenchmarkClearFat88 4.14 3.30 -20.29% BenchmarkClearFat128 5.54 4.69 -15.34% BenchmarkClearFat256 9.95 9.09 -8.64% BenchmarkClearFat512 18.7 17.9 -4.28% BenchmarkClearFat1024 36.2 35.4 -2.21% Change-Id: Ic786406d9b3cab68d5a231688f9e66fcd1bd7103 Reviewed-on: https://go-review.googlesource.com/2585 Reviewed-by: Keith Randall <khr@golang.org>
2015-04-02runtime: auto-generate duff routinesJosh Bleecher Snyder
This makes it easier to experiment with alternative implementations. While we're here, update the comments. No functional changes. Passes toolstash -cmp. Change-Id: I428535754908f0fdd7cc36c214ddb6e1e60f376e Reviewed-on: https://go-review.googlesource.com/8310 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>