| Age | Commit message (Collapse) | Author |
|
Binutils defaults to exporting all symbols when building a Windows DLL.
To avoid that we were marking symbols with __declspec(dllexport) in
the cgo-generated headers, which instructs ld to export only those
symbols. However, that approach makes the headers hard to reuse when
importing the resulting DLL into other projects, as imported symbols
should be marked with __declspec(dllimport).
A better approach is to generate a .def file listing the symbols to
export, which gets the same effect without having to modify the headers.
Updates #30674
Fixes #56994
Change-Id: I22bd0aa079e2be4ae43b13d893f6b804eaeddabf
Reviewed-on: https://go-review.googlesource.com/c/go/+/705776
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
Reviewed-by: Than McIntosh <thanm@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
When calling into C via cmd/cgo, the generated code calls
_cgo_tsan_acquire / _cgo_tsan_release around the C call to report a
dummy lock to the C/C++ TSAN runtime. This is necessary because the
C/C++ TSAN runtime does not understand synchronization within Go and
would otherwise report false positive race reports. See the comment in
cmd/cgo/out.go for more details.
Various C functions in runtime/cgo also contain manual calls to
_cgo_tsan_acquire/release where necessary to suppress race reports.
However, the cgo symbolizer and cgo traceback functions called from
callCgoSymbolizer and cgoContextPCs, respectively, do not have any
instrumentation [1]. They call directly into user C functions with no
TSAN instrumentation.
This means they have an opportunity to report false race conditions. The
most direct way is via their argument. Both are passed a pointer to a
struct stored on the Go stack, and both write to fields of the struct.
If two calls are passed the same pointer from different threads, the C
TSAN runtime will think this is a race.
This is simple to achieve for the cgo symbolizer function, which the
new regression test does. callCgoSymbolizer is called on the standard
goroutine stack, so the argument is a pointer into the goroutine stack.
If the goroutine moves Ms between two calls, it will look like a race.
On the other hand, cgoContextPCs is called on the system stack. Each M
has a unique system stack, so for it to pass the same argument pointer
on different threads would require the first M to exit, free its stack,
and the same region of address space to be used as the stack for a new
M. Theoretically possible, but quite unlikely.
Both of these are addressed by providing a C wrapper in runtime/cgo that
calls _cgo_tsan_acquire/_cgo_tsan_release around calls to the symbolizer
and traceback functions.
There is a lot of room for future cleanup here. Most runtime/cgo
functions have manual instrumentation in their C implementation. That
could be removed in favor of instrumentation in the runtime. We could
even theoretically remove the instrumentation from cmd/cgo and move it
to cgocall. None of these are necessary, but may make things more
consistent and easier to follow.
[1] Note that the cgo traceback function called from the signal handler
via x_cgo_callers _does_ have manual instrumentation.
Fixes #73949.
Cq-Include-Trybots: luci.golang.try:gotip-freebsd-amd64,gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Change-Id: I6a6a636c9daa38f7fd00694af76b75cb93ba1886
Reviewed-on: https://go-review.googlesource.com/c/go/+/677955
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
_cgo_beginthread used to retry _beginthread only when it failed with
EACCESS, but CL 651995 switched to CreateThread and incorrectly mapped
EACCESS to ERROR_NOT_ENOUGH_MEMORY. The correct mapping is
ERROR_ACCESS_DENIED.
Fixes #72814
Fixes #75381
Change-Id: I8ba060114aae4e8249576f11a21eff613caa8001
Reviewed-on: https://go-review.googlesource.com/c/go/+/706075
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
On 386 the C sigaction function assumes that the caller does not set
the SA_RESTORER flag. It does not copy the C sa_restorer field to
the kernel sa_restorer field. The effect is that the kernel sees
the SA_RESTORER flag but a NULL sa_restorer field, and the program
crashes when returning from a signal handler.
On the other hand, the C sigaction function will return the SA_RESTORER
flag and the sa_restorer field stored in the kernel.
This means that if the Go runtime installs a signal handler,
with SA_RESTORER as is required when calling the kernel,
and the Go program calls C code that calls the C sigaction function
to query the current signal handler, that C code will get a result
that it can't pass back to sigaction.
This CL fixes the problem by using the C sigaction function
for 386 programs that use cgo. This reuses the functionality
used on amd64 and other GOARCHs to support the race detector.
See #75253, or runtime/testdata/testprogcgo/eintr.go, for sample
code that used to fail on 386. No new test case is required,
we just remove the skip we used to have for eintr.go.
Fixes #75253
Change-Id: I803059b1fb9e09e9fbb43f68eccb6a59a92c2991
Reviewed-on: https://go-review.googlesource.com/c/go/+/701375
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
|
|
For programs with very large environments, calling unsetenv(3) for each
environment variable can be very expensive because of CGo overhead, but
clearenv(3) is much faster. The only thing we have to track is whether
GODEBUG is being unset by the operation, which can be done very quickly
without resorting to doing unsetenv(3) for every variable.
This change makes syscall.Clearenv() >98% faster when run in an
environment with as little as 100 environment variables. (Note that due
to golang/go#27217, it is necessary to modify BenchmarkClearenv to use
t.StopTimer() and -benchtime=100x in order to get these benchmark times
-- otherwise syscall.Setenv() time is included and the benchmarks give a
more pessimistic 50% performance improvement.)
goos: linux
goarch: amd64
pkg: syscall
cpu: AMD Ryzen 7 7840U w/ Radeon 780M Graphics
│ before │ after │
│ sec/op │ sec/op vs base │
Clearenv/100-16 22276.5n ± 5% 285.8n ± 3% -98.72% (p=0.000 n=10)
Clearenv/1000-16 1414104.0n ± 1% 783.1n ± 8% -99.94% (p=0.000 n=10)
Clearenv/10000-16 143827.554µ ± 1% 7.591µ ± 5% -99.99% (p=0.000 n=10)
geomean 1.655m 1.193µ -99.93%
The above benchmarks are CGo builds, which require CGo overhead for
every setenv(2). If you run the same benchmarks for a non-CGo package
(i.e., outside of the "syscall" package), you get slightly more modest
performance improvements:
goos: linux
goarch: amd64
pkg: clearenv_nocgo
cpu: AMD Ryzen 7 7840U w/ Radeon 780M Graphics
│ before │ after │
│ sec/op │ sec/op vs base │
Clearenv/100-16 1106.0n ± 3% 230.7n ± 8% -79.14% (p=0.000 n=10)
Clearenv/1000-16 11222.0n ± 1% 305.4n ± 6% -97.28% (p=0.000 n=10)
Clearenv/10000-16 195676.5n ± 6% 759.9n ± 10% -99.61% (p=0.000 n=10)
geomean 13.44µ 376.9n -97.20%
(As above, this requires modifying the benchmarks to use t.StopTimer()
and -benchtime=100x.)
Change-Id: I53b96a75f189e91affbde423c907888b7e0fafcd
GitHub-Last-Rev: f8d7a8140d8490189d726eb380522dccacc5f176
GitHub-Pull-Request: golang/go#70672
Reviewed-on: https://go-review.googlesource.com/c/go/+/633515
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Kirill Kolyshkin <kolyshkin@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
|
|
According to the Loong64 procedure call standard [1], R31 is a static
register and therefore needs to be saved and restored.
Also, the R2 (thread pointer) register has been removed here, as it is
not involved in allocation.
[1]: https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc
Change-Id: I02e5d4bedf131e491f1a262aa3cbc0896cbc9488
Reviewed-on: https://go-review.googlesource.com/c/go/+/700817
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
loong64
This ensures that runtime's signal handlers pass through the TSAN and
MSAN libc interceptors and subsequent calls to the intercepted
sigaction function from C will correctly see them.
Change-Id: I243a70d9dcb6d95a65c8494d5f9f9f09a316c693
Reviewed-on: https://go-review.googlesource.com/c/go/+/654995
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Change-Id: Ie38583d667d579751d643b2da2aa56390b69904c
Reviewed-on: https://go-review.googlesource.com/c/go/+/652255
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
|
|
CL 652181 accidentally missed this iPhone only code.
For #71961
Change-Id: I567f8bb38958907442e69494da330d5199d11f54
Reviewed-on: https://go-review.googlesource.com/c/go/+/653135
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
It's used by the SWIG CI build, at least, and it's an easy fix.
Fixes #71961
Change-Id: Id21071a5aef216b35ecf0e9cd3e05d08972d92fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/652181
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
|
|
_beginthread is intended to be used together with the C runtime.
The cgo runtime doesn't use it, so better use CreateThread directly,
which is the Windows API for creating threads.
Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-longtest,gotip-windows-arm64
Change-Id: Ic6cf75f69f62a3babf5e74155da1aac70961886c
Reviewed-on: https://go-review.googlesource.com/c/go/+/651995
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
For #71566
Change-Id: I6dc365dd799d7b506b4a55895f1736d3dfd4684b
Reviewed-on: https://go-review.googlesource.com/c/go/+/647095
Reviewed-by: Alan Donovan <adonovan@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
|
|
Go has never supported Cygwin as a C compiler, but users get the
following cryptic error message when they try to use it:
implicit declaration of function '_beginthread'
This is because Cygwin doesn't implement _beginthread. Note that
this is not the only problem with Cygwin, but it's the one that
users are most likely to run into first.
This CL improves the error message to make it clear that Cygwin
is not supported, and suggests using MinGW instead.
Fixes #59490
Fixes #36691
Change-Id: Ifeec7a2cb38d7c5f50d6362c95504f72818c6a76
Reviewed-on: https://go-review.googlesource.com/c/go/+/627935
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
It is defined in bionic libc since at least API level 3. Use it.
Updates #68285.
Change-Id: I215c2d61d5612e7c0298b2cb69875690f8fbea66
Reviewed-on: https://go-review.googlesource.com/c/go/+/626275
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Currently, at a cgo callback where there is already a Go frame on
the stack (i.e. C->Go->C->Go), we require that at the inner Go
callback the SP is within the g0's stack bounds set by a previous
callback. This is to prevent that the C code switches stack while
having a Go frame on the stack, which we don't really support. But
this could also happen when we cannot get accurate stack bounds,
e.g. when pthread_getattr_np is not available. Since the stack
bounds are just estimates based on the current SP, if there are
multiple C->Go callbacks with various stack depth, it is possible
that the SP of a later callback falls out of a previous call's
estimate. This leads to runtime throw in a seemingly reasonable
program.
This CL changes it to save the old g0 stack bounds at cgocallback,
update the bounds, and restore the old bounds at return. So each
callback will get its own stack bounds based on the current SP,
and when it returns, the outer callback has the its old stack
bounds restored.
Also, at a cgo callback when there is no Go frame on the stack,
we currently always get new stack bounds. We do this because if
we can only get estimated bounds based on the SP, and the stack
depth varies a lot between two C->Go calls, the previous
estimates may be off and we fall out or nearly fall out of the
previous bounds. But this causes a performance problem: the
pthread API to get accurate stack bounds (pthread_getattr_np) is
very slow when called on the main thread. Getting the stack bounds
every time significantly slows down repeated C->Go calls on the
main thread.
This CL fixes it by "caching" the stack bounds if they are
accurate. I.e. at the second time Go calls into C, if the previous
stack bounds are accurate, and the current SP is in bounds, we can
be sure it is the same stack and we don't need to update the bounds.
This avoids the repeated calls to pthread_getattr_np. If we cannot
get the accurate bounds, we continue to update the stack bounds
based on the SP, and that operation is very cheap.
On a Linux/AMD64 machine with glibc:
name old time/op new time/op delta
CgoCallbackMainThread-8 96.4µs ± 3% 0.1µs ± 2% -99.92% (p=0.000 n=10+9)
Fixes #68285.
Fixes #68587.
Change-Id: I3422badd5ad8ff63e1a733152d05fb7a44d5d435
Reviewed-on: https://go-review.googlesource.com/c/go/+/600296
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Rather than explicitly calling pthread_detach.
Fixes #68850
Change-Id: I7b4042283f9feb5383bffd40fae6db6d23217f97
Reviewed-on: https://go-review.googlesource.com/c/go/+/605257
Commit-Queue: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Cleanup and friction reduction
For #65355.
Change-Id: Ia14c9dc584a529a35b97801dd3e95b9acc99a511
Reviewed-on: https://go-review.googlesource.com/c/go/+/600436
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
In glibc versions older than 2.32 (before commit 4721f95058),
pthread_getattr_np does not always initialize the `attr` argument,
and when it fails, it results in a NULL pointer dereference in
pthread_attr_destroy down the road.
This is the simplest way to avoid this, and an alternative to CL 585019.
Updates #65625.
Change-Id: If490fd37020b03eb084ebbdbf9ae0248916426d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/587919
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
|
|
callbackUpdateSystemStack contains a fast path to exit early without
update if SP is already within the g0.stack bounds.
This is not safe, as a subsequent call may have new stack bounds that
only partially overlap the old stack bounds. In this case it is possible
to see an SP that is in the old stack bounds, but very close to the
bottom of the bounds due to the partial overlap. In that case we're very
likely to "run out" of space on the system stack.
We only need to do this on extra Ms, as normal Ms have precise bounds
defined when we allocated the stack.
TSAN annotations are added to x_cgo_getstackbounds because bounds is a
pointer into the Go stack. The stack can be reused when an old thread
exits and a new thread starts, but TSAN can't see the synchronization
there. This isn't a new case, but we are now calling more often.
Fixes #62440.
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Change-Id: I5389050494987b7668d0b317fb92f85e61d798ac
Reviewed-on: https://go-review.googlesource.com/c/go/+/584597
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Converting *void directly to mach_port_t causes newer clang to throw a
void-pointer-to-int-cast warning/error.
Change-Id: I709955d4678bed3f690a8337ce85fd8678d217bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/573415
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
For #65290
Fixes #65971
Change-Id: If15853f287e06b85bb1cb038b3785516d5812f84
Reviewed-on: https://go-review.googlesource.com/c/go/+/567556
Auto-Submit: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
|
|
Fixes #64553
Change-Id: I7860cd9ba74d70a7d988538ea4df8e122f94cde6
GitHub-Last-Rev: 06164374734aef5b94566930426005ad66d0a5b6
GitHub-Pull-Request: golang/go#64727
Reviewed-on: https://go-review.googlesource.com/c/go/+/550115
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
The default case in x_cgo_getstackbound does not actually get the stack bound of
the current thread, but estimates the bound based on the default stack size. Add
a comment noting this.
Change-Id: I7d886461f0bbc795834bed37b554417cf3837a2d
Reviewed-on: https://go-review.googlesource.com/c/go/+/563376
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
When cross-compiling a cgo program with CC=clang for Linux/ARMv5,
atomic warnings cause build errors, as cgo uses -Werror.
These warnings seem to be harmless and come from the usage of
__atomic_load_n, which is emulated due to the lack of atomic
instructions in armv5.
Fixes #65290
Change-Id: Ie72efb77468f06888f81f15850401dc8ce2c78f9
GitHub-Last-Rev: fbad847b962f6b4599cd843018e79f4b55be097e
GitHub-Pull-Request: golang/go#65588
Reviewed-on: https://go-review.googlesource.com/c/go/+/562348
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
This change introduces new options to set the floating point
mode on ARM targets. The GOARM version number can optionally be
followed by ',hardfloat' or ',softfloat' to select whether to
use hardware instructions or software emulation for floating
point computations, respectively. For example,
GOARM=7,softfloat.
Previously, software floating point support was limited to
GOARM=5. With these options, software floating point is now
extended to all ARM versions, including GOARM=6 and 7. This
change also extends hardware floating point to GOARM=5.
GOARM=5 defaults to softfloat and GOARM=6 and 7 default to
hardfloat.
For #61588
Change-Id: I23dc86fbd0733b262004a2ed001e1032cf371e94
Reviewed-on: https://go-review.googlesource.com/c/go/+/514907
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
|
|
Change-Id: Ifb4844efddcb0369b0302eeab72394eeaf5c8072
Reviewed-on: https://go-review.googlesource.com/c/go/+/540022
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: shuang cui <imcusg@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Currently, set_crosscall2 takes the address of crosscall2 without
using the GOT, which, on some architectures, results in a
PC-relative relocation (e.g. R_AARCH64_ADR_PREL_PG_HI21 on ARM64)
to the crosscall2 symbol. But crosscall2 is dynamically exported,
so the C linker thinks it may bind to a symbol from a different
DSO. Some C linker may not like a PC-relative relocation to such a
symbol. Using a local trampoline to avoid taking the address of a
dynamically exported symbol.
It may be possible to not dynamically export crosscall2. But this
CL is safer for backport. Later we may remove the trampolines
after unexport crosscall2, if they are not needed.
Fixes #62556.
Change-Id: Id28457f65ef121d3f87d8189803abc65ed453283
Reviewed-on: https://go-review.googlesource.com/c/go/+/533535
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
The Windows unhandled exception mechanism fails to call the callback
set in SetUnhandledExceptionFilter if the stack can't be correctly
unwound.
Some cgo glue code was not properly chaining the frame pointer, making
the stack unwind to fail in case of an exception inside a cgo call.
This CL fix that and adds a test case to avoid regressions.
Fixes #50951
Change-Id: Ic782b5257fe90b05e3def8dbf0bb8d4ed37a190b
Reviewed-on: https://go-review.googlesource.com/c/go/+/525475
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
The msan feature depends on llvm. The currently released llvm-16
already supports the LoongArch architecture, and msan's support
for LoongArch64 has been added in this patch[1], and it has been
merged in branches main and release/17.x.
[1]: https://reviews.llvm.org/D140528
Change-Id: If537c5ffb1c9d4b3316b9b3794d411953bc5764b
Reviewed-on: https://go-review.googlesource.com/c/go/+/481315
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: abner chenc <chenguoqi@loongson.cn>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Run-TryBot: WANG Xuerui <git@xen0n.name>
|
|
Change-Id: Ia63a4604449b5e460e6f54c962fb7d6db2bc6a43
Reviewed-on: https://go-review.googlesource.com/c/go/+/519457
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
Change-Id: I3302cc2f0e03014e9497976e36d1c7a381a2f962
Reviewed-on: https://go-review.googlesource.com/c/go/+/518623
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
All openbsd architectures now use the same code, deduplicate accordingly.
Change-Id: I65f1d9bd78c97dbdf552ec95ebba7ec4d04c8d2d
Reviewed-on: https://go-review.googlesource.com/c/go/+/518622
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
All netbsd architectures now use the same code, deduplicate accordingly.
Change-Id: Ieb179fd76885b7af6d388d7f2aee0f9fac6f1264
Reviewed-on: https://go-review.googlesource.com/c/go/+/518621
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
|
|
Much of the gcc_linux_*.c code is identical and duplicated across
architectures. Consolidate code for 386, arm, loong64, mips* and
riscv64, where the only difference is the build tags (386 also
has some non-functional ordering differences).
Change-Id: I14ee9a4cc6b72e165239d196b68b6343efaddf0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/518620
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Most freebsd architectures now use the same code, deduplicate accordingly.
The arm code differs slightly in that it has a compile time check for
ARM_TP_ADDRESS, however this is written in a way that it can be included
for all architectures.
Change-Id: I7f6032b63521d24d0c3b5e0e08d57e32b4f9ddc4
Reviewed-on: https://go-review.googlesource.com/c/go/+/518619
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
This reduces inconsistency with other architectures and will allow
for further code deduplication.
Change-Id: Icf0d02f765546c3193cccaa22c79e632e12d6bba
Reviewed-on: https://go-review.googlesource.com/c/go/+/518616
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
Use fatalf consistently on freebsd. Also use it on dragonfly, netbsd
and openbsd.
Change-Id: I8643c0b7bc13c3cb5173209d311d6d297913955b
Reviewed-on: https://go-review.googlesource.com/c/go/+/518615
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
Most architectures have a crosscall1 function that takes a function
pointer, a setg_gcc function pointer and a g pointer. However,
crosscall_386 only takes a function pointer and the call to setg_gcc
is performed in the thread entry function.
Rename crosscall_386 to crosscall1 for consistency with other
architectures, as well as standardising the API - while not strictly
necessary, it will allow for further deduplication as the calling
code becomes more consistent.
Change-Id: I77cf42e1e15e0a4c5802359849a849c32cebd92f
Reviewed-on: https://go-review.googlesource.com/c/go/+/518618
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
This reduces inconsistency with other architectures and will allow
for further code deduplication.
Change-Id: I5becbf29af2ef714974b5e338f869281f2b4de8e
Reviewed-on: https://go-review.googlesource.com/c/go/+/518617
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This extends CL 419434 to all Unix targets. Rather than repeating
the code, pull all the similar code into a single function.
CL 419434 description:
For a cgo binary, at startup we set g0's stack bounds using the
address of a local variable (&size) in a C function x_cgo_init and
the stack size from pthread_attr_getstacksize. Normally, &size is
an address within the current stack frame. However, when it is
compiled with ASAN, it may be instrumented to __asan_stack_malloc_0
and the address may not live in the current stack frame, causing
the stack bound to be set incorrectly, e.g. lo > hi.
Using __builtin_frame_address(0) to get the stack address instead.
Change-Id: I914a09d32c66a79515b6f700be18c690f3c0c77b
Reviewed-on: https://go-review.googlesource.com/c/go/+/517335
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
|
|
Every call from C to Go does acquire a mutex to check whether Go runtime
has been fully initialized. This often does not matter, because the lock
is held only briefly. However, with code that does a lot of parallel
calls from C to Go could cause heavy contention on the mutex.
Since this is an initialization guard, we can double check with atomic
operation to provide a fast path in case the initialization is done.
With this CL, program in #60961 reduces from ~2.7s to ~1.8s.
Fixes #60961
Change-Id: Iba4cabbee3c9bc646e70ef7eb074212ba63fdc04
Reviewed-on: https://go-review.googlesource.com/c/go/+/505455
Reviewed-by: Heschi Kreinick <heschi@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This reapplies CL 485500, with a fix drafted in CL 492987 incorporated.
CL 485500 is reverted due to #60004 and #60007. #60004 is fixed in
CL 492743. #60007 is fixed in CL 492987 (incorporated in this CL).
[Original CL 485500 description]
This reapplies CL 481061, with the followup fixes in CL 482975, CL 485315, and
CL 485316 incorporated.
CL 481061, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 482975 is a followup fix to a C declaration in testprogcgo.
CL 485315 is a followup fix for x_cgo_getstackbound on Illumos.
CL 485316 is a followup cleanup for ppc64 assembly.
CL 479915 passed the G to _cgo_getstackbound for direct updates to
gp.stack.lo. A G can be reused on a new thread after the previous thread
exited. This could trigger the C TSAN race detector because it couldn't
see the synchronization in Go (lockextra) preventing the same G from
being used on multiple threads at the same time.
We work around this by passing the address of a stack variable to
_cgo_getstackbound rather than the G. The stack is generally unique per
thread, so TSAN won't see the same address from multiple threads. Even
if stacks are reused across threads by pthread, C TSAN should see the
synchonization in the stack allocator.
A regression test is added to misc/cgo/testsanitizer.
[Original CL 481061 description]
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
[CL 492987 description]
On the first call into Go from a C thread, currently we set the g0
stack's high bound imprecisely based on the SP. With CL 485500, we
keep the M and don't recompute the stack bounds when it calls into
Go again. If the first call is made when the C thread uses some
deep stack, but a subsequent call is made with a shallower stack,
the SP may be above g0.stack.hi.
This is usually okay as we don't check usually stack.hi. One place
where we do check for stack.hi is in the signal handler, in
adjustSignalStack. In particular, C TSAN delivers signals on the
g0 stack (instead of the usual signal stack). If the SP is above
g0.stack.hi, we don't see it is on the g0 stack, and throws.
This CL makes it get an accurate stack upper bound with the
pthread API (on the platforms where it is available).
Also add some debug print for the "handler not on signal stack"
throw.
Fixes #51676.
Fixes #59294.
Fixes #59678.
Fixes #60007.
Change-Id: Ie51c8e81ade34ec81d69fd7bce1fe0039a470776
Reviewed-on: https://go-review.googlesource.com/c/go/+/495855
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This reverts CL 485500.
Reason for revert: This breaks internal tests at Google, see b/280861579 and b/280820455.
Change-Id: I426278d400f7611170918fc07c524cb059b9cc55
Reviewed-on: https://go-review.googlesource.com/c/go/+/492995
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Chressie Himpel <chressie@google.com>
|
|
Rework this function to closely match the PPC64 crosscall2, but
written in gnu asm. Likewise, fix this to store TOC in the new
frame, not the caller's, as is required by the ELF ABIs.
Change-Id: I8902c74f2607e3436260882a7bea52e72a67b8f9
Reviewed-on: https://go-review.googlesource.com/c/go/+/486335
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Carlos Amedee <carlos@golang.org>
|
|
ELFv1 and ELFv2 declare V20-V31 as nonvolatile (callee save) registers.
Go does not. Preserve them before calling into Go.
Change-Id: If60a6d8f71b51b1136a86eab2b90d964900becd7
Reviewed-on: https://go-review.googlesource.com/c/go/+/480657
Auto-Submit: Carlos Amedee <carlos@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Run-TryBot: Paul Murphy <murp@ibm.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
cgo.NewHandle atomically increments a global uintptr index using
atomic.AddUintptr. Use atomic.Uintptr instead, which is
cleaner and clearer.
Change-Id: I845b3e4cb8c461e787a9b9bb2a9ceaaef1d21d8e
Reviewed-on: https://go-review.googlesource.com/c/go/+/490775
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
This reapplies CL 481061, with the followup fixes in CL 482975, CL 485315, and
CL 485316 incorporated.
CL 481061, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 482975 is a followup fix to a C declaration in testprogcgo.
CL 485315 is a followup fix for x_cgo_getstackbound on Illumos.
CL 485316 is a followup cleanup for ppc64 assembly.
[Original CL 481061 description]
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
[CL 485500 description]
CL 479915 passed the G to _cgo_getstackbound for direct updates to
gp.stack.lo. A G can be reused on a new thread after the previous thread
exited. This could trigger the C TSAN race detector because it couldn't
see the synchronization in Go (lockextra) preventing the same G from
being used on multiple threads at the same time.
We work around this by passing the address of a stack variable to
_cgo_getstackbound rather than the G. The stack is generally unique per
thread, so TSAN won't see the same address from multiple threads. Even
if stacks are reused across threads by pthread, C TSAN should see the
synchonization in the stack allocator.
A regression test is added to misc/cgo/testsanitizer.
Fixes #51676.
Fixes #59294.
Fixes #59678.
Change-Id: Ic62be31a06ee83568215e875a891df37084e08ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/485500
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
|
|
This reverts CL 481061.
Reason for revert: When built with C TSAN, x_cgo_getstackbound triggers
race detection on `g->stacklo` because the synchronization is in Go,
which isn't instrumented.
For #51676.
For #59294.
For #59678.
Change-Id: I38afcda9fcffd6537582a39a5214bc23dc147d47
Reviewed-on: https://go-review.googlesource.com/c/go/+/485275
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
on PPC64"""
This reverts CL 481075 (a re-apply of previously reverted CL 478917).
Reason for revert: CL 481061 causes C TSAN failures and must be
reverted. See CL 485275. This CL depends on CL 481061.
For #59678.
Change-Id: I4bf7f43d9df1ae28e04cd4065552bcbee82ef13f
Reviewed-on: https://go-review.googlesource.com/c/go/+/485316
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
|
|
This reverts CL 481795.
Reason for revert: CL 481061 causes C TSAN failures and must be
reverted. See CL 485275. This CL depends on CL 481061.
For #59678.
Change-Id: I5ec1f495154205ebdf19cd44c6e6452a7a3606f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/485315
Auto-Submit: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|