| Age | Commit message (Collapse) | Author |
|
Windows doesn't require any special handling for cgo threads. They
can be created in the same way as in non-cgo code.
In fact, the code to create threads in runtime and in runtime/cgo is
basically the same, except that the latter does some retries on failure.
Cq-Include-Trybots: luci.golang.try:gotip-windows-amd64-longtest,gotip-windows-amd64-race,gotip-windows-arm64
Change-Id: I49d4de93d4d3b07a4c89e2bfb6b7302c6dfb9877
Reviewed-on: https://go-review.googlesource.com/c/go/+/746300
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Most platforms share the same implementation for x_cgo_init and
crosscall1. Solaris diverges too much and is left for a future CL.
Cq-Include-Trybots: luci.golang.try:gotip-freebsd-amd64,gotip-darwin-amd64_14,gotip-darwin-arm64_15,gotip-netbsd-arm64,gotip-openbsd-ppc64,gotip-solaris-amd64,gotip-linux-ppc64_power10
Change-Id: Ib2eeb6456caa5c055e1ac1907c2fdf63db58dafc
Reviewed-on: https://go-review.googlesource.com/c/go/+/708035
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Junyang Shao <shaojunyang@google.com>
|
|
Almost all pthread-related functions are exactly the same for Unix OSes.
Their implementation can be shared, taking into account the small
differences using standard predefined macros.
Cq-Include-Trybots: luci.golang.try:gotip-freebsd-amd64,gotip-darwin-amd64_14,gotip-darwin-arm64_15,gotip-netbsd-arm64,gotip-openbsd-amd64,gotip-openbsd-ppc64,gotip-solaris-amd64,gotip-linux-ppc64_power10
Change-Id: I8bee25f0619a5b315439cf12d94312c36c3e5a73
Reviewed-on: https://go-review.googlesource.com/c/go/+/707955
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
When calling into C via cmd/cgo, the generated code calls
_cgo_tsan_acquire / _cgo_tsan_release around the C call to report a
dummy lock to the C/C++ TSAN runtime. This is necessary because the
C/C++ TSAN runtime does not understand synchronization within Go and
would otherwise report false positive race reports. See the comment in
cmd/cgo/out.go for more details.
Various C functions in runtime/cgo also contain manual calls to
_cgo_tsan_acquire/release where necessary to suppress race reports.
However, the cgo symbolizer and cgo traceback functions called from
callCgoSymbolizer and cgoContextPCs, respectively, do not have any
instrumentation [1]. They call directly into user C functions with no
TSAN instrumentation.
This means they have an opportunity to report false race conditions. The
most direct way is via their argument. Both are passed a pointer to a
struct stored on the Go stack, and both write to fields of the struct.
If two calls are passed the same pointer from different threads, the C
TSAN runtime will think this is a race.
This is simple to achieve for the cgo symbolizer function, which the
new regression test does. callCgoSymbolizer is called on the standard
goroutine stack, so the argument is a pointer into the goroutine stack.
If the goroutine moves Ms between two calls, it will look like a race.
On the other hand, cgoContextPCs is called on the system stack. Each M
has a unique system stack, so for it to pass the same argument pointer
on different threads would require the first M to exit, free its stack,
and the same region of address space to be used as the stack for a new
M. Theoretically possible, but quite unlikely.
Both of these are addressed by providing a C wrapper in runtime/cgo that
calls _cgo_tsan_acquire/_cgo_tsan_release around calls to the symbolizer
and traceback functions.
There is a lot of room for future cleanup here. Most runtime/cgo
functions have manual instrumentation in their C implementation. That
could be removed in favor of instrumentation in the runtime. We could
even theoretically remove the instrumentation from cmd/cgo and move it
to cgocall. None of these are necessary, but may make things more
consistent and easier to follow.
[1] Note that the cgo traceback function called from the signal handler
via x_cgo_callers _does_ have manual instrumentation.
Fixes #73949.
Cq-Include-Trybots: luci.golang.try:gotip-freebsd-amd64,gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Change-Id: I6a6a636c9daa38f7fd00694af76b75cb93ba1886
Reviewed-on: https://go-review.googlesource.com/c/go/+/677955
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
It's used by the SWIG CI build, at least, and it's an easy fix.
Fixes #71961
Change-Id: Id21071a5aef216b35ecf0e9cd3e05d08972d92fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/652181
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
|
|
Rather than explicitly calling pthread_detach.
Fixes #68850
Change-Id: I7b4042283f9feb5383bffd40fae6db6d23217f97
Reviewed-on: https://go-review.googlesource.com/c/go/+/605257
Commit-Queue: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
For #65290
Fixes #65971
Change-Id: If15853f287e06b85bb1cb038b3785516d5812f84
Reviewed-on: https://go-review.googlesource.com/c/go/+/567556
Auto-Submit: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
|
|
When cross-compiling a cgo program with CC=clang for Linux/ARMv5,
atomic warnings cause build errors, as cgo uses -Werror.
These warnings seem to be harmless and come from the usage of
__atomic_load_n, which is emulated due to the lack of atomic
instructions in armv5.
Fixes #65290
Change-Id: Ie72efb77468f06888f81f15850401dc8ce2c78f9
GitHub-Last-Rev: fbad847b962f6b4599cd843018e79f4b55be097e
GitHub-Pull-Request: golang/go#65588
Reviewed-on: https://go-review.googlesource.com/c/go/+/562348
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
|
|
Change-Id: Ia63a4604449b5e460e6f54c962fb7d6db2bc6a43
Reviewed-on: https://go-review.googlesource.com/c/go/+/519457
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
This extends CL 419434 to all Unix targets. Rather than repeating
the code, pull all the similar code into a single function.
CL 419434 description:
For a cgo binary, at startup we set g0's stack bounds using the
address of a local variable (&size) in a C function x_cgo_init and
the stack size from pthread_attr_getstacksize. Normally, &size is
an address within the current stack frame. However, when it is
compiled with ASAN, it may be instrumented to __asan_stack_malloc_0
and the address may not live in the current stack frame, causing
the stack bound to be set incorrectly, e.g. lo > hi.
Using __builtin_frame_address(0) to get the stack address instead.
Change-Id: I914a09d32c66a79515b6f700be18c690f3c0c77b
Reviewed-on: https://go-review.googlesource.com/c/go/+/517335
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
|
|
Every call from C to Go does acquire a mutex to check whether Go runtime
has been fully initialized. This often does not matter, because the lock
is held only briefly. However, with code that does a lot of parallel
calls from C to Go could cause heavy contention on the mutex.
Since this is an initialization guard, we can double check with atomic
operation to provide a fast path in case the initialization is done.
With this CL, program in #60961 reduces from ~2.7s to ~1.8s.
Fixes #60961
Change-Id: Iba4cabbee3c9bc646e70ef7eb074212ba63fdc04
Reviewed-on: https://go-review.googlesource.com/c/go/+/505455
Reviewed-by: Heschi Kreinick <heschi@google.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This reapplies CL 485500, with a fix drafted in CL 492987 incorporated.
CL 485500 is reverted due to #60004 and #60007. #60004 is fixed in
CL 492743. #60007 is fixed in CL 492987 (incorporated in this CL).
[Original CL 485500 description]
This reapplies CL 481061, with the followup fixes in CL 482975, CL 485315, and
CL 485316 incorporated.
CL 481061, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 482975 is a followup fix to a C declaration in testprogcgo.
CL 485315 is a followup fix for x_cgo_getstackbound on Illumos.
CL 485316 is a followup cleanup for ppc64 assembly.
CL 479915 passed the G to _cgo_getstackbound for direct updates to
gp.stack.lo. A G can be reused on a new thread after the previous thread
exited. This could trigger the C TSAN race detector because it couldn't
see the synchronization in Go (lockextra) preventing the same G from
being used on multiple threads at the same time.
We work around this by passing the address of a stack variable to
_cgo_getstackbound rather than the G. The stack is generally unique per
thread, so TSAN won't see the same address from multiple threads. Even
if stacks are reused across threads by pthread, C TSAN should see the
synchonization in the stack allocator.
A regression test is added to misc/cgo/testsanitizer.
[Original CL 481061 description]
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
[CL 492987 description]
On the first call into Go from a C thread, currently we set the g0
stack's high bound imprecisely based on the SP. With CL 485500, we
keep the M and don't recompute the stack bounds when it calls into
Go again. If the first call is made when the C thread uses some
deep stack, but a subsequent call is made with a shallower stack,
the SP may be above g0.stack.hi.
This is usually okay as we don't check usually stack.hi. One place
where we do check for stack.hi is in the signal handler, in
adjustSignalStack. In particular, C TSAN delivers signals on the
g0 stack (instead of the usual signal stack). If the SP is above
g0.stack.hi, we don't see it is on the g0 stack, and throws.
This CL makes it get an accurate stack upper bound with the
pthread API (on the platforms where it is available).
Also add some debug print for the "handler not on signal stack"
throw.
Fixes #51676.
Fixes #59294.
Fixes #59678.
Fixes #60007.
Change-Id: Ie51c8e81ade34ec81d69fd7bce1fe0039a470776
Reviewed-on: https://go-review.googlesource.com/c/go/+/495855
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This reverts CL 485500.
Reason for revert: This breaks internal tests at Google, see b/280861579 and b/280820455.
Change-Id: I426278d400f7611170918fc07c524cb059b9cc55
Reviewed-on: https://go-review.googlesource.com/c/go/+/492995
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Chressie Himpel <chressie@google.com>
|
|
This reapplies CL 481061, with the followup fixes in CL 482975, CL 485315, and
CL 485316 incorporated.
CL 481061, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 482975 is a followup fix to a C declaration in testprogcgo.
CL 485315 is a followup fix for x_cgo_getstackbound on Illumos.
CL 485316 is a followup cleanup for ppc64 assembly.
[Original CL 481061 description]
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
[CL 485500 description]
CL 479915 passed the G to _cgo_getstackbound for direct updates to
gp.stack.lo. A G can be reused on a new thread after the previous thread
exited. This could trigger the C TSAN race detector because it couldn't
see the synchronization in Go (lockextra) preventing the same G from
being used on multiple threads at the same time.
We work around this by passing the address of a stack variable to
_cgo_getstackbound rather than the G. The stack is generally unique per
thread, so TSAN won't see the same address from multiple threads. Even
if stacks are reused across threads by pthread, C TSAN should see the
synchonization in the stack allocator.
A regression test is added to misc/cgo/testsanitizer.
Fixes #51676.
Fixes #59294.
Fixes #59678.
Change-Id: Ic62be31a06ee83568215e875a891df37084e08ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/485500
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
|
|
This reverts CL 481061.
Reason for revert: When built with C TSAN, x_cgo_getstackbound triggers
race detection on `g->stacklo` because the synchronization is in Go,
which isn't instrumented.
For #51676.
For #59294.
For #59678.
Change-Id: I38afcda9fcffd6537582a39a5214bc23dc147d47
Reviewed-on: https://go-review.googlesource.com/c/go/+/485275
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
Fixes #51676.
Fixes #59294.
Change-Id: I9bf1400106d5c08ce621d2ed1df3a2d9e3f55494
Reviewed-on: https://go-review.googlesource.com/c/go/+/481061
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: DeJiang Zhu (doujiang) <doujiang24@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
This reverts CL 392854.
Reason for revert: caused #59294, which was derived from google
internal tests. The attempted fix of #59294 caused more breakage.
Change-Id: I5a061561ac2740856b7ecc09725ac28bd30f8bba
Reviewed-on: https://go-review.googlesource.com/c/go/+/481060
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
Fixes #51676
Change-Id: I380702fe2f9b6b401b2d6f04b0aba990f4b9ee6c
GitHub-Last-Rev: 93dc64ad98e5583372e41f65ee4b7ab78b5aff51
GitHub-Pull-Request: golang/go#51679
Reviewed-on: https://go-review.googlesource.com/c/go/+/392854
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: thepudds <thepudds1460@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Replace deprecated // +build lines by their respective //go:build line
counterpart. Also remove build constraints implied by file name or type.
Change-Id: I8d18cd40071ca28d7654da8f0d22841f43729ca6
Reviewed-on: https://go-review.googlesource.com/c/go/+/460538
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
Fixes #29879
Change-Id: Id2061a5eab67bb90a8116dc4f16073be1c9a09a9
GitHub-Last-Rev: 186863ab6aa9481744f276a7afbd87bd53c9f863
GitHub-Pull-Request: golang/go#29900
Reviewed-on: https://go-review.googlesource.com/c/go/+/159178
Reviewed-by: Philipp Stephani <phst@google.com>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This commit add port of runtime/cgo for aix/ppc64.
AIX assembly is different from Linux assembly, therefore gcc_ppc64.S
must be redone for AIX.
Change-Id: I780ebab4ef9c4ce912f4c4d521d8c135b1eebf6e
Reviewed-on: https://go-review.googlesource.com/c/go/+/164002
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Fixes #28095
Change-Id: Id8668d52986c9805213e8847f49fe42dfde2e01a
Reviewed-on: https://go-review.googlesource.com/c/140797
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Currently, threads created by the runtime exist until the whole
program exits. For #14592 and #20395, we want to be able to exit and
clean up threads created by the runtime. This commit implements that
mechanism.
The main difficulty is how to clean up the g0 stack. In cgo mode and
on Solaris and Windows where the OS manages thread stacks, we simply
arrange to return from mstart and let the system clean up the thread.
If the runtime allocated the g0 stack, then we use a new exitThread
syscall wrapper that arranges to clear a flag in the M once the stack
can safely be reaped and call the thread termination syscall.
exitThread is based on the existing exit1 wrapper, which was always
meant to terminate the calling thread. However, exit1 has never been
used since it was introduced 9 years ago, so it was broken on several
platforms. exitThread also has the additional complication of having
to flag that the stack is unused, which requires some tricks on
platforms that use the stack for syscalls.
This still leaves the problem of how to reap the unused g0 stacks. For
this, we move the M from allm to a new freem list as part of the M
exiting. Later, allocm scans the freem list, finds Ms that are marked
as done with their stack, removes these from the list and frees their
g0 stacks. This also allows these Ms to be garbage collected.
This CL does not yet use any of this functionality. Follow-up CLs
will. Likewise, there are no new tests in this CL because we'll need
follow-up functionality to test it.
Change-Id: Ic851ee74227b6d39c6fc1219fc71b45d3004bc63
Reviewed-on: https://go-review.googlesource.com/46037
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
OpenBSD 6.0 and later have support for PT_TLS in ld.so(1). Now that OpenBSD
6.1 has been released, OpenBSD 5.9 is no longer officially supported and Go
can start generating PT_TLS for OpenBSD cgo binaries. This also allows us
to remove the workarounds in the OpenBSD cgo runtime.
This change also removes the environ and progname exports - these are now
provided directly by ld.so(1) itself.
Fixes #19932
Change-Id: I42e75ef9feb5dcd4696add5233497e3cbc48ad52
Reviewed-on: https://go-review.googlesource.com/40331
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Update #18146.
Change-Id: Ib447aabae9f203a8b61fb8c984b57d8e2bfe69c2
Reviewed-on: https://go-review.googlesource.com/33894
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change-Id: Ie9e6fda675e560234e90b9022526fd689d770818
Reviewed-on: https://go-review.googlesource.com/23610
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Add support for the context function set by runtime.SetCgoTraceback.
The context function was added in CL 17761, without support.
This CL is the support.
This CL has not been tested for real C code, as a working context
function for C code requires unwind support that does not seem to exist.
I wanted to get the CL out before the freeze.
I apologize for the length of this CL. It's mostly plumbing, but
unfortunately the plumbing is processor-specific.
Change-Id: I8ce11a0de9b3dafcc29efd2649d776e93bff0e90
Reviewed-on: https://go-review.googlesource.com/22508
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The special case was because PPC did not support external linking, but
now it does.
Fixes #10410.
Change-Id: I9b024686e0f03da7a44c1c59b41c529802f16ab0
Reviewed-on: https://go-review.googlesource.com/22372
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This is a subset of https://golang.org/cl/20022 with only the copyright
header lines, so the next CL will be smaller and more reviewable.
Go policy has been single space after periods in comments for some time.
The copyright header template at:
https://golang.org/doc/contribute.html#copyright
also uses a single space.
Make them all consistent.
Change-Id: Icc26c6b8495c3820da6b171ca96a74701b4a01b0
Reviewed-on: https://go-review.googlesource.com/20111
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This makes "CGO_ENABLED=0 go list runtime/cgo" work,
which fixes the current cmd/go test failure.
Change-Id: Ia55ce3ba1dbb09f618ae5f4c8547722670360f59
Reviewed-on: https://go-review.googlesource.com/19001
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change-Id: Ic9744c7716cdd53f27c6e5874230963e5fff0333
Reviewed-on: https://go-review.googlesource.com/8260
Reviewed-by: Minux Ma <minux@golang.org>
|
|
Technically you must initialize static pthread_mutex_t and
pthread_cond_t variables with the appropriate INITIALIZER macro. In
practice the default initializers are zero anyhow, but it's still good
code hygiene.
Change-Id: I517304b16c2c7943b3880855c1b47a9a506b4bdf
Reviewed-on: https://go-review.googlesource.com/9433
Reviewed-by: David Crawshaw <crawshaw@golang.org>
|
|
Related to issue #10410
For some reason, any non-trivial code in _cgo_wait_runtime_init_done
(even fprintf()) will crash that call.
If anybody has any guess why this is happening, please let me know!
For now, I'm clearing the functions for ppc64, as it's currently not used.
Change-Id: I1b11383aaf4f9f9a16f1fd6606842cfeedc9f0b3
Reviewed-on: https://go-review.googlesource.com/8766
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Srdjan Petrovic <spetrovic@google.com>
|
|
This is Part 2 of the change, see Part 1 here: in https://go-review.googlesource.com/#/c/7692/
Suggested by iant@, we use the library initialization entry point to:
- create a new OS thread and run the "regular" runtime init stack on
that thread
- return immediately from the main (i.e., loader) thread
- at the first CGO invocation, we wait for the runtime initialization
to complete.
The above mechanism is implemented only on linux_amd64. Next step is to
support it on linux_arm. Other platforms don't yet support shared library
compiling/linking, but we intend to use the same strategy there as well.
Change-Id: Ib2c81b1b83bee837134084b75a3beecfb8de6bf4
Reviewed-on: https://go-review.googlesource.com/8094
Run-TryBot: Srdjan Petrovic <spetrovic@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|