go - Fork of Go programming language with my patches.

Age	Commit message (Collapse)	Author
2026-02-18	runtime: tell the race detector about synchronization on mainInitDone	Cherry Mui
	CL 743940 puts an atomic bool initialization-done channel, which is a nice optimization. In race mode, however, the atomic bool does not have race instrumentation, so the race detector doesn't know the happens-before edge between the initialization and the cgo callback. To tell the race detector about this synchronization, just use the channel unconditionally in race mode. Channel operations have the proper race instrumentation. Alternatively, we could explicitly instrument the atomic boolean operations. TODO: write a test. Change-Id: I466b20a46cd39d2bbe2149a9009e1a01b15891e2 Reviewed-on: https://go-review.googlesource.com/c/go/+/746581 Reviewed-by: Ian Lance Taylor <iant@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2026-02-10	runtime: put a bool in front of initialization-done channel	Ian Lance Taylor
	Calls to Go functions from threads not started by Go have to wait for Go initialization to be complete. Before this CL they did this by receiving from a channel that is closed when initialization is done. That works well but introduces a channel operation into cgo calls. This is particularly frustrating because the channel is approximately always closed. This CL adds an atomic bool before the channel, so that in the normal case we are just adding a single locked memory load. We still use the channel as a fallback. For #77522 Change-Id: I8f609bf349bb0f836cefa5f6ad6d0c3c7bbfe5e0 Reviewed-on: https://go-review.googlesource.com/c/go/+/743940 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Ian Lance Taylor <iant@golang.org>
2025-12-11	runtime: make goroutines inherit DIT state, don't lock to OS thread	Roland Shoemaker
	When we first implemented DIT (crypto/subtle.WithDataIndependentTiming), we made it so that enabling DIT on a goroutine would lock that goroutine to its current OS thread. This was done to ensure that the DIT state (which is per-thread) would not leak to other goroutines. We also did not make goroutines inherit the DIT state. This change makes goroutines inherit the DIT state from their parent at creation time. It also removes the OS thread locking when enabling DIT on a goroutine. Instead, we now set the DIT state on the OS thread in the scheduler whenever we switch to a goroutine that has DIT enabled, and we unset it when switching to a goroutine that has DIT disabled. We add a new field to G and M, ditEnabled, to track whether the G wants DIT enabled, and whether the M currently has DIT enabled, respectively. When the scheduler executes a goroutine, it checks these fields and enables/disables DIT on the thread as needed. Additionally, cgocallbackg is updated to check if DIT is enabled when being called from C, and sets the G and M fields accordingly. This ensures that if DIT was enabled/disabled in C, the correct state will be reflected in the Go runtime. The behavior as it currently stands is as follows: - The function passed to crypto/subtle.WithDataIndependentTiming will have DIT enabled. - Any goroutine created within that function will inherit DIT enabled for its lifetime. Any goroutine created from subquent goroutines will also inherit DIT enabled for their lifetimes. - Calling into a C function within from a goroutine with DIT enabled will have DIT enabled. - If the C code disables DIT, the goroutine will have DIT re-enabled when returning to Go. - If the C code enables DIT, the goroutine will have DIT disabled when returning to Go if it was not previously enabled. - Calling back into Go code from C will have DIT enabled if it was enabled when calling into C, or if the C code enabled it. Change-Id: I8e91e6df13bb88e56e1036e0e0e5f04efd8eebd3 Reviewed-on: https://go-review.googlesource.com/c/go/+/726382 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-11-24	runtime/cgo: add support for `any` param and return type	Alexandre Daubois
	When using `any` as param or return type of an exported function, we currently have the error `unrecognized Go type any`. `any` is an alias of `interface{}` which is already supported. This would avoid such change: https://github.com/php/frankenphp/pull/1976 Fixes #76340 Change-Id: I301838ff72e99ae78b035a8eff2405f6a145ed1a GitHub-Last-Rev: 7dfbccfa582bbc6e79ed29677391b9ae81a9b5bd GitHub-Pull-Request: golang/go#76325 Reviewed-on: https://go-review.googlesource.com/c/go/+/720960 Reviewed-by: Mark Freeman <markfreeman@google.com> Auto-Submit: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-11-21	runtime: go fmt	Michael Pratt
	Change-Id: I6a6a636cf38ddb1dc6f2170361eb4093b81acdfb Reviewed-on: https://go-review.googlesource.com/c/go/+/722521 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-11-07	runtime/cgo: improve error messages after pointer panic	Ariel Otilibili
	This CL improves the error messages after panics due to the sharing of an unpinned Go pointer (or a pointer to an unpinned Go pointer) between Go and C. This occurs when it is: 1. returned from Go to C (through cgoCheckResult) 2. passed as argument to a C function (through cgoCheckPointer). An unpinned Go pointer refers to a memory location that might be moved or freed by the garbage collector. Therefore: - change the signature of cgoCheckArg (it does the real work behind cgoCheckResult and cgoCheckPointer) - change the signature of cgoCheckUnknownPointer (called by cgoCheckArg for checking unexpected pointers) - introduce cgoFormatErr (it is called by cgoCheckArg and cgoCheckUnknownPointer to format panic error messages) - update the cgo pointer tests (add new tests, and a field errTextRegexp to the struct ptrTest) - remove a loop variable in TestPointerChecks (similar to CL 711640). 1. cgoCheckResult When an unpinned Go pointer (or a pointer to an unpinned Go pointer) is returned from Go to C, 1 package main 2 3 /* 4 #include <stdio.h> 5 6 extern void* GoFoo(); 7 8 static void CFoo() { GoFoo();} 9 / 10 import ( 11 "C" 12 ) 13 14 //export GoFoo 15 func GoFoo() map[int]int { 16 return map[int]int{0: 1,} 17 } 18 19 func main() { 20 C.CFoo(); 21 } This error shows up at runtime: panic: runtime error: cgo result is unpinned Go pointer or points to unpinned Go pointer goroutine 1 [running]: main._Cfunc_CFoo() _cgo_gotypes.go:46 +0x3a main.main() /mnt/tmp/parse.go:20 +0xf exit status 2 GoFoo is the faulty Go function; it is not mentioned in the error message. Moreover the error does not say which kind of pointer caused the panic; for instance, a Go map. Retrieve name and file/line of the Go function, as well as the kind of pointer; use them in the error message: panic: runtime error: /mnt/tmp/parse.go:15: result of Go function GoFoo called from cgo is unpinned Go map or points to unpinned Go map goroutine 1 [running]: main._Cfunc_CFoo() _cgo_gotypes.go:46 +0x3a main.main() /mnt/tmp/parse.go:20 +0xf exit status 2 2. cgoCheckPointer When a pointer to an unpinned Go pointer is passed to a C function, 1 package main 2 3 / 4 #include <stdio.h> 5 void foo(void bar) {} 6 / 7 import "C" 8 import "unsafe" 9 10 func main() { 11 m := map[int]int{0: 1,} 12 C.foo(unsafe.Pointer(&m)) 13 } This error shows up at runtime: panic: runtime error: cgo argument has Go pointer to unpinned Go pointer goroutine 1 [running]: main.main.func1(...) /mnt/tmp/cgomap.go:12 main.main() /mnt/tmp/cgomap.go:12 +0x91 exit status 2 Retrieve kind of pointer; use it in the error message. panic: runtime error: argument of cgo function has Go pointer to unpinned Go map goroutine 1 [running]: main.main.func1(...) /mnt/tmp/cgomap.go:12 main.main() /mnt/tmp/cgomap.go:12 +0x9b exit status 2 Link: https://pkg.go.dev/cmd/cgo#hdr-Passing_pointers Suggested-by: Ian Lance Taylor <iant@golang.org> Fixes #75856 Change-Id: Ia72f01df016feeae0cddb2558ced51a1b07e4486 GitHub-Last-Rev: 76257c7dd7c52a3d88234d43ec7f22bda81edcdd GitHub-Pull-Request: golang/go#75894 Reviewed-on: https://go-review.googlesource.com/c/go/+/711801 Reviewed-by: Funda Secgin <fundasecgin73@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com>
2025-10-22	runtime/cgo: fix cgoCheckArg description	Ariel Otilibili
	The description is misleading: cgoCheckArg is called by both cgoCheckPointer and cgoCheckResult. Mention cgoCheckResult in the cgoCheckArg description. Remove extra spaces between words. For #75856 Change-Id: I6780cda76b5cb7b4f9af3fbaa37a6c5099cc8d7d GitHub-Last-Rev: 531928b679ab138bf9bfe47a1bf6470d582776f3 GitHub-Pull-Request: golang/go#75992 Reviewed-on: https://go-review.googlesource.com/c/go/+/713520 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com>
2025-07-29	internal/abi: move direct/indirect flag from Kind to TFlag	Keith Randall
	This info makes more sense in the flags instead of as a high bit of the kind. This makes kind access simpler because we now don't need to mask anything. Cleaned up most direct field accesses to use methods instead. (reflect making new types is the only remaining direct accessor.) IfaceIndir -> !IsDirectIface everywhere. gocore has been updated to handle the new location. So has delve. TODO: any other tools need updating? Change-Id: I123f97a4d4bdd0bff1641ee7e276d1cc0bd7e8eb Reviewed-on: https://go-review.googlesource.com/c/go/+/681936 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-28	internal/runtime/syscall/windows: factor out code from runtime	qmuntal
	Factor out the code related to doing calls using the Windows stdcall calling convention into a separate package. This will allow us to reuse it in other low-level packages that can't depend on syscall. Updates #51087. Cq-Include-Trybots: luci.golang.try:gotip-windows-arm64,gotip-windows-amd64-longtest,gotip-solaris-amd64 Change-Id: I68640b07091183b50da6bef17406c10a397896e9 Reviewed-on: https://go-review.googlesource.com/c/go/+/689156 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2025-03-14	runtime: only set isExtraInC if there are no Go frames left	Michael Pratt
	mp.isExtraInC is intended to indicate that this M has no Go frames at all; it is entirely executing in C. If there was a cgocallback to Go and then a cgocall to C, such that the leaf frames are C, that is fine. e.g., traceback can handle this fine with SetCgoTraceback (or by simply skipping the C frames). However, we currently mismanage isExtraInC, unconditionally setting it on return from cgocallback. This means that if there are two levels of cgocallback, we end up running Go code with isExtraInC set. 1. C-created thread calls into Go function 1 (via cgocallback). 2. Go function 1 calls into C function 1 (via cgocall). 3. C function 1 calls into Go function 2 (via cgocallback). 4. Go function 2 returns back to C function 1 (returning via the remainder of cgocallback). 5. C function 1 returns back to Go function 1 (returning via the remainder of cgocall). 6. Go function 1 is now running with mp.isExtraInC == true. The fix is simple; only set isExtraInC on return from cgocallback if there are no more Go frames. There can't be more Go frames unless there is an active cgocall out of the Go frames. Fixes #72870. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest Change-Id: I6a6a636c4e7ba75a29639d7036c5af3738033467 Reviewed-on: https://go-review.googlesource.com/c/go/+/658035 Reviewed-by: Cherry Mui <cherryyz@google.com> Commit-Queue: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-11-19	crypto/subtle: add DIT closure	Roland Shoemaker
	Add a new function, WithDataIndependentTiming, which takes a function as an argument, and encloses it with calls to set/unset the DIT PSTATE bit on Arm64. Since DIT is OS thread-local, for the duration of the execution of WithDataIndependentTiming, we lock the goroutine to the OS thread, using LockOSThread. For long running operations, this is likely to not be performant, but we expect this to be tightly scoped around cryptographic operations that have bounded execution times. If locking to the OS thread turns out to be too slow, another option is to add a bit to the g state indicating if a goroutine has DIT enabled, and then have the scheduler enable/disable DIT when scheduling a g. Additionally, we add a new GODEBUG, dataindependenttiming, which allows setting DIT for an entire program. Running a program with dataindependenttiming=1 enables DIT for the program during initialization. In an ideal world PSTATE.DIT would be inherited from the parent thread, so we'd only need to set it in the main thread and then all subsequent threads would inherit the value. While this does happen in the Linux kernel [0], it is not the case for darwin [1]. Rather than add complex logic to only set it on darwin for each new thread, we just unconditionally set it in mstart1 and cgocallbackg1 regardless of the OS. DIT will already impose some overhead, and the cost of setting the bit is only ~two instructions (CALL, MSR), so it should be cheap enough. Fixes #66450 Updates #49702 [0] https://github.com/torvalds/linux/blob/e8bdb3c8be08c9a3edc0a373c0aa8729355a0705/arch/arm64/kernel/process.c#L373 [1] https://github.com/apple-oss-distributions/xnu/blob/8d741a5de7ff4191bf97d57b9f54c2f6d4a15585/osfmk/arm64/status.c#L1666 Change-Id: I78eda691ff9254b0415f2b54770e5850a0179749 Reviewed-on: https://go-review.googlesource.com/c/go/+/598336 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Filippo Valsorda <filippo@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-10-30	runtime: update and restore g0 stack bounds at cgocallback	Cherry Mui
	Currently, at a cgo callback where there is already a Go frame on the stack (i.e. C->Go->C->Go), we require that at the inner Go callback the SP is within the g0's stack bounds set by a previous callback. This is to prevent that the C code switches stack while having a Go frame on the stack, which we don't really support. But this could also happen when we cannot get accurate stack bounds, e.g. when pthread_getattr_np is not available. Since the stack bounds are just estimates based on the current SP, if there are multiple C->Go callbacks with various stack depth, it is possible that the SP of a later callback falls out of a previous call's estimate. This leads to runtime throw in a seemingly reasonable program. This CL changes it to save the old g0 stack bounds at cgocallback, update the bounds, and restore the old bounds at return. So each callback will get its own stack bounds based on the current SP, and when it returns, the outer callback has the its old stack bounds restored. Also, at a cgo callback when there is no Go frame on the stack, we currently always get new stack bounds. We do this because if we can only get estimated bounds based on the SP, and the stack depth varies a lot between two C->Go calls, the previous estimates may be off and we fall out or nearly fall out of the previous bounds. But this causes a performance problem: the pthread API to get accurate stack bounds (pthread_getattr_np) is very slow when called on the main thread. Getting the stack bounds every time significantly slows down repeated C->Go calls on the main thread. This CL fixes it by "caching" the stack bounds if they are accurate. I.e. at the second time Go calls into C, if the previous stack bounds are accurate, and the current SP is in bounds, we can be sure it is the same stack and we don't need to update the bounds. This avoids the repeated calls to pthread_getattr_np. If we cannot get the accurate bounds, we continue to update the stack bounds based on the SP, and that operation is very cheap. On a Linux/AMD64 machine with glibc: name old time/op new time/op delta CgoCallbackMainThread-8 96.4µs ± 3% 0.1µs ± 2% -99.92% (p=0.000 n=10+9) Fixes #68285. Fixes #68587. Change-Id: I3422badd5ad8ff63e1a733152d05fb7a44d5d435 Reviewed-on: https://go-review.googlesource.com/c/go/+/600296 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2024-10-24	runtime: support cgo index into pointer-to-array	Ian Lance Taylor
	We were missing a case for calling a C function with an index into a pointer-to-array. Fixes #70016 Change-Id: I9c74d629e58722813c1aaa0f0dc225a5a64d111b Reviewed-on: https://go-review.googlesource.com/c/go/+/621576 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2024-08-27	runtime: store bp on cgocallback as unsafe.Pointer	Michael Anthony Knyszek
	As of CL 580255, the runtime tracks the frame pointer (or base pointer, bp) when entering syscalls, so that we can use fpTracebackPCs on goroutines that are sitting in syscalls. That CL mostly got things right, but missed one very subtle detail. When calling from Go->C->Go, the goroutine stack performing the calls when returning to Go is free to move around in memory due to growth, shrinking, etc. But upon returning back to C, it needs to restore gp.syscall*, including gp.syscallsp and gp.syscallbp. The way syscallsp currently gets updated is automagically: it's stored as an unsafe.Pointer on the stack so that it shows up in a stack map. If the stack ever moves, it'll get updated correctly. But gp.syscallbp isn't saved to the stack as an unsafe.Pointer, but rather as a uintptr, so it never gets updated! As a result, in rare circumstances, fpTracebackPCs can correctly try to use gp.syscallbp as the starting point for the traceback, but the value is stale. This change fixes the problem by just storing gp.syscallbp to the stack on cgocallback as an unsafe.Pointer, like gp.syscallsp. It also adds a comment documenting this subtlety; the lack of explanation for the unsafe.Pointer type on syscallsp meant this detail was missed -- let's not miss it again in the future. Now, we have a fix, what about a test? Unfortunately, testing this is going to be incredibly annoying because the circumstances under which gp.syscallbp are actually used for traceback are non-deterministic and hard to arrange, especially from within testprogcgo where we don't have export_test.go and can't reach into the runtime. So, instead, add a gp.syscallbp check to reentersyscall and entersyscallblock that mirrors the gp.syscallbp consistency check. This probably causes some miniscule slowdown to the syscall path, but it'll catch the issue without having to actually perform a traceback. Fixes #69085. Change-Id: Iaf771758f1666024b854f5fbe2b2c63cbe35b201 Reviewed-on: https://go-review.googlesource.com/c/go/+/608775 Reviewed-by: Nick Ripley <nick.ripley@datadoghq.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-07-23	runtime,internal: move runtime/internal/sys to internal/runtime/sys	David Chase
	Cleanup and friction reduction For #65355. Change-Id: Ia14c9dc584a529a35b97801dd3e95b9acc99a511 Reviewed-on: https://go-review.googlesource.com/c/go/+/600436 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org>
2024-05-23	all: document legacy //go:linkname for modules with ≥2,000 dependents	Russ Cox
	For #67401. Change-Id: I3ae93042dffd0683b7e6d6225536ae667749515b Reviewed-on: https://go-review.googlesource.com/c/go/+/587221 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Russ Cox <rsc@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-05-16	runtime: always update stack bounds on cgocallback	Michael Pratt
	callbackUpdateSystemStack contains a fast path to exit early without update if SP is already within the g0.stack bounds. This is not safe, as a subsequent call may have new stack bounds that only partially overlap the old stack bounds. In this case it is possible to see an SP that is in the old stack bounds, but very close to the bottom of the bounds due to the partial overlap. In that case we're very likely to "run out" of space on the system stack. We only need to do this on extra Ms, as normal Ms have precise bounds defined when we allocated the stack. TSAN annotations are added to x_cgo_getstackbounds because bounds is a pointer into the Go stack. The stack can be reused when an old thread exits and a new thread starts, but TSAN can't see the synchronization there. This isn't a new case, but we are now calling more often. Fixes #62440. Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest Change-Id: I5389050494987b7668d0b317fb92f85e61d798ac Reviewed-on: https://go-review.googlesource.com/c/go/+/584597 Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-04-19	runtime: track frame pointer while in syscall	Michael Anthony Knyszek
	Currently the runtime only tracks the PC and SP upon entering a syscall, but not the FP (BP). This is mainly for historical reasons, and because the tracer (which uses the frame pointer unwinder) does not need it. Until it did, of course, in CL 567076, where the tracer tries to take a stack trace of a goroutine that's in a syscall from afar. It tries to use gp.sched.bp and lots of things go wrong. It really should be using the equivalent of gp.syscallbp, which doesn't exist before this CL. This change introduces gp.syscallbp and tracks it. It also introduces getcallerfp which is nice for simplifying some code. Because we now have gp.syscallbp, we can also delete the frame skip count computation in traceLocker.GoSysCall, because it's now the same regardless of whether frame pointer unwinding is used. Fixes #66889. Change-Id: Ib6d761c9566055e0a037134138cb0f81be73ecf7 Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-nocgo Reviewed-on: https://go-review.googlesource.com/c/go/+/580255 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2024-04-09	runtime: remove the allocheaders GOEXPERIMENT	Michael Anthony Knyszek
	This change removes the allocheaders, deleting all the old code and merging mbitmap_allocheaders.go back into mbitmap.go. This change also deletes the SetType benchmarks which were already broken in the new GOEXPERIMENT (it's harder to set up than before). We weren't really watching these benchmarks at all, and they don't provide additional test coverage. Change-Id: I135497201c3259087c5cd3722ed3fbe24791d25d Reviewed-on: https://go-review.googlesource.com/c/go/+/567200 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2024-04-02	all: use kind* of abi	qiulaidongfeng
	For #59670 Change-Id: Id66e102f13e529dd041b68ce869026a56f0a1b9b GitHub-Last-Rev: 43aa9376f72bc02a9d86518cdc99494a6b2f8573 GitHub-Pull-Request: golang/go#65564 Reviewed-on: https://go-review.googlesource.com/c/go/+/562298 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Austin Clements <austin@google.com>
2024-03-26	runtime: don't call lockOSThread for every syscall call on Windows	qmuntal
	Windows syscall.SyscallN currently calls lockOSThread for every syscall. This can be expensive and produce unnecessary context switches, especially when the syscall is called frequently under high contention. The lockOSThread was necessary to ensure that cgocall wouldn't reschedule the goroutine to a different M, as the syscall return values are reported back in the M struct. This CL instructs cgocall to copy the syscall return values into the the M that will see the caller on return, so the caller no longer needs to call lockOSThread. Updates #58336. Cq-Include-Trybots: luci.golang.try:gotip-windows-arm64,gotip-windows-amd64-longtest Change-Id: If6644fd111dbacab74e7dcee2afa18ca146735da Reviewed-on: https://go-review.googlesource.com/c/go/+/562915 Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com> Reviewed-by: Than McIntosh <thanm@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-04	runtime: use .Pointers() instead of manual checking	Pouriya
	Change-Id: Ib78c1513616089f4942297cd17212b1b11871fd5 GitHub-Last-Rev: f97fe5b5bffffe25dc31de7964588640cb70ec41 GitHub-Pull-Request: golang/go#65819 Reviewed-on: https://go-review.googlesource.com/c/go/+/565515 Reviewed-by: Jorropo <jorropo.pgm@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2023-11-09	runtime: implement experiment to replace heap bitmap with alloc headers	Michael Anthony Knyszek
	This change replaces the 1-bit-per-word heap bitmap for most size classes with allocation headers for objects that contain pointers. The header consists of a single pointer to a type. All allocations with headers are treated as implicitly containing one or more instances of the type in the header. As the name implies, headers are usually stored as the first word of an object. There are two additional exceptions to where headers are stored and how they're used. Objects smaller than 512 bytes do not have headers. Instead, a heap bitmap is reserved at the end of spans for objects of this size. A full word of overhead is too much for these small objects. The bitmap is of the same format of the old bitmap, minus the noMorePtrs bits which are unnecessary. All the objects <512 bytes have a bitmap less than a pointer-word in size, and that was the granularity at which noMorePtrs could stop scanning early anyway. Objects that are larger than 32 KiB (which have their own span) have their headers stored directly in the span, to allow power-of-two-sized allocations to not spill over into an extra page. The full implementation is behind GOEXPERIMENT=allocheaders. The purpose of this change is performance. First and foremost, with headers we no longer have to unroll pointer/scalar data at allocation time for most size classes. Small size classes still need some unrolling, but their bitmaps are small so we can optimize that case fairly well. Larger objects effectively have their pointer/scalar data unrolled on-demand from type data, which is much more compactly represented and results in less TLB pressure. Furthermore, since the headers are usually right next to the object and where we're about to start scanning, we get an additional temporal locality benefit in the data cache when looking up type metadata. The pointer/scalar data is now effectively unrolled on-demand, but it's also simpler to unroll than before; that unrolled data is never written anywhere, and for arrays we get the benefit of retreading the same data per element, as opposed to looking it up from scratch for each pointer-word of bitmap. Lastly, because we no longer have a heap bitmap that spans the entire heap, there's a flat 1.5% memory use reduction. This is balanced slightly by some objects possibly being bumped up a size class, but most objects are not tightly optimized to size class sizes so there's some memory to spare, making the header basically free in those cases. See the follow-up CL which turns on this experiment by default for benchmark results. (CL 538217.) Change-Id: I4c9034ee200650d06d8bdecd579d5f7c1bbf1fc5 Reviewed-on: https://go-review.googlesource.com/c/go/+/437955 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-11-08	runtime: clarify error when returning unpinned pointers	Richard Wang
	With the introduction of runtime.Pinner, returning a pointer to a pinned struct that then points to an unpinned Go pointer is correctly caught. However, the error message remained as "cgo result has Go pointer", which should be updated to acknowledge that Go pointers to pinned memory are allowed. This also updates the comments for cgoCheckArg and cgoCheckResult to similarly clarify. Updates #46787 Change-Id: I147bb09e87dfb70a24d6d43e4cf84e8bcc2aff48 GitHub-Last-Rev: 706facb9f2bf28e1f6e575b7626f8feeca1187cf GitHub-Pull-Request: golang/go#62606 Reviewed-on: https://go-review.googlesource.com/c/go/+/527702 Reviewed-by: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Heschi Kreinick <heschi@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com>
2023-10-04	runtime: unlock OS thread after cgocallbackg1	Cherry Mui
	For cgo callbacks, currently cgocallbackg locks the OS thread and then call cgocallbackg1, which invokes the actual callback, and then unlocks the OS thread in a deferred call. cgocallback then continues assuming we are on the same M. This assumes there is no preemption point between the deferred unlockOSThread and returning to the caller (cgocallbackg). But this is not always true. E.g. when open defer is not used (e.g. PIE or shared build mode on 386), there is a preemption point in deferreturn after invoking the deferred function (when it checks whether there are still defers to run). Instead of relying on and requiring the defer implementation has no preemption point, we move the unlockOSThread to the caller, and ensuring no preemption by setting incgo to true before unlocking. This doesn't cover the panicking path, so we also adds an unlockOSThread there. There we don't need to worry about preemption, because we're panicking out of the callback and we have unwound the g0 stack, instead of reentering cgo. Fixes #62102. Change-Id: I0e0b9f9091be88d01675c0acb7339b81402545be Reviewed-on: https://go-review.googlesource.com/c/go/+/532615 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2023-09-19	runtime: set stackguard1 on extra M g0	Michael Pratt
	[This is an unmodified redo of CL 527056.] Standard Ms set g0.stackguard1 to the same value as stackguard0 in mstart0. For consistency, extra Ms should do the same for their g0. Do this in needm -> callbackUpdateSystemStack. Background: getg().stackguard1 is used as the stack guard for the stack growth prolouge in functions marked //go:systemstack [1]. User Gs set stackguard1 to ^uintptr(0) so that the check always fail, calling morestackc, which throws to report a //go:systemstack function call on a user stack. g0 setting stackguard1 is unnecessary for this functionality. 0 would be sufficient, as g0 is always allowed to call //go:systemstack functions. However, since we have the check anyway, setting stackguard1 to the actual stack bound is useful to detect actual stack overflows on g0 (though morestackc doesn't detect this case and would report a misleading message about user stacks). [1] cmd/internal/obj calls //go:systemstack functions AttrCFunc. This is a holdover from when the runtime contained actual C functions. But since CL 2275, it has simply meant "pretend this is a C function, which would thus need to use the system stack". Hence the name morestackc. At this point, this terminology is pretty far removed from reality and should probably be updated to something more intuitive. Change-Id: If315677217354465fbbfbd0d406d79be20db0cc3 Reviewed-on: https://go-review.googlesource.com/c/go/+/527716 Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-09-12	runtime: allow update of system stack bounds on callback from C thread	Michael Pratt
	[This is a redo of CL 525455 with the test fixed on darwin by defining _XOPEN_SOURCE, and disabled with android, musl, and openbsd, which do not provide getcontext.] Since CL 495855, Ms are cached for C threads calling into Go, including the stack bounds of the system stack. Some C libraries (e.g., coroutine libraries) do manual stack management and may change stacks between calls to Go on the same thread. Changing the stack if there is more Go up the stack would be problematic. But if the calls are completely independent there is no particular reason for Go to care about the changing stack boundary. Thus, this CL allows the stack bounds to change in such cases. The primary downside here (besides additional complexity) is that normal systems that do not manipulate the stack may not notice unintentional stack corruption as quickly as before. Note that callbackUpdateSystemStack is written to be usable for the initial setup in needm as well as updating the stack in cgocallbackg. Fixes #62440. For #62130. Change-Id: I0fe0134f865932bbaff1fc0da377c35c013bd768 Reviewed-on: https://go-review.googlesource.com/c/go/+/527715 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-09-11	Revert "runtime: allow update of system stack bounds on callback from C thread"	Michael Pratt
	This reverts CL 525455. The test fails to build on darwin, alpine, and android. For #62440. Change-Id: I39c6b1e16499bd61e0f166de6c6efe7a07961e62 Reviewed-on: https://go-review.googlesource.com/c/go/+/527317 Auto-Submit: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-09-11	Revert "runtime: set stackguard1 on extra M g0"	Michael Pratt
	This reverts CL 527056. CL 525455 breaks darwin, alpine, and android. This CL must be reverted in order to revert that CL. For #62440. Change-Id: I4e1b16e384b475a605e0214ca36c918d50faa22c Reviewed-on: https://go-review.googlesource.com/c/go/+/527316 Reviewed-by: Cherry Mui <cherryyz@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-09-11	runtime: set stackguard1 on extra M g0	Michael Pratt
	Standard Ms set g0.stackguard1 to the same value as stackguard0 in mstart0. For consistency, extra Ms should do the same for their g0. Do this in needm -> callbackUpdateSystemStack. Background: getg().stackguard1 is used as the stack guard for the stack growth prolouge in functions marked //go:systemstack [1]. User Gs set stackguard1 to ^uintptr(0) so that the check always fail, calling morestackc, which throws to report a //go:systemstack function call on a user stack. g0 setting stackguard1 is unnecessary for this functionality. 0 would be sufficient, as g0 is always allowed to call //go:systemstack functions. However, since we have the check anyway, setting stackguard1 to the actual stack bound is useful to detect actual stack overflows on g0 (though morestackc doesn't detect this case and would report a misleading message about user stacks). [1] cmd/internal/obj calls //go:systemstack functions AttrCFunc. This is a holdover from when the runtime contained actual C functions. But since CL 2275, it has simply meant "pretend this is a C function, which would thus need to use the system stack". Hence the name morestackc. At this point, this terminology is pretty far removed from reality and should probably be updated to something more intuitive. Change-Id: I8d0e5628ce31ac6a189a7d7a4124be85aef89862 Reviewed-on: https://go-review.googlesource.com/c/go/+/527056 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-09-11	runtime: allow update of system stack bounds on callback from C thread	Michael Pratt
	Since CL 495855, Ms are cached for C threads calling into Go, including the stack bounds of the system stack. Some C libraries (e.g., coroutine libraries) do manual stack management and may change stacks between calls to Go on the same thread. Changing the stack if there is more Go up the stack would be problematic. But if the calls are completely independent there is no particular reason for Go to care about the changing stack boundary. Thus, this CL allows the stack bounds to change in such cases. The primary downside here (besides additional complexity) is that normal systems that do not manipulate the stack may not notice unintentional stack corruption as quickly as before. Note that callbackUpdateSystemStack is written to be usable for the initial setup in needm as well as updating the stack in cgocallbackg. Fixes #62440. For #62130. Change-Id: I7841b056acea1111bdae3b718345a3bd3961b4a8 Reviewed-on: https://go-review.googlesource.com/c/go/+/525455 Run-TryBot: Michael Pratt <mpratt@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-08-25	cmd/cgo: add #cgo noescape/nocallback annotations	doujiang24
	When passing pointers of Go objects from Go to C, the cgo command generate _Cgo_use(pN) for the unsafe.Pointer type arguments, so that the Go compiler will escape these object to heap. Since the C function may callback to Go, then the Go stack might grow/shrink, that means the pointers that the C function have will be invalid. After adding the #cgo noescape annotation for a C function, the cgo command won't generate _Cgo_use(pN), and the Go compiler won't force the object escape to heap. After adding the #cgo nocallback annotation for a C function, which means the C function won't callback to Go, if it do callback to Go, the Go process will crash. Fixes #56378 Change-Id: Ifdca070584e0d349c7b12276270e50089e481f7a GitHub-Last-Rev: f1a17b08b0590eca2670e404bbfedad5461df72f GitHub-Pull-Request: golang/go#60399 Reviewed-on: https://go-review.googlesource.com/c/go/+/497837 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Bryan Mills <bcmills@google.com> Auto-Submit: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-07-18	all: fix typos	Jes Cok
	Change-Id: I510b0a4bf3472d937393800dd57472c30beef329 GitHub-Last-Rev: 8d289b73a37bd86080936423d981d21e152aaa33 GitHub-Pull-Request: golang/go#60960 Reviewed-on: https://go-review.googlesource.com/c/go/+/505398 Auto-Submit: Robert Findley <rfindley@google.com> Reviewed-by: Robert Findley <rfindley@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Robert Findley <rfindley@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2023-05-19	runtime: implement Pinner API for object pinning	Sven Anderson
	Some C APIs require the use or structures that contain pointers to buffers (iovec, io_uring, ...). The pointer passing rules would require that these buffers are allocated in C memory and to process this data with Go libraries it would need to be copied. In order to provide a zero-copy way to use these C APIs, this CL implements a Pinner API that allows to pin Go objects, which guarantees that the garbage collector does not move these objects while pinned. This allows to relax the pointer passing rules so that pinned pointers can be stored in C allocated memory or can be contained in Go memory that is passed to C functions. The Pin() method accepts pointers to objects of any type and unsafe.Pointer. Slices and arrays can be pinned by calling Pin() with the pointer to the first element. Pinning of maps is not supported. If the GC collects unreachable Pinner holding pinned objects it panics. If Pin() is called with the other non-pointer types it panics as well. Performance considerations: This change has no impact on execution time on existing code, because checks are only done in code paths, that would panic otherwise. The memory footprint on existing code is one pointer per memory span. Fixes: #46787 Signed-off-by: Sven Anderson <sven@anderson.de> Change-Id: I110031fe789b92277ae45a9455624687bd1c54f2 Reviewed-on: https://go-review.googlesource.com/c/go/+/367296 Auto-Submit: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Than McIntosh <thanm@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com>
2023-05-17	runtime/cgo: store M for C-created thread in pthread key	Cherry Mui
	This reapplies CL 485500, with a fix drafted in CL 492987 incorporated. CL 485500 is reverted due to #60004 and #60007. #60004 is fixed in CL 492743. #60007 is fixed in CL 492987 (incorporated in this CL). [Original CL 485500 description] This reapplies CL 481061, with the followup fixes in CL 482975, CL 485315, and CL 485316 incorporated. CL 481061, by doujiang24 <doujiang24@gmail.com>, speed up C to Go calls by binding the M to the C thread. See below for its description. CL 482975 is a followup fix to a C declaration in testprogcgo. CL 485315 is a followup fix for x_cgo_getstackbound on Illumos. CL 485316 is a followup cleanup for ppc64 assembly. CL 479915 passed the G to _cgo_getstackbound for direct updates to gp.stack.lo. A G can be reused on a new thread after the previous thread exited. This could trigger the C TSAN race detector because it couldn't see the synchronization in Go (lockextra) preventing the same G from being used on multiple threads at the same time. We work around this by passing the address of a stack variable to _cgo_getstackbound rather than the G. The stack is generally unique per thread, so TSAN won't see the same address from multiple threads. Even if stacks are reused across threads by pthread, C TSAN should see the synchonization in the stack allocator. A regression test is added to misc/cgo/testsanitizer. [Original CL 481061 description] This reapplies CL 392854, with the followup fixes in CL 479255, CL 479915, and CL 481057 incorporated. CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go calls by binding the M to the C thread. See below for its description. CL 479255 is a followup fix for a small bug in ARM assembly code. CL 479915 is another followup fix to address C to Go calls after the C code uses some stack, but that CL is also buggy. CL 481057, by Michael Knyszek, is a followup fix for a memory leak bug of CL 479915. [Original CL 392854 description] In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls. So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call. Instead, we only dropm while the C thread exits, so the extra M won't leak. When invoking a Go function from C: Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor. And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits. When returning back to C: Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C. This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows. This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread. For the newly added BenchmarkCGoInCThread, some benchmark results: 1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz 2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz [CL 479915 description] Currently, when C calls into Go the first time, we grab an M using needm, which sets m.g0's stack bounds using the SP. We don't know how big the stack is, so we simply assume 32K. Previously, when the Go function returns to C, we drop the M, and the next time C calls into Go, we put a new stack bound on the g0 based on the current SP. After CL 392854, we don't drop the M, and the next time C calls into Go, we reuse the same g0, without recomputing the stack bounds. If the C code uses quite a bit of stack space before calling into Go, the SP may be well below the 32K stack bound we assumed, so the runtime thinks the g0 stack overflows. This CL makes needm get a more accurate stack bound from pthread. (In some platforms this may still be a guess as we don't know exactly where we are in the C stack), but it is probably better than simply assuming 32K. [CL 492987 description] On the first call into Go from a C thread, currently we set the g0 stack's high bound imprecisely based on the SP. With CL 485500, we keep the M and don't recompute the stack bounds when it calls into Go again. If the first call is made when the C thread uses some deep stack, but a subsequent call is made with a shallower stack, the SP may be above g0.stack.hi. This is usually okay as we don't check usually stack.hi. One place where we do check for stack.hi is in the signal handler, in adjustSignalStack. In particular, C TSAN delivers signals on the g0 stack (instead of the usual signal stack). If the SP is above g0.stack.hi, we don't see it is on the g0 stack, and throws. This CL makes it get an accurate stack upper bound with the pthread API (on the platforms where it is available). Also add some debug print for the "handler not on signal stack" throw. Fixes #51676. Fixes #59294. Fixes #59678. Fixes #60007. Change-Id: Ie51c8e81ade34ec81d69fd7bce1fe0039a470776 Reviewed-on: https://go-review.googlesource.com/c/go/+/495855 Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>
2023-05-11	runtime: move per-type types to internal/abi	David Chase
	Change-Id: I1f031f0f83a94bebe41d3978a91a903dc5bcda66 Reviewed-on: https://go-review.googlesource.com/c/go/+/489276 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-11	runtime: redefine _type to abi.Type; add rtype for methods.	David Chase
	Change-Id: I1c478b704d84811caa209006c657dda82d9c4cf9 Reviewed-on: https://go-review.googlesource.com/c/go/+/488435 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2023-05-10	internal/reflectlite, runtime: move more constants and types into internal/abi	David Chase
	Change-Id: If5da1057ead34eb3e4c7f42bbe6ad3d350b97725 Reviewed-on: https://go-review.googlesource.com/c/go/+/484856 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-10	internal/abi: common up ArrayType	David Chase
	This refactoring is more problematic because the client package wrap abi.Type, thus the self-referential fields within ArrayType need to be downcast to the client wrappers in several places. It's not clear to me this is worthwhile; this CL is for additional comment, before I attempt similar changes for other self-referential types. Change-Id: I41e517e6d851b32560c41676b91b76d7eb17c951 Reviewed-on: https://go-review.googlesource.com/c/go/+/466236 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2023-05-05	internal/abi: refactor (basic) type struct into one definition	David Chase
	This touches a lot of files, which is bad, but it is also good, since there's N copies of this information commoned into 1. The new files in internal/abi are copied from the end of the stack; ultimately this will all end up being used. Change-Id: Ia252c0055aaa72ca569411ef9f9e96e3d610889e Reviewed-on: https://go-review.googlesource.com/c/go/+/462995 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Carlos Amedee <carlos@golang.org> Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2023-05-05	Revert "runtime/cgo: store M for C-created thread in pthread key"	Chressie Himpel
	This reverts CL 485500. Reason for revert: This breaks internal tests at Google, see b/280861579 and b/280820455. Change-Id: I426278d400f7611170918fc07c524cb059b9cc55 Reviewed-on: https://go-review.googlesource.com/c/go/+/492995 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Chressie Himpel <chressie@google.com>
2023-04-28	runtime/trace: avoid frame pointer unwinding for events during cgocallbackg	Nick Ripley
	The current mp.incgocallback() logic allows for trace events to be recorded using frame pointer unwinding during cgocallbackg when they shouldn't be. Specifically, mp.incgo will be false during the reentersyscall call at the end. It's possible to crash with tracing enabled because of this, if C code which uses the frame pointer register for other purposes calls into Go. This can be seen, for example, by forcing testprogcgo/trace_unix.c to write a garbage value to RBP prior to calling into Go. We can drop the mp.incgo check, and instead conservatively avoid doing frame pointer unwinding if there is any C on the stack. This is the case if mp.ncgo > 0, or if mp.isextra is true (meaning we're coming from a thread created by C). Rename incgocallback to reflect that we're checking if there's any C on the stack. We can also move the ncgo increment in cgocall closer to where the transition to C happens, which lets us use frame pointer unwinding for the entersyscall event during the first Go-to-C call on a stack, when there isn't yet any C on the stack. Fixes #59830. Change-Id: If178a705a9d38d0d2fb19589a9e669cd982d32cd Reviewed-on: https://go-review.googlesource.com/c/go/+/488755 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Run-TryBot: Nick Ripley <nick.ripley@datadoghq.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-26	runtime/cgo: store M for C-created thread in pthread key	Michael Pratt
	This reapplies CL 481061, with the followup fixes in CL 482975, CL 485315, and CL 485316 incorporated. CL 481061, by doujiang24 <doujiang24@gmail.com>, speed up C to Go calls by binding the M to the C thread. See below for its description. CL 482975 is a followup fix to a C declaration in testprogcgo. CL 485315 is a followup fix for x_cgo_getstackbound on Illumos. CL 485316 is a followup cleanup for ppc64 assembly. [Original CL 481061 description] This reapplies CL 392854, with the followup fixes in CL 479255, CL 479915, and CL 481057 incorporated. CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go calls by binding the M to the C thread. See below for its description. CL 479255 is a followup fix for a small bug in ARM assembly code. CL 479915 is another followup fix to address C to Go calls after the C code uses some stack, but that CL is also buggy. CL 481057, by Michael Knyszek, is a followup fix for a memory leak bug of CL 479915. [Original CL 392854 description] In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls. So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call. Instead, we only dropm while the C thread exits, so the extra M won't leak. When invoking a Go function from C: Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor. And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits. When returning back to C: Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C. This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows. This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread. For the newly added BenchmarkCGoInCThread, some benchmark results: 1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz 2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz [CL 479915 description] Currently, when C calls into Go the first time, we grab an M using needm, which sets m.g0's stack bounds using the SP. We don't know how big the stack is, so we simply assume 32K. Previously, when the Go function returns to C, we drop the M, and the next time C calls into Go, we put a new stack bound on the g0 based on the current SP. After CL 392854, we don't drop the M, and the next time C calls into Go, we reuse the same g0, without recomputing the stack bounds. If the C code uses quite a bit of stack space before calling into Go, the SP may be well below the 32K stack bound we assumed, so the runtime thinks the g0 stack overflows. This CL makes needm get a more accurate stack bound from pthread. (In some platforms this may still be a guess as we don't know exactly where we are in the C stack), but it is probably better than simply assuming 32K. [CL 485500 description] CL 479915 passed the G to _cgo_getstackbound for direct updates to gp.stack.lo. A G can be reused on a new thread after the previous thread exited. This could trigger the C TSAN race detector because it couldn't see the synchronization in Go (lockextra) preventing the same G from being used on multiple threads at the same time. We work around this by passing the address of a stack variable to _cgo_getstackbound rather than the G. The stack is generally unique per thread, so TSAN won't see the same address from multiple threads. Even if stacks are reused across threads by pthread, C TSAN should see the synchonization in the stack allocator. A regression test is added to misc/cgo/testsanitizer. Fixes #51676. Fixes #59294. Fixes #59678. Change-Id: Ic62be31a06ee83568215e875a891df37084e08ca Reviewed-on: https://go-review.googlesource.com/c/go/+/485500 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Michael Pratt <mpratt@google.com>
2023-04-17	Revert "runtime/cgo: store M for C-created thread in pthread key"	Michael Pratt
	This reverts CL 481061. Reason for revert: When built with C TSAN, x_cgo_getstackbound triggers race detection on `g->stacklo` because the synchronization is in Go, which isn't instrumented. For #51676. For #59294. For #59678. Change-Id: I38afcda9fcffd6537582a39a5214bc23dc147d47 Reviewed-on: https://go-review.googlesource.com/c/go/+/485275 TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> Reviewed-by: Than McIntosh <thanm@google.com>
2023-04-03	runtime/cgo: store M for C-created thread in pthread key	doujiang24
	This reapplies CL 392854, with the followup fixes in CL 479255, CL 479915, and CL 481057 incorporated. CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go calls by binding the M to the C thread. See below for its description. CL 479255 is a followup fix for a small bug in ARM assembly code. CL 479915 is another followup fix to address C to Go calls after the C code uses some stack, but that CL is also buggy. CL 481057, by Michael Knyszek, is a followup fix for a memory leak bug of CL 479915. [Original CL 392854 description] In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls. So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call. Instead, we only dropm while the C thread exits, so the extra M won't leak. When invoking a Go function from C: Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor. And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits. When returning back to C: Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C. This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows. This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread. For the newly added BenchmarkCGoInCThread, some benchmark results: 1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz 2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz [CL 479915 description] Currently, when C calls into Go the first time, we grab an M using needm, which sets m.g0's stack bounds using the SP. We don't know how big the stack is, so we simply assume 32K. Previously, when the Go function returns to C, we drop the M, and the next time C calls into Go, we put a new stack bound on the g0 based on the current SP. After CL 392854, we don't drop the M, and the next time C calls into Go, we reuse the same g0, without recomputing the stack bounds. If the C code uses quite a bit of stack space before calling into Go, the SP may be well below the 32K stack bound we assumed, so the runtime thinks the g0 stack overflows. This CL makes needm get a more accurate stack bound from pthread. (In some platforms this may still be a guess as we don't know exactly where we are in the C stack), but it is probably better than simply assuming 32K. Fixes #51676. Fixes #59294. Change-Id: I9bf1400106d5c08ce621d2ed1df3a2d9e3f55494 Reviewed-on: https://go-review.googlesource.com/c/go/+/481061 Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> Reviewed-by: DeJiang Zhu (doujiang) <doujiang24@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-31	Revert "runtime/cgo: store M for C-created thread in pthread key"	Cherry Mui
	This reverts CL 392854. Reason for revert: caused #59294, which was derived from google internal tests. The attempted fix of #59294 caused more breakage. Change-Id: I5a061561ac2740856b7ecc09725ac28bd30f8bba Reviewed-on: https://go-review.googlesource.com/c/go/+/481060 Reviewed-by: Heschi Kreinick <heschi@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-24	runtime/cgo: store M for C-created thread in pthread key	doujiang24
	In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls. So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call. Instead, we only dropm while the C thread exits, so the extra M won't leak. When invoking a Go function from C: Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor. And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits. When returning back to C: Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C. This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows. This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread. For the newly added BenchmarkCGoInCThread, some benchmark results: 1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz 2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz Fixes #51676 Change-Id: I380702fe2f9b6b401b2d6f04b0aba990f4b9ee6c GitHub-Last-Rev: 93dc64ad98e5583372e41f65ee4b7ab78b5aff51 GitHub-Pull-Request: golang/go#51679 Reviewed-on: https://go-review.googlesource.com/c/go/+/392854 Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: thepudds <thepudds1460@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-02-16	runtime: reimplement GODEBUG=cgocheck=2 as a GOEXPERIMENT	Keith Randall
	Move this knob from a binary-startup thing to a build-time thing. This will enable followon optmizations to the write barrier. Change-Id: Ic3323348621c76a7dc390c09ff55016b19c43018 Reviewed-on: https://go-review.googlesource.com/c/go/+/447778 Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-11-18	all: add missing periods in comments	cui fliter
	Change-Id: I69065f8adf101fdb28682c55997f503013a50e29 Reviewed-on: https://go-review.googlesource.com/c/go/+/449757 Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Joedian Reid <joedian@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Joedian Reid <joedian@golang.org> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-09-01	Revert "runtime: convert ncgocall to atomic type"	Michael Pratt
	This reverts CL 426075. Reason for revert: Import missing from cgocall.go. Change-Id: Iac17e914045b83da30484dbe2a624cde526fb175 Reviewed-on: https://go-review.googlesource.com/c/go/+/427614 Reviewed-by: Heschi Kreinick <heschi@google.com>