aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/stack.c
AgeCommit message (Collapse)Author
2014-12-01[release-branch.go1.4] runtime: fix hang in GC due to shrinkstack vs netpoll ↵Russ Cox
race ««« CL 179680043 / 752cd9199639 runtime: fix hang in GC due to shrinkstack vs netpoll race During garbage collection, after scanning a stack, we think about shrinking it to reclaim some memory. The shrinking code (called while the world is stopped) checked that the status was Gwaiting or Grunnable and then changed the state to Gcopystack, to essentially lock the stack so that no other GC thread is scanning it. The same locking happens for stack growth (and is more necessary there). oldstatus = runtime·readgstatus(gp); oldstatus &= ~Gscan; if(oldstatus == Gwaiting || oldstatus == Grunnable) runtime·casgstatus(gp, oldstatus, Gcopystack); // oldstatus is Gwaiting or Grunnable else runtime·throw("copystack: bad status, not Gwaiting or Grunnable"); Unfortunately, "stop the world" doesn't stop everything. It stops all normal goroutine execution, but the network polling thread is still blocked in epoll and may wake up. If it does, and it chooses a goroutine to mark runnable, and that goroutine is the one whose stack is shrinking, then it can happen that between readgstatus and casgstatus, the status changes from Gwaiting to Grunnable. casgstatus assumes that if the status is not what is expected, it is a transient change (like from Gwaiting to Gscanwaiting and back, or like from Gwaiting to Gcopystack and back), and it loops until the status has been restored to the expected value. In this case, the status has changed semi-permanently from Gwaiting to Grunnable - it won't change again until the GC is done and the world can continue, but the GC is waiting for the status to change back. This wedges the program. To fix, call a special variant of casgstatus that accepts either Gwaiting or Grunnable as valid statuses. Without the fix bug with the extra check+throw in casgstatus, the program below dies in a few seconds (2-10) with GOMAXPROCS=8 on a 2012 Retina MacBook Pro. With the fix, it runs for minutes and minutes. package main import ( "io" "log" "net" "runtime" ) func main() { const N = 100 for i := 0; i < N; i++ { l, err := net.Listen("tcp", "127.0.0.1:0") if err != nil { log.Fatal(err) } ch := make(chan net.Conn, 1) go func() { var err error c1, err := net.Dial("tcp", l.Addr().String()) if err != nil { log.Fatal(err) } ch <- c1 }() c2, err := l.Accept() if err != nil { log.Fatal(err) } c1 := <-ch l.Close() go netguy(c1, c2) go netguy(c2, c1) c1.Write(make([]byte, 100)) } for { runtime.GC() } } func netguy(r, w net.Conn) { buf := make([]byte, 100) for { bigstack(1000) _, err := io.ReadFull(r, buf) if err != nil { log.Fatal(err) } w.Write(buf) } } var g int func bigstack(n int) { var buf [100]byte if n > 0 { bigstack(n - 1) } g = int(buf[0]) + int(buf[99]) } Fixes #9186. LGTM=rlh R=austin, rlh CC=dvyukov, golang-codereviews, iant, khr, r https://golang.org/cl/179680043 »»» TBR=rlh CC=golang-codereviews https://golang.org/cl/184030043
2014-10-29runtime: fix line number in first stack frame in printed stack traceRuss Cox
Originally traceback was only used for printing the stack when an unexpected signal came in. In that case, the initial PC is taken from the signal and should be used unaltered. For the callers, the PC is the return address, which might be on the line after the call; we subtract 1 to get to the CALL instruction. Traceback is now used for a variety of things, and for almost all of those the initial PC is a return address, whether from getcallerpc, or gp->sched.pc, or gp->syscallpc. In those cases, we need to subtract 1 from this initial PC, but the traceback code had a hard rule "never subtract 1 from the initial PC", left over from the signal handling days. Change gentraceback to take a flag that specifies whether we are tracing a trap. Change traceback to default to "starting with a return PC", which is the overwhelmingly common case. Add tracebacktrap, like traceback but starting with a trap PC. Use tracebacktrap in signal handlers. Fixes #7690. LGTM=iant, r R=r, iant CC=golang-codereviews https://golang.org/cl/167810044
2014-10-28runtime: add GODEBUG invalidptr settingRuss Cox
Fixes #8861. Fixes #8911. LGTM=r R=r CC=golang-codereviews https://golang.org/cl/165780043
2014-10-08runtime: zero a few more dead pointers.Keith Randall
In channels, zeroing of gp.waiting is missed on a closed channel panic. m.morebuf.g is not zeroed. I don't expect the latter causes any problems, but just in case. LGTM=iant R=golang-codereviews, iant CC=golang-codereviews https://golang.org/cl/151610043
2014-10-08runtime: delay freeing of shrunk stacks until gc is done.Keith Randall
This change prevents confusion in the garbage collector. The collector wants to make sure that every pointer it finds isn't junk. Its criteria for junk is (among others) points to a "free" span. Because the stack shrinker modifies pointers in the heap, there is a race condition between the GC scanner and the shrinker. The GC scanner can see old pointers (pointers to freed stacks). In particular this happens with SudoG.elem pointers. Normally this is not a problem, as pointers into stack spans are ok. But if the freed stack is the last one in its span, the span is marked as "free" instead of "contains stacks". This change makes sure that even if the GC scanner sees an old pointer, the span into which it points is still marked as "contains stacks", and thus the GC doesn't complain about it. This change will make the GC pause a tiny bit slower, as the stack freeing now happens in serial with the mark pause. We could delay the freeing until the mutators start back up, but this is the simplest change for now. TBR=dvyukov CC=golang-codereviews https://golang.org/cl/158750043
2014-09-30runtime: fix throwsplit checkDmitriy Vyukov
Newstack runs on g0, g0->throwsplit is never set. LGTM=rsc R=rsc CC=golang-codereviews, khr https://golang.org/cl/147370043
2014-09-25cgo: adjust return value location to account for stack copies.Keith Randall
During a cgo call, the stack can be copied. This copy invalidates the pointer that cgo has into the return value area. To fix this problem, pass the address of the location containing the stack top value (which is in the G struct). For cgo functions which return values, read the stktop before and after the cgo call to compute the adjustment necessary to write the return value. Fixes #8771 LGTM=iant, rsc R=iant, rsc, khr CC=golang-codereviews https://golang.org/cl/144130043
2014-09-24cmd/cc, cmd/ld, runtime: disallow conservative data/bss objectsRuss Cox
In linker, refuse to write conservative (array of pointers) as the garbage collection type for any variable in the data/bss GC program. In the linker, attach the Go type to an already-read C declaration during dedup. This gives us Go types for C globals for free as long as the cmd/dist-generated Go code contains the declaration. (Most runtime C declarations have a corresponding Go declaration. Both are bss declarations and so the linker dedups them.) In cmd/dist, add a few more C files to the auto-Go-declaration list in order to get Go type information for the C declarations into the linker. In C compiler, mark all non-pointer-containing global declarations and all string data as NOPTR. This allows them to exist in C files without any corresponding Go declaration. Count C function pointers as "non-pointer-containing", since we have no heap-allocated C functions. In runtime, add NOPTR to the remaining pointer-containing declarations, none of which refer to Go heap objects. In runtime, also move os.Args and syscall.envs data into runtime-owned variables. Otherwise, in programs that do not import os or syscall, the runtime variables named os.Args and syscall.envs will be missing type information. I believe that this CL eliminates the final source of conservative GC scanning in non-SWIG Go programs, and therefore... Fixes #909. LGTM=iant R=iant CC=golang-codereviews https://golang.org/cl/149770043
2014-09-19runtime: add runtime· prefix to some static variablesRuss Cox
Pure renaming. This will make an upcoming CL have smaller diffs. LGTM=dvyukov, iant R=iant, dvyukov CC=golang-codereviews https://golang.org/cl/142280043
2014-09-17runtime: free stacks of Gdead goroutines at GC timeKeith Randall
We could probably free the G structures as well, but for the allg list. Leaving that for another day. Fixes #8287 LGTM=rsc R=golang-codereviews, dvyukov, khr, rsc CC=golang-codereviews https://golang.org/cl/145010043
2014-09-17runtime: print more detail in adjustframe crashRuss Cox
The logic here is copied from mgc0.c's scanframe. Mostly it is messages although the minsize code is new (and I believe necessary). I am hoping to get more information about the current arm build failures (or, if it's the minsize thing, fix them). TBR=khr R=khr CC=golang-codereviews https://golang.org/cl/143180043
2014-09-16runtime: use traceback to traverse defer structuresRuss Cox
This makes the GC and the stack copying agree about how to interpret the defer structures. Previously, only the stack copying treated them precisely. This removes an untyped memory allocation and fixes at least three copystack bugs. To make sure the GC can find the deferred argument frame until it has been copied, keep a Defer on the defer list during its execution. In addition to making it possible to remove the untyped memory allocation, keeping the Defer on the list fixes two races between copystack and execution of defers (in both gopanic and Goexit). The problem is that once the defer has been taken off the list, a stack copy that happens before the deferred arguments have been copied back to the stack will not update the arguments correctly. The new tests TestDeferPtrsPanic and TestDeferPtrsGoexit (variations on the existing TestDeferPtrs) pass now but failed before this CL. In addition to those fixes, keeping the Defer on the list helps correct a dangling pointer error during copystack. The traceback routines walk the Defer chain to provide information about where a panic may resume execution. When the executing Defer was not on the Defer chain but instead linked from the Panic chain, the traceback had to walk the Panic chain too. But Panic structs are on the stack and being updated by copystack. Traceback's use of the Panic chain while copystack is updating those structs means that it can follow an updated pointer and find itself reading from the new stack. The new stack is usually all zeros, so it sees an incorrect early end to the chain. The new TestPanicUseStack makes this happen at tip and dies when adjustdefers finds an unexpected argp. The new StackCopyPoison mode causes an earlier bad dereference instead. By keeping the Defer on the list, traceback can avoid walking the Panic chain at all, making it okay for copystack to update the Panics. We'd have the same problem for any Defers on the stack. There was only one: gopanic's dabort. Since we are not taking the executing Defer off the chain, we can use it to do what dabort was doing, and then there are no Defers on the stack ever, so it is okay for traceback to use the Defer chain even while copystack is executing: copystack cannot modify the Defer chain. LGTM=khr R=khr CC=dvyukov, golang-codereviews, iant, rlh https://golang.org/cl/141490043
2014-09-12runtime: tell the truth about BitVector typeRuss Cox
Dmitriy changed all the execution to interpret the BitVector as an array of bytes. Update the declaration and generation of the bitmaps to match, to avoid problems on big-endian machines. LGTM=khr R=khr CC=dvyukov, golang-codereviews https://golang.org/cl/140570044
2014-09-12runtime: look up arg stackmap for makeFuncStub/methodValueStub during tracebackRuss Cox
makeFuncStub and methodValueStub are used by reflect as generic function implementations. Each call might have different arguments. Extract those arguments from the closure data instead of assuming it is the same each time. Because the argument map is now being extracted from the function itself, we don't need the special cases in reflect.Call anymore, so delete those. Fixes an occasional crash seen when stack copying does not update makeFuncStub's arguments correctly. Will also help make it safe to require stack maps in the garbage collector. Derived from CL 142000044 by khr. LGTM=khr R=khr CC=golang-codereviews https://golang.org/cl/143890044
2014-09-11runtime: get rid of copyable check - all G frames are copyable.Keith Randall
Just go ahead and do it, if something is wrong we'll throw. Also rip out cc-generated arg ptr maps, they are useless now. LGTM=rsc R=rsc CC=golang-codereviews https://golang.org/cl/133690045
2014-09-09runtime: assume precisestack, copystack, StackCopyAlways, ScanStackByFramesRuss Cox
Commit to stack copying for stack growth. We're carrying around a surprising amount of cruft from older schemes. I am confident that precise stack scans and stack copying are here to stay. Delete fallback code for when precise stack info is disabled. Delete fallback code for when copying stacks is disabled. Delete fallback code for when StackCopyAlways is disabled. Delete Stktop chain - there is only one stack segment now. Delete M.moreargp, M.moreargsize, M.moreframesize, M.cret. Delete G.writenbuf (unrelated, just dead). Delete runtime.lessstack, runtime.oldstack. Delete many amd64 morestack variants. Delete initialization of morestack frame/arg sizes (shortens split prologue!). Replace G's stackguard/stackbase/stack0/stacksize/ syscallstack/syscallguard/forkstackguard with simple stack bounds (lo, hi). Update liblink, runtime/cgo for adjustments to G. LGTM=khr R=khr, bradfitz CC=golang-codereviews, iant, r https://golang.org/cl/137410043
2014-09-08runtime: let stack copier update Panic structs for usRuss Cox
It already is updating parts of them; we're just getting lucky retraversing them and not finding much to do. Change argp to a pointer so that it will be updated too. Existing tests break if you apply the change to adjustpanics without also updating the type of argp. LGTM=khr R=khr CC=golang-codereviews https://golang.org/cl/139380043
2014-09-08runtime: enable StackCopyAlwaysRuss Cox
It worked at CL 134660043 on the builders, so I believe it will stick this time. LGTM=bradfitz R=khr, bradfitz CC=golang-codereviews https://golang.org/cl/141280043
2014-09-08runtime: turn off StackCopyAlwaysRuss Cox
windows/amd64 failure: http://build.golang.org/log/1ded5e3ef4bd1226f976e3180772f87e6c918255 # ..\misc\cgo\testso runtime: copystack: locals size info only for syscall.Syscall fatal error: split stack not allowed runtime stack: runtime.throw(0xa64cc7) c:/go/src/runtime/panic.go:395 +0xad fp=0x6fde0 sp=0x6fdb0 runtime.newstack() c:/go/src/runtime/stack.c:1001 +0x750 fp=0x6ff20 sp=0x6fde0 runtime.morestack() c:/go/src/runtime/asm_amd64.s:306 +0x73 fp=0x6ff28 sp=0x6ff20 goroutine 1 [stack growth, locked to thread]: runtime.freedefer(0xc0820ce120) c:/go/src/runtime/panic.go:162 fp=0xc08201b1a0 sp=0xc08201b198 runtime.deferreturn(0xa69420) c:/go/src/runtime/panic.go:211 +0xa8 fp=0xc08201b1e8 sp=0xc08201b1a0 runtime.cgocall_errno(0x498c00, 0xc08201b228, 0x0) c:/go/src/runtime/cgocall.go:134 +0x10e fp=0xc08201b210 sp=0xc08201b1e8 syscall.Syscall(0x7786b1d0, 0x2, 0xc0820c85b0, 0xc08201b2d8, 0x0, 0x0, 0x0, 0x0) c:/go/src/runtime/syscall_windows.c:74 +0x3c fp=0xc08201b260 sp=0xc08201b210 syscall.findFirstFile1(0xc0820c85b0, 0xc08201b2d8, 0x500000000000000, 0x0, 0x0) c:/go/src/syscall/zsyscall_windows.go:340 +0x76 fp=0xc08201b2b0 sp=0xc08201b260 syscall.FindFirstFile(0xc0820c85b0, 0xc08210c500, 0xc0820c85b0, 0x0, 0x0) c:/go/src/syscall/syscall_windows.go:907 +0x6a fp=0xc08201b530 sp=0xc08201b2b0 os.openDir(0xc0820b2e40, 0x33, 0x0, 0x0, 0x0) c:/go/src/os/file_windows.go:96 +0x110 fp=0xc08201b5e0 sp=0xc08201b530 os.OpenFile(0xc0820b2e40, 0x33, 0x0, 0x0, 0x41, 0x0, 0x0) c:/go/src/os/file_windows.go:143 +0x1e9 fp=0xc08201b650 sp=0xc08201b5e0 TBR=khr CC=golang-codereviews https://golang.org/cl/138230043
2014-09-08runtime: enable StackCopyAlwaysRuss Cox
Let's see how close we are to this being ready. Will roll back if it breaks any builds in non-trivial ways. LGTM=r, khr R=iant, khr, r CC=golang-codereviews https://golang.org/cl/138200043
2014-09-08liblink, runtime: diagnose and fix C code running on Go stackRuss Cox
This CL contains compiler+runtime changes that detect C code running on Go (not g0, not gsignal) stacks, and it contains corrections for what it detected. The detection works by changing the C prologue to use a different stack guard word in the G than Go prologue does. On the g0 and gsignal stacks, that stack guard word is set to the usual stack guard value. But on ordinary Go stacks, that stack guard word is set to ^0, which will make any stack split check fail. The C prologue then calls morestackc instead of morestack, and morestackc aborts the program with a message about running C code on a Go stack. This check catches all C code running on the Go stack except NOSPLIT code. The NOSPLIT code is allowed, so the check is complete. Since it is a dynamic check, the code must execute to be caught. But unlike the static checks we've been using in cmd/ld, the dynamic check works with function pointers and other indirect calls. For example it caught sigpanic being pushed onto Go stacks in the signal handlers. Fixes #8667. LGTM=khr, iant R=golang-codereviews, khr, iant CC=golang-codereviews, r https://golang.org/cl/133700043
2014-09-08build: move package sources from src/pkg to srcRuss Cox
Preparation was in CL 134570043. This CL contains only the effect of 'hg mv src/pkg/* src'. For more about the move, see golang.org/s/go14nopkg.
2008-12-05add support for ref counts to memory allocator.Russ Cox
mark and sweep, stop the world garbage collector (intermediate step in the way to ref counting). can run pretty with an explicit gc after each file. R=r DELTA=502 (346 added, 143 deleted, 13 changed) OCL=20630 CL=20635
2008-12-04add stub routines stackalloc() and stackfree().Russ Cox
run oldstack on g0's stack, just like newstack does, so that oldstack can free the old stack. R=r DELTA=53 (44 added, 0 deleted, 9 changed) OCL=20404 CL=20433