diff options
| author | Michael Pratt <mpratt@google.com> | 2022-02-04 17:15:28 -0500 |
|---|---|---|
| committer | Michael Pratt <mpratt@google.com> | 2022-02-15 15:40:35 +0000 |
| commit | 0a5fae2a0e965024f692b95f7e857904a274fcb6 (patch) | |
| tree | 393819d9f85f5be1f54bc480f7c6763859bc8997 /src/syscall/syscall_linux_test.go | |
| parent | 0b321c9a7c0055dfd3f875dea930a28690659211 (diff) | |
| download | go-0a5fae2a0e965024f692b95f7e857904a274fcb6.tar.xz | |
runtime, syscall: reimplement AllThreadsSyscall using only signals.
In issue 50113, we see that a thread blocked in a system call can result
in a hang of AllThreadsSyscall. To resolve this, we must send a signal
to these threads to knock them out of the system call long enough to run
the per-thread syscall.
Stepping back, if we need to send signals anyway, it should be possible
to implement this entire mechanism on top of signals. This CL does so,
vastly simplifying the mechanism, both as a direct result of
newly-unnecessary code as well as some ancillary simplifications to make
things simpler to follow.
Major changes:
* The rest of the mechanism is moved to os_linux.go, with fields in mOS
instead of m itself.
* 'Fixup' fields and functions are renamed to 'perThreadSyscall' so they
are more precise about their purpose.
* Rather than getting passed a closure, doAllThreadsSyscall takes the
syscall number and arguments. This avoids a lot of hairy behavior:
* The closure may potentially only be live in fields in the M,
hidden from the GC. Not necessary with no closure.
* The need to loan out the race context. A direct RawSyscall6 call
does not require any race context.
* The closure previously conditionally panicked in strange
locations, like a signal handler. Now we simply throw.
* All manual fixup synchronization with mPark, sysmon, templateThread,
sigqueue, etc is gone. The core approach is much simpler:
doAllThreadsSyscall sends a signal to every thread in allm, which
executes the system call from the signal handler. We use (SIGRTMIN +
1), aka SIGSETXID, the same signal used by glibc for this purpose. As
such, we are careful to only handle this signal on non-cgo binaries.
Synchronization with thread creation is a key part of this CL. The
comment near the top of doAllThreadsSyscall describes the required
synchronization semantics and how they are achieved.
Note that current use of allocmLock protects the state mutations of allm
that are also protected by sched.lock. allocmLock is used instead of
sched.lock simply to avoid holding sched.lock for so long.
Fixes #50113
Change-Id: Ic7ea856dc66cf711731540a54996e08fc986ce84
Reviewed-on: https://go-review.googlesource.com/c/go/+/383434
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Diffstat (limited to 'src/syscall/syscall_linux_test.go')
| -rw-r--r-- | src/syscall/syscall_linux_test.go | 71 |
1 files changed, 71 insertions, 0 deletions
diff --git a/src/syscall/syscall_linux_test.go b/src/syscall/syscall_linux_test.go index 8d828be015..0444b64266 100644 --- a/src/syscall/syscall_linux_test.go +++ b/src/syscall/syscall_linux_test.go @@ -15,6 +15,7 @@ import ( "sort" "strconv" "strings" + "sync" "syscall" "testing" "unsafe" @@ -565,3 +566,73 @@ func TestSetuidEtc(t *testing.T) { } } } + +// TestAllThreadsSyscallError verifies that errors are properly returned when +// the syscall fails on the original thread. +func TestAllThreadsSyscallError(t *testing.T) { + // SYS_CAPGET takes pointers as the first two arguments. Since we pass + // 0, we expect to get EFAULT back. + r1, r2, err := syscall.AllThreadsSyscall(syscall.SYS_CAPGET, 0, 0, 0) + if err == syscall.ENOTSUP { + t.Skip("AllThreadsSyscall disabled with cgo") + } + if err != syscall.EFAULT { + t.Errorf("AllThreadSyscall(SYS_CAPGET) got %d, %d, %v, want err %v", r1, r2, err, syscall.EFAULT) + } +} + +// TestAllThreadsSyscallBlockedSyscall confirms that AllThreadsSyscall +// can interrupt threads in long-running system calls. This test will +// deadlock if this doesn't work correctly. +func TestAllThreadsSyscallBlockedSyscall(t *testing.T) { + if _, _, err := syscall.AllThreadsSyscall(syscall.SYS_PRCTL, PR_SET_KEEPCAPS, 0, 0); err == syscall.ENOTSUP { + t.Skip("AllThreadsSyscall disabled with cgo") + } + + rd, wr, err := os.Pipe() + if err != nil { + t.Fatalf("unable to obtain a pipe: %v", err) + } + + // Perform a blocking read on the pipe. + var wg sync.WaitGroup + ready := make(chan bool) + wg.Add(1) + go func() { + data := make([]byte, 1) + + // To narrow the window we have to wait for this + // goroutine to block in read, synchronize just before + // calling read. + ready <- true + + // We use syscall.Read directly to avoid the poller. + // This will return when the write side is closed. + n, err := syscall.Read(int(rd.Fd()), data) + if !(n == 0 && err == nil) { + t.Errorf("expected read to return 0, got %d, %s", n, err) + } + + // Clean up rd and also ensure rd stays reachable so + // it doesn't get closed by GC. + rd.Close() + wg.Done() + }() + <-ready + + // Loop here to give the goroutine more time to block in read. + // Generally this will trigger on the first iteration anyway. + pid := syscall.Getpid() + for i := 0; i < 100; i++ { + if id, _, e := syscall.AllThreadsSyscall(syscall.SYS_GETPID, 0, 0, 0); e != 0 { + t.Errorf("[%d] getpid failed: %v", i, e) + } else if int(id) != pid { + t.Errorf("[%d] getpid got=%d, want=%d", i, id, pid) + } + // Provide an explicit opportunity for this goroutine + // to change Ms. + runtime.Gosched() + } + wr.Close() + wg.Wait() +} |
