diff options
| author | Michael Anthony Knyszek <mknyszek@google.com> | 2021-02-16 23:22:08 +0000 |
|---|---|---|
| committer | Michael Knyszek <mknyszek@google.com> | 2021-02-17 22:03:37 +0000 |
| commit | 7f0d01687e030f21e8bdc36dfd9d5aac3a6f4a71 (patch) | |
| tree | 243d09fcac81bf80a58e1356688c80f54291c235 /design | |
| parent | 329650d4723a558c2b76b81b4995fc5c267e6bc1 (diff) | |
| download | go-x-proposal-7f0d01687e030f21e8bdc36dfd9d5aac3a6f4a71.tar.xz | |
design: add user-configurable memory target
For golang/go#44309.
Change-Id: Ibd2f9bed3a1a1da40b5a3d216ccb1f48c9b64c04
Reviewed-on: https://go-review.googlesource.com/c/proposal/+/292789
Reviewed-by: Michael Pratt <mpratt@google.com>
Diffstat (limited to 'design')
| -rw-r--r-- | design/44309-user-configurable-memory-target.md | 489 | ||||
| -rw-r--r-- | design/44309/exceed-heap-target-high-GOGC.png | bin | 0 -> 41474 bytes | |||
| -rw-r--r-- | design/44309/exceed-heap-target.png | bin | 0 -> 45468 bytes | |||
| -rw-r--r-- | design/44309/heavy-step-alloc-high-heap-target.png | bin | 0 -> 45519 bytes | |||
| -rw-r--r-- | design/44309/high-heap-target.png | bin | 0 -> 45448 bytes | |||
| -rw-r--r-- | design/44309/high-noise-high-heap-target.png | bin | 0 -> 57860 bytes | |||
| -rw-r--r-- | design/44309/low-heap-target.png | bin | 0 -> 44493 bytes | |||
| -rw-r--r-- | design/44309/low-noise-high-heap-target.png | bin | 0 -> 48436 bytes | |||
| -rw-r--r-- | design/44309/step-heap-target.png | bin | 0 -> 43822 bytes | |||
| -rw-r--r-- | design/44309/very-low-heap-target.png | bin | 0 -> 46734 bytes |
10 files changed, 489 insertions, 0 deletions
diff --git a/design/44309-user-configurable-memory-target.md b/design/44309-user-configurable-memory-target.md new file mode 100644 index 0000000..f1c6a44 --- /dev/null +++ b/design/44309-user-configurable-memory-target.md @@ -0,0 +1,489 @@ +# User-configurable memory target + +Author: Michael Knyszek + +Updated: 16 February 2021 + +## Background + +Issue [#23044](https://golang.org/issue/23044) proposed the addition of some +kind of API to provide a "minimum heap" size; that is, the minimum heap goal +that the GC would ever set. +The purpose of a minimum heap size, as explored in that proposal, is as a +performance optimization: by preventing the heap from shrinking, each GC cycle +will get longer as the live heap shrinks further beyond the minimum. + +While `GOGC` already provides a way for Go users to trade off GC CPU time and +heap memory use, the argument against setting `GOGC` higher is that a live heap +spike is potentially dangerous, since the Go GC will use proportionally more +memory with a high proportional constant. +Instead, users (including a [high-profile account by +Twitch](https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i-learnt-to-stop-worrying-and-love-the-heap-26c2462549a2/) +have resorted to using a heap ballast: a large memory allocation that the Go GC +includes in its live heap size, but does not actually take up any resident +pages, according to the OS. +This technique thus effectively sets a minimum heap size in the runtime. +The main disadvantage of this technique is portability. +It relies on implementation-specific behavior, namely that the runtime will not +touch that new allocation, thereby preventing the OS from backing that space +with RAM on Unix-like systems. +It also relies on the Go GC never scanning that allocation. +This technique is also platform-specific, because on Windows such an allocation +would always count as committed. + +Today, the Go GC already has a fixed minimum heap size of 4 MiB. +The reasons around this minimum heap size stem largely from a failure to account +for alternative GC work sources. +See [the GC pacer problems meta-issue](https://golang.org/issue/42430) for more +details. +The problems are resolved by a [proposed GC pacer +redesign](https://golang.org/issue/44167). + +## Design + +I propose the addition of the following API to the `runtime/debug` package: + +```go +// SetMemoryTarget provides a hint to the runtime that it can use at least +// amount bytes of memory. amount is the sum total of in-ue Go-related memory +// that the Go runtime can measure. +// +// That explictly includes: +// - Space and fragmentation for goroutine stacks. +// - Space for runtime structures. +// - The size of the heap, with fragmentation. +// - Space for global variables (including dynamically-loaded plugins). +// +// And it explicitly excludes: +// - Read-only parts of the Go program, such as machine instructions. +// - Any non-Go memory present in the process, such as from C or another +// language runtime. +// - Memory required to maintain OS kernel resources that this process has a +// handle to. +// - Memory allocated via low-level functions in the syscall package, like Mmap. +// +// The intuition and goal with this definition is the ability to treat the Go +// part of any system as a black box: runtime overheads and fragmentation that +// are otherwise difficult to account for are explicitly included. +// Anything that is difficult or impossible for the runtime to measure portably +// is excluded. For these cases, the user is encouraged to monitor these +// sources for their particular system and update the memory target as +// necessary. +// +// The runtime is free to ignore the hint at any time. +// +// In practice, the runtime will use this hint to run the garbage collector +// less frequently by using up any additional memory up-front. Any memory used +// beyond that will obey the GOGC trade-off. +// +// If the GOGC mechanism is turned off, the hint is always ignored. +// +// Note that if the memory target is set higher than the amount of memory +// available on the system, the Go runtime may attempt to use all that memory, +// and trigger an out-of-memory condition. +// +// An amount of 0 will retract the hint. A negative amount will always be +// ignored. +// +// Returns the old memory target, or -1 if none was previously set. +func SetMemoryTarget(amount int) int +``` + +The design of this feature builds off of the [proposed GC pacer +redesign](https://golang.org/issue/44167). + +I propose we move forward with almost exactly what issue +[#23044](https://golang.org/issue/23044) proposed, namely exposing the heap +minimum and making it configurable via a runtime API. +The behavior of `SetMemoryTarget` is thus analogous to the common (but +non-standard) Java runtime flag `-Xms` (with Adaptive Size Policy disabled). +With the GC pacer redesign, smooth behavior here should be straightforward to +ensure, as the troubles here basically boil down to the "high `GOGC`" issue +mentioned in that design. + +There's one missing piece and that's how to turn the hint (which is memory use) +into a heap goal. +Because the heap goal includes both stacks and globals, I propose that we +compute the heap goal as follows: + +``` +Heap goal = amount + // These are runtime overheads. + - MemStats.GCSys + - Memstats.MSpanSys + - MemStats.MCacheSys + - MemStats.BuckHashSys + - MemStats.OtherSys + - MemStats.StackSys + // Fragmentation. + - (MemStats.HeapSys-MemStats.HeapInuse) + - (MemStats.StackInuse-(unused portions of stack spans)) +``` + +What this formula leaves us with is a value that should include: +1. Stack space that is actually allocated for goroutine stacks, +1. Global variables (so the part of the binary mapped read-write), and +1. Heap space allocated for objects. +These are the three factors that go into determining the `GOGC`-based heap goal +according to the GC pacer redesign. + +Note that while at first it might appear that this definition of the heap goal +will cause significant jitter in what the heap goal is actually set to, runtime +overheads and fragmentation tend to be remarkably stable over the lifetime of a +Go process. + +In an ideal world, that would be it, but as the API documentation points out, +there are a number of sources of memory that are unaccounted for that deserve +more explanation. + +Firstly, there's the read-only parts of the binary, like the instructions +themselves, but these parts' impact on memory use are murkier since the +operating system tends to de-duplicate this memory between processes. +Furthermore, on platforms like Linux, this memory is always evictable, down to +the last available page. +As a result, I intentionally ignore that factor here. +If the size of the binary is a factor, unfortunately it will be up to the user +to subtract out that size from the amount they pass to `SetMemoryTarget`. + +The source of memory is anything non-Go, such as C code (or, say a Python VM) +running in the same process. +These sources also need to be accounted for by the user because this could be +absolutely anything, and portably interacting with the large number of different +possibilities is infeasible. +Luckily, `SetMemoryTarget` is a run-time API that can be made to respond to +changes in external memory sources that Go could not possibly be aware of, so +API recommends updating the target on-line if need be. + +Another source of memory use is kernel memory. +If the Go process holds onto kernel resources that use memory within the kernel +itself, those are unaccounted for. +Unfortunately, while this API tries to avoid situations where the user needs to +make conservative estimates, this is one such case. +As far as I know, most systems do not associate kernel memory with a process, so +querying and reacting to this information is just impossible. + +The final source of memory is memory that's created by the Go program, but that +the runtime isn't necessarily aware of, like explicitly `Mmap`'d memory. +Theoretically the Go runtime could be aware of this specific case, but this is +tricky to account for in general given the wide range of options that can be +passed to `mmap`-like functionality on various platforms. +Sometimes it's worth accounting for it, sometimes not. +I believe it's best to leave that up to the user. + +To validate the design, I ran several [simulations](#simulations) of this +implementation. +In general, the runtime is resilient to a changing heap target (even one that +changes wildly) but shrinking the heap target significantly has the potential to +cause GC CPU utilization spikes. +This is by design: the runtime suddenly has much less runway than it thought +before the change, so it needs to make that up to reach its goal. + +The only issue I found with this formulation is the potential for consistent +undershoot in the case where the heap size is very small, mostly because we +place a limit on how late a GC cycle can start. +I think this is OK, and I don't think we should alter our current setting. +This choice means that in extreme cases, there may be some missed performance. +But I don't think it's enough to justify the additional complexity. + +### Simulations + +These simulations were produced by the same tool as those for the [GC pacer +redesign](https://github.com/golang/go/issues/44167). +That is, +[github.com/mknyszek/pacer-model](https://github.com/mknyszek/pacer-model). +See the GC pacer design document for a list of caveats and assumptions, as well +as a description of each subplot, though the plots are mostly straightforward. + +**Small heap target.** + +In this scenario, we set a fairly small target (around 64 MiB) as a baseline. +This target is fairly close to what `GOGC` would have picked. +Mid-way through the scenario, the live heap grows a bit. + + + +Notes: +- There's a small amount of overshoot when the live heap size changes, which is + expected. +- The pacer is otherwise resilient to changes in the live heap size. + +**Very small heap target.** + +In this scenario, we set a fairly small target (around 64 MiB) as a baseline. +This target is much smaller than what `GOGC` would have picked, since the live +heap grows to around 5 GiB. + + + +Notes: +- `GOGC` takes over very quickly. + +**Large heap target.** + +In this scenario, we set a fairly large target (around 2 GiB). +This target is fairly far from what `GOGC` would have picked. +Mid-way through the scenario, the live heap grows a lot. + + + +Notes: +- There's a medium amount of overshoot when the live heap size changes, which is + expected. +- The pacer is otherwise resilient to changes in the live heap size. + +**Exceed heap target.** + +In this scenario, we set a fairly small target (around 64 MiB) as a baseline. +This target is fairly close to what `GOGC` would have picked. +Mid-way through the scenario, the live heap grows enough such that we exit the +memory target regime and enter the `GOGC` regime. + + + +Notes: +- There's a small amount of overshoot when the live heap size changes, which is + expected. +- The pacer is otherwise resilient to changes in the live heap size. +- The pacer smoothly transitions between regimes. + +**Exceed heap target with a high GOGC.** + +In this scenario, we set a fairly small target (around 64 MiB) as a baseline. +This target is fairly close to what `GOGC` would have picked. +Mid-way through the scenario, the live heap grows enough such that we exit the +memory target regime and enter the `GOGC` regime. +The `GOGC` value is set very high. + + + +Notes: +- There's a small amount of overshoot when the live heap size changes, which is + expected. +- The pacer is otherwise resilient to changes in the live heap size. +- The pacer smoothly transitions between regimes. + +**Change in heap target.** + +In this scenario, the heap target is set mid-way through execution, to around +256 MiB. +This target is fairly far from what `GOGC` would have picked. +The live heap stays constant, meanwhile. + + + +Notes: +- The pacer is otherwise resilient to changes in the heap target. +- There's no overshoot. + +**Noisy heap target.** + +In this scenario, the heap target is set once per GC and is somewhat noisy. +It swings at most 3% around 2 GiB. +This target is fairly far from what `GOGC` would have picked. +Mid-way through the live heap increases. + + + +Notes: +- The pacer is otherwise resilient to a noisy heap target. +- There's expected overshoot when the live heap size changes. +- GC CPU utilization bounces around slightly. + +**Very noisy heap target.** + +In this scenario, the heap target is set once per GC and is very noisy. +It swings at most 50% around 2 GiB. +This target is fairly far from what `GOGC` would have picked. +Mid-way through the live heap increases. + + + +Notes: +- The pacer is otherwise resilient to a noisy heap target. +- There's expected overshoot when the live heap size changes. +- GC CPU utilization bounces around, but not much. + +**Large heap target with a change in allocation rate.** + +In this scenario, we set a fairly large target (around 2 GiB). +This target is fairly far from what `GOGC` would have picked. +Mid-way through the simulation, the application begins to suddenly allocate much +more aggressively. + + + +Notes: +- The pacer is otherwise resilient to changes in the live heap size. +- There's no overshoot. +- There's a spike in utilization that's consistent with other simulations of the + GC pacer. +- The live heap grows due to floating garbage from the high allocation rate + causing each GC cycle to start earlier. + +### Interactions with other GC mechanisms + +Although listed already in the API documentation, there are a few additional +details I want to consider. + +#### GOGC + +The design of the new pacer means that switching between the "memory target" +regime and the `GOGC` regime (the regimes being defined as the mechanism that +determines the heap goal) is very smooth. +While the live heap times `1+GOGC/100` is less than the heap goal set by the +memory target, we are in the memory target regime. +Otherwise, we are in the `GOGC` regime. +Notice that as `GOGC` rises to higher and higher values, the range of the memory +target regime shrinks. +At infinity, meaning `GOGC=off`, the memory target regime no longer exists. + +Therefore, it's very clear to me that the memory target should be completely +ignored if `GOGC` is set to "off" or a negative value. + +#### Memory limit + +If we choose to also adopt an API for setting a memory limit in the runtime, it +would necessarily always need to override a memory target, though both could +plausibly be active simultaneously. +If that memory limit interacts with `GOGC` being set to "off," then the rule of +the memory target being ignored holds; the memory limit effectively acts like a +target in that circumstance. +If the two are set to an equal value, that behavior is virtually identical to +`GOGC` being set to "off" and *only* a memory limit being set. +Therefore, we need only check that these two cases behave identically. +Note however that otherwise that the memory target and the memory limit define +different regimes, so they're otherwise orthogonal. +While there's a fairly large gap between the two (relative to `GOGC`), the two +are easy to separate. +Where it gets tricky is when they're relatively close, and this case would need +to be tested extensively. + +## Risks + +The primary risk with this proposal is adding another "knob" to the garbage +collector, with `GOGC` famously being the only one. +Lots of language runtimes provide flags and options that alter the behavior of +the garbage collector, but when the number of flags gets large, maintaining +every possible configuration becomes a daunting, if not impossible task, because +the space of possible configurations explodes with each knob. + +This risk is a strong reason to be judicious. +The bar for new knobs is high. + +But there are a few good reasons why this might still be important. +The truth is, this API already exists, but is considered unsupported and is +otherwise unmaintained. +The API exists in the form of heap ballasts, a concept we can thank Hyrum's Law +for. +It's already possible for an application to "fool" the garbage collector into +thinking there's more live memory than there actually is. +The downside is resizing the ballast is never going to be nearly as reactive as +the garbage collector itself, because it is at the mercy of the of the runtime +managing the user application. +The simple fact is performance-sensitive Go users are going to write this code +anyway. +It is worth noting that unlike a memory maximum, for instance, a memory target +is purely an optimization. +On the whole, I suspect it's better for the Go ecosystem for there to be a +single solution to this problem in the standard library, rather than solutions +that *by construction* will never be as good. + +And I believe we can mitigate some of the risks with "knob explosion." +The memory target, as defined above, has very carefully specified and limited +interactions with other (potential) GC knobs. +Going forward I believe a good criterion for the addition of new knobs should be +that a knob should only be added if it is *only* fully orthogonal with `GOGC`, +and nothing else. + +## Monitoring + +I propose adding a new metric to the `runtime/metrics` package to enable +monitoring of the memory target, since that is a new value that could change at +runtime. +I propose the metric name `/memory/config/target:bytes` for this purpose. +Otherwise, it could be useful for an operator to understand which regime the Go +application is operating in at any given time. +We currently expose the `/gc/heap/goal:bytes` metric which could theoretically +be used to determine this, but because of the dynamic nature of the heap goal in +this regime, it won't be clear which regime the application is in at-a-glance. + +Therefore, I propose adding another metric `/memory/goal:bytes`. +This metric is analagous to `/gc/heap/goal:bytes` but is directly comparable +with `/memory/config/target:bytes` (that is, it includes additional overheads +beyond just what goes into the heap goal, it "converts back"). +When this metric "bottoms out" at a flat line, that should serve as a clear +indicator that the pacer is in the "target" regime. +This same metric could be reused for a memory limit in the future, where it will +"top out" at the limit. + +## Documentation + +This API has an inherent complexity as it directly influences the behavior of +the Go garbage collector. +It also deals with memory accounting, a process that is infamously (and +unfortunately) difficult to wrap one's head around and get right. +Effective of use of this API will come down to having good documentation. + +The documentation will have two target audiences: software developers, and +systems administrators (referred to as "developers" and "operators," +respectively). + +For both audiences, it's incredibly important to understand exactly what's +included and excluded in the memory target. +That is why it is explicitly broken down in the most visible possible place for +a developer: the documentation for the API itself. +For the operator, the `runtime/metrics` metric definition should either +duplicate this documentation, or point to the API. +This documentation is important for immediate use and understanding, but API +documentation is never going to be expressive enough. +I propose also introducing a new document to the `doc` directory in the Go +source tree that explains common use-cases, extreme scenarios, and what to +expect in monitoring in these various situations. +This document should include a list of known bugs and how they might appear in +monitoring. +In other words, it should include a more formal and living version of the [GC +pacer meta-issue](https://golang.org/issues/42430). +The hard truth is that memory accounting and GC behavior are always going to +fall short in some cases, and it's immensely useful to be honest and up-front +about those cases where they're known, while always striving to do better. +As every other document in this directory, it would be a living document that +will grow as new scenarios are discovered, bugs are fixed, and new functionality +is made available. + +## Alternatives considered + +Since this is a performance optimization, it's possible to do nothing. +But as I mentioned in [the risks section](#risks), I think there's a solid +justification for doing *something*. + +Another alternative I considered was to provide better hooks into the runtime to +allow users to implement equivalent functionality themselves. +Today, we provide `debug.SetGCPercent` as well as access to a number of runtime +statistics. +Thanks to work done for the `runtime/metrics` package, that information is now +much more efficiently accessible. +By exposing just the right metric, one could imagine a background goroutine that +calls `debug.SetGCPercent` in response to polling for metrics. +The reason why I ended up discarding this alternative, however, is this then +forces the user writing the code that relies on the implementation details of +garbage collector. +For instance, a reasonable implementation of a memory target using the above +mechanism would be to make an adjustment each time the heap goal changes. +What if future GC implementations don't have a heap goal? Furthermore, the heap +goal needs to be sampled; what if GCs are occurring rapidly? Should the runtime +expose when a GC ends? What if the new GC design is fully incremental, and there +is no well-defined notion of "GC end"? It suffices to say that in order to keep +Go implementations open to new possibilities, we should avoid any behavior that +exposes implementation details. + +## Go 1 backwards compatibility + +This change only adds to the Go standard library's API surface, and is therefore +Go 1 backwards compatible. + +## Implementation + +Michael Knyszek will implement this. +1. Implement in the runtime. +1. Extend the pacer simulation test suite with this use-case in a variety of + configurations. diff --git a/design/44309/exceed-heap-target-high-GOGC.png b/design/44309/exceed-heap-target-high-GOGC.png Binary files differnew file mode 100644 index 0000000..601982c --- /dev/null +++ b/design/44309/exceed-heap-target-high-GOGC.png diff --git a/design/44309/exceed-heap-target.png b/design/44309/exceed-heap-target.png Binary files differnew file mode 100644 index 0000000..2a695fa --- /dev/null +++ b/design/44309/exceed-heap-target.png diff --git a/design/44309/heavy-step-alloc-high-heap-target.png b/design/44309/heavy-step-alloc-high-heap-target.png Binary files differnew file mode 100644 index 0000000..c57bbcd --- /dev/null +++ b/design/44309/heavy-step-alloc-high-heap-target.png diff --git a/design/44309/high-heap-target.png b/design/44309/high-heap-target.png Binary files differnew file mode 100644 index 0000000..b69ae21 --- /dev/null +++ b/design/44309/high-heap-target.png diff --git a/design/44309/high-noise-high-heap-target.png b/design/44309/high-noise-high-heap-target.png Binary files differnew file mode 100644 index 0000000..9989dce --- /dev/null +++ b/design/44309/high-noise-high-heap-target.png diff --git a/design/44309/low-heap-target.png b/design/44309/low-heap-target.png Binary files differnew file mode 100644 index 0000000..6b8739d --- /dev/null +++ b/design/44309/low-heap-target.png diff --git a/design/44309/low-noise-high-heap-target.png b/design/44309/low-noise-high-heap-target.png Binary files differnew file mode 100644 index 0000000..66d67ab --- /dev/null +++ b/design/44309/low-noise-high-heap-target.png diff --git a/design/44309/step-heap-target.png b/design/44309/step-heap-target.png Binary files differnew file mode 100644 index 0000000..af93ed2 --- /dev/null +++ b/design/44309/step-heap-target.png diff --git a/design/44309/very-low-heap-target.png b/design/44309/very-low-heap-target.png Binary files differnew file mode 100644 index 0000000..c6a3bdd --- /dev/null +++ b/design/44309/very-low-heap-target.png |
