From a24f4db2a2bd3e897d466a11d269ac7e618a6f8a Mon Sep 17 00:00:00 2001 From: Mark Freeman Date: Mon, 12 May 2025 13:59:27 -0400 Subject: internal/pkgbits, cmd/compile/internal/noder: document string section To understand this change, we begin with a short description of the UIR file format. Every file is a header followed by a series of sections. Each section has a kind, which determines the type of elements it contains. An element is just a collection of one or more primitives, as defined by package pkgbits. Strings have their own section. Elements in the string section contain only string primitives. To use a string, elements in other sections encode a reference to the string section. To illustrate, consider a simple file which exports nothing at all. package p In the meta section, there is an element representing a package stub. In that package stub, a string ("p") represents both the path and name of the package. Again, these are encoded as references. To manage references, every element begins with a reference table. Instead of writing the bytes for "p" directly, the package stub encodes an index in this reference table. At that index, a pair of numbers is stored, indicating: 1. which section 2. which element index within the section Effectively, elements always use *2* layers of indirection; first to the reference table, then to the bytes themselves. With some minor hand-waving, an encoding for the above package is given below, with (S)ections, (E)lements and (P)rimitives denoted. + Header | + Section Ends // each section has 1 element | | + 1 // String is elements [0, 1) | | + 2 // Meta is elements [1, 2) | + Element Ends | | + 1 // "p" is bytes [0, 1) | | + 6 // stub is bytes [1, 6) + Payload | + (S) String | | + (E) String | | | + (P) String { byte } 0x70 // "p" | + (S) Meta | | + (E) Package Stub | | | + Reference Table | | | | + (P) Entry Count uvarint 1 // there is a single entry | | | | + (P) 0th Section uvarint 0 // to String, 0th section | | | | + (P) 0th Index uvarint 0 // to 0th element in String | | | + Internals | | | | + (P) Path uvarint 0 // 0th entry in table | | | | + (P) Name uvarint 0 // 0th entry in table Note that string elements do not have reference tables like other elements. They behave more like a primitive. As this is a bit complicated and getting into details of the UIR file format, we omit some details in the documentation here. The structure will become clearer as we continue documenting. Change-Id: I12a5ce9a34251c5358a20f2f2c4d0f9bd497f4d0 Reviewed-on: https://go-review.googlesource.com/c/go/+/671997 Reviewed-by: Robert Griesemer Auto-Submit: Mark Freeman TryBot-Bypass: Mark Freeman --- src/cmd/compile/internal/noder/doc.go | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) (limited to 'src/cmd/compile/internal/noder') diff --git a/src/cmd/compile/internal/noder/doc.go b/src/cmd/compile/internal/noder/doc.go index 5509b0001a..24590107c2 100644 --- a/src/cmd/compile/internal/noder/doc.go +++ b/src/cmd/compile/internal/noder/doc.go @@ -20,7 +20,7 @@ The payload is a series of sections. Each section has a kind which determines its index in the series. SectionKind = Uint64 . -Payload = SectionString // TODO(markfreeman) Define. +Payload = SectionString SectionMeta SectionPosBase // TODO(markfreeman) Define. SectionPkg // TODO(markfreeman) Define. @@ -40,6 +40,12 @@ accessed using an index relative to the start of the section. // TODO(markfreeman): Rename to SectionIndex. RelIndex = Uint64 . +## String Section +String values are stored as elements in the string section. Elements outside +the string section access string values by reference. + +SectionString = { String } . + ## Meta Section The meta section provides fundamental information for a package. It contains exactly two elements — a public root and a private root. -- cgit v1.3