| Age | Commit message (Collapse) | Author |
|
We did not set a limit on the maximum size of sparse maps in
the old GNU sparse format. Set a limit based on the cumulative
size of the extension blocks used to encode the map (consistent
with how we limit the sparse map size for other formats).
Add an additional limit to the total number of sparse file entries,
regardless of encoding, to all sparse formats.
Thanks to Colin Walters (walters@verbum.org),
Uuganbayar Lkhamsuren (https://github.com/uug4na),
and Jakub Ciolek for reporting this issue.
Fixes #78301
Fixes CVE-2026-32288
Change-Id: I84877345d7b41cc60c58771860ba70e16a6a6964
Reviewed-on: https://go-internal-review.googlesource.com/c/go/+/3901
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Roland Shoemaker <bracewell@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/763766
Auto-Submit: David Chase <drchase@google.com>
TryBot-Bypass: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Jakub Ciolek <jakub@ciolek.dev>
|
|
Sparse files in tar archives contain only the non-zero components
of the file. There are several different encodings for sparse
files. When reading GNU tar pax 1.0 sparse files, archive/tar did
not set a limit on the size of the sparse region data. A malicious
archive containing a large number of sparse blocks could cause
archive/tar to read an unbounded amount of data from the archive
into memory.
Since a malicious input can be highly compressable, a small
compressed input could cause very large allocations.
Cap the size of the sparse block data to the same limit used
for PAX headers (1 MiB).
Thanks to Harshit Gupta (Mr HAX) (https://www.linkedin.com/in/iam-harshit-gupta/)
for reporting this issue.
Fixes CVE-2025-58183
Fixes #75677
Change-Id: I70b907b584a7b8676df8a149a1db728ae681a770
Reviewed-on: https://go-internal-review.googlesource.com/c/go/+/2800
Reviewed-by: Roland Shoemaker <bracewell@google.com>
Reviewed-by: Nicholas Husin <husin@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/709861
Auto-Submit: Michael Pratt <mpratt@google.com>
TryBot-Bypass: Michael Pratt <mpratt@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
|
|
Change-Id: I0e55dd68d92c39aba511b55368bf50d929d75f86
GitHub-Last-Rev: 17430140783db8bf3354304c8f28d6826186c6cb
GitHub-Pull-Request: golang/go#66158
Reviewed-on: https://go-review.googlesource.com/c/go/+/569696
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: qiulaidongfeng <2645477756@qq.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
Change-Id: I813aa09f8a65936796469fa637d0f23004d26098
Reviewed-on: https://go-review.googlesource.com/c/go/+/534757
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: shuang cui <imcusg@gmail.com>
|
|
Allow GODEBUG users to report how many times a setting
resulted in non-default behavior.
Record non-default-behaviors for all existing GODEBUGs.
Also rework tests to ensure that runtime is in sync with runtime/metrics.All,
and generate docs mechanically from metrics.All.
For #56986.
Change-Id: Iefa1213e2a5c3f19ea16cd53298c487952ef05a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/453618
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
These are mentioned in the release notes but not the actual doc comments.
Nothing should exist only in release notes.
Change-Id: I8d10f25a2c9b2677231929ba3f393af9034b777b
Reviewed-on: https://go-review.googlesource.com/c/go/+/462195
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
|
|
CL 452616 disables path security checks by default, enabling them
only when GODEBUG=tarinsecurepath=0 or GODEBUG=zipinsecurepath=0
is set. Remove now-obsolete documenation of the path checks.
For #55356
Change-Id: I4ae57534efe9e27368d5e67773a502dd0e56eff4
Reviewed-on: https://go-review.googlesource.com/c/go/+/458875
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
|
|
Change-Id: I4cff6b2a1fed6acdf754539c3c53a61eaa3b3f84
Reviewed-on: https://go-review.googlesource.com/c/go/+/450176
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This change is being made late in the release cycle.
Disable it by default. Insecure path checks may be enabled by setting
GODEBUG=tarinsecurepath=0 or GODEBUG=zipinsecurepath=0.
We can enable this by default in Go 1.21 after publicizing the change
more broadly and giving users a chance to adapt to the change.
For #55356.
Change-Id: I549298b3c85d6c8c7fd607c41de1073083f79b1d
Reviewed-on: https://go-review.googlesource.com/c/go/+/452616
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
|
|
Add GODEBUG=tarinsecurepath=1 and GODEBUG=zipinsecurepath=1 settings
to disable file name validation.
For #55356.
Change-Id: Iaacdc629189493e7ea3537a81660215a59dd40a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/452495
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
Change-Id: I69065f8adf101fdb28682c55997f503013a50e29
Reviewed-on: https://go-review.googlesource.com/c/go/+/449757
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Joedian Reid <joedian@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
|
|
Return a distinguishable error when reading an archive file
with a path that is:
- absolute
- escapes the current directory (../a)
- on Windows, a reserved name such as NUL
Users may ignore this error and proceed if they do not need name
sanitization or intend to perform it themselves.
Fixes #25849
Fixes #55356
Change-Id: Ieefa163f00384bc285ab329ea21a6561d39d8096
Reviewed-on: https://go-review.googlesource.com/c/go/+/449937
Reviewed-by: Joseph Tsai <joetsai@digital-static.net>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
|
|
Set a 1MiB limit on special file blocks (PAX headers, GNU long names,
GNU link names), to avoid reading arbitrarily large amounts of data
into memory.
Thanks to Adam Korczynski (ADA Logics) and OSS-Fuzz for reporting
this issue.
Fixes CVE-2022-2879
For #54853
Change-Id: I85136d6ff1e0af101a112190e027987ab4335680
Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/1565555
Reviewed-by: Tatiana Bradley <tatianabradley@google.com>
Run-TryBot: Roland Shoemaker <bracewell@google.com>
Reviewed-by: Roland Shoemaker <bracewell@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/439355
Reviewed-by: Damien Neil <dneil@google.com>
Run-TryBot: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Roland Shoemaker <roland@golang.org>
|
|
[This CL is part of a sequence implementing the proposal #51082.
The design doc is at https://go.dev/s/godocfmt-design.]
Run the updated gofmt, which reformats doc comments,
on the main repository. Vendored files are excluded.
For #51082.
Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407
Reviewed-on: https://go-review.googlesource.com/c/go/+/384268
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Many of the methods inside the archive/tar package don't need to be
exported. Doing so sets a bad precedent that it's OK to export methods
to indicate an internal public API. That's not a good idea in general,
because exported methods increase cognitive load when reading code:
the reader needs to consider whether the exported method might be used
via some external interface or reflection.
This CL should have no externally visible behaviour changes at all.
Change-Id: I94a63de5e6a28e9ac8a283325217349ebce4f308
Reviewed-on: https://go-review.googlesource.com/c/go/+/341410
Reviewed-by: Joe Tsai <joetsai@digital-static.net>
Trust: Joe Tsai <joetsai@digital-static.net>
Trust: Michael Knyszek <mknyszek@google.com>
|
|
The old ioutil references are still valid, but update our code
to reflect best practices and get used to the new locations.
Code compiled with the bootstrap toolchain
(cmd/asm, cmd/dist, cmd/compile, debug/elf)
must remain Go 1.4-compatible and is excluded.
Also excluded vendored code.
For #41190.
Change-Id: I6d86f2bf7bc37a9d904b6cee3fe0c7af6d94d5b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/263142
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
|
|
After golang.org/cl/210124, I wondered if the same error had gone
unnoticed elsewhere. I quickly spotted another dozen mistakes after
reading through the output of:
git grep '\<[Aa]n [bcdfgjklmnpqrtvwyz][a-z]'
Many results are false positives for acronyms like "an mtime", since
it's pronounced "an em-time". However, the total amount of output isn't
that large given how simple the grep pattern is.
Change-Id: Iaa2ca69e42f4587a9e3137d6c5ed758887906ca6
Reviewed-on: https://go-review.googlesource.com/c/go/+/210678
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Zach Jones <zachj1@gmail.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
CL 14624 introduced this label. At that time,
the switch-case had a break to label statement which made this necessary.
But now, the code no longer has a break statement and it directly returns.
Hence, it is no longer necessary to have a label.
Change-Id: Idde0fcc4d2db2d76424679f5acfe33ab8573bce4
Reviewed-on: https://go-review.googlesource.com/96935
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
|
|
Change-Id: I1f25b11fb9b7cd3c09968ed99913dc85db2025ef
Reviewed-on: https://go-review.googlesource.com/94976
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change Reader to promote TypeRegA to TypeReg in headers, unless their
name have a trailing slash which is already promoted to TypeDir. This
will allow client code to handle just TypeReg instead both TypeReg and
TypeRegA.
Change Writer to promote TypeRegA to TypeReg or TypeDir in the headers
depending on whether the name has a trailing slash. This normalization
is motivated by the specification (in pax(1)):
0 represents a regular file. For backwards-compatibility, a
typeflag value of binary zero ( '\0' ) should be recognized as
meaning a regular file when extracting files from the
archive. Archives written with this version of the archive file
format create regular files with a typeflag value of the
ISO/IEC 646:1991 standard IRV '0'.
Fixes #22768.
Change-Id: I149ec55824580d446cdde5a0d7a0457ad7b03466
Reviewed-on: https://go-review.googlesource.com/85656
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Several usages of tar (reasonably) just use the Header.FileInfo
to determine the type of the header. However, the os.FileMode type
is not expressive enough to represent "files" that are not files
at all, but some form of metadata.
Thus, Header{Typeflag: TypeXGlobalHeader}.FileInfo().Mode().IsRegular()
reports true, even though the expected result may have been false.
To reduce (not eliminate) the possibility of failure for such usages,
use the placeholder filename from the global PAX headers.
Thus, in the event the user did not handle special "meta" headers
specifically, they will just be written to disk as a regular file.
As an example use case, the "git archive --format=tgz" command produces
an archive where the first "file" is a global PAX header with the
name "global_pax_header". For users that do not explicitly check
the Header.Typeflag field to ignore such headers, they may end up
extracting a file named "global_pax_header". While it is a bogus file,
it at least does not stop the extraction process.
Updates #22748
Change-Id: I28448b528dcfacb4e92311824c33c71b482f49c9
Reviewed-on: https://go-review.googlesource.com/78355
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This CL removes the following APIs:
type SparseEntry struct{ ... }
type Header struct{ SparseHoles []SparseEntry; ... }
func (*Header) DetectSparseHoles(f *os.File) error
func (*Header) PunchSparseHoles(f *os.File) error
func (*Reader) WriteTo(io.Writer) (int, error)
func (*Writer) ReadFrom(io.Reader) (int, error)
This API was added during the Go1.10 dev cycle, and are safe to remove.
The rationale for reverting is because Header.DetectSparseHoles and
Header.PunchSparseHoles are functionality that probably better belongs in
the os package itself.
The other API like Header.SparseHoles, Reader.WriteTo, and Writer.ReadFrom
perform no OS specific logic and only perform the actual business logic of
reading and writing sparse archives. Since we do know know what the API added to
package os may look like, we preemptively revert these non-OS specific changes
as well by simply commenting them out.
Updates #13548
Updates #22735
Change-Id: I77842acd39a43de63e5c754bfa1c26cc24687b70
Reviewed-on: https://go-review.googlesource.com/78030
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Existing methods regFileReader.LogicalRemaining and regFileReader.PhysicalRemaining have inconsistent reciever names with the previous name
Change-Id: Ief2024716737eaf482c4311f3fdf77d92801c36e
Reviewed-on: https://go-review.googlesource.com/76430
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
|
|
The USTAR format says:
<<<
Implementors should be aware that the previous file format did not include
a mechanism to archive directory type files.
For this reason, the convention of using a filename ending with
<slash> was adopted to specify a directory on the archive.
>>>
In light of this suggestion, make the following changes:
* Writer.WriteHeader refuses to encode a header where a file that
is obviously a file-type has a trailing slash in the name.
* formatter.formatString avoids encoding a trailing slash in the event
that the string is truncated (the full string will be encoded elsewhere,
so stripping the slash is safe).
* Reader.Next treats a TypeRegA (which is the zero value of Typeflag)
as a TypeDir if the name has a trailing slash.
Change-Id: Ibf27aa8234cce2032d92e5e5b28546c2f2ae5ef6
Reviewed-on: https://go-review.googlesource.com/69293
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
s/TypeSymLink/TypeSymlink/g
Change-Id: I2550843248eb27d90684d0036fe2add0b247ae5a
Reviewed-on: https://go-review.googlesource.com/67810
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
The interfaces for io.Reader and io.Writer permit calling Read/Write
with an empty buffer. However, this condition is often not well tested
and can lead to bugs in various implementations of io.Reader and io.Writer.
For example, see #22028 for buggy io.Reader in the bzip2 package.
We reduce the likelihood of hitting these bugs by adjusting
regFileReader.Read and regFileWriter.Write to avoid performing
Read and Write calls when the buffer is known to be empty.
Fixes #22029
Change-Id: Ie4a26be53cf87bc4d2abd951fa005db5871cc75c
Reviewed-on: https://go-review.googlesource.com/66111
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
To support the efficient packing and extracting of sparse files,
add two new methods:
func Reader.WriteTo(io.Writer) (int64, error)
func Writer.ReadFrom(io.Reader) (int64, error)
If the current archive entry is sparse and the provided io.{Reader,Writer}
is also an io.Seeker, then use Seek to skip past the holes.
If the last region in a file entry is a hole, then we seek to 1 byte
before the EOF:
* for Reader.WriteTo to write a single byte
to ensure that the resulting filesize is correct.
* for Writer.ReadFrom to read a single byte
to verify that the input filesize is correct.
The downside of this approach is when the last region in the sparse file
is a hole. In the case of Reader.WriteTo, the 1-byte write will cause
the last fragment to have a single chunk allocated.
However, the goal of ReadFrom/WriteTo is *not* the ability to
exactly reproduce sparse files (in terms of the location of sparse holes),
but rather to provide an efficient way to create them.
File systems already impose their own restrictions on how the sparse file
will be created. Some filesystems (e.g., HFS+) don't support sparseness and
seeking forward simply causes the FS to write zeros. Other filesystems
have different chunk sizes, which will cause chunk allocations at boundaries
different from what was in the original sparse file. In either case,
it should not be a normal expectation of users that the location of holes
in sparse files exactly matches the source.
For users that really desire to have exact reproduction of sparse holes,
they can wrap os.File with their own io.WriteSeeker that discards the
final 1-byte write and uses File.Truncate to resize the file to the
correct size.
Other reasons we choose this approach over special-casing *os.File because:
* The Reader already has special-case logic for io.Seeker
* As much as possible, we want to decouple OS-specific logic from
Reader and Writer.
* This allows other abstractions over *os.File to also benefit from
the "skip past holes" logic.
* It is easier to test, since it is harder to mock an *os.File.
Updates #13548
Change-Id: I0a4f293bd53d13d154a946bc4a2ade28a6646f6a
Reviewed-on: https://go-review.googlesource.com/60872
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Use "file" consistently instead of "entry".
Change-Id: Ia81c9665d0d956adb78f7fa49de40cdb87fba000
Reviewed-on: https://go-review.googlesource.com/60150
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Many aspects of the package is woefully undocumented.
With the recent flurry of improvements, the package is now at feature
parity with the GNU and TAR tools. Thoroughly all of the public API
and perform some minor stylistic cleanup in some code segments.
Change-Id: Ic892fd72c587f30dfe91d1b25b88c9c8048cc389
Reviewed-on: https://go-review.googlesource.com/59210
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
The PAX specification says the following:
<<<
'g' represents global extended header records for the following files in the archive.
The format of these extended header records shall be as described in pax Extended Header.
Each value shall affect all subsequent files that do not override that value
in their own extended header record and until another global extended header record
is reached that provides another value for the same field.
>>>
This CL adds support for parsing and composing global PAX records,
but intentionally does not provide support for automatically
persisting the global state across files.
Changes made:
* When Reader encounters a TypeXGlobalRecord header, it parses the
PAX records and returns them to the user ad-verbatim. Reader does not
store them in its state, ensuring it has no effect on future Next calls.
* When Writer receives a TypeXGlobalRecord header, it writes the
PAX records to the archive ad-verbatim. It does not store them in
its state, ensuring it has no effect on future WriteHeader calls.
* The restriction regarding empty record values is lifted since this
value is used to represent deletion in global headers.
Why provide raw support only:
* Some archives in the wild have a global header section (often empty)
and it is the user's responsibility to manually read and discard it's body.
The logic added here allows users to more easily skip over these sections.
* For users that do care about global headers, having access to the raw
records allows them to implement the functionality of global headers themselves
and manually persist the global state across files.
* We can still upgrade to a full implementation in the future.
Why we don't provide full support:
* Even though the PAX specification describes their operation in detail,
both the GNU and BSD tar tools (which are the most common implementations)
do not have a consistent interpretation of many details.
* Global headers were a controversial feature in PAX, by admission of the
specification itself:
<<<
The concept of a global extended header (typeflag g) was controversial.
The typeflag g global headers should not be used with interchange media that
could suffer partial data loss in transporting the archive.
>>>
* Having state persist from entry-to-entry complicates the implementation
for a feature that is not widely used and not well supported.
Change-Id: I1d904cacc2623ddcaa91525a5470b7dbe226c7e8
Reviewed-on: https://go-review.googlesource.com/59190
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
|
|
This CL adds the following new publicly visible API:
type Header struct { ...; PAXRecords map[string]string }
The new Header.PAXRecords field is a map of all PAX extended header records.
We suggest (but do not enforce) that users use VENDOR-prefixed keys
according to the following in the PAX specification:
<<<
The standard developers have reserved keyword name space for vendor extensions.
It is suggested that the format to be used is:
VENDOR.keyword
where VENDOR is the name of the vendor or organization in all uppercase letters.
>>>
When reading, the Header.PAXRecords is populated with all PAX records
encountered so far, including basic ones (e.g., "path", "mtime", etc).
When writing, the fields of Header will be merged into PAXRecords,
overwriting any records that may conflict.
Since PAXRecords is a more expressive feature than Xattrs and
is entirely a superset of Xattrs, we mark Xattrs as deprecated,
and steer users towards the new PAXRecords API.
The issue has a discussion about adding a Header.SetPAXRecord method
to help validate records and keep the Header fields in sync.
However, we do not include that in this CL since that helper
method can always be added in the future.
There is no support for global records.
Fixes #14472
Change-Id: If285a52749acc733476cf75a2c7ad15bc1542071
Reviewed-on: https://go-review.googlesource.com/58390
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
The Reader and Writer are now at feature parity,
meaning that everything that can be parsed by the Reader,
can also be composed by the Writer.
This position enables us to support selection of the format
in a backwards compatible way, since it ensures that everything
that can be read can also be round-trip written.
As such, we add the following new API:
type Format int
const FormatUnknown Format = 0 ...
type Header struct { ...; Format Format }
The new Header.Format field is populated by the Reader on the
best guess on what the format is. Note that the Reader is very liberal
in what it permits, so a hybrid TAR file using aspects of multiple
formats can still be decoded, but will be reported as FormatUnknown.
Even though Reader has full support for V7 and basic support for STAR,
it will still report those formats as unknown (and the constants for
those formats are not even exported). The reasons for this is because
the Writer has no support for V7 or STAR. Leaving it as unknown allows
the Writer to choose a format usually USTAR or GNU that can encode
the equivalent Header.
When writing, the Header.allowedFormats will take the Format field
into consideration if it is a known format.
Fixes #18710
Change-Id: I00980c475d067c6969d3414e1ff0224fdd89cd49
Reviewed-on: https://go-review.googlesource.com/58230
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This CL is the first step (of two) for adding sparse file support
to the Writer. This CL only refactors the logic of sparse-file handling
in the Reader so that common logic can be easily shared by the Writer.
As a result of this CL, there are some new publicly visible API changes:
type SparseEntry struct { Offset, Length int64 }
type Header struct { ...; SparseHoles []SparseEntry }
A new type is defined to represent a sparse fragment and a new field
Header.SparseHoles is added to represent the sparse holes in a file.
The API intentionally represent sparse files using hole fragments,
rather than data fragments so that the zero value of SparseHoles
naturally represents a normal file (i.e., a file without any holes).
The Reader now populates SparseHoles for sparse files.
It is necessary to export the sparse hole information, otherwise it would
be impossible for the Writer to specify that it is trying to encode
a sparse file, and what it looks like.
Some unexported helper functions were added to common.go:
func validateSparseEntries(sp []SparseEntry, size int64) bool
func alignSparseEntries(src []SparseEntry, size int64) []SparseEntry
func invertSparseEntries(src []SparseEntry, size int64) []SparseEntry
The validation logic that used to be in newSparseFileReader is now moved
to validateSparseEntries so that the Writer can use it in the future.
alignSparseEntries is currently unused by the Reader, but will be used
by the Writer in the future. Since TAR represents sparse files by
only recording the data fragments, we add the invertSparseEntries
function to convert a list of data fragments to a normalized list
of hole fragments (and vice-versa).
Some other high-level changes:
* skipUnread is deleted, where most of it's logic is moved to the
Discard methods on regFileReader and sparseFileReader.
* readGNUSparsePAXHeaders was rewritten to be simpler.
* regFileReader and sparseFileReader were completely rewritten
in simpler and easier to understand logic.
* A bug was fixed in sparseFileReader.Read where it failed to
report an error if the logical size of the file ends before
consuming all of the underlying data.
* The tests for sparse-file support was completely rewritten.
Updates #13548
Change-Id: Ic1233ae5daf3b3f4278fe1115d34a90c4aeaf0c2
Reviewed-on: https://go-review.googlesource.com/56771
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Move all sentinel errors to common.go since some of them are
returned by both the reader and writer and remove errInvalidHeader
since it not used.
Also, consistently use the "tar: " prefix for errors.
Change-Id: I0afb185bbf3db80dfd9595321603924454a4c2f9
Reviewed-on: https://go-review.googlesource.com/55650
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Both GNU and BSD tar do not care if the devmajor and devminor values are
set on entries (like regular files) that aren't character or block devices.
While this is non-sensible, it is more consistent with the Writer to actually
read these fields always. In a vast majority of the cases these will still
be zero. In the rare situation where someone actually cares about these,
at least information was not silently lost.
Change-Id: I6e4ba01cd897a1b13c28b1837e102a4fdeb420ba
Reviewed-on: https://go-review.googlesource.com/55572
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Prior to Go1.8, the Writer had a bug where it would output
an invalid tar file in certain rare situations because the logic
incorrectly believed that the old GNU format had a prefix field.
This is wrong and leads to an output file that mangles the
atime and ctime fields, which are often left unused.
In order to continue reading tar files created by former, buggy
versions of Go, we skeptically parse the atime and ctime fields.
If we are unable to parse them and the prefix field looks like
an ASCII string, then we fallback on the pre-Go1.8 behavior
of treating these fields as the USTAR prefix field.
Note that this will not use the fallback logic for all possible
files generated by a pre-Go1.8 toolchain. If the generated file
happened to have a prefix field that parses as valid
atime and ctime fields (e.g., when they are valid octal strings),
then it is impossible to distinguish between an valid GNU file
and an invalid pre-Go1.8 file.
Fixes #21005
Change-Id: Iebf5c67c08e0e46da6ee41a2e8b339f84030dd90
Reviewed-on: https://go-review.googlesource.com/53635
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
According to the GNU manual, the format is:
<<<
GNU.sparse.size=size
GNU.sparse.numblocks=numblocks
repeat numblocks times
GNU.sparse.offset=offset
GNU.sparse.numbytes=numbytes
end repeat
>>>
The logic in parsePAX converts the repeating sequence of
(offset, numbytes) pairs (which is not PAX compliant) into a single
comma-delimited list of numbers (which is now PAX compliant).
Thus, we validate the following:
* The (offset, numbytes) headers must come in the correct order.
* The ',' delimiter cannot appear in the value.
We do not validate that the value is a parsible decimal since that
will be determined later.
Change-Id: I8d6681021734eb997898227ae8603efb1e17c0c8
Reviewed-on: https://go-review.googlesource.com/31439
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Relevant PAX specification:
<<<
If the <value> field is zero length, it shall delete any header
block field, previously entered extended header value, or
global extended header value of the same name.
>>>
We don't delete global extender headers since the Reader doesn't
even support global headers (which the specification admits was
a controversial feature).
Change-Id: I2125a5c907b23a3dc439507ca90fa5dc47d474a9
Reviewed-on: https://go-review.googlesource.com/31440
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
The GNU format does not have a prefix field, so we should make
no attempt to read it. It does however have atime and ctime fields.
Since Go previously placed incorrect values here, we liberally
read the atime and ctime fields and ignore errors so that old tar
files written by Go can at least be partially read.
This fixes half of #12594. The Writer is much harder to fix.
Updates #12594
Change-Id: Ia32845e2f262ee53366cf41dfa935f4d770c7a30
Reviewed-on: https://go-review.googlesource.com/31444
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
* Assert that the format is GNU.
Both GNU and STAR have some form of sparse file support with
incompatible header structures. Worse yet, both formats use the
'S' type flag to indicate the presence of a sparse file.
As such, we should check the format (based on magic numbers)
and fail early.
* Move realsize parsing logic into readOldGNUSparseMap.
This is related to the sparse parsing logic and belongs here.
* Fix the termination condition for parsing sparse fields.
The termination condition for reading the sparse fields
is to simply check if the first byte of the offset field is NULL.
This does not seem to be documented in the GNU manual, but this is
the check done by the both the GNU and BSD implementations:
http://git.savannah.gnu.org/cgit/tar.git/tree/src/sparse.c?id=9a33077a7b7ad7d32815a21dee54eba63b38a81c#n731
https://github.com/libarchive/libarchive/blob/1fa9c7bf90f0862036a99896b0501c381584451a/libarchive/archive_read_support_format_tar.c#L2207
* Fix the parsing of sparse fields to use parseNumeric.
This is what GNU and BSD do. The previous two links show that
GNU and BSD both handle base-256 and base-8.
* Detect truncated streams.
The call to io.ReadFull does not check if the error is io.EOF.
Getting io.EOF in this method is never okay and should always be
converted to io.ErrUnexpectedEOF.
* Simplify the function.
The logic is essentially a do-while loop so we can remove
some redundant code.
Change-Id: Ib2f601b1a283eaec1e41b1d3396d649c80749c4e
Reviewed-on: https://go-review.googlesource.com/28471
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
|
|
Most calls to strconv.ParseInt(x, 10, 0) should really be
calls to strconv.ParseInt(x, 10, 64) in order to ensure that they
do not overflow on 32b architectures.
Furthermore, we should document a bug where Uid and Gid may
overflow on 32b machines since the type is declared as int.
Change-Id: I99c0670b3c2922e4a9806822d9ad37e1a364b2b8
Reviewed-on: https://go-review.googlesource.com/28472
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Move all parse/format related functionality into strconv.go
and thoroughly test them. This also reduces the amount of noise
inside reader.go and writer.go.
There was zero functionality change other than moving code around.
Change-Id: I3bc288d10c20ebb3814b30b75d8acd7be62b85d7
Reviewed-on: https://go-review.googlesource.com/28470
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
The use of PAX headers can modify the overall file size, thus the
formerly created regFileReader may be stale.
The relevant PAX specification for this behavior is:
<<<
Any fields in the preceding optional extended header shall override
the associated fields in this header block for this file.
>>>
Where "optional extended header" refers to the preceding PAX header.
Where "this header block" refers to the subsequent USTAR header.
Fixes #15573
Fixes #15564
Change-Id: I83b1c3f05a9ca2d3be38647425ad21a9fe450ee2
Reviewed-on: https://go-review.googlesource.com/28418
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
The tar.Reader guarantees stickiness of errors. Ensuring this property means
that the methods of Reader need to be consistent about whose responsibility it
is to actually ensure that errors are sticky.
In this CL, we make it only the responsibility of the exported methods
(Next and Read) to store tr.err. All other methods just return the error as is.
As part of this change, we also check the error value of mergePAX (and test
that it properly detects invalid PAX files). Since the value of mergePAX was
never used before, we change it such that it always returns ErrHeader instead
of strconv.SyntaxError. This keeps it consistent with other usages of strconv
in the same tar package.
Change-Id: Ia1c31da71f1de4c175da89a385dec665d3edd167
Reviewed-on: https://go-review.googlesource.com/28215
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Factor out the regular file handling logic into handleRegularFile
from nextHeader. We will need to reuse this logic when fixing #15573
in a future CL.
Factor out the sparse file handling logic into handleSparseFile.
Currently this logic is split between nextHeader (for GNU sparse
files) and Next (for PAX sparse files). Instead, we move this
related code into a single method.
There is no overall logic change. Thus, no unit tests.
Updates #15573 #15564
Change-Id: I3b8270d8b4e080e77d6c0df6a123d677c82cc466
Reviewed-on: https://go-review.googlesource.com/27454
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The Reader and Writer have hard-coded constants regarding the
offsets and lengths of certain fields in the tar format sprinkled
all over. This makes it harder to verify that the offsets are
correct since a reviewer would need to search for them throughout
the code. Instead, all information about the layout of header
fields should be centralized in one single file. This has the
advantage of being both centralized, and also acting as a form
of documentation about the header struct format.
This method was chosen over using "encoding/binary" since that
method would cause an allocation of a header struct every time
binary.Read was called. This method causes zero allocations and
its logic is no longer than if structs were declared.
Updates #12594
Change-Id: Ic7a0565d2a2cd95d955547ace3b6dea2b57fab34
Reviewed-on: https://go-review.googlesource.com/14669
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Pointed out during review of golang.org/cl/22104.
Change-Id: If8842e7f8146441e918ec6a2b6e893b7cf88615c
Reviewed-on: https://go-review.googlesource.com/22120
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
cmd and runtime were handled separately, and I'm intentionally skipped
syscall. This is the rest of the standard library.
CL generated mechanically with github.com/mdempsky/unconvert.
Change-Id: I9e0eff886974dedc37adb93f602064b83e469122
Reviewed-on: https://go-review.googlesource.com/22104
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Automated change.
Fixes #15269
Change-Id: I8deb2ac0101d3f7c390467ceb0a1561b72edbb2f
Reviewed-on: https://go-review.googlesource.com/21962
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Commit dd5e14a7511465d20c6e95bf54c9b8f999abbbf6 ensured that no data
could be read for header-only files regardless of what the Header.Size
said. We should document this fact in Reader.Read.
Updates #13647
Change-Id: I4df9a2892bc66b49e0279693d08454bf696cfa31
Reviewed-on: https://go-review.googlesource.com/17913
Reviewed-by: Russ Cox <rsc@golang.org>
|