aboutsummaryrefslogtreecommitdiff
path: root/src/regexp/syntax/parse.go
AgeCommit message (Collapse)Author
2025-04-18regexp/syntax: recognize category aliases like \p{Letter}Russ Cox
The Unicode specification defines aliases for some of the general category names. For example the category "L" has alias "Letter". The regexp package supports \p{L} but not \p{Letter}, because there was nothing in the Unicode tables that lets regexp know about Letter. Now that package unicode provides CategoryAliases (see #70780), we can use it to provide \p{Letter} as well. This is the only feature missing from making package regexp suitable for use in a JSON-API Schema implementation. (The official test suite includes usage of aliases like \p{Letter} instead of \p{L}.) For better conformity with Unicode TR18, also accept case-insensitive matches for names and ignore underscores, hyphens, and spaces; and add Any, ASCII, and Assigned. Fixes #70781. Change-Id: I50ff024d99255338fa8d92663881acb47f1e92a5 Reviewed-on: https://go-review.googlesource.com/c/go/+/641377 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Alan Donovan <adonovan@google.com>
2024-09-03all: omit unnecessary 0 in slice expressionnlwkobe30
All changes are related to the code, except for the comments in src/regexp/syntax/parse.go and src/slices/slices.go. Change-Id: I73c5d3c54099749b62210aa7f3182c5eb84bb6a6 GitHub-Last-Rev: 794aa9b0539811d00e1cd42be1e8d9fe9afe0281 GitHub-Pull-Request: golang/go#69170 Reviewed-on: https://go-review.googlesource.com/c/go/+/609678 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com>
2024-04-02regexp/syntax: cleanup code generation in perl_groups.goOlivier Mengué
Cleanup code generation of perl_groups.go: * Fix the generated code header to follow the standard https://go.dev/s/generatedcode * Apply gofmt as last step of code generation * Add //go:generate lines in parse.go to trigger code generation * Adapt make_perl_groups.pl to handle writing directly to the output file (as we can't use shell redirection in go:generate lines) * use strict; use warnings; Change-Id: I675241da03dd3f6facc9eb9de00999bd9203ad70 Reviewed-on: https://go-review.googlesource.com/c/go/+/555995 Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2024-03-29crypto/tls,regexp: remove always-nil error resultsDaniel Martí
These were harmless, but added unnecessary verbosity to the code. This can happen as a result of refactors: for example, the method sessionState used to return errors in some cases. Change-Id: I4e6dacc01ae6a49b528c672979f95cbb86795a85 Reviewed-on: https://go-review.googlesource.com/c/go/+/528995 Reviewed-by: Leo Isla <islaleo93@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Olivier Mengué <olivier.mengue@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: qiulaidongfeng <2645477756@qq.com> Reviewed-by: Quim Muntal <quimmuntal@gmail.com>
2024-03-26regexp/syntax: simplify the codeapocelipes
Use the slices package and the built-in max to simplify the code. There's no noticeable performance change in this modification. Change-Id: I96e46ba8ab1323f1ba0b8c9b827836e217772cf2 GitHub-Last-Rev: f0111ac7e220f7dac03290125a3a83831012f235 GitHub-Pull-Request: golang/go#66511 Reviewed-on: https://go-review.googlesource.com/c/go/+/573978 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: qiulaidongfeng <2645477756@qq.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Than McIntosh <thanm@google.com>
2023-10-03regexp/syntax: use min funcqiulaidongfeng
Change-Id: I679c906057577d4a795c07a2f572b969c3ee14d5 GitHub-Last-Rev: fba371d2d6bfc7fbf3a93a53bb22039acf65d7cf GitHub-Pull-Request: golang/go#63350 Reviewed-on: https://go-review.googlesource.com/c/go/+/532218 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2023-08-16regexp/syntax: use more compact Regexp.String outputRuss Cox
Compact the Regexp.String output. It was only ever intended for debugging, but there are at least some uses in the wild where regexps are built up using regexp/syntax and then formatted using the String method. Compact the output to help that use case. Specifically: - Compact 2-element character class ranges: [a-b] -> [ab]. - Aggregate flags: (?i:A)(?i:B)*(?i:C)|(?i:D)?(?i:E) -> (?i:AB*C|D?E). Fixes #57950. Change-Id: I1161d0e3aa6c3ae5a302677032bb7cd55caae5fb Reviewed-on: https://go-review.googlesource.com/c/go/+/507015 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Rob Pike <r@golang.org> Auto-Submit: Russ Cox <rsc@golang.org>
2023-07-31regexp/syntax: accept (?<name>...) syntax as valid captureMauri de Souza Meneguzzo
Currently the only named capture supported by regexp is (?P<name>a). The syntax (?<name>a) is also widely used and there is currently an effort from the Rust regex and RE2 teams to also accept this syntax. Fixes #58458 Change-Id: If22d44d3a5c4e8133ec68238ab130c151ca7c5c5 GitHub-Last-Rev: 31b50e6ab40cfb0f36df6f570525657d4680017f GitHub-Pull-Request: golang/go#61624 Reviewed-on: https://go-review.googlesource.com/c/go/+/513838 Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
2022-11-02regexp: add ErrLarge errorcuiweixie
For #56041 Change-Id: I6c98458b5c0d3b3636a53ee04fc97221f3fd8bbc Reviewed-on: https://go-review.googlesource.com/c/go/+/444817 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> Auto-Submit: Ian Lance Taylor <iant@google.com> Run-TryBot: xie cui <523516579@qq.com>
2022-10-05regexp: limit size of parsed regexpsRuss Cox
Set a 128 MB limit on the amount of space used by []syntax.Inst in the compiled form corresponding to a given regexp. Also set a 128 MB limit on the rune storage in the *syntax.Regexp tree itself. Thanks to Adam Korczynski (ADA Logics) and OSS-Fuzz for reporting this issue. Fixes CVE-2022-41715. Fixes #55949. Change-Id: Ia656baed81564436368cf950e1c5409752f28e1b Reviewed-on: https://go-review.googlesource.com/c/go/+/439356 Auto-Submit: Roland Shoemaker <roland@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Roland Shoemaker <roland@golang.org> Reviewed-by: Damien Neil <dneil@google.com>
2022-10-02regexp: fix a few function names on commentscui fliter
Change-Id: I192dd34c677e52e16f0ef78e1dae58a78f6d1aac GitHub-Last-Rev: 1638a7468951df72f13fea34855b6a4fcbb08226 GitHub-Pull-Request: golang/go#55967 Reviewed-on: https://go-review.googlesource.com/c/go/+/436885 Run-TryBot: Ian Lance Taylor <iant@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com>
2022-04-22regexp/syntax: rename ErrInvalidDepth to ErrNestingDepthIan Lance Taylor
The proposal accepted the name ErrNestingDepth. For #51684 Change-Id: I702365f19e5e1889dbcc3c971eecff68e0b08727 Reviewed-on: https://go-review.googlesource.com/c/go/+/401854 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-04-22regexp: change ErrInvalidDepth message to match proposalIan Lance Taylor
Also update the file in $GOROOT/api/next to use proposal number. For #51684 Change-Id: I28bfa6bc1cee98a17b13da196d41cda34d968bb0 Reviewed-on: https://go-review.googlesource.com/c/go/+/401076 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Run-TryBot: Ian Lance Taylor <iant@google.com> Auto-Submit: Ian Lance Taylor <iant@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2022-04-11all: gofmt main repoRuss Cox
[This CL is part of a sequence implementing the proposal #51082. The design doc is at https://go.dev/s/godocfmt-design.] Run the updated gofmt, which reformats doc comments, on the main repository. Vendored files are excluded. For #51082. Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407 Reviewed-on: https://go-review.googlesource.com/c/go/+/384268 Reviewed-by: Jonathan Amsterdam <jba@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-04-04regexp/syntax: add and use ErrInvalidDepthRuss Cox
The fix for #51112 introduced a depth check but used ErrInternalError to avoid introduce new API in a CL that would be backported to earlier releases. New API accepted in proposal #51684. This CL adds a distinct error for this case. For #51112. Fixes #51684. Change-Id: I068fc70aafe4218386a06103d9b7c847fb7ffa65 Reviewed-on: https://go-review.googlesource.com/c/go/+/384617 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01all: remove trailing blank doc comment linesRuss Cox
A future change to gofmt will rewrite // Doc comment. // func f() to // Doc comment. func f() Apply that change preemptively to all doc comments. For #51082. Change-Id: I4023e16cfb0729b64a8590f071cd92f17343081d Reviewed-on: https://go-review.googlesource.com/c/go/+/384259 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org>
2022-02-10regexp/syntax: reject very deeply nested regexps in ParseRuss Cox
The regexp code assumes it can recurse over the structure of a regexp safely. Go's growable stacks make that reasonable for all plausible regexps, but implausible ones can reach the “infinite recursion?” stack limit. This CL limits the depth of any parsed regexp to 1000. That is, the depth of the parse tree is required to be ≤ 1000. Regexps that require deeper parse trees will return ErrInternalError. A future CL will change the error to ErrInvalidDepth, but using ErrInternalError for now avoids introducing new API in point releases when this is backported. Fixes #51112. Change-Id: I97d2cd82195946eb43a4ea8561f5b95f91fb14c5 Reviewed-on: https://go-review.googlesource.com/c/go/+/384616 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2021-10-06all: use bytes.Cut, strings.CutRuss Cox
Many uses of Index/IndexByte/IndexRune/Split/SplitN can be written more clearly using the new Cut functions. Do that. Also rewrite to other functions if that's clearer. For #46336. Change-Id: I68d024716ace41a57a8bf74455c62279bde0f448 Reviewed-on: https://go-review.googlesource.com/c/go/+/351711 Trust: Russ Cox <rsc@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-04-17regexp/syntax: fix comment on p.literal and simplifyRuss Cox
p.literal's doc comment said it returned a value but it doesn't. While we're here, p.newLiteral is only called from p.literal, so simplify the code by merging the two. Change-Id: Ia357937a99f4e7473f0f1ec837113a39eaeb83d4 Reviewed-on: https://go-review.googlesource.com/c/go/+/222659 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-12-01Revert "go/printer: forbid empty line before first comment in block"Joe Tsai
This reverts commit 08f19bbde1b01227fdc2fa2d326e4029bb74dd96. Reason for revert: The changed transformation takes effect on a larger set of code snippets than expected. For example, this: func foo() { // Comment bar() } becomes: func foo() { // Comment bar() } This is an unintended consequence. Change-Id: Ifca88d6267dab8a8170791f7205124712bf8ace8 Reviewed-on: https://go-review.googlesource.com/81335 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Joe Tsai <joetsai@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-11-02go/printer: forbid empty line before first comment in blockJoe Tsai
To improve readability when exported fields are removed, forbid the printer from emitting an empty line before the first comment in a const, var, or type block. Also, when printing the "Has filtered or unexported fields." message, add an empty line before it to separate the message from the struct or interfact contents. Before the change: <<< type NamedArg struct { // Name is the name of the parameter placeholder. // // If empty, the ordinal position in the argument list will be // used. // // Name must omit any symbol prefix. Name string // Value is the value of the parameter. // It may be assigned the same value types as the query // arguments. Value interface{} // contains filtered or unexported fields } >>> After the change: <<< type NamedArg struct { // Name is the name of the parameter placeholder. // // If empty, the ordinal position in the argument list will be // used. // // Name must omit any symbol prefix. Name string // Value is the value of the parameter. // It may be assigned the same value types as the query // arguments. Value interface{} // contains filtered or unexported fields } >>> Fixes #18264 Change-Id: I9fe17ca39cf92fcdfea55064bd2eaa784ce48c88 Reviewed-on: https://go-review.googlesource.com/71990 Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>
2017-03-06regexp/syntax: remove unused flags parameterDaniel Martí
Found by github.com/mvdan/unparam. Change-Id: I186d2afd067e97eb05d65c4599119b347f82867d Reviewed-on: https://go-review.googlesource.com/37840 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-06-28unicode: upgrade to version 9.0.0Marcel van Lohuizen
Changes beyond generated tables: - Now supports aliases to handle deprecated property classes. - Some Mongolian letters are now modifiers. Other changes: - strconv: newly generated table to be in sync - regexp/syntax: updated maxFold Fixes #16191 Change-Id: I56bdf21ee2f775f2a82d0465b3772faf5c24cb61 Reviewed-on: https://go-review.googlesource.com/24496 Run-TryBot: Marcel van Lohuizen <mpvl@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-03-02all: single space after period.Brad Fitzpatrick
The tree's pretty inconsistent about single space vs double space after a period in documentation. Make it consistently a single space, per earlier decisions. This means contributors won't be confused by misleading precedence. This CL doesn't use go/doc to parse. It only addresses // comments. It was generated with: $ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])') $ go test go/doc -update Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7 Reviewed-on: https://go-review.googlesource.com/20022 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Dave Day <djd@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-01all: make copyright headers consistent with one space after periodBrad Fitzpatrick
This is a subset of https://golang.org/cl/20022 with only the copyright header lines, so the next CL will be smaller and more reviewable. Go policy has been single space after periods in comments for some time. The copyright header template at: https://golang.org/doc/contribute.html#copyright also uses a single space. Make them all consistent. Change-Id: Icc26c6b8495c3820da6b171ca96a74701b4a01b0 Reviewed-on: https://go-review.googlesource.com/20111 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-01-08regexp/syntax: fix factoring of common prefixes in alternationsPaul Wankadia
In the past, `a.*?c|a.*?b` was factored to `a.*?[bc]`. Thus, given "abc" as its input string, the automaton would consume "ab" and then stop (when unanchored) whereas it should consume all of "abc" as per leftmost semantics. Fixes #13812. Change-Id: I67ac0a353d7793b3d0c9c4aaf22d157621dfe784 Reviewed-on: https://go-review.googlesource.com/18357 Reviewed-by: Russ Cox <rsc@golang.org>
2015-12-01regexp/syntax: fix handling of \Q...\ERuss Cox
It's not a group: must handle the inside as a sequence of literal chars, not a single literal string. That is, \Qab\E+ is the same as ab+, not (ab)+. Fixes #11187. Change-Id: I5406d05ccf7efff3a7f15395bdb0cfb2bd23a8ed Reviewed-on: https://go-review.googlesource.com/17233 Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2014-10-20regexp/syntax: fix validity testing of zero repeatsIan Lance Taylor
This is already tested by TestRE2Exhaustive, but the build has not broken because that test is not run when using -test.short. LGTM=rsc R=rsc CC=golang-codereviews https://golang.org/cl/155580043
2014-09-30regexp/syntax: reject large repetitions created by nesting small onesRuss Cox
Fixes #7609. LGTM=r R=r CC=golang-codereviews https://golang.org/cl/150270043
2014-09-08build: move package sources from src/pkg to srcRuss Cox
Preparation was in CL 134570043. This CL contains only the effect of 'hg mv src/pkg/* src'. For more about the move, see golang.org/s/go14nopkg.