From b8e20a54ac6fab854dcf508e7fa645d927dba892 Mon Sep 17 00:00:00 2001 From: Shulhan Date: Thu, 25 Aug 2022 01:27:36 +0700 Subject: all: move all documentation into directory _doc While at it reformat the README, add section for development, add link for Go documentation. --- .gitignore | 2 +- CHANGELOG.adoc | 192 ----------------- README | 79 ++++--- README.adoc | 1 + SPECS.adoc | 606 ---------------------------------------------------- _doc/CHANGELOG.adoc | 192 +++++++++++++++++ _doc/SPECS.adoc | 606 ++++++++++++++++++++++++++++++++++++++++++++++++++++ _doc/index.adoc | 1 + index.adoc | 1 - 9 files changed, 847 insertions(+), 833 deletions(-) delete mode 100644 CHANGELOG.adoc create mode 120000 README.adoc delete mode 100644 SPECS.adoc create mode 100644 _doc/CHANGELOG.adoc create mode 100644 _doc/SPECS.adoc create mode 120000 _doc/index.adoc delete mode 120000 index.adoc diff --git a/.gitignore b/.gitignore index 854ac09..47b1c68 100644 --- a/.gitignore +++ b/.gitignore @@ -1,6 +1,6 @@ // SPDX-FileCopyrightText: 2020 M. Shulhan // SPDX-License-Identifier: GPL-3.0-or-later -/*.html +/_doc/*.html /cover.html /cover.out /testdata/*.html diff --git a/CHANGELOG.adoc b/CHANGELOG.adoc deleted file mode 100644 index 86054d8..0000000 --- a/CHANGELOG.adoc +++ /dev/null @@ -1,192 +0,0 @@ -// SPDX-FileCopyrightText: 2021 M. Shulhan -// SPDX-License-Identifier: GPL-3.0-or-later -= Changelog for asciidoctor-go -Shulhan -21 February 2022 -:toc: -:sectanchors: -:sectlinks: - - -[#v0_3_1] -== asciidoctor-go v0.3.1 (2022-08-06) - -[#v0_3_1_chores] -=== Chores - -all: rewrite unit tests for inlineParser using test.Data:: -+ --- -Using string literal for testing string input that may contains backtick -or double quote make the test code become unreadable and hard to modify. - -The test.Data help this by moving the input and expected output into -a file that can we write as is. --- - -all: cleaning up codes:: Use raw string literal whenever possible. - -go.mod: update share to v0.40.0:: -+ --- -This update fix some issues related to new line on test.Data. --- - - -[#v0_3_0] -== asciidoctor-go v0.3.0 (2022-07-24) - -This release set the minimum Go version to 1.18. - -[#v0_3_0_breaking_changes] -=== Breaking changes - -all: refactoring handling generate ref ID:: -+ --- -Previously, we always set the ID prefix and separator default to "\_" if -its not set, this cause all of the ID is prefixed with "\_". - -This changes use strict rules when generating ID following the Mozilla -specification [1] and latest AsciiDoc Language [2]. - -The idprefix must be ASCII string. -It must start with "\_", "-", or ASCII letters, otherwise the "\_" will -be added to the beginning. -If one of the character is not valid, it will replaced with "\_". - -The `idseparator` can be empty or single ASCII character ('\_' or '-', -ASCII letter, or digit). -It is used to replace invalid characters in the REF_ID. - -[1] https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/id - -[2] https://docs.asciidoctor.org/asciidoc/latest/sections/id-prefix-and-separator/ --- - -[#v0_3_0_enhancements] -=== Enhancements - -all: sort the generated HTML meta by names:: -+ --- -The order of meta names are "author", "description", "generator", and -then "keywords". --- - -all: store the list of author names under Attributes "author_names":: -+ --- -Previously, to get list of author names, we need to iterate each of -the Authors field. - -This changes set the Attributes "author_names" to list of author full -names, each separated by comma. --- - -all: add default metadata "generator":: -+ --- -The generator metadata contains the library name and its current version. --- - -all: realign all structs:: -+ --- -This is to minimize memory allocation when using asciidoctor. --- - -[#v0_3_0_chores] -=== Chores - -all: rewrite test using lib/test.Data:: -+ --- -Previously, to test parser and check the generated HTML, we write AsciDoc -input and expected HTML output using literal string `...`. -The text of input and output sometimes long, take multiple lines, which -makes the test code ugly, hard to write, and read. - -Using lib/test.Data we can write the input and output as the AsciiDoc -markup and the HTML markup as is, simplify writing the test and more -readable. --- - - -[#v0_2_0] -== asciidoctor-go v0.2.0 (2022-03-04) - -This release changes the license of asciidoctor-go from BSD to GPL 3.0 or -later. - -[#v0_2_0_bug_fixes] -=== Bug fixes - -all: fix list check box text get cut one character:: -+ --- -Given the following asciidoc check box markup, - - * [ ] abc - -It will rendereded as "❏ bc" instead of "❏ abc". --- - -[#v0_2_0_chores] -=== Chores - -all: replace bytes.Title and strings.Title with function:: -+ --- -Both of those functions has been deprecated. - -Since the Title function is to convert the adminition string into a -human title (first letter uppercase), we can use a function to do that. -Any unknown admonition will be returned as is. --- - - -[#v0_1_1] -== asciidoctor-go v0.1.1 (2021-12-06) - - -[#v0_1_1_bug_fixes] -=== Bug fixes - -all: fix parsing and rendering cross reference:: -+ --- -Previously, when parsing cross reference we assume that if the string -contains upper-case letter then it's a label so we store it as title -to search for ID later. - -The bug is when ID is set manually and its contains upper-case for -example "[#Id]". - -This changes fix this issue by storing cross reference first, not -assuming it as ID or title, and then when doing rendering we check -whether its ID or title. --- - -all: allow colon ':' and period '.' on the ID:: -+ --- -According to XML spec [1], the colon is allowed as the first and the next -character. While period is only allowed on the next characters. - -[1] https://www.w3.org/TR/REC-xml/#NT-Name --- - - -[#v0_1_0] -== asciidoctor-go v0.1.0 (2021-03-06) - -The asciidoctor-go is the Go module to parse the AsciiDoc (TM) markup -and convert it into HTML5. - -This first release bring almost all AsciiDoc syntax except for include -directive, inter-document cross-reference, macros, and non-primary syntax -features. - -I hope this library can be useful for Gophers who need the power of AsciiDoc -in their workflows. diff --git a/README b/README index aff9937..52f975d 100644 --- a/README +++ b/README @@ -11,14 +11,17 @@ The asciidoctor-go is the Go module to parse the https://asciidoctor.org/docs/what-is-asciidoc[AsciiDoc markup^] and convert it into HTML5. -== Specifications +== Documentation + +https://pkg.go.dev/git.sr.ht/~shulhan/asciidoctor-go[Go documentation^]. + +=== Specifications During reverse engineering the AsciiDoc markup, we write the syntax, rules, and format in link:SPECS.html[this document^]. - -== Features +=== Features List of available formatting that are supported on current implementation. Each supported feature is linked to official @@ -30,7 +33,7 @@ The numbered one is based on the old documentation. meta `doctitle`, `showtitle!` and subtitle. ** {url_ref}/document/author-information/[Author information^] ** {url_ref}/document/revision-information/[Revision information^] -** {url_ref}/document/metadata/[Metadata] +** {url_ref}/document/metadata/[Metadata^] * 15. Preamble * 16. Sections ** 16.1. Titles as HTML headings @@ -145,9 +148,31 @@ Additional metadata provides by this library, * `author_names` - list of author full names separated by comma. -== TODO +The following markup will not supported because its functionality is duplicate +with others markup or not secure, + +* 14. Header +** 14.4. Subtitle partitioning. + Rationale: duplicate with 14.1.2 the "Main: sub" format -List of features which will be implemented, +* 23. Tables +** 23.10. Nested tables. + Rationale: nested table is not a good way to present information. + Never should it be. +** Using different cell separator + +* 28. Include Directive +** 28.6. Select Portions of a Document to Include. + Rationale: the parser would need to know the language to be included and + parse the whole source code to check for comments and tags. +** 28.8. Include Content from a URI. + Rationale: security and unreliable network connections. +** 28.9. Caching URI Content + + +=== TODO + +List of features which may be implemented, * 16. Sections ** 16.9. Section styles @@ -174,12 +199,12 @@ List of features which will be implemented, ** 40.1. Passthrough Macros -=== BUGS +==== BUGS Unknown. -=== ENHANCEMENTS +==== ENHANCEMENTS * Create tree that link Include directive. Once the included files changes, the parent should be rendered too. @@ -190,31 +215,7 @@ Unknown. -- -== Not supported - -The following markup will not supported because its functionality is duplicate -with others markup or not secure, - -* 14. Header -** 14.4. Subtitle partitioning. - Rationale: duplicate with 14.1.2 the "Main: sub" format - -* 23. Tables -** 23.10. Nested tables. - Rationale: nested table is not a good way to present information. - Never should it be. -** Using different cell separator - -* 28. Include Directive -** 28.6. Select Portions of a Document to Include. - Rationale: the parser would need to know the language to be included and - parse the whole source code to check for comments and tags. -** 28.8. Include Content from a URI. - Rationale: security and unreliable network connections. -** 28.9. Caching URI Content - - -== Miscellaneous +=== Miscellaneous link:CHANGELOG.html[Changelog]. @@ -223,3 +224,15 @@ library: * link:testdata/test.exp.html[HTML file generated using asciidoctor^] * link:testdata/test.got.html[HTML file using this library^] + + +== Development + +https://git.sr.ht/~shulhan/asciidoctor-go[Repository^]:: +Link to the source code. + +https://lists.sr.ht/~shulhan/asciidoctor-go[Mailing list^]:: +Link to discussion or where to send the patches. + +https://todo.sr.ht/~shulhan/asciidoctor-go[Issues^]:: +Link to open an issue or request for new feature. diff --git a/README.adoc b/README.adoc new file mode 120000 index 0000000..100b938 --- /dev/null +++ b/README.adoc @@ -0,0 +1 @@ +README \ No newline at end of file diff --git a/SPECS.adoc b/SPECS.adoc deleted file mode 100644 index c3f974c..0000000 --- a/SPECS.adoc +++ /dev/null @@ -1,606 +0,0 @@ -// SPDX-FileCopyrightText: 2020 M. Shulhan -// SPDX-License-Identifier: GPL-3.0-or-later -= AsciiDoctor Document Specification -Shulhan -6 June 2020 -:toc: -:url_ref: https://docs.asciidoctor.org/asciidoc/latest - -This document contains grammar of asciidoc document markup language based on -https://asciidoctor.org/docs/user-manual[Asciidoctor User Manual]. - -== About implementation - -We try to follow the document syntax rules, but there are some inconsistencies -we found when the document parsed and rendered to HTML. -For example, the current asciidoctor allow the following inline formatting, - - _A `B_ C` - -to be rendered into the following HTML tree, - - A B C - -This is of course rendered correctly when opened in web browser, but it seems -break the tree. -In the previous implementation, we able to break down it into the following -tree, - - - B - - C - -But its open many inline formatting permutations which make the code more -complex than it should. - -This implementation, - -* use the strict asciidoctor syntax rules which we define in this document. - -* minimize duplicate markup. -** Support only "<<" ">>" syntax, drop "xref:" syntax - - -== Common grammar - ----- -EMPTY = "" - -DQUOTE = %d34 ; " - -WORD = 1*VCHAR ; Sequence of visible character without - ; white spaces. - -STRING = WORD *(WSP WORD) ; Sequence of word with spaces between them. - -LINE = STRING LF ; STRING that end with new line. - -TEXT = 1*LINE ; One or more LINE. - -REF_ID = 1*ALPHA *("-" / "_" / ALPHA / DIGIT) ----- - - -== Document header - -{url_ref}/document/header/[Reference^]. - -Document header consist of title and optional authors, a revision, and zero or -more metadata. -The document metadata can be in any order, before or after title, but the -author and revision MUST be after title and in order. - ----- -DOC_HEADER = *(DOC_ATTRIBUTE / COMMENTS) - "=" SP DOC_TITLE LF - (*DOC_ATTRIBUTE) - DOC_AUTHORS LF - (*DOC_ATTRIBUTE) - DOC_REVISION LF - (*DOC_ATTRIBUTE) ----- - -There are no empty line before and after the document header. -An empty line mark as the end of document header. - -=== Title - -{url_ref}/document/title/[Reference^]. - ----- -DOC_TITLE = 1*WORD [DOC_TITLE_SEP SUBTITLE] - -DOC_TITLE_SEP = ":" - -SUBTITLE = 1*WORD ----- - -=== Author information - -{url_ref}/document/author-information/[Reference^]. - ----- -DOC_AUTHORS = MAILBOX *( ";" MAILBOX ) - - MAILBOX = STRING [ "<" EMAIL ">" ] - - EMAIL = WORD "@" WORD "." 1*8ALPHA - ; simplified syntax of email format. ----- - -=== Revision information - -{url_ref}/document/revision-information/[Reference^]. - ----- -DOC_REVISION = DOC_REV_VERSION [ "," DOC_REV_DATE ] - -DOC_REV_VERSION = "v" 1*DIGIT "." 1*DIGIT "." 1*DIGIT - -DOC_REV_DATE = 1*2DIGIT 3*ALPHA 4*DIGIT ----- - -=== Metadata - -{url_ref}/document/metadata/[Reference^]. - -There are also metadata which affect how the document rendered, - ----- -DOC_ATTRIBUTE = ":" DOC_ATTR_KEY ":" *STRING LF - -DOC_ATTR_KEY = ( "toc" / "sectanchors" / "sectlinks" - / "imagesdir" / "data-uri" / *META_KEY ) LF - -META_KEY_CHAR = (A..Z | a..z | 0..9 | '_') - -META_KEY = 1META_KEY_CHAR *(META_KEY_CHAR | '-') ----- - - -=== HTML format - -HTML format for rendering section header, - ----- - ----- - -== Document preamble - -Any content after document title and before the new section is considered as -document preamble and its rendered inside the "content", not "header". - -HTML format, - ----- -
-
-
- {DOC_PREAMBLE} -
-
- ... -
----- - - -== Block - ----- -BLOCK_REF = "[#" REF_ID *["." RoleName] "]" LF ----- - -=== Attribute - ----- -BLOCK_ATTR = "[" ATTR_NAME ("=" ATTR_VALUE) *("," ATTR_OPT) "]" LF - -ATTR_NAME = WORD - -ATTR_VALUE = STRING - -ATTR_OPT = ATTR_NAME ("=") ATTR_VALUE) ----- - - -== Table of contents - -The table of contents (ToC) will be generated if "toc" attribute is set in -document header with the following syntax, - ----- -TOC_ATTR = ":toc:" (TOC_PLACEMENT / TOC_POSITION ) - -TOC_PLACEMENT = ("auto" / "preamble" / "macro") - -TOC_POSITION = ("left" / "right") - -TOC_MACRO = "toc::[]" ----- - -If toc placement is empty it default to "auto", and placed after document -header. -If toc is set to "preamble" it will be set after document preamble. -If toc is set to "macro", it will be set after section title that have -TOC_MACRO. - -=== Title - -By default the ToC element will have the title set to "Table of Contents". -One can change the ToC title using attribute "toc-title", - ----- -TOC_TITLE = ":toc-title:" LINE ----- - -=== Levels - -By default only section level 1 and 2 will be rendered. -One can change it using the attribute "toclevels", - ----- -TOC_LEVELS = ":toclevels:" 1DIGIT ----- - - -== Sections - -Sections or headers group one or more paragraphs or blocks. -Each section is started with '=' character or '#' (markdown). -There are six levels or sections that are allowed in asciidoc, any more than -that will be considered as paragraph. - ----- -SECTION = [BLOCK_REF] - 2*6(EQUAL/HASH) 1*WSP LINE LF ----- - -HTML format, - -HTML class for section is `sectN`, where N is the level, which is equal to -number of '=' minus 1. - ----- -
- {WORD} -
- ... -
-
----- - -=== Section Attributes - -==== idprefix - ----- -":idprefix:" EMPTY / REF_ID ----- - -The idprefix must be ASCII string. -It must start with "\_", "\-", or ASCII letters, otherwise the "\_" will be -prepended. -If one of the character is not valid, it will replaced with "\_". - -==== idseparator - ----- -":idseparator:" EMPTY / "-" / "_" / ALPHA ----- - -The `idseparator` can be empty or single ASCII character ("\_" or "\-", -ASCII letter, or digit). -It is used to replace invalid REF_ID character. - - -== Comments - ----- -COMMENT_SINGLE = "//" LINE - -COMMENT_BLOCK = "////" LF - *LINE - "////" LF - -COMMENTS = *(COMMENT_SINGLE / COMMENT_BLOCK) ----- - -The comment line cannot start with spaces, due to -link:#block_literal[Block literal]. - - -== Block listing - ----- -LISTING_STYLE = "[listing]" LF TEXT LF - -LISTING_BLOCK = "----" LF TEXT "----" LF ----- - - -== Block literal - ----- -LITERAL_PARAGRAPH = 1*WSP TEXT - -LITERAL_STYLE = "[literal]" LF TEXT LF - -LITERAL_BLOCK = "...." LF TEXT "...." LF ----- - -HTML format, - ----- -
-
-
{{TEXT}}
-
-
----- - -Substitution rules, - -* special characters: "<", ">", and "&" -* callouts - - -== Include Directive - ----- -INCLUDE_DIRECTIVE = "include::" PATH "[" ELEMENT_ATTRIBUTE "]" - -PATH = ABSOLUTE_PATH / RELATIVE_PATH - -ABSOLUTE_PATH = "/" WORD *( "/" WORD ) - -RELATIVE_PATH = ( "." / ".." ) "/" WORD * ( "/" WORD ) ----- - -== Images - -=== Inline image - ----- -IMAGE_INLINE = "image:" URL "[" (IMAGE_ATTRS) "]" - -IMAGE_ATTRS = TEXT ("," IMAGE_WIDTH ("," IMAGE_HEIGHT)) *("," IMAGE_OPTS) - -IMAGE_OPTS = IMAGE_OPT_KEY "=" 1*VCHAR - -IMAGE_OPT_KEY = "title" / "float" / "align" / "role" / "link" ----- - -== Video - ----- -BLOCK_VIDEO = "video::" (URL / WORD) "[" ( "youtube" / "vimeo" ) *(BLOCK_ATTR) "]" ----- - - -== Audio - ----- -BLOCK_AUDIO = "audio::" (URL / WORD) "[" - ( "options" "=" DQUOTE *AUDIO_ATTR_OPTIONS DQUOTE ) - "]" - -AUDIO_ATTR_OPTIONS = "autoplay" | "loop" | "controls" | "nocontrols" ----- - - -== Block attributes - ----- -BLOCK_ATTRS = BLOCK_ATTR *( "," BLOCK_ATTR ) - -BLOCK_ATTR = WORD "=" (DQUOTE) WORD (DQUOTE) ----- - - -== Inline formatting - -There are two types of inline formatting: constrained and unconstrained. -The constrained formatting only applicable if the previous character of syntax -begin with non-alphanumeric and end with characters other than alpha-numeric -and underscore. - ----- -FORMAT_BEGIN = WSP / "!" / DQUOTE / "#" / "$" / "%" / "&" / "'" / "(" / ")" - / "*" / "+" / "," / "-" / "." / "/" / - / ":" / ";" / "<" / "=" / ">" / "?" / "@" - / "[" / "\" / "]" / "^" / "_" / "`" - / "{" / "|" / "}" / "~" - -FORMAT_END = FORMAT_BEGIN ----- - -=== Unconstrained bold - ----- -TEXT_UNCONSTRAINED_BOLD = "**" TEXT "**" ----- - -=== Unconstrained italic - ----- -TEXT_UNCONSTRAINED_ITALIC = "__" TEXT "__" ----- - -=== Unconstrained mono - ----- -TEXT_UNCONSTRAINED_MONO = "``" TEXT "``" ----- - -=== Bold - ----- -TEXT_BOLD = FORMAT_BEGIN "*" TEXT "*" FORMAT_END ----- - -=== Italic - ----- -TEXT_ITALIC = FORMAT_BEGIN "_" TEXT "_" FORMAT_END ----- - -=== Monospace - ----- -TEXT_MONO = FORMAT_BEGIN "`" TEXT "`" FORMAT_END ----- - -=== Double quote curve - ----- -TEXT_QUOTE_DOUBLE = QUOTE "`" TEXT "`" QUOTE ----- - -=== Single quote curve - ----- -TEXT_QUOTE_SINGLE = "'`" TEXT "`'" ----- - -=== Subscript - ----- -TEXT_SUBSCRIPT = "~" WORD "~" ----- - -=== Superscript - ----- -TEXT_SUPERSCRIPT = "^" WORD "^" ----- - -=== Attribute reference - ----- -ATTR_REF = "{" META_KEY "}" ----- - -The attribute reference will be replace with document attributes, if its -exist, otherwise it would be considered as normal text. - - -== Passthrough - ----- -PASSTHROUGH_SINGLE = FORMAT_BEGIN "+" TEXT "+" FORMAT_END - -PASSTHROUGH_DOUBLE = "++" TEXT "++" - -PASSTHROUGH_TRIPLE = "+++" TEXT "+++" - -PASSTHROUGH_BLOCK = "++++" LF 1*LINE "++++" LF ----- - - -== URLs - -The URL should end with "[]". - ----- -URL = URL_SCHEME "://" 1*VCHAR ( - "[" URL_TEXT ("," URL_ATTR_TARGET ) ("," URL_ATTR_ROLE ) "]" ) LWSP - -URL_TEXT = TEXT ("^") - -URL_ATTR_TARGET = "window" "=" "_blank" - -URL_ATTR_RILE = "role=" WORD *("," WORD) ----- - - -== Anchor - ----- -ANCHOR_LINE = "[[" REF_ID "]]" LF - -ANCHOR_LINE_SHORT = "[#" REF_ID "]" LF - -ANCHOR_INLINE = "[[" REF_ID "]]" TEXT - -ANCHOR_INLINE_SHORT = "[#" REF_ID "]#" TEXT "#" FORMAT_END. ----- - -== Cross references - ----- -CROSS_REF_INTERNAL = "<<" REF_ID ("," REF_LABEL) / CROSS_REF_NATURAL ">>" - -CROSS_REF_NATURAL = BLOCK_TITLE ----- - -Rendered HTML, ----- -REF_LABEL / BLOCK_TITLE ----- - -The CROSS_REF_NATURAL only works if the text contains at least one uppercase -or space. - - -== Table - ----- -TABLE = TABLE_SEP LF *ROW LF TABLE_SEP - -TABLE_SEP = "|" 3*"=" - -ROW = 1*CELL - -CELL = CELL_FORMAT "|" TEXT (LF) - -CELL_FORMAT = CELL_DUP / CELL_SPAN_COL/ CELL_SPAN_ROW - / CELL_ALIGN_HOR / CELL_ALIGN_VER / CELL_STYLE - -CELL_DUP = 1*DIGIT "*" - -CELL_SPAN_COL = 1*DIGIT "+" - -CELL_SPAN_ROW = "." 1*DIGIT "+" - -CELL_ALIGN_HOR = "<" / "^" / ">" - -CELL_ALIGN_VER = "." ("<" / "^" / ">") - -CELL_STYLE = "a" / "d" / "e" / "h" / "l" / "m" / "s" / "v" ----- - - -== Inconsistencies and bugs on asciidoctor - -Listing style "[listing]" followed by "...." is become listing block. -Example, ----- -[listing] -.... -This block become listing. -.... ----- - -Image width and height with non-digits characters are allowed, -Example, ----- -image::sunset.jpg[Text,a,b] ----- - -Link with "https" end with '.' works, but "mailto" end with '.' is not -working. -Example, ----- -https://asciidoctor.org. - -mailto:me@example.com. ----- - -Block image with "link" option does not work as expected, ----- -image::{image-sunset}[Block image with attribute ref, link={test-url}]. ----- - -First table row with multiple lines does not considered as header, even -thought it separated by empty line. -Example, - ----- -|=== -|A1 -|B1 - -|A2 -|B2 -|=== ----- diff --git a/_doc/CHANGELOG.adoc b/_doc/CHANGELOG.adoc new file mode 100644 index 0000000..86054d8 --- /dev/null +++ b/_doc/CHANGELOG.adoc @@ -0,0 +1,192 @@ +// SPDX-FileCopyrightText: 2021 M. Shulhan +// SPDX-License-Identifier: GPL-3.0-or-later += Changelog for asciidoctor-go +Shulhan +21 February 2022 +:toc: +:sectanchors: +:sectlinks: + + +[#v0_3_1] +== asciidoctor-go v0.3.1 (2022-08-06) + +[#v0_3_1_chores] +=== Chores + +all: rewrite unit tests for inlineParser using test.Data:: ++ +-- +Using string literal for testing string input that may contains backtick +or double quote make the test code become unreadable and hard to modify. + +The test.Data help this by moving the input and expected output into +a file that can we write as is. +-- + +all: cleaning up codes:: Use raw string literal whenever possible. + +go.mod: update share to v0.40.0:: ++ +-- +This update fix some issues related to new line on test.Data. +-- + + +[#v0_3_0] +== asciidoctor-go v0.3.0 (2022-07-24) + +This release set the minimum Go version to 1.18. + +[#v0_3_0_breaking_changes] +=== Breaking changes + +all: refactoring handling generate ref ID:: ++ +-- +Previously, we always set the ID prefix and separator default to "\_" if +its not set, this cause all of the ID is prefixed with "\_". + +This changes use strict rules when generating ID following the Mozilla +specification [1] and latest AsciiDoc Language [2]. + +The idprefix must be ASCII string. +It must start with "\_", "-", or ASCII letters, otherwise the "\_" will +be added to the beginning. +If one of the character is not valid, it will replaced with "\_". + +The `idseparator` can be empty or single ASCII character ('\_' or '-', +ASCII letter, or digit). +It is used to replace invalid characters in the REF_ID. + +[1] https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/id + +[2] https://docs.asciidoctor.org/asciidoc/latest/sections/id-prefix-and-separator/ +-- + +[#v0_3_0_enhancements] +=== Enhancements + +all: sort the generated HTML meta by names:: ++ +-- +The order of meta names are "author", "description", "generator", and +then "keywords". +-- + +all: store the list of author names under Attributes "author_names":: ++ +-- +Previously, to get list of author names, we need to iterate each of +the Authors field. + +This changes set the Attributes "author_names" to list of author full +names, each separated by comma. +-- + +all: add default metadata "generator":: ++ +-- +The generator metadata contains the library name and its current version. +-- + +all: realign all structs:: ++ +-- +This is to minimize memory allocation when using asciidoctor. +-- + +[#v0_3_0_chores] +=== Chores + +all: rewrite test using lib/test.Data:: ++ +-- +Previously, to test parser and check the generated HTML, we write AsciDoc +input and expected HTML output using literal string `...`. +The text of input and output sometimes long, take multiple lines, which +makes the test code ugly, hard to write, and read. + +Using lib/test.Data we can write the input and output as the AsciiDoc +markup and the HTML markup as is, simplify writing the test and more +readable. +-- + + +[#v0_2_0] +== asciidoctor-go v0.2.0 (2022-03-04) + +This release changes the license of asciidoctor-go from BSD to GPL 3.0 or +later. + +[#v0_2_0_bug_fixes] +=== Bug fixes + +all: fix list check box text get cut one character:: ++ +-- +Given the following asciidoc check box markup, + + * [ ] abc + +It will rendereded as "❏ bc" instead of "❏ abc". +-- + +[#v0_2_0_chores] +=== Chores + +all: replace bytes.Title and strings.Title with function:: ++ +-- +Both of those functions has been deprecated. + +Since the Title function is to convert the adminition string into a +human title (first letter uppercase), we can use a function to do that. +Any unknown admonition will be returned as is. +-- + + +[#v0_1_1] +== asciidoctor-go v0.1.1 (2021-12-06) + + +[#v0_1_1_bug_fixes] +=== Bug fixes + +all: fix parsing and rendering cross reference:: ++ +-- +Previously, when parsing cross reference we assume that if the string +contains upper-case letter then it's a label so we store it as title +to search for ID later. + +The bug is when ID is set manually and its contains upper-case for +example "[#Id]". + +This changes fix this issue by storing cross reference first, not +assuming it as ID or title, and then when doing rendering we check +whether its ID or title. +-- + +all: allow colon ':' and period '.' on the ID:: ++ +-- +According to XML spec [1], the colon is allowed as the first and the next +character. While period is only allowed on the next characters. + +[1] https://www.w3.org/TR/REC-xml/#NT-Name +-- + + +[#v0_1_0] +== asciidoctor-go v0.1.0 (2021-03-06) + +The asciidoctor-go is the Go module to parse the AsciiDoc (TM) markup +and convert it into HTML5. + +This first release bring almost all AsciiDoc syntax except for include +directive, inter-document cross-reference, macros, and non-primary syntax +features. + +I hope this library can be useful for Gophers who need the power of AsciiDoc +in their workflows. diff --git a/_doc/SPECS.adoc b/_doc/SPECS.adoc new file mode 100644 index 0000000..c3f974c --- /dev/null +++ b/_doc/SPECS.adoc @@ -0,0 +1,606 @@ +// SPDX-FileCopyrightText: 2020 M. Shulhan +// SPDX-License-Identifier: GPL-3.0-or-later += AsciiDoctor Document Specification +Shulhan +6 June 2020 +:toc: +:url_ref: https://docs.asciidoctor.org/asciidoc/latest + +This document contains grammar of asciidoc document markup language based on +https://asciidoctor.org/docs/user-manual[Asciidoctor User Manual]. + +== About implementation + +We try to follow the document syntax rules, but there are some inconsistencies +we found when the document parsed and rendered to HTML. +For example, the current asciidoctor allow the following inline formatting, + + _A `B_ C` + +to be rendered into the following HTML tree, + + A B C + +This is of course rendered correctly when opened in web browser, but it seems +break the tree. +In the previous implementation, we able to break down it into the following +tree, + + + B + + C + +But its open many inline formatting permutations which make the code more +complex than it should. + +This implementation, + +* use the strict asciidoctor syntax rules which we define in this document. + +* minimize duplicate markup. +** Support only "<<" ">>" syntax, drop "xref:" syntax + + +== Common grammar + +---- +EMPTY = "" + +DQUOTE = %d34 ; " + +WORD = 1*VCHAR ; Sequence of visible character without + ; white spaces. + +STRING = WORD *(WSP WORD) ; Sequence of word with spaces between them. + +LINE = STRING LF ; STRING that end with new line. + +TEXT = 1*LINE ; One or more LINE. + +REF_ID = 1*ALPHA *("-" / "_" / ALPHA / DIGIT) +---- + + +== Document header + +{url_ref}/document/header/[Reference^]. + +Document header consist of title and optional authors, a revision, and zero or +more metadata. +The document metadata can be in any order, before or after title, but the +author and revision MUST be after title and in order. + +---- +DOC_HEADER = *(DOC_ATTRIBUTE / COMMENTS) + "=" SP DOC_TITLE LF + (*DOC_ATTRIBUTE) + DOC_AUTHORS LF + (*DOC_ATTRIBUTE) + DOC_REVISION LF + (*DOC_ATTRIBUTE) +---- + +There are no empty line before and after the document header. +An empty line mark as the end of document header. + +=== Title + +{url_ref}/document/title/[Reference^]. + +---- +DOC_TITLE = 1*WORD [DOC_TITLE_SEP SUBTITLE] + +DOC_TITLE_SEP = ":" + +SUBTITLE = 1*WORD +---- + +=== Author information + +{url_ref}/document/author-information/[Reference^]. + +---- +DOC_AUTHORS = MAILBOX *( ";" MAILBOX ) + + MAILBOX = STRING [ "<" EMAIL ">" ] + + EMAIL = WORD "@" WORD "." 1*8ALPHA + ; simplified syntax of email format. +---- + +=== Revision information + +{url_ref}/document/revision-information/[Reference^]. + +---- +DOC_REVISION = DOC_REV_VERSION [ "," DOC_REV_DATE ] + +DOC_REV_VERSION = "v" 1*DIGIT "." 1*DIGIT "." 1*DIGIT + +DOC_REV_DATE = 1*2DIGIT 3*ALPHA 4*DIGIT +---- + +=== Metadata + +{url_ref}/document/metadata/[Reference^]. + +There are also metadata which affect how the document rendered, + +---- +DOC_ATTRIBUTE = ":" DOC_ATTR_KEY ":" *STRING LF + +DOC_ATTR_KEY = ( "toc" / "sectanchors" / "sectlinks" + / "imagesdir" / "data-uri" / *META_KEY ) LF + +META_KEY_CHAR = (A..Z | a..z | 0..9 | '_') + +META_KEY = 1META_KEY_CHAR *(META_KEY_CHAR | '-') +---- + + +=== HTML format + +HTML format for rendering section header, + +---- + +---- + +== Document preamble + +Any content after document title and before the new section is considered as +document preamble and its rendered inside the "content", not "header". + +HTML format, + +---- +
+
+
+ {DOC_PREAMBLE} +
+
+ ... +
+---- + + +== Block + +---- +BLOCK_REF = "[#" REF_ID *["." RoleName] "]" LF +---- + +=== Attribute + +---- +BLOCK_ATTR = "[" ATTR_NAME ("=" ATTR_VALUE) *("," ATTR_OPT) "]" LF + +ATTR_NAME = WORD + +ATTR_VALUE = STRING + +ATTR_OPT = ATTR_NAME ("=") ATTR_VALUE) +---- + + +== Table of contents + +The table of contents (ToC) will be generated if "toc" attribute is set in +document header with the following syntax, + +---- +TOC_ATTR = ":toc:" (TOC_PLACEMENT / TOC_POSITION ) + +TOC_PLACEMENT = ("auto" / "preamble" / "macro") + +TOC_POSITION = ("left" / "right") + +TOC_MACRO = "toc::[]" +---- + +If toc placement is empty it default to "auto", and placed after document +header. +If toc is set to "preamble" it will be set after document preamble. +If toc is set to "macro", it will be set after section title that have +TOC_MACRO. + +=== Title + +By default the ToC element will have the title set to "Table of Contents". +One can change the ToC title using attribute "toc-title", + +---- +TOC_TITLE = ":toc-title:" LINE +---- + +=== Levels + +By default only section level 1 and 2 will be rendered. +One can change it using the attribute "toclevels", + +---- +TOC_LEVELS = ":toclevels:" 1DIGIT +---- + + +== Sections + +Sections or headers group one or more paragraphs or blocks. +Each section is started with '=' character or '#' (markdown). +There are six levels or sections that are allowed in asciidoc, any more than +that will be considered as paragraph. + +---- +SECTION = [BLOCK_REF] + 2*6(EQUAL/HASH) 1*WSP LINE LF +---- + +HTML format, + +HTML class for section is `sectN`, where N is the level, which is equal to +number of '=' minus 1. + +---- +
+ {WORD} +
+ ... +
+
+---- + +=== Section Attributes + +==== idprefix + +---- +":idprefix:" EMPTY / REF_ID +---- + +The idprefix must be ASCII string. +It must start with "\_", "\-", or ASCII letters, otherwise the "\_" will be +prepended. +If one of the character is not valid, it will replaced with "\_". + +==== idseparator + +---- +":idseparator:" EMPTY / "-" / "_" / ALPHA +---- + +The `idseparator` can be empty or single ASCII character ("\_" or "\-", +ASCII letter, or digit). +It is used to replace invalid REF_ID character. + + +== Comments + +---- +COMMENT_SINGLE = "//" LINE + +COMMENT_BLOCK = "////" LF + *LINE + "////" LF + +COMMENTS = *(COMMENT_SINGLE / COMMENT_BLOCK) +---- + +The comment line cannot start with spaces, due to +link:#block_literal[Block literal]. + + +== Block listing + +---- +LISTING_STYLE = "[listing]" LF TEXT LF + +LISTING_BLOCK = "----" LF TEXT "----" LF +---- + + +== Block literal + +---- +LITERAL_PARAGRAPH = 1*WSP TEXT + +LITERAL_STYLE = "[literal]" LF TEXT LF + +LITERAL_BLOCK = "...." LF TEXT "...." LF +---- + +HTML format, + +---- +
+
+
{{TEXT}}
+
+
+---- + +Substitution rules, + +* special characters: "<", ">", and "&" +* callouts + + +== Include Directive + +---- +INCLUDE_DIRECTIVE = "include::" PATH "[" ELEMENT_ATTRIBUTE "]" + +PATH = ABSOLUTE_PATH / RELATIVE_PATH + +ABSOLUTE_PATH = "/" WORD *( "/" WORD ) + +RELATIVE_PATH = ( "." / ".." ) "/" WORD * ( "/" WORD ) +---- + +== Images + +=== Inline image + +---- +IMAGE_INLINE = "image:" URL "[" (IMAGE_ATTRS) "]" + +IMAGE_ATTRS = TEXT ("," IMAGE_WIDTH ("," IMAGE_HEIGHT)) *("," IMAGE_OPTS) + +IMAGE_OPTS = IMAGE_OPT_KEY "=" 1*VCHAR + +IMAGE_OPT_KEY = "title" / "float" / "align" / "role" / "link" +---- + +== Video + +---- +BLOCK_VIDEO = "video::" (URL / WORD) "[" ( "youtube" / "vimeo" ) *(BLOCK_ATTR) "]" +---- + + +== Audio + +---- +BLOCK_AUDIO = "audio::" (URL / WORD) "[" + ( "options" "=" DQUOTE *AUDIO_ATTR_OPTIONS DQUOTE ) + "]" + +AUDIO_ATTR_OPTIONS = "autoplay" | "loop" | "controls" | "nocontrols" +---- + + +== Block attributes + +---- +BLOCK_ATTRS = BLOCK_ATTR *( "," BLOCK_ATTR ) + +BLOCK_ATTR = WORD "=" (DQUOTE) WORD (DQUOTE) +---- + + +== Inline formatting + +There are two types of inline formatting: constrained and unconstrained. +The constrained formatting only applicable if the previous character of syntax +begin with non-alphanumeric and end with characters other than alpha-numeric +and underscore. + +---- +FORMAT_BEGIN = WSP / "!" / DQUOTE / "#" / "$" / "%" / "&" / "'" / "(" / ")" + / "*" / "+" / "," / "-" / "." / "/" / + / ":" / ";" / "<" / "=" / ">" / "?" / "@" + / "[" / "\" / "]" / "^" / "_" / "`" + / "{" / "|" / "}" / "~" + +FORMAT_END = FORMAT_BEGIN +---- + +=== Unconstrained bold + +---- +TEXT_UNCONSTRAINED_BOLD = "**" TEXT "**" +---- + +=== Unconstrained italic + +---- +TEXT_UNCONSTRAINED_ITALIC = "__" TEXT "__" +---- + +=== Unconstrained mono + +---- +TEXT_UNCONSTRAINED_MONO = "``" TEXT "``" +---- + +=== Bold + +---- +TEXT_BOLD = FORMAT_BEGIN "*" TEXT "*" FORMAT_END +---- + +=== Italic + +---- +TEXT_ITALIC = FORMAT_BEGIN "_" TEXT "_" FORMAT_END +---- + +=== Monospace + +---- +TEXT_MONO = FORMAT_BEGIN "`" TEXT "`" FORMAT_END +---- + +=== Double quote curve + +---- +TEXT_QUOTE_DOUBLE = QUOTE "`" TEXT "`" QUOTE +---- + +=== Single quote curve + +---- +TEXT_QUOTE_SINGLE = "'`" TEXT "`'" +---- + +=== Subscript + +---- +TEXT_SUBSCRIPT = "~" WORD "~" +---- + +=== Superscript + +---- +TEXT_SUPERSCRIPT = "^" WORD "^" +---- + +=== Attribute reference + +---- +ATTR_REF = "{" META_KEY "}" +---- + +The attribute reference will be replace with document attributes, if its +exist, otherwise it would be considered as normal text. + + +== Passthrough + +---- +PASSTHROUGH_SINGLE = FORMAT_BEGIN "+" TEXT "+" FORMAT_END + +PASSTHROUGH_DOUBLE = "++" TEXT "++" + +PASSTHROUGH_TRIPLE = "+++" TEXT "+++" + +PASSTHROUGH_BLOCK = "++++" LF 1*LINE "++++" LF +---- + + +== URLs + +The URL should end with "[]". + +---- +URL = URL_SCHEME "://" 1*VCHAR ( + "[" URL_TEXT ("," URL_ATTR_TARGET ) ("," URL_ATTR_ROLE ) "]" ) LWSP + +URL_TEXT = TEXT ("^") + +URL_ATTR_TARGET = "window" "=" "_blank" + +URL_ATTR_RILE = "role=" WORD *("," WORD) +---- + + +== Anchor + +---- +ANCHOR_LINE = "[[" REF_ID "]]" LF + +ANCHOR_LINE_SHORT = "[#" REF_ID "]" LF + +ANCHOR_INLINE = "[[" REF_ID "]]" TEXT + +ANCHOR_INLINE_SHORT = "[#" REF_ID "]#" TEXT "#" FORMAT_END. +---- + +== Cross references + +---- +CROSS_REF_INTERNAL = "<<" REF_ID ("," REF_LABEL) / CROSS_REF_NATURAL ">>" + +CROSS_REF_NATURAL = BLOCK_TITLE +---- + +Rendered HTML, +---- +REF_LABEL / BLOCK_TITLE +---- + +The CROSS_REF_NATURAL only works if the text contains at least one uppercase +or space. + + +== Table + +---- +TABLE = TABLE_SEP LF *ROW LF TABLE_SEP + +TABLE_SEP = "|" 3*"=" + +ROW = 1*CELL + +CELL = CELL_FORMAT "|" TEXT (LF) + +CELL_FORMAT = CELL_DUP / CELL_SPAN_COL/ CELL_SPAN_ROW + / CELL_ALIGN_HOR / CELL_ALIGN_VER / CELL_STYLE + +CELL_DUP = 1*DIGIT "*" + +CELL_SPAN_COL = 1*DIGIT "+" + +CELL_SPAN_ROW = "." 1*DIGIT "+" + +CELL_ALIGN_HOR = "<" / "^" / ">" + +CELL_ALIGN_VER = "." ("<" / "^" / ">") + +CELL_STYLE = "a" / "d" / "e" / "h" / "l" / "m" / "s" / "v" +---- + + +== Inconsistencies and bugs on asciidoctor + +Listing style "[listing]" followed by "...." is become listing block. +Example, +---- +[listing] +.... +This block become listing. +.... +---- + +Image width and height with non-digits characters are allowed, +Example, +---- +image::sunset.jpg[Text,a,b] +---- + +Link with "https" end with '.' works, but "mailto" end with '.' is not +working. +Example, +---- +https://asciidoctor.org. + +mailto:me@example.com. +---- + +Block image with "link" option does not work as expected, +---- +image::{image-sunset}[Block image with attribute ref, link={test-url}]. +---- + +First table row with multiple lines does not considered as header, even +thought it separated by empty line. +Example, + +---- +|=== +|A1 +|B1 + +|A2 +|B2 +|=== +---- diff --git a/_doc/index.adoc b/_doc/index.adoc new file mode 120000 index 0000000..59a23c4 --- /dev/null +++ b/_doc/index.adoc @@ -0,0 +1 @@ +../README \ No newline at end of file diff --git a/index.adoc b/index.adoc deleted file mode 120000 index 100b938..0000000 --- a/index.adoc +++ /dev/null @@ -1 +0,0 @@ -README \ No newline at end of file -- cgit v1.3