| Age | Commit message (Collapse) | Author |
|
Putting "html" under "net" package make no sense.
Another reason is to make the package flat under "lib/" directory.
|
|
HTTP request now implicitly create request with context.
Any false positive related to not closing HTTP response body has been
annotated with "nolint:bodyclose".
In the example code, use consistent "// Output:" comment format, by
prefixing with single space.
Any comment on code now also prefixing with single space.
An error returned without variables now use [errors.New] instead of
[fmt.Errorf].
Any error returned using [fmt.Errorf] now wrapped using "%w" instead of
"%s".
Also, replace error checking using [errors.Is] or [errors.As], instead
of using equal/not-equal operator.
Any statement like "x = x OP y" now replaced with "x OP= y".
Also, swap statement is simplified using "x, y = y, x".
Any switch statement with single case now replaced with if-condition.
Any call to defer on function or program that call [os.Exit], now
replaced by calling the deferred function directly.
Any if-else condition now replaced with switch statement, if possible.
|
|
There are several reasons that why we move from github.com.
First, related to the name of package.
We accidentally name the package with "share" a common word in English
that does not reflect the content of repository.
By moving to other repository, we can rename it to better and unique
name, in this "pakakeh.go".
Pakakeh is Minang word for tools, and ".go" suffix indicate that the
repository related to Go programming language.
Second, supporting open source.
The new repository is hosted under sourcehut.org, the founder is known
to support open source, and all their services are licensed under AGPL,
unlike GitHub that are closed sources.
Third, regarding GitHub CoPilot.
The GitHub Terms of Service [1], allow any public content that are hosted
there granted them to parse the content.
On one side, GitHub helps and flourish the open source, but on another
side have an issues regarding scraping the copyleft license [2].
[1]: https://docs.github.com/en/site-policy/github-terms/github-terms-of-service#4-license-grant-to-us
[2]: https://githubcopilotinvestigation.com
|
|
Instead of using bytes.Replace, three times, iterate the plain text
manually to clean up the white and multiple spaces.
Benchmark result,
name old time/op new time/op delta
Sanitize-8 4.27µs ±10% 2.64µs ±13% -38.21% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
Sanitize-8 4.84kB ± 0% 4.45kB ± 0% -7.94% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Sanitize-8 13.0 ± 0% 6.0 ± 0% -53.85% (p=0.000 n=10+10)
|
|
Since the sanitize package only contains HTML function, and the html
package already exist, we move the function into html package.
|
|
While at it, minimize allocation by using the input []byte as
output.
|
|
Given an input string, The NormalizeForID normalize it to HTML ID.
The normalization follow Mozilla specification [1] rules,
- it must not contain whitespace (spaces, tabs etc.),
- only ASCII letters, digits, '_', and '-' should be used, and
- it should start with a letter.
An empty string is equal to "_".
Any other unknown characters will be replaced with '_'.
If the input does not start with letter, it will be prefixed with
'_', unless it start with '_'.
[1] https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/id.
|
|
|
|
|
|
|
|
The NodeIterator have the method Next() that will return the first child
or the next sibling of current node, iteratively from top to bottom.
|
|
The x/net/html package currently only provide bare raw functionalities
to iterate tree, there is no check for empty node, and no function to
get attribute by name without looping it manually.
This package extends the package by adding methods to get node's attribute
by name, get the first non-empty child, and get the next non-empty sibling.
|