| Age | Commit message (Collapse) | Author |
|
When two or more struct has the same prefix that means it is time to
move it to group it.
Also, we will group one command to one package in the future.
|
|
Naming it page_links does not make sense if the result is from brokenlinks
command.
|
|
Previously, we only encode the BrokenlinksResult.PageLinks.
The struct may changes in the future, so its better to encode the whole
struct now rather than changing the output later.
|
|
The brokenlinks command now have option "-past-result" that accept
path to JSON file from the past result.
If its set, the program will only scan the pages with broken links
inside that report.
|
|
The fragment part on URL, for example "/page#fragment" should be
removed, otherwise it will indexed as different URL.
|
|
|
|
Any HTML link that is from domain other than the scanned domain should
net get parsed.
It only check if the link is valid or not.
|
|
For link to image we can skip parsing it.
|
|
The test should not require internet connection to be passed.
|
|
Turn out broken HTML still get parsed by "net/html" package.
|
|
The current implementation at least cover 84% of the cases.
Todo,
* CLI for scan
* add more test case for 100% coverage, including scan on invalid
base URL, scan on invalid HTML page, scan on invalid href or
src image
|
|
|