| Age | Commit message (Collapse) | Author |
|
In the struct Link, we add field Value that store the href from A element
or src from IMG element.
This allow us to debug any error during scan, especially joining path
and link.
|
|
If parent URL like "/page" return the body as HTML page, the URL should
be end with slash to make the relative links inside it works when joined
with the parent URL.
|
|
The TestScan_slow takes around ~11 seconds due to test include
[time.Sleep].
|
|
Each time the scan start, new queue add, fetching start, print the
message to stderr.
This remove the verbose options for better user experience.
|
|
Previously, have [jarink.Link], [brokenlinks.Broken], and
[brokenlinks.linkQueue] to store the metadata for a link.
These changes unified them into struct [jarink.Link].
|
|
Previously, we made the scan logic to run in multiple goroutine with
one channel to push and consume the result and another channel to push
and pop link to be processed.
The logic is a very complicated code, making it hard to read and debug.
These changes refactoring it to use single goroutine that push and pop
link from/to a slices, as queue.
|
|
On link with invalid domain, it should break and return the error
immediately.
|
|
Any succesful fetch on external URLs, will be recorded into jarink
cache file, located in user's home cache directory.
For example, in Linux it would be `$HOME/.cache/jarink/cache.json`.
This help improve the future rescanning on the same or different target
URL, minimizing network requests.
|
|
The test run a server that contains three six pages that contains
various [time.Sleep] duration before returning the response.
This allow us to see how the main scan loop works, waiting
for resultq and listWaitStatus.
|
|
There are two test cases, one for invalid status code like "abc",
and one for unknown status code like "50".
|
|
|
|
Before the Options passed to worker, it should be valid, including the
URL to be scanned.
|
|
The insecure option will allow and not report as error on server with
invalid certificates.
|
|
When link known to have an issues, one can ignore the status
code during scanning broken links using "-ignore-status" option.
|
|
When two or more struct has the same prefix that means it is time to
move it to group it.
Also, we will group one command to one package in the future.
|