aboutsummaryrefslogtreecommitdiff
path: root/brokenlinks_worker.go
AgeCommit message (Collapse)Author
2025-06-12all: refactoring, move brokenlinks code to its own packageShulhan
When two or more struct has the same prefix that means it is time to move it to group it. Also, we will group one command to one package in the future.
2025-06-12all: rename the json field page_links to broken_linksShulhan
Naming it page_links does not make sense if the result is from brokenlinks command.
2025-06-11all: revert to use HTTP GET on external, non-image URLShulhan
Using HTTP HEAD on certain page may return * 404, not found, for example on https://support.google.com/accounts/answer/1066447 * 405, method not allowed, for example on https://aur.archlinux.org/packages/rescached-git For 405 response code we can check and retry with GET, but for 404 its impossible to check if the URL is really exist or not, since 404 means page not found.
2025-06-11all: check for DNS timeout and retry 5 timesShulhan
When the call to HTTP HEAD or GET return an error and the error is *net.DNSError with Timeout, retry the call until no error or Timeout again for 5 times.
2025-06-05all: encode the whole BrokenlinksResult struct to JSONShulhan
Previously, we only encode the BrokenlinksResult.PageLinks. The struct may changes in the future, so its better to encode the whole struct now rather than changing the output later.
2025-06-05all: add option to scan pass resultShulhan
The brokenlinks command now have option "-past-result" that accept path to JSON file from the past result. If its set, the program will only scan the pages with broken links inside that report.
2025-06-01all: brokenlinks should scan only URL on given pathShulhan
Previously, if we pass the URL with path to brokenlinks, for example "web.tld/path" it will scan all of the pages in the website "web.tld". Now, it only scan the "/path" and its sub paths.
2025-06-01all: use separate logs for worker and main programShulhan
The worker use log with date and time, while the main program is not.
2025-06-01all: rename the program and repository into jarinkShulhan
Jarink is a program to help web administrator to maintains their website. Currently its provides a command to scan for brokenlinks.