<feed xmlns='http://www.w3.org/2005/Atom'>
<title>jarink/brokenlinks/brokenlinks_test.go, branch main</title>
<subtitle>Program to inspects and maintains web sites.</subtitle>
<id>http://git.kilabit.info/jarink/atom?h=main</id>
<link rel='self' href='http://git.kilabit.info/jarink/atom?h=main'/>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/jarink/'/>
<updated>2026-02-11T18:04:40Z</updated>
<entry>
<title>brokenlinks: store the anchor or image source in link</title>
<updated>2026-02-11T18:04:40Z</updated>
<author>
<name>Shulhan</name>
<email>ms@kilabit.info</email>
</author>
<published>2026-02-11T18:04:40Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/jarink/commit/?id=9c7ee77376294e9abd70ca356e26d0ab16ad7466'/>
<id>urn:sha1:9c7ee77376294e9abd70ca356e26d0ab16ad7466</id>
<content type='text'>
In the struct Link, we add field Value that store the href from A element
or src from IMG element.
This allow us to debug any error during scan, especially joining path
and link.
</content>
</entry>
<entry>
<title>brokenlinks: make link that return HTML always end with slash</title>
<updated>2026-02-11T14:45:06Z</updated>
<author>
<name>Shulhan</name>
<email>ms@kilabit.info</email>
</author>
<published>2026-02-11T03:47:42Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/jarink/commit/?id=8100b3be0730173a77f1a64f9ac6bc8862a159ac'/>
<id>urn:sha1:8100b3be0730173a77f1a64f9ac6bc8862a159ac</id>
<content type='text'>
If parent URL like "/page" return the body as HTML page, the URL should
be end with slash to make the relative links inside it works when joined
with the parent URL.
</content>
</entry>
<entry>
<title>all: mark and skip the slow test</title>
<updated>2026-01-21T21:15:47Z</updated>
<author>
<name>Shulhan</name>
<email>ms@kilabit.info</email>
</author>
<published>2026-01-21T21:15:47Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/jarink/commit/?id=a33efc3992f58355eb98d7a5574df955952924b8'/>
<id>urn:sha1:a33efc3992f58355eb98d7a5574df955952924b8</id>
<content type='text'>
The TestScan_slow takes around ~11 seconds due to test include
[time.Sleep].
</content>
</entry>
<entry>
<title>brokenlinks: print the progress to stderr</title>
<updated>2026-01-21T18:51:52Z</updated>
<author>
<name>Shulhan</name>
<email>ms@kilabit.info</email>
</author>
<published>2026-01-21T18:51:52Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/jarink/commit/?id=2a4376d5ddeee82d4ef38f4953453abb43e85220'/>
<id>urn:sha1:2a4376d5ddeee82d4ef38f4953453abb43e85220</id>
<content type='text'>
Each time the scan start, new queue add, fetching start, print the
message to stderr.
This remove the verbose options for better user experience.
</content>
</entry>
<entry>
<title>all: refactoring, use single struct to represent Link</title>
<updated>2026-01-21T18:39:41Z</updated>
<author>
<name>Shulhan</name>
<email>ms@kilabit.info</email>
</author>
<published>2026-01-21T18:39:41Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/jarink/commit/?id=79eaccc81b85eb92dab9cf18d52662f367903652'/>
<id>urn:sha1:79eaccc81b85eb92dab9cf18d52662f367903652</id>
<content type='text'>
Previously, have [jarink.Link], [brokenlinks.Broken], and
[brokenlinks.linkQueue] to store the metadata for a link.

These changes unified them into struct [jarink.Link].
</content>
</entry>
<entry>
<title>brokenlinks: refactoring the logic, simplify the code</title>
<updated>2026-01-21T17:27:18Z</updated>
<author>
<name>Shulhan</name>
<email>ms@kilabit.info</email>
</author>
<published>2026-01-21T17:27:18Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/jarink/commit/?id=26fc8bd3203dae6b4705ada227439c90129bbe36'/>
<id>urn:sha1:26fc8bd3203dae6b4705ada227439c90129bbe36</id>
<content type='text'>
Previously, we made the scan logic to run in multiple goroutine with
one channel to push and consume the result and another channel to push
and pop link to be processed.
The logic is a very complicated code, making it hard to read and debug.

These changes refactoring it to use single goroutine that push and pop
link from/to a slices, as queue.
</content>
</entry>
<entry>
<title>brokenlinks: fix infinite loop on unknown host</title>
<updated>2025-11-20T10:12:19Z</updated>
<author>
<name>Shulhan</name>
<email>ms@kilabit.info</email>
</author>
<published>2025-07-01T19:07:43Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/jarink/commit/?id=5301c666eec35699bbb9024678bb37adc057404c'/>
<id>urn:sha1:5301c666eec35699bbb9024678bb37adc057404c</id>
<content type='text'>
On link with invalid domain, it should break and return the error
immediately.

</content>
</entry>
<entry>
<title>brokenlinks: implement caching for external URLs</title>
<updated>2025-06-27T05:19:23Z</updated>
<author>
<name>Shulhan</name>
<email>ms@kilabit.info</email>
</author>
<published>2025-06-21T08:20:01Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/jarink/commit/?id=1ca561ed0ecfa59b70a10191ac8e58cde90d126e'/>
<id>urn:sha1:1ca561ed0ecfa59b70a10191ac8e58cde90d126e</id>
<content type='text'>
Any succesful fetch on external URLs, will be recorded into jarink
cache file, located in user's home cache directory.
For example, in Linux it would be `$HOME/.cache/jarink/cache.json`.

This help improve the future rescanning on the same or different target
URL, minimizing network requests.
</content>
</entry>
<entry>
<title>all: add test cases for simulating slow server</title>
<updated>2025-06-18T18:06:48Z</updated>
<author>
<name>Shulhan</name>
<email>ms@kilabit.info</email>
</author>
<published>2025-06-18T18:06:48Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/jarink/commit/?id=8bc8fce1bd80b5a25c452ac5a24b1a1e3f5a4feb'/>
<id>urn:sha1:8bc8fce1bd80b5a25c452ac5a24b1a1e3f5a4feb</id>
<content type='text'>
The test run a server that contains three six pages that contains
various [time.Sleep] duration before returning the response.

This allow us to see how the main scan loop works, waiting
for resultq and listWaitStatus.

</content>
</entry>
<entry>
<title>brokenlinks: add test cases for IgnoreStatus options</title>
<updated>2025-06-17T16:49:18Z</updated>
<author>
<name>Shulhan</name>
<email>ms@kilabit.info</email>
</author>
<published>2025-06-17T16:49:18Z</published>
<link rel='alternate' type='text/html' href='http://git.kilabit.info/jarink/commit/?id=5ac82928702a0f0a7be0ef6e96ab04c39a7e8e9d'/>
<id>urn:sha1:5ac82928702a0f0a7be0ef6e96ab04c39a7e8e9d</id>
<content type='text'>
There are two test cases, one for invalid status code like "abc",
and one for unknown status code like "50".

</content>
</entry>
</feed>
