A tool for crawling and finding links to URLs which no longer exist
Go to file
2024-01-04 21:40:12 +01:00
cmd/deadlinks Add ability to ignore URLs 2024-01-04 21:40:12 +01:00
client.go Add option to set http user agent 2024-01-04 20:51:36 +01:00
deadlinks.go Add ability to ignore URLs 2024-01-04 21:40:12 +01:00
flake.lock Initial commit, SQLiteStore is mostly implemented 2023-12-26 23:18:09 +01:00
flake.nix Flesh out README and documentation 2023-12-30 13:43:06 +01:00
go.mod Add support for RSS feeds 2023-12-30 11:33:47 +01:00
go.sum Add support for RSS feeds 2023-12-30 11:33:47 +01:00
LICENSE.txt Flesh out README and documentation 2023-12-30 13:43:06 +01:00
parser_test.go Added HTTP(s)/HTML support 2023-12-30 11:22:09 +01:00
parser.go Add support for RSS feeds 2023-12-30 11:33:47 +01:00
README.md Rename 'patterns' to 'follows' 2024-01-04 21:31:32 +01:00
resource.go Clean up yaml output 2023-12-30 12:17:36 +01:00
store_test.go Fix how iteration works in Store, since sqlite doesn't like concurrent access 2023-12-29 20:12:10 +01:00
store.go Fix how iteration works in Store, since sqlite doesn't like concurrent access 2023-12-29 20:12:10 +01:00
url.go Got DeadLinks basic functionality actually working 2023-12-30 10:31:30 +01:00

DeadLinks

A tool for crawling and finding links to URLs which no longer exist. deadlinks supports the HTTP(s) and gemini protocols, and is intended for periodically checking links on personal websites and blogs.

Library

The deadlinks package is designed to be easily embedded into a process and have its results displayed in something like a status page.

See the godocs for more info.

Command-Line

The command-line utility can be installed using go install:

go install code.betamike.com/mediocregopher/deadlinks/cmd/deadlinks

The -url parameter is required. Given a URL it will check it for any dead links. Can be specified more than once:

deadlinks -url='https://mediocregopher.com' -url='gemini://mediocregopher.com'

Any links which are dead will be output to stdout as YAML objects, each containing the dead URL, the error encountered, and which pages link to it.

In order to recursively crawl through links you can give one or more regex patterns. Any URL which matches a pattern will have its links followed and checked as well (and if any of those linked URLs match a pattern their links will be checked, and so on):

deadlinks \
    -url='https://mediocregopher.com' -url='gemini://mediocregopher.com' \
    -follow='://mediocregopher.com'

There are further options available which affect the utility's behavior, see deadlinks -h for more.