A tool for crawling and finding links to URLs which no longer exist
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Brian Picciano 3620eb9d0b Add ability to ignore URLs 4 months ago
cmd/deadlinks Add ability to ignore URLs 4 months ago
LICENSE.txt Flesh out README and documentation 4 months ago
README.md Rename 'patterns' to 'follows' 4 months ago
client.go Add option to set http user agent 4 months ago
deadlinks.go Add ability to ignore URLs 4 months ago
flake.lock Initial commit, SQLiteStore is mostly implemented 4 months ago
flake.nix Flesh out README and documentation 4 months ago
go.mod Add support for RSS feeds 4 months ago
go.sum Add support for RSS feeds 4 months ago
parser.go Add support for RSS feeds 4 months ago
parser_test.go Added HTTP(s)/HTML support 4 months ago
resource.go Clean up yaml output 4 months ago
store.go Fix how iteration works in Store, since sqlite doesn't like concurrent access 4 months ago
store_test.go Fix how iteration works in Store, since sqlite doesn't like concurrent access 4 months ago
url.go Got DeadLinks basic functionality actually working 4 months ago

README.md

DeadLinks

A tool for crawling and finding links to URLs which no longer exist. deadlinks supports the HTTP(s) and gemini protocols, and is intended for periodically checking links on personal websites and blogs.

Library

The deadlinks package is designed to be easily embedded into a process and have its results displayed in something like a status page.

See the godocs for more info.

Command-Line

The command-line utility can be installed using go install:

go install code.betamike.com/mediocregopher/deadlinks/cmd/deadlinks

The -url parameter is required. Given a URL it will check it for any dead links. Can be specified more than once:

deadlinks -url='https://mediocregopher.com' -url='gemini://mediocregopher.com'

Any links which are dead will be output to stdout as YAML objects, each containing the dead URL, the error encountered, and which pages link to it.

In order to recursively crawl through links you can give one or more regex patterns. Any URL which matches a pattern will have its links followed and checked as well (and if any of those linked URLs match a pattern their links will be checked, and so on):

deadlinks \
    -url='https://mediocregopher.com' -url='gemini://mediocregopher.com' \
    -follow='://mediocregopher.com'

There are further options available which affect the utility's behavior, see deadlinks -h for more.