A tool for crawling and finding links to URLs which no longer exist
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Brian Picciano db3e6029b9 Add option to set http user agent 5 months ago
cmd/deadlinks Add option to set http user agent 5 months ago
LICENSE.txt Flesh out README and documentation 5 months ago
README.md Flesh out README and documentation 5 months ago
client.go Add option to set http user agent 5 months ago
deadlinks.go Flesh out README and documentation 5 months ago
flake.lock Initial commit, SQLiteStore is mostly implemented 5 months ago
flake.nix Flesh out README and documentation 5 months ago
go.mod Add support for RSS feeds 5 months ago
go.sum Add support for RSS feeds 5 months ago
parser.go Add support for RSS feeds 5 months ago
parser_test.go Added HTTP(s)/HTML support 5 months ago
resource.go Clean up yaml output 5 months ago
store.go Fix how iteration works in Store, since sqlite doesn't like concurrent access 5 months ago
store_test.go Fix how iteration works in Store, since sqlite doesn't like concurrent access 5 months ago
url.go Got DeadLinks basic functionality actually working 5 months ago

README.md

DeadLinks

A tool for crawling and finding links to URLs which no longer exist. deadlinks supports the HTTP(s) and gemini protocols, and is intended for periodically checking links on personal websites and blogs.

Library

The deadlinks package is designed to be easily embedded into a process and have its results displayed in something like a status page.

See the godocs for more info.

Command-Line

The command-line utility can be installed using go install:

go install code.betamike.com/mediocregopher/deadlinks/cmd/deadlinks

The -urls parameter is required. Given one or more URLs it will check each one for any dead links:

deadlinks -urls 'https://mediocregopher.com,gemini://mediocregopher.com'

Any links which are dead will be output to stdout as YAML objects, each containing the dead URL, the error encountered, and which pages link to it.

In order to recursively crawl through links you can give one or more regex patterns. Any URL which matches a pattern will have its links checked as well (and if any of those link URLs match a pattern their links will be checked, and so on):

deadlinks \
    -urls 'https://mediocregopher.com,gemini://mediocregopher.com' \
    -patterns '://mediocregopher.com'

There are further options available which affect the utility's behavior, see deadlinks -h for more.