Provided by: checkbot_1.80-3_all
Checkbot - WWW Link Verifier
checkbot [--cookies] [--debug] [--file file name] [--help] [--mailto email addresses] [--noproxy list of domains] [--verbose] [--url start URL] [--match match string] [--exclude exclude string] [--proxy proxy URL] [--internal-only] [--ignore ignore string] [--filter substitution regular expression] [--style style file URL] [--note note] [--sleep seconds] [--timeout timeout] [--interval seconds] [--dontwarn HTTP responde codes] [--enable-virtual] [--language language code] [--suppress suppression file] [start URLs]
HINTS AND TIPS
Problems with checking FTP links Some users may experience consistent problems with checking FTP links. In these cases it may be useful to instruct Net::FTP to use passive FTP mode to check files. This can be done by setting the environment variable FTP_PASSIVE to 1. For example, using the bash shell: "FTP_PASSIVE=1 checkbot ...". See the Net::FTP documentation for more details. Run-away Checkbot In some cases Checkbot literally takes forever to finish. There are two common causes for this problem. First, there might be a database application as part of the web site which generates a new page based on links on another page. Since Checkbot tries to travel through all links this will create an infinite number of pages. This kind of run-away effect is usually predictable. It can be avoided by using the --exclude option. Second, a server configuration problem can cause a loop in generating URLs for pages that really do not exist. This will result in URLs of the form http://some.server/images/images/images/logo.png, with ever more 'images' included. Checkbot cannot check for this because the server should have indicated that the requested pages do not exist. There is no easy way to solve this other than fixing the offending web server or the broken links. Problems with https:// links The error message Can't locate object method "new" via package "LWP::Protocol::https::Socket" usually means that the current installation of LWP does not support checking of SSL links (i.e. links starting with https://). This problem can be solved by installing the Crypt::SSLeay module.
The most simple use of Checkbot is to check a set of pages on a server. To check my checkbot pages I would use: checkbot http://degraaff.org/checkbot/ Checkbot runs can take some time so Checkbot can send a notification mail when the run is done: checkbot --mailto email@example.com http://degraaff.org/checkbot/ It is possible to check a set of local file without using a web server. This only works for static files but may be useful in some cases. checkbot file:///var/www/documents/
This script uses the "LWP" modules.
This script can send mail when "Mail::Send" is present.
Hans de Graaff <firstname.lastname@example.org> any