www.scan
www.scan is a simple program used to find dead and damaged links
on your website. As it can take a long time to run, this is often run
via cron and the results sent to you via email. To better understand
the email message sent, as well as the normal output, see the
interpretation page.
This checks all of the pages on your site, and each reference as it goes
off site. However, once it sees that the offsite reference works, it does
not pursue additional work on the remote sites.
This program takes a number of optional parameters.
- --site=http://www.yoursite.com/.
This allows you to specify the web site to search.
- --exceptions="http://www.yoursite.com/~name/exceptions.file,...".
This allows you to specify a number of files (accessible via the
web) which list URLs to not bother checking. This is useful
for dealing with web pages which really exist, but this program
has trouble handling. You can specify a number of such exception
files separated by commas.
- --depth=5000.
You can specify the number of unprocessed pages that this program
should keep around while processing. This is an aide to keep from
running your computer out of memory.
- --checkuser.
This allows you to check user files. User files are of the form:
http://www.yoursite.com/~username/
If there is a www.scan.exceptions in the home directory,
it is read and processed.
- --email=username@yoursite.com.
You can send the results via email to yourself. If you quote the
parameter, you can have multiple email addresses separated by
spaces.
- --sendmail=/usr/lib/sendmail.
If you have sendmail in a different place, you can specify where it
is.
- --maxurl=200.
You can specify the maximum length of a URL. If it gets longer than
this limit, the reference is ignored. This is a measure to avoid
degenerate automatically generated URLs on some sites.
There are some additional parameters used for debugging, and are not
typically very interesting unless you are debugging the program.
You can find information about these options within the program.
You can pick up the entire archive for
version v0.17.
This contains all of the files mentioned on this page.
Once you have the tar file, you will need to extract and build it.
Do something like the following:
cd
mkdir www.scan
tar xvfz www.scan-0.17.tar.gz
cd www.scan-0.17
make