
220
Appendix D
Configuration Files
The following table describes each field:
Example
The following example illustrates an HTTP scheduled update:
http://www.hp.com\User-Agent: noname user agent\13\3600\5\
This example specifies the URL & Request headers, an offset hour of 13 (1 pm), an interval of one hour, and
a recursion depth of 5. This would result in updates at 13:00, 14:00, 15:00, and so on. To schedule for an update
to occur only once a day, use an interval value of 24 hours x 60 minutes x 60 seconds = 86400.
The following example illustrates an FTP scheduled update:
ftp://[email protected]/pub/misc/test_file.cc\\18\120\0
This example specifies the FTP request, an offset hour of 18 (6 pm), and an interval of every two minutes.
Note that the user must be anonymous and the password must appear in the
records.config
file using the
configuration variable
proxy.config.http.ftp.anonymous_passwd
.
Specifying URL Regular Expressions (url_regex)
This section describes how to specify a url_regex. Entries of type
url_regex
within the configuration files
use regular expressions to perform a match.
The following table offers examples to illustrate how to create a valid
url_regex
:
Field
Allowed inputs
URL
HTTP and FTP-based URLs.
Request_headers
(Optional) A <CR><LF> separated list of headers passed in each
GET
request.
You can define any request header that conforms to the HTTP specification.
The default is no request header.
Offset_hour
Base hour used to derive the update periods. The range is 00-23 hours.
Interval
The interval, in seconds, at which updates should occur, starting at Offset hour.
Recursion_depth
The depth to which referenced URLs are recursively updated, starting at the
given URL.
Value
Description
x
Matches the character 'x'
.
Match any character
^
Specifies beginning of line
$
Specifies end of line
[xyz]
A “character class.” In this case, the pattern matches either ‘x’, ‘y’, or ‘z’.
[abj-oZ]
A “character class” with a range. This pattern matches ‘a’, ‘b’, any letter from ‘j’
through ‘o’, or ‘Z’.
[^A-Z]
A “negated character class”. For example, this pattern matches any character except
those in the class.
r*
Zero or more r’s, where r is any regular expression
r+
One or more r’s, where r is any regular expression
r?
Zero or one r’s, where r is any regular expression
r{2,5}
From two to five r’s, where r is any regular expression
r{2,}
Two or more r’s, where r is any regular expression
r{4}
Exactly 4 r’s, where r is any regular expression