
Paths and URLs Options
167
-reparse
Type: Web crawling only.
Forces parsing of all HTML documents already in the collection. You must specify a
starting point with the
-start
option when you use
-reparse
.
You can use -reparse when you want to include paths and documents which were
previously skipped due to exclusion or inclusion criteria. Remember to change the
criteria, else there will be little for the Verity Spider to do. This can be easy to overlook
when you are using
-cmdfile
.
-unlimited
Specifies no limits to be placed on Verity Spider if neither
-host
nor
-domain
is
specified. The default is to limit based on the host of the first starting point listed.
-virtualhost
Syntax
:
-virtualhost name_1 [name_n] ...
Specifies that DNS lookups are avoided for the hosts listed. You must use only
complete text strings for hosts. You may not use wildcard expressions. This allows
you to index by alias, such as when multiple Web servers are running on the same
host. You can use regular expressions.
Normally, when Verity Spider resolves host names, it uses DNS lookups to convert
the names to canonical names, of which there can be only one per machine. This
allows for the detection of duplicate documents, to prevent results from being
diluted. In the case of multiple aliased hosts, however, duplication is not a barrier as
documents can be referred to by more than one alias, and yet remain distinct
because of the different alias names.
Example
You may have both marketing.verity.com and sales.verity.com running on the same
host. Each alias has a different document root, although document names such as
index.htm may occur for both. With
-virtualhost
, both server aliases can be
indexed as distinct sites. Without
-virtualhost
, they would both be resolved to the
same host name and only the first document encountered from any duplicate pair
would be indexed.
Warning! If you are using Netscape Enterprise Server, and you have specified only the
host name as a virtual host, then Verity Spider will not be able to index the virtual
host site. This is because the Verity Spider always adds the domain name to the
document key.
Summary of Contents for COLDFUSION 5-ADVANCED ADMINISTRATION
Page 1: ...Macromedia Incorporated Advanced ColdFusion Administration ColdFusion 5...
Page 20: ......
Page 56: ...38 Chapter 1 Advanced Data Source Management...
Page 74: ...56 Chapter 2 Administrator Tools...
Page 76: ......
Page 86: ...68 Chapter 3 ColdFusion Security...
Page 87: ...To Learn More About Security 69...
Page 88: ...70 Chapter 3 ColdFusion Security...
Page 130: ...112 Chapter 5 Configuring Advanced Security...
Page 132: ......
Page 154: ...136 Chapter 6 Configuring Verity K2 Server...
Page 162: ...144 Chapter 7 Indexing XML Documents...
Page 202: ...184 Chapter 8 Verity Spider...
Page 236: ...218 Chapter 10 Verity Troubleshooting Utilities...
Page 238: ......
Page 348: ...330 Chapter 14 ClusterCATS Utilities...
Page 349: ...Using sniff 331...
Page 350: ...332 Chapter 14 ClusterCATS Utilities...
Page 362: ...344 Chapter 15 Optimizing ClusterCATS...
Page 372: ...354 Index...