154
Chapter 8 Verity Spider
By default, each indexing thread uses as much memory as is available from the
system.
-maxnumdoc
Syntax
:
-maxnumdoc num_docs
Specifies the maximum number of documents to be downloaded or submitted for
indexing. The value for num_docs does not necessarily correspond exactly to the
number of documents indexed. The following factors affect the actual number.
Whether or not the value of
num_docs
falls within a block of documents dictated by
-submitsize
. If it does, the entire block of documents must be processed.
Whether or not documents retrieved are actually indexed because they are invalid or
corrupt.
-mimemap
Syntax
:
-mimemap path_and_filename
Specifies a control file (simple ASCII text) that maps file extensions to MIME-types.
This allows you to make custom associations and override defaults.
The format for the control file is:
#file_ext_no_dot
mime-type
abc
application/word
-nocache
Type
: Web crawling only
Used with
-noindex
or
-nosubmit
, this option disables the caching of files during
Web site indexing. This has the effect of decreasing the demands on your disk space.
Normally, Verity Spider downloads URLs and then writes them to a bulk insert file
and downloads the documents themselves. When indexing occurs, once
-submitsize
has been reached, the cached files are indexed and then deleted. If you
use
-noindex
, the bulk insert file is submitted but not processed by Verity Spider, and
so the documents are not deleted until indexing occurs takes over. This will usually
be
mkvdk
or
collsvc
, or you can subsequently use Verity Spider again with the
-processbif option.
By using
-nocache
in conjunction with
-noindex
or
-nosubmit
, you avoid storing
files locally at all. Files are downloaded only when indexing actually occurs.
See also
-noindex
.
-nodupdetect
Type
: Web crawling only.
Disables checksum-based detection of duplicates when indexing Web sites.
URL-based duplicate detection is still performed.
Содержание COLDFUSION 5-ADVANCED ADMINISTRATION
Страница 1: ...Macromedia Incorporated Advanced ColdFusion Administration ColdFusion 5...
Страница 20: ......
Страница 56: ...38 Chapter 1 Advanced Data Source Management...
Страница 74: ...56 Chapter 2 Administrator Tools...
Страница 76: ......
Страница 86: ...68 Chapter 3 ColdFusion Security...
Страница 87: ...To Learn More About Security 69...
Страница 88: ...70 Chapter 3 ColdFusion Security...
Страница 130: ...112 Chapter 5 Configuring Advanced Security...
Страница 132: ......
Страница 154: ...136 Chapter 6 Configuring Verity K2 Server...
Страница 162: ...144 Chapter 7 Indexing XML Documents...
Страница 202: ...184 Chapter 8 Verity Spider...
Страница 236: ...218 Chapter 10 Verity Troubleshooting Utilities...
Страница 238: ......
Страница 348: ...330 Chapter 14 ClusterCATS Utilities...
Страница 349: ...Using sniff 331...
Страница 350: ...332 Chapter 14 ClusterCATS Utilities...
Страница 362: ...344 Chapter 15 Optimizing ClusterCATS...
Страница 372: ...354 Index...