Overview
147
Flow control
When indexing Web sites, Verity Spider distributes requests to Web servers in a
round-robin manner. This means one URL is fetched from each Web server in turn.
With flow control, it is possible that a faster Web site will finish before a slower one.
Regardless, the Verity Spider optimizes indexing every Web server.
Verity Spider V3.7 adjusts the number of connections per server depending on the
download bandwidth. When the download bandwidth from a Web server falls below
a certain value, Verity Spider will automatically scale back the number of
connections to that Web server. There will always be at least one connection to a Web
server. When the download bandwidth increases to an acceptable level, Verity Spider
reallocates connections (per the value of the -connections option, which is 4 by
default). You can turn off flow control with the -noflowctrl option.
Multithreading
Since version 3.1, the Verity Spider has separated the gathering and indexing jobs
into multiple threads for concurrence. Verity Spider V3.7 can create concurrent
connections to Web servers for fetching documents, and have concurrent indexing
threads for maximum utilization. This translates to an overall improvement in
throughput. In previous releases, work was done in a round-robin manner, so that at
any given time, only one job was running. Spider attends to the Web sites within an
indexing job in a round-robin manner.
Efficient DNS lookups
Verity Spider V3.7 significantly reduces DNS lookups, which means great
improvements to spidering throughput. If spidering is limited by domain or host,
then no DNS lookups are made on hosts that fall outside of that range. Previously,
DNS lookups were made on all candidate URLs.
Proxy handling efficiency
The use of the -noproxy option for reducing proxy checking for certain hosts, and the
use of -proxyauth for authenticating on proxy servers allows for much greater
flexibility when dealing with indexing jobs that involve proxy servers and firewalls.
NOTE: Information Server V3.7does not support retrieving documents for viewing
through secure proxy servers. Do not use -proxyauth for indexing documents which
are to be viewed through Information Server V3.7.
Summary of Contents for COLDFUSION 5-ADVANCED ADMINISTRATION
Page 1: ...Macromedia Incorporated Advanced ColdFusion Administration ColdFusion 5...
Page 20: ......
Page 56: ...38 Chapter 1 Advanced Data Source Management...
Page 74: ...56 Chapter 2 Administrator Tools...
Page 76: ......
Page 86: ...68 Chapter 3 ColdFusion Security...
Page 87: ...To Learn More About Security 69...
Page 88: ...70 Chapter 3 ColdFusion Security...
Page 130: ...112 Chapter 5 Configuring Advanced Security...
Page 132: ......
Page 154: ...136 Chapter 6 Configuring Verity K2 Server...
Page 162: ...144 Chapter 7 Indexing XML Documents...
Page 202: ...184 Chapter 8 Verity Spider...
Page 236: ...218 Chapter 10 Verity Troubleshooting Utilities...
Page 238: ......
Page 348: ...330 Chapter 14 ClusterCATS Utilities...
Page 349: ...Using sniff 331...
Page 350: ...332 Chapter 14 ClusterCATS Utilities...
Page 362: ...344 Chapter 15 Optimizing ClusterCATS...
Page 372: ...354 Index...