Indexing Performance
337
10.4.1. Indexing Performance
While achieving extremely high read performance, in previous versions of Directory Server, write
performance was limited by the number of bytes per second that could be written into the storage
manager's transaction log file. Large log files were generated for each LDAP write operation; in
fact,
log file verbosity
could easily be 100 times the corresponding number of bytes changed in the
Directory Server. The majority of the contents in the log files are related to index changes (ID insert
and delete operations).
The secondary index structure was separated into two levels in the old design:
• The ID list structures, which were the province of the Directory Server backend and opaque to the
storage manager.
• The storage manager structures (Btrees), which were opaque to the Directory Server backend code.
Because it had no insight into the internal structure of the ID lists, the storage manager had to treat ID
lists as opaque byte arrays. From the storage manager's perspective, when the content of an ID list
changed, the
entire list
had changed. For a single ID that was inserted or deleted from an ID list,
the corresponding number of bytes written to the transaction log was the maximum configured size for
that ID list, about 8 kilobytes. Also, every database page on which the list was stored was marked as
dirty, since the
entire
list had changed.
In the redesigned index, the storage manager has visibility into the fine-grain index structure, which
optimizes transaction logging so that only the number of bytes actually changed need to be logged
for any given index modification. The Berkeley DB provides ID list semantics, which are implemented
by the storage manager. The Berkeley API was enhanced to support the insertion and deletion
of individual IDs stored against a common key, with support for duplicate keys, and an optimized
mechanism for the retrieval of the complete ID list for a given key.
The storage manager has direct knowledge of the application's intent when changes are made to ID
lists, resulting in several improvements to ID list handling:
• For long ID lists, the number of bytes written to the transaction log for any update to the list is
significantly reduced, from the maximum ID list size (8 kilobytes) to twice the size of one ID (4
bytes).
• For short ID lists, storage efficiency, and in most cases performance, is improved because only the
storage manager meditate need to be stored, not the ID list metadata.
• The average number of database pages marked as dirty per ID insert or delete operation is very
small because a large number of duplicate keys will fit into each database page.
10.4.2. Search Performance
For each entry ID list, there is a size limit that is globally applied to all index keys managed by the
server. In previous versions of Directory Server, this limit was called the
All IDs Threshold
. Because
maintaining large ID lists in memory can affect performance, the All IDs Threshold set a limit on how
large a single entry ID list could get. When a list hit a certain pre-determined size, the search would
treat it as if the index contained the entire directory.
The difficulty in setting the All IDs Threshold hurt performance. If the threshold was too low, too many
searches examined every entry in the directory. If it was too high, too many large ID lists had to be
maintained in memory.
Содержание DIRECTORY SERVER 8.0
Страница 18: ...xviii ...
Страница 29: ...Configuring the Directory Manager 11 6 Enter the new password and confirm it 7 Click Save ...
Страница 30: ...12 ...
Страница 112: ...94 ...
Страница 128: ...110 ...
Страница 190: ...Chapter 6 Managing Access Control 172 4 Click New to open the Access Control Editor ...
Страница 224: ...206 ...
Страница 324: ...306 ...
Страница 334: ...316 ...
Страница 358: ...340 ...
Страница 410: ...392 ...
Страница 420: ...402 ...
Страница 444: ...426 ...
Страница 454: ...436 ...
Страница 464: ...446 ...
Страница 484: ...466 ...
Страница 512: ...494 ...
Страница 522: ...504 ...