5 – Detailed Descriptions of Command LineTools
Health Check and Baselining Tools
5-88
D000006-000 Rev A
Q
❥
automated
In both cases the user should follow the initial setup procedure outlined above
to create a good baseline of the configuration.
In the manual method, the user would run the tools manually when trying to
diagnose problems, or when there is a concern or need to validate the
configuration and health.
In the automated method, the user could run
all_analysis
or a specific tool
in an automated script (such as a
cron
job). When run in this mode the
-s
option may prove useful (but care must be taken to avoid excessive saved
failures). When run in automated mode, a frequency of no faster than hourly
would be recommended. For many fabrics a run daily or perhaps every few
hours would be sufficient. Since the exit code from each of the tools indicates
the overall success/failure, an automated script could easily check the exit status
and on failure email the output from the analysis tool to the appropriate
administrators for further analysis and corrective action as needed.
NOTE:
Running these tools too often can have negative impacts. Among the
potential risks:
❥
Each run adds a potential burden to the SM, fabric and/or switches. For
infrequent runs (hourly or daily) this impact is negligible. However, if this were
to be run very frequently, the impacts to fabric and SM performance can be
noticeable.
❥
Runs with the
-s
option will consume additional disk space for each run that
identifies an error. The amount of disk space will vary depending on fabric
size. For a larger fabric this can be on the order of 1-40 MB. Therefore, care
must be taken not to run the tools too often and to visit and clean out the
FF_ANALYSIS_DIR
periodically. If the
-s
option is used during automated
execution of the health check tools, it may be helpful to also schedule
automated disk space checks (e.g., as a cron job).
❥
Runs coinciding with down time for selected components (such as servers
that are offline or rebooting) will be considered failures and generate the
resulting failure information. If the runs are not carefully scheduled, this could
be misleading and also waste disk space.
Содержание Fast Fabric
Страница 1: ...D000006 000 Rev A Page i Q S i m p l i f y Fast Fabric Users Guide...
Страница 2: ...Fast Fabric Users Guide Q Page ii D000006 000 Rev A...
Страница 38: ...3 Getting Started Upgrading IB software 3 24 D000006 000 Rev A Q...
Страница 148: ...6 MPI Sample Applications Pallas 6 6 D000006 000 Rev A Q...
Страница 159: ...B Fast Fabric Configuration Files D000006 000 Rev A B 7 Q NOTE Do not edit etc sysconfig iba iba_mon conf sample...
Страница 166: ...B Fast Fabric Configuration Files Port List Files B 14 D000006 000 Rev A Q...
Страница 168: ...C Configuration of IPoIB Name Mapping C 2 D000006 000 Rev A Q...