
There is a
define service
block as shown below for the aggregate CPU plug-in.
define service{
use local-service
host_name localhost
service_description Aggregate CPU Stats
check_command check_cpu_aggr!0.5!0.8
}
Here, a custom command name is defined, the name being arbitrary. In this example it it called
check_cpu_aggr
, which is the last line shown above. It is followed by an exclamation mark, a number,
another exclamation mark, and another number. The ‘!’ separates the parameters required by the plugin (
-w
and
–c
), and the numbers are the arguments to these plugins (i.e. the
warning
and
critical
levels). So in
this instance, the warning level is set to 0.5 and critical to 0.8
5. Adjust the
warning
and
critical
levels for this plugin
6. Save and quit the
localhost.cfg
file.
7. Restart the Nagios service.
#
service nagios restart
8. Switch to the
/usr/local/nagios/etc
directory.
9. Modify the
nrpe.conf
configuration file as needed.
a. Define the command name and the path including the arguments to the plugins to be executed by this
command.
The command for the
cray_check_cpu
plugin is shown below:
command[check_cpu]=/usr/local/nagios/libexec/cray_check_cpu -w 0.5 -c 0.8
In the preceding example,
-w
is the warning level, whereas
-c
is the critical level.
b. Update the thresholds as needed.
10. Save and quit the
nrpe.conf
file.
11. Restart the
nrpe
service.
#
service nrpe restart
12. Repeat this process for all the nodes and services in those nodes that need adjusting.
4.3
Get Started with Using Grafana
Grafana is a feature-rich metrics, dashboard, and graph editor. It provides information about utilization of system
resources, such as CPU, memory, I/O, etc.
Major Grafana components and UI elements include:
System Monitoring
S3016
77