Accessing the health of an HSP cluster
1
Hyper Scale-Out Platform Supporting and Troubleshooting
1
Accessing the health of an HSP
cluster
This chapter covers how to monitor and access the health of your cluster
and covers:
•
•
Visually monitoring cluster node health
•
Monitoring alert conditions using the management interfaces
•
Monitoring the run state of HSP resources
Overview
The storage management aspects of the Hyper Scale-Out Platform are
designed for:
• Self-healing—failed internal services are restarted automatically. If
these services fail to restart after a period of time, the node is
automatically rebooted, and if the reboot fails to bring services back
online, the node is marked DOWN and a system alert condition is
reported. The HSP software will automatically attempt the reboot a few
more times, before marking the node as in ERROR and another system
alert condition is reported. All services will be stopped on the node in
ERROR, but the node will remain powered on for debugging purposes.
• Self-provisioning—new and/or replaced nodes and disks are
automatically discovered and provisioned for compute and storage.
• Self-rebalancing—data distribution is continually balanced and
optimized across the cluster.
• Automated data protection—3 copies of each file are maintained in the
cluster on different nodes and racks (when possible) to maximize data
availability and redundancy.
Summary of Contents for Hyper Scale-Out
Page 1: ...MK 94HSP006 03 Hyper Scale Out Platform Maintaining and Troubleshooting 1 2 ...
Page 38: ...30 Troubleshooting Hyper Scale Out Platform Supporting and Troubleshooting ...
Page 40: ...32 Hyper Scale Out Platform Supporting and Troubleshooting ...
Page 41: ...Hyper Scale Out Platform Maintaining and Troubleshooting ...