Chapter 8. Software stack
231
Workload Management
IBM Platform HPC includes batch job scheduling and workload management
to effectively share the resources of a cluster among multiple users, and to
maintain a queue of work to keep your cluster busy. The workload
management that is included has the equivalent scheduling functions of IBM
Platform LSF Express Edition but is not limited to 100 nodes as Express
edition is.
Monitoring and reporting
After a cluster is provisioned, IBM Platform HPC provides the means to
monitor the status of the cluster resources and jobs to display alerts when
there are resource shortages or abnormal conditions, and to produce reports
on the throughput and usage of the cluster.
With these tools, you can quickly understand how the cluster resources are
being used, by whom, and how effectively the available capacity is used.
These monitoring facilities are a simplified subset of those facilities that are
provided by the IBM Platform Application Center.
MPI libraries
HPC clusters frequently employ a distributed memory model to divide a
computational problem into elements that can be run in parallel on the hosts
of a cluster. This often involves the requirement that the hosts share progress
information and partial results by using the cluster’s interconnect fabric. This
is often accomplished by using a message passing mechanism. The most
widely adopted standard for this type of message passing is the Message
Passing Interface (MPI) interface standard. For more information, see this
website:
http://www.mpi-forum.org
IBM Platform HPC includes a robust, commercial implementation of the MPI
standard, IBM Platform MPI. This implementation already is integrated with
the LSF workload manager element of IBM Platform HPC, which gives the
workload scheduler full control over MPI resource scheduling.
Summary of Contents for NeXtScale System
Page 2: ......
Page 16: ...xiv IBM NeXtScale System Planning and Implementation Guide...
Page 26: ...8 IBM NeXtScale System Planning and Implementation Guide...
Page 132: ...114 IBM NeXtScale System Planning and Implementation Guide...
Page 263: ...0 5 spine 0 475 0 875 250 459 pages IBM NeXtScale System Planning and Implementation Guide...
Page 264: ......
Page 265: ......