Chapter
8.
Troubleshooting
hardware
and
software
problems
This
chapter
includes
information
to
diagnose
problems
associated
with
the
Cluster
1350.
The
Cluster
1350
is
an
integrated
Linux
cluster
that
includes
IBM
and
third
party
hardware
and
software
components
like
server
nodes
and
associated
firmware,
storage
and
networking
subsystems,
plus
Cluster
Systems
Management
(CSM)
software
and
General
Parallel
File
System
(GPFS)
software.
Problem
resolution
involves
identifying
the
problem
cluster
component
and
following
the
applicable
problem
resolution
procedure
for
that
component.
This
chapter
includes
information
for
the
diagnosis
of
problems
down
to
the
component
level.
When
a
failing
component
is
identified
you
can
review
the
specific
product
documentation
for
further
actions.
Links
to
applicable
product
Web
sites
and
online
product
documentation
are
provided
in
this
chapter.
Diagnosing
hardware
and
software
problems
in
a
cluster
environment
requires
a
basic
understanding
of
how
the
components
of
the
Cluster
1350
function
together.
The
cluster
consists
of:
v
One
or
more
19
″
racks.
v
From
4
to
512
cluster
nodes.
The
nodes
are
configured
to
execute
customer
applications
or
provide
other
services
required
by
the
customer,
such
as,
file
server,
network
gateway,
or
storage
server.
v
One
management
node
(xSeries
345
or
Eserver
325)
for
cluster
systems
management
and
administration.
v
A
management
Ethernet
VLAN
used
for
secure
traffic
for
hardware
control.
The
management
Ethernet
VLAN
is
used
for
management
traffic
only.
It
is
logically
isolated
for
security
using
the
VLAN
capability
of
the
Cisco
Ethernet
switches,
and
is
only
accessible
from
the
management
node.
The
cluster
VLAN
and
management
VLANs
share
the
same
physical
Cisco
switches.
v
A
cluster
VLAN
used
for
other
management
traffic
and
user
traffic.
Cisco
switches
integrated
with
the
cluster
are
used
for
the
management
Ethernet
VLAN
and
the
cluster
Ethernet
VLAN.
v
Service
processor
networks.
All
nodes
in
the
cluster
are
connected
through
serial
service
processors
(xSeries
335)
and/or
Remote
Supervisor
Adapter
(RSA)
devices.
The
first
node
in
a
serial
connection
must
have
a
Remote
Supervisor
Adapter
which
is
connected
through
the
Ethernet
to
the
management-Ethernet
VLAN.
v
A
terminal
server
network
for
remote
console,
using
the
In-Reach
LX-4000
(32-port,
48-port)
terminal
server.
Optionally,
the
customer
might
elect
to
include
an
additional
network.
v
A
high-performance
Myrinet
2000
cluster
interconnect,
or
an
additional
10/100
Ethernet.
v
The
customer
can
elect
to
configure
a
subset
of
cluster
nodes
with
additional
external
storage.
This
can
also
be
a
Fibre
Channel
solution
(using
a
FAStT
storage
subsystem).
v
A
supported
distribution
of
the
Linux
operating
system.
v
Cluster
management
software,
such
as,
CSM.
CSM
maintains
a
database
of
configuration
information
(tab
files)
about
the
nodes
that
are
configured
in
the
Cluster
1350.
To
display
the
node
configuration
information,
use
the
following
CSM
command
on
the
management
server
console:
Isnode
-Al
©
Copyright
IBM
Corp.
2004
37
Содержание eserver Cluster 1350
Страница 1: ...IBM Eserver Cluster 1350 Installation and Service Guide ERserver...
Страница 2: ......
Страница 3: ...IBM Eserver Cluster 1350 Installation and Service Guide ERserver...
Страница 26: ...10 IBM Eserver Cluster 1350 Installation and Service Guide...
Страница 82: ...66 IBM Eserver Cluster 1350 Installation and Service Guide...
Страница 86: ...70 IBM Eserver Cluster 1350 Installation and Service Guide...
Страница 88: ...72 IBM Eserver Cluster 1350 Installation and Service Guide...
Страница 90: ...74 IBM Eserver Cluster 1350 Installation and Service Guide...
Страница 94: ...78 IBM Eserver Cluster 1350 Installation and Service Guide...
Страница 98: ...82 IBM Eserver Cluster 1350 Installation and Service Guide...
Страница 104: ...88 IBM Eserver Cluster 1350 Installation and Service Guide...
Страница 114: ...98 IBM Eserver Cluster 1350 Installation and Service Guide...
Страница 120: ...104 IBM Eserver Cluster 1350 Installation and Service Guide...
Страница 121: ......
Страница 122: ...Part Number 25K8407 Printed in USA 1P P N 25K8407...