background image

5-8

   Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380

Compaq Confidential – Need to Know Required

Writer:

 Rachel Williams  

Project:

 Compaq ProLiant Clusters for SCO UnixWare 7 U/300 Quick Install Guide for the Compaq ProLiant DL380

Comments:

 

Part Number:

 221544-001  

File Name: 

f-ch5 Troubleshooting.doc  

Last Saved On: 

12/27/00 11:01 AM

Table 5-3
Solving Node-to-Node Communication Problems 

continued

Problem

Possible Cause

Action

Alternating root node panics

ServerNet I cross-cabled in
two-node cluster (and the CI
serial cable is not used)

Power down the cluster. Verify that
ServerNet I is cabled between cluster
nodes (X to X and Y to Y) as described in
Chapter 2 of this guide. Correct the
cabling and boot the cluster.

ServerNet I link exception
errors are reported

ServerNet I cable is defective

Verify that the ServerNet I cables are
properly connected, do not have bent
pins, and are not crimped or
compromised in any way. If necessary,
replace the ServerNet I cables.

ServerNet I cables can be damaged if
improperly secured or crimped when a
rack-mount server on a slide-rail is
pushed back into a rack.

SPA is defective

Eliminate the ServerNet I cable as a
possible cause. Then, verify the SPA
functionality by using the ServerNet I
graphical monitor or the ServerNet I
command line diagnostic (spam). See
the spam(1M) man page for additional
information.

continued

Содержание ProLiant DL380 G2

Страница 1: ...ProLiant Clusters for SCO UnixWare 7 U 300 Quick Install Guide for the Compaq ProLiant DL380 First Edition January 2001 Part Number 221544 001 Compaq Computer Corporation ...

Страница 2: ...ries All other product names mentioned herein may be trademarks or registered trademarks of their respective companies Compaq shall not be liable for technical or editorial errors or omissions contained herein The information in this document is provided as is without warranty of any kind and is subject to change without notice The warranties for Compaq products are set forth in the express limite...

Страница 3: ...re Installation Steps 1 10 Resources for Application Installation 1 11 Other References 1 12 SCO UnixWare 7 NonStop Clusters Documentation 1 13 Chapter 2 Setting Up Cluster Hardware Assembling the Rack 2 2 Stacking Components 2 3 Transporting Racks 2 4 Setting Up the Cluster Nodes 2 5 Installing the 64 Bit External Storage Fibre Channel HBA and GBIC SWs 2 5 Installing Internal Disk Drives 2 6 Inst...

Страница 4: ...Servers with SmartStart 3 3 Erasing the Configuration 3 3 Configuring the Servers 3 4 Updating Controller Firmware 3 7 Verifying ServerNet I Connections 3 7 Verifying the Local Adapter 3 8 Verifying Node to Node Communication 3 9 Installing the Cluster Using Quick Install 3 10 Installing Node 1 3 11 Installing Node 2 3 13 Verifying the Cluster Assembly 3 14 Additional Cluster Setup Tasks 3 14 Regi...

Страница 5: ...hared Storage Problems 5 10 Client to Cluster Connectivity Problems 5 12 Cluster Resource Problems 5 14 ServerNet I Messages 5 15 ServerNet I SAN Error Messages 5 15 ServerNet I Notice Messages 5 17 ServerNet I Warning Messages 5 19 ServerNet I Panic Messages 5 23 ServerNet I Continuation and Informative Messages 5 27 Appendix A Software Versions Appendix B Quick Install Planning Worksheets Glossa...

Страница 6: ...oldface A plus sign between two keys indicates that they should be pressed simultaneously User Input File Names Directory Names Commands Examples Screen Elements These elements appear in a different typeface Variables Information supplied by the user appears in italics Menu Options Dialog Box Names These elements appear in initial capital letters Type When you are instructed to type information ty...

Страница 7: ...of information Symbols on Equipment These symbols may be located on equipment in areas where hazardous conditions may exist This symbol in conjunction with any of the following symbols indicates the presence of a potential hazard The potential for injury exists if warnings are not observed Consult your documentation for specific details This symbol indicates the presence of hazardous energy circui...

Страница 8: ...ury from a hot component allow the surface to cool before touching These symbols on power supplies or systems indicate that the equipment is supplied by multiple sources of power WARNING To reduce the risk of injury from electric shock remove all power cords to completely disconnect power from the system Weight in kg Weight in lb This symbol indicates that the component exceeds the recommended wei...

Страница 9: ...mpaq Technical Support Phone Center Telephone numbers for worldwide Technical Support Centers are listed on the Compaq website Access the Compaq website by logging on to the Internet at http www compaq com Be sure to have the following information available before you call Compaq Technical support registration number if applicable Product serial number Product model name and number Applicable erro...

Страница 10: ...Compaq Authorized Reseller For the name of your nearest Compaq authorized reseller In the United States call 1 800 345 1518 In Canada call 1 800 263 5868 Elsewhere see the Compaq website for locations and telephone numbers ...

Страница 11: ...failures and provides configuration options for load balancing Clustering is an established technology that can provide the following benefits Availability Scalability Manageability Investment protection Operational efficiency The reliability of the SCO UnixWare 7 NonStop Clusters technology ensures that your applications and data are protected from multiple error conditions For more details on Co...

Страница 12: ...r you must assemble the cluster components initialize them and install the cluster software Hardware Components Supported cluster hardware components for this Quick Install include ProLiant DL380 servers the Compaq StorageWorks RAID Array 4100 RA4100 storage subsystem the hardware required for the cluster interconnect public network interface controller NIC and the Cluster Integrity CI serial cabl...

Страница 13: ...redundant array controller Two GBIC SWs one in each controller Two 9 1 GB or larger disk drives one in each slot 0 Two multimode Fibre Channel cables Cluster Interconnect ProLiant Clusters for SCO UnixWare 7 with the ProLiant DL380 server can use either a high speed ServerNet I network or a dedicated private Ethernet network to connect the cluster nodes The cluster nodes use the interconnect data ...

Страница 14: ...functions more than one node trying to behave as the root node is undesirable NOTE The CI serial cable may be referred to as the split brain avoidance SBA serial cable in UnixWare software and documentation Hardware Configuration The U 300 kit for the ProLiant DL380 server supports the server and storage hardware in specific configurations based on the type of cluster interconnect The CI serial ca...

Страница 15: ...must be on a different subnet from the embedded Ethernet cluster interconnect Multiple public network controllers can be installed after cluster installation is complete For a list of certified NICs see the Compaq High Availability website at http www compaq com highavailability Clusters that use ServerNet I interconnect access the public network using the embedded NIC Each server must be connecte...

Страница 16: ...sters software provide the operating environment for the ProLiant Clusters for SCO UnixWare 7 The SCO UnixWare 7 NonStop Clusters software provides the technology to Perform single system image operations Perform failover Define and modify cluster members Manually control and administer the cluster View the current state of the cluster This software is included in the U 300 kit NOTE The U 300 kit ...

Страница 17: ...configuration in the following categories G ServerNet I connectivity tests verify that the nodes in the cluster can communicate over X and Y ServerNet I paths G Ethernet connectivity tests verify that the nodes can communicate over the Ethernet cluster interconnect G Storage tests verify the presence of and minimum configuration requirements of supported HBAs array controllers and external storage...

Страница 18: ... CD is required for ServerNet I configurations You can also use the CD to configure additional hardware For information concerning SmartStart refer to the Compaq Server Setup and Management package that comes with your server The following utilities on the SmartStart CD are used for your cluster Compaq Array Configuration Utility ACU The ACU is an offline tool that is used to configure the array c...

Страница 19: ...k drive storage system memory and system processor has a robust set of management capabilities Compaq Insight Manager notifies the system administrator of impending fault conditions For information concerning Compaq Insight Manager refer to the Compaq Server Setup and Management package See Chapter 4 Managing Clusters for more information Compaq Management Agents and Tools for Servers for SCO Unix...

Страница 20: ...in Chapter 2 G Setting up external storage hardware components according to the documentation that came with them Once you have set up the hardware you must cable the hardware To set up the external storage refer to Setting Up the External Storage Hardware in Chapter 2 2 Perform preinstallation tasks Before beginning any software installation procedures you must perform a few tasks to prepare for ...

Страница 21: ... you through the installation and request the information found in your completed worksheets Refer to Installing the Cluster Using Quick Install in Chapter 3 The following sections offer sources of information and support for application installation and cluster documentation Resources for Application Installation Client server software applications are among the key components of any cluster Comp...

Страница 22: ...ay 4100 User Guide Compaq StorageWorks RAID Array 4100 Configuration poster Compaq StorageWorks RAID Array 4000 Redundant Array Controller Configuration poster Compaq Fibre Channel Storage System User Guide Compaq StorageWorks Fibre Channel Host Adapter Installation Guide Compaq Fibre Channel Troubleshooting Guide Compaq Fibre Channel Storage System Technology For more information about cluster us...

Страница 23: ...ick the book and question mark icon in the toolbar on the UnixWare Desktop to access SCOhelp A browser displays the main SCOhelp list of topics Type scohelp at the command line of a desktop terminal dtterm to access SCOhelp A browser displays the main SCOhelp list of topics Use the following URL to access SCOhelp remotely when the cluster is connected to the public network http clustername 457 Sub...

Страница 24: ...7 U 300 for the Compaq ProLiant DL380 Quick Install Cluster Assembling the Rack Setting Up the Cluster Nodes Setting Up the External Storage Hardware Cabling the Components For specific information about individual components see the documentation that comes with the component For information on steps and procedures for setting up cluster hardware refer to the documentation that comes with the har...

Страница 25: ...g height and overhead obstacles G Change in slope of the floor or change in other elevation G Floor roughness texture gaps and obstacles G Floor load capacity Check the area where the hardware is to be unpacked for the following conditions G Adequate proximity to installation area G Maneuvering room G Room to disassemble the crate G Room for the ramp and for rolling the hardware off the crate Chec...

Страница 26: ...rack whenever possible Install non flat panel monitors toward the top of the rack Install components that require better cooling capacity toward the top of the rack Purchase the rack stabilizer feet option when offered The typical stacking order has the UPSs at the bottom and progresses upward according to the following list UPS Storage subsystems Node 1 and node 2 Keyboard mouse monitor switch Mo...

Страница 27: ...es are disconnected from any expansion cabinets and that you have labeled the cables for trouble free reconnection Protect coil and stow the cables in the cabinet base Confirm that all major cable bundles are well secured Insert anti static foam between components in the rack Wrap the front and rear doors of the rack in bubble wrap before securely closing them Crate the rack according to the docum...

Страница 28: ...t Lights Out Edition boards can be installed after the cluster Quick Install procedure has completed Installing the 64 Bit External Storage Fibre Channel HBA and GBIC SWs For a redundant fault tolerant configuration the storage system connects to both ProLiant DL380 servers so an HBA must be installed in each server Additionally a GBIC SW must be installed in each adapter The following steps expla...

Страница 29: ...rnet interconnect to a public network install an NC3123 NIC into slot 1 of each node using the documentation that comes with the NIC This step is not needed for clusters that use the ServerNet I interconnect Installing the ServerNet I Cluster Interconnect If your ProLiant DL380 cluster uses ServerNet I as the cluster interconnect you must install the ServerNet I PCI adapter into slot 1 of each ser...

Страница 30: ...ompaq StorageWorks RAID Array 4000 controllers RA4000 controller one included in the RA4100 storage Two GBIC SWs one in each controller Two 9 1 GB or larger disk drives Fibre Channel cables To configure the external storage for the U 300 Quick Install cluster set up the RA4100 storage subsystem according to the following steps 1 Follow the set up instructions in the documentation that comes with t...

Страница 31: ...service and assembly of the cluster so following appropriate cabling standards is vital to a successful cluster setup Using Labeling Standards Proper labeling can prevent improper connections and simplify cluster assembly and service Make sure to label each server with the correct node labels that are provided Also label the ends of the following cables Ethernet crossover cable for cluster interco...

Страница 32: ...lude X and Y connections for redundancy Figure 2 1 shows the ServerNet I adapter connections Port X Connector Port Y Connector PCI Bus Connector Figure 2 1 ServerNet I PCI adapter connections IMPORTANT Cable X and Y to their corresponding counterparts Do not cable X connections to Y connections ...

Страница 33: ...ect the ServerNet I adapter in node 1 to the ServerNet I adapter in node 2 as shown in Figure 2 2 X Public Network Node 1 Node 2 CI Serial Cable Y Dedicated ServerNet I Cables Figure 2 2 Example of cabling the cluster interconnect of a cluster that uses ServerNet I interconnect NOTE Cabling for the external storage is intentionally not shown ...

Страница 34: ...ble ties Red ties are used only during shipment and are to be removed during onsite installation Figure 2 3 ServerNet I cable labeling suggestion To cable the ServerNet I interconnect follow these steps 1 Connect a white labeled ServerNet I X cable to the X connection on the ServerNet I adapter in node 1 2 Connect the other end of the cable to the corresponding ServerNet I adapter X connection in ...

Страница 35: ... embedded NIC of the servers See Figure 2 2 earlier in this chapter For interconnects using Ethernet connect the public LAN Ethernet cable to the NC3123 NIC into slot 1 of the servers See Figure 2 4 Ethernet Crossover Cable Public Network Node 1 Node 2 CI Serial Cable Figure 2 4 Example of cabling the cluster interconnect of a cluster that uses Ethernet NOTE Cabling for the external storage is int...

Страница 36: ... cable to the embedded NIC in node 2 Figure 2 4 illustrates the proper cabling Cabling the CI Serial Cable IMPORTANT The CI serial cable is required To cable the CI serial cable connect one end of the CI serial cable to serial port connector B in node 1 Connect the other end of the CI serial cable to serial port connector B in node 2 Figure 2 2 illustrates the proper cabling for clusters that use ...

Страница 37: ...ler Fibre Channel Array Controller GBIC 4 places SCSI Bus 1 5 4 3 2 1 0 SCSI Bus 2 5 4 3 2 1 0 viewed from rear Dual Fiber Optic Cable Y X Figure 2 5 Supported cabling of the RA4100 storage subsystem for the U 300 configuration 2 Connect the first RAID controller in the upper slot rack mount or right slot tower as viewed from the back of the RA4100 to node 1 Connect the redundant or second control...

Страница 38: ...ounts of tension or pressure on the Fibre Channel cable body can destroy the connector Type SC connectors include a white stripe along each side Verify that the connectors mate with a positive click and that the white stripe is invisible If the white stripe is visible the connectors are not properly mated Do not subject connectors to abrasion chemical contaminants or rough handling Fibre Channel m...

Страница 39: ...ge lasts until the UPS batteries approach the end of their holdup period To connect the UPS power management cable to the ProLiant server nodes 1 Locate the cable The UPS power management cable is a 3 66 m 12 00 ft serial cable included with most Compaq UPSs 2 Connect one end of the cable to the COM port on the UPS chassis Connect the other end of the cable to any unused serial port on any ProLian...

Страница 40: ...nstall your cluster Understanding Preinstallation Tasks and Considerations Configuring the Servers with SmartStart Updating Controller Firmware Verifying ServerNet I Connections Installing the Cluster Using Quick Install Verifying the Cluster Assembly Additional Cluster Setup Tasks Registering the ProLiant Cluster for SCO UnixWare 7 Viewing UnixWare and NonStop Clusters Documentation For specific ...

Страница 41: ...lists the default settings used during the Quick Install installation procedure These parameters can be modified after the installation is complete by running the International Settings Manager in the System folder of the SCOadmin system administration tool Table 3 1 Quick Install Default Settings Parameter Default Locale C Standard Keyboard United States C Code set C Time zone Configurable to any...

Страница 42: ...e that configuration on each node before using the Quick Install CDs for the ProLiant DL380 server The following steps must be used separately on each node to erase the configuration CAUTION This procedure erases any information currently stored on the node To prevent information loss back up important files before attempting installation NOTE The SmartStart procedure may prompt you for the Server...

Страница 43: ...ase a configuration power up the server and then insert the SmartStart CD into the CD ROM drive for node 1 If you erased a configuration according to the preceding steps begin with step 2 2 Select the language at the prompt The Regional Settings screen displays 3 Select the country and keyboard type from the Regional Settings screen 4 Set the date time and daylight savings time adjustment if appli...

Страница 44: ...ays 17 Use the arrow keys to select Step 5 Save and exit and then press Enter The Step 5 Save and exit window displays 18 Select Save the configuration and restart the computer and then press Enter A Reboot window displays 19 Press Enter The Array Configuration Utility loads and an Unconfigured Controller Wizard screen displays 20 Click Next The Select a configuration screen displays with RAID 0 h...

Страница 45: ...only a single diskette A screen for creating the first diskette displays d Click Skip The Firmware Upgrade diskette for the RA4000 Controller displays e Insert the formatted diskette into the disk drive and then click OK f Wait for the software to be written to the diskette g Remove the diskette from the drive after the software has been written to the diskette h Click Skip on each of the remainin...

Страница 46: ...g procedures when the firmware is updated G Continue with the Verify the ServerNet I Connections section for cluster interconnects using ServerNet I G Continue with the Installing the Cluster Using Quick Install section for cluster interconnects using Ethernet Verifying ServerNet I Connections NOTE This section applies only to clusters connected with ServerNet I If the cluster is connected with Et...

Страница 47: ... follow these steps 1 Have a ServerNet I Verification Utilities diskette for each server 2 Insert the proper diskette into the server that you want to test Reboot the node Wait for the DOS prompt to be displayed 3 Type spaf at the DOS prompt and then press Enter A title screen displays 4 Press any key to start the test of the ServerNet I links The following messages display LINK X IS ALIVE LINK Y ...

Страница 48: ...or errors Persistent errors indicate a problem with the cabling between the nodes 5 Exit the text by pressing Esc if errors persist Resolve the problem before continuing 6 Press Enter on node 1 to begin the loopback test Test message similar to the following display spaf path 0 Loopback 0 Option 1 11 29 14 08 20 600 pages spaf path 1 Loopback 0 Option 1 11 29 14 08 21 1200 pages If a loopback erro...

Страница 49: ...values of node2 ic and 10 1 0 2 are provided during installation IMPORTANT The public network IP address for node 1 and node 2 and the CVIP address must be on the same Ethernet subnet The default router must be on the public network subnet The cluster interconnect IP addresses for node 1 and node 2 must be on a different subnet from the public network For clusters using Ethernet interconnect netma...

Страница 50: ...er to continue or power down the system to abort the installation The software begins to load and a progress bar indicates the installation progress When all software has been loaded several screens request necessary information 4 Provide the necessary information for the following screens Each of the screens mentioned in step 3 displays the fields and a brief description of each field as it is se...

Страница 51: ...he node 1 hostname and IP address for the cluster interconnect the node 2 hostname and IP address for the cluster interconnect and the netmask The Ethernet cluster interconnect addresses must be on the same network f Network configuration Enter the external network configuration domain name CVIP address netmask node 1 hostname and IP address for the public network node 2 hostname and IP address fo...

Страница 52: ... together Insert the CDs into the servers power up the servers and follow the procedures for each node at the same time To install the software on node 2 follow these steps 1 Power up node 2 and then insert the Quick Install CD for node 2 into the CD ROM Wait while node 2 boots from the CD A warning message indicates that data will be lost 2 Press Enter to continue or for clusters using Ethernet i...

Страница 53: ...nd storage subsystems If any resources are missing or if the file system switch is not correctly operating refer to Chapter 5 Troubleshooting of this guide NOTE The NSCVU requires the SNMP agents to be running on each node After booting the cluster wait 15 minutes for all the agents to start on each node before using the NSCVU Otherwise the NSCVU reports errors about unavailable data Additional Cl...

Страница 54: ...itionally you can access manual pages using the man 1M command Access the online documentation in the following ways Click the book and question mark icon in the toolbar on the UnixWare Desktop to access SCOhelp A browser displays the main SCOhelp list of topics Type scohelp at the command line of a desktop terminal dtterm to access SCOhelp A browser displays the main SCOhelp list of topics Use th...

Страница 55: ...ized SCOadmin Event Processor Subsystem SCO UnixWare 7 NonStop Clusters Management Suite Clusterized and cluster specific command line utilities Compaq provides the management capabilities customized for use with ProLiant Clusters for SCO UnixWare 7 Compaq management software includes Clusterized Compaq Insight Manager Support Uninterruptible Power Supply UPS Initiated Shutdown Configuration ...

Страница 56: ...plays the main SCOhelp list of topics Type scohelp at the command line of a desktop terminal dtterm to access SCOhelp A browser displays the main SCOhelp list of topics Use the following URL to remotely access SCOhelp when the cluster is connected to the public network http clustername 457 Substitute the name of your cluster or its CVIP address for clustername The browser displays the main SCOhelp...

Страница 57: ... Task Scheduler VERITAS Volume Manager Virtual Domain User Manager The following SCOadmin folders provide additional management tools Clustering Compaq Hardware Networking Software Management Event Processing Subsystem The Event Processing Subsystem EPS is installed during cluster installation Use the EPS to configure actions and notifications based on system messages syslogd See the SCO UnixWare ...

Страница 58: ...NMP agents the EPS and Compaq Insight Manager support See the NCMS Configuration Manager help subsystem for additional information ServerNet Manager The SCO UnixWare 7 NonStop Clusters ServerNet Manager provides a graphical user interface to manage the ServerNet I storage area network SAN The ServerNet Manager displays the status of ServerNet I connections as well as advanced ServerNet I configura...

Страница 59: ...al pages for commands can be viewed using SCOhelp or using the man command from the command line onnode onall Executes a command on a specific node or on all nodes migrate kill3 Sends the running process a migrate request signal cluster Displays node state information and the software version installed clusternode_avail Shows node availability status clusternode_num Displays the current node numbe...

Страница 60: ...f shutdown The following SCO UnixWare commands interrogate the cluster node on which they are executed These commands can be used on any cluster node in conjunction with the onnode command nfsstat NFS and RPC kernel interface statistics rtpm Real time performance monitor psradm SMP processor administration psrinfo SMP processor information pexbind Exclusive processor bind operation NOTE Refer to t...

Страница 61: ...ht Manager Overview Compaq Insight Manager is the Compaq Win32 application for managing networked devices Compaq Insight Manager provides intelligent monitoring and alerting capabilities for the critical systems in a distributed enterprise Compaq Insight Manager consists of a Win32 application and a set of server or client based management data collection agents Key subsystems make system health c...

Страница 62: ...q Management Agent software runs on ProLiant cluster servers and interacts with the Compaq Insight Manager XE management server using SNMP and hypertext transfer protocol HTTP messaging Compaq Insight Manager XE gives system administrators control through a visual interface comprehensive fault and configuration management and remote management Administrators can access detailed information about t...

Страница 63: ...he Management CD the package that provides this support is nscvu and is part of the Compaq Management Agents and Tools for Servers for SCO UnixWare 7 NonStop Clusters portion of the CD For more information see the description of the Quick Install CDs in Chapter 1 Clustering Overview of this guide UPS Initiated Shutdown The Quick Install procedure automatically installs the UPS software On the Mana...

Страница 64: ...log and must not be modified The UPS_SERIAL_PORT parameter identifies The serial ports to which the UPSs are connected The combination of UPS signals required to shut down the cluster The UPS_SERIAL_PORT parameter is set equal to a listing of serial ports that is separated by colons and semicolons Colon separated serial ports create a pair of UPSs in which both of the UPSs must signal that they ar...

Страница 65: ...of the cluster In this configuration both UPSs are combined into a single logical UPS which results in a UPS_SERIAL_PORT configuration of UPS_SERIAL_PORT dev tty00 1 dev tty00 2 where dev tty00 1 is the device identifier for the node 1 serial port tied to UPS 1 and dev tty00 2 is the device identifier for the node 2 serial port tied to UPS 2 Serial Cable Serial Cable UPS 1 Node 1 Node 2 CI Serial ...

Страница 66: ... arise while installing configuring testing and operating Compaq ProLiant Clusters for SCO UnixWare 7 refer to the following troubleshooting sections Installation Problems Quick Install Error Messages Node to Node Communication Problems Shared Storage Problems Client to Cluster Connectivity Problems Cluster Resource Problems ServerNet I Messages ...

Страница 67: ...r distribution unit PDU or UPS circuit breaker No console output Keyboard mouse or monitor cabling Verify the cabling for correctness Console output is from the wrong server Node selection Verify that you have the correct server selected through the Keyboard Monitor Mouse switchbox Press the PrintScrn key for a menu of possible connections Inadequate memory Verify that the node has at least the mi...

Страница 68: ...and controllers are properly installed Refer to the documentation that comes with the product for more troubleshooting information Uncertified hardware configuration server or storage Verify that the servers and storage subsystems are on the certified hardware list for ProLiant clusters See the certified hardware listing at http www compaq com highavailability StorageWorks RAID Array 4000 redundan...

Страница 69: ...g the server with the SmartStart See Chapter 3 of this guide No Fibre Channel HBA Install the Fibre Channel HBA adapter follow the cabling instructions and configure the server with the SmartStart See Chapter 2 and Chapter 3 of this guide Failed Fibre Channel HBA GBIC SW cable or StorageWorks RAID Array 4100 storage subsystem Replace failed component No disks drives or only 1 in RA4100 Add disk dr...

Страница 70: ... licenses for number of processors users features and so on ServerNet I PCI adapter SPA is not correctly functioning Verify the SPA using the ServerNet I Verification Utility SVU as described in Chapter 3 of this guide New node does not join the cluster SPA is not correctly cabled or a ServerNet I cable is defective Verify the ServerNet I connections using the SVU as described in Chapter 3 of this...

Страница 71: ...Replace if necessary Node hardware failure Disconnect the node from the cluster Diagnose and repair hardware failures as a stand alone ProLiant server Existing node does not rejoin the cluster Both X and Y ServerNet I cables are damaged Note Damage can occur if cables are improperly secured and both cables are severely crimped when a rack mount server on a slide rail is pushed back into a rack Che...

Страница 72: ...ed in any way If necessary replace the Ethernet crossover cable After the Ethernet connection is repaired boot the cluster Alternating root node panics RA4100 system Both ServerNet I connections failed and the CI serial cable is not used Power down the cluster Check the ServerNet I cables to determine that the cables are properly connected do not have bent pins and are not crimped or compromised i...

Страница 73: ...ing and boot the cluster ServerNet I link exception errors are reported ServerNet I cable is defective Verify that the ServerNet I cables are properly connected do not have bent pins and are not crimped or compromised in any way If necessary replace the ServerNet I cables ServerNet I cables can be damaged if improperly secured or crimped when a rack mount server on a slide rail is pushed back into...

Страница 74: ...ster verify ServerNet I functionality by using the ServerNet I graphical monitor or the ServerNet I command line diagnostic spam See the spam 1M man page for additional information Replace the SPA if necessary Intermittent ServerNet Advanced Interface Logic SAIL freeze link level self check errors or performance degradation ServerNet I board installed in the wrong slot Install the ServerNet I boar...

Страница 75: ...on Drive problem Replace the bad drive Drives in the RA4100 are not recognized Hardware errors or communications problems or cluster does not support the disk drive Use the SCOadmin event viewer to verify that no hardware errors or transport problems exist Check the event log for disk I O error messages or indications of problems with communications transport See the documentation that comes with ...

Страница 76: ...at node to diagnose and isolate the problem using the information contained in the user guide for the RA4100 and the Fibre Channel troubleshooting guide Replace any defective component Storage performance is marginal on a FC AL system Cache modules on the array controllers do not match Verify that the cache module on each RA4100 controller is properly seated If necessary replace one cache module s...

Страница 77: ...ing connections Transmission Control Protocol Internet Protocol TCP IP is not properly configured Configure TCP IP using SCOadmin networking tools Clients cannot communicate with a node or nodes over Ethernet Public network interface IP address is invalid Reconfigure public network interface IP addresses for the network boards within the cluster using SCOadmin networking tools Cluster Virtual IP C...

Страница 78: ... NIC boards on two different nodes have IP addresses on the same subnet as the CVIP address This configuration allows the CVIP to switch to another public network interface if the primary public network interface is lost See the SCO UnixWare 7 NonStop Cluster System Administrator s Guide for more information on CVIP set_id failed Invalid argument message displays Intermittent failure occurs when a...

Страница 79: ... package that includes a loadable kernel module and requires a node reboot ALL nodes in the cluster must be rebooted by using the cluster wide reboot shutdown i6 Not all processors seem to be usable Incorrect UnixWare licenses on a node Update the UnixWare licenses for that node through the SCOadmin license manager so that all processors are properly licensed Perform a clusternode_shutdown and reb...

Страница 80: ...ed in the SCOhelp online documentation set See Chapter 4 Managing Clusters for information about viewing NonStop Clusters documentation ServerNet I SAN Error Messages The ServerNet I PCI Adapter Driver SPAD sends ServerNet I SAN messages to the system console and system log for the local node var adm log osmlog n where n is the local node number This section lists all of the SPAD messages describe...

Страница 81: ...serious problem that can warrant action Table 5 10 PANIC Messages that report a catastrophic failure the SPAD can no longer continue operation The local node has dropped out of the cluster Table 5 11 None blank Continuation of a previous message or an informative message not associated with a fault Table 5 12 Most messages include variable strings that are filled in when the event causing the mess...

Страница 82: ... with the target node over the given path X Y If a path is cabled a success message is expected If the path is not cabled for example X side cabled and Y side not cabled a failure message is expected for that path If a barrier transaction fails when it is expected to succeed there is a problem with the path Check the cabling and ServerNet I switch ensure that it is powered up on the indicated path...

Страница 83: ...nk exception reporting must be enabled see spam l on command for this message to be displayed None Successfully recovered from frozen SAIL Indicates that the SAIL application specific integrate circuit ASIC stopped responding due to an internal self check problem usually from being held off of the PCI bus too long After detecting the self check the ASIC is reset and processing resumes None Switchi...

Страница 84: ...nnnnn int_mode n This message follows a SAIL ASIC self check It indicates that after the SAIL ASIC self check recovery procedure completed the block transfer engine BTE status register contained an incorrect value and had to be reinitialized probably because a second self check occurred during recovery of the first self check None If the SAIL ASIC self checks occur too frequently the node drops fr...

Страница 85: ...h for additional SNET messages Path n still disabled due to link exceptions Verify that path n is properly cabled No more warnings regarding path being disabled are printed until the condition is corrected This series of messages indicates that a continuous burst of link exceptions was detected As a result link exception reporting is turned off for this path until the condition is corrected Check ...

Страница 86: ...rom being held off the PCI bus too long This self check condition is recoverable The ASIC is reset and processing resumes None However if this message is frequently repeated move the SPA to a higher priority slot on the PCI bus If that does not help other PCI boards in the node can be consuming the PCI bus and preventing the SAIL ASIC from obtaining the access that it needs The SAIL on the SHIP bo...

Страница 87: ...pace exhaustion None If the node drops out of the cluster later these messages can be useful in determining what happened 0xF0nnn low condition on external LSERR input detected 0xF0nnn illegal burst on the i960 bus detected 0xF0nnn invalid register access on the i960 bus detected 0xF0nnn error pulse on the external PCHK input detected 0xF0nnn data parity error detected 0xF0nnn address parity error...

Страница 88: ...gnostics replace the SPA intr_init spawn_daemon_thread failed Indicates that an attempt to spawn a kernel daemon thread failed Because this daemon is vital to the SPAD if it fails to start the initialization sequence is aborted Reboot the node into the cluster SAIL frozen see SAIL state printed above This message is preceded by a warning message indicating that the SAIL ASIC is frozen and a dump o...

Страница 89: ... installed in local node Run the resource manager resmgr command and verify that the SPA displayed as ship in the report is listed ship_init Unknown revision of the SAIL ASIC detected CIN 0xnnnnnnnn The SAIL ASIC detected is not revision A or revision B Replace the SPA with a 1 5 revision E SPA in the local node ship_init Unknown revision of the ServerNet I PCI adapter found Rev ID 0xn Indicates t...

Страница 90: ... a software or hardware error has occurred Reboot the node into the cluster The PLX_ABORT_ACTIVE bit is set shipintr Indicates that the PLX chip aborted a PCI operation A previous SPA read or write operation has been aborted This condition could lead to unreported data loss or corruption so SPA operations are halted with a PANIC Run offline diagnostics If the SPA passes diagnostics reboot the node...

Страница 91: ...et I Request Invalid status on ServerNet I Request 0xnnnnnnnn bte_error Invalid BTE command descriptor intr_init qintr_map failed ioint invalid ioaddr 0xnnnnnnnn ship_init physmap failed PHYS_TO_VIRT invalid address 0xnnnnnnnn allocSNdev out of sndev table space snetConfig bad cmd n snetOpen invalid mode 0xnnnnnnnn These messages are all SPAD software errors If possible take crash dump for analysi...

Страница 92: ...ss 0xnnnnnnnn Type Interrupt These are two separate cases of continuation messages They are followed by information from the access validation and translation AVT entry associated with the problem Usually this dump of information is accompanied by some other error message indicating the problem Information from the AVT entry specified by one of these two lines is printed out to help diagnose the s...

Страница 93: ... which was not expected Usually this packet dump is accompanied by some other error message indicating the problem the exception packet is dumped to help diagnose what caused the problem This message is usually seen in conjunction with timeouts on ServerNet I requests Save this and any accompanying messages for analysis by product support personnel See the user action for the message accompanying ...

Страница 94: ...7 1 1 IP PTF nsc1011c PTF nsc1013a Compaq EFS 7 38a Compaq Management Agents 4 90 System partition created from the Compaq SmartStart and Support Software CD 4 90 Additional software and versions needed include Compaq SmartStart and Support Software CD 4 90 or later to initialize the cluster Compaq Management CD 4 90 or later to install the Compaq Insight Manager client software ...

Страница 95: ...uster Software for the Compaq ProLiant DL380 server Fill these worksheets out before you begin the software installation and use the data where needed in the procedures Table B 1 Quick Install Data Screen Field Your Information a Read responses from previously saved diskette b Date Time Time zone only U S time zones are available c Cluster name d System owner name System owner login ID System owne...

Страница 96: ...10 1 0 1 Node 2 hostname for the cluster interconnect node2 ic Node 2 IP address for the cluster interconnect 10 1 0 2 Netmask 255 255 255 0 f Domain name CVIP address Netmask Node 1 hostname for the public network Node 1 IP address for the public network Node 2 hostname for the public network Node 2 IP address for the public network Default route g SNMP agent configuration Contact name Machine lo...

Страница 97: ...le B 2 SCO UnixWare License Worksheet Field Your Information Node 1 license number Node 1 license code Node 1 license data if necessary NonStop Cluster Two Node License Node 2 license number Node 2 license code Node 2 license data if necessary ...

Страница 98: ...dition that results in both nodes in a two node cluster trying to operate as the root node Cluster Membership Service Cluster Membership Service CLMS determines which nodes are a part of the cluster and controls the operating system portion of nodes that join and leave the cluster Clusterized The term refers to software that has been modified or designed to work in a cluster software environment C...

Страница 99: ... Interface Ethernet Crossover Cable The Ethernet crossover cable provides the node to node communication data path for the cluster FC AL See Fibre Channel Arbitrated Loop Fibre Channel Arbitrated Loop Fibre Channel Arbitrated Loop FC AL is a communication method between hardware components HTTP See Hypertext Transfer Protocol Hypertext Transfer Protocol Hypertext transfer protocol HTTP is the set ...

Страница 100: ...gh speed low latency cluster interconnect that uses a ServerNet I PCI adapter and two ServerNet I cables Simple Network Management Protocol The simple network management protocol SNMP is a transmission control protocol IP TCP IP protocol that generally uses the User Datagram Protocol UDP to exchange messages between a management information base and a management client residing on a network Becaus...

Страница 101: ... results in both nodes in a two node cluster trying to operate as the root node The use of the CI serial cable which is included in this cluster kit eliminates the possibility of split brain Storage Area Network A storage area network SAN is a high speed special purpose network or subnetwork that interconnects different kinds of data storage devices with an associated data server on behalf of a la...

Страница 102: ... 2 14 Fibre Channel precautions 2 15 keyboard 2 16 labeling 2 8 monitor 2 16 mouse 2 16 public LAN Ethernet 2 12 ServerNet I interconnect 2 9 UPS 2 16 cabling CI serial cable 2 13 components 2 8 Ethernet crossover cable 2 13 keyboard 2 16 monitor 2 16 mouse 2 16 public LAN Ethernet 2 12 ServerNet I interconnect 2 9 standards 2 8 UPS 2 16 cabling cluster nodes Ethernet cluster interconnect illustra...

Страница 103: ...des setting up 2 5 cluster aware applications 1 11 clustering overview 1 2 clusterized agents 4 8 clusters documentation 1 13 SCO UnixWare 7 NonStop Clusters 4 2 commands clusterized 4 5 onnode 4 6 SCO UnixWare 4 6 Compaq authorized reseller xi Compaq Insight Manager agents 1 7 support 4 7 Compaq Insight Manager XE overview 4 8 support 4 8 Compaq support website 3 7 components cabling 2 8 external...

Страница 104: ...ion illustrated 1 5 LAN connection 1 5 ServerNet I configuration illustrated 1 4 hazard symbol viii hazardous conditions symbols on equipment viii hazardous energy circuits symbol viii warning viii HBA Host Bus Adapter installing 2 5 help additional information 1 12 additional sources x Compaq authorized resellers telephone numbers xi Compaq support website x technical support telephone numbers x ...

Страница 105: ...c LAN Ethernet 2 12 CI serial cable 2 13 internal disk drive installation considerations 3 2 installing 2 6 investment protection cluster 1 1 K Keepalive Configuration Manager 4 5 Keepalive Manager 4 4 keyboard cables 2 16 L labels on cables 2 8 symbols on equipment viii LAN local area network connection hardware components 1 5 licenses obtaining 3 3 software 1 9 local area network See LAN M manag...

Страница 106: ...alling HBA 2 5 public LAN NIC 2 6 redundant controller 2 7 ServerNet I interconnect cabling 2 11 upgrading controller firmware 3 7 protection investment 1 1 public LAN Ethernet cabling 2 12 NIC installing 2 6 Q Quick Install data B 1 error messages 5 4 planning worksheets B 1 UPS software 4 9 R RA4100 storage subsystem cabling illustrated 2 14 rack components stacking 2 3 loading caution 2 3 site ...

Страница 107: ...initiated shutdown 4 9 single system image capabilities cluster 1 3 site rack assembling 2 2 SmartStart configuring the servers 3 4 defined 1 8 SmartStart CD 1 8 3 3 software components cluster 1 5 general installation steps 1 10 installation 3 1 3 2 licenses 1 9 SCO 1 6 SCOadmin 4 2 updates 3 15 versions A 1 solving problems 5 1 split brain avoiding 1 4 standards cabling 2 8 storage hardware comp...

Страница 108: ...are 1 8 upgrading controller firmware procedure 3 7 UPS uninterruptible power supply cabling 2 16 defined 1 7 information file 4 10 initiated shutdown 4 9 4 10 Quick Install 4 9 serial connection 4 10 shutdown notification 4 11 utilities Array Configuration 1 8 NSCVU 1 7 Options ROMPaq 1 8 SVU 1 8 V verifying clusters 1 7 local ServerNet I adapter 3 8 ServerNet I connections 3 7 verifying node to ...

Отзывы: