background image

Managing the Compaq ProLiant Clusters HA/F100 and HA/F200  

5-15

Compaq Confidential – Need to Know Required

Writer:

 Bryan Hicks  

Project:

 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide  

Comments:

 

Part Number:

 380362-003  

File Name: 

f-ch5 Managing the Compaq ProLiant Clusters HAF100 and HAF200.doc  

Last Saved On: 

8/24/00 12:03 PM

Other Functions

Two helpful functions for using the Redundancy Manager graphical user
interface (GUI) are Refresh and Rescan.

Refresh

Refresh (

F5

) updates information on the GUI screen, checks for path failures

and path changes, and displays the current configuration. The GUI will not
update automatically. The changes that you have made will not be saved. Use
refresh to update the main screen to see the current configuration or to see if a
failure has happened in the system. Refresh does not affect any processing or
interrupt any of the system’s functions.

Rescan

Rescan is used to check for new host bus adapters and array controllers and
after adding and removing physical drives. Use rescan after a hot-swap of host
bus adapters or array controllers and after adding or removing physical drives.

NOTE:

  For every hot replace, a rescan should be run on each machine in a cluster.

 

1.

 

Select Feature

from the Main screen.

 

2.

 

Select Rescan from the Features menu.

RAID Array 4000 Controller Hot Replace

In an HA/F200 cluster an RA4000 Controller can be replaced in the
RA4000/4100 without powering down the storage system or taking the cluster
“off-line.” This is called “hot replace.”

 

1.

 

Identify which controller needs to be replaced from the Compaq
Redundancy Manager screen. Simply remove the fiber optic cable, and
GBIC from the RA4000 controller and remove the RA4000 controller
out of the system. You can remove the active RA4000 controller
provided that the storage system has a standby RA4000 controller ready
for the failover operation.

 

2.

 

Insert the replacement controller and GBIC then reconnect the fiber
optic cable.

 

3.

 

Perform the Rescan operation to have Redundancy Manager identify the
new RA4000 controller.

Summary of Contents for ProLiant Clusters HA/F100

Page 1: ...ProLiant Clusters HA F100 and HA F200 Administrator Guide Third Edition September 2000 Part Number 380362 003 Compaq Computer Corporation...

Page 2: ...HE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND ANY RISK ARISING OUT OF THE USE OF THIS INFORMATION REMAINS WITH RECIPIENT IN NO EVENT SHALL COMPAQ BE LIABLE FOR ANY...

Page 3: ...ProLiant Cluster HA F200 1 4 Compaq ProLiant Servers 1 6 Compaq StorageWorks RAID Array 4000 or Compaq StorageWorks RAID Array 4100 1 6 Compaq StorageWorks RAID Array 4000 Controller 1 7 Connection In...

Page 4: ...lanning 2 28 Server Capacity 2 29 Shared Storage Capacity 2 31 Static Load Balancing 2 35 Networking Capacity 2 37 Network Considerations 2 37 Network Configuration 2 37 Migrating Network Clients 2 38...

Page 5: ...HA F200 Windows NTS E 4 19 Chapter 5 Managing the Compaq ProLiant Clusters HA F100 and HA F200 Managing a Cluster Without Interrupting Cluster Services 5 2 Managing a Cluster in a Degraded Condition 5...

Page 6: ...Storage 6 6 Client to Cluster Connectivity 6 11 Cluster Groups and Cluster Resource 6 15 Troubleshooting Compaq Redundancy Manager 6 16 Event Logging 6 16 Informational Messages 6 16 Warning Message 6...

Page 7: ...About This Guide vii Expanding Capacity B 8 Other Functions B 9 Troubleshooting Redundancy Manager B 9 Appendix C Software and Firmware Versions Glossary Index...

Page 8: ......

Page 9: ...t they should be pressed simultaneously USER INPUT User input appears in a different typeface and in uppercase FILENAMES File names appear in uppercase italics Menu Options Command Names Dialog Box Na...

Page 10: ...c instructions NOTE Text set off in this manner presents commentary sidelights or interesting points of information Symbols on Equipment These icons may be located on equipment in areas where hazardou...

Page 11: ...ll power cords to completely disconnect power from the system Rack Stability WARNING To reduce the risk of personal injury or damage to the equipment be sure that The leveling jacks are extended to th...

Page 12: ...rt Phone Center Telephone numbers for world wide Technical Support Centers are listed on the Compaq website Access the Compaq website by logging on to the Internet at http www compaq com Be sure to ha...

Page 13: ...Compaq Authorized Reseller For the name of your nearest Compaq authorized reseller In the United States call 1 800 345 1518 In Canada call 1 800 263 5868 Elsewhere see the Compaq website for locations...

Page 14: ...ion of servers and storage that acts as a single system presents a single system image to clients provides protection against system failures and provides configuration options for static load balanci...

Page 15: ...0 4100 storage systems One Compaq StorageWorks RAID Array 4000 Controller per RA4000 4100 storage system One of the following hubs or switches Compaq StorageWorks Fibre Channel Storage Hub 7 or 12 por...

Page 16: ...Service MSCS Compaq SmartStart and Support Software CD Compaq Cluster Verification Utility CCVU Compaq Insight Manager optional Compaq Insight Manager XE optional Compaq Intelligent Cluster Administra...

Page 17: ...n components Two Compaq ProLiant servers One or more Compaq StorageWorks RAID Array 4000 or Compaq StorageWorks RAID Array 4100 RA4000 4100 storage systems Two Compaq StorageWorks RAID Array 4000 Cont...

Page 18: ...e Channel for Windows NT Compaq SANworks Secure Path for Windows 2000 on RAID Array 4000 4100 Compaq Cluster Verification Utility CCVU Compaq Insight Manager optional Compaq Insight Manager XE optiona...

Page 19: ...ing a shared storage subsystem connected to ProLiant servers through Fibre Channel Arbitrated Loop technology NOTE Visit the Compaq High Availability website http www compaq com highavailability to ob...

Page 20: ...pable and manages all of the drives in the RA4000 4100 storage array Each RA4000 4100 is shipped with one controller installed In a HA F100 cluster each array controller is connected to both servers t...

Page 21: ...ster If the maximum number of supported RA4000 4100s currently five are connected to either type of cluster using a 12 port hub there will be unused ports Compaq does not currently support using these...

Page 22: ...ht Manager CIM Array Configuration Utility ACU and the StorageWorks Switch Management Utility For more information refer to the Compaq StorageWorks Fibre Channel FC AL Switch 8 Installation Guide Comp...

Page 23: ...ach server and a cable connecting the adapters The cluster nodes use the interconnect data path to Communicate individual resource and overall cluster status Send and receive heartbeat signals Update...

Page 24: ...ect strategies refer to the White Paper Increasing Availability of Cluster Communications in a Windows NT Cluster available from the Compaq High Availability website http www compaq com highavailabili...

Page 25: ...to achieve this and the method you choose is dependent on your hardware One way is through use of the Redundant NIC Utility available on all Compaq 10 100 Fast Ethernet products The other option is th...

Page 26: ...connect Dedicated Interconnect Using Standard Ethernet Cables and a private Ethernet Hub Standard Ethernet cables can be used to connect the NICs together through a private Ethernet hub to create anot...

Page 27: ...luster nodes Monitor the state of each cluster node Initiate failover and failback events NOTE MSCS will only run with Windows NTS E Previous versions of Windows NT are not supported NOTE The HA F200...

Page 28: ...on the SmartStart and Support Software CD included in the Compaq Server Setup and Management Pack shipped with ProLiant servers SmartStart is the recommended way to configure the Compaq ProLiant Clus...

Page 29: ...etup and Management pack Compaq Support Paq for Microsoft Windows 2000 The Compaq Support Paq for Microsoft Windows 2000 is an advanced software delivery tool that replaces the familiar SSD utility ve...

Page 30: ...through a redundant path allowing applications to continue processing This rerouting is transparent to NTFS Therefore in an HA F200 configuration it is not necessary for MSCS to fail resources over to...

Page 31: ...ces over to the other node Secure Path in combination with redundant hardware components is the basis for the enhanced high availability features of the HA F200 running Windows NTS E Two licenses of S...

Page 32: ...Insight Manager Compaq Insight Manager loaded from the Compaq Management CD that is shipped with each ProLiant server is an easy to use console based software utility for collecting server and cluster...

Page 33: ...ter Monitor relies heavily on the Compaq Insight Manager agents for basic information about system health It also has custom agents that are designed specifically for monitoring cluster health Cluster...

Page 34: ...ications are among the key components of any cluster Compaq is working with its key software partners to ensure that cluster aware applications are available and that the applications work seamlessly...

Page 35: ...all of the cluster components and concepts fit together to meet your information system needs The major topics discussed in this chapter are Planning Considerations Capacity Planning Network Consider...

Page 36: ...the HA F100 Configuration section of this chapter By definition a highly available system is not continuously available and therefore may have single points of failure NOTE The discussion in this cha...

Page 37: ...city memory and processor power to run all applications all applications running on the first node plus all clustered applications running on the other node When designing your cluster so that only on...

Page 38: ...the Marketing server HR clients experience a slight disruption of service while the file shares and print spooler fail over to their secondary server Any jobs that were in the print spooler before the...

Page 39: ...e active example 2 While in a normal state both cluster nodes run at expected performance levels If the Marketing server encounters a failure the market research application and associated data resour...

Page 40: ...ximum availability the file and print server can be unavailable for several hours without impacting revenue In this scenario the order entry database is configured to use the file and print server as...

Page 41: ...y processing data In active standby only one server is processing data active while the other the standby server is in an idle state The standby server must be logged in to the Windows NT or Windows 2...

Page 42: ...distribution instructions for the warehouse With an estimated downtime cost of 1 000 hour the company determines that the cost of a standby server is justified This mission critical active server is...

Page 43: ...d each application depends on software and hardware subsystems For example most applications need a storage subsystem to hold their data files This section is designed to help you understand which sub...

Page 44: ...ication and a Web server a Windows NT or Windows 2000 service NOTE For this example it is assumed that each cluster group can communicate with the other even if they are not executing on the same node...

Page 45: ...r each business function Figure 2 6 Web Sales Order Business Function Web Server Service Cluster Group 1 Resource 1 Dependent Resource 1 Resource 2 Resource 3 Resource 1 Dependent Resource 1 Resource...

Page 46: ...p 1 Database Server Application Cluster Group 2 Network Name IP Address Network Name IP Address Web Server Service Physical Disk Resource contains web pages and web scripts Database Application Physic...

Page 47: ...b Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4 Resource 3 Web Server Service Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4 Resource 4 N A Sub Resource 1 Sub Resource 2 Sub Res...

Page 48: ...ese features will vary based upon your specific server model The single points of failure described in this section are Cluster interconnect Fibre Channel data paths Non shared disk drives Shared disk...

Page 49: ...onnect NOTE If you are using the ServerNet option as the interconnect the card itself has a built in level of redundancy Each ServerNet PCI adapter has two data ports thereby allowing two separate cab...

Page 50: ...ode Failure of a LAN NIC in a cluster node may have serious repercussions If your cluster is configured with a dedicated interconnect and a single LAN NIC the failure of a LAN NIC will prevent network...

Page 51: ...the Microsoft clustering software to use both the primary and redundant LAN NIC as backup for intracluster communication With this strategy your cluster can continue normal operations without a failov...

Page 52: ...retains its fully redundant status when MSCS is configured to use the other network ports as interconnect backup Failure of the primary interconnect path results in intracluster communications occurr...

Page 53: ...ons occurring over the primary NIC of the redundant pair If the entire interconnect card fails the cluster nodes will still have a working communication path The cluster to LAN communication is fully...

Page 54: ...RA4000 Controller and a Compaq StorageWorks RAID Array 4000 or Compaq StorageWorks RAID Array 4100 RA4000 4100 into which the SCSI disks are placed The RA4000 4100 storage system has two distinct dat...

Page 55: ...Interconnect Figure 2 11 Host bus adapter to storage hub data path Note that the Compaq Insight Manager tools monitor the health of the RA4000 4100 storage system If any part of the Fibre Channel data...

Page 56: ...iant Server ProLiant Server RA4000 4100 storage hub or switch Interconnect Figure 2 12 Hub to RA4000 4100 data path Without access to shared storage clustered applications cannot reach their data or l...

Page 57: ...not used failure of a shared disk drive will disrupt service to all clustered applications and services that depend on the drive Failover of a cluster node will not resolve this failure since neither...

Page 58: ...combination of multiple paths and redundant hardware components provided by the HA F200 offers significantly enhanced high availability over non redundant configurations A single component failure in...

Page 59: ...the interruptions may not even be noticeable The following illustration depicts the HA F200 configuration components Node 1 RA4000 4100 Dedicated Interconnect LAN Node 2 storage hub or switch storage...

Page 60: ...ndby data paths separated by two Fibre Channel storage hubs or FC AL switches Figure 2 14 and Figure 2 15 detail the active and standby paths of the minimum HA F200 configuration RA4000 4100 storage h...

Page 61: ...5 Active hub to storage data path The second active data path runs from the active hub or switch to the RA4000 4100 If this path fails the applications can seamlessly fail over to the standby hub to R...

Page 62: ...des is divided into three generic categories Operating system Nonclustered applications and services Clustered applications and services Figure 2 16 illustrates these categories in the cluster Operati...

Page 63: ...will reside on shared storage see Shared Storage Capacity later in this chapter Server Capacity The capacity needed in each server depends on whether you design your cluster as an active active config...

Page 64: ...p to fail applications and services to Server1 Server1 clustered applications and services if Server1 is set up to fail applications and services to Server2 Processing power memory and nonshared stora...

Page 65: ...ch the application from shared storage The application will execute with the same customizations that existed when executed on the primary node Two factors help to determine the required amount of sha...

Page 66: ...files and program files It might be important for the log file and program files to have a quick recovery time while performance would be a secondary concern Together the files do not take up much cap...

Page 67: ...array Each volume is presented to the operating system as an independent disk drive and can be independently controlled by the cluster software Using the previous example you could configure the two...

Page 68: ...Group Log file s for Database Required Application Capacity 12 GB 4 3 GB Desired Level of Protection RAID 5 RAID 1 RAID Configuration 4 x 4 3 GB 2 x 4 3 GB Required Capacity With RAID 17 2 GB 8 6 GB T...

Page 69: ...applications and data across the data paths through an active active host bus adapter configuration This configuration can increase the functionality of the cluster IMPORTANT Disk load balancing cann...

Page 70: ...tion can accommodate static load balancing because the host bus adapters of one server can be in an active active HBA mode to different storage systems RA4000 4100 RA4000 4100 Server Server A A S A S...

Page 71: ...ffect the corporate LAN The Microsoft clustering software has specific requirements regarding which protocol can be used and how IP address and network name resolution occurs Additionally consider how...

Page 72: ...ce IP addresses DHCP cannot be used to assign IP addresses for virtual servers When configuring DHCP exclude enough static IP addresses from the pool of dynamically leased addresses to account for the...

Page 73: ...ers connect to resources using the cluster network name and file share Client Server Applications Reconfiguration of client applications in a client server environment may also be required Some applic...

Page 74: ...clustered servers after failover Cluster server thresholds and periods Failover of directly connected devices Automatic vs manual failover Failover failback policies Performance After Failover As app...

Page 75: ...k resource Disk1 that is part of a cluster group Group1 You set the restart threshold to 5 and the restart period to 10 If the Disk1 resource fails the Microsoft clustering software will attempt to re...

Page 76: ...ce If Group1 cannot be restarted within the limits of the restart threshold and period the Microsoft clustering software attempts to fail over Group1 to Node 2 If the failover threshold for Group1 is...

Page 77: ...y not be able to provide failover capabilities for them Manual vs Automatic Failback Failback is the act of integrating a failed cluster node back into the cluster Specifically it brings cluster group...

Page 78: ...icies for cluster groups Table 2 4 Group Failover Failback Policy Terms and Definitions Term Definition Failover policy The circumstances the Microsoft clustering software uses to take a group offline...

Page 79: ...ering software will attempt to fail over the group 5 times within a 1 hour period Prevent Prevent automatic failback This setting allows the administrator to fail back a group manually Allow Allow aut...

Page 80: ...ned in previous examples A blank copy of the worksheet is provided in Appendix A Group Failover Failback Policy Worksheet Group Name Web Server Service General Properties Name Web Server Service Descr...

Page 81: ...uration or you are planning to upgrade the operating system of an HA F100 or HA F200 see Chapter 4 for more details The Compaq ProLiant Clusters HA F100 and HA F200 are combinations of several individ...

Page 82: ...2 Installation Guide Compaq StorageWorks Fibre Channel FC AL Switch 8 Installation Guide Documentation received with your operating system Microsoft Windows NT Server 4 0 Enterprise Edition Windows NT...

Page 83: ...ncy Manager Fibre Channel Compaq SANworks Secure Path for Windows 2000 on RAID Array 4000 4100 Compaq Insight Manager optional Compaq Insight Manager XE optional Compaq Intelligent Cluster Administrat...

Page 84: ...cards you will use for client access to the cluster What are the adapter names and IP addresses of the network adapter cards you will use for the dedicated interconnect between the cluster nodes What...

Page 85: ...n on the drives themselves After you have configured the shared drives from one of the cluster nodes it is not necessary to configure the drives from the other cluster node When the Array Configuratio...

Page 86: ...Windows NTS E or Windows 2000 Advanced Server makes dynamic drive letter assignments when drives are added or removed or when the boot order of drive controllers is changed but Disk Administrator or...

Page 87: ...q recommends that Automatic Server Recovery ASR be left at the default values for clustered servers Follow the installation instructions in your Compaq ProLiant Server documentation to set up the hard...

Page 88: ...ardware and configuration For more information refer to your server documentation and the Compaq white paper Where Do I Plug the Cable Solving the Logical Physical Slot Numbering Problem available fro...

Page 89: ...4000 Controller and the Fibre Channel cables Note that the Compaq shared external storage documentation explains how to install these devices for a single server Because clustering requires shared sto...

Page 90: ...Enabled or Disabled as required by the ports in your cluster configuration The ports on the Compaq StorageWorks FC AL 3 Port Expansion Module are configured in a similar fashion by selecting 3 Port Ex...

Page 91: ...the section on the Compaq Array Configuration Utility in the Compaq shared external storage documentation NOTE The Array Configuration Utility runs automatically during an automated SmartStart install...

Page 92: ...mpaq ServerNet option as the server interconnect for your ProLiant Cluster you need the following Two ServerNet PCI adapter cards Two ServerNet cables Follow these steps to install the ServerNet inter...

Page 93: ...s you to configure any certified network card as a possible path for intracluster communication If you are employing a dedicated interconnect use MSCS to configure your LAN network cards to serve as a...

Page 94: ...q Intelligent Cluster Administrator software and documentation Compaq Cluster Verification Utility At least 10 high density diskettes Assisted Integration Using SmartStart Recommended IMPORTANT Prior...

Page 95: ...on the second server When configuring drives through the Array Configuration Utility create a logical drive with 100MB of space to be used as the quorum disk Assisted Integration Installation Steps I...

Page 96: ...torage system Refer to the user guide for the RA4000 or RA4100 for more details After you have completed using the Array Configuration Utility the system will reboot and SmartStart will automatically...

Page 97: ...the node Run Options ROMPaq from diskettes and choose to update the firmware on the array controllers 11 Power down the storage and Node 1 after the firmware update completes 12 Power on the storage a...

Page 98: ...ect Settings from the Start menu c Select Control Panel from the Settings menu d Select Add Remove Programs from the Control Panel e Click Install from the Add Remove Programs page f Click Next from t...

Page 99: ...FQDN to IP address mapping For more detailed information on Secure Path refer to the Secure Path documentation 19 Run the Compaq Cluster Verification Utility CD from your cluster kit to ensure that yo...

Page 100: ...s 2000 Advanced Server run Compaq Support Paq for Windows 2000 to verify that all installed drivers are current This service can be run from the following path on the SmartStart CD x cpqsupsw ntcsp se...

Page 101: ...Cluster Administrator CD 2 Click the Explore button 3 Double click the CICA folder 4 Double click SETUP EXE The Compaq Intelligent Cluster Administrator will begin installation If a previous version o...

Page 102: ...d the software verify creation of the cluster using the following steps 1 Shut down and power down both servers 2 Power down and then power on the RA4000 4100 3 Power up both servers When Windows fini...

Page 103: ...If the cluster is not working correctly see the installation troubleshooting tips in Chapter 6 Verifying Node Failover NOTE Do not run any client activity while testing failover events Follow these s...

Page 104: ...the installation troubleshooting tips in Chapter 6 Verifying Network Client Failover After you have verified that each server is correctly running as a cluster node the next step is to verify that ne...

Page 105: ...dministrator to perform a manual failover of the cluster group that contains the IP address 6 Execute the ping command again after the manual failover completes As soon as the other node brings the cl...

Page 106: ...th basic cluster management and operation It also assumes that you are familiar with the hardware and software configuration details outlined in Chapter 3 of this guide Even though some of the procedu...

Page 107: ...to Appendix C to determine which service packs software and firmware version levels are required for cluster upgrades IMPORTANT These procedures may be updated over time For additional information on...

Page 108: ...rray 4000 4100 Compaq Redundancy Manager Fibre Channel Windows NTS E only Additional GBICs and Fibre Channel cables In addition to the above requirements the following items are needed for any cluster...

Page 109: ...or FC AL switch documentation Compaq StorageWorks Fibre Channel Storage Hub 7 Installation Guide Compaq StorageWorks Fibre Channel Storage Hub 12 Installation Guide Compaq StorageWorks Fibre Channel F...

Page 110: ...ver all cluster resources to node 2 2 Upgrading the operating system and drivers on node 1 3 Failing back all cluster resources to node 1 4 Upgrading node 2 If an RA4000 Controller firmware upgrade is...

Page 111: ...Log on to the node as the administrator NOTE After the upgrade is complete the Cluster Service will fail to start This is because the DNS client for Node 1 has not been set up The problem that occurs...

Page 112: ...and that Node 1 has rejoined the cluster Open Cluster Administrator by clicking Start Programs Administrative Tools Cluster Administrator As the Cluster Administrator opens an error will display This...

Page 113: ...to update the firmware on the RA4000 Controllers d Power down the storage and Node 1 after the firmware update completes e Power on the storage wait for the drives to spin and then power on Node 1 9...

Page 114: ...th nodes 4 Re installing applications on the cluster Use the SmartStart Assisted Integration procedure to configure the servers nodes in this migration procedure CAUTION Installation using SmartStart...

Page 115: ...rating system installation 9 Insert the Windows 2000 CD when prompted Follow the on screen instructions to install Windows 2000 Advanced Server 10 Power down the server insert the Options ROMPaq diske...

Page 116: ...fer to your Windows 2000 Advanced Server documentation 19 Verify that Secure Path is running properly and that the redundant paths are operational See the Secure Path documentation for more informatio...

Page 117: ...ade include 1 Failing over all cluster resources to node 2 2 Adding the redundant loop hardware 3 Upgrading the hardware of node 1 4 Installing Secure Path on node 1 5 Failing back all cluster resourc...

Page 118: ...Information tab The online Array Configuration Utility is installed by running the Compaq Support Paq for Windows 2000 in Step 2 of this migration procedure b Determine the firmware version on newly...

Page 119: ...Run In the dialog box that displays type X dskbldr setup exe where X is the drive letter associated with your CD ROM drive You can also acquire and run the latest Options ROMPaq from the Compaq websi...

Page 120: ...n Process IV HA F100 Windows NTS E to HA F200 Windows 2000 Advanced Server This procedure can be performed while keeping your cluster on line a rolling upgrade provided that the firmware levels of all...

Page 121: ...d if you want to upgrade Windows select Yes c Follow the on screen instructions until you are required to log on to the node d Log on to the node as the administrator NOTE After the Windows 2000 Advan...

Page 122: ...s Microsoft Cluster Server This error will not display when Node 2 is upgraded to Windows 2000 Advanced Server From the error screen select Yes To All to open Cluster Administrator 7 Verify the RA4000...

Page 123: ...e Builder Utility by inserting the SmartStart CD and selecting Start and then Run In the dialog box that displays type X dskbldr setup exe where X is the drive letter associated with your CD ROM drive...

Page 124: ...The cluster must be shut down causing the cluster to be unavailable to clients during the migration IMPORTANT Back up all data before beginning the migration process The basic steps to this upgrade in...

Page 125: ...bre Channel storage hub or FC AL switch See the installation procedures in Chapter 3 IMPORTANT If using the Compaq StorageWorks FC AL Switch 8 be sure to properly set up the Port LIP Propagation Polic...

Page 126: ...y running the Compaq Cluster Verification Utility Instructions for installing and running this utility can be found in Chapter 3 of this guide 9 Verify that Compaq Redundancy Manager is running proper...

Page 127: ...clusters The chapter also details the utilities and programs used in the ongoing management of Compaq ProLiant Clusters HA F100 and HA F200 The topics addressed in this chapter include Managing a Clu...

Page 128: ...ath for Windows 2000 on RAID Array 4000 4100 Compaq Insight Manager Compaq Insight Manager XE Compaq Intelligent Cluster Administrator Microsoft Cluster Administrator Managing a Cluster Without Interr...

Page 129: ...egradation Use Compaq Insight Manager or Compaq Insight Manager XE to determine the problem 2 Determine whether the condition will continue to worsen 3 Determine how critical the problem is a If the p...

Page 130: ...istration and control and it relies heavily on the Compaq Insight Manager Web enabled agents as well as other agents for basic information about system health A full description of the Compaq Insight...

Page 131: ...ompaq Insight Manager tools show the shared logical drives as cluster resources owned by a particular node They show the Fibre Channel hardware as a physical resource of both servers in the cluster Wh...

Page 132: ...0 4100 you are about to remove from the cluster 2 Power off the RA4000 4100 you are about to remove Remove the Gigabit Interface Converter GBIC and the cable from the Fibre Channel storage hub or FC A...

Page 133: ...move the SmartStart CD 8 Boot Node 1 to Windows NTS E or Windows 2000 Advanced Server then run Disk Administrator for Windows NTS E or Disk Management for Windows 2000 Advanced Server to assign perman...

Page 134: ...our storage system to understand these rules Failure to follow these rules may result in loss of data Replacing a Failed Drive The procedure for replacing a failed drive is completed within the RA4000...

Page 135: ...it ACU Remove SmartStart CD 6 Boot Node 1 to Windows NTS E or Windows 2000 Advanced Server then run Disk Administrator for Windows NTS E or Disk Management for Windows 2000 Advanced Server to assign p...

Page 136: ...ode Node 1 for example Fail over to the remaining node any cluster groups that are running on the node being replaced Node 2 2 Open Cluster Administrator on Node 1 Right click Node 2 Select Evict Node...

Page 137: ...process of backing up data will ensure that a company s assets are secure and available when a disaster strikes The cluster itself provides a high degree of application availability but does not prev...

Page 138: ...e performance monitor utility can be used to determine whether either of the cluster nodes is operating at too high a performance level Then use Cluster Administrator to fail over as many cluster grou...

Page 139: ...Channel Host Adapter host bus adapter Compaq StorageWorks RA4000 Controller array controller and Fibre Channel data paths It then reroutes the I O processing This section provides information on how...

Page 140: ...current configuration Changing from Standby to Active Paths To change a path from Standby to Active mode 1 Highlight the Standby path you want to change 2 Select Path from the main screen menu bar 3...

Page 141: ...adapters or array controllers and after adding or removing physical drives NOTE For every hot replace a rescan should be run on each machine in a cluster 1 Select Features from the Main screen 2 Selec...

Page 142: ...START menu select Programs then SecurePath and then the SPM submenu 2 Click the SPM application icon Logging on to Secure Path Manager Logging on to SPM incorporates entering user and storage profiles...

Page 143: ...ion password must be the same on each of the Secure Path host Check Save Password if you want SPM to use the saved password automatically each time you login with this storage profile Saving an SPM St...

Page 144: ...client SPM access list or password using the Configuration utility you must stop and restart the Agent using the Windows Services Applet located in Control Panel 3 Find and select the Secure Path Age...

Page 145: ...highlight it in the storage system view 2 Drag the drive to the other controller or right click to select the Move To Other Controller action Verifying A Path Choose Verify a Path when you want SPM t...

Page 146: ...3 Disconnect Fibre Channel cable from removed controller Wait for Secure Path to acknowledge the failed path 4 Insert replacement controller Wait for LED 8 to start flashing This will take about 30 s...

Page 147: ...n agents Management agents monitor more than 1 000 management parameters Key subsystems are instrumented to make health configuration and performance data available to the agent software The agents ac...

Page 148: ...tus of all cluster resources From the Compaq Insight Manager Cluster Shared Resources screen you can View address transport protocol and physical ID of all cluster interconnects View the current state...

Page 149: ...Insight Manager XE is Cluster Monitor a real time cluster monitoring system for ProLiant Clusters using Microsoft Windows NTS E or Windows 2000 Advanced Server and MSCS The combination of Insight Man...

Page 150: ...ring administrators a quick and convenient way to diagnose system status Compaq Insight Manager XE helps you focus on your computing environment from the perspective of Microsoft clusters and their at...

Page 151: ...tifications of changes in cluster status Monitor cluster status by viewing a list of cluster alerts Investigate the sources of specific alerts Browse cluster and component status in a tree hierarchy D...

Page 152: ...hout failing over the cluster You can also check for any cluster destabilizing conditions such as disk thresholds or application slowdowns Compaq Intelligent Cluster Administrator performs three main...

Page 153: ...ting Cluster Configurations Using the Import Export configuration functionality you can Import an archived configuration to the active cluster Export a cluster configuration to an archive and save it...

Page 154: ...rces to their preferred server Pause groups and resources Restructure a group s resource dependency tree Cluster Administrator can run remotely or on a cluster node If Cluster Administrator is install...

Page 155: ...F100 and HA F200 These problems are described in the following troubleshooting categories Installation Node to Node Shared Storage Client to Cluster Connectivity Cluster Groups and Cluster Resources...

Page 156: ...services are running 2 Check the name resolution of the cluster It is possible that you are using an incorrect name or that the name is not being properly resolved by WINS or DNS Cluster Administrator...

Page 157: ...de Primary IP address is invalid Verify that the addresses are valid If DHCP is used to obtain noncluster IP addresses run IPConfig exe to ensure the network adapter cards have valid IP addresses If t...

Page 158: ...cabling of the cluster Troubleshooting Node to Node Problems Table 6 2 describes problems that may be encountered during server to server communication Table 6 2 Solving Node to Node Problems Problem...

Page 159: ...igured properly Verify TCP IP configuration on both nodes No IP connectivity Verify IP connectivity to the cluster address If unable to ping the IP address of the cluster run Cluster Administrator on...

Page 160: ...Reboot cluster nodes after installing MSCS Ensure the drives are recognized Drives in the RA4000 4100 are not recognized Host bus adapter driver is not installed Ensure that the host bus adapter drive...

Page 161: ...ion Utility 3 If all drives are not recognized by the Array Configuration Utility verify all Gigabit Interface Converter Shortwave GBIC SW modules are properly seated 4 Verify that all Fibre Channel c...

Page 162: ...all Fibre Channel cables are properly connected to the GBIC SW modules For details on how to connect the GBIC SW modules and the Fibre cables see the documentation that came with your storage system...

Page 163: ...atch 2 If they do not match use Options ROMPaq to update the firmware Drive rebuild automatically restarts Failover may have occurred Check to see if a failover has occurred It is normal behavior for...

Page 164: ...fline GBIC SW laser has malfunctioned 1 Refer to the documentation that came with your storage system for instructions on replacing a GBIC SW 2 Manually fail back resources Windows NTS E or Windows 20...

Page 165: ...IP address Network clients communicate with the cluster through TCP IP Table 6 4 Solving Client to Cluster Connectivity Problems Problem Possible Cause Action TCP IP is not configured properly Verify...

Page 166: ...client has TCP IP protocol correctly installed and configured Resource name resolution problem may exist Use NetBT cache Nbtstat exe on the Windows NTS E CD to determine whether the name had been pre...

Page 167: ...iling back to the inaccessible primary server 2 Implement a redundant interconnect LAN strategy Install three PCI network cards per server Set up one as a private interconnect configured for cluster c...

Page 168: ...bling is not damaged or loose on the surviving node 3 Verify that MSCS was able to receive the heartbeat of the surviving node and properly failed over the resources 4 Verify that the failed over grou...

Page 169: ...line Some resources take time to go offline Wait several minutes then check any dependencies that the resource may have Verify that each can be taken offline An IP address added to a cluster group fai...

Page 170: ...Event Viewer refer to your Microsoft operating system documentation Informational Messages Table 6 6 provides a list of informational messages and actions to take using Redundancy Manager Table 6 6 C...

Page 171: ...has exited improperly It is recommended that Redundancy Manager not be run while another program has a lock on the array controller s To stop this instance from starting select the Cancel button To st...

Page 172: ...ctive Path to xxxxxx A logical disk is claimed but no Active Path is selected Click OK for the Redundancy Manager to automatically assign the path for this logical disk Or click Cancel to assign the p...

Page 173: ...a list of error messages and actions to take using Redundancy Manager Table 6 8 Error Messages Message Description Action Another instance has locked the loop This instance is running in Read Only mod...

Page 174: ...k management command to an array controller The lock management command only allows viewing the data No action needed to view the data To configure the data close the other application to unlock the a...

Page 175: ...firmware cannot resolve two array controllers talking to the drive Replace the drives Array controller firmware versions don t match The array controllers have different firmware versions Run the Opti...

Page 176: ...00 Completed worksheets are illustrated in chapters 2 and 3 of this guide Copy these worksheets and use as many as necessary to assist you in planning and designing your cluster configuration The foll...

Page 177: ...2 Sub Resource 3 Sub Resource 4 Resource 2 Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4 Resource 3 Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4 Resource 4 Sub Resource 1...

Page 178: ...urce 2 Description Required Capacity without RAID Level of Protection Desired RAID Configuration Required Capacity with RAID Disk Resource 3 Disk Resource 4 Description Required Capacity without RAID...

Page 179: ...Failover Failback Policy worksheet to define failover and failback settings for each cluster group Group Failover Failback Policy Worksheet Group Name General Properties Name Description Preferred Ow...

Page 180: ..._________________________________ Password _______________________________________ Domain _________________________________________ Network Adapter Cards that will be used for client access to the clu...

Page 181: ...monitoring tool it is not a real time management tool In a nonclustered environment Redundancy Manager enables full utilization of the redundant hardware available for use with RA4000 4100 storage sys...

Page 182: ...B 1 shows a single server setup with an RA4000 4100 This setup provides redundant paths to the RA4000 4100 RA4000 4100 LAN Server storage hub or switch storage hub or switch storage hub or switch sto...

Page 183: ...ancy Manager can be configured with multiple paths to a particular storage device Each path can be defined as an active path enabling static I O load balancing or with one active and one or more stand...

Page 184: ...48 MB of RAM recommended for Windows NTS E or Microsoft Windows NT Server 4 0 1 5 MB reserved disk space VGA color or better At least two Compaq StorageWorks Fibre Channel Host Adapters P or Compaq St...

Page 185: ...ancy Manager If the server is not set up to automatically load when the CD is placed in the CD ROM drive follow these steps to manually install Compaq Redundancy Manager 1 Place the Compaq Redundancy...

Page 186: ...ging Redundancy Manager Redundancy Manager increases the availability of single server or clustered systems using the RA4000 4100 storage system Redundancy Manager can detect failures of the host bus...

Page 187: ...to Active Paths To change a path from Standby to Active mode 1 Highlight the Standby path you want to change 2 Select Path from the main screen menu bar 3 Select Set As Active from the Path menu The s...

Page 188: ...he new drives 3 Run the Array Configuration Utility to configure the drives NOTE You cannot increase the capacity of an existing Windows NT drive volume but you can assign a new drive letter to the ex...

Page 189: ...ened in the system Refresh does not affect any processing or interrupt any of the system s functions Rescan Rescan is used to check for new host bus adapters and array controllers and after adding and...

Page 190: ...are and firmware updates recommended or required for your Compaq ProLiant Cluster Table C 1 Supported Software Firmware Versions Software Firmware Title Version Compaq SmartStart and Support Software...

Page 191: ...l 1 2 or later Compaq SANworks Secure Path for Windows 2000 on RAID Array 4000 4100 3 1 or later Microsoft Windows NT Server 4 0 Service Pack 6a or later Microsoft Windows 2000 Service Pack 1 or later...

Page 192: ...at a time can communicate Array controller A hardware device that facilitates communications between a host and one or more devices organized on an array Also called RA4000 controller Availability A...

Page 193: ...Compaq StorageWorks Fibre Channel Host Bus Adapter P A device that provides an interface between a host system server and storage system or other devices connected on a Fibre Channel arbitrated loop...

Page 194: ...Fibre Channel Array See Compaq StorageWorks RAID Array 4000 4100 Fibre Channel An IEEE standard for providing high speed data transfer among workstations mainframes supercomputers desktop computers s...

Page 195: ...t Protocol Address A number that uniquely identifies a host server so that computer entities can locate and communicate with each other through the transfer of packets IP addresses can be statically o...

Page 196: ...e a system is turned on that verifies components are present and operating Preferred node The principal server an application is configured to operate from Proprietary clustering system Traditionally...

Page 197: ...ault tolerance by storing two sets of duplicate data on a pair of disk drives RAID 4 Data guarding This level involves the use of a single designated drive containing parity data If a drive fails the...

Page 198: ...rverNet A bidirectional high bandwidth low latency redundant path network interconnect Service A data set or operation set exported by application servers to their clients Shared resource A type of cl...

Page 199: ...red storage to existing cluster 5 6 application software cluster aware 1 21 Compaq integration technotes 1 21 array creating 2 33 maximum volumes 2 33 optimizing performance 2 33 volume 2 33 Automatic...

Page 200: ...ackup 5 11 installing a new boot drive 5 11 modifying physical cluster resources 5 6 removing shared storage 5 6 replacing a storage drive 5 8 system performance 5 12 Windows NT Performance Monitor 5...

Page 201: ...2 23 Compaq Insight Manager XE cluster components 1 3 cluster management 5 4 cluster monitor cluster specific features 5 24 description 1 20 5 23 managing the interconnect 2 16 Compaq Intelligent Clus...

Page 202: ...able 2 39 D data backup 5 11 dedicated interconnect 1 11 DHCP 2 38 disk resource troubleshooting 6 3 DNS See Domain Name Service Domain Name Service 2 38 drive letters 3 6 drive ownership determining...

Page 203: ...essages 6 16 Insight Management Desktop 1 19 installation Compaq StorageWorks RAID Array 4000 Storage System 3 9 Ethernet hub 3 12 hardware 3 7 interconnect 3 8 3 11 Microsoft Cluster Server 6 3 redun...

Page 204: ...9 failover period 2 42 failover threshold 2 42 restart period 2 41 restart threshold 2 41 N net use command 2 39 network capacity 2 37 clients 5 4 migrating 2 38 troubleshooting 6 11 configurations 2...

Page 205: ...rformance 2 21 2 40 monitoring 2 40 virtual 1 10 server capacity requirements table 2 30 ServerNet installation 3 12 interconnect 1 11 redundancy 2 15 shared resource connecting to 2 39 shared storage...

Page 206: ...iled over group 6 14 client to cluster connectivity 6 11 cluster administrator does not appear in start menu 6 2 cluster group 6 15 cluster resource group 6 15 cluster to LAN communication 6 11 Compaq...

Page 207: ...erating system 1 14 Performance Monitor 5 12 Windows NTS E 1 14 Windows Performance Monitor 2 41 WINS See Windows Internet Name Service worksheet cluster group definition A 2 group failover failback p...

Reviews: