background image

ClusterPack  

 

Index of Tutorial Sections 

Index

 | 

Administrators Guide

 | 

Users Guide

 | 

Tool Overview

 | 

Related Documents

 | 

Dictionary

 

Administrators Guide 

 

1.0 ClusterPack Install QuickStart 

 

1.1 ClusterPack General Overview 

 

1.2 Comprehensive Install Instructions 

 

1.3 Installation and Configuration of Optional Components 

 

1.4 Software Upgrades and Reinstalls 

 

1.5 Golden Image Tasks 

 

1.6 System Maintenance Tasks 

 

1.7 System Monitoring Tasks 

 

1.8 Workload Management Tasks 

 

1.9 System Troubleshooting Tasks 

 

Users Guide 

 

2.1 Job Management Tasks 

 

2.2 File Transfer Tasks 

 

2.3 Miscellaneous Tasks 

 

Tool Overview 

 

3.1 Cluster Management Utility Zone Overview 

 

3.2 Service ControlManager (SCM) Overview 

 

3.3 System Inventory Manager Overview 

 

3.4 Application ReStart (AppRS) Overview 

 

3.5 Cluster Management Utility (CMU) Overview 

 

3.6 NAT/IPFilter Overview 

 

3.7 Platform Computing Clusterware Pro V5.1 Overview 

 

3.8 Management Processor (MP) Card Interface Overview 

 

3.9 HP Systems Insight Manager (HPSIM) Overview 

 

Related Documents 

 

4.1 Related Documents 

 

Summary of Contents for 1032

Page 1: ...tem Monitoring Tasks 1 8 Workload Management Tasks 1 9 System Troubleshooting Tasks Users Guide 2 1 Job Management Tasks 2 2 File Transfer Tasks 2 3 Miscellaneous Tasks Tool Overview 3 1 Cluster Management Utility Zone Overview 3 2 Service ControlManager SCM Overview 3 3 System Inventory Manager Overview 3 4 Application ReStart AppRS Overview 3 5 Cluster Management Utility CMU Overview 3 6 NAT IPF...

Page 2: ...Back to Top Index Administrators Guide Users Guide Tool Overview Related Documents Dictionary Copyright 1994 2004 hewlett packard company Dictionary of Cluster Terms ...

Page 3: ...Run mp_register on the Management Server Step Q12 Power up the Compute Nodes Step Q13 Run compute_config on the Management Server Step Q14 Run finalize_config on the Management Server 1 0 1 How Can I Get My HP UX Cluster Running If you have installed ClusterPack before follow the instructions in this section as a quick reminder You can refer to the detailed instructions for any given step via the ...

Page 4: ...e following software on each Compute Node z HP UX 11i Ignite UX z HP UX 11i V2 0 TCOE Allow the default choices to install ClusterPack requires a homogeneous operating system environment That is all Compute Nodes and the Management Server must have the same release of HP UX installed as well as the same operating environment The Management Server requires at least one LAN connection The manager mu...

Page 5: ...file system space on the Management Server Minimum requirements are listed below z var 4GB z opt 4GB z share 500MB Clusterware edition only For more information see the Comprehensive Instructions for this step References z Step 3 Allocate File System Space Back to Top Step Q4 Obtain a License File z Get the Host ID number of the Management Server z Contact Hewlett Packard Licensing Services to red...

Page 6: ...lusterPack on Compute Nodes for the first time DO NOT power up the systems ClusterPack will do that for you automatically If you do accidentally power the compute nodes DO NOT answer the HP UX boot questions For more information see the Comprehensive Instructions for this step References z Step 5 Prepare Hardware Access Back to Top Step Q6 Power Up the Management Server Perform a normal first boot...

Page 7: ...ep 7 Configure the ProCurve Switch Back to Top Step Q8 Copy the License Files to the Management Server Put the files in any convenient directory on the Management Server e g tmp For more information see the Comprehensive Instructions for this step References z Step 8 Copy the License Files to the Management Server Back to Top Step Q9 Install ClusterPack on the Management Server z Mount and registe...

Page 8: ...tory z Whether to configure SCM SysInvMgr or HP SIM software z The LSF admin password Clusterware edition only For more information see the Comprehensive Instructions for this step References z Step 10 Run manager_config on the Management Server Back to Top Step Q11 Run mp_register on the Management Server Provide the following information to the mp_register program about each Management Processor...

Page 9: ...more information see the Comprehensive Instructions for this step References z Step 12 Power up the Compute Nodes Back to Top Step Q13 Run compute_config on the Management Server The compute_config program will register the nodes with various programs For more information see the Comprehensive Instructions for this step References z Step 13 Run compute_config on the Management Server Back to Top S...

Page 10: ...ments Dictionary Copyright 1994 2004 hewlett packard company repeat the installation process performing all steps in the order specified For more information see the Comprehensive Instructions for this step References z Step 14 Set up HyperFabric optional Back to Top ...

Page 11: ...benefits z horizontally scalable by adding more nodes z vertically scalable by using larger SMP nodes z fault isolation failure of a single Compute Node will not shutdown the entire cluster system z asymmetry mix and match of different nodes in a cluster z configuration flexibility nodes interconnect z re deployable nodes A compute cluster consists of Compute Nodes that incorporate multiple proces...

Page 12: ...t provide system computing resource and storage capability A ClusterPack cluster is built with HP Integrity servers 2 way or 4 way server platforms based on Intel Itanium 2 based processors and HP s zx1 chipset technologies The HP Integrity rx2600 server powered by Intel Itanium 2 based processors is the industry s first dual processor Itanium 2 based server The rx2600 dramatically improves price ...

Page 13: ...alth monitoring z cluster troubleshooting z cluster tuning z golden image creation and distribution z cluster reconfiguration z cluster system hardware and software inventory management z cluster server nodes consistency checking Distributed resource management z cluster resource scheduling z policy based queues and multiple queue management z job submission monitor and control z user specified jo...

Page 14: ...ed for the system administrators who will be responsible for the initial setup and continuing operation of the cluster The Administrators section of the tutorial covers a range of topics including installation and setup of the ClusterPack software on the cluster creating and managing golden images system maintenance tasks adding users to the cluster adding third party software to the cluster syste...

Page 15: ...f Optional Components z Section 1 4 Software Upgrades and Reinstalls z Section 1 5 Golden Image Tasks It is helpful prior to installation to review and be familiar with several additional sections of the tutorial This material does not need to be completely reviewed but should be read and available during the initial testing of the new cluster z Section 1 6 System Maintenance Tasks z Section 1 7 S...

Page 16: ...ing prerequisites are assumed z HP UX 11i V2 0 TCOE installed on the Management Server z HP UX 11i V2 0 TCOE installed on each Compute Node The following software components must be installed for all features of ClusterPack V2 4 to function effectively z HP UX 11i Ignite UX on the Management Server z HP UX 11i Ignite UX on each Compute Node Back to Top 1 1 5 System Requirements In order to install...

Page 17: ...Index Administrators Guide Users Guide Tool Overview Related Documents Dictionary Copyright 1994 2004 hewlett packard company Back to Top ...

Page 18: ...ent Server Step 12 Power up the Compute Nodes Step 13 Run compute_config on the Management Server Step 14 Set up HyperFabric optional Step 15 Set up InfiniBand optional Step 16 Run finalize_config on the Management Server Step 17 Create a Golden Image of a Compute Node from the Management Server Step 18 Add nodes to the cluster that will receive the Golden Image Step 19 Distribute the Golden Image...

Page 19: ...etails section gives the exact commands you must enter Note The steps in this section have to be followed in the specified order to ensure that everything works correctly Please read all of the following steps BEFORE beginning the installation process Back to Top Step 1 Fill Out the ClusterPack Installation Worksheet Background ClusterPack simplifies the creation and administration of a cluster of...

Page 20: ...You will be asked to set it Back to Top Step 2 Install Prerequisites Background ClusterPack works on HP Integrity Servers running HP UX In order to install ClusterPack you must hav the Technical Computing Operating Environment TCOE version of HP UX installed You must also ha the Ignite UX software which is used for installation Installing Ignite UX on the Compute Nodes makes possible to create and...

Page 21: ...t Server and the Compute Nodes Or you can install Ignite UX after rebooting by the following method z Using the HP UX 11i V2 0 TCOE DVD mount and register the DVD as a software depot z Install the Ignite UX software on the Management Server using swinstall On the Management Server usr sbin swinstall s source_machine mnt dvdrom Ignite UX Note Allow the default choices to install Back to Top Step 3 ...

Page 22: ...lect the appropriate file system Should be fs0 but may be fs1 Shell fs0 5 Boot HP UX fs0 hpux 6 Interrupt auto boot 7 Boot to single user mode HPUX boot vmunix is 8 Determine the lvol of opt cat etc fstab 9 Look for the lvol that corresponds to opt 10 Extend the file system Use lvol from Step 2 lvextend L 4096 dev vg00 lvol4 May not be lvol4 umount dev vg00 lvol4 This should fail extendfs dev vg00...

Page 23: ...icates z If you purchased the ClusterPack Base Edition redeem the Base Edition license certificate z If you purchased the ClusterPack Clusterware Edition redeem the Base Edition certificate and the Clusterware edition certificate Note It may take up to 24 hours to receive license file Plan accordingly Details You will need to contact HP licensing to redeem your license certificates You can call E ...

Page 24: ...e the Management Processors manually by connecting a console to each card Note If you are installing ClusterPack on Compute Nodes for the first time DO NOT power up the systems ClusterPack will do that for you automatically If you do accidentally power the compute nodes DO NOT answer the HP UX boot questions Back to Top Step 6 Power Up the Management Server Background This is the first step in act...

Page 25: ...ect the manual option z Select the IP address field and enter the IP address to be used for the switch Back to Top Step 8 Copy the License Files to the Management Server Background Copy the license files onto the Management Server The license files can be placed in any convenient directory that is accessable to the Management Server During the invocation of the manager_config tool you will be aske...

Page 26: ...ry On the system with the DVD drive i e remote system 1 Mount the DVD mount dev dsk xxx mnt dvdrom 2 Edit the etc exports file DVDs must be mounted read only ro and if required can give root permission to other machines mounting the filesystem root machine_foo machine_bar machine_baz Add a line to etc exports mnt dvdrom ro root local_system 3 Export the file system using all the directives found i...

Page 27: ...for use as a software depot are z Insert DVD into the drive z Mount the DVD drive locally on that system z Register the depot on the DVD using swreg z Check the contents of the DVD using swlist These commands can only be executed as the super user i e root A DVD drive installed in the Management Server can be used for software installations If the Manageme Server does not include a DVD drive use o...

Page 28: ...te Nodes z Specify how many Compute Nodes are in the cluster and the starting IP address of the first Compute Node This information is used to assign names and IP addresses when Compute Nodes are brought up The first 5 characters of the Management Server s hostname are used for a base for the Compute Nodes For example if the starting IP address is 10 1 1 1 and there are 16 Compute Nodes and the na...

Page 29: ...Whether to mount a home directory z The SCM admin password if SCM is configured z The LSF admin password Clusterware edition only Details This tool can be invoked in two ways based on your specific requirements z If you want manager_config to drive the allocation of hostnames and IP addresses of the Compute Nodes in the cluster based on some basic queries invoke opt clusterpack bin manager_config ...

Page 30: ...ou just need to press RETURN to assign those default values Back to Top Step 11 Run mp_register on the Management Server Background A Management Processor MP allows you to remotely monitor and control the state of a Compute Node configuring and registering the MP cards for each Compute Node clbootnodes can be used to automatica answer the first boot questions for each Compute Node ...

Page 31: ...d and shut down the node or gain root access through the console The configuration step configures the MP for telnet or web access only to make future modifications such as adding users simpler to perform mp_register will add each MP and associated IP address to the etc hosts file on the Management Server This file will later get propagated to the Compute Nodes Each MP is assigned a name during th...

Page 32: ...bootnodes clbootnodes will gain console access by using telnet to reach the MP clbootnodes uses a library called Expect to produce the input needed to gain access to the console and step through the boot processes The are times when human intervention is necessary In these cases a message will be displayed explaining w control is being returned to the user The user can then interact with the MP co...

Page 33: ...odes that have a connected MP that you specifie the previous step It will answer the first boot questions for all nodes automatically Provide the following information to the clbootnodes program z Language to use z Host name z Time and time zone settings z Network configuration z Root password Details To run clbootnodes use the following command opt clusterpack bin clbootnodes Before booting the n...

Page 34: ...ded using a The IP address of each Compute Node known by the Management Server makes up the Cluster Network ClusterPack includes a utility to configure additional networks on all of the Compute Nodes These networks like the Cluster Network refer to a logical collection of interfaces IP addresses and not to a physical network However they must share a common netmask The concept of a network is defi...

Page 35: ...ty using clnetworks The network entity m be assigned an extension that forms the aliases to use for the HyperFabric interfaces Use these names wh you want to explicitly communicate over the HyperFabric network For example if node002 has a HyperFabric interface with the extension hyp ftp through this network can be achieved using usr bin ftp node002 hyp Notice that this command will only work from ...

Page 36: ... on the management server usr sbin swcopy x enfoce_dependencies false s IB driver source var opt clusterpack depot At the end of compute_config if the IB drivers are found in var opt clusterpack depot an option to insta the IB drivers on the compute nodes will be given If you choose to install the IB drivers on the compute nodes a second option will be presented The IB drivers can be installed on ...

Page 37: ...her temporary files network directories and host specific configuration files are not included A system image may be referred to as a golden image or a recovery image The different names used to re to the image reflect the different reasons for creating it Administrators may create a recovery image of node in the event that the node experiences hardware failure or the file system is accidentally r...

Page 38: ...te Node should be opened for accepting Clusterware job badmin hopen hostname Back to Top Step 18 Add nodes to the cluster that will receive the Golden Image Background This command adds the new node with the specified host name and IP address to the cluster It also reconfigures all of the components of ClusterPack to accommodate the newly added node Details Invoke opt clusterpack bin manager_confi...

Page 39: ... hostname all The keyword all can be used to distribute the image to all of the Compute Nodes in the cluster or a sing hostname can be specified sysimage_distribute will reboot each Compute Node for installation with the specified image Back to Top Step 20 Install and Configure the remaining Compute Nodes Background This tool is the driver that installs and configures appropriate components on eve...

Page 40: ...ion checks on the Cluste Management Software and validates the installation It prints out diagnostic error messages if the installation is not successful Overview This program completes the installation and configuration process verifies the Cluster Management Software and validates the installation If it reports diagnostic error messages repeat the installation proc performing all steps in the or...

Page 41: ... the cluster nodes Network Address Translation allows communications from inside the cluster to get out without allowing connections from outside to get in NAT rewrites the IP headers of internal packets going out making it appear that they all came from a single IP address which is the external IP address of the entire cluster Reply packets coming back are translated back and forwarded to the app...

Page 42: ...abilities One of the features that it supports is Network Address Translation For your information on HP UX IPFilter please refer to the HP UX IPFilter manual and release notes at docs hp com http docs hp com hpux internet index html IPFilter 9000 For information on NAT features of HP UX IPFilter refer to the public domain how to document No guarantee can be made about the correctness completeness...

Page 43: ...will walk through the steps of setting up HP UX IPFilter pass through all of the packet For more complicated filtering rules please refer to the HP UX IPFilter documentation z Create a file with pass through rules cat EOF tmp filter rules pass in all pass out all EOF cat tmp filter rules pass in all pass out all To create more complicated rules please refer to the HP UX IPFilter documentation http...

Page 44: ...60000 map lan0 192 168 0 0 24 15 99 84 23 32 EOF cat tmp nat rules lan0 interface to the external network NAT IP interface 15 99 84 23 map lan0 192 168 0 0 24 15 99 84 23 32 portmap tcp udp 40000 60000 map lan0 192 168 0 0 24 15 99 84 23 32 Example 2 Map packets from specific Compute Nodes 192 168 0 3 and 192 168 0 4 to a single IP address 15 99 84 23 cat EOF tmp nat rules lan0 interface to the ex...

Page 45: ...node This will normally be done automatically by compute_config Example In this example lan1 is the private subnet of the Compute Nodes and the Management Server lan1 interface is 192 168 0 1 The following steps should be performed to configure the routing tables in each Compute Node z On each Compute Node issue the command usr sbin route add default 192 168 0 1 1 z On each Compute Node add or mod...

Page 46: ... Server as the home directory for the cluster I alternate mount point is used it is necessary to perform the following steps before starting the Invoke opt clusterpack bin manager_config on Management Server step z If it is not already setup configure the file server to export the directory you intend to mount as home z Connect the file server to the ProCurve 5308xl switch The file server s connec...

Page 47: ...nt the head node s and the remaining Compute Nodes z Use compute_config to configure the additional network cards to allow the head node s to be accessible outside of the cluster Assign the available network cards publicly accessible IP addresses as appropriate to your local networking configuration Back to Top 1 3 4 Set up TCP CONTROL ClusterPack delivers a package to allow some control of TCP se...

Page 48: ...ome users It is configured by default to allow ac to root and lsfadmin ALL root ALL ALL lsfadmin ALL Although the hosts deny file disallows all access the entries in hosts allow override the settings of hosts deny The hosts deny file also does not prevent users from accessing telnet and remsh between Compute Nodes This allows MPI based applications to run when submitted to a ClusterWare Pro queu M...

Page 49: ...verview Overview It is very important to read this entire section before beginning the upgrade or reinstallation process As with the installation ClusterPack uses a three stage process for reinstalling and configuring an ClusterPack managed cluster z Installation and configuration of the Management Server z Installation and configuration of the Compute Nodes z Verification of the Management Server...

Page 50: ...P Integrity server with HP UX 11i Version 2 0 TCOE z Compute Nodes HP Integrity servers with HP UX 11i Version 2 0 TCOE z Cluster Management Software ClusterPack V2 4 The following prerequisites are assumed z HP UX 11i v2 0 TCOE is installed on the Management Server z HP UX 11i v2 0 TCOE is installed on each Compute Node z HP UX 11i Ignite UX on the Management Server z HP UX 11i Ignite UX on each ...

Page 51: ...red Upgrading from Base Edition to Clusterware Edition If you are upgrading from Base Edition to Clusterware Edition you will need to redeem your Clusterware Edition license certificate using the instructions in 1 2 3 Pre Install Checklist You can reuse the ClusterPack license file and specify a location for the Clusterware license file Increasing the size of an existing cluster If you are perform...

Page 52: ...p 3 Invoke opt clusterpack bin compute_config on Management Server This tool is the driver that installs and configures appropriate components on every Compute Node It is invoked with the force install option F as follows opt clusterpack bin compute_config F Back to Top Reinstall Step 4 Invoke opt clusterpack bin finalize_config on Management Server Finalize and validate the installation and confi...

Page 53: ...ackup utilities swinstall s depot_with_V2 4 CPACK BACKUP z Take a backup of the cluster information opt clusterpack bin clbackup f backup_file_name z Copy the backup file to another system for safe keeping z Remove the TCP wrappers on your Compute Nodes clsh usr bin perl p i e s usr lbin tcpd etc inetd conf z Remove the Compute Nodes from the Systems Inventory Manager database opt sysinvmgr bin si...

Page 54: ...LSF queues if in use should be empty of all jobs and the nodes should be idle Instructions for upgrading from V2 3 to V2 4 z Backup the cluster user level data z Install the V2 4 backup utilities swinstall s depot_with_V2 4 CPACK BACKUP z Take a backup of the cluster information opt clusterpack bin clbackup f backup_file_name z Copy the backup file to another system for safe keeping z Install the ...

Page 55: ...nistrators Guide Users Guide Tool Overview Related Documents Dictionary Copyright 1994 2004 hewlett packard company z Verify that everything is working as expected opt clusterpack bin finalize_config Back to Top ...

Page 56: ...on files are not included A system image may be referred to as a golden image or a recovery image The different names used to refer to the image reflect the different reasons for creating it Administrators may create a recovery image of a node in the event that the node experiences hardware failure or the file system is accidentally removed or corrupted Administrators may also create a golden imag...

Page 57: ...UX is installed on the system swlist l product Ignite UX If it is not you will need to obtain and install this product first http software hp com Read the man pages for make_sys_image 1m to find out more about creating system images The user can control what files are included in an image through the use of the l g and f arguments to make_sys_image See the man pages for make_sys_image 1m for more ...

Page 58: ...reboot each Compute Node for installation with the specified image If the image was sent to a node that was already part of the cluster that node must have the Compute Node software reconfigured For more information see the Software Upgrades and Reinstalls section compute_config a node name If the image was sent to a node that will be added to the cluster please see the Add Node s to the Cluster u...

Page 59: ...les from that top level directory into the corresponding location in the root file system on the machine If an existing file on the compute node would be overwritten that file will be moved to var opt clusterpack sysfiles save to preserve the file If the CPACK FILES bundle is unconfigred the origional files will be restored to their origional location clsysfile can be invoked with no options The f...

Page 60: ... SD can be asociated with a Golden Image and will be installed on the compute nodes following an installation with that image The software bundles should be swcopy d to var opt clusterpack depot A list of all the bundels that are available in the depot can be found using e usr sbin swlist l bundle var opt clusterpack depot The bundles are associated with an image using the sysimage_register comman...

Page 61: ...of Nodes 1 6 12 Execute remote commands on one or more nodes 1 6 13 Copy files within nodes in a cluster 1 6 14 List a user s process status on one or more cluster nodes 1 6 15 Kill a user s process or all of the user s processes on some all Cluster Nodes 1 6 16 Create a Cluster Group 1 6 17 Remove a Cluster Group 1 6 18 Add Nodes to a Cluster Group 1 6 19 Remove Nodes from a Cluster Group 1 6 20 ...

Page 62: ...ager_config Step 2 Invoke mp_register on Management Server If the host being added to the cluster has an MP interface it should be registered and possibly configured with mp_register opt clusterpack bin mp_register a new_node_name The a option can be repeated when adding multiple hosts at one time The mp_register utility will prompt you for information to configure and or register an MP card for t...

Page 63: ...nfigures appropriate components on every Compute Node It is invoked with the add node option a as follows opt clusterpack bin compute_config a new_node_name This command configures the new node with the specified hostname to serve as a Compute Node in the cluster The a option can be repeated if more than one node needs to be added to the system For more information on the usage of compute_config r...

Page 64: ...be removed from the system For more information on the usage of manager_config refer to the man pages man manager_config Step 2 Invoke opt clusterpack bin compute_config on Management Server This tool is the driver that installs and configures appropriate components on every Compute Node It is invoked with the remove node option r as follows opt clusterpack bin compute_config r node_name The r opt...

Page 65: ...Distributor and then click on Install Software z Select the node s and or node group to install on z This will bring up the swinstall GUI from which you can specify the software source and select the software to be installed References z 3 9 4 How to run HPSIM Web based GUI Using the SCM GUI To add additional software to Compute Nodes using SCM GUI do the following z Under Tools select Software Ma...

Page 66: ...Remove Software z Select the node s and or node group to install on z This will bring up the swinstall GUI from which you can specify the software source and select the software to be installed References z 3 9 4 How to run HPSIM Web based GUI Using the SCM GUI To remove software to Compute Nodes using SCM GUI do the following z Under Tools select Software Management and then double click on Unins...

Page 67: ...eradd 1M for more information useradd Use ypmake to push the new user s account information to the Compute Nodes var yp ypmake Using the HPSIM GUI To add users to the cluster do the following z Select Configure HP UX Configuration and then double click on Accounts for Users and Groups z Select the node s and or node group to install on z This will bring up the user account GUI where you can specif...

Page 68: ...ser use ypmake to push this change to the Compute Nodes var yp ypmake Using the HPSIM GUI To remove users from the cluster do the following z Select Configure HP UX Configuration and then double click on Accounts for Users and Groups z Select the node s and or node group to install on z This will bring up the user account GUI where you can specify the user account to remove References z 3 9 4 How ...

Page 69: ...fy the parameters to change References z 3 9 4 How to run HPSIM Web based GUI z 1 5 1 Create a Golden Image of a Compute Node from the Management Server z 1 5 2 Distribute Golden Image to a set of Compute Nodes Using the SCM GUI z Select one or more nodes z Under Tools select System Administration and then click on System Properties z A SAM System Properties window will appear for each node select...

Page 70: ...erences z 3 9 4 How to run HPSIM Web based GUI Using the SCM GUI To define Compute Node inventories for consistency checks use the SCM GUI to access the Systems Inventory Manager GUI z Select one or more nodes z Under Tools select System Inventory and then click SysInvMgr portal z This launches the Systems Inventory Manager GUI z Using the Systems Inventory Manager GUI Log in as admin Select the F...

Page 71: ...xecute the task Click Schedule to schedule when the task should run Click Run Now to run the task now References z 3 9 4 How to run HPSIM Web based GUI Using the SCM GUI To define Compute Node inventories for consistency checks use the SCM GUI to access the Systems Inventory Manager GUI z Select one or more nodes z Under Tools select System Inventory and then click SysInvMgr portal z This launches...

Page 72: ...based GUI Using the SCM GUI To define Compute Node inventories for consistency checks use the SCM GUI to access the Systems Inventory Manager GUI z Select one or more nodes z Under Tools select System Inventory and then click SysInvMgr portal z This launches the Systems Inventory Manager GUI z Using the Systems Inventory Manager GUI Log in as admin Select the Filter folder Click Create Filter Ente...

Page 73: ...oup sub1 clsh C sub1 uname a z Invoke uname a on node1 and node3 clsh C node1 node3 uname a For more details on the usage of clsh invoke the command man clsh Back to Top 1 6 13 Copy files within nodes in a cluster The clcp command in opt clusterpack bin is used to copy files between cluster nodes Each file or directory argument is either a remote file name of the form h path or cluster path or a l...

Page 74: ... command in opt clusterpack bin is used to produce a ps output that includes the host name A clps command with no arguments lists all the processes associated with the user invoking the command on all Compute Nodes Some examples of clps usage are z List all processes belonging to user joeuser clps u joeuser z List all processes on node3 and node4 clps C node3 node4 a For more details on the usage ...

Page 75: ...n y z Kill a process with PID 2260 on node1 clkill C node1 p 2260 For more details on the usage of clkill invoke the command Back to Top 1 6 16 Create a Cluster Group Groups of Compute Nodes can be created and added to all tools in ClusterPack using opt clusterpack bin clgroup The following example creates a node group cae containing compute cluster nodes lucky000 lucky001 and lucky002 opt cluster...

Page 76: ...in clgroup The following example adds nodes lucky006 and lucky008 to the node group cae opt clusterpack bin clgroup a cae lucky006 lucky008 Groups can also be created or extended using the name of a pre existing group For more details on the usage of clgroup invoke the command man clgroup Back to Top 1 6 19 Remove Nodes from a Cluster Group Compute Nodes can be removed from existing groups in Clus...

Page 77: ...ems can be done in a similar fashion as adding file systems See Add File Systems to Compute Nodes From SAM select the file system you want to remove and select Actions Remove Do this for each node in the cluster References z 1 6 20 Add File Systems to Compute Nodes Back to Top 1 6 22 How is the the ClusterPack license server managed ClusterPack Base Edition The ClusterPack Base Edition license ser...

Page 78: ... fully functional Base Edition license manager All Base Edition license server functions should be used to manage that portion of the license server Platform Computing s Clusterware Pro V5 1 uses a proprietary licensing scheme For more information on managing the Clusterware Pro license functionality Please see the Platform Computing Clusterware Pro V5 1 Overview References z 3 7 5 How do I start ...

Page 79: ...onfig Finalize_config performs a series of tests to determine the overall health of the individual components of the cluster that have been automatically setup and administered by ClusterPack Finalize_config can be run repeatedly without side effects The health of the cluster for accepting and running jobs can also be determined using tools provided as part of Clusterware Pro Using the Clusterware...

Page 80: ... ok status A more detailed list of STATUS is available in the long report bhosts l or bhosts l hostname The lsload command provides an instantaneous view of the load state of the Compute Nodes lsload A more detailed list of the load information is available in the long report lsload l or lsload l hostname Common Terms Both the Web interface and the CLI use the same terms for the health and status ...

Page 81: ... access the Clusterware Pro V5 1 Web Interface z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 1 7 2 Get an Overview of the Job Queue Status Using the Clusterware Pro V5 1 Web Interface Select the Queues tab An overview of available job queues is displayed The following details are displayed z State The state of the queue Any queue with an Open Active state can ...

Page 82: ...he Clusterware Pro V5 1 Command Line Interface Back to Top 1 7 3 Get details on health of specific Compute Nodes Using the Clusterware Pro V5 1 Web Interface The Hosts Tab located on the left hand side of the screen contains a table showing information about your hosts resources The Detailed View shows the current Stage and Batch State The Detailed View is accessed by selecting View Details There ...

Page 83: ... on it have been suspended It has been locked by the administrator z closed_Busy The host is not accepting new jobs Some load indices have exceeded their thresholds z closed_Excl The host is not accepting jobs until the exclusive job running on it completes z closed_Full The host is not accepting new jobs The configured maximum number of jobs that can run on it has been reached z closed_Wind The h...

Page 84: ...al minutes before the graphs show any information for a given node Using the Clusterware Pro V5 1 CLI Resources available for job scheduling can be seen using the following command bhosts This will display a report for all the Compute Nodes in the cluster To get the resource usage for an individual Compute Node specify the name of the node on the command line bhosts l hostname For more information...

Page 85: ...ing the lshosts command a resource can be specified Only hosts that meet the resource requirement will be displayed lshosts R res_req hostname For example to find all the hosts with at least 4096MB of available memory lshosts R mem 4096 Membership in logical groups defined with the clgroup command can also be given as a resource lshosts R group_name For a full list of currently defined resources u...

Page 86: ...Index Administrators Guide Users Guide Tool Overview Related Documents Dictionary Copyright 1994 2004 hewlett packard company ...

Page 87: ...e 1 8 13 Resume a suspended job in a queue 1 8 14 Resume all suspended jobs owned by a user 1 8 15 Resume all suspended jobs in a queue 1 8 1 Add new Job Submission Queues A new queue can be added to the cluster by editing the file share platform clusterware conf lsbatch clustername configdir lsb queues The name of your cluster can be determined by using the Clusterware Pro V5 1 CLI lsid This abov...

Page 88: ...your cluster can be determined by using the Clusterware Pro V5 1 CLI lsid Before removing a queue it should be closed using the Clusterware Pro V5 1 CLI badmin qclose queue name Jobs still executing can be killed or allowed to run to completion before removing the queue Delete or comment out the queue definitions that you want to remove After adding removing or modifying queues it is necessary to ...

Page 89: ...f userid s to the queue definition After adding removing or modifying queues it is necessary to reconfigure LSF to read the new queue information This is done from the Management Server using the Clusterware Pro V5 1 CLI badmin reconfig Verify the queue has been modified by using the Clusterware Pro V5 1 CLI bqueues l queue_name References z 1 8 1 Add new Job Submission Queues z 3 7 9 How do I acc...

Page 90: ...itted to that queue will only run on nodes that are members of that group After adding removing or modifying queues it is necessary to reconfigure LSF to read the new queue information This is done from the Management Server using the Clusterware Pro V5 1 CLI badmin reconfig Verify the queue has been modified by using the Clusterware Pro V5 1 CLI bqueues l queue_name References z 3 7 9 How do I ac...

Page 91: ...ueues controls the pre and post commands associated with each queue The name of your cluster can be determined by using the Clusterware Pro V5 1 CLI lsid Pre execution commands are executed before a job is run from the queue Post execution commands are executed when a job successfully completes execution from the queue This can be useful for acquiring and releasing special resources such as access...

Page 92: ...e Pro V5 1 CLI bqueues l queue_name References z 1 8 1 Add new Job Submission Queues Back to Top 1 8 7 Kill a job in a queue Using the Clusterware Pro V5 1 CLI Jobs can be killed using the bkill command bkill jobid Users can kill their own jobs Queue administrators can kill jobs associated with a particular queue References z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Ba...

Page 93: ...mmand with the q option bkill q queue name u all 0 Users can kill their own jobs Queue administrators can kill jobs associated with a particular queue References z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 1 8 10 Suspend a job in a queue Using the Clusterware Pro V5 1 CLI bstop jobid Users can suspend their own jobs Queue administrators can suspend jobs asso...

Page 94: ...ommand Line Interface Back to Top 1 8 12 Suspend all jobs in a queue Using the Clusterware Pro V5 1 CLI All of the jobs in a queue can be suspended by a queue administrator using the special 0 job id bstop q queue name u all 0 References z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 1 8 13 Resume a suspended job in a queue Using the Clusterware Pro V5 1 CLI br...

Page 95: ...o V5 1 CLI by using the special 0 job id bresume u userid 0 Users can resume their own jobs Queue administrators can resume jobs associated with a particular queue References z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 1 8 15 Resume all suspended jobs in a queue Using the Clusterware Pro V5 1 CLI All of the jobs in a queue can be resumed by a queue administr...

Page 96: ...y node that shows a state of unavail or unreach is potentially down and should be checked by a system administrator In order to determine the state of nodes on the cluster the tools should be used Using the Clusterware Pro V5 1 Web Interface The default hosts view is a table showing information about your hosts resources The default view is accessed from View Details There are two different indica...

Page 97: ...e MP interface to view any diagnostic messages from the Compute Node References z 3 7 1 What is Clusterware Pro Back to Top 1 9 3 Bring up a Compute Node with a recovery image Recovery images created with opt clusterpack bin sysimage_create are stored in var opt ignite archives hostname where hostname is the name of the node from which the image was taken The images are stored in files based on th...

Page 98: ... logs for ClusterPack are stored in var opt clusterpack log Back to Top 1 9 5 Bring up the Management Server from a crash After a crash the Management Server state can be checked by running opt clusterpack bin finalize_config Back to Top 1 9 6 Troubleshoot SCM problems There are two common problems that are discussed here For any additional troubleshooting help please see z Planning installing and...

Page 99: ...0 15 42 17 1 08 opt mx lbin mxagent root 23334 1 0 15 42 17 0 59 opt mx lbin mxrmi root 24269 24252 1 01 30 51 pts 0 0 00 grep mx z If AgentConfig is installed and running uninstall it and then reinstall it usr sbin swremove AgentConfig z To install AgentConfig type usr sbin swinstall s CMS var opt mx depot11 AgentConfig z where CMS is the hostname of the Management Server Problem scmgr prints out...

Page 100: ...e Add the new node into groups as appropriate using clgroups Replacing with the same hostname and IP address If the hostname and IP Address from the failed node will be assigned to the replacement node do NOT remove the failed node from the cluster using the r option This will remove the node from any groups that have been setup and it will remove any automated Systems Inventory Manager informatio...

Page 101: ...Copyright 1994 2004 hewlett packard company ...

Page 102: ... a submitted job 2 1 11 Kill a submitted job in a queue 2 1 12 Kill all jobs submitted by the user 2 1 13 Kill all jobs submitted by the user in a queue 2 1 14 Suspend a submitted job in a queue 2 1 15 Suspend all jobs submitted by the user 2 1 16 Suspend all jobs submitted by the user in a queue 2 1 17 Resume a suspended job in a queue 2 1 18 Resume all suspended jobs submitted by the user 2 1 19...

Page 103: ...erences z 3 7 6 How do I start and stop the Clusterware Pro V5 1 Web GUI z 3 7 8 How do I access the Clusterware Pro V5 1 Web Interface Back to Top 2 1 2 Invoke the Workload Management Interface from the intranet Using the Clusterware Pro V5 1 Web Interface z Go to the following URL in a web browser http management_server 8080 Platform login Login jsp z Enter your Unix user name and password This ...

Page 104: ...specific Compute Nodes in the cluster for more information on using the f option to transfer files within the cluster Jobs may be submitted to a Group of Compute Nodes if the group was created using the clgroup tool by specifying a resource requirement of the group name bsub R group_name command arguments See bsub 1 man page for complete syntax man 1 bsub References z 2 2 3 Transfer a file from in...

Page 105: ...ware Pro V5 1 Web Interface From the Jobs tab z Select Job Submit z Enter relevant Job information z Select the Resources tab Enter the group name in the Resource Requirement string field Using the Clusterware Pro V5 1 CLI bsub R group_name cmd Use clinfo to list the current groups and their membership clinfo References z 3 7 8 How do I access the Clusterware Pro V5 1 Web Interface z 3 7 9 How do ...

Page 106: ...n be found on the Queue Tab Using the Clusterware Pro V5 1 CLI bmod sp priority job_ID bswitch desintationqueue job_ID References z 3 7 8 How do I access the Clusterware Pro V5 1 Web Interface z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 2 1 7 Check the status of a submitted job Using the Clusterware Pro V5 1 Web Interface From the Jobs tab z Select Tools Fin...

Page 107: ...Previous and Next buttons to view more jobs Using the Clusterware Pro V5 1 CLI bjobs bjobs l References z 3 7 8 How do I access the Clusterware Pro V5 1 Web Interface z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 2 1 9 Examine data files during a job run Using the Clusterware Pro V5 1 CLI bpeek job_ID References z 3 7 9 How do I access the Clusterware Pro V5 1...

Page 108: ...ess the Clusterware Pro V5 1 Web Interface z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 2 1 11 Kill a submitted job in a queue Using the Clusterware Pro V5 1 Web Interface From the Jobs tab z Select the job from the Jobs table z Select Jobs Kill Using the Clusterware Pro V5 1 CLI bkill job_ID References z 3 7 8 How do I access the Clusterware Pro V5 1 Web Int...

Page 109: ...terware Pro V5 1 Command Line Interface Back to Top 2 1 13 Kill all jobs submitted by the user in a queue Using the Clusterware Pro V5 1 Web Interface From the Jobs tab z Select Tools Find z Select the Advanced tab z Select User from the Field list in the Define Criteria section z Type the user name in the Value field z Click to add to the list z Select Queue from the Field list z Select the queue...

Page 110: ... the Clusterware Pro V5 1 CLI bstop job_ID References z 3 7 8 How do I access the Clusterware Pro V5 1 Web Interface z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 2 1 15 Suspend all jobs submitted by the user Using the Clusterware Pro V5 1 Web Interface From the Jobs tab z Select Tools Find z Select User from the Field list z Type the user name in the Value fi...

Page 111: ...ion z Type the user name in the Value field z Click z Select Queue from the Field list z Select the queue from the Queue list z Click z Click Find z Click Select All z Click Suspend Using the Clusterware Pro V5 1 CLI bstop u username q queuename 0 References z 3 7 8 How do I access the Clusterware Pro V5 1 Web Interface z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back t...

Page 112: ... tab z Select User from the Field list in the Define Criteria section z Type the user name in the Value field z Click z Select State from the Field list z Select Suspend from the State list z Click z Click Find z Click Select All z Click Resume Using the Clusterware Pro V5 1 CLI bresume u username 0 References z 3 7 8 How do I access the Clusterware Pro V5 1 Web Interface z 3 7 9 How do I access t...

Page 113: ...z 3 7 8 How do I access the Clusterware Pro V5 1 Web Interface z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 2 1 20 Submit a MPI job in a queue Using the Clusterware Pro V5 1 Web Interface From the Jobs tab z Select Job Submit z Enter the number of processors required in the Max Processors field z Complete job data z Click Submit Using the Clusterware Pro V5 1...

Page 114: ...w do I access the Clusterware Pro V5 1 Web Interface z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 2 1 22 Resume a suspended MPI job Using the Clusterware Pro V5 1 Web Interface From the Jobs tab z Select the suspended job from the Jobs table z Select Job Resume Using the Clusterware Pro V5 1 CLI bresume job_ID References z 3 7 8 How do I access the Clusterwar...

Page 115: ...Copyright 1994 2004 hewlett packard company ...

Page 116: ...n the cluster 2 2 6 Transfer a file from a node to a set of nodes in the cluster 2 2 1 Transfer a file from intranet to the Management Server in the cluster Using the Clusterware Pro V5 1 Web Interface By default all files transferred using the Web interface will be placed in share platform clusterware tomcat webapps Clusterware users userid From the Jobs tab z Tools Upload Download Files z Comple...

Page 117: ...Compute Nodes in the cluster If the cluster is a guarded cluster this operation must be done in two steps z First FTP the file to the Head node Management Server z Second distribute the file to specific nodes There are two methods that can be used 1 Use Clusterware Pro V5 1 CLI to distribute the file to the specific nodes that need the file bsub f local_file op remote_file Where op is an operator ...

Page 118: ...re the job starts Overwrites the remote file if it exists Then copies the remote file to the local file after the job completes Overwrites the local file bsub f local_file remote_file 2 Copy the file to specific nodes in the cluster using clcp clcp C node1 node3 a input data h date input data For more details on the usage of clcp invoke the command man clcp References z 3 7 9 How do I access the C...

Page 119: ...luster The clcp command in opt clusterpack bin is used to copy files between cluster nodes Each file or directory argument is either a remote file name of the form h path or cluster path or a local file name containing no characters Some examples of clcp usage are z Update etc checklist on all nodes with the local etc checklist clcp etc checklist h etc checklist clcp etc checklist cluster etc chec...

Page 120: ...ndex Administrators Guide Users Guide Tool Overview Related Documents Dictionary Copyright 1994 2004 hewlett packard company For more details on the usage of clcp invoke the command man clcp Back to Top ...

Page 121: ...ion fails to complete 2 3 8 Check impact on the job if a Compute Node crashes 2 3 9 Get a high level view of the status of the Compute Nodes 2 3 1 Run a tool on a set of Compute Nodes A set of multi system aware tools has been provided for use on the cluster To execute a command on multiple hosts follow the examples below z To run a tool on all the Compute Nodes clsh script z To run a tool on host...

Page 122: ... Clusterware Pro V5 1 Web Interface z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 2 3 2 Check resource usage on a Compute Node Using the Clusterware Pro V5 1 Web Interface From the Jobs tab z Select the job from the Jobs table z Select Jobs Monitor z Review the charts Using the Clusterware Pro V5 1 CLI lsload l host_name References z 3 7 8 How do I access the ...

Page 123: ...job completes As long as the application only generates files within its execution directory there is no need for the user to remove temporary files generated by an application In the event AppRS restarts an application on a new set of nodes the original working directories and files created before the migration are not removed This is done in order to be as careful as possible about avoiding data...

Page 124: ...d to the job becomes unavailable or unreachable by the other hosts while the job is executing z The job is explicitly migrated using the LSF command bmig z The user s job exits with exit code 3 For more information on exit values see the HP Application ReStart User s Guide As long as an application can generate restart files and be restarted from those files AppRS will ensure that files marked as ...

Page 125: ...atively the toolset can be used to trigger checkpointing by your application Using the Clusterware Pro V5 1 Web Interface From the Jobs tab z Select Jobs Submit z Enter job information z Click Advanced z On the Advanced dialog Select Checkpoint Specify an checkpoint period in the every minutes field Specify a checkpoint directory in the directory field z On the Advanced dialog enter script details...

Page 126: ...he Jobs tab z Review the job states in the Jobs table z Use the Previous and Next buttons to view more Jobs Using the Clusterware Pro V5 1 CLI bjobs job_ID References z 3 7 8 How do I access the Clusterware Pro V5 1 Web Interface z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 2 3 8 Check impact on the job if a Compute Node crashes In the event that a Compute No...

Page 127: ... jobid References z 3 7 9 How do I access the Clusterware Pro V5 1 Command Line Interface Back to Top 2 3 9 Get a high level view of the status of the Compute Nodes Using the Clusterware Pro V5 1 Web Interface From the Jobs tab z Review the Hosts table z Use the Previous and Next buttons to view more hosts Using the Clusterware Pro V5 1 CLI bhosts References z 3 7 8 How do I access the Clusterware...

Page 128: ...ified nodes 3 1 10 clinfo Shows nodes and cluster information 3 1 11 clgroup Creates a logical cluster group of nodes 3 1 12 clbroadcast Telnet and MP based broadcast commands on cluster nodes 3 1 13 clpower controls remote power operations for cluster nodes 3 1 1 What is Cluster Management Utility Zone ClusterPack includes several utilities which can aide both in administrative tasks and in workl...

Page 129: ...he tools mp_register and clbootnodes can be used to register and configure MP interfaces and then use those interfaces to automate the booting of nodes By default manager_config interactively asks the user for an IP address range to assign to the Compute Nodes It is also to possible to pass a file containing names and IP addresses to manager_config The EasyInstall utilities can also be used to add...

Page 130: ...ctive commands on one some or all nodes in the cluster z clcp Copies files to one some all cluster nodes z cluptime Works like ruptime only for all nodes in the cluster z clps Cluster wide ps command z clkill Kills specified processes on specified nodes z clinfo Shows nodes and cluster information z clpower Utility to manage remote power operations on the cluster ex turn the system power on and of...

Page 131: ...p for something on all hosts in the cluster clsh grep pattern files To append something to a file on all machines clsh i cat file addendum To run a command with a five second timeout on all the hosts in the cluster group hp directing output into separate files clsh o t5 C hp date clsh o t5 hp date A cluster name without a C must follow all flag arguments For more details on the usage of clsh invok...

Page 132: ...t this file is different on all hosts The following is a way in which this can be done clcp h etc checklist checklist h vi checklist Make necessary changes clcp checklist h h etc checklist If the CLUSTER environment variable was defined as host0 host1 then the above would map to rcp host0 etc checklist checklist host0 rcp host1 etc checklist checklist host1 vi checklist host0 checklist host1 rcp c...

Page 133: ...the host names with file names of the form YYMMDD TT TT The above might map to rcp host0 usr spool mqueue syslog host0 syslog 921013 14 43 rcp host1 usr spool mqueue syslog host1 syslog 921013 14 43 4 Like rcp clcp can copy many files to the cluster This is done by clcp src1 src2 src3 h or clcp src1 src2 src3 cluster group For more details on the usage of clcp invoke the command man clcp Back to T...

Page 134: ...ps Back to Top 3 1 9 clkill Kills specified processes on specified nodes clps and clkill are the same program with clps producing a ps output that includes the host name and clkill allowing processes to be killed Since using PIDs on a cluster is not feasible given there will be different hosts clkill can kill processes by name The i option should be passed to clkill to allow interactive killing i ...

Page 135: ...rograms Long lines wrap and the cluster name is always given even when there is only one cluster This is the default mode if the output is to a tty device like the user s screen z Long format enabled by the l option The long format is essentially a dump of the internal database maintained by cladmin The cluster name is always output followed by one record per host Each field of the record occurs b...

Page 136: ...removing nodes from a group the nodes to be removed can be specified in terms of a list of individual nodes and or other groups When a previously existing group is specified all members of that group are removed from the group being modified The third form allows the information regarding one or more node groups to be provided in a file The last form lists all the node groups in the compute cluste...

Page 137: ... input keyboard actions will be broadcast in all target windows To send a command to a specific target type directly in the target window and the command is not broadcast clbroadcast is used as follows clbroadcast nodename clbroadcast mp nodename clbroadcast telnet nodename Examples The following command broadcasts to cluster nodes nodea nodeb and nodec using the default telnet interface clbroadca...

Page 138: ... follows clpower options nodelist Examples This command line turns on the power on nodes n3 and n4 clpower on n3 n4 This command line turns off the power to node groups group1 and group2 clpower off C group1 group2 This command line displays the power status of all the nodes in the ClusterPack cluster clpower status This example lights up the unit identifier LED on node n1 clpower uidon n1 For mor...

Page 139: ...buted Task Facility that improves operator efficiency by replicating operations across the nodes or node groups within the ServiceControl Managed Cluster with a single command z Tools designed to deal with a single system single system aware tools like bdf are dispatched to the target systems and their results collected for review This mechanism can also be used to handle custom tools such as user...

Page 140: ...talled prior to installation of ClusterPack References z 4 1 2 HP UX ServiceControl Manager Back to Top 3 2 3 How to Run SCM Web based GUI This release of ClusterPack includes a version of SCM that has a Web based GUI To run the SCM GUI point your Web browser at the following URL https manager_node_address 50000 You must be using a recent version of Internet Explorer or Netscape in order to run th...

Page 141: ...ures of the tool are z You design the grouping of devices in the way that best suits your environment z The GUI s buttons tabs and menus provide quick access to defining devices and groups adding configuring and deleting devices as well as groups schedules and filters collecting data on a scheduled basis or on demand filtering of collected data to isolate specific data comparing collected inventor...

Page 142: ...ties Online help is available by clicking the Help Tab in Systems Inventory Manager GUI References z 4 1 4 HP System Inventory Manager Back to Top 3 3 2 How to invoke Systems Inventory Manager Using the SCM GUI z Under Tools select HP Systems Inventory Manager z Double click on the HP Systems Inventory Manager icon z This launches the Systems Inventory Manager GUI From your web browser at your des...

Page 143: ...uently leaves the restart files inaccessible Using a shared file system does not preclude data loss and can introduce performance degradation Redundant hardware solutions are often financially impractical for large clusters used in technical computing Secondly applications affected by computer failure generally require human detection and intervention in order to be restarted from restart files Va...

Page 144: ...shrc file source share platform clusterware conf cshrc lsf and the following line to their profile file share platform clusterware conf profile lsf References z 2 3 4 Remove temporary files from Compute Nodes z 2 3 5 Prepare application for checkpoint restart z 2 3 6 Restart application from a checkpoint if a Compute Node crashes z AppRS Release Note z AppRS User s Guide Back to Top ...

Page 145: ...roup Administration Menu 3 5 1 What is CMU CMU is designed to manage a large group of Compute Nodes CMU comes with a Graphical User Interface It provides access to all Compute Nodes from a single screen using a single mouse click The CMU main window gives you access to all the menus you need to setup your CMU configuration Back to Top 3 5 2 Command line utilities CMU offers several command line ba...

Page 146: ...f any or through its console port if there is a terminal server 3 Connect to a node by telnet through the management network through its management card if any or through its console port if there is a terminal server z Event handling management Displays a warning message or executes a command when a node becomes unreachable or reachable again Back to Top 3 5 4 Invoking CMU The user must be logged...

Page 147: ...lows multiple non contiguous selections while the Shift key allows contiguous or groups of objects to be selected Back to Top 3 5 5 Stopping CMU To stop CMU left click the mouse on the Quit button in the main CMU window lower right corner Note When stopping CMU saves the current configuration parameters Back to Top 3 5 6 CMU main window Description of the main menu buttons for CMU monitoring and m...

Page 148: ...clicked is not selected it will be added to your selection If it is already selected it will be removed from the selection The selection is composed of all the darker nodes on the window z Select all the nodes of the logical group Double left click on one node of the logical group and all the nodes will be selected z Unselect all the nodes of the logical group Double middle click on one node of th...

Page 149: ... Locator On Switches on the Locator LED of the node This option is only available if the node is an HP Integrity server with a properly registered ECI card z Locator Off Switches off the Locator LED of the node This option is only available if the node is an HP Integrity server with a properly registered ECI card Note If several nodes are selected all the items of the contextual menu are inactivat...

Page 150: ...nagement card password and the same PDU password If a node is linked with both a PDU and a management card the power off will be performed using the management card The PDU will be used only if the management card power off has failed Note If the nodes are not halted they will be powered off by the remotely manageable PDU or by their management card This can damage the file system If unsure use Ha...

Page 151: ...d nodes Telnet connection to the console and through a terminal server if all the selected nodes are connected to a terminal server Telnet connection through the management card if all the selected nodes have a management card Note Telnet connections through the management card are not allowed in a single window mode z Multiple Window If the user chooses the multiple windows mode the command launc...

Page 152: ...sualize the telnet sessions or does not want to crowd the display the user has the option to start the Xterm windows minimized Note The console broadcast displayed Xterm windows are limited by the number of ttys and the display capacity of the X server HP advises the use of a Single Window for performing the broadcast command on a large number of nodes z Remote Connection This feature offers the s...

Page 153: ...s the IP headers of internal packets going out making it appear that they all came from a single IP address which is the external IP address of the entire cluster Reply packets coming back are translated back and forwarded to the appropriate Compute Node Thus the Compute Nodes are allowed to connect to the outside world if needed However outside machines cannot initiate any connection to individua...

Page 154: ...ities One of the features that it supports is Network Address Translation For information on HP UX HPFilter please refer to the HP UX HPFilter manual and release notes at docs hp com http docs hp com hpux internet index html IPFilter 9000 For information on NAT features of HP UX HPFilter refer to the public domain how to document No guarantee can be made about the correctness completeness or appli...

Page 155: ... V5 1 services be refreshed after changes to the configuration are made 3 7 11 Where can I find more information about using and administering Clusterware Pro V5 1 3 7 1 What is Clusterware Pro Platform Computing Clusterware Pro V5 1 is a comprehensive cluster management solution for enterpris looking to maximize on the cost effective high performance potential of HP UX clusters Platform Computing...

Page 156: ... to your Software License Certificate for contact information You will need to get the host identification number from the Management Server The host ID can be fou using the uname command bin uname i The number returned by the command uname i must be proceeded by a when making your request Fo example if uname i returns 2005771344 provide the ID number as 2005771344 in your key request Note It may ...

Page 157: ...manent license file touch share platform clusterware conf LSF_license oem 4 Start the Clusterware Services on the Management Sever share platform clusterware lbin cwmgr start Note These changes will need to be un done in order to use a permanent license key Please see share platform clusterware conf README demo for more information References z Step 7 Configure the ProCurve Switch z 2 2 1 Transfer...

Page 158: ...latform clusterware 1 0 hppa11 64 etc sbatchd root 20163 20116 0 Aug 2 0 05 share platform clusterware 1 0 hppa11 64 etc mbatchd d share pla root 20110 1 0 Aug 2 0 11 share platform clusterware 1 0 hppa11 64 etc lim root 20113 1 0 Aug 2 0 00 share platform clusterware 1 0 hppa11 64 etc res On a Compute Node Clusterware Pro V5 1 uses different services than on the Management node The method of chec...

Page 159: ...terware lbin cwagent start To STOP services on ALL Compute Nodes Issue the following command on the Management Server as the super user i e root On the Management Server clsh share platform clusterware lbin cwagent stop To START services on a single Compute Node Issue the following command as the super user i e root On the Management Server clsh C compute_node share platform clusterware lbin cwage...

Page 160: ...f the Management Server is rebooted References z 3 7 5 How do I start and stop the Clusterware Pro V5 1 daemons Back to Top 3 7 7 What system resources are required by Clusterware Pro V5 1 The Clusterware Pro V5 1 web server is Tomcat Tomcat is maintained and distributed by the Free Softw Foundation Several tools within the ClusterPack solution use the Tomcat web server Back to Top 3 7 8 How do I ...

Page 161: ... Clusterware Pro V5 1 services be refreshed after changes to the configuration are made The services only read the configuration file when they are started up or reconfigured Any time a change made to the configuration the services must either be restarted or reconfigured Changes include but are limited to z adding or removing queues z changing existing queues z adding or removing nodes z reinstal...

Page 162: ...obs will continue to run Please How do I start and stop the Clusterware Pro V5 1 daemons for more information References z 3 7 5 How do I start and stop the Clusterware Pro V5 1 daemons Back to Top 3 7 11 Where can I find more information about using and administering Clusterwar Pro V5 1 Online reference documents are available for Administering Clusterware Pro and Running Jobs using Clusterware P...

Page 163: ... the system console serial or LAN to activate the main MP menu z Enter the cm command to access the command menu Enter the pc command power control to toggle system power state Note that no signal is sent to the OS to allow for a graceful shutdown so the system should be halted prior to using this command to turn off the system Enter the lc command LAN configuration to set IP address subnet mask g...

Page 164: ...Index Administrators Guide Users Guide Tool Overview Related Documents Dictionary Copyright 1994 2004 hewlett packard company Step 11 Run mp_register on the Management Server Back to Top ...

Page 165: ...nd storage environment with a feature rich extensible and secure management tool set HP SIM also serves as a central access point for ProLiant Essentials Integrity Essentials and Storage Essentials software options that deliver targeted functionality for these platforms Back to Top 3 9 2 What are the key features of HP Systems Insight Manager Here are some of the key features of HP SIM z Delivers ...

Page 166: ... Systems Insight Manager is available as part of the HP UX Operating Environment and as a web release HP Software bundle and must be installed in the Management Server ClusterPack provides tools to configure HPSIM to manage the ClusterPack cluster For additional information about the configuration management or general troubleshooting please refer to the HPSIM Technical Reference http h18013 www1 ...

Page 167: ...Copyright 1994 2004 hewlett packard company ...

Page 168: ...5 HP UX IPFilter 4 1 6 ClusterPack V2 3 4 1 7 HP Systems Insight Manager 4 1 1 HP UX 11i Operating Environments HP UX 11i v2 Operating Environment Document Collection http www docs hp com en oshpux11iv2 html Ignite UX Administration Guide http docs hp com en B2355 90875 index html Software Distributor Administration Guide for HP UX 11i http docs hp com en B2355 90789 index html Back to Top 4 1 2 H...

Page 169: ...otes pdf HP Application Restart User s Guide AppRS User s Guide pdf Back to Top 4 1 4 HP System Inventory Manager Systems Inventory Manager User s Guide http docs hp com en 5187 4238 index html Systems Inventory Manager Troubleshooting Guide http docs hp com en 5187 4239 index html Back to Top 4 1 5 HP UX IPFilter HP UX IPfilter Release Note http www docs hp com hpux onlinedocs B9901 90010 B9901 9...

Page 170: ...ClusterPack V2 3 ClusterPack V2 3 Release Note http www docs hp com hpux onlinedocs T1843 90009 T1843 90009 htm Back to Top 4 1 7 HP Systems Insight Manager HP Systems Insight Manager Product Information http h18013 www1 hp com products servers management hpsim index html Back to Top Cluster LAN Switch Cluster Management Software Guarded Cluster Head Node Interconnect Switch Management Processor M...

Page 171: ... for system administrators and end users Back to Top Guarded Cluster A cluster where only the Management Server has a network connection to nodes outside of the cluster All of the Compute Nodes are connected within the cluster on a private subnet i e IP addresses of 10 or 198 162 Back to Top Head Node A Head Node provides user access to the cluster In smaller clusters the Management Server may als...

Page 172: ...nt functions Back to Top Management Server The Management Server provides single point of management for all system components in the cluster In smaller clusters the Management Server may also serve as a head Node References z Head Node Back to Top Network Attached Storage NAS Network Attached Storage NAS attaches directly to Ethernet networks providing easy installation low maintenance and high u...

Page 173: ...Index Administrators Guide Users Guide Tool Overview Related Documents Dictionary Copyright 1994 2004 hewlett packard company Back to Top ...

Reviews: