background image

This soft copy for use by IBM employees only.

How Customers Can Get ITSO Redbooks

. . . . . . . . . . . . . . . . . . . . .

 286

IBM Redbook Order Form

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

 287

List of Abbreviations

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

 289

Index

 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

291

viii

SP PD Guide 

Summary of Contents for RS/6000 SP

Page 1: ...SG24 4778 00 RS 6000 SP Problem Determination Guide December 1996 This soft copy for use by IBM employees only...

Page 2: ......

Page 3: ...International Technical Support Organization RS 6000 SP Problem Determination Guide December 1996 SG24 4778 00 IBML This soft copy for use by IBM employees only...

Page 4: ...poration International Technical Support Organization Dept HYJ Mail Station P099 522 South Road Poughkeepsie New York 12601 5400 When you send information to IBM you grant IBM a non exclusive right to...

Page 5: ...able Values 15 2 2 9 spdata Directory Structure 15 2 2 10 Install PSSP on the Control Workstation 16 2 2 11 PSSP 1 2 18 2 2 12 PSSP 2 1 18 2 2 13 Changes from PSSP 1 2 to PSSP 2 1 19 2 2 14 PSSP Softw...

Page 6: ...2 2 Instance 72 3 2 3 Realm 73 3 2 4 Ticket 73 3 2 5 Key 73 3 2 6 Ticket Granting Ticket 73 3 3 Components 74 3 3 1 ssp authent 2 1 0 2 74 3 3 2 ssp clients 2 1 0 5 74 3 4 Install Process 74 3 4 1 set...

Page 7: ...Node Behavior 103 4 4 3 Secondary Node Behavior 104 4 4 4 Recovery from a Switch Failure 104 4 5 Switch Commands 104 4 5 1 The rc switch Script 108 4 5 2 Switch Initialization 108 4 6 Reviewing Switch...

Page 8: ...Commands 161 6 1 3 Error Log Files 161 6 2 SP Error Logging 162 6 2 1 Install and Configure Error Log 163 6 2 2 AIX Error Log Facility 164 6 2 3 BSD syslog Facility 165 6 2 4 Trimming syslog Files 168...

Page 9: ...Tips 222 9 6 Managing Print Services 225 9 6 1 Using Print Services 225 9 7 Managing NTP 225 9 7 1 Using NTP 226 9 7 2 NTP Hints and Tips 226 Appendix A RS 6000 SP Script Files 229 A 1 The setup_authe...

Page 10: ...This soft copy for use by IBM employees only How Customers Can Get ITSO Redbooks 286 IBM Redbook Order Form 287 List of Abbreviations 289 Index 291 viii SP PD Guide...

Page 11: ...rvices File after Running install_cw Script 43 23 Output from setup_server 68 24 Extract from the etc services File 76 25 Example Using netstat Command 76 26 Example of etc krb srvtab from the Control...

Page 12: ...ocess Overview 195 77 Booting Problems 196 78 System Monitor Problems 197 79 Switch Problems 199 80 System Partitioning Problems 200 81 Copying a System Dump 206 82 Using crash 210 83 Default Values f...

Page 13: ...122 setup_server Script Flow Chart 12 23 250 123 setup_server Script Flow Chart 13 23 251 124 setup_server Script Flow Chart 14 23 252 125 setup_server Script Flow Chart 15 23 253 126 setup_server Sc...

Page 14: ...This soft copy for use by IBM employees only xii SP PD Guide...

Page 15: ...ies 16 2 Components of the PSSP Install Image 17 3 Output Terms of the lslpp Command 24 4 Components and Sizes of the PSSP PTFset11 28 5 NIM Client Definition Information 55 6 Cabling SP Switch versus...

Page 16: ...This soft copy for use by IBM employees only xiv SP PD Guide...

Page 17: ...and other technical support personnel who deal with SP problems It is also useful for those who want a more comprehensive understanding of RS 6000 SP components How This Redbook Is Organized This redb...

Page 18: ...RS 6000 SP provides several tools that help to identify and isolate problems In this chapter these tools will be used along with the Symptom Index in Chapter 3 of the Diagnosis and Messages Guide GC23...

Page 19: ...s His expertise has been gained by working closely with customers resolving installation and post installation issues He has written extensively on the SP Switch A special thanks is extended to Richar...

Page 20: ...Comments Welcome We want our redbooks to be as helpful as possible Should you have any comments about this or other redbooks please send us a note at the following address redbook vnet ibm com Your co...

Page 21: ...ree hardware components that make the RS 6000 SP different from conventional RISC machines The Control Workstation The Frame The High Performance Switch These components are integrated into the RS 600...

Page 22: ...s network is dedicated to SP traffic only therefore you should install a different network adapter to pass user applications traffic 1 1 2 The Frame All the hardware information flows from the Frame t...

Page 23: ...signed for concurrent maintenance each node can be removed and repaired without interrupting operations on the other nodes 1 1 3 The High Performance Switch The High Performance Switch is one of the m...

Page 24: ...switch These components will be covered later on this book The Switch has the ability to partition which creates separate and non disrupting environments In this way each partition can be seen and man...

Page 25: ...d like this dsh w sp21n01 fails and says that it could not be authenticated on node sp21n01 due to a kerberos ticket problem the offensive component is clear But if you enter this command SDRGetObject...

Page 26: ...This soft copy for use by IBM employees only not considered suitable to include in a chapter but are useful as reference material 6 SP PD Guide...

Page 27: ...nical Presentation SG24 4542 Chapter 3 Installing and Configuring the SP System For the purpose of this book we assume that detailed site planning has been carried out RS 6000 SP hardware has been cor...

Page 28: ...6 PSSP install images have been copied to spdata sys1 install pssplpp and renamed as pssp installp and inutoc script has been run 7 Basic AIX mksysb image bos obj ssp 41 has been installed under spdat...

Page 29: ...le file is copied to HOME dtprofile the first time a user logs into the desktop Any lines in sys dtprofile located between SYSPROFILE COMMENT START and SYSPROFILE COMMENT END are filtered out during t...

Page 30: ...omponents that are installed on your Control Workstation If it does not list all of the required components then install the missing NIM components from the AIX install media before proceeding further...

Page 31: ...IFIER 000081007414db91 VG STATE active PP SIZE 4 megabyte s VG PERMISSION read write TOTAL PPs 2166 8664 megabytes MAX LVs 256 FREE PPs 1852 7408 megabytes LVs 12 USED PPs 314 1256 megabytes OPEN LVs...

Page 32: ...cpip h sp2tr0 a 9 12 0 37 m 255 255 255 0 i tr0 g 9 12 0 32 r 16 s usr sbin mktcpip h sp2cw0 a 9 12 20 99 m 255 255 255 0 i en0 t bnc s tr0 is configured first followed by en0 This is done to make en0...

Page 33: ...p2sw10 9 12 6 11 sp2sw11 9 12 6 12 sp2sw12 9 12 6 13 sp2sw13 9 12 6 14 sp2sw14 9 12 6 15 sp2sw15 9 12 6 16 sp2sw16 Figure 10 etc hosts File on the Control Workstation 2 2 5 1 Verify Control Workstatio...

Page 34: ...ly to a root user The possible values range from 40 to 131072 Any increase in this number will take effect immediately Any decrease in this number will not become effective until the next system boot...

Page 35: ...blems on page 69 for details of post installation customization steps Note etc rc sp does not exist yet It will be created after the installation of PSSP software 2 2 9 spdata Directory Structure You...

Page 36: ...NIM configuration data files spdata sys1 install pssplpp Location of all PSSP and SP system file sets spdata sys1 sdr Location of the SDR files spdata sys1 sdr archives In this directory are stored t...

Page 37: ...rtitioning Files ssp top X X 2 2 10 1 Minimal PSSP Installation For a minimum installation the following components are required Code for installing and monitoring the SP system ssp basic SP user comm...

Page 38: ...Control Workstation to AIX Version 4 1 They also need to create AIX 3 2 5 based partitions before actually starting the upgrade First Implementation of PSSP 1 2 PSSP 1 2 was IBM s first implementation...

Page 39: ...ife to 30 days for Kerberos and enhancements to address any host by any of its host names VSD performance improvements With standard VSD database applications could only stripe data across multiple di...

Page 40: ...p installp is now pssp installp spbootins spbootins now has A diagnosis option r diag New syntax for the h option Remote Diagnosis Support has been added in PSSP 2 1 2 2 14 PSSP Software Strategy The...

Page 41: ...components affected by these PTFs 2 2 15 Obtaining PSSP PTFs Using FixDist Figure 13 Obtaining PSSP PTFs Using FixDist The easiest way to obtain AIX fixes and PSSP fixes is to use the tool FixDist Th...

Page 42: ...Subsystem ssp docs 2 1 0 4 COMMITTED SP man and info files ssp gui 2 1 0 6 COMMITTED SP System Monitor Graphical ssp jm 2 1 0 2 COMMITTED SP Job Manager Package ssp public 2 1 0 2 COMMITTED Public Cod...

Page 43: ...so known as PTF or program temporary fix ID parameter specifies the identifier of an update to an AIX 3 2 formatted fileset When only the l lowercase L flag is entered the lslpp command displays the l...

Page 44: ...em The COMMITTED state means that a commitment has been made to this level of the software A committed fileset update cannot be rejected but a committed fileset base level and its updates regardless o...

Page 45: ...art of PSSP installation By installing correct levels of PSSP PTFs you ensure that the PSSP software enhancements and fixes for known problems are incorporated in the PSSP code on your Control Worksta...

Page 46: ...1 bin bin 2186 Apr 9 09 34 ssp sysctl README rw r r 1 bin bin 1487 Apr 9 06 44 ssp sysman README rw r r 1 bin bin 116 Apr 9 09 39 ssp top README IX53362 README files also exist for some products unde...

Page 47: ...following steps to change inittab file and start the kadmind kerberos and hardmon daemons chitab kadm 2 respawn usr lpp ssp kerberos etc kadmind chitab kerb 2 respawn usr lpp ssp kerberos etc kerbero...

Page 48: ...et11 Component Size ppe poe 2 1 0 8 2063360 ppe vt 2 1 0 4 1428480 ssp authent 2 1 0 2 380928 ssp basic 2 1 0 10 17014784 ssp clients 2 1 0 5 1904640 ssp csd cmi 2 1 0 1 158720 ssp csd hsd 2 1 0 2 190...

Page 49: ...tion was set up The full path of this script is usr lpp ssp bin install_cw install_cw runs on the Control Workstation and does the following Configures the Control Workstation Executes usr lpp ssp ins...

Page 50: ...ed 0513 059 The splogd Subsystem has been started Subsystem PID is 15102 0513 059 The sysctld Subsystem has been started Subsystem PID is 16390 Figure 14 Output of Running install_cw Script Figure 15...

Page 51: ...entication part first 2 5 Problems with spmon SP System Monitor refers to four main components Hardware Monitor The hardware monitor consists of a set of commands and a daemon to monitor and control t...

Page 52: ...e state and control the hardware The SP system monitor GUI issues the spmon command internally to control and monitor the hardware The splogd logging daemon is also a client of the hardware monitor fo...

Page 53: ...Instead of the Expected Information If the system monitor panels are displaying blanks instead of the expected information after installation or if the ControllerResponds indicator on the system monit...

Page 54: ...ion is fine Verify Serial Port speed Check the baud rate of the serial port The System Monitor configures this for you when it starts Use the stty command as follows to redirect input from the correct...

Page 55: ...ntry that was found and is shown here just as an example entry in the log file The same messages are also logged under errpt Check var adm SPlogs SPdaemon log for messages containing the word sphwlog...

Page 56: ...tion using klist request authorization using kinit and check spdata sys1 spmon hmacls Check hardware monitor daemon hardmon Issue the hardware monitor command using an ID that has monitor authority an...

Page 57: ...exists with permissions rw r r Check that error record templates for SPMON exist errpt t grep SPMON If the logging daemon dies Check for a core dump If available run crash to identify the reason for...

Page 58: ...5 1 sp2n15 sp2n15 9 12 20 99 16 1 16 1 sp2n16 sp2n16 9 12 20 99 In this case SDR has information about node 8 2 Check var adm SPlogs spmon and view hmlogfile nnn The following is a listing of the rele...

Page 59: ...ps eaf grep hardmon shows that it is running but another ps eaf shows it is continually respawning fuser dev tty0 reconfirms this 2 Check that the sdr daemon is running by using lssrc g sdr or ps aef...

Page 60: ...re missing 1 root admin vsm 1 hardmon sp2cw0 vsm Even if these lines are added by hand there may still be a problem starting spmon Check for and remove any blank lines 4 Kill hardmon to respawn it wit...

Page 61: ...this on a running SP system 1 Remove the Kerberos configuration files and the Kerberos server keytab file rm i etc krb rm Remove etc krb srvtab y rm Remove etc krb conf y rm Remove etc krb realms y 2...

Page 62: ...running the install_cw script check the following PATH usr lpp ssp bin should be exported in your PATH Authentication You should be logged in as a valid authenticated user of the system management co...

Page 63: ...lost when you restore the mksysb image on the CWS There are two different methods you can use to recreate this entry 1 Run install_cw This will create the proper node_number entry for the Control Work...

Page 64: ...s master and client NIM Master This role is dedicated to only one machine in the NIM environment The NIM master is the single point of administration for NIM installations All NIM related operations a...

Page 65: ...mation is stored on the master Within the RS 6000 SP environment there is a machine object for each node of the RS 6000 SP They will run as standalone systems after the installation lsnim l sshps01 sp...

Page 66: ...ources type spot version 04 server master location usr alloc_count 1 Rstate ready for use prev_state ready for use release 01 if_supported rs6k ent if_supported rs6k fddi if_supported rs6k tok if_supp...

Page 67: ...nce the following tasks Configuration of the network adapters except the css0 adapter Install additional lpps Call the tftpboot tuning cust script Call the tftpboot script cust script lsnim l psspscri...

Page 68: ...a set of commands you can run to do this A list of some of them follows lsnim This command displays information about the NIM environment including predefined information the attributes required for a...

Page 69: ...es standalone sphps05 machines standalone sphps06 machines standalone Without any parameters the command lsnim displays all NIM objects which have been created on this NIM master In the RS 6000 SP env...

Page 70: ...essor systems rs6ksmp Micro Channel based symmetric multiprocessor systems rspc ISA bus systems In the RS 6000 SP environment only rs6k is relevant if1 This attribute specifies the interface to be use...

Page 71: ...ndalone NIM machine object newhost The NIM Network object spnet must exist at this time Several checks are made for instance whether the IP address for hostname speth02 is reachable from the net spnet...

Page 72: ...tandalone clients In the RS 6000 SP environment this operation is not used It requires the nodes to be configured as NIM Clients access to root user granted by entry in rhosts file on the node The fol...

Page 73: ...fix_query operation can be applied to SPOT resources or NIM clients When invoked without any optional attributes information about all installed fixes are displayed on the target The fixes attribute a...

Page 74: ...b l 3 List Node Boot Install Information Node hostname hdw_enet_addr srvr response install last_install_image last_install_time next_install_image 3 sp2n03 10005AFA082C 1 disk hdisk0 bos obj ssp 41 S...

Page 75: ...sponse of disk The Cstate attribute for the client Ready for nim operation from the output of lsnim l sp2n03 is correct Based on the bootp_response for the node the Cstate attribute is fixed and will...

Page 76: ...llocated to the NIM client At the same time sp2n08 is listed in the spdata sys1 install images bos obj ssp 41 entry 2 To correct this situation you must issue the rmnfsexp command to remove the client...

Page 77: ...ry Compare all filesets in this directory with the list given in Step 8 of the installation process in the RS 6000 SP Installation Guide GC23 3898 This list is defined in the file usr lpp bos sysmgt n...

Page 78: ...es yes The Rstate is ready for use and the simages is yes If the simages attribute is no then the required images for the support images needed to create the SPOT were not available in the lppsource r...

Page 79: ...ver Allocating resources for client sphps01 0042 001 nim processing error encountered on master 0042 001 m_allocate processing error encountered on master 0042 058 m_alloc_spot unable to allocate pssp...

Page 80: ...llowing steps to debug the installation of your nodes 1 spbootins r disk Frame Node NumberOfNodes This command resets the specified nodes to disk and calls all necessary NIM commands to reset the inst...

Page 81: ...etwork This mode will be covered later in this chapter in more detail The command spbootins prepares the installation server to be able to serve NIM installations including serving as network boot ser...

Page 82: ...mation such as client address the boot server address gateway information the network adapter to use the bootfile name and others 2 Configure the given network device by use of the received IP paramet...

Page 83: ...os_inst during the second phase of sbin rc boot This script triggers the installation process If you have problems within the beginning of the installation look at some of these scripts to figure out...

Page 84: ..._DATA The name of the bosinst_data file for this installation is stored in this environment variable NIM_BOS_IMAGE Here we find the name of the image to be installed on the node NIM_BOS_FORMAT One for...

Page 85: ...T_DATA file spcntl aixedu spdata sys1 install images bos obj ssp 41 N IM_BOS_IMAGE file Following is an example of a host info file created for a maintenance network boot Network Install Manager warni...

Page 86: ...failure and how to resolve them Step 1 Verify Boot Install Server is Available Do the following steps to verify that the boot install server is available 1 Determine the client s boot install server b...

Page 87: ...ck to see if the image is available and the permissions are appropriate Issue usr lpp ssp bin splstdata b The next_install_image field lists the name of the image to be installed If the field for this...

Page 88: ...t information from the SDR setup_server Creating Node arrays for processing setup_server Getting SP Object information from the SDR setup_server Checking to see if this system is an install server set...

Page 89: ...ts back to disk the bootp_response option in the SDR and resets the NIM machine object for that node 2 11 1 Why a Node Is Not Being Customized The reasons why a node is not customized are many However...

Page 90: ...e to the initial_hostname or reliable hostname variables on the SDR so keeping them clear will avoid a lot of problems Installing the NIM master option in a node that never was a boot install server w...

Page 91: ...rform Kerberos provides authentication services that allow certain distributed services within the SP system and between it and other workstations clients to securely control access to their services...

Page 92: ...ed and Joe on the Control Workstation make sure that the users are exist on the nodes as well 2 Use the command usr kerberos bin kadmin to add a Kerberos user called kerb 3 Create a new file klogin in...

Page 93: ...ated with a Kerberos user or service Keys are stored in the Kerberos database Keys are used to encrypt the data packets used by Kerberos clients and services 3 2 6 Ticket Granting Ticket When a user e...

Page 94: ...hentication server Within an authentication realm there must be at least one authentication server but you may choose to have more than one When you configure your realm you designate one authenticati...

Page 95: ...r providing ticket granting tickets to clients so that they can access specific server principals The kerberos daemon listens for requests on the kerberos4 udp port If this port is not defined in the...

Page 96: ...sed naming convention for the Kerberos daemon services kerberos 88 tcp Kerberos kerberos 88 udp Kerberos kerberos adm 749 tcp kerberos administration kerberos adm 749 udp kerberos administration rfile...

Page 97: ...Note Without a k file the Kerberos server cannot be started automatically during an unattended reboot of the master server 3 6 2 HOME klogin The klogin file contains a list of principals name instanc...

Page 98: ...his file root sp21n01 klist srvtab Server key file etc krb srvtab Service Instance Realm Key Version rcmd sp21n01 SP21CW0 1 Figure 27 Example of etc krb srvtab from a Node Note Always ensure that the...

Page 99: ...21CW0 sp21sw01 SP21CW0 sp21sw02 SP21CW0 sp21sw03 SP21CW0 sp21sw04 SP21CW0 sp21sw05 SP21CW0 sp21sw06 SP21CW0 sp21sw07 SP21CW0 sp21sw08 SP21CW0 sp21sw09 SP21CW0 sp21sw11 SP21CW0 Figure 29 Example of a e...

Page 100: ...317 Could not fetch master key 26 Apr 96 14 49 23 Shutting down admin server 26 Apr 96 14 53 17 Kerberos admin server started PID 22380 Figure 31 Example of admin_server syslog File 3 7 Commands This...

Page 101: ...d their private key versions found in the server key table usually etc krb srvtab Following is an example of the output of the klist srvtab command root sp21cw0 klist srvtab Server key file etc krb sr...

Page 102: ...mote command execution and monitoring server sysctld Sysctl connects to a remote host s sysctld using TCP IP passes keywords and commands to the server and writes output returned to stdout Any sysctl...

Page 103: ...r requesting the service in the file 3 8 3 PATH Variable Ensure that the usr lpp ssp rcmd bin directory is before the usr bin directory in the user s local PATH statement Also make sure that the PATH...

Page 104: ...8 7 Service Key Files First check that the service key files etc krb srvtab exist on all nodes Use either the ksrvutil or the klist command to show the version numbers of the service keys in the etc k...

Page 105: ...fresh inittab 9 stopsrc s hardmon 10 setup_authent 11 spbootins r customize l NODELIST where NODELIST is a comma separated list of node numbers 12 startsrc s hardmon 13 telinit 2 Restart the Kerberos...

Page 106: ...This soft copy for use by IBM employees only 86 SP PD Guide...

Page 107: ...ater on The switch consists of 3 main hardware components The switch board containing the switch chips The switch cables to connect the switch to the nodes and other switches The switch adapters for e...

Page 108: ...n 8 of these If there are 3 switches then 8 cables will go to the first switch and 8 to the other The number of cables going to each switch decreases to the point where you have 5 switches allowing on...

Page 109: ...frame The intermediate switch boards provide sufficient routes to ensure that this is an optimum configuration If a 128 way configuration was supported without the use of intermediate switch boards th...

Page 110: ...evel of maintenance applied To check your current level of maintenance execute lslpp l grep ssp If you have to install the SP Switch ensure that the following two products are at least at the levels s...

Page 111: ...uting Packet switches with wormhole routing help balance low latency and help to relieve of congestion problems Circuit switches open the entire path from sending node to receiving node and keep that...

Page 112: ...s only supposed to be running a single application With SP Switch we had time to redesign the fault scenarios and we made the faults localized Only the link that experienced the fault is brought down...

Page 113: ...a more convenient time 150 MBps single direction bandwidth The switch provides 150 MBps single direction bandwidth 300 MBps both directions This is nominal bandwidth The effective bandwidth will chan...

Page 114: ...ich are the chosen routes 1 Port 7 across to SW3 onto that chip through port 4 exiting on port 7 over to SW7 onto the chip on port 7 2 Port 6 across to SW2 onto that chip through port 4 exiting on por...

Page 115: ...e number is used to build logical number Multiply frame number by 10 U1 to U8 Master switch chip silkscreen on the board N1 to N16 Node ports physical numbering E1 to E16 Switch to switch ports 4 3 Sw...

Page 116: ...not compromised on the switch network Following is a listing from the etc SP directory on the Control Workstation While the Eclock files reside in this directory at PSSP 2 1 the topology files are mer...

Page 117: ...tb0 5 0 L01 S00 BH J12 to L01 N6 s 16 2 tb0 6 0 L01 S00 BH J24 to L01 N7 s 16 3 tb0 7 0 L01 S00 BH J26 to L01 N8 s 14 3 tb0 8 0 L01 S00 BH J10 to L01 N9 s 14 2 tb0 9 0 L01 S00 BH J8 to L01 N10 s 17 0...

Page 118: ...e that would impact the topology file information is the jack cabling information Table 6 summarizes those differences Table 6 Cabling SP Switch versus HiPS Chip Node HiPS Jacks SP Switch Jacks Chip N...

Page 119: ...a switch in a switch frame 1000 is added to the logical number to differentiate it from node frame switches The switch frame switches are numbered 1 through the total number of switch frame switches I...

Page 120: ...edded in the data cables Figure 49 The HiPS Clock Subsystem The clock is driven from the clock card selection logic to the switch card From there it is redriven to two major branches one drives the da...

Page 121: ...later used to tune the individual switch port Each of the eight phases is driven to each of the switch chips You can see how switch chip 0 1 6 and 7 are all redriven from one source and chips 2 3 4 a...

Page 122: ...th length for each non master board Hence the number of port choices is reduced to four for each PLL and for a node switch board only non node chips may be a source yielding 2 PLL x 2 internal 4 ports...

Page 123: ...ontact the backup fail If the primary detects the backup has failed it will automatically bring up a new one The new backup will be chosen such that it is as far away from the primary as possible in t...

Page 124: ...ep applies only on IPL or when the topology file is changed such as with switch partitioning Secondary nodes send receive service packets to from the primary node only The secondary node acknowledges...

Page 125: ...le in etc SP on the primary node Estart Starts the switch If Estart can find the file expected top file in etc SP on the primary node it will use that to initialize the switch Otherwise it will transf...

Page 126: ...included The Worm on the primary then distributes these to the nodes In this way the link to the node is disabled The nodes are not able to generate a switch fault by sending a packet to the fenced n...

Page 127: ...you start seeing random ports being reported over time for example ports 0 1 and 7 reported at one time and ports 0 1 and 6 reported at another time Another strong indication of this problem is that...

Page 128: ...do it may not be possible to even ping that node Run the following command to resolve this on that node usr lpp ssp css ifconfig css0 switch IP address arp A detailed flow chart of this script can be...

Page 129: ...tion s fabric pool and nodes Each chip is sent a Read Status service package to check cabling Each chip is sent an Initialization service package to set the chip ID and specify routes to the primary E...

Page 130: ...tral to the functioning of the switch It can be viewed as having two main personalities Primary Worm Secondary Worm There is a third personality with the SP Switch Primary backup Worm The primary Worm...

Page 131: ...ponds The spmon GUI for switch_responds has changed for the SP Switch Figure 54 shows the differences between them High Performance Switch SP Switch initial state red yellow on the switch green green...

Page 132: ...5 Calls the device driver to configure the device build_dds_fail and dd_config_fail 6 Marks the device css0 as available 7 Executes POST diagnostics with the different routines depending on the type o...

Page 133: ...ap which will create a compressed tar image of all the switch logs in the following format var adm SPlogs css hostname date css snap Z Run the command dsh a usr lpp ssp css css snap Then transfer all...

Page 134: ...led frame The R indicates that the error was on the right hand side of the link which in this example is the node or switch adapter side rather than the switch chip port side When Estart finds problem...

Page 135: ...to diagnose the problem There are also many normal informational switch messages that you will see in the error report For instance when you issue an Estart you will see a switch fault in the error re...

Page 136: ...re an adapter was expected usually because the previous was a wide node out top messages 1 1 Indicates that a switch resource is uninitialized It says that the initialization code never even got far e...

Page 137: ...makes sense when the node is not populated or when the cable has been removed for some reason 11 18 is similar to 17 but occurs much more rarely 12 19 is very similar to 7 and was introduced later in...

Page 138: ...e a The clock tree was not set up properly by issuing an Eclock with the correct Eclock topology file b The clock to this board is broken in this switch assembly in the cable or in the switch that sou...

Page 139: ...problem It tells you which device saw the problem The problem could be with that device or it could be with the device to which it is connected What you are looking for in the flt file is a pattern o...

Page 140: ...ence unfence If an on board port port 4 through 7 is called the FRU is the switch assembly that houses the chip that is reporting the fault If an adapter reports a fault and it has an MSTAT of 1234567...

Page 141: ...on to out top notation so that you can look in the out top file to determine the physical location of the connection The important information in the flt file is The device ID which tells you which de...

Page 142: ...y of checking for errors it is quite thorough because in order to let an error escape both chips would have to break in the same way at the same time In the flt file the master is noted as chip 0 and...

Page 143: ...ds a problem that it feels may cause a switch fault True VDC errors are indicated by an MSTAT equal to 000000c0 Adapter node errors are indicated by the other bits These indicate either a problem with...

Page 144: ...In addition to the log files there is always a possibility of errors being recorded in the error log var adm ras errlog These errors can be read by using the error report command errpt It is important...

Page 145: ...dures In this way you can prevent many of the problems that you might otherwise encounter In addition it may be of assistance in deciding whether System Partitioning is suitable for the environment or...

Page 146: ...e the following command can be run to check the version of POWERparallel System Support Programs lslpp l grep ssp If there is a requirement to have more than a single AIX Version 3 2 5 partition then...

Page 147: ...e how the issue of compatibility of code makes these limitations essential Even when all the partitions are running the same versions of code there is still good justification for these limitations Ty...

Page 148: ...iguration where there are 2 wide nodes occupying the bottom 2 drawers there are 10 thin nodes the top drawer is empty and the layout for the 4 node partition is chosen to contain node slots 1 2 5 and...

Page 149: ...le where you would normally expect to see L01 N2 there is L01 N3 The node numbering in relation to the slot numbers remains consistent but the physical cabling has changed to enable the connection of...

Page 150: ...ere imposed to prevent the product becoming enormous in size It is possible to obtain topology files that will support a configuration that is not shipped as standard in ssp top by making a special re...

Page 151: ...de is responsible for generating the routing tables which it then distributes to the switch adapters on the nodes for storing in their NVRAM Each partition has its own primary node so that running Est...

Page 152: ...he switch could run indefinitely until an opportunity is available to carry out maintenance Similar considerations can be applied to the 12 way partition but the impact is minimal because 3 out of the...

Page 153: ...ther issue that should be considered in relation to performance when setting up System Partitioning on a multi switch system Take an example of a 5nsb 0isb configuration that is not currently partitio...

Page 154: ...s 1nsb0isb config 8_8 layout 1 syspar 1 spdata sys1 syspar_configs 1nsb0isb config 8_8 layout 1 syspar 2 Within these directories there are three files topology nodelist and custom Actually of these f...

Page 155: ...r_configs topologies any l2 8way6 0isb Note that there are 12 files 6 beginning with any l1 and 6 with any l2 As there are three possible layouts for an 8_8 configuration this explains why there are s...

Page 156: ...cy in the two filenames will cause Estart to fail In this example of an 8_8 configuration using layout 1 the two files that will be loaded into the SDR files directories for each partition should be s...

Page 157: ...tion of encountering any known problems4 If the system was previously installed at AIX Version 3 2 5 it is a good idea to run with the AIX Version 4 1 Control Workstation for a period of time to ensur...

Page 158: ...n mechanism If you are using the etc hosts file for this purpose then simply add the IP address and an appropriate name These hostnames will be used to identify your partitions when carrying out syste...

Page 159: ...nfiguration or Layout Select System Partition Layout Enter Customization Information for a Selected System Partition Apply System Partition Configuration Restore System Partition Configuration F1 Help...

Page 160: ...down or started up 5 3 3 Archiving the SDR Always begin the System Partitioning by archiving the SDR It is possible to return to the original configuration by reapplying it but this could involve a c...

Page 161: ...the IP address for the partition that is described in the layout that is chosen for customization In the preceding example this is layout 1 It is important to know which nodes are going to be in this...

Page 162: ...ctive partitions When the OK message is displayed after Enter has been pressed the data that had been entered will have been placed in a file called custom Based on the example above this will be in t...

Page 163: ...e partition Resolve these issues before you begin System Partitioning because only non fatal errors can be corrected by using this option that passes the F to the verparvsd command By choosing No Disc...

Page 164: ...according to the new partitions if there is one installed Validates that the VSD configuration is consistent with the partition being set up If the apply fails then record the error message that appe...

Page 165: ...output for the process name from the preceding command should look like usr lpp ssp bin sdrd 9 180 6 199 usr lpp ssp bin sdrd 9 180 6 198 Alternatively the following command can be run to check the sa...

Page 166: ...e the IP address of the primary is the IP alias given to the second partition created that is not the default Following is the etc SDR_dest_info file from a node in the default partition default 9 180...

Page 167: ...DR_dest_info files on the problematic nodes have the correct IP address for the primary entry or equivalent opstation for AIX Version 3 2 5 nodes by referring to the output of the SDRGetObjects comman...

Page 168: ...fs Each partition has its own directory identified by its associated IP address IP alias or Control Workstation IP address In the example given the following directories exist spdata sys1 sdr partitio...

Page 169: ...199 2 1 0 sppart2 9 180 6 198 3 2 1 sppart2 9 180 6 198 4 3 0 spcws 9 180 6 199 5 4 1 spcws 9 180 6 199 6 5 1 sppart2 9 180 6 198 7 6 1 sppart2 9 180 6 198 8 7 1 spcws 9 180 6 199 9 8 1 spcws 9 180 6...

Page 170: ...shown by the code_version attribute Here is the same data for the second partition syspar_name ip_address install_image syspar_dir code_version sppart2 9 180 6 198 default spdata sys1 sypar_configs 1n...

Page 171: ...d SDR daemons Moves data from the original Node Adapter host_responds and switch_responds to the newly created ones but only for the affected nodes Opens a custom file in the syspar directory to get t...

Page 172: ...not always signify this since the node may be up but the connectivity on this interface may be lost The heartbeat daemon runs on each node and the Control Workstation At PSSP 2 1 the daemon is called...

Page 173: ...he object class and refresh the host_responds GUI The heartbeat daemon on the Control Workstation will send out regular ping packets known as proclaim packets to any nodes that have been marked as mis...

Page 174: ...ecessary If there is no connectivity across any of the interfaces then check to see what the status is of any users that are already logged in If their sessions are hung that is if there is no respons...

Page 175: ...em name from which to determine which daemon is servicing which partition With these distinct subsystems on the Control Workstation each partition has its own separate and distinct heartbeat ring Thes...

Page 176: ...at daemon can be put into debug mode on the spcws partition by issuing the following commands export SP_NAME spcws usr lpp ssp bin hb debug This heartbeat daemon will now send all stdout to the consol...

Page 177: ...f they are installed The heartbeat daemons will start up with the p1 flag set if they are required to wait for these VSD daemons Otherwise the heartbeat daemons get started with the p0 flag and this o...

Page 178: ...This soft copy for use by IBM employees only 158 SP PD Guide...

Page 179: ...also included in the bos sysmgt serv_aid for example the sysdumpstart command To determine if this package is installed run the command lslpp l grep bos sysmgt serv_aid If the package is installed a...

Page 180: ...system problems the system deletes hardware related entries older than 90 days from the error log and software related entries 30 days after logging See the errclear command lines in the crontab file...

Page 181: ...er Writes an operator message entry to the error log errmsg Implements error logging in applications The errmsg command will list add or delete messages stored in the error message catalog With this c...

Page 182: ...error logging system synchronizing the error logging system and querying the status of the error special file 6 2 SP Error Logging The RS 6000 SP uses both the AIX Error Logging facilities and the BS...

Page 183: ...be identified to the SP authentication services All other log management commands additionally require that the user be defined as a principal in the etc logmgt acl file All users defined in this fil...

Page 184: ...s pdf and pfck include buildTop sysctl bin pdfpfck cmds Include pfps commands include buildTop sysctl bin pfps cmds Include rcmds create class rcmds buildTop samples sysctl rcmds rcmds cmds Include Lo...

Page 185: ...e to list error log messages and syslog messages in one single report To access the syslog menu in SMIT enter smit spsyslog Generating Reports on BSD syslog Log Files The syslogd daemon which logs the...

Page 186: ...Fn File name SID SID level of the file L Line number or function Note You must be Kerberos authenticated to run the psyslrpt command The fast path to invoke for the Generate a Syslog Report SMIT menu...

Page 187: ...e msg_src_list is a semicolon separated list of facility priority where facility is all except mark mark time marks kern user mail daemon auth see syslogd AIX Commands Reference priority is one of fro...

Page 188: ...th a slash it is interpreted as a file containing a list of nodes to execute the command on Otherwise it can be a list of host names Note If neither the a nor w options are used psyslprt defaults to t...

Page 189: ...and the nodes contain an entry to clean up the logs The following example shows the crontab entries for root on a Control Workstation 0 11 usr bin errclear d S O 30 0 12 usr bin errclear d H 90 01 5 u...

Page 190: ...ist of nodes to execute the command on Otherwise it can be a list of host names The f and p options can be used to control selecting files in the configuration file All files found in the configuratio...

Page 191: ...ompressed tar files created on each target node in the directory var archives arc_weekly tab to the directory var logrepos on the local node issue from the command line splm a gather k archive t spdat...

Page 192: ...where the archive collection will be stored on each node The default is var adm archives k type For the gather function only this option indicates whether a service collection or archive is being coll...

Page 193: ...he node list or add additional collection commands if needed In this table the following target nodes need to edited allnodes All nodes in the SP system amd_server amd server node usually the cws cws...

Page 194: ...r log processing requires for node information from the ODM database on each node Therefore it is better to use the dsh command in combination with the errpt command and options to view the error log...

Page 195: ...e error using the sequence of the error passed by the notification facility to the EN_MAILLOC Note that this will only happen if the variables EN_RUNDEFAULT and EN_MAILLOC are set 5 EN_pend checks for...

Page 196: ...s spdata sys1 err_methods EN_pend spdata sys1 err_methods EN_hdisk0 dsh w sp21n7 sp21n8 sp21n9 sp21n10 ln s spdata sys1 err_methods EN_pend envs spdata sys1 err_methods EN_hdisk0 envs 4 Add the notifi...

Page 197: ...ontract with IBM Corp CPRY Description Default error notification script for pend errors Syntax example Add as EN_pend 1 Internal Ref None 96 1 1 src ssp logmgt bin EN_pend sysman ssp_r2 4 r2_4t6d6 4...

Page 198: ...v null 2 1 if 0 then CWS usr lpp ssp bin SDRGetObjects SP control_workstation awk NR 1 if CWS then export EN_MAILLOC root CWS fi fi Figure 75 spdata sys1 err_methods EN_pend envs 6 3 1 Error Notificat...

Page 199: ...SP error occurs The error notification will perform an ODM method defined by the administrator when a particular error occurs or a particular process fails The following classifications of errors shou...

Page 200: ...ed should have the en_persistenceflg set to 0 en_name Uniquely identifies the object The creator uses this unique name when removing the object en_persistenceflg Designates whether the error notificat...

Page 201: ...resource name from the error log entry 7 The resource type from the error log entry 8 The resource class from the error log entry 9 The error label from the error log entry Use the en_persistenceflg...

Page 202: ...tification object for example tbx_diagerr obj enter the following errnotify en_name tbx_diagerr obj en_persistenceflg 1 en_label HPS_DIAG_ERROR2_ER en_method define_path methods errnot 1 Note that the...

Page 203: ...or Notify Keep the method scripts on each node so you can run them if network or distributed file system problems occur Using File Collections is an excellent way to keep these scripts updated The obj...

Page 204: ...vendors errnotify en_name errnot PEND obj en_persistenceflg 1 en_label PEND en_method define_path methods errnot 1 errnotify en_name errnot Pend obj en_persistenceflg 1 en_label Pend en_method define...

Page 205: ...pt a l 1 tmp errnot Mail the full expanded error report to root controlworkstation This is the user and the hostname that the administrator wants to be notified at They could be anywhere in the system...

Page 206: ...nd to the PID of this script errpt a l 1 tmp errnot Mail the full expanded error report to root controlworkstation This is the user and the hostname that the administrator wants to be notified at They...

Page 207: ...system startup this last error entry is read from NVRAM and added to the error log when errdemon is started errdemon does NOT create an error log entry for the logged error if the error record templa...

Page 208: ...g file When the buffer is full new entries are discarded until space becomes available in the buffer When this situation occurs errdemon creates an error log entry to inform you about the problem You...

Page 209: ...n action field separated by one or more tabs The selector field names a facility and a priority level You can separate facility names with a comma and separate the facility and priority level portions...

Page 210: ...improper login attempts LOG_CRIT and higher priority messages are sent to the system console err LOG_ERR Represents an error condition for example an unsuccessful disk write warning LOG_WARNING Messag...

Page 211: ...rms the following steps 1 It creates directories in var adm that the logging daemon uses Of course this will only be done if they do not already exist 2 It adds an entry to the file etc syslog conf fo...

Page 212: ...ge or both function Specifies the program to call when the event occurs There two special keywords for function If function is SP_ERROR_LOG error logging is performed provided that syslog is set up an...

Page 213: ...e splogd daemon is set up to be respawnable and to be the only instance of the splogd daemon running on that particular node or Control Workstation Important Do not start splogd from the command line...

Page 214: ...This soft copy for use by IBM employees only 194 SP PD Guide...

Page 215: ...and problems 7 1 1 Booting Process Overview In order to network boot and install nodes they must be defined as NIM clients for the NIM master in NIM jargon or boot install server in SP jargon When the...

Page 216: ..._server This script executes several NIM commands to allocate the necessary NIM resources to each client Note that if you are installing several nodes with several NIM masters setup_server will run in...

Page 217: ...ult to determine which resource mount failed On the NIM master verify that NFS is functioning correctly and that the appropriate resources have been exported correctly If you have a gateway between th...

Page 218: ...sage window is displayed after opening the System Monitor GUI then verify authorization using klist and request authorization using the kinit command If the problem persists check the spdata sys1 spmo...

Page 219: ...t This command is executed when you want to start the switch However this command is not the real one the real one resides in the primary node and is called Estart_sw Chapter 4 The Switch on page 87 d...

Page 220: ...rtitioned when a system partition is applied Nevertheless many components remain systemwide Figure 80 System Partitioning Problems System partitioning problems usually arise from two sides Problems ap...

Page 221: ...your node to hang while it waits for an answer from the serial link This chapter gives a brief overview of system dumps and provides examples with commands and scripts you can use to handle them 8 1 H...

Page 222: ...s that were running in kernel mode at that time System Buffers This is the area where incoming data is stored while awaiting memory allocation from the kernel TTY information Contains information abou...

Page 223: ...device Operator intervention is required when a dump is to be written to the secondary dump device When you install the operating system the dump device is automatically configured for you By default...

Page 224: ...in which the dump occurred If the dump device is not large enough the system will produce a partial dump only It is possible but extremely unlikely that a Support Center can determine the cause of the...

Page 225: ...orkstation You can initiate a dump either to the primary or secondary dump device by using the following key sequence crtl alt NUMPAD1 Using the reset button This procedure works for all system config...

Page 226: ...e changes find the new value on the list If the value does not change then the dump did not complete due to an unexpected error 8 1 2 Copying a System Dump The following section discusses details of c...

Page 227: ...Previous Menu Choice 0 Note You will need to attach an external device to the SP nodes in order to use this option To copy a system dump after rebooting in normal mode do the following If it is the Co...

Page 228: ...epeatedly and this is seriously impacting the use of the machine the dump can be sent to the Support Center through your normal first level support channel for analysis The Support Center will analyze...

Page 229: ...he previously created files Use the r flag to remove previously gathered and saved information Before you send your media to the Support Center make sure you contact the Center and obtain a Problem Ma...

Page 230: ...following output will appear on the screen WARNING dump file does not appear to match namelist Reading in symbols If the dump itself is corrupted in some way then the following will appear on the scre...

Page 231: ...table entry 37 Ignore messages reading cannot read process table entry This shows that the process running at the time of the crash was sysdumpstart and it was running in slot 46 It also shows find w...

Page 232: ...e crash command you know that you have a full dump that can be analyzed You should avoid sending dumps to the Support Center only to find out that the Center cannot do anything about them because they...

Page 233: ...es can be defined as a file collection and changes to any files in that collection can be propagated to all nodes The user admin file collection is provided for propagating user administration files t...

Page 234: ...s 9 2 2 ssp sysman 2 1 0 7 This component includes the following functions User management File collections Auto Mount Daemon Print support Time services 9 3 Managing User Accounts In order to have th...

Page 235: ...me sp21cw0 Home Directory Path home sp21cw0 File Collection Management true File Collection daemon uid 102 File Collection daemon port 8431 SP Accounting Enabled false SP Accounting Active Node Thresh...

Page 236: ...directory for user administration files are also stored in the System Data Repository These default values may be changed However the etc passwd and etc security passwd files are updated from the Con...

Page 237: ...topic 9 4 Managing File Collections on page 218 for further information on the File Collection Step 1 will refresh the Amd mount maps If this command is not run before the new user attempts to login t...

Page 238: ...login Figure 87 Example of Login by Blocked User ID The command spacs_cntrl unblock user1 can be used to regain login access for user1 on a node For further information on User Access Problems see Ch...

Page 239: ...or messages similar to those shown in Figure 89 are most likely a result of mismatched levels of PSSP code on the Control Workstation and the nodes Ensure that PSSP components are at the same level on...

Page 240: ...if it exists contains a list of files to be excluded from all the file collections Verify the contents of the etc ssp server_name file This tells supper from which host it should update files 7 suppe...

Page 241: ...sed to manage NFS mounting of home directories and other directories When configured Amd mounts filesystems on demand when they are first referenced and also unmounts them after a period of inactivity...

Page 242: ...50 26 dev hd2 589824 11936 98 11188 16 usr dev hd9var 8192 5440 34 334 33 var dev hd3 24576 23320 6 40 1 tmp dev hd1 8192 7840 5 18 2 home sp21cw0 tony 40960 39312 5 a sp21cw0 tony sp21cw0 tony 40960...

Page 243: ...fter creating a new Amd map file use dsh to copy the file to each node Then run etc amd amd_start f to restart Amd This makes it unnecessary to reboot all the nodes that have the new map 5 If one of t...

Page 244: ...d auto on auto fstype toplvl etc amd amd_start kill TERM 11324 etc amd amd_start Starting Amd on Fri May 10 13 31 04 CDT 1996 etc amd amd_start Started Amd on Fri May 10 13 31 04 CDT 1996 May 10 13 31...

Page 245: ...e print server through rsh as well which could pose a security exposure If print_config secure or open then the following sequence will configure Print Services 1 etc rc sp is started from etc inittab...

Page 246: ...zcat ntp tar Z tar xvf ntp doc xntpdc 8 zcat ntp tar Z tar xvf ntp doc ntpq 8 zcat ntp tar Z tar xvf ntp doc ntpdate 8 zcat ntp tar Z tar xvf ntp doc ntptrace 8 mv ntp doc xntp 8 usr man man8 mv ntp...

Page 247: ...i etc rc ntp done This script will stop the NTP daemon on each node and then restart it This will synchronize the time on the nodes with the Control Workstation Use dsh a rm rhost to delete the rhost...

Page 248: ...This soft copy for use by IBM employees only 228 SP PD Guide...

Page 249: ...SP scripts which can be used as a reference for problem determination A 1 The setup_authent Script The setup_authent script provides the initial setup for the Kerberos authentication services This scr...

Page 250: ...This soft copy for use by IBM employees only Figure 102 setup_authent Script Flow Chart 2 7 230 SP PD Guide...

Page 251: ...This soft copy for use by IBM employees only Figure 103 setup_authent Script Flow Chart 3 7 Appendix A RS 6000 SP Script Files 231...

Page 252: ...This soft copy for use by IBM employees only Figure 104 setup_authent Script Flow Chart 4 7 232 SP PD Guide...

Page 253: ...This soft copy for use by IBM employees only Figure 105 setup_authent Script Flow Chart 5 7 Appendix A RS 6000 SP Script Files 233...

Page 254: ...This soft copy for use by IBM employees only Figure 106 setup_authent Script Flow Chart 6 7 234 SP PD Guide...

Page 255: ...This soft copy for use by IBM employees only Figure 107 setup_authent Script Flow Chart 7 7 Appendix A RS 6000 SP Script Files 235...

Page 256: ...ipt This script is invoked during the installation procedure The CW is installed here creating the directory structure and setting up the process and subsystems infrastructure that will manage the RS...

Page 257: ...This soft copy for use by IBM employees only Figure 109 install_cw Script Flow Chart 2 3 Appendix A RS 6000 SP Script Files 237...

Page 258: ...This soft copy for use by IBM employees only Figure 110 install_cw Script Flow Chart 3 3 238 SP PD Guide...

Page 259: ...er Script This script is run to check and configure boot install servers It can be run from the Control Workstation or from the nodes which will be used as boot install servers Figure 111 setup_server...

Page 260: ...This soft copy for use by IBM employees only Figure 112 setup_server Script Flow Chart 2 23 240 SP PD Guide...

Page 261: ...This soft copy for use by IBM employees only Figure 113 setup_server Script Flow Chart 3 23 Appendix A RS 6000 SP Script Files 241...

Page 262: ...This soft copy for use by IBM employees only Figure 114 setup_server Script Flow Chart 4 23 242 SP PD Guide...

Page 263: ...This soft copy for use by IBM employees only Figure 115 setup_server Script Flow Chart 5 23 Appendix A RS 6000 SP Script Files 243...

Page 264: ...This soft copy for use by IBM employees only Figure 116 setup_server Script Flow Chart 6 23 244 SP PD Guide...

Page 265: ...This soft copy for use by IBM employees only Figure 117 setup_server Script Flow Chart 7 23 Appendix A RS 6000 SP Script Files 245...

Page 266: ...This soft copy for use by IBM employees only Figure 118 setup_server Script Flow Chart 8 23 246 SP PD Guide...

Page 267: ...This soft copy for use by IBM employees only Figure 119 setup_server Script Flow Chart 9 23 Appendix A RS 6000 SP Script Files 247...

Page 268: ...This soft copy for use by IBM employees only Figure 120 setup_server Script Flow Chart 10 23 248 SP PD Guide...

Page 269: ...This soft copy for use by IBM employees only Figure 121 setup_server Script Flow Chart 11 23 Appendix A RS 6000 SP Script Files 249...

Page 270: ...This soft copy for use by IBM employees only Figure 122 setup_server Script Flow Chart 12 23 250 SP PD Guide...

Page 271: ...This soft copy for use by IBM employees only Figure 123 setup_server Script Flow Chart 13 23 Appendix A RS 6000 SP Script Files 251...

Page 272: ...This soft copy for use by IBM employees only Figure 124 setup_server Script Flow Chart 14 23 252 SP PD Guide...

Page 273: ...This soft copy for use by IBM employees only Figure 125 setup_server Script Flow Chart 15 23 Appendix A RS 6000 SP Script Files 253...

Page 274: ...This soft copy for use by IBM employees only Figure 126 setup_server Script Flow Chart 16 23 254 SP PD Guide...

Page 275: ...This soft copy for use by IBM employees only Figure 127 setup_server Script Flow Chart 17 23 Appendix A RS 6000 SP Script Files 255...

Page 276: ...This soft copy for use by IBM employees only Figure 128 setup_server Script Flow Chart 18 23 256 SP PD Guide...

Page 277: ...This soft copy for use by IBM employees only Figure 129 setup_server Script Flow Chart 19 23 Appendix A RS 6000 SP Script Files 257...

Page 278: ...This soft copy for use by IBM employees only Figure 130 setup_server Script Flow Chart 20 23 258 SP PD Guide...

Page 279: ...This soft copy for use by IBM employees only Figure 131 setup_server Script Flow Chart 21 23 Appendix A RS 6000 SP Script Files 259...

Page 280: ...This soft copy for use by IBM employees only Figure 132 setup_server Script Flow Chart 22 23 260 SP PD Guide...

Page 281: ...This soft copy for use by IBM employees only Figure 133 setup_server Script Flow Chart 23 23 Appendix A RS 6000 SP Script Files 261...

Page 282: ...This soft copy for use by IBM employees only A 4 The rc switch Script Figure 134 rc switch Script Flow Chart 1 8 262 SP PD Guide...

Page 283: ...This soft copy for use by IBM employees only Figure 135 rc switch Script Flow Chart 2 8 Appendix A RS 6000 SP Script Files 263...

Page 284: ...This soft copy for use by IBM employees only Figure 136 rc switch Script Flow Chart 3 8 264 SP PD Guide...

Page 285: ...This soft copy for use by IBM employees only Figure 137 rc switch Script Flow Chart 4 8 Appendix A RS 6000 SP Script Files 265...

Page 286: ...This soft copy for use by IBM employees only Figure 138 rc switch Script Flow Chart 5 8 266 SP PD Guide...

Page 287: ...This soft copy for use by IBM employees only Figure 139 rc switch Script Flow Chart 6 8 Appendix A RS 6000 SP Script Files 267...

Page 288: ...This soft copy for use by IBM employees only Figure 140 rc switch Script Flow Chart 7 8 268 SP PD Guide...

Page 289: ...This soft copy for use by IBM employees only Figure 141 rc switch Script Flow Chart 8 8 Appendix A RS 6000 SP Script Files 269...

Page 290: ...This soft copy for use by IBM employees only 270 SP PD Guide...

Page 291: ...o represented by SP subsystems that query the SDR to retrieve information about the SP configuration or values for state variables The organization of the SDR database is based on object classes Each...

Page 292: ...This soft copy for use by IBM employees only 272 SP PD Guide...

Page 293: ...he SDR objects that reference the hostname and IP address changes are Adapters Deals with ent css0 tr0 fi0 IP addresses and adapters Frame Deals with the MACN CWS attribute and works with hostnames No...

Page 294: ...e CW hostname RS 6000 SP nodes etc ssp server_name Specifies IP address and hostname of RS 6000 SP server of the boot install adapter servers RS 6000 SP nodes etc ssp server_hostname Specifies IP addr...

Page 295: ...6000 SP nodes have completed the reboot and customized you should verify that the files on RS 6000 SP nodes reflect the new IP address hostname changes The added Install support in AIX 4 1 for NIM req...

Page 296: ...reference the new node client hostname and IP address d Make sure that SP_NAME environment variable is updated or blank e Execute a usr lpp ssp kerberos bin kdestroy to remove any active kerberos tick...

Page 297: ...realms and etc krb srvtab reference the new hostname 7 You need to now create the new source master resources for the sdr hb and hr daemons This may be possible by executing the usr lpp ssp inst_root...

Page 298: ...t to re execute the system partitioning steps that are specified in Chapter 5 of the System Administrators Guide Update other files on the CW that may reflect IP address or hostname changes a Update a...

Page 299: ...e and reliable_hostname point to the correct IP address hostnames The files etc krb conf etc krb realms and etc krb srvtab have the correct hostnames defined 2 You may have to modify the following fil...

Page 300: ...This soft copy for use by IBM employees only 280 SP PD Guide...

Page 301: ...ctor of Licensing IBM Corporation 500 Columbus Avenue Thornwood NY 10594 USA Licensees of this program who wish to have information about it for the purpose of enabling i the exchange of information b...

Page 302: ...sed by IBM Corporation under license UNIX is a registered trademark in the United States and other countries licensed exclusively through X Open Company Limited Microsoft Windows and the Windows 95 lo...

Page 303: ...Networking and Systems Management Redbooks Collection SBOF 7370 SK2T 6022 Transaction Processing and Data Management Redbook SBOF 7240 SK2T 8038 AS 400 Redbooks Collection SBOF 7270 SK2T 2849 RISC Sy...

Page 304: ...This soft copy for use by IBM employees only 284 SP PD Guide...

Page 305: ...nadian users only To get lists of redbooks TOOLS SENDTO WTSCPOK TOOLS REDBOOKS GET REDBOOKS CATALOG TOOLS SENDTO USDIST MKTTOOLS MKTTOOLS GET ITSOCAT TXT TOOLS SENDTO USDIST MKTTOOLS MKTTOOLS GET LIST...

Page 306: ...ent Listserver To initiate the service send an E mail note to announce webster ibmlink ibm com with the keyword subscribe in the body of the note leave the subject line blank IBMMAIL Internet In Unite...

Page 307: ...Company Address City Postal code Country Telephone number Telefax number VAT number Invoice to customer number Credit card number Credit card expiration date Card issued to Signature We accept America...

Page 308: ...This soft copy for use by IBM employees only 288 SP PD Guide...

Page 309: ...Liquid Crystal Display LED Light Emitter Diode LRU Least Recently Used LSC Link Switch Chip LVM Logical Volume Manager MIB Management Information Base MPI Message Passing Interface MPL Message Passin...

Page 310: ...This soft copy for use by IBM employees only 290 SP PD Guide...

Page 311: ...rame 1 disk space requirements 11 Control Workstation continued General Description 1 maximum number of processes 14 number of license users 14 preparing for install 8 prerequisites 10 required steps...

Page 312: ...alization switch 108 install_cw script 236 See also PSSP Scripts iotcl system call 162 ISB 96 See also Switch Isolating problems 195 J jm_config file 157 K Kerberos See Authentication Services L LED C...

Page 313: ...nvironment variable 40 PATH environment variable for PSSP 8 PEND error type 180 penotify command 179 Perf error type 180 PERL See Practical Extraction and Report Language PERM error type 180 Phase Loc...

Page 314: ...t 74 ssp basic 74 90 126 214 ssp css 90 ssp sysman 162 214 ssp top 126 130 Supervisor Card 2 supper See User Management Switch 128 way system example 89 48 way system example 90 Clock files 102 Coexis...

Page 315: ...System Partitioning Applying 144 Archiving the SDR 140 default 146 Directory 134 example 128 Limitations 127 Overview 125 partition name 141 primary 146 Process Overview 139 Restoring the SDR 144 Rul...

Page 316: ...IBML This soft copy for use by IBM employees only Printed in U S A SG24 4778 00...

Reviews: