background image

XVIII

Sun StorEdge 3900 and 6900 Series 2.0 Troubleshooting Guide • March 2003

Sun Proprietary/Confidential: Internal Use Only

Related Documentation

Product

Title

Part Number

Late-breaking News

• Sun StorEdge 3900 and 6900 Series 2.0 Release Notes

816-5254

Sun StorEdge 3900 and
6900 series information

• Sun StorEdge 3900 and 6900 Series 2.0 Installation Guide
• Sun StorEdge 3900 and 6900 Series 2.0 Reference and Service Guide
• Sun StorEdge 3900 and 6900 Series 2.0 Regulatory and Safety
Compliance Manual
• Sun StorEdge 3900 and 6900 Series 2.0 Site Prep Guide

816-5252
816-5253

816-5257
816-5256

Sun StorEdge T3 and
T3+ array

• Sun StorEdge T3+ Array Release Notes
• Sun StorEdge T3+ Array Start Here
• Sun StorEdge T3 and T3+ Array Regulatory and Safety Compliance
Manual
• Sun StorEdge T3+ Array Installation and Configuration Manual
• Sun StorEdge T3+ Array Administrator’s Guide
• Sun StorEdge T3 Array Cabinet Installation Guide

816-4771
816-4768
816-0774

816-4769
816-4770
806-7979

Diagnostics

• Storage Automated Diagnostics Environment User’s Guide

816-3142

Sun StorEdge SAN 4.0
(1 Gb switches)

• Sun StorEdge SAN 4.0 Release Guide to Documentation
• Sun StorEdge SAN 4.0 Release Installation Guide
• Sun StorEdge SAN 4.0 Release Configuration Guide
• Sun StorEdge Network 2 Gb FC Switch-16 FRU Installation
• Sun StorEdge SAN 4.0 Release Notes

816-4470
816-4469
806-5513
816-5285
816-4472

Sun StorEdge SAN 4.1
(2 Gb switches)

• Sun StorEdge SAN 4.1 Release Guide to Documentation
• Sun StorEdge SAN 4.1 Release Installation Guide
• Sun StorEdge SAN 4.1 Release Configuration Guide
• Sun StorEdge SAN 4.1 2 Gb Brocade Silkworm Fabric Switch Guide to

Documentation

• Sun StorEdge SAN 4.1 2 Gb McData Intrepid Director Switch Guide to

Documentation

• Sun StorEdge SAN 4.1 Release Notes

817-0061
817-0056
817-0057
817-0062

817-0063

817-0071

3Com Ethernet hubs

• SuperStack 3 Baseline Hub 12-Port TP User Guide
• SuperStack 3 Baseline Hub 24-Port TP User Guide

3C16440A
3C16441A

Summary of Contents for StorEdge 3900 Series

Page 1: ...c 4150 Network Circle Santa Clara CA 95054 U S A 650 960 1300 Send comments about this document to docfeedback sun com Sun StorEdge 3900 and 6900 Series 2 0 Troubleshooting Guide Part No 816 5255 12 March 2003 Revision A ...

Page 2: ...003 Sun Microsystems Inc 4150 Network Circle Santa Clara California 95054 Etats Unis Tous droits réservés Sun Microsystems Inc a les droits de propriété intellectuels relatants à la technologie incorporée dans le produit qui est décrit dans ce document En particulier et sans la limitation ces droits de propriété intellectuels peuvent inclure un ou plus des brevets américains énumérés à http www su...

Page 3: ...I Accessing Sun Documentation Online XX Sun Welcomes Your Comments XX 1 Introduction 1 Predictive Failure Analysis PFA Capabilities 2 2 General Troubleshooting Procedures 3 High Level Troubleshooting Tasks 3 Host Side Troubleshooting 6 Storage Service Processor Side Troubleshooting 6 Verifying the Configuration Settings 7 To Verify Configuration Settings 7 Clearing the Lock File 10 To Clear the Lo...

Page 4: ...ubleshooting Tools 23 Storage Automated Diagnostic Environment 2 2 23 Example Topology 24 Generating Component Specific Event Grids 25 To Customize an Event Report 25 Microsoft Windows 2000 System Errors 26 Command Line Test Examples 27 qlctest 1M 27 switchtest 1M 28 Monitoring Sun StorEdge T3 and T3 Arrays Using the Explorer Data Collection Utility 29 To Install the Explorer Data Collection Utili...

Page 5: ...fying the Data Host 56 Verifying the Storage Service Processor Side 57 FRU Tests Available for the A3 or B3 FC Link Segment 57 To Isolate the A3 or B3 FC Link 58 Quiescing the I O on the A3 or B3 Link 59 Suspending the I O on the A3 to B3 Link 59 Troubleshooting the A4 or B4 FC Link 60 Verifying the Data Host 62 Sun StorEdge 3900 Series 62 Sun StorEdge 6900 Series 62 FRU Tests Available for the A4...

Page 6: ...ting the Sun StorEdge T3 Array Devices 87 Troubleshooting the T1 or T2 Data Path 88 Notification Events 89 To Verify the Storage Service Processor 92 FRU Tests Available for the T1 or T2 Data Path FRU 93 To Isolate the T1 or T2 Data Path 94 Sun StorEdge T3 Array Event Grid 95 To Use the Sun StorEdge T3 Array Event Grid 95 9 Troubleshooting Virtualization Engine Devices 107 About the Virtualization...

Page 7: ...ices 117 Viewing the Virtualization Engine Map 118 To Failback the Virtualization Engine 120 Manually Clearing and Restoring the SAN Database 123 To Reset the SAN Database on Both Virtualization Engines 124 To Reset the SAN Database on a Single Virtualization Engine 125 Restarting the slicd Daemon 126 To Restart the slicd Daemon 126 Diagnosing a creatediskpools 1M Failure 129 Virtualization Engine...

Page 8: ...ommand Line Interface CLI 142 11 Example of Fault Isolation 147 A Virtualization Engine References 155 SRN Reference 155 SRN SNMP Single Point of Failure Descriptions 159 Port Communication Numbers 160 Virtualization Engine Service Codes 160 B Configuration Utility Error Messages 163 Virtualization Engine Error Messages 164 Switch Error Messages 168 Sun StorEdge T3 Array Partner Group Error Messag...

Page 9: ...Log 26 FIGURE 3 3 Qlogic SANblade Manager HBA Driver and Firmware Versions 33 FIGURE 3 4 QLogic SANblade Manager Diagnostics 34 FIGURE 5 1 Sun StorEdge 3900 Series FC Link Diagram 39 FIGURE 5 2 Sun StorEdge 6900 Series FC Link Diagram 41 FIGURE 5 3 Data Host Notification of Intermittent Problems 43 FIGURE 5 4 Data Host Notification of Severe Link Error 43 FIGURE 5 5 Storage Service Processor Notif...

Page 10: ...10 3 Healthy Sun StorEdge 3900 series system shown using Multipath Configurator 140 FIGURE 10 4 Sun StorEdge 3900 series system with a LUN failover shown using Multipath Configurator 141 FIGURE 10 5 Multipath Configurator Array Properties 141 FIGURE 10 6 Multipath Configurator LUN Properties Detail 142 FIGURE 10 7 Sun StorEdge T3 Array Failover Driver CLI Output for the Sun StorEdge 3900 Series 14...

Page 11: ...I Sun Proprietary Confidential Internal Use Only FIGURE 11 8 Successful Switch Test Results 153 FIGURE 11 9 Multipath Recovery using the Sun StorEdge T3 Array Multipath Configurator 154 FIGURE 11 10 Recovered Paths 154 ...

Page 12: ...XII Sun StorEdge 3900 and 6900 Series 2 0 Troubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only ...

Page 13: ...BLE 0 1 setupswitch Exit Values 85 TABLE 8 1 Storage Automated Diagnostic Environment Event Grid for the Sun StorEdge T3 Array 96 TABLE 9 1 Virtualization Engine LEDs 110 TABLE 9 2 LED Diagnostic Codes 111 TABLE 9 3 Speed Activity and Validity of the Link 112 TABLE 9 4 Virtualization Engine Statistical Data 113 TABLE 9 5 Storage Automated Diagnostic Environment Event Grid for Virtualization Engine...

Page 14: ...al Use Only TABLE A 5 Virtualization Engine Service Codes 400 599 Device Side Interface Driver Errors 162 TABLE B 1 Virtualization Engine Error Messages 164 TABLE B 2 Sun StorEdge Network FC Switch Error Messages 168 TABLE B 3 Sun StorEdge T3 Array Error Messages 171 TABLE B 4 Other SUNWsecfg Error Messages 175 ...

Page 15: ...uide is written for SunTM personnel who have been fully trained on all the components in the configuration How This Book Is Organized This book contains the following topics Chapter 1 introduces the Sun StorEdge 3900 and 6900 series storage subsystems Chapter 2 offers general troubleshooting guidelines such as manually halting the I O and returning paths to production Chapter 3 presents informatio...

Page 16: ...tipath configurator Chapter 11 provides an example of fault isolation It begins with how to discover an error and shows the user steps that are necessary for resolution Appendix A provides virtualization engine references including Service Request Numbers SRNs and Simple Network Management Protocol SNMP Reference an SRN SNMP single point of failure table and port communication and service code tab...

Page 17: ... AaBbCc123 What you type when contrasted with on screen computer output su Password AaBbCc123 Book titles new words or terms words to be emphasized Read Chapter 6 in the User s Guide These are called class options You must be superuser to do this Command line variable replace with a real name or value To delete a file type rm filename Shell Prompt C shell machine name C shell superuser machine nam...

Page 18: ...n StorEdge T3 Array Cabinet Installation Guide 816 4771 816 4768 816 0774 816 4769 816 4770 806 7979 Diagnostics Storage Automated Diagnostics Environment User s Guide 816 3142 Sun StorEdge SAN 4 0 1 Gb switches Sun StorEdge SAN 4 0 Release Guide to Documentation Sun StorEdge SAN 4 0 Release Installation Guide Sun StorEdge SAN 4 0 Release Configuration Guide Sun StorEdge Network 2 Gb FC Switch 16 ...

Page 19: ...Switch Installer s User s Manual SANbox 16 Segmented Loop Fibre Channel Switch Installer s User s Manual 875 3060 875 1881 875 3059 Expansion cabinet Sun StorEdge Expansion Cabinet Installation and Service Manual 805 3067 Storage Server Processor Sun V100 Server User s Guide Netra X1 Server User s Guide Netra X1 Server Hard Disk Drive Installation Guide 806 5980 806 5980 806 7670 Product Title Par...

Page 20: ... purchase a broad selection of Sun documentation including localized versions at http www sun com documentation Sun Welcomes Your Comments Sun is interested in improving its documentation and welcomes your comments and suggestions You can email your comments to Sun at docfeedback sun com Please include the part number 816 5255 of your document in the subject line of your email ...

Page 21: ...nal Array Partner Groups Supported with Optional Additional Expansion Cabinet Virtualization Engine Sun StorEdge 3900 series 3900SL2 Sun StorEdge 3910 system Sun StorEdge 3960 system Two 8 port switches Two 16 port switches One to four One to four N A One to five N A Sun StorEdge 6900 series 6910SL3 6960SL3 Sun StorEdge 6910 system Sun StorEdge 6960 system Four 8 port switches Four 16 port switche...

Page 22: ...Many devices like the Sun StorEdge FC switch 8 and switch 16 switch and the Sun StorEdge T3 array cause the Storage Automated Diagnostic Environment alerts to be sent if the temperature thresholds are exceeded This enables Sun trained personnel to address the problem before the component and enclosure fails Single Point of Failure SPOF notification Storage Automated Diagnostic Environment notifica...

Page 23: ...level steps you can take to isolate and troubleshoot problems in the Sun StorEdge 3900 and 6900 series It offers a methodical approach and lists the tools and resources available at each step Note A single problem can cause various errors throughout the storage area network SAN A good practice is to begin by investigating the devices that have experienced Loss of Communication events in the Storag...

Page 24: ...ecking functionality determine whether the package or patch is installed Verify the functionality using one of the following tools checkdefaultconfig 1M cfgadm al output luxadm 1M output Review the multipathing status using the Sun StorEdge Traffic Manager MPxIO software or vxdmp 1M command 3 Check the status of a Sun StorEdge T3 array by using one or more of the following methods Review the Stora...

Page 25: ...cessor you must export X Display 5 Check the status of the virtualization engine using one or more of the following methods Review the Storage Automated Diagnostic Environment device monitoring reports Run the checkve 1M checkvemap 1M and showvemap 1M commands which check and display the virtualization host and LUN configurations Refer to the LED status blink codes Virtualization Engine LEDs on pa...

Page 26: ...tware Restart the application Host Side Troubleshooting Host side troubleshooting refers to the messages and errors that the data host detects Usually these messages appear in the var adm messages file Storage Service Processor Side Troubleshooting Storage Service Processor side troubleshooting refers to messages alerts and errors that the Storage Automated Diagnostic Environment detects while run...

Page 27: ...ript to check all accessible components The output is shown in CODE EXAMPLE 2 1 Run the checkswitch 1M checkt3config 1M checkve 1M checkvemap 1M scripts from opt SUNWsecfg bin to check the settings on the Sun StorEdge network FC switch 8 and switch 16 switches the Sun StorEdge T3 array and the virtualization engine The scripts check the default configuration files in the opt SUNWsecfg etc director...

Page 28: ...0 Configuration Checking command ver PASS Checking command vol stat PASS Checking command port list PASS Checking command port listmap PASS Checking command sys list FAIL Failure Noted Checking T3 t3b2 Checking t3b2 Configuration Checking command ver PASS Checking command vol stat PASS Checking command port list PASS Checking command port listmap PASS Checking command sys list PASS snip Checking V...

Page 29: ...2002 checkt3config t3b0 INFO sys memsize 32 MBytes Mon Jan 7 18 07 51 PST 2002 checkt3config t3b0 INFO cache memsize 256 MBytes Mon Jan 7 18 07 51 PST 2002 checkt3config t3b0 INFO Mon Jan 7 18 07 51 PST 2002 checkt3config t3b0 INFO CURRENT CONFIGURATION Mon Jan 7 18 07 51 PST 2002 checkt3config t3b0 INFO blocksize 16k Mon Jan 7 18 07 51 PST 2002 checkt3config t3b0 INFO cache auto Mon Jan 7 18 07 5...

Page 30: ... virtualization engine changes are accepted If a process such as savevemap 1M is running you cannot remove the lock file using the removelocks 1M command This process causes a component to be unavailable 2 Monitor the var adm log SEcfglog file to see when the savevemap 1M process successfully exits CODE EXAMPLE 2 2 savevemap 1M Output When savevemap ve pair EXIT is displayed the savevemap 1M proce...

Page 31: ...ries FIGURE 2 1 Sun StorEdge 6900 Series Logical View Host with HBA 0 and HBA 1 LUN0 10G Active MPDrive 0 LUN1 10G Active MPDrive1 Virtualization Engine 2 Virtualization Engine 1 SAN Database MPDrive Carved LUNs Masking Switch Switch LUN0 10G Active MPDrive LUN1 10G Active MPDrive1 Virtualization Engine Communications Traffic Switch Switch Storage I O and LUN0 500G Active Master LUN1 500G Passive ...

Page 32: ...y data paths to the master Sun StorEdge T3 array FIGURE 2 2 Primary Data Paths to the Alternate Master Host with HBA 0 and HBA 1 LUN0 10G Active MPDrive 0 LUN1 10G Active MPDrive 1 Switch Virtualization Engine 2 Virtualization Engine 1 Switch LUN1 10G Active MPDrive 1 LUN0 10G Active MPDrive 0 LUN0 500G Active Master LUN1 500G Passive Alternate Master LUN0 500G Passive Master LUN1 500G Active Alte...

Page 33: ... switch switch alternate master controller primary route from HBA 1 From HBA 1 switch virtualization engine 2 switch master controller backend loop to alternate master secondary route from HBA 1 Host with HBA 0 and HBA 1 LUN0 10G Active MPDrive0 LUN1 10G Active MPDrive1 Switch Virtualization Engine 2 Switch LUN0 500G Active Master LUN1 500G Passive Alternate Master Logical Multipath Drive MPDrive ...

Page 34: ...before the second tier of switches No Sun StorEdge T3 array failure is noted because of the redundant path by way of the Sun StorEdge network FC switch 8 and switch 16 switch T ports FIGURE 2 4 Path Failure Before the Second Tier of Switches LUN0 10G Active MPDrive 0 LUN1 10G Active MPDrive1 LUN0 10G Active MPDrive0 LUN1 10G Active MPDrive1 Host with HBA 0 and HBA 1 Switch Switch Switch SAN Databa...

Page 35: ... network FC switch 8 and switch 16 switches or in the event that both T ports fail between the switches the virtualization engine forces a LUN failover of the affected Sun StorEdge T3 array and routes all I O to its secondary path From the host side nothing has changed all I O is routed through both HBAs refer to FIGURE 2 5 Host with HBA 0 and HBA 1 LUN0 10G Active MPDrive0 LUN1 10G Active MPDrive...

Page 36: ...a Sun StorEdge Traffic Manager MPxIO software problem on a Sun StorEdge 6900 series system usr sbin luxadm display dev rdsk c6t29000060220041F96257354230303052d0s2 DEVICE PROPERTIES for disk dev rdsk c6t29000060220041F96257354230303052d0s2 Status Port A O K Status Port B O K Vendor SUN Product ID SESS01 WWN Node 2a000060220041f4 WWN Port A 2b000060220041f4 WWN Port B 2b000060220041f9 Revision 080C...

Page 37: ...ed in the following sections To Quiesce the I O 1 Determine the path you want to disable 2 Type To Unconfigure the c2 Path 1 Type cfgadm c unconfigure device cfgadm al Ap_Id Type Receptacle Occupant Condition c0 scsi bus connected configured unknown c0 dsk c0t0d0 disk connected configured unknown c0 dsk c0t1d0 disk connected configured unknown c1 scsi bus connected configured unknown c1 dsk c1t6d0...

Page 38: ...ge T3 array logical unit number LUN failover After the failover occurs replace the cable and proceed with the testing and FRU isolation After the testing and any FRU replacement are finished return the Controller state back to the default by using virtualization engine failback Refer to To Failback the Virtualization Engine on page 120 cfgadm c unconfigure c2 2b000060220041f4 cfgadm al Ap_Id Type ...

Page 39: ...ort listmap Another but slower method is to run the runsecfg script and verify the virtualization engine maps by polling them against a live system Caution During the failover small computer systems interface SCSI errors will occur on the data host and a brief suspension of I O will occur To Put the c2 Path Back into Production 1 Type 2 Verify that I O has resumed on all paths cfgadm c configure c...

Page 40: ...v vx dmp Disk_1s4 char dev vx rdmp Disk_1s4 privpaths block dev vx dmp Disk_1s3 char dev vx rdmp Disk_1s3 version 2 2 iosize min 512 bytes max 2048 blocks public slice 4 offset 0 len 209698816 private slice 3 offset 1 len 4095 update time 1010434311 seqno 0 6 headers 0 248 configs count 1 len 3004 logs count 1 len 455 Defined regions config priv 000017 000247 000231 copy 01 offset 000000 enabled c...

Page 41: ... Cache Enabled Minimum prefetch 0x0 Maximum prefetch 0x0 Device Type Disk device Path s dev rdsk c20t2B000060220041F4d0s2 devices pci a 2000 pci 2 SUNW qlc 4 fp 0 0 ssd w2b000060220041f4 0 c raw luxadm display dev rdsk c23t2B000060220041F9d0s2 DEVICE PROPERTIES for disk dev rdsk c23t2B000060220041F9d0s2 Status Port A O K Vendor SUN Product ID SESS01 WWN Node 2a000060220041f9 WWN Port A 2b000060220...

Page 42: ...ies Troubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only To Put the DMP Enabled Paths Back into Production 1 Type 2 Verify that the path has been reenabled by typing vxdmpadm enable ctlr cn vxdmpadm listctlr all ...

Page 43: ...atus of the Sun StorEdge 3900 or 6900 series systems using the Storage Automated Diagnostic Environment utility version 2 2 The Storage Automated Diagnostic Environment is installed on every Storage Service Processor that ships with the unit All that is needed is web browser access to the Storage Service Processor In non Sun host configurations such as Microsoft Windows 2000 the Storage Automated ...

Page 44: ...omated Diagnostic Environment topology shown in FIGURE 3 1 the internel components of a Sun StorEdge 3910 system are shown There is also a Solaris host diag221 and the Storage Service Processor diag156 in the view What is missing is the Microsoft Windows 2000 host which is also connected FIGURE 3 1 Storage Automated Diagnostic Environment Example Topology ...

Page 45: ...agnostic Environment event grid like the one shown in in TABLE 3 1 TABLE 3 1 Event Grid Sorting Criteria Category Component Event Type Severity Action All default Sun StorEdge A3500FC array Sun StorEdge A5000 array Agent Host Message Sun Switch Sun StorEdge T3 array Tape Virtualization engine All default Backplane Controller Disk Interface LUN Port Power Agent Deinstall Agent Install Alarm FC Alte...

Page 46: ... Windows 2000 errors through the Event Properties System Log The types of errors that would indicate a Sun StorEdge T3 Array Failover Driver issue have the Source Jafo An example is shown in FIGURE 3 2 You should also look for other events such as any HBA driver related events qla2200 for example or disk related events FIGURE 3 2 Microsoft Windows 2000 Event Properties System Log ...

Page 47: ...UNW qlc 3 fp 0 0 devctl run_connect Yes mbox Disable ilb Disable ilb_10 Disable elb Enable qlctest called with options dev devices pci 6 4000 SUNW qlc 3 fp 0 0 devctl run_connect Yes mbox Disable ilb Disable ilb_10 Disable el b Enable qlctest Started Program Version is 4 0 1 Testing qlc0 device at devices pci 6 4000 SUNW qlc 3 fp 0 0 devctl QLC Adapter Chip Revision 1 Risc Revision 3 Frame Buffer ...

Page 48: ...ns and restrictions opt SUNWstade Diags bin switchtest v o dev 2 192 168 0 30 0x0 xfersize 200 switchtest called with options dev 2 192 168 0 30 0x0 xfersize 200 switchtest Started Testing port 2 Using ip_addr 192 168 0 30 fcaddr 0x0 to access this port Chassis Status for Device Switch Power OK Temp OK 23 0c Fan 1 OK Fan 2 OK Testing Device Switch Port 2 Pattern 0x7e7e7e7e Testing Device Switch Po...

Page 49: ...tility you can access the web site with the following URL http webhome eng mdeSW Project Explorer html To Install the Explorer Data Collection Utility on the Storage Service Processor 1 Type 2 When you are prompted for site specific information during the installation process you can optionally click Return to accept the blank defaults Caution Do not accept automatic emailing of the Explorer Data ...

Page 50: ...E 3 3 Editing Switch Information Using vi 4 Type Sun StorEdge T3 array information in the opt SUNWexplo etc t3input txt file 5 Type the password for your specific site CODE EXAMPLE 3 4 Editing Sun StorEdge T3 Array Information Using vi Note xxxx represents Sun StorEdge T3 array passwords vi saninput txt Input file for extended data collection Format is SWITCH SWITCH TYPE PASSWORD LOGIN Valid switc...

Page 51: ...r switch 16 switch and Sun StorEdge T3 array information that you can use for troubleshooting purposes A tar gzip file is put in the opt SUNWexplo output tar gzip file directory You can send the tar gzip file to Sun Solution Center for evaluation The Sun StorEdge network FC switch 8 and switch 16 switch information is placed in the san directory of the tar file Sun StorEdge T3 array information is...

Page 52: ...the HBA manufacturer s utility such as the Qlogic SANblade Manager software provided by Qlogic for their HBAs This software is freely downloadable from Qlogic s website http www qlogic com Note Other manufacturer s utilities such as LightPulse s Emulex are needed for other HBA s such as Emulex HBAs Use the Qlogic SANblade Manager to extract information about HBA Driver versions Firmware versions A...

Page 53: ...Chapter 3 Troubleshooting Tools 33 Sun Proprietary Confidential Internal Use Only FIGURE 3 3 Qlogic SANblade Manager HBA Driver and Firmware Versions ...

Page 54: ... Use Only QLogic SANblade Manager is also useful for viewing a primitive topology and a LUN listing FIGURE 3 4 QLogic SANblade Manager Diagnostics Note Differing HBA manufacturer s may bundle different features with their tools The information in this guide is written with the assumption of Qlogic software usage ...

Page 55: ...gine Two for each Sun StorEdge T3 array partner group One for the Ethernet hub that is installed on the second Sun StorEdge Expansion Cabinet in the Sun StorEdge 3960 and 6960 series systems Note Information about LED status lights power information and front panel settings can be found in the 3Com document SuperStack 3 Baseline Hub 12 Port TP User Guide or SuperStack 3 Baseline Hub 24 Port TP Use...

Page 56: ...36 Sun StorEdge 3900 and 6900 Series 2 0 Troubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only ...

Page 57: ... Environment GUI Note linktest tests both ends of the link segment and enters a guided isolation when a fault is detected Faults can be detected in one of two ways when linktest sends an alert on a bad or intermittent link or when a red link appears on the topology graph indicating a failure This chapter contains the following sections FC Links on page 38 Troubleshooting the A1 or B1 FC Link on pa...

Page 58: ...hat it is configured to monitor host errors The following diagrams provide troubleshooting information for the basic components and FC links specific to the Sun StorEdge 3900 1 1 series shown in FIGURE 5 1 and the Sun StorEdge 6900 1 1 series shown in FIGURE 5 2 Note An actual Sun StorEdge 3900 or 6900 series configuration could have more Sun StorEdge T3 arrays than are shown in FIGURE 5 1 and FIG...

Page 59: ...ponents and the FC links for a Sun StorEdge 3900 series system A1 to B1 HBA to Sun StorEdge network FC switch 8 and switch 16 switch link A4 to B4 Sun StorEdge network FC switch 8 and switch 16 switch to Sun StorEdge T3 array link FIGURE 5 1 Sun StorEdge 3900 Series FC Link Diagram HOST HBA A HBA B T3 alternate master T3 Master sw1a sw1b B4 B1 A4 A1 ...

Page 60: ...des FC Link Between These Components A1 to B1 HBA to Sun StorEdge network FC switch 8 and switch 16 switch link A2 to B2 Sun StorEdge network FC switch 8 and switch 16 switch to virtualization engine link on the host side A3 to B3 Sun StorEdge network FC switch 8 and switch 16 switch to the virtualization engine link on the device side A4 to B4 Sun StorEdge network FC switch 8 and switch 16 switch...

Page 61: ... Fibre Channel FC Links 41 Sun Proprietary Confidential Internal Use Only FIGURE 5 2 Sun StorEdge 6900 Series FC Link Diagram HOST HBA A HBA B A1 B1 sw1a sw1b A2 B2 v1a v1b A3 B3 T1 T2 A4 sw2b sw2a T3 alternate master T3 Master B4 ...

Page 62: ... B1 link is the FC link from the HBA to the switch What happens when a FC link fails depends on the system If a problem occurs with the A1 or B1 FC link In a Sun StorEdge 3900 series system the Sun StorEdge T3 array will fail over In a Sun StorEdge 6900 series system no Sun StorEdge T3 array will fail over but an error with the FC link can cause a path to go offline ...

Page 63: ... 5mins Last Message diag xxxxx xxx com qlc ID 686697 kern info NOTICE Qlogic qlc 0 Loop OFFLINE Site FSDE LAB Broomfield CO Source diag xxxxx xxx com Severity Normal Category Message Key message diag xxxxx xxx com EventType LogEvent driver MPXIO_offline EventTime 01 08 2002 14 48 02 Found 2 driver MPXIO_offline warning s in logfile var adm messages on diag xxxxx xxx com id 80fee746 Jan 8 14 47 07 ...

Page 64: ...cation Note An A1 or B1 FC link error can cause a port in sw1a or sw1b to change state Site FSDE LAB Broomfield CO Source diag xxxxx xxx com Severity Normal Category Switch Key switch 100000c0dd0057bd EventType StateChangeEvent X port 6 EventTime 01 08 2002 14 54 20 port 6 in SWITCH diag sw1a ip 192 168 0 30 is now Unknown status state changed from Online to Admin ...

Page 65: ...A O K Status Port B O K Vendor SUN Product ID SESS01 WWN Node 2a000060220041f4 WWN Port A 2b000060220041f4 WWN Port B 2b000060220041f9 Revision 080C Serial Num Unsupported Unformatted capacity 102400 000 MBytes Write Cache Enabled Read Cache Enabled Minimum prefetch 0x0 Maximum prefetch 0x0 Device Type Disk device Path s dev rdsk c6t29000060220041F96257354230303052d0s2 devices scsi_vhci ssd g29000...

Page 66: ...rom the Storage Service Processor The dev option to switchtest is in the following format Port IP Address FCAddress The FCAddress can be set to 0x0 Note If you are testing an A1 or B1 FC link that is connected to an HBA you must specify a payload of 200 bytes or less This is a limitation in the HBA application specific integrated circuit ASIC usr sbin cfgadm al Ap_Id Type Receptacle Occupant Condi...

Page 67: ...m the command line interface CLI opt SUNWstade Diags bin switchtest v o dev 2 192 168 0 30 0 switchtest called with options dev 2 192 168 0 30 0 switchtest Started Testing port 2 Using ip_addr 192 168 0 30 fcaddr 0x0 to access this port Chassis Status for Device Switch Power OK Temp OK 23 0c Fan 1 OK Fan 2 OK 02 06 02 15 09 45 diag Storage Automated Diagnostic Environment MSGID 4001 switchtest WAR...

Page 68: ...to test the entire link 3 Break the connection by uncabling the link 4 Insert a loopback connector into the switch port 5 Rerun switchtest a If switchtest fails replace the gigabit interface converter GBIC and rerun switchtest b If switchtest fails again replace the switch 6 Insert a loopback connector into the HBA 7 Run qlctest a If the qlctest test fails replace the HBA b If the qlctest test pas...

Page 69: ...uested the following events be forwarded to you from diag xxxxx xxx com Site FSDE LAB Broomfield CO Source diag226 xxxxx xxx com Severity Normal Category Message Key message diag xxxxx xxx com EventType LogEvent driver Fabric_Warning EventTime 01 08 2002 17 34 47 Found 1 driver Fabric_Warning warning s in logfile var adm messages on diag xxxxx xxx com id 80fee746 Info Fabric warning Jan 8 17 34 36...

Page 70: ...tate changed from Online to Admin Site FSDE LAB Broomfield CO Source diag xxxxx xxx com Severity Normal Category San Key switch 100000c0dd0061bb 1 EventType LinkEvent ITW switch ve EventTime 01 08 2002 17 39 47 ITW ERROR 765 in 11 mins Origin port 1 on switch sw1b 192 168 0 31 Destination port 1 on ve diag v1b 29000060220041f4 Info An invalid transmission word ITW was detected between two componen...

Page 71: ... pci 6 4000 SUNW qlc 2 fp 0 0 devctl CONNECTED devices pci 6 4000 SUNW qlc 3 fp 0 0 devctl CONNECTED usr sbin luxadm display dev rdsk c6t29000060220041F96257354230303052d0s2 DEVICE PROPERTIES for disk dev rdsk c6t29000060220041F96257354230303052d0s2 Status Port A O K Status Port B O K Vendor SUN Product ID SESS01 WWN Node 2a000060220041f9 WWN Port A 2b000060220041f9 WWN Port B 2b000060220041f4 Rev...

Page 72: ...th the switch and the GBIC are tested using the switchtest test The switchtest test Can be used only in conjunction with the loopback connector Cannot be cabled to the virtualization engine while switchtest runs No virtualization engine tests are available To Isolate the A2 or B2 FC Link To isolate the A2 or B2 link which is the FC link from the first switch to the virtualization engine only in th...

Page 73: ...ngine side GBIC recable the link and monitor the link for errors b Replace the cable recable the link and monitor the link for errors c Replace the virtualization engine restore the virtualization engine settings recable the link and monitor the link for errors Note The procedures for restoring virtualization engine settings are in the Sun StorEdge 3900 and 6900 Series 2 0 Reference and Service Gu...

Page 74: ...0fee746 Jan 8 18 24 24 WWN 2b000060220041f9 diag xxxxx xxx com mpxio ID 779286 kern info scsi_vhci ssd g29000060220041f96257354230303053 ssd19 multipath status degraded path pci 6 4000 SUNW qlc 3 fp 0 0 fp1 to target address 2b000060220041f9 1 is offline Jan 8 18 24 24 WWN 2b000060220041f9 diag xxxxx xxx com mpxio ID 779286 kern info scsi_vhci ssd g29000060220041f96257354230303052 ssd18 multipath ...

Page 75: ...ffline Action 1 Verify cables GBICs and connections along FC path 2 Check Storage Automated Diagnostic Environment SAN Topology GUI to identify failing segment of the data path 3 Verify correct FC switch configuration Site FSDE LAB Broomfield CO Source diag xxxxx xxx com Severity Normal Category Switch Key switch 100000c0dd00cbfe EventType StateChangeEvent M port 1 EventTime 01 08 2002 18 28 40 po...

Page 76: ...2 210100e08b23fa25 unknown connected unconfigured unknown c2 2b000060220041f4 disk connected configured unknown c3 fc fabric connected configured unknown c3 2b000060220041f9 disk connected configured unusable c3 210100e08b230926 unknown connected unconfigured unknown c4 fc private connected unconfigured unknown c5 fc connected unconfigured unknown usr sbin luxadm e port Found path to 2 HBA ports d...

Page 77: ...ts and test options Refer to the Storage Automated Diagnostic Environment User s Guide for more information FRU Tests Available for the A3 or B3 FC Link Segment The linktest is not available Both the switch and the GBIC are tested using the switchtest test The switchtest test Can be used only in conjunction with the loopback connector Cannot be cabled to the virtualization engine while switchtest ...

Page 78: ... connector in to the switch port 4 Run switchtest a If the test fails replace the GBIC and rerun switchtest b If the test fails again replace the switch 5 If the switch or the GBIC shows no errors replace the remaining components in the following order a Replace the virtualization engine side GBIC recable the link and monitor the link for errors b Replace the cable recable the link and monitor the...

Page 79: ...llowing methods to suspend I O while the failover occurs Stop all customer applications that are accessing the Sun StorEdge T3 array Manually pull the link from the Sun StorEdge T3 array to the switch and wait for a Sun StorEdge T3 array LUN failover After the failover occurs replace the cable and proceed with testing and FRU isolation After testing is complete and any FRU replacement is finished ...

Page 80: ...ce diag xxxxx xxx com Severity Warning Category Message DeviceId message diag xxxxx xxx com EventType LogEvent driver MPXIO_offline EventTime 01 29 2002 14 28 06 Found 2 driver MPXIO_offline warning s in logfile var adm messages on diag xxxxx xxx com id 80e4aa60 snip Site FSDE LAB Broomfield CO Source diag xxxxx xxx com Severity Warning Category Message DeviceId message diag xxxxx xxx com EventTyp...

Page 81: ...arning s found in logfile var adm messages t3 on diag id 83060c0c Jan 29 14 12 58 t3b0 ISR1 2 W u2ctr ISP2100 2 Received LOOP DOWN async event Jan 29 14 13 32 t3b0 MNXT 1 W u1ctr starting lun 1 failover Site FSDE LAB Broomfield CO Source diag Severity Warning Category T3message DeviceId t3message 83060c0c EventType LogEvent MessageLog EventTime 01 29 2002 14 11 14 Warning s found in logfile var ad...

Page 82: ...StorEdge T3 array LUNs back to the proper configuration after the failing FRU is replaced This command is issued from the data host Sun StorEdge 6900 Series In a Sun StorEdge 6900 series device the virtualization engine pairs handle the failover and the failover is not noted on the data host All paths remain online and active The failbackt3path command is used and is issued from the Storage Servic...

Page 83: ...2A60003E82Fd0s2 Status Port A O K Status Port B O K Vendor SUN Product ID T300 WWN Node 50020f2000006443 WWN Port A 50020f2300006355 WWN Port B 50020f2300006443 Revision 0118 Serial Num Unsupported Unformatted capacity 488642 000 MBytes Write Cache Enabled Read Cache Enabled Minimum prefetch 0x0 Maximum prefetch 0x0 Device Type Disk device Path s dev rdsk c26t60020F20000064433C3352A60003E82Fd0s2 d...

Page 84: ...e Storage Automated Diagnostic Environment GUI to isolate suspected failing components Alternatively follow these steps 1 Quiesce the I O on the A4 or B4 FC link path 2 Run switchtest 1M to test the entire link re create the problem 3 Break the connection by uncabling the link 4 Insert the loopback connector in to the switch port cfgadm al Ap_Id Type Receptacle Occupant Condition ac0 bank0 memory ...

Page 85: ... 6 If switchtest passes assume that the suspect components are the cable and the Sun StorEdge T3 array controller a Replace the cable b Rerun switchtest 7 If the test fails again replace the Sun StorEdge T3 array controller 8 Return the path to production 9 Return the Sun StorEdge T3 array LUNs to the correct controllers if a failover occurred Determine if failovers occur using the luxadm failover...

Page 86: ...66 Sun StorEdge 3900 and 6900 Series 2 0 Troubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only ...

Page 87: ...iagnostic Environment Event Grid enables you to sort host events by component category or event type The Storage Automated Diagnostic Environment GUI displays an event grid that describes the severity of the event tells whether action is required provides a description of the event and gives the recommended action Refer to the Storage Automated Diagnostic Environment User s Guide for more informat...

Page 88: ...68 Sun StorEdge 3900 and 6900 Series 2 0 Troubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only FIGURE 6 1 Sample Host Event Grid ...

Page 89: ...utput of the luxadm e port Finds the path to 20 HBA ports LUN t300 Alarm Red Y The state of lUN t300 c14t50020F2 300003EE5d0s2 status A on diag xxxxx xxx com The status changed from OK to error target t3 diag244 t3b0 90 0 0 40 The luxadm display reported a change in the port status of one of its paths The Storage Automated Diagnostic Environment tries to find the enclosure corresponding to this pa...

Page 90: ... failure details enclosure PatchInfo New patch and package information were generated Send changes to the output of showrev p and pkginfo enclosure backup The Agent was backed up Backs up the configuration file of the Agent disk_ capacity Alarm Yellow Y Detected that var opt SUNWstade is at or above 98 capacity by typing usr sbin df k var opt SUNWstade Remove unused files and directories to free u...

Page 91: ...tructions for the next four steps 1 Install the SUNWstade package on a new master host 2 Run opt SUNWstade bin ras_install on the new master host 3 Configure the host as the master host 4 Connect to the master server s GUI at http servername 7654 5 Choose System Utilities Recover Config Refer to Chapter 3 of the Storage Automated Diagnostic Environment User s Guide for detailed instructions a In t...

Page 92: ...tall the SUNWstade package on the new host 5 Run opt SUNWstade bin ras_install 6 Configure the host as a slave 7 Choose Maintenance General Maintenance Maintain Hosts Refer to the maintenance section in Chapter 3 of the Storage Automated Diagnostic User s Guide for detailed instructions 8 In the Maintain Hosts window select the new host 9 Configure the options as needed 10 Choose Maintenance Topol...

Page 93: ...ction infrastructure The switches are paired to provide redundancy Two switches are used in each Sun StorEdge 3900 series and four switches are used in each Sun StorEdge 6900 series Each Sun StorEdge network FC switch 8 and switch 16 switch is connected by way of an Ethernet to the service network for management and service from the Storage Service Processor These switches can be monitored through...

Page 94: ...00 Series 2 0 Reference and Service Guide for the procedures Zone Modifications You should not modify the shared zone set on the back end switches doing so can cause an error Error State 50 on the virtualization engine If you determine however that you must modify the shared zone set follow these steps 1 Offline the T ports interswitch links 2 Offline the virtualization engine ports 3 Modify the z...

Page 95: ...ave a Sun StorEdge SAN 4 1 infrastructure already in place and functional This includes at a minimum A Solaris host on the SAN management network loaded with SANbox2 Manager Sun StorEdge 2 Gbit 16 port switch network configured in desired topology ring star mesh or cascade with healthy ISL links Diagnosing and Troubleshooting Switch Hardware Problems Note Whereas 1 Gbit switch port numbers are num...

Page 96: ...ng procedures for the Sun StorEdge network FC switch 8 and switch 16 switch hardware refer to the Sun StorEdge SAN 4 1 Release Field Troubleshooting Guide This document covers the Sun StorEdge network FC switch 8 and switch 16 switch and the interconnections HBA GBIC and cables on either side of the switch The Sun StorEdge SAN 4 1 Release Field Troubleshooting Guide also includes an appendix on th...

Page 97: ...vent grid that describes the severity of the event tells whether action is required provides a description of the event and gives the recommended action Refer to the Storage Automated Diagnostic Environment User s Guide for more information To Use the Switch Event Grid 1 From the Storage Automated Diagnostic Environment Help menu select the Event Grid link 2 FIGURE 7 1 shows the Switch Event Grid ...

Page 98: ...link 1 Check the Topology GUI for any link errors 2 Quiesce I O on the link 3 Run linktest on the link to isolate the failing FRU chassis fan Alarm Yellow Y chassis fan 1 status changed from OK None system_ reboot Alarm Yellow Y The uptime of the switch was less than the previous uptime of the switch This could indicate that the switch has been reset either by a user or by the loss of power 1 Chec...

Page 99: ...rnet connectivity to the switch has been lost 1 Check Ethernet connectivity to the switch 2 Verify that the switch is booted correctly with no POST errors 3 Verify that the switch Test Mode is set for normal operations 4 Verify the TCP IP settings on switch by way of Forced PROM Mode access 5 Replace switch if needed switch test Diagnostic Test Red Check Test Manager for failure details TABLE 7 1 ...

Page 100: ... device It creates a detailed description of the device monitored and sends it using any active notifier such as the SunTM Remote Services SRS Net Connect service or email enclosure Location Change Location of switch rasd2 swb0 ip xxx 0 0 40 was changed TABLE 7 1 Storage Automated Diagnostic Environment Event Grid for 1 Gbit Switches Continued Component EventType Severity Action Description Note T...

Page 101: ...itch has logged out of the Fabric connection and has gone offline 1 Verify cables GBICs and connections along the FC path 2 Check the Storage Automated Diagnostic Environment SAN Topology GUI to identify failing segment of the data path 3 Verify the correct FC switch configuration enclosure Statistics Statistics about switch d2 swb1 ipxxx 0 0 41 10002000007a609 TABLE 7 1 Storage Automated Diagnost...

Page 102: ...ssis board Alarm Yellow Y The uptime of the switch was less than the previous uptime of the switch This could indicate that the switch has been reset either by a user or by the loss of power 1 Check to see if the switch has been reset 2 Check the power going to the switch chassis power Alarm Yellow chassis power 1 status changed from OK This event monitors changes in the status of the chassis powe...

Page 103: ...ic Test Red Check Test Manager for failure details enclosure Discovery Discovered a new switch called ras d2 swb1 ip xxx 0 0 41 10002000007a609 Discovery events occur the very first time the agent probes a storage device It creates a detailed description of the device monitored and sends it using any active notifier such as the SunTM Remote Services SRS Net Connect service or email enclosure Locat...

Page 104: ...nection and has gone offline 1 Verify cables GBICs and connections along the FC path 2 Check the Storage Automated Diagnostic Environment SAN Topology GUI to identify failing segment of the data path 3 Verify the correct FC switch configuration enclosure Statistics Statistics about switch d2 swb1 ipxxx 0 0 41 10002000007a609 TABLE 7 2 Storage Automated Diagnostic Environment Event Grid for 2 GBit ...

Page 105: ...torEdge network FC switch 8 and switch 16 switches periodically releases new versions of the switch Flash code and the new version will not match the default version 4 WARNING The configuration is not set to the default but the differences are likely supported alternatives The default switch configurations were overridden with valid alternatives which are also supported by the SUNWsecfg configurat...

Page 106: ...ge 3900 and 6900 Series 2 0 Troubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only Note If multiple systems are connected to a switch the switch settings might not match the default settings ...

Page 107: ...T3 array is used as a building block configured in various ways to provide a storage solution optimized to the host application The array is sometimes called a controller unit which refers to the internal RAID controller on the controller card Arrays without the controller card are called expansion units When connected to a controller unit the expansion unit enables the user to increase storage ca...

Page 108: ...provide redundancy If one of the two links is lost no Sun StorEdge T3 array LUN failover occurs and no pathing failures are detected If both T port links fail a Sun StorEdge T3 array LUN failover occurs as one of the virtualization engines takes control of the I O operations One of the Sun StorEdge T3 array LUNs fail over as all I O is routed to the controlling virtualization engine The host detec...

Page 109: ... Not Available status state changed from Online to Offline INFORMATION A port on the switch has logged out of the fabric and gone offline PROBABLE CAUSE 1 Verify cables GBICs and connections along Fibre Channel path 2 Check Storage Automated Diagnostic Environment SAN Topology GUI to identify failing segment of the data path 3 Verify correct FC switch configuration Site Lab 3286 DSQA1 Broomfield S...

Page 110: ...ion engine has detected a change in status for a Multipath Drive or VLUN usually meaning a pathing problem to a Sun StorEdge T3 array controller for changes in Active Passive paths 1 Check Sun StorEdge T3 array for current LUN ownership port listmap 2 Use mpdrive failback if needed to fail LUNs back to correct the controller if needed Site Lab 3286 DSQA1 Broomfield Source diag xxxxx xxx com Severi...

Page 111: ...xx xxx com id 809f76b4 INFORMATION Fabric warning Jan 30 11 46 37 WWN 2b00006022004186 diag xxxxx xxx com fp ID 517869 kern warning WARNING fp 2 N_x Port with D_ID 108000 PWWN 2b00006022004186 reappeared in fabric in backup diag xxxxx xxx com Site Lab 3286 DSQA1 Broomfield Source diag xxxxx xxx com Severity Warning Actionable Category Host DeviceId host diag xxxxx xxx com EventType AlarmEvent P hb...

Page 112: ...enu t3b0 1 port listmap port targetid addr_type lun volume owner access u1p1 0 hard 0 vol1 u1 primary u1p1 0 hard 1 vol2 u1 failover u2p1 1 hard 0 vol1 u1 failover u2p1 1 hard 1 vol2 u1 primary MANAGE CONFIGURATION FILES MENU 1 Display Virtualization Engine Map 2 Save Virtualization Engine Map 3 Verify Virtualization Engine Map 4 Help 5 Return Select configuration option above 3 Verifying Virtuali...

Page 113: ...itch Port 8 is Offline switchtest failed Remove FC Cable from switch 100000c0dd00b682 port 8 Insert FC loopback cable into switch 100000c0dd00b682 port 8 Continue Isolation switchtest started on switch 100000c0dd00b682 port 8 Estimated test time 14 minute s 01 30 02 11 22 11 diag209 Storage Automated Diagnostic Environment MSGID 6013 switchtest FATAL switch0 Device Switch Port 8 is Offline switcht...

Page 114: ...emaining link I O is automatically routed over the repaired link by the switch after the failed link is replaced and recabled No manual intervention is required If both links have failed and a LUN failover has occurred you must manually run a failbackt3path command to return the paths to their optimal state after you repair and recable the links To Isolate the T1 or T2 Data Path 1 Run linktest fro...

Page 115: ...nostic Environment GUI displays an event grid that describes an event and its severity and tells what if any action should be taken Refer to the Storage Automated Diagnostic Environment User s Guide for more information To Use the Sun StorEdge T3 Array Event Grid 1 From the Storage Automated Diagnostic Environment Help menu click the Event Grid link 2 Select the criteria from the Storage Automated...

Page 116: ...ow Y The vol slice feature is possible in Sun StorEdge T3 array firmware version 2 1 and above This option enables volume slicing up to 16 LUN per single Sun StorEdge T3 array or partner group This feature also enables LUN masking HBA zoning features This option is disabled by default To activate the feature type sys_enable_volslice_on from the Sun StorEdge T3 array command line disk port Alarm Re...

Page 117: ...e matching firmware with the other loopcard 4 Reenable the loopcard if possible enable u encid 1 2 5 Replace the loopcard if necessary 6 Reenable the disk if possible 7 Replace the disk if necessary power battery Alarm Red Y The state of the batteries in the Sun StorEdge T3 array is not optimal Possible causes are The voltage level on the power supply and the battery have moved out of acceptable t...

Page 118: ...u stat 3 Replace power cooling unit if necessary power temp Alarm Red Y The state of the temperature in the Sun StorEdge T3 array power cooling unit is either too high or is unknown 1 Open a Telnet session to the affected Sun StorEdge T3 array 2 Verify that the power cooling unit state is in fru stat 3 Replace the power cooling unit if necessary log Alarm Red Y This event includes all important er...

Page 119: ...gained oob OutOfBand ib Comm_Lost Down Y Since InBand ib monitoring is established using luxadm the monitoring may not be activated for a particular Sun StorEdge T3 array 1 Verify luxadm with the command line luxadm probe luxadm display 2 Verify cables GBICs and connections along the data path 3 Check the Storage AutomatedDiagnostic Environment SAN Topology GUI to identify the failing segment of t...

Page 120: ...cted Sun StorEdge T3 array 2 Verify that the Sun StorEdge T3 array is booted correctly 3 Verify the correct TCP IP settings on the Sun StorEdge T3 array 4 Increase the http timeout 5 Ping timeout in Utilities System System Timeouts The current default timeouts are 10 seconds for ping and 60 seconds for http tokens t3ofdg Diagnostic Test Red The t3ofdg 1M test failed t3test Diagnostic Test Red The ...

Page 121: ...ogy A new controller as identified by its serial number has been installed on the Sun StorEdge T3 array disk Topology A new disk as identified by its serial number has been installed on the Sun StorEdge T3 array interface loopcard Topology A new loopcard as identified by its serial number has been installed on the Sun StorEdge T3 array power Topology A new PCU has been installed on the Sun StorEdg...

Page 122: ...the chassis Replace the loopcard within the 30 minute power shutdown timeframe power Topology Red Y The Sun StorEdge T3 array has reported that a power cooling unit PCU has been removed from the chassis Replace the PCU within the 30 minute shutdown timeframe controller State Change The status of the controller has changed from disabled to ready enabled disk State Change The status of the disk has ...

Page 123: ...ported that a power cooling unit has been disabled 1 Open a Telnet session to the affected Sun StorEdge T3 array 2 Verify the controller state with fru_stat and sys_stat 3 Re enable the controller if possible enable u 4 Run logger dmprstlog from a serial port session on the affected controller The output from logger will only go the the syslog facility Review the syslog on the master controller to...

Page 124: ...lid system area on drive 9 Drive not present D Drive disabled is being reconstructed S Drive substituted 3 Replace the disk if necessary interface loopcard State Change Red Y The Sun StorEdge T3 array has indicated that the loopcard is no longer in an optimal state 1 Open a Telnet session to the affected Sun StorEdge T3 array 2 Verify loopcard state with fru stat 3 Verify matching firmware with ot...

Page 125: ...em area on drive 9 Drive not present D Drive disabled is being reconstructed S Drive substituted power State Change Red Y The Sun StorEdge T3 array has reported that a power cooling unit has been disabled A PCU failure can happen due to 1 Power loss 2 The PCU fails 3 The power switch is disrupted 1 Check the power supply and cables 2 Replace PCU if necessary enclosure Statistics Displays statistic...

Page 126: ...106 Sun StorEdge 3900 and 6900 Series 2 0 Troubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only ...

Page 127: ...Names on page 115 Virtualization Engine Event Grid on page 132 About the Virtualization Engine The virtualization engine supports the multipathing functionality of the Sun StorEdge T3 array Each virtualization engine has physical access to all underlying Sun StorEdge T3 arrays and controls access to half of the Sun StorEdge T3 arrays The virtualization engine has the ability to assume control of a...

Page 128: ... LED readout See Appendix A for the table of codes and related appropriate actions to take In some cases you might not be able to receive SRNs because of communication errors If this occurs you must read the virtualization engine LEDs to determine the problem Retrieving Service Information You can retrieve service information from one of two sources CLI Interface Error Log Analysis Commands Both o...

Page 129: ...p nnn Txxxxx uuuuuuuu SRN mmmmm TimeStamp nnn Txxxxx uuuuuuuu SRN mmmmm TimeStamp nnn Txxxxx uuuuuuuu SRN mmmmm Item Description TimeStamp The time and date when the error occurred nnn The name of the virtualization engine pair v1 or v2 Txxxxx1 The LUN where the error occurred uuuuuuuu The unique ID of the drive or the virtualization engine router SRN mmmmm The SRN defined in numerical order Refer...

Page 130: ...2 Jan 3 10 22 26 v1 29000060 220041F9 SRN 70030 2002 Jan 3 10 25 54 v1 29000060 220041F9 SRN 70030 opt svengine sduc sclrlog TABLE 9 1 Virtualization Engine LEDs LED Color State Description Power Green Solid on The virtualization engine is powered on Status1 Green Solid on Blink service code This is the normal operating mode The number of blinks indicate a decimal number that corresponds to a diag...

Page 131: ...ngine in decimal numbers Each decimal number is represented by a number of blinks followed by a medium duration period two seconds of no LED display TABLE 9 2 lists the status LED code descriptions The blink code repeats continuously with a four second off interval between code sequences TABLE 9 2 LED Diagnostic Codes Code LED Blink Pattern 0 Fast 1 Once 2 Twice with one second between blinks 3 Th...

Page 132: ...LEDs The Ethernet port LEDs indicate the speed activity and validity of the link shown in TABLE 9 3 TABLE 9 3 Speed Activity and Validity of the Link LED Color State Description Speed Amber Solid on Off The link is 100Base TX The link is 10Base T Link Activity Green Solid on Blink A valid link is established Operations including data activity are normal FC Port Host Side Power Switch Power Plug RJ...

Page 133: ... another reading The number of new errors that occurred within that time frame represents the number of link errors TABLE 9 4 Virtualization Engine Statistical Data Count Type Description Link failure count The number of times the virtualization engine s frame manager detects a nonoperational state or other failure of N port initialization protocol Loss of synchronization count The number of times...

Page 134: ...ount 0 Loss of Sync Count 0 Loss of Signal Count 0 Protocol Error Count 0 Invalid Word Count 8 Invalid CRC Count 0 I00001 Device Side FC Vital Statistics Link Failure Count 0 Loss of Sync Count 0 Loss of Signal Count 0 Protocol Error Count 0 Invalid Word Count 139 Invalid CRC Count 0 I00002 Host Side FC Vital Statistics Link Failure Count 0 Loss of Sync Count 0 Loss of Signal Count 0 Protocol Erro...

Page 135: ...eeded to identify this LUN The procedure to obtain the VLUN serial number is detailed next CODE EXAMPLE 9 2 luxadm Output for a Host Device usr sbin luxadm display dev rdsk c4t2B00006022004186d0s2 DEVICE PROPERTIES for disk dev rdsk c4t2B00006022004186d0s2 Status Port A O K Vendor SUN Product ID SESS01 WWN Node 2a00006022004186 WWN Port A 2b00006022004186 Revision 080E Serial Num Unsupported Unfor...

Page 136: ... at the scsi prompt 4 Find the VLUN serial number in the Inquiry displayed list From this screen note that the VLUN number is 62 57 33 4b 30 30 31 48 beginning with the fifth pair of numbers on the third line up to and including the twelfth pair format e c4t2B00006022004186d0 format scsi scsi inquiry Inquiry 00 00 03 12 2b 00 00 02 53 55 4e 20 20 20 20 20 SUN 53 45 53 53 30 31 20 20 20 20 20 20 20...

Page 137: ...ev rdsk c6t29000060220041956257334B30303148d0s2 DEVICE PROPERTIES for disk dev rdsk c6t29000060220041956257334B30303148d0s2 Status Port A O K Status Port B O K Vendor SUN Product ID SESS01 WWN Node 2a00006022004195 WWN Port A 2b00006022004195 WWN Port B 2b00006022004186 Revision 080E Serial Num Unsupported Unformatted capacity 56320 000 MBytes Write Cache Enabled Read Cache Enabled Minimum prefetc...

Page 138: ...5 0 t3b00 6257334F30304149 T49152 T16385 VDRV001 55 0 DISK POOL SUMMARY Disk pool RAID MP Drive Size Largest Free Total Free Number of Target GB Block GB Space GB VLUNs t3b00 5 T49152 477 367 367 2 t3b01 5 T49153 477 477 477 0 MULTIPATH DRIVE SUMMARY Disk pool MP Drive T3 Active Controller Serial Target Path WWN Number t3b00 T49152 50020F2300006DFA 60020F2000006DFA t3b01 T49153 50020F230000725B 60...

Page 139: ...020F2000006DFA which you need to perform Sun StorEdge T3 array LUN failback commands Determining the virtualization engine pairs on the system MAIN MENU SUN StorEdge 6910 SYSTEM CONFIGURATION TOOL 1 T3 Configuration Utility 2 Switch Configuration Utility 3 Virtualization Engine Configuration Utility 4 View Logs 5 View Errors 6 Exit Select option above 3 VIRTUALIZATION ENGINE MAIN MENU 1 Manage VLU...

Page 140: ... on page 118 If there has been a failover the Multipath Drive Summary will show the same Sun StorEdge T3 array active path WWN for all LUNs associated with one Sun StorEdge T3 array as shown in CODE EXAMPLE 9 3 CODE EXAMPLE 9 3 Multipath Drive Summary 2 If the Sun StorEdge T3 array LUNS have failed over run the command found in CODE EXAMPLE 9 5 for that specific Sun StorEdge T3 array Note The Sun ...

Page 141: ...torEdge T3 array and virtualization engines are online In this example a t3b0 should be plugged in to port 2 on a 1 Gbit switch of both sw2a and sw2b port 1 on a 2 Gbit switch b The virtualization engine should be plugged in to port 1 on a 1 Gbit switch of the same two switches port 0 on a 2 Gbit switch Refer to the Sun StorEdge 3900 and 6900 Series 2 0 Reference and Service Guide to determine whi...

Page 142: ...arget Devices 1 Address 0x02 0xef Proxy AL_PA Public Address World Wide Name E8 0010C000 2900006022004195 E4 00110000 2900006022004186 3 TL_Port online offline Not logged in 4 TL_Port online offline Not logged in 5 TL_Port online offline Not logged in 6 TL_Port online offline Not logged in 7 T_Port online online logged in 8 T_Port online online logged in Name Server Port Address Type PortWWN Node ...

Page 143: ...ly Clearing and Restoring the SAN Database It is occasionally necessary to manually clear and restore the SAN database on the virtualization engines Caution This procedure clears the SAN database and removes the configuration of the disk pools multipath drives zoning and VLUNs After you perform this procedure you must restore the virtualization map to the virtualization engine pair using restoreve...

Page 144: ...t to reset the virtualization engines using the following steps 2 To disable the switch ports associated with the vehostname type 3 Open a Telnet session into vehostname and clear the SAN database by entering 9 at the prompt 4 Select Q to exit the telnet session 5 To enable the switch ports associated with the vehostname type 6 To reset the virtualization engine and force it to synchronize with it...

Page 145: ...he password The User Service Utility Menu is displayed 4 Type 9 to clear the SAN database A successful command displays the message SAN database has been cleared An unsuccessful command results in the service code 051 If this occurs repeat Steps 1 through 3 If the command continues to fail replace the virtualization engine 5 To reconnect the virtualization engine s device side FC cables type 6 Typ...

Page 146: ... use Segments identified with 0x5555aa in the address are associated with slicd 3 Remove the segments by typing the following Refer to the ipcrm 1 man page for details The message queues and shared memory and semaphores have been removed ps ef grep slicd ipcs IPC status from running system as of Wed Feb 20 12 48 30 MST 2002 T ID KEY MODE OWNER GROUP Message Queues Shared Memory m 0 0x50000483 rw r...

Page 147: ...n ERROR HALT 50 condition requires that you visually inspect the virtualization engines with firmware revision 8 14 or earlier For virtualization engines with firmware revision 8 17 or later you can determine error conditions with the following steps a Open a Telnet session into v1_hostname b Display the Vital Product Data VPD by entering 1 at the prompt The last line of the output displays any er...

Page 148: ... Product Type FC FC 3 SVE H FC FC 3 router H Firmware Revision 8 017 Official Release Vicom release Apr 11 2002 17 49 16 Loader Revision 2 02 42 Unique ID 00000060 2200418A Unit Serial Number 00250339 PCB Number 00166425 MAC address 0 60 22 3 D1 E3 DIP SW1 00000000 DIP SW2 00000011 1 down 0 up 76543210 76543210 Error None ...

Page 149: ...e following example creatediskpools 1M for t3b0 indicates a missing Sun StorEdge T3 array path Thu May 30 17 35 19 MDT 2002 creatediskpools t3b0 ENTER opt SUNWsecfg bin creatediskpools n t3b0 Thu May 30 17 35 19 MDT 2002 checkslicd v1 ENTER opt SUNWsecfg bin checkslicd n v1 Thu May 30 17 35 21 MDT 2002 checkslicd v1 EXIT There are no eligible drives to create MultiPath drive automatically Thu May ...

Page 150: ...d in 3 TL_Port online offline Not logged in 4 TL_Port online offline Not logged in 5 TL_Port online offline Not logged in 6 TL_Port online offline Not logged in 7 T_Port online online logged in 8 T_Port online online logged in Name Server Port Address Type PortWWN Node WWN FC 4 Types 01 10C000 N 2900006022004195 2800006022004195 SCSI_FCP Here port 2 on sw2a is offline If required ports are offline...

Page 151: ...SUNWsecfg bin creatediskpools n t3b0 Thu May 30 17 40 24 MDT 2002 checkslicd v1 ENTER opt SUNWsecfg bin checkslicd n v1 Thu May 30 17 40 28 MDT 2002 checkslicd v1 EXIT MultiPath found T00000 and T00002 MultiPath found T00001 and T00003 Automatic MultiPath Drive created successfully Thu May 30 17 40 58 MDT 2002 creatediskpools mpdrive T49152 is t3b00 New disk pool name is t3b00 Thu May 30 17 41 17 ...

Page 152: ...nt grid that describes the severity of the event tells whether action is required provides a description of the event and lists the recommended action Refer to the Storage Automated Diagnostic Environment User s Guide Help section for more information To Use the Virtualization Engine Event Grid From the Storage Automated Diagnostic Environment Help menu select the Event Grid link FIGURE 9 3 shows ...

Page 153: ...er if needed volume_add Alarm Yellow A new VLUN was added to the configuration None volume_ delete Alarm Yellow A VLUN was deleted from the configuration None enclosure Alarm log Yellow Port statistics on virtualization engine v1a changed None enclosure Audit Automatic weekly audits send a detailed description of the enclosure to the Sun Network Storage Command Center NSCC None oob OutofBand Comm_...

Page 154: ...ver conditions in the Sun StorEdge T3 array 6 Replace the virtualization engine if necessary oob command Comm_ Lost Down Invalid command or slicd daemon problem 1 Check the status of the slicd daemon 2 Check the power on the virtualization engine 3 Make sure the virtualization engine is booted correctly 4 Verify that the TCP IP settings on the virtualization engine are correct 5 Check the T3 messa...

Page 155: ...y The discovery device found a new virtualization engine called v1a Discovery events occur the first time the agent probes a storage device and creates a detailed description of the device monitored The discovery device sends it using any active notifier such as NetConnect or email TABLE 9 5 Storage Automated Diagnostic Environment Event Grid for Virtualization Engine Continued Component EventType...

Page 156: ...136 Sun StorEdge 3900 and 6900 Series 2 0 Troubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only ...

Page 157: ...o monitor the host to switch link The Storage Automated Diagnostic Environment running the Storage Service Processor is not able to execute switchtest 1M on switch ports with Microsoft Windows 2000 HBAs currently attached as F ports The FRUs in the host to switch link can be isolated using the HBA utilities on the host and Storage Automated Diagnostic Environment s switchtest on the Storage Servic...

Page 158: ...ntial Internal Use Only Troubleshooting Tasks Using Microsoft Windows 2000 Launching the Sun StorEdge T3 Array Failover Driver GUI From the Microsoft Windows 2000 Advanced Server GUI click Programs T3 StorEdge Configurator Configurator FIGURE 10 1 Launching the Sun StorEdge T3 Array Failover Driver ...

Page 159: ...orEdge T3 Array Failover Driver Versions 2 0 0 123 and 2 1 0 104 Note In FIGURE 10 2 the example on the left shows build number 2 0 130 comprised of driver version 2 0 0 123 and application version 2 0 0 125 The same build number might have a different driver version and application version The example on the right shows build number 2 0 130 comprised of driver version 2 1 0 104 and application ve...

Page 160: ...dvanced Server GUI click Administrative Tools Computer Management Software Environment 2 Ensure the Jafo driver is in a Running and OK state 3 Launch the Sun StorEdge T3 Array Failover Driver GUI using instructions found in Launching the Sun StorEdge T3 Array Failover Driver GUI on page 138 The Multipath Configurator window is displayed A healthy Sun StorEdge 3900 series system has a solid line co...

Page 161: ...A system that has experienced a LUN failover has a broken line connecting the HBA to the storage as shown in FIGURE 10 4 FIGURE 10 4 Sun StorEdge 3900 series system with a LUN failover shown using Multipath Configurator 5 To further check the affected Sun StorEdge T3 array a Right click the Sun StorEdge T3 array in the failed path b Select Array Properties FIGURE 10 5 Multipath Configurator Array ...

Page 162: ...nd Line Interface CLI Use the jafo_nutil exe interface which is available with Sun StorEdge T3 Array Failover Driver version 2 1 and later to gather information about The WWN of monitored Sun StorEdge T3 array partner groups The WWN of individual LUNs Device paths LUN to drive letter mapping The status for primary paths secondary paths standby paths active paths In addition you can use the jafo_nu...

Page 163: ...rom OS DEVICE VENDOR Sun Microsystems T3 Disk Array FW_REV 0201 SERIAL 00163874 WWN 60020f20000003d50000000000000000 FO_CAPABLE true MASTER true LUN NAME G WWN 60020f20000003d53cf7c0f500028022 GOOD_PATHS 2 STATE up 1 PATH NAME 5 0 0 0 HBA_NAME Device ScsiPort5 TARGET 0 0 0 TYPE secondary STATE up_standby 3 PATH NAME 4 0 0 0 HBA_NAME Device ScsiPort4 TARGET 0 0 0 TYPE primary STATE up_active 2 CONT...

Page 164: ...rue LUN NAME J WWN 290000602200418f6257335430303177 GOOD_PATHS 2 STATE up 1 PATH NAME 4 0 0 0 HBA_NAME Device ScsiPort4 TARGET 0 0 0 TYPE primary STATE up_active 2 PATH NAME 5 0 0 0 HBA_NAME Device ScsiPort5 TARGET 0 0 0 TYPE primary STATE up_active 2 LUN NAME K WWN 290000602200418f6257335430303178 GOOD_PATHS 2 STATE up 1 PATH NAME 4 0 0 1 HBA_NAME Device ScsiPort4 TARGET 0 0 0 TYPE primary STATE ...

Page 165: ... engine WWN The worldwide name of the Master virtualization engine of the partner group LUN NAME Microsoft Windows 2000 Device letter WWN The first 16 digits correspond to the Master virtualization engine WWN from the Device section The last 16 digits are the VLUN serial number You can crosscheck the WWN using The SUNWsecfg virtualization engine maps The Storage Automated Diagnostic Environment s ...

Page 166: ...146 Sun StorEdge 3900 and 6900 Series 2 0 Troubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only ...

Page 167: ...r the Error One of the best ways to discover errors is by using the Storage Automated Diagnostic Environment monitoring system The Storage Automated Diagnostic Environment should be configured to email alerts and events to a local System Administrator In FIGURE 11 1 the alert was displayed using the Storage Automated Diagnostic Environment GUI FIGURE 11 1 Alerts Display Using the Storage Automated...

Page 168: ...heck the Sun StorEdge T3 Array Failover Driver The next set of diagrams show the fault as displayed by the driver and the results of drilling down for more details FIGURE 11 2 Drilling Down for Sun StorEdge T3 Array Failover Driver Fault Detail Array 1 A solid line connecting the HBA to the storage represents a healthy system Array 2 A dotted line connecting the HBA to the storage represents a LUN...

Page 169: ... SunBlade 4 Isolate the components in the path The components in the path are the HBA the cable the switch side GBIC and the Sun StorEdge network FC switch itself To isolate all components use a combination of the Storage Automated Diagnostic Environment and the HBA utility QLogic SunBlade Note If no HBA utility is present or if the utilities do not offer diagnostics a best guess effort will have ...

Page 170: ...e then run The HBA passed the tests Note The next components that can be isolated are the switch side GBIC and the Sun StorEdge network FC switch itself For these components you can launch the tests using the Storage Automated Diagnostic Environment Diagnose Tests Test From Topology functionality 5 Again temporarily remove the cable from the switch port in question insert a loopback connector plug...

Page 171: ...rnal Use Only In the examples shown in FIGURE 11 5 FIGURE 11 6 and FIGURE 11 7 Port 2 on Switch diag156 sw1a was marked with a Red icon indicating a problem Note All tests were run with the default values FIGURE 11 5 Storage Automated Diagnostic Environment Test from Topology ...

Page 172: ...ubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only FIGURE 11 6 Storage Automated Diagnostic Environment Test from Topology Pull Down Menu FIGURE 11 7 Storage Automated Diagnostic Environment Test from Topology Test Detail ...

Page 173: ...onnector was re inserted and the same test was run a second time FIGURE 11 8 Successful Switch Test Results On this pass the test was successful This indicates that the problem was most likely the switch side GBIC which was replaced 6 Recover the problem with the GBIC or the switch a Recable the link between the HBA and switch b Use the Sun StorEdge T3 Array Failover Driver GUI for the Sun StorEdg...

Page 174: ...11 9 Multipath Recovery using the Sun StorEdge T3 Array Multipath Configurator Note Storage Automated Diagnostic Environment should also post an event noting that the Port has gone back online The Multipath Configurator GUI should show both paths online and handling I O as illustrated in FIGURE 11 10 FIGURE 11 10 Recovered Paths ...

Page 175: ... appendix contains the following information SRN Reference on page 155 SRN SNMP Single Point of Failure Descriptions on page 159 Port Communication Numbers on page 160 Virtualization Engine Service Codes on page 160 SRN Reference TABLE A 1 provides an explanation of SRNs for the virtualization engine ...

Page 176: ... drive with a new drive 70005 If the initiator is master then it has detected a write error on a member within a mirror drive If a spare drive is available use it to replace the failed drive If no spare is available replace the failed drive with a new drive 70006 Communication between the virtualization engines has failed Update the firmware 70007 The primary drive cannot write to the drive being ...

Page 177: ...ultipath drive failover occurred Check the multipath drive 70051 A multipath drive failback occurred No action is needed 70098 Instant copy degraded If no spare is available replace the failed drive with a new drive 70099 Degrade because the drive has disappeared Reinsert the missing drive or replace it with a drive of equal or greater capacity 7009A A mirror drive was written to causing it to ent...

Page 178: ...71010 The status of the SLIC daemon has changed No action is needed 72000 The primary and secondary SLIC daemon connection is active No action is needed 72001 The virtualization engine failed to read the SAN drive configuration No action is needed 72002 The virtualization engine failed to lock on to the SLIC daemon No action is needed 72003 The virtualization engine failed to read the SAN SignOn i...

Page 179: ...020 70030 70051 70025 The IP of the partner s virtualization engine is not reachable Check the Ethernet cabling and connections None 72000 72007 70020 70021 70022 70025 70030 70050 The SAN topology has changed The Global SAN configuration has changed The SAN configuration has changed The IP of the partner virtualization engine is not reachable A physical device is missing A SLIC virtualization eng...

Page 180: ...n engine Virtualization engine 25001 TABLE A 4 Virtualization Engine Service Codes 0 399 Host Side Interface Driver Errors Service Code Number Cause of Error Recommended Corrective Action 005 A PCI bus parity error has occurred Replace the virtualization engine 24 The attempt to report one error resulted in another error Cycle power to the virtualization engine 40 The database is corrupt Clear the...

Page 181: ...of the virtualization engine If necessary clear the SAN database If necessary cycle the virtualization engine power If necessary import the SAN zone configuration 54 The cabling configuration is unauthorized Check the cabling 57 Too many HBAs are attempting to log in Check the cabling 60 The node mapping table was cleared using SW2 No action required 62 SW2 settings are incorrect Correct the SW2 s...

Page 182: ...ive Action 409 The FC device side type code is invalid Cycle the power If the problem persists replace the virtualization engine 434 Cannot continue due to many elastic store errors Elastic store errors result from a clock mismatch between transmitter and receiver and indicate an unreliable link This error can also occur if a device in the SAN loses power unexpectedly Check for the faulty componen...

Page 183: ...is appendix The error messages are broken out into the following sections Virtualization Engine Error Messages on page 164 TABLE B 1 lists SUNWsecfg error messages specific to the virtualization engine Switch Error Messages on page 168 TABLE B 2 lists SUNWsecfg error messages specific to the Sun StorEdge network FC switch 8 and switch 16 switches Sun StorEdge T3 Array Partner Group Error Messages ...

Page 184: ...ngines to confirm that the configuration locks are set Common to virtualization engine The virtualization engine was unable to obtain a lock on vepair Another virtualization engine command is updating the configuration 1 Run listavailable v which returns the status of individual virtualization engines 2 Check for the lock file directly by using ls la opt SUNWsecfg etc look for v1 lock or v2 lock 3...

Page 185: ...set properly The server port number is not set properly The host WWN Authentications are not set properly The host IP Authentications are not set properly The Other VEHOST IP is not set properly 1 Log in to the virtualization engine and verify that the device host and network settings are correct 2 Make sure the virtualization engine hardware is not in ERROR 50 mode 3 If required power cycle the v...

Page 186: ...n the rmvezone command 2 If errors still exist run sadapter alias d vepair r initiator a zone n 3 Run savemap n vepair createvlun Invalid disk pool diskpool on vepair or disk pool is unavailable 1 Run the showvemap n vepair command to verify that the disk pool was created properly 2 If the disk pool is unavailable run creatediskpools n t3name 3 If that fails check the Sun StorEdge T3 array for unm...

Page 187: ...ne is unable to properly configure the virtualization engine host vehost The virtualization engine cannot continue the configuration of other components 1 Check the status of the virtualization engine and reset if necessary 2 Run the setdefaultconfig command again setdefaultconfig The setupve 1M command failed 1 Run setupve n ve_hostname v verbose mode 2 Check the errors 3 Run checkve n ve_hostnam...

Page 188: ...switch in question does not appear check for the existence of the lock file directly by typing ls la opt SUNWsecfg etc look for switch lock 3 If the lock is set in error use the removelocks s command to clear it Due to a non reentrant interface there is a single lock file for all switches Only one can be accessed at a time Common to all Sun StorEdge network FC switch commands Unable to determine s...

Page 189: ...with a zoneset and zone s Each zone then has port or WWN members 1 Gbit switches had numbered hard zones only 1 Attempt to activate an existing zone set 2 opt SUNWsecfg flib sanbox2 x switchip get_zoneset_list 3 From that list select the zoneset you want to be active It should be named something similar to hostname_sw1a_zset 4 opt SUNWsecfg flib sanbox2 x switchip activate_zoneset zoneset 5 If you...

Page 190: ...ctly 1 Wait several minutes 2 Run ping switch 3 If errors persist manually power cycle the switch setupswitch Switch switch timed out after reset The switch took longer than two minutes to reset after a configuration change 1 Wait several minutes 2 Run ping switch 3 If errors persist manually power cycle the switch setupswitch Could not set chassis ID on switch switch to cid This occurs only in a ...

Page 191: ... Refer to the raid cfg files in opt SUNWsecfg etc to determine if the configuration commands are set up and functioning properly Common to Sun StorEdge T3 array Could not mount volume volume lun config does not match The LUN might have multiple drive failures or corrupted data or parity 1 Replace the failed FRUs 2 Restore the Sun StorEdge T3 array configuration with the restoret3config f n t3_name...

Page 192: ...lume slicing is allowed with this version of firmware Common to Sun StorEdge T3 array Error while opening the Sun StorEdge T3 array Cannot open after resetting the Sun StorEdge T3 array 1 Check the Sun StorEdge T3 master and alternate master network connection 2 Check with Ping t3_name command to determine if the Sun StorEdge T3 array is operating checkt3config The vol init command is being execut...

Page 193: ...he Sun StorEdge T3 and T3 array documentation enablevolslicing Error checking the Sun StorEdge T3 enable volume slicing status 1 Check the Sun StorEdge T3 array firmware level and verify it is 2 01 00 or higher 2 Check if volume slicing is supported and enabled on the Sun StorEdge T3 array enablevolslicing Cannot enable volume slicing The Sun StorEdge T3 array firmware does not support this featur...

Page 194: ...tation rmt3slice An error failed to remove slice slicename Refer to the Sun StorEdge T3 and T3 array documentation rmt3slice An error failed to remove slice from volume volume 1 Check the volume status using the checkt3mount or showt3 command 2 If unmounted use the restoret3config command to mount savet3config While checking the configuration the Sun StorEdge T3 array configuration was not saved 1...

Page 195: ...ould not determine the Sun StorEdge system type Multiple components might be down and the getcabinet command could not determine the Sun StorEdge series type 3910 3960 6910 or 6960 To use the command line interface CLI set the BOXTYPE environment variable to one of the seven values For example BOXTYPE 3910 export BOXTYPE listavailable The component is unavailable It is either not found or the conf...

Page 196: ...176 Sun StorEdge 3900 and 6900 Series 2 0 Troubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only ...

Page 197: ...d line interface CRC cyclic redundancy code DAS direct attached storage EOF end of file FC Fibre Channel FC ELS Fibre Channel Extended Link Service FRU field replaceable unit GBIC gigabit interface converter GUI graphical user interface HBA host bus adapter ISL inter switch link LED light emitting diode LUN logical unit number MAC media access control NSCC Network Storage Command Center PCU power ...

Page 198: ...everse address resolution protocol RFE request for enhancement RSS Remote Storage Services SAN storage area network SCSI small computer system interface SLIC Serial Loop IntraConnect SNMP simple network management protocol SPOF single point of failure SRN Service Request Number SRS Sun Remote Services SSP Storage Service Processor SVE storage virtualization engine TCP IP transport control protocol...

Page 199: ...ssor side notification 61 troubleshooting 60 verifying data host 62 verifying Sun StorEdge 3900 series 62 verifying Sun StorEdge 6900 series 62 C c2 path returning to production 19 unconfiguring 17 cfgadm verifying functionality 4 checkdefaultconfig verifying functionality 4 command line test example qlctest 1M 27 switchtest 1M 28 communication loss event 3 configuration settings 23 verifying 7 cr...

Page 200: ...ments before running 30 F failback virtualization engine 120 failback operations 16 failover operations 16 fault isolation examples 147 Fibre Channel link A1 or B1 data host verification 45 A2 to B2 host side verification 51 A3 or B3 host side verification 56 check error status manually 113 FRU tests for A2 or B2 link 52 FRU tests for A3 or B3 link 57 troubleshooting 37 troubleshooting A4 B4 link ...

Page 201: ... Microsoft Windows 2000 troubleshooting 137 viewing system errors 26 Microsoft Windows NT configurations 7 monitoring functions for Sun StorEdge 3900 and 6900 Series 2 multipath configurator array properties 141 healthy configuration 140 with LUN failover 141 N notification Storage Service Processor 44 used in PFA 2 notification events A1 or B1 43 A2 or B2 49 A3 or B3 54 A4 or B4 60 T1 or T2 89 P ...

Page 202: ...Edge 6900 Series I O routed through both HBAs 15 logical view 11 multipathing options 16 primary data paths to alternate master 12 primary data paths to Sun StorEdge T3 array 13 Sun StorEdge Network FC Switch 8 and Switch 16 switch diagnosis of 28 Sun StorEdge network FC switch 8 and switch 16 switch checking status 5 Sun StorEdge T3 array event grid 95 Explorer Data Collection Utility 29 LUN fail...

Page 203: ...tconfig 4 configuration settings 7 data host 45 failover luxadm display 63 host side 51 luxadm output 4 operation of user selected components 57 storage service processor 92 storage service processor side 57 Veritas DMP installations 5 used in troubleshooting 20 Veritas DMP error message for A3 or B3 link 57 viewing virtualization engine map 118 virtualization engine backpanel 112 checking status ...

Page 204: ...Index 184 Sun StorEdge 3900 and 6900 Series 2 0 Troubleshooting Guide March 2003 Sun Proprietary Confidential Internal Use Only Z zone modifications 74 ...

Reviews: