background image

G A L A X Y ®   A U R O U R A   L S       C O N F I G U R A T I O N   A N D   S Y S T E M   I N T E G R A T I O N   G U I D E  
   
 

 

79                                            

 

Section 4 Troubleshooting  Guide 

if there’s rust on the outside, electronic components on the inside could also be rusting – and 
those can’t be cleaned with the Royal Jelly. 

  

 

4.8

 

Motherboard problems  

 

Connectors:

 As with the plugs which plug into them, many connectors can be damaged – 

especially SATA connectors on the motherboard. Here are the various connectors used and 
considering which could be damaged: LED/switch/Chassis connections, IPMI socket, RAM 
sockets, CPU sockets, PCI/PCIe slots, power connections, fan connections, SATA 
connections, and I2C connections (to power supply or to LEDs). 

i801:

 The motherboards we’ve tested, have Intel i801 chips used for the sensors. While this is 

a fairly reliable chip, the symptom you might see if it fails is that all of the sensors will go dead 
simultaneously (Assuming there is no software problem), and/or the chip can’t be found by the 
computer. 

Northbridge:

 The Northbridge controls higher-speed functions of the motherboard, such as 

the on-board VGA (ATI ES1000 or Matrox G200) and RAM. If the on-board VGA dies, the unit 
is still capable of being operated remotely, however the only fix is to replace the motherboard. 
Note that on some motherboards, the Northbridge also controls the PCIe slots. 

RAM:

 RAM can fail. If the amount of memory is suddenly decreased, it could indicate a 

problem with one or more of the memory modules. If the module is intermittent, try swapping 
around the modules and see if the problem goes away. If the module failed completely, the 
best way to troubleshoot it is to try swapping the modules one-at-a-time. 

Southbridge:

 This chip controls the slower-speed functions of the motherboard, such as 

PCI/32, PCI/x, serial/parallel ports, power management, Ethernet, USB ports, and interfaces 
with the real-time clock. Typically, if a Southbridge dies, then entire motherboard doesn’t 
function. 

CPU: 

If you have a motherboard with multiple CPUs, if one CPU goes out, the system will 

typically lock up until it is rebooted, at which point, only one CPU might come up. See also 
fans, below. 

Chassis/CPU/Chipset Fans:

 It is important to keep an eye on the chassis fans, as they not 

only cool the drives, but also play a part in cooling the motherboard, CPU, and RAM. There 
also may be, depending on the motherboard, a fan on the Northbridge or Southbridge chip, as 
well as a fan directly on the CPU. If a chassis fan fails, you should see it in the NumaRAID 
GUI, however if a chipset or CPU fan fails, a typical symptom is spontaneous rebooting of the 
array (Not related to software). 

IPMI Card/On-Board: 

Typically, either the IPMI card works or it doesn’t. If an IPMI card fails, it 

will show a host of symptoms, such as not appearing in the BIOS, or it’s Ethernet port or 
virtual disk not showing up in the OS. However, if the IPMI card is known to be good, and 
works in another system, it could indicate a problem with the +5V Standby as going through 
the motherboard, or coming from the power supply – in other words, a more serious problem. 

CMOS Battery:

 We do show the status of the CMOS battery from the motherboard in the 

NumaRAID GUI. If the battery gets low (~6% of it’s normal voltage), you will start to see 
symptoms of the battery failing, such as the date and time on the hardware clock are not 
correct, and bootup messages saying the battery is low or dead. It is very simple to replace 
and very low-cost. At the time of this writing, SuperMicro boards use CR-2032 3V batteries. 
Do NOT substitute other models, such as CR-2025. 

Summary of Contents for Galaxy Aurora LS Series

Page 1: ...Galaxy Aurora LS Series RAID Storage System Configuration and System Integration Guide Version 2 1 February 2011...

Page 2: ...rive Eden Prairie MN 55344 3732 952 829 0300 Sales rorke com techsupport rorke com This manual is preliminary and under construction and only applies to the Galaxy Aurora LS product Contact Rorke Tech...

Page 3: ...cations 11 1 1 1 Overview 11 1 1 2 Basic Features and Advantages 11 1 2 Model Variations 13 1 2 1 Galaxy Aurora LS Model Descriptions 13 1 3 Product Description 14 1 3 1 Description of Physical Compon...

Page 4: ...Client RAID Connections and LUN Preparation 31 2 2 5 Apple OSX Client RAID Connections and LUN Preparation 38 2 3 Remote Administration 43 2 3 1 Using a Browser and Logging into the Aurora LS 43 3 0...

Page 5: ...4 11 Data Drive problems 81 4 12 SAS HBA problems 81 4 13 SAS Host connectivity issues 82 4 14 Fibre HBA problems 82 4 15 Fibre Host connectivity issues 83 4 16 Troubleshooting Aurora LS s Client Rel...

Page 6: ...E 5 Section 1 Intro and Overview Changing Passwords 101 Run a CLI command from Webmin 102 Change the Network Host Name 102 See and Control SMART for the Boot Device 102 Setting System Time or Timezon...

Page 7: ...of such revisions or changes Product specifications are also subject to change without prior notice Trademarks Rorke Data and the Rorke Data logo are registered trademarks of Rorke Data Inc Rorke Dat...

Page 8: ...e Prior to powering on the subsystem ensure that the correct power range is being used If a disk or power supply module fails leave it in place until you have a replacement unit and you are ready to r...

Page 9: ...se damage to the equipment or result in personal injury Warnings should be taken seriously Warnings are easy to recognize The word warning is written as WARNING both capitalized and bold and is follow...

Page 10: ...for the latest software updates NOTE that the version installed on your system should provide the complete functionality listed in the specification sheet user s manual We provide special revisions fo...

Page 11: ...G A L A X Y A U R O U R A C O N F I G U R A T I O N A N D S Y S T E M I N T E G R A T I O N G U I D E 10 Section 1 Intro and Overview This page left blank intentionally...

Page 12: ...econd would be no good without the host connectivity which is built into the unit Aurora LS is capable of supporting 1 2 or 4 port 8Gb Fibre Channel HBA host bus adapter with connections to 1 2 4 host...

Page 13: ...browser or command line Remote Hardware Status monitoring LUN Partitioning Background Activities that include RAID Rebuild SMART condition polling Enclosure and Media health monitoring and repair Fail...

Page 14: ...FC8 18TA GALAXY AURORA LS 12BAY 4U TOWER RACKMOUNT STORAGE APPLIANCE 1X2 66GHZ CORE I7 CPU 6GB 3X2GB RAM 12X1 5TB 7200RPM SATA DRIVES LINUX OS EOS APPLICATION ON DOM DUAL PORT 8GBIT FC HBA RAID 6 1ST...

Page 15: ...nt of the Galaxy Aurora LS The figure below shows a detailed diagram of the front controls area The figure on the following page shows a diagram of the rear of the Galaxy Aurora LS Note that this conf...

Page 16: ...th needed for 8Gb G SAS RAID Controller H SAS Activity I SAS Heartbeat K PS 2 Mouse Connector L PS 2 Keyboard Connector M USB Ports N Serial Port Not used O Exhaust Fan Area P VGA Connector Q Network...

Page 17: ...can be connected to a unique host system Next to the Fibre HBA are empty slots followed by a SAS RAID Controller card that the 12 Disk drives are connected to Be aware that a green LED blinks continu...

Page 18: ...Aurora LS EULA restricts you the user from loading any other software such as application software onto the Aurora LS Tampering with loading or using any other software voids the license agreement Eac...

Page 19: ...ude personal injury and death Power Input and Grounding CAUTION Ensure your installation has adequate power supply and branch circuit protection Check nameplate ratings to assure there is no overloadi...

Page 20: ...use the wheel locks when installing or removing the Galaxy Aurora LS from the rack If the rack does not have wheel locks place something against the wheels to prevent movement or if your rack is equip...

Page 21: ...where you want to place the Aurora LS in the rack Position the fixed rack rail sliding rail guide assemblies at the desired location in the rack keeping the sliding rail guide facing the inside of th...

Page 22: ...G A L A X Y A U R O U R A L S C O N F I G U R A T I O N A N D S Y S T E M I N T E G R A T I O N G U I D E 21 Section 1 Intro and Overview...

Page 23: ...the rear of the chassis rails with the front of the rack rails Slide the chassis rails into the rack rails keeping the pressure even on both sides you may have to depress the locking tabs when insert...

Page 24: ...The Power PB is used to power up the Aurora LS The power switch is used to turn the unit on However do not use it to turn the unit off unless there is no other way To turn on the unit press the power...

Page 25: ...assigned numbered slot in the Aurora LS chassis as shown below NOTE THAT DRIVE SLOT 13 HAS AN EMPTY DRIVE CANISTER The drives are simple to install Simply unwrap and push each drive into each empty d...

Page 26: ...The 8Gb Fibre Channel connection can either be connected point to point I e directly to another computer with 4Gb or 8Gb Fibre Channel host adapter or can be connected to a 4Gb or 8Gb Fibre Channel sw...

Page 27: ...AN usage you need to be able to see the Aurora LS with a standard internet browser over ethernet from your client The process below will allow the client to talk to the Aurora LS over ethernet on a Wi...

Page 28: ...d your client can now see the Aurora LS over Ethernet using as standard Internet Browser The Aurora LS has been setup with a fixed default IP address of 192 168 1 129 2 2 2 Installing Fibre Channel HB...

Page 29: ...LS driver Finally you are most interested in the device name on the right dev sdb The next step for preparing to use this LUN is to label the device and create a partition on it This is done with the...

Page 30: ...iple partitions but for this example the entire LUN is used Consult with tech support for partition size options In this case you have created partition 1 but still need to create a file system on it...

Page 31: ...The partition is prepared but must be mounted to use the LUN by the Linux clients Here s the command mount dev sdb1 mnt enter In this example you are mounting the ext3 partition dev sdb1 to a pre exis...

Page 32: ...he array begin by left clicking on the Windows logo Or Start Menu in the lower left corner of the screen Note that the instructions here are for Vista Ultimate 64 but other versions are similar This w...

Page 33: ...dows an Initialize Disk popup will appear on top of the disk management window This warning will usually also only appear on 64 bit OSes If you are running a 32 bit OS and your LUN is greater than 2TB...

Page 34: ...etup The Disk Management window will open In the example below a 1TB LUN was used it is appearing as Disk 1 To the right of Disk 1 a large rectangle with a black bar running across the top Right click...

Page 35: ...S Y S T E M I N T E G R A T I O N G U I D E 34 Section 2 Basic Setup This will open the New Simple Volume Wizard Left click on the Next button to continue The Specify volume size window will open Use...

Page 36: ...U I D E 35 Section 2 Basic Setup The assign drive letter window opens Use the default and note the letter Click on the Next button to continue The format partition window opens Leave all values at def...

Page 37: ...k format checkbox So it is checked then left click on the Next button The completing the simple volume window opens This is the final window of the wizard which shows all of the settings that were sel...

Page 38: ...Section 2 Basic Setup When the partitioning is finished the New Simple Volume Wizard will close and you will be returned to the Disk Management screen After a few moments less than a minute the Disk M...

Page 39: ...e following popup window If you get this warning it will save all of the steps necessary in setting up the Aurora LS with Apple Disk Utility So if the Disk Insertion warning does appear click on the I...

Page 40: ...showing the contents of the Applications folder On most systems this new column will be too large to fit on the screen so you will need to scroll all the way to the bottom Click on the slider and dra...

Page 41: ...open You will see the LUN listed on the left in the example above it is a 1TB LUN showing GalaxyIB testlun1 Media Click on the LUN to select it On the upper right is a series of tabs Click on the Part...

Page 42: ...T E M I N T E G R A T I O N G U I D E 41 Section 2 Basic Setup Drag down to set the number of partitions to 1 Partition then release the mouse button Click in the white text area next to Name and type...

Page 43: ...ng beyond this point will erase the LUN Click on the Partition button The partition and volume creation process will begin this will only take a few seconds When it is done if you have OS X 10 5 or ab...

Page 44: ...command line interface For ease of use the user should use a browser remotely to verify the basic operations and functionality This is accessed by opening a browser and typing the following URL http...

Page 45: ...tatus of the Aurora LS Once logged in through a browser http 192 168 1 129 10000 the following functions and features are available to the client 3 1 0 GUI Menu Details and Functions 3 1 1 Main GUI sc...

Page 46: ...initial web admin page opens In the Webmin menu on the left expand the selection called Hardware Below this click on NumaRAID GUI this will launch the Main GUI Screen as follows The group will expand...

Page 47: ...er device A RAID does not necessarily need to contain all of the disks in the array Because of this there are three possible things you could see in this table If no RAID s are defined it will say No...

Page 48: ...operating system of the array itself takes about 2GB of RAM In general a larger cache yields greater performance Once you know what cache size you would like to use select it by left clicking on the...

Page 49: ...lot Starting drive number was set to 0 the number of devices was set to 16 and the RAID Size The total usable capacity of the RAID in Gigabytes in this case 1914GB or 1 9TB The Code Rev is the version...

Page 50: ...n where you can scan and see performance statistics for the RAID This will be covered later Below the RAID status is a table of LUN s if any A LUN is a logical portion of a RAID which is presented to...

Page 51: ...tail later Below the LUN creation area of the RAID Details screen is the RAID Drive Details by Slot table This table shows in slot order the slot number each drive with the manufacturer the model the...

Page 52: ...by id device name The upper table represents reads the lower table represents writes The numbers at the top of the table columns are times in milliseconds For example the first column indicates 0 15 m...

Page 53: ...side of the drives In this case the numbers are low because this is a very slow array the drives are connected to a PCI X SAS card In this test using the first drive as an example 11276 sectors fell i...

Page 54: ...is a Delete function If you want to delete the LUN Note all initiators and targets must be removed first left click on the Delete button In this table we see the the name of the LUN MyLun the name of...

Page 55: ...sed to either reload the regular current configuration into RAM or to load one that you saved previously Simply select the configuration that you want to load reload with the drop down then left click...

Page 56: ...This is usually done to retain the information from a trace prior to resetting restarting a new one The type function works in conjunction with the number of entries function creating something simila...

Page 57: ...te The entry column shows the number for the particular entry in the Trace file uGap is the number of microseconds between commands uSecs is the amount of time in Microseconds that it took to execute...

Page 58: ...entries The top left chart shows the logical block address LBA or logical position number sector number within the RAID that the virtual head is positioned In the example it is a straight line going u...

Page 59: ...S T E M I N T E G R A T I O N G U I D E 58 Section 3 Management Below these is a button which allows you to switch back to the data text display To return to the NumaRAID GUI Main page left click on t...

Page 60: ...e it shows ATTOtarget0 indicating port 0 which is the first port The right column shows the WWN World Wide Network Number Normally the initiator client is always referred to by this WWN but look at th...

Page 61: ...he Create button on the left Using Gamer as an example the middle table will change as follows If you wish to make the user not real time left click on the Delete button to the left of it s name Note...

Page 62: ...e of the file than the position that the client computer is currently requesting This is called a read ahead cache The cache is only selectable in 128KB increments and the value here is the number of...

Page 63: ...ten when writing that the cache should write its contents to disk and empty itself The default value is 10 which means that when the cache is at least 10 full it should empty The cache size which was...

Page 64: ...ns reconstruction is only performed when the array is idle If you set it to 100 which is definitely not recommended the array would run very slowly to the clients while reconstructing at full speed So...

Page 65: ...the diagnostic message log Note that it is important that this function only be used when directed to do so and it must be disabled when not in use otherwise it would fill up the boot drive Here is a...

Page 66: ...you would like to view on the right then click the corresponding Chart button on the left to see the charts below for that selection The options are as follows NumaRAID Device This shows graphs perta...

Page 67: ...hows the current day and lower left shows the last day On each set of charts read information is in green color and write information is in red The first group of charts is for data rates Vertically t...

Page 68: ...o for example in the upper right chart we see four bars The left bar shows that there were about 39000 commands executed which took 100 microseconds to execute The middle bar shows that there were abo...

Page 69: ...hat you would like to see here is a single bar as far to the right as possible This indicates that the array did a lot of large transfers which were all equal in size Going vertically is the number of...

Page 70: ...hed and wish to return to the NumaRAID Main GUI screen you can left click on the Return to NumaRAID GUI Main Page link at the bottom of the Slot Details screen Modern hard drives have sensors within t...

Page 71: ...dicator of how efficient the caching is on the drive If the computer initiator requested the same block twice and it happened to be in the cache of the drive then the drive would not have to read it a...

Page 72: ...e The forth column shows the total numbers of errors corrected i e The sum of the first three columns The fifth column shows how many times it had to call the error correction algorithms whether or no...

Page 73: ...G A L A X Y A U R O U R A L S C O N F I G U R A T I O N A N D S Y S T E M I N T E G R A T I O N G U I D E 72 Section 3 Management...

Page 74: ...otors on the hard drives as well as the fans in the system 5V This is the 5V power output as seen from the motherboard This voltage operates the majority of electrical circuits within the system 5VSB...

Page 75: ...ess and subnet mask on the right then left click on the Update button on the left If you wish to set a static IP address change the DHCP dropdown to n type the IP address and subnet mask in the fields...

Page 76: ...days a week by phone at 800 328 8147 4 1 Chassis Status Indicators The front of the Aurora LS has some indicators that can help determine basic problems with the unit Next to the front operator panel...

Page 77: ...tion board that the power supply module plugs into with status monitoring Dual power cord While these power system configurations may seem drastically different there are a large number of components...

Page 78: ...LS s sensors are on the motherboard if these voltages are not correct it could also indicate a power supply problem On a system with redundant power supplies the power load is shared between the powe...

Page 79: ...ctly to provide airflow while protecting the delicate components on that board It can be removed if necessary but should be replaced when done Finally there is usually a main air baffle in the system...

Page 80: ...outhbridge This chip controls the slower speed functions of the motherboard such as PCI 32 PCI x serial parallel ports power management Ethernet USB ports and interfaces with the real time clock Typic...

Page 81: ...otherboards the physical port used for the installation matters this is because some motherboards have multiple USB chips Also the built in port enumerator might have a specific order for referencing...

Page 82: ...then hot plugging one drive might cause other drives to momentarily spin down then back up 4 10 Boot device problems The boot device does have some mortality even if it is a SATADOM Aside from an all...

Page 83: ...ector on the cable is securely mated properly If this latch becomes bent it must be fixed at all cost If it can not be fixed the cable has to be replaced If the cable is used with a broken latch then...

Page 84: ...are the other end to what it is plugging into if the laser on what it is plugging into is coming from the same side as the laser coming from the cable the cable is defective The other mechanical probl...

Page 85: ...or a problem with the driver Here s the troubleshooting technique If you look carefully at the chart there is a straight chain going from RAID to the Fibre Driver on the client You should troubleshoo...

Page 86: ...nd computer but is very small It runs off the 5V standby which is used to power the on off switch and is capable of communicating through the motherboard even if the array is off To access the IPMI se...

Page 87: ...array will reboot as if you actually hit the reset button on the front You can turn on the array via the power on button Once the array is on and starting to boot you can click on the small window in...

Page 88: ...there were 5 fans one each on connectors 1 through 3 and connectors 7 and 8 Back on the main window which shows the remote desktop you should be able to diagnose a problem if the computer isn t booti...

Page 89: ...ide default password is rdserdse Once you are in you can type ifconfig and look at the network settings and change them with the system config network tui command CAUTION It is very important that whe...

Page 90: ...s Section 5 Application Technical Customer Notes 5 0 Application Technical Customer Notes 5 1 Windows Infiniband Performance Tuning In Windows you can improve performance However to do so you will nee...

Page 91: ...G A L A X Y A U R O U R A L S C O N F I G U R A T I O N A N D S Y S T E M I N T E G R A T I O N G U I D E 90 Section 5 Application Technical Notes Left click on Accessories...

Page 92: ...G A L A X Y A U R O U R A L S C O N F I G U R A T I O N A N D S Y S T E M I N T E G R A T I O N G U I D E 91 Section 5 Application Technical Notes...

Page 93: ...G A L A X Y A U R O U R A L S C O N F I G U R A T I O N A N D S Y S T E M I N T E G R A T I O N G U I D E 92 Section 5 Application Technical Notes Left click on Command Prompt...

Page 94: ...This will open a command prompt window which looks as follows From the command prompt type regedit enter Once you run regedit drag the scrollbar on the left area all the way to the top if it is not al...

Page 95: ...ion 5 Application Technical Notes Type Ctrl F This will open the search window The text box which is prompting what to search for is already selected type ModeFlags Already typed in the picture At the...

Page 96: ...E M I N T E G R A T I O N G U I D E 95 Section 5 Application Technical Notes Left click on the Find Next button The computer will search for the text specified When it finds something the Searching wi...

Page 97: ...A T I O N A N D S Y S T E M I N T E G R A T I O N G U I D E 96 Section 5 Application Technical Notes Press enter This will cause a small pop up window to appear Press 2 enter f3 This will change the...

Page 98: ...A T I O N G U I D E 97 Section 5 Application Technical Notes You will know the search is complete when the following pop up window appears When it is finished left click on the OK button and close the...

Page 99: ...des some information It is either the first screen you see after logging into Webmin or in the webmin menu on the left you can left click on System Information located near the bottom of the menu Item...

Page 100: ...om all addresses All IP addresses will be able to access this array The other two bubbles are used in conjunction with the text box below You enter IP addresses into the text box If you then check the...

Page 101: ...down and select NumaRAID GUI Left click on the Save button at the bottom of the screen To return to the NumaRAID GUI expand the Hardware category on the left if it is not already and left click on Num...

Page 102: ...lick on Webmin Users At the top you will see a table of users You can either left click on the link above or below the table which reads Create a new Webmin User if you would like to create a user To...

Page 103: ...mand Shell below the Others group Type the command you would like and press enter To return to the NumaRAID GUI expand the Hardware category on the left if it is not already and left click on NumaRAID...

Page 104: ...t and the system clock may not match the hardware clock over time The hardware clock can also drift To get to the time screen do the following Expand the Hardware group if it is not already expanded L...

Page 105: ...S T E M I N T E G R A T I O N G U I D E 104 Section 5 Application Technical Notes Although you do not have to log out of the array it is better if you do as the logging in out are logged by Webmin To...

Page 106: ...t fix the problem with clients communicating with unintended arrays and vice versa Newer switches are called fabric switches and use what is called zoning instead of provisioning The term fabric is r...

Page 107: ...4 Infiniband Switch Configurations Quick note about Infiniband switches Infiniband switches are not the same as Fibre Channel switches because of how the subnet is run The subnet is run by clients so...

Reviews: