background image

26

Sun StorEdge Availability Suite 3.2 Software Troubleshooting Guide • December 2003

Disable pending on diskq %s, try

again later

Kernel

A request to disable the disk queue is already in
progress. Verify that the previous request has
completed successfully. If it has, this request is no
longer valid. If it has not, wait for it to complete
unsuccessfully before attempting to disable the disk
queue.

disk service, <ctag>, is active on

node "<hostname>" Please re-issue

the command on that node

RM

The remote mirror set being operated on is not active
on the current node in the cluster.

disk service, %s, is active on node

"%s"; Please re-issue the command

on that node

PITC

The

iiadm

command must be issued on the other

node of the cluster. The disk group that the user is
attempting to operate on is not active on the node
where the

iiadm

command was issued.

diskq name is longer than <MAX>

characters

RM

The device specified for the disk queue volume is too
long for remote mirror to accept.

disk queue <diskq2> does not match

<diskq1> skipping set

RM

The user tried to enable a set into a group that has a
disk queue, but the user specified a disk queue that
does not match the group’s disk queue.

diskqueue <diskq> is incompatible

RM

The user tried to enable a set into a group that has a
disk queue, but the user specified a disk queue that
does not match the group’s disk queue.

Disk queue %s is already in use

Kernel

The volume for the disk queue being added to the set
or group is already in use as a data volume, bitmap
volume, or disk queue. Use a different volume for the
disk queue.

Disk queue %s operation not

possible, set is in replicating

mode

Kernel

The user attempted to perform disk queue
maintenance on a set while the set is replicating.

Disk queue does not exist for set

%s:%s ==> %s:%s

Kernel

The user attempted to perform disk queue
maintenance on a set that does not have a disk queue.

disk queue <diskq> is incompatible

with existing queue

RM

The user tried to enable a set into a group that has a
disk queue, but the user specified a disk queue that
does not match the group’s disk queue.

disk queue <diskq> is not in disk

group "<ctag>"

RM

The user tried to enable a disk queue that does not
reside in the same cluster resource group in which the
volume and bitmap reside.

Disk queue operations on

synchronous sets not allowed

Kernel

An attempt to enable a sync set with a disk queue, or
to add a disk queue to a sync set, has been made. Sync
sets cannot have disk queues attached to them.

TABLE 3-1

Error Messages for the Sun StorEdge Availability Suite 3.2 Software (Continued)

Error Message

From

Meaning

Summary of Contents for Sun StorEdge Availability Suite 3.2

Page 1: ...osystems Inc www sun com Submit comments about this document at http www sun com hwdocs feedback Sun StorEdge Availability Suite 3 2 Software Troubleshooting Guide Part No 817 3752 10 December 2003 Revision 51 ...

Page 2: ... réservés Sun Microsystems Inc a les droits de propriété intellectuels relatants à la technologie qui est décrit dans ce document En particulier et sans la limitation ces droits de propriété intellectuels peuvent inclure un ou plus des brevets américains énumérés à http www sun com patents et un ou les brevets plus supplémentaires ou les applications de brevet en attente dans les Etats Unis et dan...

Page 3: ...ion 3 2 Remote Mirror Software Troubleshooting Tips 5 Troubleshooting Checklist 6 Troubleshooting Log Files and Services 6 Checking Log Files 7 Checking the etc nsswitch conf File 8 Checking That the rdc Service Is Running 8 If the dev rdc Link Is Not Created 9 Checking the Integrity of the Link 10 Testing with ifconfig 11 Testing with ping 11 Testing with snoop and atmsnoop 11 ...

Page 4: ... 2 Software Troubleshooting Guide December 2003 Correcting Common User Errors 13 Enabled Software on Only One Host 13 Volumes Are Inaccessible 13 Wrong Volume Set Name Specified 14 Accommodating Memory Requirements 16 3 Error Messages 19 ...

Page 5: ...ilability Suite 3 2 software Before You Read This Book To use the information in this document you must have thorough knowledge of the topics discussed in these books Sun StorEdge Availability Suite 3 2 Point in Time Copy Software Administration and Operations Guide Sun StorEdge Availability Suite 3 2 Remote Mirror Software Administration and Operations Guide ...

Page 6: ...irror software Chapter 3 provides an alphabetical list of error messages from all sources associated with the Sun StorEdge Availability Suite software Using UNIX Commands This document might not contain information on basic UNIX commands and procedures such as shutting down the system booting the system and configuring devices See the following for this information Software documentation that you ...

Page 7: ...BbCc123 The names of commands files and directories on screen computer output Edit your login file Use ls a to list all files You have mail AaBbCc123 What you type when contrasted with on screen computer output su Password AaBbCc123 Book titles new words or terms words to be emphasized Replace command line variables with real names or values Read Chapter 6 in the User s Guide These are called clas...

Page 8: ...and User s Guide 805 0331 805 6552 Sun Gigabit Ethernet FC AL P Combination Adapter Installation Guide 806 2385 Sun Gigabit Ethernet S 2 0 Adapter Installation and User s Guide Sun Gigabit Ethernet P 2 0 Adapter Installation and User s Guide 805 2784 805 2785 Sun Enterprise 10000 InterDomain Networks User Guide 806 4131 System administration Sun StorEdge Availability Suite 3 2 Remote Mirror Softwa...

Page 9: ...product that are not answered in this document go to http www sun com service contacting Sun Welcomes Your Comments Sun is interested in improving its documentation and welcomes your comments and suggestions You can submit your comments by going to http www sun com hwdocs feedback Please include the title and part number of your document with your feedback Sun StorEdge Availability Suite 3 2 Softw...

Page 10: ...x Sun StorEdge Availability Suite 3 2 Software Troubleshooting Guide December 2003 ...

Page 11: ...Improving Performance on page 2 Safeguarding the VTOC Information on page 3 Troubleshooting Checklist This table shows the troubleshooting checklist and related sections TABLE 1 1 Troubleshooting Checklist Step For Instructions 1 Check for installation errors Sun StorEdge Availability Suite 3 2 Software Installation Guide 2 Check that dev ii is created after reboot Sun StorEdge Availability Suite ...

Page 12: ...ability Suite software is used with a filesystem tuning the number of SV threads might produce better performance When a filesystem flushes its cache it generates many parallel write operations The SV s default setting of 32 threads could produce a bottleneck You can increase the number of SV threads The maximum number of threads allowed is 1024 Note Each thread consumes 32k of memory The sv_threa...

Page 13: ...ludes cylinder 0 in its mapping If the VTOC of the source and destination volumes are not identical some type of data loss might occur This data loss might not be detected initially but can be detected later when other utilities are used like fsck 1M or when the system is rebooted When first configuring and validating volume replication save copies of all affected devices VTOCs using the prtvtoc 1...

Page 14: ...4 Sun StorEdge Availability Suite 3 2 Software Troubleshooting Guide December 2003 ...

Page 15: ...ting Checklist on page 6 Troubleshooting Log Files and Services on page 6 Checking the Integrity of the Link on page 10 Correcting Common User Errors on page 13 Note The Sun StorEdge Availability Suite 3 2 Remote Mirror Software Administration and Operations Guide describes the dsstat and scmadm commands These commands are useful for displaying information about remote mirror and point in time cop...

Page 16: ...nstallation errors Sun StorEdge Availability Suite 3 2 Software Installation Guide 2 Check that dev rdc is created after reboot Checking That the rdc Service Is Running on page 8 If the dev rdc Link Is Not Created on page 9 3 Check that the sndrd daemon is running Sun StorEdge Availability Suite 3 2 Software Installation Guide 4 Check the log file contents Checking Log Files on page 7 5 Check that...

Page 17: ...irst atm dev vx rdsk rootdg vol4 dev vx rdsk rootdg bm4 second atm dev vx rdsk rootdg vol4 dev vx rdsk rootdg vol4 Successful Aug 20 19 13 58 sndr sndrboot r first atm dev vx rdsk rootdg vol2 dev vx rdsk rootdg bm2 second atm dev vx rdsk rootdg vol2 dev vx rdsk rootdg bm2 Successful Aug 20 19 13 58 sndr sndrboot r first atm dev vx rdsk rootdg vol3 dev vx rdsk rootdg bm3 second atm dev vx rdsk root...

Page 18: ...sswitch conf file ensure that files is placed before nis nisplus ldap dns or any other service the machine is using For example for systems using the NIS naming service the file must include If you need to edit the etc nsswitch conf 4 file use a text editor After editing the file shut down and restart your machine Checking That the rdc Service Is Running When the remote mirror software loads it ad...

Page 19: ...ges shows that the service is running If the dev rdc Link Is Not Created Note Although other applications make entries in the files described in this section you can edit the files to correct these problems Always make a backup copy of a file before editing it The dev rdc pseudo link might not be created for the following reasons The etc devlink tab file is missing an entry for the dev rdc pseudo ...

Page 20: ...e machines where the software is installed in the etc hosts file Make sure this file contains the same information on the primary and secondary hosts because the software is bidirectional The software uses these hosts to transfer data Simple tests to check link integrity include the following Use the telnet or rlogin commands to connect to the hosts Use the ifconfig command to check your network i...

Page 21: ...ding and receiving data during a copy or update operation ifconfig a ba0 flags 1000843 UP BROADCAST RUNNING MULTICAST IPv4 mtu 9180 index 1 inet 192 9 201 10 netmask ffffff00 broadcast 192 2 201 255 ether 8 0 20 af 8e d0 lo0 flags 1000849 UP LOOPBACK RUNNING MULTICAST IPv4 mtu 8232 index 2 inet 127 0 0 1 netmask ff000000 hme0 flags 1000843 UP BROADCAST RUNNING MULTICAST IPv4 mtu 1500 index 3 inet ...

Page 22: ...980057629 Seq 2524538165 Len 0 Win 33304 Options nop nop tstamp 1057686 843238 nws822 nws350 RPC C XID 3565514134 PROG 100143 VERS 4 PROC 8 etc opt SUNWconn atm bin atmsnoop d ba0 port rdc device ba0 Using device dev ba promiscuous mode TRANSMIT VC 32 TCP D 121 S 1011 Syn Seq 2333980324 Len 0 Win 36560 _____________________________________________________________________________ RECEIVE VC 32 TCP ...

Page 23: ...s Volumes Are Inaccessible Verify that a volume or disk is accessible Confirm each volume is available on the primary and secondary host by using the dd 1M command to read a volume Issue the following command on the primary and secondary hosts for each primary secondary and bitmap volume The result shows that the command was able to read 10 512 byte records indicating that the volume is accessible...

Page 24: ...gured volume sets Make sure that you specify the correct volume set on the command line newfs N dev vx rdsk rootdg test0 dev vx rdsk rootdg tony0 2048000 sectors in 1000 cylinders of 32 tracks 64 sectors 1000 0MB in 63 cyl groups 16 c g 16 00MB g 7680 i g super block backups for fsck F ufs o b at 32 32864 65696 98528 131360 164192 197024 229856 262688 295520 328352 361184 394016 426848 459680 4925...

Page 25: ...e sndradm p command to find the volume set name correctly but issue the command from the secondary host incorrectly Depending on which host you issue the command from the output differs For example when issued from the primary host the command shows the correct volume set name of calamari dev vx rdsk rootdg tony1 When issued from the secondary host the command shows the incorrect volume set name T...

Page 26: ... name that comes after SOOtrdc in alphabetical order Accommodating Memory Requirements In releases prior to the Sun StorEdge Availability Suite 3 2 software a single asynchronous thread was created for each group of volume sets on the primary host Asynchronous I O requests were placed on an in memory queue and serviced by this single thread Because there was only one thread only one RPC request co...

Page 27: ... aware of this requirement to avoid exhausting the memory of the secondary host The most common symptom of secondary memory exhaustion is the volume sets changing to logging mode The RPC requests fail when memory is low To avoid the problem change the number of asynchronous threads for some of the groups from the default of two to one This forces the earlier version 3 1 behavior with no extra memo...

Page 28: ...18 Sun StorEdge Availability Suite 3 2 Software Troubleshooting Guide December 2003 ...

Page 29: ...rnel software Kernel messages might not be printed on the screen but are usually written to var opt SUNWesm ds log or to the system console and recorded in var adm messages TABLE 3 1 Error Messages for the Sun StorEdge Availability Suite 3 2 Software Error Message From Meaning s is not a valid number PITC iiadm was expecting a number to be on the command line This happens when the copy parameters ...

Page 30: ...me Attach failed PITC The overflow volume could not be attached to the specified set Possible errors ENOMEM The kernel module ran out of memory DSW_EEMPTY No overflow volume was specified DSW_EINUSE The overflow volume is already being used by point in time copy software in a different capacity master shadow bitmap DSW_ENOTFOUND The set that the user is trying to attach to does not exist DSW_EALRE...

Page 31: ...y being used by another set as an overflow volume Bitmap volume is not a character device PITC The volume that was specified as the bitmap volume during an enable operation is a block device and not a character device bitmap volume name must start with dev PITC The volume that was specified as the bitmap volume during an enable or import operation does not start with dev both phost and shost are l...

Page 32: ...g mode on the primary host before a sync can be started If the primary site is in an error state fix the error and then place the primary site into logging mode for this set Then reissue the forward sync command can not use current config for bitmap reconfiguration RM A single set must be specified for bitmap operations The default configuration cannot be used for these operations can not use curr...

Page 33: ...s not understood by the receiving host Verify that both hosts are running compatible versions of the remote mirror software Change request denied volume mirror is up Kernel The user asked to sync a remote mirror set and the secondary host has refused the sync event changing queue parameters may only be done on a primary SNDR host RM The queue parameters for the async I O queue both memory based an...

Page 34: ...l was unable to access one or more volumes in the set DSW_EOPACKAGE Another package for example RDC told the point in time copy software not to perform the copy operation DSW_EIO The kernel had a problem reading or writing one of the volumes in the set Could not create rdc_config process Kernel The user has issued a sync for a remote mirror set and the process could not be started for the set The ...

Page 35: ...ctag2 This error indicates the values specifed for ctag1 and ctag2 do not match device name is longer than MAX characters RM The name for the primary data volume primary bitmap volume secondary data volume or secondary bitmap volume is too long for the remote mirror software Disable failed PITC iiadm was unable to disable one or more sets Possible errors EFAULT The kernel module tried to read out ...

Page 36: ...kq is incompatible RM The user tried to enable a set into a group that has a disk queue but the user specified a disk queue that does not match the group s disk queue Disk queue s is already in use Kernel The volume for the disk queue being added to the set or group is already in use as a data volume bitmap volume or disk queue Use a different volume for the disk queue Disk queue s operation not p...

Page 37: ...or secondary volume encountered an error and the remote mirror software was unable to read or write to the volume In a Sun Cluster environment a failover of the resource group might have been issued causing the sync ot reverse sync to be stopped Duplicate volume specified PITC A command that can take multiple shadow volume names like update or copy detected that a shadow volume was specified more ...

Page 38: ...t access to the underlying volume master shadow bitmap Enable failed can t tidy up cfg PITC Could not enable volume and could not remove new entry from configuration file Enable pending on s s try again later Kernel A previous enable operation of a set is still processing when another enable operation is attempted enabling disk queue on an SNDR secondary is not allowed diskq RM A disk queue can be...

Page 39: ...svol RM Autosync could not be activated for the set when going from logging mode to replicating mode file contains no matching SNDR sets RM The configuration file specified with the f switch contains no valid remote mirror sets found matching ndr_ii entry for vol RM There is already an ndr_ii entry for this remote mirror set Group config does not match kernel PITC The groups in dscfg are different...

Page 40: ...mes DSW_EHDRBMP Could not read bitmap header Bitmap volume might be inaccessible or bad DSW_EOFFLINE One of the volumes is offline and cannot be made part of a set DSW_ERSRVFAIL Could not get access to the underlying volume incorrect Solaris release requires release RM The Solaris version that the remote mirror software is trying to run on is not supported Instant Image volumes that are not in a d...

Page 41: ...tag ltag is longer than CFG_MAX_BUF x RM The l hostname tag generated for the ctag to use to mark the remote mirror set as local is too long This results if a long host name is returned from the call to uname Master and bitmap are the same device PITC During an enable operation iiadm discovered that the master volume and the bitmap volume are the same Master and shadow are the same device PITC Dur...

Page 42: ...OUND The set could not be found in the kernel Not all Instant Image volumes are in a disk group PITC All volumes master shadow and bitmap must be in the same cluster device group not a valid number nust be a decimal between 1 and max RM The number entered is out of the allowed range or is not a number Not primary cannot sync s s and s s Kernel An update full sync or reverse sync command was issued...

Page 43: ...se sync operation has been requested but the bitmap on the primary host cannot be accessed Verify that the bitmap volume is a valid volume and is not in an error state Request not serviced s is currently being synced Kernel The user attempted to sync the remote mirror set or put the remote mirror set into logging mode while a previous sync request is being set up If the user issued a second sync r...

Page 44: ...e new set s s not already enabled Kernel The user has tried to do an operation on a set that is not enabled Verify that the proper set has been specified to sndradm and then verify that the set is enabled using sndradm i Set Copy Parameters failed PITC iiadm could not modify the copy units and delay values for the specified set Possible errors EFAULT The kernel module tried to read out of bounds F...

Page 45: ...a character device PITC During an enable operation iiadm discovered that the shadow volume is a block device and not a character or a raw device shadow volume name must start with dev PITC The shadow volume must exist in the dev directory tree Shadow volume not in a disk group PITC During an attach operation iiadm determined that the set to which the user is attaching an overflow volume is neither...

Page 46: ...et of copy update is mounted unmount it first PITC If the shadow in a master to shadow copy or update operation or the master in a shadow to master copy or update operation is mounted it cannot be copied to The bitmap s is already in use Kernel The bitmap requested for the remote mirror set being enabled is already in use as a bitmap for another set Enable the set and specify a different volume fo...

Page 47: ...Kernel The remote mirror software was unable to add host information to its configuration Verify that the system is not running low on memory unable to add set to configuration storage error RM An error has occurred that prevents the remote mirror software from accessing the configuration storage database when trying to enable set unable to allocate memory for cluster tag RM System is running low ...

Page 48: ...cluster device group is active on the current host Unable to find group in configuration storage RM Could not find remote mirror group in configuration database while trying to do a diskq operation Unable to find shost svol in configuration storage RM Could not find remote mirror set in configuration database while trying to do a diskq operation unable to find SNDR set shost svol in config RM The ...

Page 49: ...the bitmap file read returned X instead of Y RM Bitmap could not be read correctly Unable to register s Kernel The remote mirror software could not use the volume requested Verify that the volume exists is accessible and is not in an error state unable to remove set from configuration storage error RM An error has occurred that prevents the remote mirror software from removing the set from the con...

Page 50: ...ITC iiadm queried the kernel for the version of the code it was running but failed Possible errors EFAULT The kernel module tried to read out of bounds File a bug against iiadm vol is already configured as an SNDR bitmap RM The master shadow or bitmap volume in the ndr_ii entry is already configured as a remote mirror bitmap volume vol is not a character device RM The volume specified is not a cha...

Page 51: ...ible errors EFAULT The kernel module tried to read out of bounds File a bug against iiadm ENOMEM The kernel module ran out of memory EINTR The user interrupted the wait process DSW_EEMPTY No set was specified to wait for DSW_ENOTFOUND The specified set could not be found in the kernel DSW_ENOTLOCKED User tried to remove the PIDlock but the set is not locked DSW_EINUSE The user tried to remove the ...

Page 52: ...the following format cfgadm Hardware specific failure operation failed could not suspend user process process_id You must stop the process manually perform the cfgadm operation and then start the process again Use the following commands to accomplish this 1 Quiesce I O to the sets using the following series of commands 2 Issue the cfgadm command 3 Start I O to the sets using the following series o...

Reviews: