•
Failure of a TMF audited volume on the primary system
•
TMF subsystem failure after which the TMF volume recovery is successful
•
TMF file recovery operation on the primary system that is not to a timestamp, first purge,
or TOMATPOSITION position.
•
TMF ABORT TRANSACTION with the AVOIDHANGING option on the primary system
RDF cannot recover from the following events:
•
TMF file recovery operation to a timestamp, first purge, or TOMATPOSITION on the primary
system.
•
TMF subsystem failure after which TMF cannot perform a successful volume recovery
operation
After a TMF file recovery to a timestamp, first purge, or TOMATPOSITION, or after a TMF
subsystem failure for which volume recovery cannot succeed, the databases or the affected files
on the primary and backup systems must be resynchronized.
Communication Line Failures
RDF can recover from communication line failures. When the extractor detects that a
communication line to the backup system is down, it reports the error to the EMS event log. The
extractor attempts to resend data every minute until the line to the backup system is reenabled.
Unless you are running the ZRDF/ZLT product, the failure of the communications line will lead
to the loss of committed transactions if you also lose your primary system and you must perform
an RDF Takeover operation before the extractor was able to catch up. This risk is eliminated with
the RDF/ZLT product and a proper configuration for CommitHold. For further details see,
“Zero
Lost Transactions (ZLT)” (page 337)
.
If you stop RDF on the primary system when the communication line to the backup system is
down, the monitor tries to send a stop message to the processes on the backup system and reports
that the line is down. All of the processes on the backup system continue to run until a STOP
RDF command is issued at the backup system.
NOTE:
If you issue a STOP RDF command on the primary or backup system while the network
is down, you must also issue a STOP RDF command on the other system while the network is
still down.
If you have an RDF network running and the Network Master's RDFNET process encounters a
communications line failure when attempting to perform a network transaction on another
primary node in the RDF network, then it can lead to an increase in work to be performed during
an RDF Takeover operation. Once the
comm
line comes back up and the RDFNET process can
resume its network transactions, that need for increased takeover work is eliminated.
System Failures
If you lose your primary system and you can recover it without having to perform an RDF
Takeover operation, then no special recovery is required for RDF. When you have restarted your
primary system, then restart RDF before you restart your applications.
If you lose your primary system and you need to restart you applications as quickly as possible,
then perform the RDF Takeover operation on your backup system. Details of the various tasks
you need to do after the RDF Takeover are provided further below. Additionally, if you can
eventually recover your primary system, a discussion is also provided further below on how
you can recover the database on that system and bring it into synchronization with the database
on your backup system where your applications are now running.
If you lose your backup system, you only need to recover it and then restart RDF on your primary
system as quickly as possible. If the communications line to your backup system has sufficient
bandwidth, then RDF can catch up very quickly.
126
Critical Operations, Special Situations, and Error Conditions
Summary of Contents for NonStop RDF
Page 68: ...68 ...
Page 186: ...186 ...
Page 260: ...260 ...
Page 278: ...278 ...
Page 284: ...284 ...
Page 290: ...290 ...
Page 308: ...308 ...
Page 322: ...322 ...
Page 336: ...336 ...
Page 348: ...348 ...
Page 464: ...464 ...
Page 478: ......