Replication Server runs in distributed database systems with many other hardware and software components, including Adaptive Servers and other data servers, Replication Agents, LANS, WANs, and client application programs.
Any of these components, including Replication Server, may occasionally fail. Replication Server is a fault-tolerant system, designed with this possibility in mind. During most failures, it waits for the failure to be corrected and then continues its work. When failures require restarting Replication Server or a Replication Agent, the start-up process guarantees that replication is resumed without loss or duplication of data.
Protecting against data loss is the same with a replication system as with a centralized database system. In both cases, there is one definitive version of the data, and you should invest considerable planning and resources to protect it.
In a replication system, if the primary data is protected, all replicate data can ultimately be recovered. A site can replace lost replicate data simply by re-creating its subscriptions.
If replicated transactions are lost at the primary site (as happens, for example, when the primary database is rolled back to a previous dump), consistency may be lost between primary and replicated data.
The backup and recovery methods available in a replication system include preventive and recovery measures. This chapter describes both of them.
Preventive measures include:
Warm standby
Hardware data mirroring (hot standby)
Longer save intervals
Coordinated dumps
Recovery measures include:
Subscription initialization
Subscription reconciliation utility (rs_subcmp)
Database recovery
Restoring coordinated dumps