Wednesday, April 20, 2016

Steps to re-instantiate a failed GoldenGate configuration

Steps to re-instantiate a failed GoldenGate configuration

A network outage between the systems can sometimes break the synchronization between the databases. GoldenGate replication is quite good in recovering from short outages. However, if the outage is long and the amount of change caused in the source database is high, there can be instances where the GoldenGate replication cannot recover itself, and the administrator will need to re-instantiate the target environment. In this recipe, we will look into the steps that you can follow to re-sync a GoldenGate configuration when there is a network outage or some issue with the target database/system.

Getting ready

For this recipe we will refer to the setup done in the Setting up a simple GoldenGate replication configuration between two single node databases recipe in Chapter 2, Setting up GoldenGate Replication. The scenario demonstrated in this recipe is a broken replication due to a long network outage. The trail file area on the source system filled up causing the extract to abend. Due to the long outage in replication, some of the archive logs are no longer present on the disk on the source server. Due to this the replication cannot be resumed with just a simple restart of the Extract/Datapump processes.

How to do it...

The steps to re-instantiate a broken GoldenGate replication configuration are as follows:
  1. Stop the Replicat process in the target environment if it is running by using the following command:
    GGSCI> STOP REPLICAT RGGTEST1
    
  2. Truncate the tables in the target database that you are replicating using the following commands:
    SQL> TRUNCATE TABLE SCOTT.EMP;
    SQL> TRUNCATE TABLE SCOTT.DEPT;
    SQL> TRUNCATE TABLE SCOTT.SALGRADE;
    SQL> TRUNCATE TABLE SCOTT.BONUS;
    
  3. Note down the current timestamp from the source database:
    SQL> SELECT TO_CHAR(SYSDATE,'YYYY-MM-DD HH24:MI:SS') from dual;
    TO_CHAR(SYSDATE,'YY
    -------------------
    2013-02-26 15:50:02
    
  4. Note down the current_scn from the source database as follows:
    SQL> SELECT CURRENT_SCN FROM V$DATABASE;
    CURRENT_SCN
    -----------
      11547294
    
  5. Create a /u01/app/ggate/expdp directory for taking a Datapump Export on both the servers, as follows:
    mkdir /u01/app/ggate/expdp
    
  6. Create a directory object in the source database as follows:
    SQL> CREATE DIRECTORY EXPIMP as '/u01/app/ggate/expdp';
    Directory created.
    SQL> GRANT READ, WRITE, EXECUTE on Directory EXPIMP to system;
    Grant succeeded.
    
  7. Run Datapump Export to backup the tables that we are replicating:
    [oracle@prim1-ol6-112 ggate]$ expdp userid=system schemas=SCOTT FLASHBACK_SCN=11547294 DIRECTORY=EXPIMP DUMPFILE=EXPDP.dmp LOGFILE=EXPDP.log
    Export: Release 11.2.0.3.0 - Production on Tue Feb 26 16:03:39 2013
    Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
    Password: 
    Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
    With the Partitioning, OLAP, Data Mining and Real Application Testing options
    FLASHBACK automatically enabled to preserve database integrity.
    Starting "SYSTEM"."SYS_EXPORT_SCHEMA_01":  userid=system/******** schemas=SCOTT FLASHBACK_SCN=11547294 DIRECTORY=EXPIMP DUMPFILE=EXPDP.dmp LOGFILE=EXPDP.log 
    Estimate in progress using BLOCKS method...
    Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
    Total estimation using BLOCKS method: 192 KB
    Processing object type SCHEMA_EXPORT/USER
    Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
    Processing object type SCHEMA_EXPORT/ROLE_GRANT
    Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
    Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
    Processing object type SCHEMA_EXPORT/TABLE/TABLE
    Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
    Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
    Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
      exported "SCOTT"."DEPT"   5.929 KB       4 rows
      exported "SCOTT"."EMP"      8.648 KB      16 rows
      exported "SCOTT"."SALGRADE" 5.859 KB       5 rows
      exported "SCOTT"."BONUS"    0 KB       0 rows
    Master table "SYSTEM"."SYS_EXPORT_SCHEMA_01" successfully loaded/unloaded
    *********************************************************Dump file set for SYSTEM.SYS_EXPORT_SCHEMA_01 is:
      /u01/app/ggate/expdp/EXPDP.dmp
    Job "SYSTEM"."SYS_EXPORT_SCHEMA_01" successfully completed at 16:05:08
    
  8. Create a directory object in the database in the target database:
    SQL> CREATE DIRECTORY EXPIMP AS '/u01/app/ggate/expdp';
    Directory created.
    SQL> GRANT READ,WRITE,EXECUTE ON DIRECTORY EXPIMP TO SYSTEM;
    Grant succeeded.
    
  9. Copy the dump file from the source to the target server as follows:
    scp /u01/app/ggate/expdp/expdp.dmp stdby1-ol6-112:/u01/app/ggate/expdp/
    
  10. Import the dump file into the target database as follows:
    [oracle@stdby1-ol6-112 expdp]$ impdp userid=system directory=expimp dumpfile=EXPDP.dmp logfile=IMPDP.dmp full=y table_exists_action=REPLACE
    Import: Release 11.2.0.3.0 - Production on Tue Feb 26 16:20:51 2013
    Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
    Password: 
    Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
    With the Partitioning, OLAP, Data Mining and Real Application Testing options
    Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
    Starting "SYSTEM"."SYS_IMPORT_FULL_01":  userid=system/******** directory=expimp dumpfile=EXPDP.dmp logfile=IMPDP.dmp full=y table_exists_action=REPLACE 
    Processing object type SCHEMA_EXPORT/USER
    ORA-31684: Object type USER:"SCOTT" already exists
    Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
    Processing object type SCHEMA_EXPORT/ROLE_GRANT
    Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
    Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
    Processing object type SCHEMA_EXPORT/TABLE/TABLE
    Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
      imported "SCOTT"."DEPT" 5.929 KB       4 rows
      imported "SCOTT"."EMP"  8.648 KB      16 rows
      imported "SCOTT"."SALGRADE" 5.859 KB  5 rows
      imported "SCOTT"."BONUS"    0 KB       0 rows
    Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
    Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
    Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
    Job "SYSTEM"."SYS_IMPORT_FULL_01" completed with 1 error(s) at 16:21:05
    
  11. Alter the Extract and Datapump on the source server as follows:
    GGSCI> ALTER EXTRACT EGGTEST1, BEGIN 2013-02-26 15:50:02
    GGSCI> ALTER EXTRACT PGGTEST1, BEGIN 2013-02-26 15:50:02
    
  12. Remove all the trail files from the dirdat directory on the source server except the last generated trail file.
  13. Start Extract and Datapump on server A as follows:
    GGSCI> START EXTRACT EGGTEST1
    GGSCI> START EXTRACT PGGTEST1
    
  14. Start the Replicat process on server B from the SCN noted in step 4 as follows:
    GGSCI> START REPLICAT RGGTEST1, AFTERCSN 11547294
    
  15. Check whether the data is getting replicated on both sides by running stats or not as follows:
    GGSCI (prim1-ol6-112.localdomain) 3> STATS EGGTEST1, TOTAL, TABLE SCOTT.*
    Sending STATS request to EXTRACT EGGTEST1 ...
    Start of Statistics at 2013-02-26 22:34:00.
    Output to /u01/app/ggate/dirdat/st:
    Extracting from SCOTT.EMP to SCOTT.EMP:
    *** Total statistics since 2013-02-26 22:01:04 ***
      Total inserts                              0.00
      Total updates                             16.00
      Total deletes                              0.00
      Total discards                             0.00
      Total operations                          16.00
    End of Statistics.
    GGSCI (prim1-ol6-112.localdomain) 4> STATS PGGTEST1, TOTAL, TABLE SCOTT.*
    Sending STATS request to EXTRACT PGGTEST1 ...
    Start of Statistics at 2013-02-26 22:34:18.
    Output to /u01/app/ggate/dirdat/rt:
    Extracting from SCOTT.EMP to SCOTT.EMP:
    *** Total statistics since 2013-02-26 22:01:05 ***
      Total inserts                              0.00
      Total updates                             16.00
      Total deletes                              0.00
      Total discards                             0.00
      Total operations                          16.00
    End of Statistics.
    GGSCI (stdby1-ol6-112.localdomain) 8> STATS RGGTEST1, TOTAL, TABLE SCOTT.*
    Sending STATS request to REPLICAT RGGTEST1 ...
    Start of Statistics at 2013-02-26 22:36:54.
    Replicating from SCOTT.EMP to SCOTT.EMP:
    *** Total statistics since 2013-02-26 22:09:22 ***
      Total inserts                              0.00
      Total updates                             16.00
      Total deletes                              0.00
      Total discards                             0.00
      Total operations                          16.00
    End of Statistics.
    

How it works...

All GoldenGate processes maintain their checkpoints at filesystem level (or a table in the case of Replicat). Using this checkpoint information, a GoldenGate process can resume its job from the point it left at the time of failure. Sometimes due to a network issue or any other issue, the outage can be long, due to which it might not be possible to resume the replication, for example, in the case of Extract, as long as you have all the archive logs available on the disk that have still not been mined, GoldenGate should be able to resume the Extract process. If the source system is a very busy OLTP system, keeping the archive logs on the disk for the duration of the outage might not be an option and due to this, it may not be feasible to resume the replication. In such scenarios, you will need to re-instantiate the tables that you are replicating in the target database. In this recipe, we have covered the steps that you can follow to re-instantiate the target database objects without causing an outage to the source database.
In this procedure, we first stop the Replicat process in the target system. After this we clear all the data in the replicated objects by truncating the tables. At this time, we note the SCN and timestamp of the source database. In step 6, we run an export of the replicated tables. In this example, we are replicating the objects owner by a SCOTT user. The dump file is then transferred on to the target system. Then we import the dump file into the target database. At this time, we have re-instantiated the objects in the target database to the state at which SCN was noted as we saw in step 3. We now re-point the Extract processes to start from the timestamp which was noted in step 4. In this scenario, our trail file area has filled up so we delete the trail files except the latest one as that will be required by the Extract process to resume its functioning. After this we start the Extract and Datapump processes. Then we start the Replicat process on the target system and instruct it to start applying the changes from the SCN that was noted in step 3. Now the replication should be back in sync. We verify that it is working by checking the stats of all three processes.

No comments:

Post a Comment