Loading data with files to the database utility method
In this recipe we will look into the detailed steps of instantiating a target database from a source database using the GoldenGate extract file to the database utility method. Although this method can be used to load the data into a variety of databases, for example, Microsoft SQL Server and Oracle DB2, we will focus on loading the data into the target Oracle database using the SQL * Loader. For this, we will instantiate the EMP table with 1 million rows to the target database while it gets updated with various transactions. We will also verify at the end of the initial load that all of these changes have been applied to the target as well.
Getting ready
For this recipe we will use the following setup:
Create a modified version of the EMP table that is delivered in the SCOTT demo schema with Oracle binaries in the source and target databases. You can use the following DDL statement to create the EMP table:
CREATE TABLE EMP (
EMPNO NUMBER(10) PRIMARY KEY,
ENAME VARCHAR2(25),
JOB VARCHAR2(25),
MGR NUMBER(10),
HIREDATE DATE,
SAL NUMBER(20,2),
COMM NUMBER(20,2),
DEPTNO NUMBER(10),
CITY VARCHAR2(25));
Use the following script to load 1 million rows into this table:
BEGIN
FOR I IN 1..250000 LOOP
INSERT INTO EMP VALUES (0+I,'TOM'||I,'WORKER',7369,TO_DATE('07-11-2012','DD-MM-YYYY'),10000+I,I,10,'LONDON');
INSERT INTO EMP VALUES (250000+I,'BOB'||I,'DBA',7499,TO_DATE('07-11-2012','DD-MM-YYYY'),260000+I,I,10,'PARIS');
INSERT INTO EMP VALUES (500000+I,'ALEX'||I,'DEVELOPER',7521,TO_DATE('07-11-2012','DD-MM-YYYY'),500000+I,I,10,'TOKYO');
INSERT INTO EMP VALUES (750000+I,'SAM'||I,'SALESMAN',7934,TO_DATE('07-11-2012','DD-MM-YYYY'),750000+I,I,10,'BOSTON');
END LOOP;
COMMIT;
END;
Enable supplemental logging for the EMP table in the source database.
Grant SELECT on the EMP table to the GGATE_ADMIN user in the source database.
Grant INSERT, UPDATE, and DELETE on the EMP table to the GGATE_ADMIN user in the target database.
How to do it...
The following are the steps for a continuous replication setup:
Follow the steps in the Setting up a simple GoldenGate replication configuration between two single node databases recipe in Chapter 2, Setting up GoldenGate Replication to set up the continuous replication for the EMP table.
Modify the replicat RGGTEST1 configuration to add the HANDLECOLLISIONS parameter as shown in the following command:
./ggsci
STOP REPLICAT RGGTEST1
EDIT REPLICAT RGGTEST1
REPLICAT RGGTEST1
USERID GGATE_ADMIN@TGORTEST, PASSWORD GGATE_ADMIN
DISCARDFILE /u01/app/ggate/dirrpt/RGGTEST1.dsc,append,MEGABYTES 500
ASSUMETARGETDEFS
HANDLECOLLISIONS
MAP SCOTT.EMP, TARGET SCOTT.EMP;
Ensure that you don't start the RGGTEST1 replicat at this stage.
The following are the steps for setting up an initial load:
Create a GoldenGate initial load Extract process in the source environment as shown in the following command:
./ggsci
EDIT PARAMS EXTINIT
EXTRACT EXTINIT
SOURCEISTABLE
USERID GGATE_ADMIN@DBORATEST, PASSWORD GGATE_ADMIN
FORMATASCII, SQLLOADER
RMTHOST stdby1-ol6-112, MGRPORT 8809
RMTFILE ./dirdat/extinit.dat, PURGE
TABLE SCOTT.EMP;
Add the initial load Extract process to the source manager configuration as shown in the following command:
GGSCI> ADD EXTRACT EXTINIT, SOURCEISTABLE
Create a GoldenGate initial load Replicat process in the target environment as shown in the following command:
./ggsci
EDIT PARAMS RPTINIT
GENLOADFILES sqlldr.tpl
USERID GGATE_ADMIN@TGORTEST, PASSWORD GGATE_ADMIN
EXTFILE ./dirdat/extinit.dat
ASSUMETARGETDEFS
MAP SCOTT.EMP, TARGET SCOTT.EMP;
Run the following command to generate the files for the SQL loader:
cd $GG_HOME
./replicat paramfile ./dirprm/rptinit.prm reportfile ./dirrpt/RPTINIT.rpt
The preceding command will generate two files called EMP.run and EMP.ctl in $GG_HOME.
Modify the contents of EMP.ctl and change the name of the EMP table to fully qualified SCOTT.EMP.
Before starting the initial load, check the data in the source database as shown in the following command:
SQL> SELECT CITY,COUNT(*) FROM EMP GROUP BY CITY;
CITY COUNT(*)
--------------------------------
LONDON 250000
PARIS 250000
TOKYO 250000
BOSTON 250000
On the source system, start the EXTINIT Extract process as shown in the following command:
GGSCI> START EXTRACT EXTINIT
Run the following UPDATE/DELETE statement in the source system:
SQL> DELETE EMP WHERE EMPNO>650000 and CITY='TOKYO';
SQL> UPDATE EMP SET CITY='SHANGHAI' where CITY='TOKYO';
SQL> COMMIT;
Once the EXTINIT Extract process finishes in step 8, start the SQL * Loader job to load the data in the EMP table as shown in the following command:
cd $GG_HOME
./EMP.run
Wait for the SQL loader job to finish, then check the data in the target database as shown in the following command:
SQL> SELECT CITY,COUNT(*) FROM EMP GROUP BY CITY;
CITY COUNT(*)
---------------------------------
LONDON 250000
PARIS 250000
TOKYO 250000
BOSTON 250000
Check the updated state of data in the source environment as shown in the following command:
SQL> SELECT CITY,COUNT(*) FROM EMP GROUP BY CITY;
CITY COUNT(*)
---------------------------------
LONDON 250000
PARIS 250000
SHANGHAI 150000
BOSTON 250000
Start the change delivery Replicat process in the target environment as shown in the following command:
GGSCI> START REPLICAT RGGTEST1
Check the state of the Replicat process once it has processed all the trail files. Check the data in the target database to ensure that all the changes are copied across as shown in the following command:
SQL> SELECT CITY,COUNT(*) FROM EMP GROUP BY CITY;
CITY COUNT(*)
--------------------------------
LONDON 250000
PARIS 250000
SHANGHAI 150000
BOSTON 250000
Now that the initial load is complete, we will update the RGGTEST1 Replicat process to turn off HANDLECOLLISIONS as shown in the following command:
GGSCI> SEND REPLICAT RGGTEST1, NOHANDLECOLLISIONS
GGSCI> EDIT PARAMS RGGTEST1
REPLICAT RGGTEST1
USERID GGATE_ADMIN@TGORTEST, PASSWORD GGATE_ADMIN
DISCARDFILE /u01/app/ggate/dirrpt/RGGTEST1.dsc,APPEND,MEGABYTES 500
ASSUMETARGETDEFS
MAP SCOTT.EMP, TARGET SCOTT.EMP;
How it works...
In this recipe we first configure the processes for continuous replication following the steps from a Setting up a simple GoldenGate replication configuration between two single node databases recipe in Chapter 2, Setting up GoldenGate Replication. Once that is done, we will modify the Replicat process to add the HANDLECOLLISIONS parameter. The key thing to note in this procedure is that we do not start the Replicat process at this stage. After this we configure an initial load extract. The use of the SOURCEISTABLE parameter tells the GoldenGate extract to read the data from the database itself and not from the redo logs/archived logs. The FORMATASCII and SQLLOADER parameters are used to write the destination file in the ASCII format on the target server. This parameter must be specified before the RMTFILE parameter in the extract configuration.
We then create an initial load replicat on the target system. The GENLOADFILES parameter is used to denote that we want to generate SQL loader files using the Replicat process. GoldenGate does this by referring to a template file called sqlldr.tpl which is shipped with the GoldenGate binaries by Oracle.
Then we start the initial load extract, which creates the ASCII file on the target server. Afterwards, we update the data in the source system to generate some changes while the initial load extracts the data from the database.
Then we start the SQL loader process to load the data and verify that all the data from source has been copied to the target environment.
This is followed by starting the RGGTEST1 replicat that applies the DML changes to the target EMP table. Finally we verify that the data in the target EMP table matches the one in the source EMP table.
There's more...
You can use this method to load the data into a non-Oracle target database. For this you will need to change the FORMATASCII parameter as follows:
FORMATASCII, BCP
While generating the loader files on the target system, you will need to specify the appropriate template for the database that you are loading into, for example, db2cntl.tpl for DB2 and bcpfmt.tpl for SQL Server.
No comments:
Post a Comment