In the last Teradata Data Mover (TDM) article (Executing Partial Table Copies with Teradata Data Mover), we discussed creating a TDM job to copy a subset of rows in a table between Teradata systems. This example showed how customers can avoid copying an entire table to the target system when they only want to copy recent changes made to that table. The problem with the example in that article, though, is that the where clause has a hard-coded value in it. Customers will typically want to avoid having hard-coded values in their production TDM partial copy jobs because the subset of rows they want to copy will change every time they want to execute the job. It's possible for customers to just create a new TDM job every time they want to change the where clause, but that could lead to many unnecessary jobs being created in the TDM repository that copy data from the same table. It's much more efficient to create one job that will copy a dynamic subset of rows every time it is executed. Executing the same TDM job repeatedly instead of creating a new job every time rows need to be copied from the same table will eliminate the overhead associated with creating new TDM jobs.

This article will discuss how customers can use TDM to dynamically execute partial table copies with one job.

Nested Selects

The key to creating a dynamic partial table copy job with TDM is to specify a nested select in the where clause for the job. The nested select statement can be very powerful in the way that it keeps the text of the where clause constant, while the meaning of it stays dynamic. For example, let's assume we modify the where clause in the previous article to be the following instead:

WHERE salesdate > (sel salesdate from jg185041.itemsdate)

This where clause will use a control table called jg185041.itemsdate to determine the criteria used to copy a subset of rows from the jg185041.items1 table. The advantage of using this where clause instead of the one specified in the Executing Partial Table Copies with Teradata Data Mover article is that no hard-coded values are used to determine the subset of rows that should be copied to the target system. The customer can change the salesdate value in the jg185041.itemsdate table before executing the same TDM job every time to get different results. This is a very efficient method to manage copying recent changes in a table to the target system on a continuous basis.

Create a Dynamic Partial Copy TDM Job

Let's use the same jg185041.items1 table we created in the Executing Partial Table Copies with Teradata Data Mover article to tie this all together and create a dynamic partial copy TDM job. In this example, let's assume we want to copy all rows that correspond to when sales were made after 2008 without specifying a hard-coded value in the where clause. This of course also assumes that the salesdate column in the jg185041.itemsdate table has a lone value of '2008-12-31' in it. Here is the XML file we will use to create a dynamic partial copy TDM job:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<dmCreate xmlns="http://schemas.teradata.com/dataMover/v2009"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://schemas.teradata.com/dataMover/v2009/DataMover.xsd">
    <job_name>jg_partialcopy2_devX</job_name>
    <source_tdpid>dmdev</source_tdpid>
    <source_user>jg185041</source_user>
    <source_password>jg185041</source_password>
    <target_tdpid>dmsmp</target_tdpid>
    <target_user>jg185041</target_user>
    <target_password>jg185041</target_password>
    <force_utility>tptapi</force_utility>
    <log_level>1</log_level>
    <database selection="unselected">
        <name>jg185041</name>
            <table selection="included">
                <name>items1</name>
                    <sql_where_clause><![CDATA[ WHERE salesdate > (sel salesdate from jg185041.itemsdate)]]></sql_where_clause>
                    <key_columns>
                        <key_column>deptId</key_column>
                        <key_column>prodId</key_column>
                        <key_column>salesdate</key_column>
                    </key_columns>
            </table>
    </database>
</dmCreate>

As you can see in the XML file above, the nested select statement in the <sql_where_clause> tag is used to tell TDM that this is a partial table copy job. Since TDM just passes the where clause as-is to the underlying utility used to extract the data, it doesn't know that a nested select is in the where clause. The ARC utility doesn't support nested selects in where clauses (even for PPI tables), so that is why we have to force TDM to use TPTAPI to copy the data instead. If we didn't specify a value for the <force_utility> tag then TDM would have chosen ARC as the default utility to use (because this is a PPI table) and an error would have been reported.

Here is the status output of executing this job:

dm-agent4:/home/jg185041/datamover/testing # datamove status -job_name jg_partialcopy2_devX -output_level 3
Data Mover Command Line 13.10.00.03
Status mode
Command parameters:
- job_name: jg_partialcopy2_devX
- output_level: 3
Permission authorized
Requesting status...
Connected to Daemon version 13.10.00.03

Job Name: jg_partialcopy2_devX
Job Execution Name: jg_partialcopy2_devX-20111229072543EST


TYPE  ID   STATUS                 CURRENT STEP                START TIME        DURATION TIME
-----------------------------------------------------------------------------------------------------------------------------------------------------
Job:  26   COMPLETED_SUCCESSFULLY 5                           12/29/11 7:25 AM  0:0:19


TYPE  ID   STATUS                 STEP TYPE                   START TIME        DURATION TIME
-----------------------------------------------------------------------------------------------------------------------------------------------------
Step: 3    COMPLETED_SUCCESSFULLY MOVE_DEFINITION_BEFORE_LOAD 12/29/11 7:25 AM  0:0:0
Step: 4    COMPLETED_SUCCESSFULLY MOVE_TABLE_DATA             12/29/11 7:25 AM  0:0:16
Step: 5    COMPLETED_SUCCESSFULLY RESOLVE_TABLE_AFTER_LOAD    12/29/11 7:26 AM  0:0:1


TYPE  ID    NAME                         MOVE PHASE     STATUS   TYPE   ROWS   BYTES   AGENT   UTILITY         TIMESTAMP
-----------------------------------------------------------------------------------------------------------------------------------------------------
Task: 228                                PRE_DATA_MOVE  COMPLETE                       Agent1  SQL            12/29/11 7:25 AM
Task: 229   "jg185041"."items1_e286e7_t" MOVING_DATA    LOAD     table  37     931     Agent1  TPTAPI_LOAD    12/29/11 7:26 AM
Task: 229   "jg185041"."items1"          MOVING_DATA    EXPORT   table  37     931     Agent1  TPTAPI_LOAD    12/29/11 7:26 AM
Task: 229   "jg185041"."items1"          MOVING_DATA    COMPLETE table  37     931     Agent1  TPTAPI_LOAD    12/29/11 7:26 AM
Task: 230                                POST_DATA_MOVE COMPLETE                       Agent1  SQL            12/29/11 7:26 AM 

You can see in the status output that 37 rows were copied in this job instead of the full 200 in the source table. You can also see that the TPTAPI_LOAD utility was used to execute this partial table copy. If we wanted to use a different date criteria in the where clause the next time rows from the jg185041.items1 table need to be copied then all we would have to do is change the salesdate value in the jg185041.itemsdate table before executing the same job again.

Discussion
david clough 4 comments Joined 12/09
29 Dec 2011

Yes, good stuff.
What I'm planning to do, however, is generate an XML file every time I want to run it. This will be constructed (probably using Python) and will have the Tables that I want to 'move'.
The reason I'm thinking of doing it this way is that I'll have a control Table, containing all the Tables that need moving, and this set of Tables could change from run to run.
So, having constructed the XML script, we can then just run it.

bbdd.tsunix 6 comments Joined 04/13
03 Sep 2013

I'm interested to install DataMover for copy data from  production nvironment  to development environment. Can you tell me where I can download the software TDM? this software is included in the DISK TTU 14.0???
Regards

You must sign in to leave a comment.