This article describes how to preserve data integrity when a Teradata Parallel Transporter (Teradata PT) export to stream job restarts.

POSSIBLE ISSUE

Teradata PT can export data from a Teradata source table and load the data into a different Teradata target table in a single Teradata PT job without landing the data to a disk file. This sounds great but care must be taken to ensure data integrity.

When the exported data is unordered and there is a job restart during the middle of the export, the target table can have undesirable results at the end of the job.

Why? You might ask. It is because the Teradata Database does not guarantee the same order of the exported data after a restart.

The Stream operator sends the data to the Teradata Database and the data is committed as it is loaded. The Stream operator keeps track the number of records that has been sent. When there is a restart, the Stream operator will resume from the last sent record number.

A job is not checkpoint/restartable when using the TPT Export operator or SQL Selector operator. So after a restart, the SELECT statement in the SelectStmt attribute is sent again and all of the rows will be exported from the Teradata Database again.

This issue occurs when all of the following conditions are met:

  • The job uses the TPT Export or SQL Selector operator to export data from a source table.
  • The job uses the TPT Stream operator to load the data into a target table.
  • The exported data is unordered.
  • A restart occurs in the middle of the export.

This problem can also occur if the source table changes between the original export and the re-issued export after restart.

 

UNDESIRABLE RESULTS

If the target table is a SET table, the two results are:

  • Some rows are not applied on the target table.
  • Some rows are incorrectly inserted into the TPT Stream operator’s error table.

If the target table is a MULTISET table, the two results are:

  • Some rows are not applied on the target table.
  • Some rows are incorrectly applied twice on the target table.

Note: The target table cannot be rolled back because the Stream operator sends the data to the Teradata Database and the Teradata Database applies the data on the target table.

 

HOW TO AVOID THE ISSUE

There are two ways to avoid the issue:

  1. Export the data to a disk file. Then, load the data file to the target table.
  1. Use the TPT Update operator instead of the TPT Stream operator.

Note: Use of the TPT Update operator will still have a problem if the user tried to checkpoint. An export-to-update job cannot have checkpointing because the Export operator cannot restart in the middle of the export. The Export operator exports all of the data or none of the data.

Export the Data to a Disk File

The first way is to export the data to a file. You would need to use the TPT Data Connector operator to write the data to a file. You would need to use the “Write” option for the TPT Data Connector’s OpenMode attribute.

When a restart occurs in the middle of the job, the TPT Data Connector operator rewrites the entire file. When the job completes, the data file will have all of the correct rows as long as the user did not select the “WriteAppend” option for the OpenMode attribute.

The data file can then be loaded into a target table. You would need to use the TPT Data Connector operator to read the data from the file.

When a restart occurs in the middle of the loading, the TPT Data Connector operator guarantees the same order of the data read from the file. When the loading completes, the target table will have the correct result.

Use the TPT Update Operator

The second way is to use the TPT Update operator instead of the TPT Stream operator.

The TPT Update operator is different from the TPT Stream operator.

The TPT Update operator sends the data to the Teradata Database in the acquisition phase. In the acquisition phase, the Teradata Database populates the data to a temporary work table. After the acquisition phase, the Teradata Database applies the data from  the work table to the target table. This is called the application phase.

When the user does not specify a checkpoint and a restart occurs, the TPT Update operator will restart the loading from the beginning and the Teradata Database will populate the work table from the beginning. After the application phase, the target table will have the correct result as long as the user did not specify a checkpoint.

Note: The job is not checkpoint/restartable when using the TPT Export operator or SQL Selector operator.

For information on the Teradata PT operators, see Teradata Parallel Transporter Reference.

For information on Teradata PT examples, see Teradata Parallel Transporter User Guide.

Discussion
krystynytax 5 comments Joined 03/13
20 May 2013

Hi,
I want to copy a database/schema´s user from one machine to another machine because I have development environment and i want to copy the same structure and data from development environment to  production environment which are in different machines. How I can do this ???
I have ABU Aplicattion for the backup
 

krystynytax 5 comments Joined 03/13
20 May 2013

Hi,
i want to know how I can copy a database or user schema from the development environment to the production environment.
I want to do export of all structure of a database and do the import in other database in other machine
I have ABU Aplicattion for the backup
Can i do this ?? Because i saw how copy tables, but i want to copy the all schema of a user /database
 
Thanks

TonyL 43 comments Joined 12/09
20 May 2013

You can try using Teradata Data Mover. See the Teradata Data Mover User Guide.

krystynytax 5 comments Joined 03/13
21 May 2013

Isn´t there  other way for I can do this??? Because the aplication  Teradata Data Mover isn´t free and i can´t pay a license

krystynytax 5 comments Joined 03/13
21 May 2013

Is it possible do export (from development environment) of a user and your objects and then do import in production environment???
Regards

TonyL 43 comments Joined 12/09
21 May 2013

TPT does not support the export of a user database.
TPT does support the export of user tables from one Teradata Database system to another Teradata Database system. See my article at this URL: http://developer.teradata.com/tools/articles/use-teradata-parallel-transporter-to-move-data-without-landing-the-data-to-disk

You must sign in to leave a comment.