TD 14.0 - The other partitioning enhancements
In my blog entries "Teradata Columnar" and "9.2 Quintillion? What's that all about?", I discussed column partitioning and the increased partition limit that are introduced in TD 14.0. But there are other TD 14.0 partitioning enhancements that you may find useful. The following provides a brief introduction to these enhancements; for more detailed information, see the Orange Book: Increased Partition Limit and other Partitioning Enhancements, the Orange Book: Teradata Columnar, and the TD 14.0 manuals.
RANGE_N with BIGINT and TIMESTAMP
RANGE_N function is extended to allow a test value to have
TIMESTAMP(n) [WITH TIME ZONE] data type. With a test value of either of these two data types, the result of the
RANGE_N function has
BIGINT data type. The maximum number of ranges when the test value has
BIGINT data type is
2^63 - 3).
For test values with other data types,
RANGE_N still has
INTEGER data type and the maximum number of partitions is
2^31 - 1).
CREATE TABLE t1 (a INT, ts1 TIMESTAMP(6) WITH TIME ZONE NOT NULL, ...) PARTITION BY RANGE_N(ts1 BETWEEN TIMESTAMP '0001-01-01 00:00:00.000000+00:00' AND TIMESTAMP '9999-12-31 23:23:59.999999+00:00' EACH INTERVAL '0.000001' SECOND);
315,537,895,440,000,000 partitions with each distinct value of
ts1 in a separate partition. It is unlikely you would want to use this fine of granularity (too few rows per partition) but it is allowed.
A more reasonable example might be the following,
CREATE TABLE t2 (a INT, ts1 TIMESTAMP(6) WITH TIME ZONE NOT NULL, ...) PARTITION BY RANGE_N(ts1 BETWEEN TIMESTAMP '0001-01-01 00:00:00.000000+00:00' AND TIMESTAMP '9999-12-31 23:23:59.999999+00:00' EACH INTERVAL '1' DAY);
RANGE_N with TIMESTAMP Cast as DATE
Prior to Teradata 14.0, the
RANGE_N function’s test value could be a
TIMESTAMP column cast as
DATE and the ranges specified by
DATE constants. However, to obtain row partition elimination, a predicate of a query needed to specify the
TIMESTAMP column without casting. Starting with Teradata 14.0, the query can cast the
TIMESTAMP column to
DATE and compare to DATE constants and be able to obtain row partition elimination. Note that
TIMESTAMP supports the TD 13.10
AT clause with a
CAST which can be used to make a partitioning expression deterministic (a subsequent blog entry will discuss this use of the
CREATE MULTISET TABLE ts ( Calling_Nbr VARCHAR(21) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL, Start_Time TIMESTAMP(0) NOT NULL, Data INTEGER ) PRIMARY INDEX (Calling_Nbr), PARTITION BY RANGE_N(CAST(Start_Time AS DATE AT LOCAL) BETWEEN DATE '1990-01-01' AND DATE '2030-12-31' EACH INTERVAL '1' DAY ); EXPLAIN SELECT Calling_Nbr, Start_Time FROM t12 WHERE CAST(Start_Time AS DATE) BETWEEN DATE '2008-03-14' AND DATE '2008-03-29'; Explanation --------------------------------------------------------------------------- 1) First, we lock a distinct TEST."pseudo table" for read on a RowHash to prevent global deadlock for TEST.ts. 2) Next, we lock TEST.ts for read. 3) We do an all-AMPs RETRIEVE step from 16 partitions of TEST.ts with a condition of ("(TEST.ts.Start_Time < TIMESTAMP '2008-03-30 00:00:00') AND (TEST.ts.Start_Time >= TIMESTAMP '2008-03-14 00:00:00')") into Spool 1 (all_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 1 row (61 bytes). The estimated time for this step is 0.02 seconds. 4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
ADD Option and Excess Combined Partitions
ADD option for a row or column partitioning level specifies the number of partitions that can be added to a partitioning level. The number of partitions defined plus the number of partitions specified by the
ADD option defines the maximum number of partitions for that level. Also, excess combined partitions are assigned to levels until no more can be assigned to any level. The default
ADD is initially 10 for a column partitioning level and 0 for a row partitioning level (but may be increased if excess combined partitions are assigned to the partitioning level). Previously, excess combined partitions were only given to the first level and any excess not assignable to the first level were unusable.
PARTITION BY (RANGE_N(p BETWEEN 1 AND 100 EACH 1), RANGE_N(q BETWEEN 1 AND 300 EACH 10) ADD 20)
By specifying the
ADD 20 option for the second level of partitioning,
20 additional partitions besides the
30 defined partitions can be added to this level up to a maximum of
50 partitions total. Without the
ADD option partition, the maximum number of partitions for the second level is
30. If possible, Teradata tries to keep the partitioning as 2-byte partitioning (
65535 combined partitions). The first level is assigned any excess combined partitions; this can be computed by dividing
50 (the maximum number of partitions for the second level); the result is maximum of
1310 partitions for the first level. Since
100 partitions are defined for the first level,
1210 additional partitions are available for level
1 (that is, instead of the usual default of
ADD 0, the first level has a default of
ADD 1210). In this case, after assigning as many excess combined partitions to the first level, there are none left that can be assigned to the second level, so its
ADD remains at
20. In some cases, there can still be some excess combined partitions that can be assigned to other levels.
SHOW JOIN INDEXshow the
ADDoption as needed for a row partitioning level and always shows it for a column partitioning level.
EACH Optional for ALTER TABLE DROP Ranges
Prior to Teradata 14.0, the ranges and partitions were required to match existing ranges after expansion of ranges with
EACH clauses in both the old partitioning expression and the
DROP RANGE clause.
With Teradata 14.0, the ranges and partitions within the specified alter drop ranges are dropped thereby simplifying the dropping of partiitons. Note that an
EACH clause specified in the
DROP RANGE alter ranges has no effect.
For example, assume the following create table has been done:
CREATE TABLE Ordertsz ( o_orderkey INTEGER NOT NULL, o_custkey INTEGER, o_orderstatus CHAR(1) CASESPECIFIC, o_totalprice DECIMAL(13,2) NOT NULL, o_ordertsz TIMESTAMP(6) WITH TIME ZONE NOT NULL, o_comment VARCHAR(79) ) PRIMARY INDEX (o_orderkey) PARTITION BY RANGE_N( o_ordertsz BETWEEN TIMESTAMP '2003-01-01 00:00:00.000000+00:00' AND TIMESTAMP '2009-12-31 23:59:59.999999+00:00' EACH INTERVAL '1' MONTH) UNIQUE INDEX (o_orderkey);
ALTER TABLE statement is allowed if there are zero, one, or more rows with
o_ordertsz between TIMESTAMP ’2003-01-01 00:00:00.000000+00:00’ and TIMESTAMP ’2003-12-31 23:59:61.999999+00:00’ (those rows, if any, are deleted):
ALTER TABLE Ordertsz MODIFY PRIMARY INDEX (o_orderkey) DROP RANGE BETWEEN TIMESTAMP '2003-01-01 00:00:00.000000+00:00' AND TIMESTAMP '2003-12-31 23:59:59.999999+00:00' EACH INTERVAL ’1’ MONTH ADD RANGE BETWEEN TIMESTAMP '2010-01-01 00:00:00.000000+00:00' AND TIMESTAMP '2010-12-31 23:59:59.999999+00:00' EACH INTERVAL '1' MONTH WITH DELETE;
ALTER TABLE statement has the same affect since the
EACH clause for the
DROP is not needed:
ALTER TABLE Ordertsz MODIFY PRIMARY INDEX (o_orderkey) DROP RANGE BETWEEN TIMESTAMP '2003-01-01 00:00:00.000000+00:00' AND TIMESTAMP '2003-12-31 23:59:59.999999+00:00' ADD RANGE BETWEEN TIMESTAMP '2010-01-01 00:00:00.000000+00:00' AND TIMESTAMP '2010-12-31 23:59:59.999999+00:00' EACH INTERVAL '1' MONTH WITH DELETE;
The new partitioning expression for table
Ordertsz for either of the above is the following (the table still has a 2-byte partitioning):
RANGE_N(o_ordertsz BETWEEN TIMESTAMP '2004-01-01 00:00:00.000000+00:00' AND TIMESTAMP '2010-12-31 23:59:59.999999+00:00' EACH INTERVAL '1' MONTH)
Other Miscellaneous Changes
- New form of check constraint text for partitioning (includes number of partitions,
- New dictionary fields in existing data dictionary tables (for example, TVM and its views indicate the number of PI columns and the number of partitioning levels; TVFields and its views indicate if a column is a partitioning column or not).
- Modified and new dictionary views.
- New fields for
- DBQL, XML plans, and QCD include additional fields for partitioning information.