In my blog entries "Teradata Columnar" and "9.2 Quintillion? What's that all about?", I discussed column partitioning and the increased partition limit that are introduced in TD 14.0.  But there are other TD 14.0 partitioning enhancements that you may find useful.  The following provides a brief introduction to these enhancements; for more detailed information, see the Orange Book: Increased Partition Limit and other Partitioning Enhancements, the Orange Book: Teradata Columnar, and the TD 14.0 manuals.

RANGE_N with BIGINT and TIMESTAMP

The RANGE_N function is extended to allow a test value to have BIGINT or TIMESTAMP(n) [WITH TIME ZONE] data type.  With a test value of either of these two data types, the result of the RANGE_N function has BIGINT data type.  The maximum number of ranges when the test value has BIGINT data type is 9,223,372,036,854,775,805 (2^63 - 3).

For test values with other data types, RANGE_N still has INTEGER data type and the maximum number of partitions is 2,147,483,647 (2^31 - 1).

For example,

CREATE TABLE t1 (a INT, ts1 TIMESTAMP(6) WITH TIME ZONE NOT NULL, ...)
PARTITION BY RANGE_N(ts1 BETWEEN
         TIMESTAMP '0001-01-01 00:00:00.000000+00:00' AND
         TIMESTAMP '9999-12-31 23:23:59.999999+00:00'
     EACH INTERVAL '0.000001' SECOND);

This defines 315,537,895,440,000,000 partitions with each distinct value of ts1 in a separate partition. It is unlikely you would want to use this fine of granularity (too few rows per partition) but it is allowed.

A more reasonable example might be the following,

CREATE TABLE t2 (a INT, ts1 TIMESTAMP(6) WITH TIME ZONE NOT NULL, ...)
PARTITION BY RANGE_N(ts1 BETWEEN
         TIMESTAMP '0001-01-01 00:00:00.000000+00:00' AND
         TIMESTAMP '9999-12-31 23:23:59.999999+00:00'
     EACH INTERVAL '1' DAY);

RANGE_N with TIMESTAMP Cast as DATE

Prior to Teradata 14.0, the RANGE_N function’s test value could be a TIMESTAMP column cast as DATE and the ranges specified by DATE constants.  However, to obtain row partition elimination, a predicate of a query needed to specify the TIMESTAMP column without casting.  Starting with Teradata 14.0, the query can cast the TIMESTAMP column to DATE and compare to DATE constants and be able to obtain row partition elimination.  Note that TIMESTAMP supports the TD 13.10 AT clause with a CAST which can be used to make a partitioning expression deterministic (a subsequent blog entry will discuss this use of the AT clause).

For example,

CREATE MULTISET TABLE ts (
  Calling_Nbr VARCHAR(21) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
  Start_Time TIMESTAMP(0) NOT NULL,
  Data INTEGER )
PRIMARY INDEX (Calling_Nbr),
PARTITION BY RANGE_N(CAST(Start_Time AS DATE AT LOCAL)
  BETWEEN DATE '1990-01-01' AND DATE '2030-12-31' EACH INTERVAL '1' DAY );
 
EXPLAIN SELECT Calling_Nbr, Start_Time FROM t12
  WHERE CAST(Start_Time AS DATE) BETWEEN DATE '2008-03-14'
                                     AND DATE '2008-03-29';
 
Explanation
---------------------------------------------------------------------------
  1) First, we lock a distinct TEST."pseudo table" for read on a
     RowHash to prevent global deadlock for TEST.ts.
  2) Next, we lock TEST.ts for read.
  3) We do an all-AMPs RETRIEVE step from 16 partitions of TEST.ts
     with a condition of ("(TEST.ts.Start_Time < TIMESTAMP '2008-03-30
     00:00:00') AND (TEST.ts.Start_Time >= TIMESTAMP '2008-03-14
     00:00:00')") into Spool 1 (all_amps), which is built locally on
     the AMPs.  The size of Spool 1 is estimated with no confidence to
     be 1 row (61 bytes).  The estimated time for this step is 0.02
     seconds.
  4) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.

ADD Option and Excess Combined Partitions

The new ADD option for a row or column partitioning level specifies the number of partitions that can be added to a partitioning level.  The number of partitions defined plus the number of partitions specified by the ADD option defines the maximum number of partitions for that level.  Also, excess combined partitions are assigned to levels until no more can be assigned to any level.  The default ADD is initially 10 for a column partitioning level and 0 for a row partitioning level (but may be increased if excess combined partitions are assigned to the partitioning level).  Previously, excess combined partitions were only given to the first level and any excess not assignable to the first level were unusable.

For example,

PARTITION BY (RANGE_N(p BETWEEN 1 AND 100 EACH 1),
              RANGE_N(q BETWEEN 1 AND 300 EACH 10) ADD 20)

By specifying the ADD 20 option for the second level of partitioning, 20 additional partitions besides the 30 defined partitions can be added to this level up to a maximum of 50 partitions total.  Without the ADD option partition, the maximum number of partitions for the second level is 30.  If possible, Teradata tries to keep the partitioning as 2-byte partitioning (65535 combined partitions).  The first level is assigned any excess combined partitions; this can be computed by dividing 65535 by 50 (the maximum number of partitions for the second level); the result is maximum of 1310 partitions for the first level.  Since 100 partitions are defined for the first level, 1210 additional partitions are available for level 1 (that is, instead of the usual default of ADD 0, the first level has a default of ADD 1210).  In this case, after assigning as many excess combined partitions to the first level, there are none left that can be assigned to the second level, so its ADD remains at 20.  In some cases, there can still be some excess combined partitions that can be assigned to other levels.

With TD 14.0, the number of partitions for any level can be changed to be between 1 and the level’s maximum.  Previously, this was only allowed for first level of partitioning (for other levels, the number of partitions for the level could not be changed -- that is, you needed to add as many partitions as you dropped for those levels).
 
SHOW TABLE and SHOW JOIN INDEX show the ADD option as needed for a row partitioning level and always shows it for a column partitioning level.

EACH Optional for ALTER TABLE DROP Ranges

Prior to Teradata 14.0, the ranges and partitions were required to match existing ranges after expansion of ranges with EACH clauses in both the old partitioning expression and the DROP RANGE clause.

With Teradata 14.0, the ranges and partitions within the specified alter drop ranges are dropped thereby simplifying the dropping of partiitons.  Note that an EACH clause specified in the DROP RANGE alter ranges has no effect.

For example, assume the following create table has been done:

CREATE TABLE Ordertsz (
    o_orderkey INTEGER NOT NULL,
    o_custkey INTEGER,
    o_orderstatus CHAR(1) CASESPECIFIC,
    o_totalprice DECIMAL(13,2) NOT NULL,
    o_ordertsz TIMESTAMP(6) WITH TIME ZONE NOT NULL,
    o_comment VARCHAR(79) )
  PRIMARY INDEX (o_orderkey)
  PARTITION BY RANGE_N(
      o_ordertsz BETWEEN TIMESTAMP '2003-01-01 00:00:00.000000+00:00'
                     AND TIMESTAMP '2009-12-31 23:59:59.999999+00:00'
                    EACH INTERVAL '1' MONTH)
  UNIQUE INDEX (o_orderkey);

The following ALTER TABLE statement is allowed if there are zero, one, or more rows with o_ordertsz between TIMESTAMP ’2003-01-01 00:00:00.000000+00:00’ and TIMESTAMP ’2003-12-31 23:59:61.999999+00:00’ (those rows, if any, are deleted):

ALTER TABLE Ordertsz MODIFY PRIMARY INDEX (o_orderkey)
  DROP RANGE BETWEEN TIMESTAMP '2003-01-01 00:00:00.000000+00:00'
                 AND TIMESTAMP '2003-12-31 23:59:59.999999+00:00'
                       EACH INTERVAL ’1’ MONTH
   ADD RANGE BETWEEN TIMESTAMP '2010-01-01 00:00:00.000000+00:00'
                 AND TIMESTAMP '2010-12-31 23:59:59.999999+00:00'
                EACH INTERVAL '1' MONTH
  WITH DELETE;

The following ALTER TABLE statement has the same affect since the EACH clause for the DROP is not needed:

ALTER TABLE Ordertsz MODIFY PRIMARY INDEX (o_orderkey)
  DROP RANGE BETWEEN TIMESTAMP '2003-01-01 00:00:00.000000+00:00'
                 AND TIMESTAMP '2003-12-31 23:59:59.999999+00:00'
   ADD RANGE BETWEEN TIMESTAMP '2010-01-01 00:00:00.000000+00:00'
                 AND TIMESTAMP '2010-12-31 23:59:59.999999+00:00'
                EACH INTERVAL '1' MONTH
  WITH DELETE;

The new partitioning expression for table Ordertsz for either of the above is the following (the table still has a 2-byte partitioning):

RANGE_N(o_ordertsz BETWEEN TIMESTAMP '2004-01-01 00:00:00.000000+00:00'
                       AND TIMESTAMP '2010-12-31 23:59:59.999999+00:00'
                      EACH INTERVAL '1' MONTH)

Other Miscellaneous Changes

  1. New form of check constraint text for partitioning (includes number of partitions, ADD option, etc.).
  2. New dictionary fields in existing data dictionary tables (for example, TVM and its views indicate the number of PI columns and the number of partitioning levels; TVFields and its views indicate if a column is a partitioning column or not).
  3. Modified and new dictionary views.
  4. New fields for HELP COLUMN statement.
  5. DBQL, XML plans, and QCD include additional fields for partitioning information.
Discussion
virenag 5 comments Joined 09/12
05 Oct 2012

Teradata TPT OUTLIMIT value is ignored. BTEQ RETLIMIT and FEXP OUTLIMIT work fine. Anyone knows anything to do with TPT OUTLIMIT?
I defined INTEGER OutLimit = 1,
in TPT but it retrieved all rows anyway.

PaulSinclair 29 comments Joined 06/11
11 Oct 2012

Since I am not a TPT expert, I would suggest that you please post this question in the Tools forum -- http://forums.teradata.com/forum/tools -- where it may get better visibility by a TPT expert that may be able to answer your question.

Paul Sinclair

ericsun2 5 comments Joined 06/10
21 Apr 2013

I have a question about partition elimination for VARCHAR based RANGE_N in 14.00 release, here is the sample DDL:

create set table MY_CHAR_RANGE (
 TRACKING_ID INT NOT NULL,
 TRACKING_KEY varchar(20) NOT NULL,
 TRACKING_VALUE VARCHAR(128)
) primary index (TRACKING_ID)
partition by ( range_n( tracking_key between '0','1','14','2','4','8','E',
'Q' AND 'Z', NO RANGE ) );

insert into MY_CHAR_RANGE values (101, '2', 'browser_id=dfdfdfdfdfdf');
insert into MY_CHAR_RANGE values (103, '3', 'order_id=3456');
insert into MY_CHAR_RANGE values (104, '4', 'url=http://www.godaddy.com');
insert into MY_CHAR_RANGE values (105, '5', 'test_id=007');
insert into MY_CHAR_RANGE values (106, '6', 'android');
insert into MY_CHAR_RANGE values (107, '7', '4.0.4');
insert into MY_CHAR_RANGE values (108, '8', 'viewer_info=N/A');
insert into MY_CHAR_RANGE values (109, '9', 'session_id=abef-0239');
insert into MY_CHAR_RANGE values (110, '11', 'N/A');
insert into MY_CHAR_RANGE values (111, 'query', 'everything = ???');
insert into MY_CHAR_RANGE values (112, '0', 'md5=???');

collect stats MY_CHAR_RANGE index(TRACKING_ID);
collect stats MY_CHAR_RANGE column(TRACKING_key);
collect stats MY_CHAR_RANGE column(partition);

I am expecting the partition elimination (narrawed to 2 partition scan) when filter on TRACKING_KEY is applied. This works in TD 14.00 VM, but not in the real 14.00.02.11 release:


explain select * from my_char_range where tracking_key in ('4','10');

 *** Help information returned. 14 rows.
 *** Total elapsed time was 1 second.

Explanation
---------------------------------------------------------------------------
  1) First, we lock a distinct SAMPLES."pseudo table" for read on a
     RowHash to prevent global deadlock for SAMPLES.my_char_range.
  2) Next, we lock SAMPLES.my_char_range for read.
  3) We do an all-AMPs RETRIEVE step from 2 partitions of
     SAMPLES.my_char_range with a condition of (
     "(SAMPLES.my_char_range.TRACKING_KEY = '10') OR
     (SAMPLES.my_char_range.TRACKING_KEY = '4')") into Spool 1
     (group_amps), which is built locally on the AMPs.  The size of
     Spool 1 is estimated with high confidence to be 1 row (328 bytes).
     The estimated time for this step is 0.03 seconds.
  4) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 0.03 seconds.

select partition, tracking_key from my_char_range order by 1,2;

 *** Query completed. 11 rows found. 2 columns returned. 
 *** Total elapsed time was 1 second.

  PARTITION  TRACKING_KEY
-----------  ----------------------------------------
          1  0
          2  11
          4  2
          4  3
          5  4
          5  5
          5  6
          5  7
          6  8
          6  9
          8  query

The partition elimination is also supposed to work in 13.10, but the explain plan still scans all partitions.

  3) We do an all-AMPs RETRIEVE step from SAMPLESS.my_char_range by
     way of an all-rows scan with a condition of (
     "(SAMPLES.my_char_range.TRACKING_KEY = '10') OR
     (SAMPLES.my_char_range.TRACKING_KEY = '4')") into Spool 1
     (group_amps), which is built locally on the AMPs.  The size of
     Spool 1 is estimated with high confidence to be 2 rows (952 bytes).
     The estimated time for this step is 0.03 seconds.

Can Paul or someone shed some light on this issue please?

PaulSinclair 29 comments Joined 06/11
24 Apr 2013

I tried this on a 13.10 release and a 14.0 release and I was able to get partition elimination. I asked a Teradata software engineer (who developed the char PPI feature) to run it specifically on 14.00.02.11 and he was able to get partition elimination.  So we don't know why you aren't getting partition elimination. You can try the CharPPITrace diagnostic if you can get acccess to the DBS I/O window or know someone who can look at for you:

diagnostic CharPPITrace on for session;
explain select * from my_char_range where tracking_key in ('4','10');

 
The reason why partition elimination was not done in your session should be in the resulting trace which is printed to the DBS I/O window.
You may want to contact the support center to help you with this problem.

Paul Sinclair

PaulSinclair 29 comments Joined 06/11
24 Apr 2013

You also want to check with your system administrators if CharPE has been disabled on your system via the DBS Control internal flag:
  157.  DisableCharPE

Paul Sinclair

PaulSinclair 29 comments Joined 06/11
24 Apr 2013

You also may want to check if DBS Control internal flag AcceptReplacementCharacters has been turned on; if it is turned on, this disables character partition elimination. The CharPPITrace would indicate if this is the reason.

Paul Sinclair

ericsun2 5 comments Joined 06/10
07 May 2013

Yes, AcceptReplaceCharacters = TRUE is the root cause, but we have a lot Unicode string to load.
Currently, a lot of data import will fail if we turn off AcceptReplaceCharacters. I am hoping TD can provide a more tolerative client-side library to translate Unicode beyond BMP.
Thanks Paul, as usual!

PaulSinclair 29 comments Joined 06/11
14 May 2013

In order to ensure consistent evaluation of character partitioning expressions, the internal DBSControl field AcceptReplacementCharacters must be FALSE when doing character partition elimination.  This field being set TRUE has the effect of ignoring failed character conversions from one character set to another.  If a character row-partitioned table was created with a different setting of this flag than subsequent DML requests, wrong results or data corruption may result.
The restriction is because character sets may have new translations between the character sets added, so a translation that did not exist at table creation time may exist at a later time.  If the character partitioning expression includes such a translation, its evaluation may change over time (which might cause rows to go into different partitions than they would have before and partition elimination may incorrectly eliminate partitions).  To avoid any issues, a character partitioning expression is evaluated always as if AcceptReplacementCharacters is false.  As part of static partition elimination, character partition ranges may be compared to query predicates.  The evalution of the query predicate must use the setting of AcceptReplacementCharacters and using the query predicate in character partition elimiination must always use a FALSE setting.  But the query predicate must be evaluated/processed with only a single setting in both cases otherwise inconsistencies could occur.  So, character partition elimination is only allowed when setting of AcceptReplacementCharacters is false.
Perhaps something could be done to allow character partition elimination when the setting for AcceptReplacementCharacters is TRUE (e.g., detect if there are no charset translations taking place) but this would require some investigation.  You may want to request an RFC to relax this restriction.

Paul Sinclair

tt255009 4 comments Joined 11/12
06 Jun 2013

Hi,
 
I have a question... is row level locking allowed in dynamic query in teradata stored procedure? I have a test code 

REPLACE PROCEDURE TESTDB.TEST

()

MAIN:

BEGIN

 

    DECLARE vSQL_Text                  VARCHAR(5000);

    

    SET  vSQL_Text =  'LOCKING ROW FOR WRITE SELECT COL1 FROM  TESTDB.tABLE WHERE col1= 200;';

/* Exec Dynamic SQL */

  CALL DBC.SysExecSQL(:vSQL_Text);

 

END

MAIN;

 

CALL   TESTDB.TEST();

 

 

and if i try to execute it then "Error 5568:the sql statement is not supported within a stored procedure" is produced.

 

How can row level locking be achieved in a stored procedure if the query is dynamically generated?

 

 

PaulSinclair 29 comments Joined 06/11
07 Jun 2013

Stored procedure question: I would suggest that you please post this type of question in the Database forum -- http://forums.teradata.com/forum/database -- where the question and answers may get better visibility than under this blog about partitioning. Note that the error is in regard to the using a SELECT in the call and not specifically to the locking modifier.

Paul Sinclair

PaulSinclair 29 comments Joined 06/11
08 Oct 2013

If all the values are unique for the column (or set of columns), you can create a USI on the NUPI column (or the columns of the NUPI) even if the column (or set of columns) is defined as the NUPI (whether partitioned or not). However, if there is no partitioning or all the partitioning columns from the partitioning expressions are included as columns in the NUPI, the table should immediately be altered to have a UNIQUE PRIMARY INDEX (this will drop the now unnecessary USI) since the uniqueness can be enforced by the UPI instead maintaining a separate USI and the UPI provides more efficient access than the USI.
There would be an improvement to queries that can use the USI (or UPI) for access -- that is, when the query specifies predicates that are equality to specific values for all the columns in the USI (or UPI).
For the nonpartitioned case with a UPI, the performace improvement for access via the UPI compared to being a NUPI might be minor though knowing there is only one unique row for a value can sometimes lead to better optimization of the query.
In the partitioning case and with the USI, this will be more efficient than probing each partitioning based on the NUPI (one probe for each nonempty partition vs. one probe on the USI followed by one probe on the base table). Of course, in the partitioning case, UPI access would be the best -- one probe to the correct partition (but as mentioned above, all the partitioning columns would need to be in the UPI and this might not be the set of columns on which you want to define uniqueness).

Paul Sinclair

addsmiles 2 comments Joined 01/12
07 Feb 2014

Paul,
Thanks a ton. This and your other partitioning related write ups have been very insightful. I have been helped a great deal.
Regards
PT

suhailmemon84 52 comments Joined 09/10
24 Aug 2014

Hi Paul,
Does timestamp partitioning in TD 14.0 work only in equality conditions? I tried the following example and i could not achieve partition elimintation when a <= or >= was used.
 

CREATE MULTISET TABLE ts1 (
  Calling_Nbr VARCHAR(21) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
  Start_Time TIMESTAMP(6) with time zone NOT NULL,
  Data INTEGER )
PRIMARY INDEX (Calling_Nbr),
PARTITION BY RANGE_N(Start_Time  BETWEEN TIMESTAMP '1990-01-01 00:00:00.000000+00:00'
                     AND TIMESTAMP '2099-12-31 23:59:59.999999+00:00'
                    EACH INTERVAL '1' day );
  
SELECT *  FROM ts1
  WHERE Start_Time=current_timestamp(6)
;

SELECT *  FROM ts1
  WHERE Start_Time>=current_timestamp(6)
;


SELECT *  FROM ts1
  WHERE Start_Time<=current_timestamp(6)
;  

 
and the explain shows that only 1st query showed partition elimination. The other 2 did not. here are the explain outputs too.
 

Explain SELECT *  FROM ts1
  WHERE Start_Time=current_timestamp(6)
;

  1) First, we lock a distinct "pseudo table" for read on a
     RowHash to prevent global deadlock for ts1. 
  2) Next, we lock ts1 for read. 
  3) We do an all-AMPs RETRIEVE step from a single partition of
     ts1 with a condition of ("ts1.Start_Time =
     CURRENT_TIMESTAMP(6)") with a residual condition of (
     "ts1.Start_Time = CURRENT_TIMESTAMP(6)") into Spool 1
     (group_amps), which is built locally on the AMPs.  The size of
     Spool 1 is estimated with no confidence to be 24 rows (1,584
     bytes).  The estimated time for this step is 0.03 seconds. 
  4) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 0.03 seconds. 


Explain SELECT *  FROM ts1
  WHERE Start_Time>=current_timestamp(6)
;

  1) First, we lock a distinct "pseudo table" for read on a
     RowHash to prevent global deadlock for ts1. 
  2) Next, we lock ts1 for read. 
  3) We do an all-AMPs RETRIEVE step from ts1 by way of an
     all-rows scan with a condition of ("ts1.Start_Time >=
     CURRENT_TIMESTAMP(6)") into Spool 1 (group_amps), which is built
     locally on the AMPs.  The size of Spool 1 is estimated with no
     confidence to be 80 rows (5,280 bytes).  The estimated time for
     this step is 0.03 seconds. 
  4) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 0.03 seconds. 


Explain SELECT *  FROM ts1
  WHERE Start_Time<=current_timestamp(6)
;

  1) First, we lock a distinct "pseudo table" for read on a
     RowHash to prevent global deadlock for ts1. 
  2) Next, we lock ts1 for read. 
  3) We do an all-AMPs RETRIEVE step from ts1 by way of an
     all-rows scan with a condition of ("ts1.Start_Time <=
     CURRENT_TIMESTAMP(6)") into Spool 1 (group_amps), which is built
     locally on the AMPs.  The size of Spool 1 is estimated with no
     confidence to be 80 rows (5,280 bytes).  The estimated time for
     this step is 0.03 seconds. 
  4) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 0.03 seconds. 

 
Thanks in advance for your response.
 
-Suhail

PaulSinclair 29 comments Joined 06/11
25 Aug 2014

Timestamp partitioning works with nonequality conditions in 14.0 with constants but not with CURRENT_TIMESTAMP. Prior to Teradaata 12.0, this was also true for date partitioning and the CURRENT_DATE and DATE functions -- these functions were treated as non-constant which prevented static partition elimination. In Teradata 12.0, CURRENT_DATE and DATE were changed to be treated as constants and their values substituted so the optimizer could do static partition elimination (the date was marked in the cached plan and the plan was discarded once the date changed). However, CURRENT_TIMESTAMP changes more often than date so it is not currently treated as a constant so that the plan can be cached but this prevents static partition elimination. Note that if you substitute constants for CURRENT_TIMESTAMP, static partition elimination occurs for the inequality cases. For the equality case, partition elimination occurs using deferred partition elimination (which only occurs with equality); this deferred partition elimination allows for a cached plan. A future release will allow CURRENT_TIMESTAMP to be treated as constant by building a specific a plan which can't be cached but then optimizer has to decide that a specific plan that is not cached is better than having a generic plan that is cached.
EXPLAIN SELECT * FROM ts1 WHERE Start_Time>=TIMESTAMP '2014-08-25 14:22:44.000000';
Explanation
---------------------------------------------------------------------------
1) First, we lock a distinct PLS."pseudo table" for read on a RowHash
to prevent global deadlock for PLS.ts1.
2) Next, we lock PLS.ts1 for read.
3) We do an all-AMPs RETRIEVE step from 31175 partitions of PLS.ts1
with a condition of ("PLS.ts1.Start_Time >= TIMESTAMP '2014-08-25
14:22:44.000000+00:00'") into Spool 1 (group_amps), which is built
locally on the AMPs. The size of Spool 1 is estimated with no
confidence to be 2 rows (170 bytes). The estimated time for this
step is 0.07 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 0.07 seconds.

EXPLAIN SELECT *  FROM ts1  WHERE Start_Time<=TIMESTAMP '2014-08-25 14:22:44.000000';
Explanation
---------------------------------------------------------------------------
  1) First, we lock a distinct PLS."pseudo table" for read on a RowHash
     to prevent global deadlock for PLS.ts1.
  2) Next, we lock PLS.ts1 for read.
  3) We do an all-AMPs RETRIEVE step from 9003 partitions of PLS.ts1
     with a condition of ("PLS.ts1.Start_Time <= TIMESTAMP '2014-08-25
     14:22:44.000000+00:00'") into Spool 1 (group_amps), which is built
     locally on the AMPs.  The size of Spool 1 is estimated with no
     confidence to be 2 rows (170 bytes).  The estimated time for this
     step is 0.07 seconds.
  4) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 0.07 seconds.

Paul Sinclair

suhailmemon84 52 comments Joined 09/10
29 Aug 2014

Thanks Paul. Do you know which future release will allow treating CURRENT_TIMESTAMP as constant and facilitate creation of a specific plan?

PaulSinclair 29 comments Joined 06/11
30 Aug 2014

I wish I could provide such information but until there is an officical announcement of a release and its content and since release content is subject to change, I am unable to provide that information in an open forum. In some cases, your account team may be able to provide tentative information about upcoming releases.

Paul Sinclair

suhailmemon84 52 comments Joined 09/10
2 months ago

Thanks Paul.

MaximeV 1 comment Joined 11/13
1 month ago

Hi, I have a question about partition elimination in 14.00(.06.09) with a table partitioned on timestamp column.
I've prepared 3 cases studies but first of all we need a materialized calendar table :

CREATE SET TABLE CALN_TEST ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      CALN_DT DATE FORMAT 'YYYY-MM-DD',
      DAY_OF_WEEK INTEGER,
      DAY_OF_MONTH INTEGER,
      DAY_OF_YEAR INTEGER,
      DAY_OF_CALN INTEGER,
      WEEKDAY_OF_MONTH INTEGER,
      WEEK_OF_MONTH INTEGER,
      WEEK_OF_QUARTER INTEGER,
      WEEK_OF_YEAR INTEGER,
      WEEK_OF_CALN INTEGER,
      MONTH_OF_QUARTER INTEGER,
      MONTH_OF_YEAR INTEGER,
      MONTH_OF_CALN INTEGER,
      QUARTER_OF_YEAR INTEGER,
      QUARTER_OF_CALN INTEGER,
      YEAR_OF_CALN INTEGER,
      WEEKBEGIN_DT DATE FORMAT 'YYYY-MM-DD',
      WEEKEND_DT DATE FORMAT 'YYYY-MM-DD',
      MONTHBEGIN_DT DATE FORMAT 'YYYY-MM-DD',
      MONTHEND_DT DATE FORMAT 'YYYY-MM-DD',
      QUARTERBEGIN_DT DATE FORMAT 'YYYY-MM-DD',
      QUARTEREND_DT DATE FORMAT 'YYYY-MM-DD',
      YEARBEGIN_DT DATE FORMAT 'YYYY-MM-DD',
      YEAREND_DT DATE FORMAT 'YYYY-MM-DD')
UNIQUE PRIMARY INDEX ( CALN_DT );

INSERT INTO CALN_TEST
SELECT 
			CALENDAR_DATE,
			DAY_OF_WEEK,
			DAY_OF_MONTH,
			DAY_OF_YEAR,
			DAY_OF_CALENDAR,
			WEEKDAY_OF_MONTH,
			WEEK_OF_MONTH ,
			WEEK_OF_QUARTER,
			WEEK_OF_YEAR,
			WEEK_OF_CALENDAR,
			MONTH_OF_QUARTER,
			MONTH_OF_YEAR,
			MONTH_OF_CALENDAR,
			QUARTER_OF_YEAR,
			QUARTER_OF_CALENDAR,
			YEAR_OF_CALENDAR,
			WEEKBEGIN,
			WEEKEND,
			MONTHBEGIN,
			MONTHEND,
			QUARTERBEGIN,
			QUARTEREND,
			YEARBEGIN,
			YEAREND
FROM SYS_CALENDAR.BUSINESSCALENDAR 
WHERE YEAR_OF_CALENDAR BETWEEN 1990 AND 2050;    

COLLECT STATISTICS ON CALN_TEST INDEX ( CALN_DT );

next let's proceed with cases studies :
CASE 1:  table partitioned directly on timestamp column 1 day interval :

CREATE MULTISET TABLE ts1 (
  Calling_Nbr VARCHAR(21) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
  Start_Time TIMESTAMP(0) WITH TIME ZONE NOT NULL,
  DATA INTEGER )
PRIMARY INDEX (Calling_Nbr),
PARTITION BY RANGE_N(Start_Time  BETWEEN TIMESTAMP '1990-01-01 00:00:00+00:00'
                     AND TIMESTAMP '2099-12-31 23:59:59+00:00'
                    EACH INTERVAL '1' DAY );

then let's try this query :

EXPLAIN SELECT *  FROM ts1 INNER JOIN CALN_TEST CAL ON ( CAST(ts1.start_time AS DATE) = CAL.CALN_DT)
  WHERE cal.caln_dt='2014-08-25';

sadly, partition elimination does not occur.
 
CASE 2 : like in the blog entry, table partitioned directly on timestamp column casted as date still with 1 day interval :

CREATE MULTISET TABLE ts2 (
  Calling_Nbr VARCHAR(21) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
  Start_Time TIMESTAMP(0) WITH TIME ZONE NOT NULL,
  DATA INTEGER )
PRIMARY INDEX (Calling_Nbr),
PARTITION BY RANGE_N(CAST(Start_Time AS DATE AT LOCAL ) BETWEEN DATE '1990-01-01'
                     AND DATE  '2099-12-31'
                     EACH INTERVAL '1' DAY );

let's try the same query again :

EXPLAIN SELECT *  FROM ts2 INNER JOIN CALN_TEST CAL ON ( CAST(ts2.start_time AS DATE) = CAL.CALN_DT)
  WHERE cal.caln_dt='2014-08-25';

still, partition elimination does not occur.
 
CASE 3 : data redundancy with date column added and table partitioned on this new column :

CREATE MULTISET TABLE ts3 (
  Calling_Nbr VARCHAR(21) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
  Start_Time TIMESTAMP(0) WITH TIME ZONE NOT NULL,
  Start_dt DATE NOT NULL,
  DATA INTEGER )
PRIMARY INDEX (Calling_Nbr),
PARTITION BY RANGE_N(Start_dt BETWEEN DATE '1990-01-01'
                     AND DATE  '2099-12-31'
                    EACH INTERVAL '1' DAY );

now the query is :

EXPLAIN SELECT *  FROM ts3 INNER JOIN CALN_TEST CAL ON ( ts3.start_dt  = CAL.CALN_DT)
  WHERE cal.caln_dt='2014-08-25';

and now partition elimination occurs and data from a single partition is retrieved.
 
¤ Does anyone have an explanation about this behaviour ?
¤ Are there any improvements in latest releases ?
¤ Is it not recommanded to partition table on timestamp column ?
 
Thanks.
 
On a side note, i'll add that I know the query explained is pretty stupid ( but replace caln_dt filter by month or week restriction and this kind of query can happen a lot, especially on 3rd party bi tools )
 
 
 

You must sign in to leave a comment.