If you are using Teradata on one of the 1xxx or 2xxx platforms, be aware that query demotions are automatically built into your workload management scheme.   Because I’ve had a few questions about what these so-called "query milestones" are and how they actually work, I’d like to explain this functionality and discuss its trade-offs. 

The appliance platforms are intended to be simple, easy to use, with few knobs to turn or decisions to make.  When it comes to workload management, you have available 4 different priorities:  Rush, High, Medium and Low, also known as R, H, M, and L.  These groupings represent the four default priority scheduler performance groups.   

All user work will fall into the Medium priority by default.  Users and all the work they submit can be assigned to one of the other three performance groups by adding an abbreviation of the performance group name (preceded by the dollar sign) into the first position in the user’s account string (for example, $R, $H, $M, or $L).

Modifying a user’s  account string to begin with the characters “$L” will cause that user’s queries to all execute in the Low performance group.   Similarly, starting the account string with the characters “$R” will result in that user’s submitted work to run at the highest priority, Rush.

Three of the performance groups,  Rush, High, and Medium,  come with built-in query milestones, as illustrated in the graphic above.   Query milestones keep track of CPU usage by query and potentially can move a query that starts in one priority to a lower priority automatically, based on consumption of CPU per node.   The purpose behind a query milestone is to move work that is running at a high priority out of the high priority if the query proves to be longer running than is suitable for that priority.   By keeping only short work in the higher priorities, you can better protect that high priority work so that it can run more consistently.  

Query milestones are particularly useful if a single user submits a mix of short and long and very long queries.  A query milestone will give all queries a chance to run at the prescribed priority for that user, and then after a certain amount of CPU usage will demote the queries that require a greater amount of CPU.

I want to make a quick diversion at this point and mention that in addition to having 4 priority differences on the Appliance platforms, system throttles and filters can be used to control concurrency and decide if requests qualify to be rejected.   If you are on Teradata 13.10, the “common classification” enhancement allows you to associate specific throttles or filters  to queries based on their individual query characteristics.   For example, in Teradata 13.10 you can set up a throttle that will manage the concurrency of all queries that have an estimated processing time of greater than 10 seconds (or whatever time you wish), or that access a specific table or database.   Teradata documentation will outline all the choices for you when it comes to scoping a filter or a throttle rule to suit your purposes.

Throttles in particular can provide a big advantage on the Appliance platform.  By limiting concurrency on lower priority work, you can increase the level of resources available to the higher priority work, boosting its consistency and performance. 

The same priority setup will be used night and day on an Appliance platform.  The same is true of throttles and filters.  Once you activate a throttle rule,  it will remain active through all of your processing windows.  You can, however, set up multiple different throttles to control different situations.

Now let’s get back on track with our main topic and take a closer look at how the query milestones work.  

Priority scheduler lives within the database’s PDE level, closely linked to the operating system.  Each node in an MPP system has its own instance of the operating system, and also has its own copy of the priority scheduler.    Each priority scheduler copy is monitoring and controlling priorities only for the tasks that are running on its node.

When a query milestone is defined, it is given a threshold of CPU usage.  This is done for you on the Appliance.  When a query executes, its CPU consumption is checked against this threshold as it is running.  When its CPU usage reaches that threshold on a node, all the tasks contributing to that request will be moved under the control of a lower priority group (on that node).  

For example, when a request that begins in the Rush priority consumes 10 seconds of CPU on a node, it will be moved under the control of the Medium priority on that node.   It could be that tasks on behalf of this request that are active on a different node have already been demoted, or it could be they won’t be demoted for some time, as might be the case if the processing is skewed.   It is perfectly normal for a request on one node to reach a query milestone before the request running on another node.   Each node is taking care of itself.

If you have experience with TASM, the query milestone functionality I am describing is similar to the exception type called “Tactical CPU Time per node”, which is only supported for  workloads with a tactical enforcement priority.     The TASM exception action for such an exception must be the change the workload (a demotion) and the threshold value must be under 5 seconds.   

In TASM you have a second option for demoting queries known as “CPU Time Sum over all Nodes” which does not perform the demotion until all nodes combined reach a specified sum of CPU seconds.  

Here's another way to look at the query milestones that come with the Appliance.  Allocation groups are priority scheduler components that underlie each performance group and that carry the actual priority information, such as weight.  The higher the allocation group weight, the higher its priority.

 

When the session submits a new request, the request-level CPU accumulation is zeroed out for the previous query and the new request will begin to accumulate CPU on each node independently of the CPU accumulated by the previous query in the session.   A demotion lasts only for the life of the current request.  

A given request can be demoted more than one time.  If a request starts in Rush, it can be demoted to Medium after 10 CPU seconds have been consumed, and then it can be demoted a second time to Low after 200 CPU seconds on the same node.   The accumulation of CPU in both cases is from when the query begin to execute.

The number of nodes in the configuration will influence the speed at which demotions take place in the life of a query. The larger the number of nodes in your configuration, the more CPU seconds are allowed to be consumed overall before a given request will begin to experience demotion.

As an example, let’s consider a small configuration and then a large configuration.   If you have a small two-node Appliance, then a request would get to consume about 20 CPU seconds overall (10 CPU seconds per node * 2 nodes) before a demotion on one or both nodes will take place.  On the other hand, if you have a 10-node configuration, the same request would get to consume about 100 CPU seconds overall (10 CPU seconds per node * 10 nodes) before the same request would begin to see demotions happening across nodes.  

 Changing the threshold of query milestones is not supported on the Appliance platforms.    If you find the thresholds are not suitable for the type of work you are performing, or if reductions in priority are happening too quickly for you and you would prefer to disable the query milestone functionality, there is an operating system command that can be issued to turn off demotions.  This is an all-or-nothing setting that will disable all defined query milestones, and it will require a restart.

To remove query milestones:  schmon -G DEFAULTSETTINGS

To revert back to query milestones:  schmon -G CLEAR

Remember that a restart will be required for this change to take effect.   And you must have operating system root privileges to execute either of those commands.  The commands must be submitted from the operating system prompt.

When considering whether you want to keep the milestones on or turn them off, ask yourself these questions:

  1. Do most of the queries that you wish to execute at the Rush (highest) and High (next highest) priorities complete using less than 10 seconds of CPU per node?  If so, the query milestones will be helpful to you because they will get the other, longer, work out of the high priority grouping before they slow the short work down very much.
  2. Do most of the queries you expect to run in the Rush (highest) and High (next highest) priorities complete using more than 10 seconds of CPU per node?  If so, they query milestone is providing less value, since if most work is demoted, the benefit of having different priorities is reduced.
  3. Do you expect most queries that start in Rush, High or Medium to require more than 200 CPU seconds per node?  If so, then query milestones are serving no value, since most of the work will end up running in Low and you will have lost all priority differentiation.
Discussion
VasuKillada 12 comments Joined 10/11
26 Apr 2012

Very nice article about it. Wish it had more information like what other levels of workload management is possible using the TD Appliance. I'm more interested in 2xxx platforms. First question to start with can operating periods be implemented? If so xschmon? how do I get it never used it just read about it in the utilities. Can you please help/share any guiding document for WD management and best practices. Last question what kind of questions should I be asking client/customer to properly configure the workload management in Appliance. In advance thanks much Carrie.

Thanks,
Vasu

carrie 431 comments Joined 04/08
27 Apr 2012

Vasudev,

The appliance platforms are intended to be simple, easy to use, with few knobs to turn or decisions to make. For that reason, you cannot change priority settings using any of the varieties of the schmon commands. (You can, however, turn query milestones on and off as explained in the posting above. ) Operating periods cannot be implemented on the appliance platforms.

You can find a manual called Appliance Implementation Guide by clicking on “Hardware Platforms” and then “2650” from the Teradata Information Products web site. There is coverage in that document that describes what workload management is available, in Chapter 13, Task 4 and Task 5.

The Information Productions web site is
http://www.info.teradata.com/

Thanks, -Carrie

super25 3 comments Joined 07/11
02 May 2012

Its really cool, got better understanding now but has a question about removing milestones.
1. If query milestoens are removed does the queries get priority by first come basis?
2. Do we really need to remove them or just make all users to default account "$M".

we have 12 nodes and most of our queries really run longer (more than 1 hr) as they are analysis based, does that mean we have remove priority schduler?

carrie 431 comments Joined 04/08
04 May 2012

If the query milestone is removed, then that means that no query will get it's priority diminished once it has started to run. All queries will run at their original priority until they complete. All queries will receive CPU based on the relative weight of their allocation group. Queries are assigned priorities based on their user. The query milestone does not change the assigned priority of a query as it begins to run, only after it has reached the threshold. Up until that point the query will run at the priority assigned to the user.

You do not need to turn off the query milestones if you are comfortable with what the demotion of queries that reach the milestones. The blog posting above describes how they function, but if you would prefer that your queries always run at the priority they were originally assigned, no matter how much CPU they consume, you can turn the milestones off.

Turning the milestones off is not the same thing as running all the queries in medium. If you ran all the queries in medium, then you would have no priority differences among them, even when they begin to execute. But with query milestones, the priority differences are preserved up until the point when the threshold of usage is reached, and that is determined query by query. At that point, the priority for that one query becomes smaller, but the priority of the other active queries does not.

Thanks, -Carrie

VasuKillada 12 comments Joined 10/11
06 May 2012

Thank you much Carrie. Can TDWM be used for CPU RP and other allocation groups or only PSA and command level schmon commands to play with CPU RP or AGs?

Thanks,
Vasu

carrie 431 comments Joined 04/08
10 May 2012

Vasu,

You are not able to set or change CPU limits on the appliance platforms.

If you are on a platform with full TASM and were on 13.10, you would use Viewpoint Workload Designer to set CPU limits. If you have full TASM and are on a release earlier than 13.10 you could use TDWM. I you are an EDW platform and you don't have TAS you could use PSA or schmon commands.

Teradata documentation will tell you how to make CPU settings using any of those approaches.

Thanks, -Carrie

TeraFan 3 comments Joined 01/12
29 Jun 2012

Hi Carrie,

Is changing a query Priority an available feature in an appliance? I've tried updating a $H query to $M in Viewpoint but it's not taking effect.

Thanks

carrie 431 comments Joined 04/08
02 Jul 2012

You can only change the priority (by changing the account string) of a running query if you are NOT using TASM. On the appliance platform you are not using TASM, so it would seem to me that you should be able to do what you are trying to do.

Viewpoint supports making such a change in portlets like Query Monitor. What you do in Viewpoint will cause a PM/API SET SESSSION ACCOUNT API to be executed, and that also supports the ability to change priorities of a running query (if you are not using TASM).

If the change is not working for you, you can either manually review the viewpoint.log file and the DBC.sw_event_log table for more information, or engage support for assistance.

Thanks, -Carrie

Yasir Maken 1 comment Joined 09/12
26 Sep 2012

Perfect, the best explanation I could ever come across:) Thanks Carrie..

carrie 431 comments Joined 04/08
26 Sep 2012

You are very welcome! Thank you for the kind feedback. -Carrie

SmarakDas 20 comments Joined 02/12
02 Oct 2012

Hello Everyone...I just need to clear a few things about Workload. Recently, I read about Resource Partition, Performance Group, Performance Period and Allocation Group. From what I have understood, the Performance Group is decided by the User's Account String. And the Allocation group is decided by Performance Period (Milestones), with different Performance Periods alligning to different Allocation Group.

The hierarchy is: Resource Partition -> Performance Group -> Performance Period -> Allocation Group

My question is:

(a) How a query is allocated to a Resource Partition ? (My guess is since the Performance Group is implicitly decided by Account String, a set of Performance Periods constitutes a Resource Partition. Based upon Performance Period, the Resourse Partition is decided).

(b) Once within a Performance Group, based upon milestones, a query is moved between different Performance Period and hence, Allocation Group. My Question is which is the First Performance Period within a Performance Group that is allocated ? Is it something default ?

(c) In all these above scenarios, where does Workload comes into play. Is Workload equivalent to Performance Period ? Where exactly Workload fits in the above hierarchical diagram.

Thanks in advance.

carrie 431 comments Joined 04/08
04 Oct 2012

A query is not allocated to a resource partition directly. It is mapped to a performance group. A query is under the control of an allocation group, based on which allocation group its performance group (PG) points to. The resource partition is a collection of performance groups. So whatever PG a query belongs to will dictate the resource partition it is associated with.

A performance period is simply an attribute of a performance group. It allows a query to map to a single performance group, but based on time or CPU usage to be moved under the control of a different allocation group than the default allocation group. The performance period has no independent existence outside a performance group. It is simply a bridge between a performance group and an allocation group. It is a carry-over from previous releases where it had a more important role. Today, you can basically ignore performance periods.

Most performance groups only have a single (default) performance period. For appliance platforms, the performance groups (most of them) have built-in performance periods in order to control the automatic demotions. The first PP is the default one where the query starts; the second PP is for the allocation group after the demotion. However, you don't have to be concerned about performance periods at all or how to define them. In fact, on the appliance, you will not be allowed to add or delete performance periods. If you are using full TASM, you don't need to worry about them either, as you have things like workload exceptions to provide similar functionality.

The TASM workload is a performance group under the covers (at least in the SLES 10 versions). TASM workloads do not know about or get involved in performance periods. So if you are a TASM user, you don't have to be concerned or ever think about performance periods. Just set up workload exceptions if you need them, using whatever exception criteria you like.

Thanks, -Carrie

rfenwick 1 comment Joined 10/04
15 Jan 2013

Hi Carrie,
Am I understanding it correctly, that it is not possible to turn off milestones at the AG level? That there is no way to reprioritize an excuting query (reclassify in TASM terms)?
My use case is as follows:
a batch script is running late, doing a trunc and load (insert/select) to a critical table. The nature of the query is that is exceeds the milestone and steps down. Because there is other work on the system, I want this 1 query to be moved to R1 and stay there for just this one day.
Short of throttling all other work into the delay queue, is there nothing that can be done to increase this query's priority?
ruth

Ruth Fenwick

carrie 431 comments Joined 04/08
22 Jan 2013

Hi Ruth,

I am assuming you are talking about an Appliance platform. On the Appliance you can issue a SET SESSION ACCOUNT to change the priority of a running query. It is one of the monitor commands, which are documented in Teradata documentation.

SET SESSION ACCOUNT allows you to change the performance group of the query, which means the query will begin CPU accumulations all over. However, the query is still at risk for being demoted when it reaches the new performance group’s query milestone threshold.

If query milestones are on, there is no way to keep a query running at a high priority while exempting it from ever being stepped down.

You also have the option of turning off the automatic query milestones if they are not working for you, but then they will be off for all priorities and for all queries.

Thanks, -Carrie

Phil 2 comments Joined 09/09
11 Apr 2013

Carrie,
Great detail about the workload management capabilities on an Appliance. I figure it is not available but wanted to check with you to see if we could add a CPU cap on a AG on an appliance? We get some high PJI queries that take over the system and we end up having to abort them. I would like to be able to manually move them into a penalty box with CPU cap and let them complete if the process is business critical and we have not had time to tune the process.
 
Thanks
Phil

carrie 431 comments Joined 04/08
12 Apr 2013

Hi Phil,
 
Sorry, CPU limits on allocation groups are not available on the appliance platform.  The closest thing to that is the query milestones described in this posting.  
 
Thanks, -Carrie

rlk3 2 comments Joined 05/09
04 Sep 2013

Hi Carrie
 
I wonder if you can tell us how IDs with the account string "DBC" are handled under this scheme on Appliances?  It seems that such accounts include the users DBC, TDWM, and Viewpoint. 
 
What AG would their queries start in?  And are they subject to demotion??
 
Thanks
 
Ricky

carrie 431 comments Joined 04/08
05 Sep 2013

Hi Ricky,
 
By default, all requests will run in $M (allocation group 2) on the appliance unless their user's account string has been modified to point to one of the other performance groups.   
 
You can change the account string on all those users except for user DBC. So, queries submitted from user DBC will always run in $M.
 
All requests are subject to demotion, no matter who the user is who submits the request.   There is no exemption for queries submitted by DBC. 
 
Thanks, -Carrie

geethareddy 104 comments Joined 10/11
30 Dec 2013

Hi Carrie,

When you said we can ignore the performance periods now with TASM, is this applicable for both for appliance (we are on 2690 for dev) and EDW platform (we are on 6650 for Prod). I am asking this question, because I am struggling to find out the straight answer for the significancy of the account strings both on appliance and enterprise platforms. When i said, straight answer, do we need to still focus on dividing the user community in to L, H, M, R performance groups (PG) keeping the fact in mind that PG  will decide the right AG (allocation group), even though having the TASM exceptions and variety of classification criteria?
If that is the fact, is the account string perf.group part will be used only for grouping the users in to one particular workload, but not for allocating the resources?
For ex: I have 300 users with $L$APPL&S&D&H, these users classified in to a tactical workload (WL1) based on the 'Account String' criteria.
That workload falls in to the  tactical resource partition (which gets 50% of system resources). So are those 300 users will get the resources from Default RP0 via AG 1 OR get the tactical resources from RP01.

Id

PartitionName

Weight

Limit

0

Default

25%

none

1

DSS

25%

none

2

Tactical

50%

none

Q2) When we define CPU limit (System Values) under Priority Distribution, I understood that is the max percentage of the system CPU the Database can use. For Ex: If the CPU limit is 95. DBMS can use 95 and which process can make use of that 5%? And when i mouse over the AGs under Priority Distribution tab, i noticed that all the AGs that we have showing 'No CPU Limit' except the AG for Penalty box showing 'CPU Limit:5'. I saw the same from schmon below (bolded the entry for penalty box AG with 5 cpu limit). So that means, rest of the CPU will be used for penalty box purpose all the times?
Can you please eloberate more on this?

Id

Type

Wght

Lim

1

N

7

none

2

N

13

none

3

N

27

none

4

N

53

none

5

NX

100

none

6

N

3

5

7

N

52

none

9

N

13

none

8

N

26

none

10

N

6

none

Q3) CPU Limit(Sum over all nodes):
For ex: If i have the TactUserID(meant for tactical queries), that ID queries will be run under tactical WL (under tactical AG resources), but due to the skew, those queries executes within the same TacticalWL more than acceptible time. I heard that, with CPU Time exception, that problem will be resolved though it is unable to do so effectively for sub-second queries. Can you please explain how it works? When i said it resolves, does this exception identifies the query ahead of time based on the cpu sum calculated on alll the ACTIVE nodes and demote the query to nex low level WL?
Tactical CPU time per Node:
Via this exception option for tactical queries implies that less-tuned-tactical queries will also run in HIGHEST priority for a time, preferably as little time as possible, before Teradata Priority SCheduler moves the request to run in an AG with lower weight.
My questions on above point are:
a) How long the less tuned tactical queries will run same priority. I am thinking this is oppoisite to CPU Time(all over the nodes) option (because CPUTime exception does not allow query to run under same priority where as this option will allow to run the query)

b) what kind of advantage we are going to get it from this exception option? It is almost like the query milestone feature. I understood that it is meant for the customers who cannot tune their queries but still want to run them under

c)From the name of this exception, does this decide the action of the exception based on the amount of cpu calculation on ONE node basis? Plese explain which way it considers the aggregated amount of CPU?

 

Thanks,
Geeta

carrie 431 comments Joined 04/08
02 Jan 2014

EDW platforms with TASM and quite different from Appliance platforms.  If you are have TASM on the system, then performance periods no longer have a role.   The above comments from this blog posting only apply to platforms that are of the Appliance type and appliance platforms DO NOT have TASM.    It is only on the appliance platform that you need to divide users into the L, M, H and R performance groups.   You do not have workloads or workload exceptions with the Appliance.  Workloads or workload exceptions  are only available with TASM.
 
The performance group in the account string is only used in non-TASM systems.   TASM systems use workloads, but non-TASM systems do not, and so the appliance platforms rely on the account string performance group to map to an allocation group.  If you are on a TASM system, and a workload classifies to a tactical workload within a tactical resource partition, then that is the workload that will be used by the tactical queries.
 
Unfortunately, Developer Exchange is not equipped to (and I am not able to) provide support at the level you have been seeking recently.  In addition, sub-questions listed within your questions #2 and #3 are beyond the scope of this particular blog posting.   Off-topic questions are usually more appropriate when posted on one of the general Teradata technical forums.
 
To help get answers to those questions, and others you may have, please read the orange book on CPU limits.   It describes how system level and allocation group level CPU limits work, separately and together.   Also read the TASM orange book for more detail on how tactical exceptions work.  Those points are thoroughly documented in those documents.
 
Thanks, -Carrie

geethareddy 104 comments Joined 10/11
02 Jan 2014

Thank you for your comments and suggestions Carrie. I will do.

Thanks,
Geeta

geethareddy 104 comments Joined 10/11
06 Jan 2014

Hi Carrie,
I am just asking this to know what is the worst case scenario it leads to if i keep one or couple of workload definitions in to RP0?
I heard many times that RP0 is strictly intended for internal activities. But what is the downside if we take a share from it?
Please help me understand this.
 

 

Thanks,
Geeta

carrie 431 comments Joined 04/08
08 Jan 2014

Hi Geeta,
 
You can use RP0 for user-assigned work if you wish.   The recommendation to avoid using RP0 is made because it's usually seen as cleaner to keep internal activity that runs very lightly in RP0 separate from user-defined work, so you can more easily monitor each of the types of work, as well as ensure that the priority internal work is given is not being shared or diluted by other less critical work.   I think its best to use the other four resource partitions for user work if you have the choice, and avoid using RP0 unless you require a fifth RP for some reason.  Then, if you need to use it, go ahead.
 
Thanks, -Carrie

You must sign in to leave a comment.