Flow Control and Spawned Work
There’s a comment on an earlier blog posting of mine from 2009 titled Controlling the Flow of Work in Teradata. The comment poses a question that is more reasonably answered by making second posting on flow control.
The question was what happens when spawned work messages cannot be placed on the message queue on one AMP, but can on the others, will all AMPs reject the spawned work message? Further, what happens if the spawned request originates from a situation where one of the WorkNew AWTS is itself in flow control? Below is an explanation of how this all works. Check out the earlier Flow Control posting for background.
Flow Control for WorkNew (Work00) messages
When an AMP temporarily closes the door to new work, that AMP is in a state that we call “flow control.” When in a state of flow control, which often lasts a fraction of second, that AMP will turn away newly-arriving messages. One type of arriving work can be work messages that the dispatcher has sent to the AMP for processing. If there are a number of such messages waiting on the message queue for an AWT, and that number that is waiting has exceeded the message queue length for the WorkNew work type, then that AMP will not be able to accept that message.
While each AMP makes this decision independently of other AMPs, the incoming message cannot be accepted by any AMP unless all AMPs can either give provide an AWT or queue up the message. If only one AMP is in the state of flow control for that work type, the message is returned to the sender and will be retried. In fact the full message is not stored in AMP memory until all AMPs can accept the message.
Each work message represents one step in a request, not the entire query. When the message that is being retried is from the dispatcher, then no work on behalf of that request step will be able to start while the request step is undergoing retry logic. However, it is possible the request itself is already partially completed at that point, as previous steps may not have encountered the flow control state and could have completed normally. So flow control can occur in the middle of a query, and if the flow control condition persists, it could unexpectedly delay that query’s execution, even though earlier steps have completed.
Flow Control for WorkOne (Work01) messages
WorkOne messages are messages that represent spawned work. Spawned work happens when a WorkNew message is starting up on all AMPs and the step requires an activity such as row redistribution or table duplication, something that requires more than one AWT per AMP.
Before a message can be sent to all AMPs requesting WorkOne AWTs, all of the AMPs must have successfully acquired a WorkNew AWT for that step. If any of the AMPs queued up the WorkNew message or were in flow control, then the message that wants to spawn WorkOne AWTs on behalf of that step will not be sent.
When all the AMPs have their WorkNew AWTs, the last AMP to get its WorkNew AWT will spawn the message for WorkOne AWTs to all AMPs. If one of the AMPs had been in flow control and unable to process its WorkNew message, then the spawned work message will not be sent. It is only sent when all AMPs have been able to provide a WorkNew AWT.
Flow Control Managed by Work Type
Flow control is managed by work type independently. It is not likely you will go into a state of flow on more than one work type at a time. This is because if you are already in flow control for WorkNew work types on an AMP, that will make it less likely that there will be a demand for WorkOne work types, as there is a dependency between them. Being in flow control for WorkNew lightens the load on WorkOne AWTs.