A Quick Guide to Using Short Query Acceleration (SQA) for Faster Queries on Amazon Redshift
This is a cross-posting from https://www.intermix.io/blog/short-query-acceleration/
In November 2017, AWS introduced Short Query Acceleration (SQA) for Amazon Redshift. The promise of SQA is to speed up the execution of short running queries, with better predictability of query execution times. That, of course, sounds amazing. In this post, we’ll cover what SQA is, SQA best practices, and what trade-offs you’re making with SQA.
Table of Contents
- Redshift Users' Biggest Problem: Slow Queries
- How Redshift SQA Works
- 3 Redshift Concepts to Understand before Enabling SQA
- Amazon Redshift WLM Best Practices (Before SQA)
- Does your Redshift cluster need SQA, or can you do it all in WLM?
- The Downside of SQA: Trading Memory for Concurrency
- Summary
Redshift Users’ Biggest Problem: Slow Queries
A common issue with Amazon Redshift is that it slows down as your workload volume increases. And surveying Redshift users shows that “slow queries” and “slow dashboards” come out as two of the most common issues. “Slow queries” are enough of an issue for the Redshift product team to launch a feature like SQA.
But using SQA without any other adjustments to your cluster is not a recipe for success. There are other levers to pull first. And then SQA becomes one part of your performance tuning strategy.
How Redshift SQA Works
Short Query Acceleration uses machine learning to predict the execution time of a query. The algorithm moves short running queries to a ‘short query’ queue for faster processing.
Starting in 2018, SQA is enabled by default for all clusters. Users can still disable SQA via Workload Management on their Redshift console.
3 Redshift Concepts to Understand before Enabling SQA
Whether you choose to disable SQA or not, it is important to understand 3 key Redshift concepts to make a more informed choice:
- Query queues
- User and query groups
- Workload management (WLM)
These concepts represent tools you can use to fine-tune your workloads. Using them in the right way will have a much bigger impact than flying solo with SQA. Adding SQA into the mix is “a” performance tuning setting, but not “the” setting. Let’s take a look at these concepts.
Query Queues
Amazon Redshift routes user queries to queues for processing. By default, there are two queues available. One for superusers, and a default queue for all other users.
The main purpose of the superuser queue is troubleshooting. It can only run one query at a time. You can’t configure it any other way.
The default queue has a concurrency of five, with memory equally split across those five slots. You can change the concurrency of the default queue, and also add more queues.
User and Query Groups
It’s impractical to assign each individual query to a queue. With user and query groups, you can group queries based on access rights and workload patterns. The grouping then allows you to “bulk-assign” a set of queries to a specific queue at runtime.
Workload Management (WLM)
WLM is where things come together. It’s where you:
- Define your queues
- Associate your user and query groups to a queue
- Set concurrency/slot count and memory for a queue
The point of defining more than one queue is to separate and protect your workloads from each other. The three common workload types are loads, transforms and ad-hoc queries.
Amazon Redshift WLM Best Practices (Before SQA)
To get the most out of SQA, it is important to set up your WLM correctly. Broadly speaking, the 4 recommended best practices for setting up your cluster are:
- Separate your users (one user, one login)
- Group those users by workload patterns (load, transform, ad-hoc)
- Create three WLM queues (one for each workload pattern)
- Assign each group to its corresponding queue
Related Reading: 3 things to avoid when setting up an Amazon Redshift cluster
That’s when you can start fine-tuning your workloads. The two key choices to make are picking the right (1) slot count and (2) memory percentage for each queue. Slot count determines your concurrency. Giving queries enough memory means they don’t fall back to disk, degrading performance.
Getting your WLM settings right will allow you to get the most out of your cluster resources. Every query will have a slot to run, with enough memory. It’s the first step to workload scalability.
Workload Management and Short Query Acceleration
Queries experience a delay when there are not enough concurrency slots available. A short-running query might take 5 seconds to execute. But it takes 2 minutes to return if it’s stuck in the queue waiting for a query that takes 1:55 minutes to finish. With the help of SQA, these short queries can hop to the express queue.
But what if you’ve already tuned your WLM for your workloads? That implies that each query will have a concurrency slot available at runtime. There will be no queries stuck in the queue. And that means there’s nothing to accelerate.
In other words, if you fine-tune your cluster by using the WLM, you probably will not need SQA. And the upside of tuning your WLM vs. only using SQA is much bigger. With WLM, every query will run fast. With SQA, only short-running queries may run fast.
Does your Redshift cluster need SQA, or can you do it all in WLM?
So why did Amazon develop SQA if using workload management means you may not need it?
Turns out that fine-tuning your WLM is hard. The standard way of doing it is a combination of running scripts and guessing. But that may lead to more frustration than results. Therefore, a lot of users fall back to using the default queue coupled with SQA.
But there’s a huge payoff if you get WLM right. The data team at Remind used Intermix to improve their Redshift WLM performance. The results of their optimization efforts delivered a 99.9% reduction in queue wait times. And they did so while reducing the number of nodes from 16 to 12.

The Downside of SQA: Trading Memory for Concurrency
But even for a fine-tuned cluster, SQA can still come in useful. A good use case example are unusual concurrency spikes.
Let’s take an e-commerce company as an example. It’s the first day of a new month. Everybody in the company wants to know “how many new widgets did we sell last month?” And then break that query further by channel, region, customer segment etc.
To get those insights, a lot of complex, long-running dashboard queries need to be triggered. That may push your query volume above your concurrency limits.
In such a situation, having SQA enabled means your short running queries will produce fast results. That’s why, it is always a good idea to leave SQA enabled, by default.
However, using SQA comes at a cost, in the form of wasted memory. Using SQA means adding one more queue. No other queue can use the memory associated with the added queue. That translates to concurrency for those queues.
In short, with SQA you are trading off memory for concurrency. The only way to make up for that trade-off is to add more memory by adding more nodes, which translates to higher costs.
Summary
SQA adds an “express queue” to your cluster for short-running queries. But you still need to configure your WLM for your workloads to get blazing fast performance. Otherwise, SQA will not produce the expected results, and may even lead to wasted memory.