“Why is this query running so slowly?”
It’s one of the phrases most commonly heard by DBAs and database developers when dealing with OLTP systems.
Luckily, SQL Server provides a range of native options for determining exactly what’s occurring under the hood. Using execution plans, it’s possible to see the exact roadmap the SQL engine is following to retrieve data. This article will review the basics of reading and interpreting execution plans, then dig deeper into the internal processes and mechanics used by the SQL Server optimizer.
Using SQL Server Management Studio (SSMS), the Actual Execution Plan can be included as a tab in the result set by selecting it as an option under “Query” (Or using the CTRL-M hotkey).
Once activated, the Execution Plan tab is available for any query issues in SSMS. (Note: All queries are performed against AdventureWorks. This is SQL Server's sample database and can be downloaded here.).
The key operation here is a Clustered Index Seek; since the column referenced in the above query has a clustered index, the optimizer can seek directly to that row with minimal effort. This is the simplest possible example, but the plan can become complicated quickly:
Each operation in the SQL query is represented by a corresponding graphic in the execution plan. The higher the percentage, the larger the workload. In this case, the clustered index scan performed on SalesOrderDetail is the majority of the cost of this query. The Sort (ORDER BY) is next, while the rest are nominal. Note that the percentages are estimations rather than exact figures; the total will often go over 100%.
For more detail, you can hover your cursor over each operator:
The key items are the estimated costs; I/O and CPU will reflect the amount of effort required by each to execute the query. Values of .01 demonstrate virtually no workload.
MSDN has an extensive list of all possible operators used in this capacity.
Common Expensive Operators
In general, operators which can seek to individual rows on sorted data are the most efficient.
These operators will be the most expensive:
Table/Index scans: any operator that performs a scan is looking at a row range. The performance hit will scale up as the size of the data set increases.
Hash joins: Used by the optimizer for set-matching operations (joins, intersections, union). The rows from one table are inserted into a hash bucket based on the value of the hash function. Then, rows from the other table are scanned: for each scanned row, the corresponding hash bucket row is checked and the matches are produced as output. Hash joins occur due to a lack of usable indexes, and are CPU-intensive.
Bookmark lookup: Indicates that while an index was used to locate the row(s), additional data that was not included in the index was required. These can often be eliminated by adding the referenced column(s) as included columns in the index.
Sort: A fairly expensive operator; sometimes this can be eliminated by additional indexing.
Actual vs. Estimated Query Plans
In addition to including the actual execution plan as part of the query results, it’s possible to get a quicker estimation of SQL Server’s workload using the estimated execution option:
This is available in SSMS under ‘query’ or by using hotkey ‘CTRL-L’.
This doesn’t execute the query; it only approximates what the optimizer believes will be the query plan. It’s particularly helpful for long-running queries, when waiting for the result set to return would be tedious.
This is also the version that is stored in the query cache (described in detail below).
One of the most critical elements the optimizer uses to determine query plans is the cardinality of the result set. Cardinality refers to the number of elements in a set; in this case, the number of rows in the result sets being returned by the engine.
Its importance lies in the query plan's reliance on cardinality to plot an optimal path through the data. For example, the optimizer will use information about join predicates, selectivity, distinct values, and duplicate rows to create the query plan; all these figures are established by the cardinality estimator.
In SQL Server 2014, the cardinality estimator was re-written to address known issues. Among the most notable changes:
The “Ascending Key” problem has been rectified.
This problem occurred when values were inserted into an indexed column, but statistics had not yet been updated to reflect the inserts. For example, if a date (2015/01/01) was being used as the key value and statistics had last been updated on 2014/12/28, those rows added with 2015/01/01 would not be included in cardinality estimations. Depending on the load frequency, this could result in a significant amount of data rendered invisible to the estimator, potentially generating a less than optimal query plan.
The new estimator also now assumes that there is some correlation amongst filtered predicates on the same table; The Microsoft-provided example uses the make and model in a cars table to demonstrate this. Logically, there is a correlation between the make and model of a car, and the new estimator adds an exponential component to the algorithm it uses to reach this figure now. However, no correlation is assumed on filtered predicates on different tables.
These modifications were based on likely cardinality estimations and sample data in use since SQL Server 2000. They don’t require any action on the part of the developer, but will result in different query plans from those built on previous SQL Server versions.
For an incredibly in-depth look into this topic, a white paper on cardinality estimation in SQL 2014 is available.
Storage of Query Plans
In order to effectively re-use query plans, SQL Server stores plans and statistics internally. Caching plans prevents the need to run through the expensive planning process for queries and procedures that are commonly executed on the server. These can be viewed by combining a handful of built-in SQL Server Dynamic Management Views (DMVs):
This is especially useful when troubleshooting scheduled or off-hours stored procedures and queries; the XML can be opened in SSMS:
It can also be viewed graphically by saving it as a .SQLPLAN file, then opening directly with SSMS:
The cache can be emptied by executing DBCC FREEPROCCACHE on the SQL instance.
Upcoming Features – Query Data Store
In late May 2015, Microsoft announced that a new, interesting feature will be available in SQL Server 2016 (edition-specific details were not available at this time).
The Query Data Store is a service that performs as a historical tracking system for SQL execution plans. As a query plan for a specific stored procedure/query evolves over time, the history of those changes will be tracked in the query data store. In addition, performance metrics will be maintained, along with the ability to force the usage of older plans.
This has the potential to be an excellent troubleshooting tool for determining why queries and procedures are experiencing performance degradation. More details should be forthcoming in mid-2016.