Sunday, November 9, 2014

Newest SQL Server Feature – “Query Store”

Very important disclosure: 
  • Do not try described functionality in production. Do not try it in test also. That functionality is not available for the public yet.
  • I did not try the new functionality either. That means I might not cover everything and not everything covered is covered precisely.
  • Do not build false expectations, some functionality might disappear in a release of the product, but some new functionality might be added.
  • Whole blog post is based on a presentation of Microsoft Principal Architect Conor Cunningham on PASS Summit 2014 in Seattle.
Introduction:
  • You might be trapped in a situation when SQL queries suddenly start to work slowly. Why that is happen? There are some reasons when your cached plan can be invalidated or just simply pushed out of plan cache because of memory pressure. That would cause query plan recompilation and there is no any guarantee that new plan will be the same as the old good one.
  • I've seen that situation a lot for old and current SQL Server versions. These situations are difficult to troubleshoot and getting rid of the bad plan not always possible. Usually, updating statistics is the cure, which caused bad plan invalidation and recompilation to the good one. Since SQL Server 2012, updating statistics causes plan invalidation only if data have significantly changed. That means you have to search for a bad plan and delete it manually. All of that takes time and as I mentioned earlier, there is no guarantee, that newly compiled plan would be the best.
  • Didn't you wish to store your “good plans” in cache forever and not allow SQL Server to push them out? Microsoft have heard yours pray and developed “Query Plan Storage”, which will allow you to manage query plans in the way you want.

New Functionality description:
  • Store the history of plans for each query;
  • Capture runtime statistics of each plan per time unit (default is 1 hour) (Max, Min, Avg, Last, Std dev for duration, IO writes/Reads, CPU waits, memory usage, DOP, rowcount values);
  • Compile regular queries with parameters as “auto-parameterized” queries;
  • Identify queries that have “gotten slower recently”;
  • Allow you to force plans easily;
  • Make sure this works across server restarts, upgrades, and query recompiles;
  • DBA can determine query retention time for Query Store;
Additional functionality:
  • Establishing base lines for query executions;
  • Performance regression testing after server upgrades or migrations;

Facts:
  • Query Texts start at the first character of the first token of the statement; end at last character of last token;
  • Comments before/after do not count;
  • Spaces and comments inside *do* count;
  • Statement_sql_handleis MD5 hash of text;
  • Context_settingscontains one row per unique combination of plan-affecting settings;
  • Different SET options cause multiple “queries” in the Query Store;
  • Plan caching/recompilation behavior unaffected;
  • Statistics are recorded when query execution ends.

New Functionality in my own understanding:
  • Each executed query in your system will not just go to Query Plan Cache, but also will be recorded into Query Store;
  • Along with a query, runtime execution parameters, such as duration, IO, CPU etc. also will be recorded;
  • Query Store will permanently save all plans and stats, which will be available even after server reboot. That means lower time for plan recompilations on busy servers.
  • Plans will be available even after server migration. That means you cannot just reuse “old good plans”, but also compare performance change on the new hardware for individual queries.
  • Query Store might contain more than one plan for a single query;
  • DBA will be able to mark a single the best plan for a particular query to be chosen by SQL engine for execution;
  • Privileges to mark a plan for execution will be available for an Admin role ONLY;
  • New functionality might be available initially in SQL Azure and then implemented in the next version of SQL Server. Currently that functionality is available for SQL Server engine version 13.0.11.152.

I’ve tried to capture as many details as possible and come up with reasonable explanation.

Captured list of new DMVs:
  • sys.query_store_query_text
  • sys.query_store_query
  • sys.query_store_plan
  • sys.query_store_runtime_stats
  • sys.query_store_runtime_stats_interval
  • sys.query_context_settings
Captured lists of new DMVs’ columns:
sys.query_store_query_text
Column name Type Description
query_idbigint*ID of the query text. This is unique value within a (server or database)*
query_sql_textnvarchar(max)Text of the SQL query.
statement_sql_handlevarbinary(64)A token that refers to the statement that the query is part of. MD5 hash to text.
More columns are available

sys.query_store_query
Column name Type Description
query_idbigint*ID of the query. This is unique value within a (server or database)*
query_text_idbigint*ID of the query text. This is the same value as the query_text_id in the sys.query_store_query_text catalog view.
context_settings_idbigint*ID of the query context settings. This is the same value as the context_setting_id in the sys.query_context_setting catalog view. (Refers to context settings such as “quoted identifiers” etc.)*
object_idintID of the object (for example, stored procedure or user-defined function) for this query plan. For ad hoc and prepared batches, this column is zero.
batch_sql_handlevarbinary (64)A token that refers to the batch that the query is part of. For ad hoc and prepared batches, this column is null.

Column is nullable.
query_hashbinary(8)Binary hash value calculated on the query and used to identify queries with similar logic. You can use the query hash to determine the aggregate resource usage for queries that differ only by literal values.
is_internal_querybit*(Is query Distributed, Replicated, etc.?)**
query_parameterization_typesmallint*Query parameterization type. For queries without parameters, this column is zero.
query_parameterization_type_descnvarchar (60)*Description of query parameterization type.
0 = None
1 = User
initial_compile_start_timedatetime3Date and Time when query was initially compiled.
(Format is datetime2 + time zone shift?)**
last_compile_start_timedatetime3Date and Time of query’s last compilation. For query which had no recompilation that parameter is (equals to initial_compile_start_time or NULL)**
More columns are available

sys.query_store_plan
Column name Type Description
plan_idbigint*ID of the plan. This is unique value within a (server)*
query_idbigint*ID of the query. This is the same value as the query_id in the sys.query_store_query catalog view.
plan_group_idintID of the query plan group. This is the same value as the plan_group_id in the sys.query_plan_group catalog view.**
engine_versionnvarchar (128)Version of the instance of SQL Server plan has been compiled, in the form of 'major.minor.build.revision'.
query_plan_hashbinary(8)Binary hash value calculated on the query execution plan and used to identify similar query execution plans. You can use query plan hash to find the cumulative cost of queries with similar execution plans.
Will always be 0x000 when a natively compiled stored procedure queries a memory-optimized table.*
query_plannvarchar (max)Contains the compile-time Showplan representation of the query execution. The Showplan is in text format.*
is_parallel_planbit*Indicates whether the plan uses parallelism.
0 = not used
1 = used
Column is not nullable.*
is_forced_planbit*Indicates whether the plan for particular query is forced to be executed. In case of multiple plans for the same query SQL Engine will execute a plan where that column equals One. (In case all plans have zero, the most recent plan will be executed.)
0 = plan is not forced to be executed
1 = plan is not forced to be executed
Column is not nullable.*
force_failure_countintCounter of cases when is_forced_plan = 1, but by some reason SQL Engine does not use that plan. (f.i. index used by that plan has been modified or deleted and SQL Engine can’t use that plan anymore) *
last_force_failure_reasonsmallintColumn keeps code of the last force failure reason. In case of force_failure_count = 0 it also will be Zero.
Column is not nullable.*
last_force_failure_reason_descnvarchar (60)*Description of force failure reason.
0 = none
count_compilesint*Counter of compiles. Starts from One.
Column is not nullable. **
More columns are available

sys.query_store_runtime_stats
Column name Type Description
runtime_stats_idbigint*ID of the runtime plan stats. This is unique value within a (server or database)*
plan_idbigint*ID of the plan. This is the same value as the plan_id in the sys.query_store_plan catalog view.
runtime_stats_interval_idint**
execution_typesmallintIndicates special type of execution of SQL query. For instance parallel plan. For plans without any specialties column equals Zero.
Column is not nullable.*
execution_type_descnvarchar (60)*Description of query parameterization type.
0 = Regular*
first_execution_timedatetime3Date and Time when plan first time was executed.
(Format is datetime2 + time zone shift?)**
More columns are available
* The best guess.
** No clue.

Below is a possible data schema for the new feature:


How does it work?
Here are the sample steps of performance troubleshooting in the new environment:

  1. Alter database to be covered by Query Store;
  2. Use set of DMV’s to see executed queries and their plans;
  3. Find the bottleneck;
  4. Find the fastest plan, which by some reason was replaced by a slow one;
  5. Mark fastest plan to be used for the query.
Here are queries for the first and the last steps: 
Step #1. Altering database to be covered by Query Store:
ALTER DATABASE MyTestDB
SET QUERY_STORE (interval_length_minutes = 1);
ALTER DATABASE MyTestDB SET QUERY_STORE = ON;

Step #5. Mark fastest plan to be used for the query:
EXEC sp_query_store_force_plan 1, 1;


That stored procedure will set is_forced_plan = 1 in "sys.query_store_plan" DMV. 

For “sp_query_store_force_plan” you have to supply plan_id as a parameter.
(The second parameter is unknown. That can be query_id or context_settings_id

There can be situations when:

-         You made a wrong choice;
-         Forced plan is not the best anymore;
-        Some metadata have changed that causes Forced Plan to fail.

In these cases you can un-force a forced plan by following command:
EXEC sp_query_store_unforce_plan 1, 1;

(I assume that stored procedure is accepting the same parameters as the forcing one).

Afterwords:
I believe hundreds of thousands of SQL Servers in the world struggling because of bad execution plans, by some reason, pushed out good ones.

The new SQL Server “Query Store” feature will solve that problem and definitely be a very useful tool in hands of knowledgeable DBAs.



Please fill free to contact me in case you think I did not understand all new functionalities correctly or by the time when that feature officially go to a release some will be changed or new be added.




1 comment: