Tuesday, September 8, 2015

SQL Server 2016 Query Store

I've already described some of "Query Store" functionality in my blog post "Newest SQL Server Feature – “Query Store”" in last November, when SQL Server 2016 was not available yet and nobody could try it.
Now I want to try that functionality alive and make sure all my expectations are true.

At first will enable Query Store functionality for a database in three easy steps:
Step 1. In SSMS 2016 CTP2 create a test database and go to it's properties:

Step 2. In Properties select the last in the list "Query Store" Option:

Step 3. In "General" section change "Enable" to "True". That will fill all default values.

Same result as Steps 1-2-3 you can get just by running following script:
USE [AdventureWorks2014];
GO
ALTER DATABASE [AdventureWorks2014] SET QUERY_STORE = ON;
GO

Now will review the Case:

Problem Description:

Company "ABC" has a database where they place thousands of sales' orders on daily basis.
At the same time couple of hundreds clients every few seconds sending heavy queries requests to the SQL Server to monitor these sales orders.

Problem Solution:

Because over the time data are changing and after first time server was down, somebody very smart said: "-We have to update statistics!" and somebody added: "-And we will schedule that statistic update to prevent any problems in the future!"

Still have a problem:

Even though statistics have been updated on the regular basis, performance problem occurred even more often now! Why?

Problem Simulation:

I will try to simulate that particular problem using very small AdventureWorks database. To get some visible results I will use one month as a period.

In preparation I will run following script to Enable Query Store, Clear it (just in case), then re-create one additional index, clear query cache (NEVER do it in PRODUCTION) with "UPDATE STATISTICS" to simulate nightly maintenance job:

USE [AdventureWorks2014];
GO
ALTER DATABASE [AdventureWorks2014] SET QUERY_STORE = ON;
GO
ALTER DATABASE [AdventureWorks2014] SET QUERY_STORE CLEAR ALL;
GO
IF Exists (SELECT TOP 1 1 FROM sys.indexes WHERE name = 'IX_SalesOrderHeader_Date')
    DROP INDEX IX_SalesOrderHeader_Date ON Sales.SalesOrderHeader;
GO
CREATE INDEX IX_SalesOrderHeader_Date ON Sales.SalesOrderHeader (OrderDate);
GO
DBCC FREEPROCCACHE;
GO
UPDATE STATISTICS Sales.SalesOrderHeader; 
GO 

Mr. Martin is coming to work pretty early and executes his query first:

DECLARE @SQL NVARCHAR(1000) = '
     SELECT h.AccountNumber FROM Sales.SalesOrderHeader AS h
     INNER JOIN Person.Person AS p ON h.CustomerID = p.BusinessEntityID
     WHERE p.LastName = @LastName and h.OrderDate >= @OrderDate;';
DECLARE @LastName NVARCHAR(100) = N'Martin';
DECLARE @OrderDate DATETIME = '2014-07-01';

SET STATISTICS IO ON
EXECUTE sp_executesql @SQL,  
     N'@LastName NVARCHAR(100), @OrderDate DATETIME',
     @LastName = @LastName, @OrderDate = @OrderDate;
SET STATISTICS IO OFF
You might have noticed that I added "SET STATISTICS IO ON" command to capture IO Stats:
Very nice. In the beginning of the period there were only 2 logical reads because period is still empty.
Maintenance job updated statistics and everything works perfectly. Everybody are happy.

Now let's add some data to a period:
INSERT INTO Sales.SalesOrderHeader(
    RevisionNumber, OrderDate, DueDate, ShipDate, Status,
    OnlineOrderFlag, PurchaseOrderNumber,
    AccountNumber, CustomerID, SalesPersonID, TerritoryID,
    BillToAddressID, ShipToAddressID, ShipMethodID,
    CreditCardID, CreditCardApprovalCode, CurrencyRateID,
    SubTotal, TaxAmt, Freight, Comment,
    rowguid, ModifiedDate)
SELECT     RevisionNumber
    , DATEADD(MONTH,1,OrderDate)
    , DATEADD(MONTH,1,DueDate)
    , DATEADD(MONTH,1,ShipDate)
    , Status, OnlineOrderFlag, PurchaseOrderNumber,
    AccountNumber, CustomerID, SalesPersonID, TerritoryID,
    BillToAddressID, ShipToAddressID, ShipMethodID,
    CreditCardID, CreditCardApprovalCode, CurrencyRateID,
    SubTotal, TaxAmt, Freight, Comment,
    NewId(), ModifiedDate
FROM Sales.SalesOrderHeader
WHERE OrderDate >= '2014-06-01';

Now will re-run Mr. Martin's script. We shouldn't be surprised to see new big numbers:
Number of our reads increased almost 3000 times! That is in a very small database with no activity.
Now imagine busy server with hundreds of clients....
That means DBA team should use Admin connection to bring server alive.

Guess what they do to fix the problem....

UPDATE STATISTICS Sales.SalesOrderHeader;

Let's simulate their activity too and after that run Mr. Martin's query again.
  
What do we see: Number of logical reads dropped to a level SQL Server can handle hundreds of requests.
Did "UPDATE STATISTICS" helped? Yep! ....

Everything looks bright, until the next period when maintenance job updated statistics...

Let's describe the real problem: updating statistics sometimes helps, but sometimes harms.
In order to fix this issue once and for all will use "Query Store".

Open AdventureWorks database where you can notice some new items in the list:

Will briefly review these new reports:
Regressed Queries report contains list of queries in Query Store with their plans. In my case I have only one.
Overall resources consumption report for all of your queries.

And here is our "winner" in the "Top Resource consuming queries":


 You can notice that our slow query #17 has two plans associated with it #17 and #20



Why is that?
The plan #17 was created by SQL Server when there was no Sales orders in July of 2014 and SQL Server decided that it will be less expensive to build a plan using indexing by Order Date and associated Key Lookup. However, when number of records for the period started growing, that plan became more and more expensive.
Updating statistics invalidated that plan and when we run the query again SQL Server created plan #20, which looks for person's Last Name first and then does a full scan over Sales table. For that particular query plan #20 is more appropriate in a long run.

What would you do in these kind of situations in the previous versions of SQL Server:
- You might specify usage of Last Name Index in a hint or use an option to optimize for unknown.
The problem with these old solution is that they are applicable only when you have full control over the SQL code, in case of third party application you have nothing else other than just update statistics when problem occurs.

In SQL Server 2016 we have an option to force particular execution plan to be executed for a query.
Let's press the button "Force Plan" for plan #20.

Now will prove that it is working:
At first Enable actual execution plan (Ctrl+M), Update statistics again and execute our query for the next period - August of 2014:

UPDATE STATISTICS Sales.SalesOrderHeader; 
GO
DECLARE @SQL NVARCHAR(1000) = '
     SELECT h.AccountNumber FROM Sales.SalesOrderHeader AS h
     INNER JOIN Person.Person AS p ON h.CustomerID = p.BusinessEntityID
     WHERE p.LastName = @LastName and h.OrderDate >= @OrderDate;';
DECLARE @LastName NVARCHAR(100) = N'Martin';
DECLARE @OrderDate DATETIME = '2014-08-01';

SET STATISTICS IO ON
EXECUTE sp_executesql @SQL,  
     N'@LastName NVARCHAR(100), @OrderDate DATETIME',
     @LastName = @LastName, @OrderDate = @OrderDate;
SET STATISTICS IO OFF

Here is our plan for the query, which is supposed to be plan #20 forced by "Query Store" manager to be executed. However it isn't. It is very close plan, but it does not have Filtering operation for Sales Data.

That is strange. Let's go to the fourth report provided by query store: "Tracked Queries"

 At first it is completely empty. We have to specify Query #17 to see query stats and plans:

As you can see, SQL Server came up with the third plan #26 for that query.
MSDN says: "When a plan is forced for a particular query, every time SQL Server encounters the query, it tries to force the plan in the optimizer. If plan forcing fails, a XEvent is fired and the optimizer is instructed to optimize in the normal way."

I've tried to look at Extended Events and there was nothing. Absolutely no reason for plan #20 to fail.
I do not know what it can be. It can be CTP 2 bug or maybe a feature.
I've clicked on "Compare Plans" button and got all three plans at glance:

 It seems like plan #26 is just a little bit better than #20 and SQL Server preferred it over, without any firing of "Plan execution Failure" in the extended events.


So far what we've learned about "Query Store":

1. "Query Store" manager collects all query execution plans along with their statistics and visually present it with pretty friendly interface. I may predict that it will shake position of major monitoring tools unless they figure out better strategies.

2. We can force specific plan to be executed for a query.

3. (Undocumented) It looks like when we force a plan SQL Server tries to improve it if possible.

Other cull "Query Store" features will be in my further posts.


2 comments:

  1. ALTER DATABASE [AdventureWorks2014] SET QUERY_STORE CLEAR =on; giving error, it should be
    ALTER DATABASE [AdventureWorks2014] SET QUERY_STORE CLEAR

    ReplyDelete
  2. There was: "ALTER DATABASE [AdventureWorks2014] SET QUERY_STORE CLEAR =ALL;"
    I've fixed it to: "ALTER DATABASE [AdventureWorks2014] SET QUERY_STORE CLEAR ALL;"
    That was a typo.
    Thanks Rakesh.

    ReplyDelete