To prove for the client and for myself necessity of using that hint I've made an experiment, which want to post here.
At first, will go to TestDB and create three tables: tbl_Test with test data, tbl_Counts - there will be inserted counts of records from tbl_Test table, tbl_Flag - communication table to transfer a signal between two SSMS windows.
As a result of that script table tbl_Test has to be populated by 10,000 "Empty" records.
-- # Script 1 - Window 1 - Preparation
use TestDB
GO
-- Drop Test Table if exists
IF EXISTS ( SELECT TOP 1 1 FROM sys.tables WHERE name = 'tbl_Test')
DROP TABLE tbl_Test;
GO
-- Drop Test Table if exists
IF EXISTS ( SELECT TOP 1 1 FROM sys.tables WHERE name = 'tbl_Counts')
DROP TABLE tbl_Counts;
GO
-- Drop Test Table if exists
IF EXISTS ( SELECT TOP 1 1 FROM sys.tables WHERE name = 'tbl_Flag')
DROP TABLE tbl_Flag;
GO
-- Create Test Table
CREATE TABLE tbl_Test(
ID INT IDENTITY (1,1) CONSTRAINT PK_Test PRIMARY KEY,
TestText
VARCHAR(100) NULL,
);
GO
CREATE TABLE tbl_Counts (Counted INT)
GO
CREATE TABLE tbl_Flag(ID INT IDENTITY(1,1))
GO
-- Fill Test records
INSERT INTO tbl_Test(TestText)
SELECT TOP 10000 NULL
FROM sys.messages;
GO
|
After table tbl_Test has been populated I run the second script.
That script is waiting for a signal to collect the data and collects it until another signal.
Data collection is a simple counting of records from the tbl_Test table.
If the number of records won't match 10,000 it will be reported to tbl_Counts table.
-- # Script 2 - Window 1 - Collecting
the data
SET NOCOUNT ON
GO
DECLARE @i INT = 0;
WHILE @i = 0
SELECT @i = COUNT(*) FROM tbl_Flag;
WHILE (SELECT COUNT(*) FROM tbl_Flag) > 0
BEGIN
SELECT @i = COUNT(*) FROM tbl_Test (NOLOCK);
IF @i != 10000
INSERT INTO
tbl_Counts(Counted)
VALUES(@i);
END
GO
SELECT Counted, 10000 - Counted as Diff
FROM tbl_Counts;
GO
|
While the second script is running I open another SSMS tab an start third script.
That script inserts one record into tbl_Flag table to activate data collection and then just does an update of tbl_Test. After update is finished it sends signal back to stop collecting the data.
-- # Script 3 - Window 2 - Changing
test data
INSERT INTO tbl_Flag DEFAULT VALUES;
GO
UPDATE tbl_FF_Test
SET TestText = REPLICATE('A',100);
GO
TRUNCATE TABLE tbl_Flag;
GO
|
After that script has finished I switched to the first tab and got my expected results:
That shows that during "UPDATE" operation my second query captured some situations when number of records is not equal 10,000 records.
That is easy proof how NOLOCK hint can produce incorrect results.
Microsoft recommends to use "READ UNCOMMITTED" transaction isolation level instead of NOLOCK.
Look how how it performs.
At first, run script #1 again to recreate our tables.
Then instead of script #2 run it's modification: #4
-- # Script 4 - Window 1 - Collecting
the data
SET NOCOUNT ON
GO
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
GO
DECLARE @i INT = 0;
WHILE @i = 0
SELECT @i = COUNT(*) FROM tbl_Flag;
WHILE (SELECT COUNT(*) FROM tbl_Flag) > 0
BEGIN
SELECT @i = COUNT(*) FROM tbl_Test;
IF @i != 10000
INSERT INTO
tbl_Counts(Counted)
VALUES(@i);
END
GO
SELECT Counted, 10000 - Counted as Diff
FROM tbl_Counts;
GO
|
Then in another SSMS tab run script #3 and go back for the results. They are the same.
There are several captured records which do not match count 10,000.
That means "READ UNCOMMITTED" works in the same manner as NOLOCK hint and may generate misleading data.
Now will talk about the difference.
You can not use "READ UNCOMMITTED" within a function, but you can use NOLOCK hint.
CREATE FUNCTION dbo.fn_Test()
RETURNS INT AS
BEGIN
RETURN (SELECT COUNT(*) FROM tbl_Test (NOLOCK));
END
|
You have to apply "NOLOCK" hint individually to every table you want.
"READ UNCOMMITTED" will be applied to the whole statement until isolation level will be changed or until closing of connection.
Now is the most important question: Do you have to use "NOLOCK" at all? Is it "silver bullet" or is it evil?
SQL developers often use that hint to avoid unnecessary locking of tables/pages/records during heavy transactional activity to avoid possible deadlocks.
However, as you could see from my examples you can possibly get inaccurate data as a result.
If the result is not important to you then it is OK, but if that result is used in following calculations or important financial reporting then I'd say it is not OK.
I'll try to define some ONLY cases when it would be OK to use "NOLOCK" hint:
1. For development/troubleshooting purposes, when results are not very important;
2. In extremely big and heavy transactional systems with thousands of requests per second SOME user reporting queries can be hinted with "NOLOCK". For instance: it is not big deal if Amazon store you get a duplicate or missing item in a suggestion items line.
3. For a query, which does clustered index seek or scan for historical data, which can't be changed or deleted during the request.
Generally speaking, if you have "NOLOCK" in your code means that you have big problem with performance and you are at risk of having even bigger problem of having incorrect data.
No comments:
Post a Comment