Friday, December 20, 2019

Recovering from Severe Database corruption error 9100

In my previous post I outlined very dangerous SQL Server problem, caused by usually not very harmful commands "PARSE" and "TRY_PARSE": Having "NaN" value for REAL and FLOAT producing severe error.

This post will be about a major issue it causes and on how to fight it.

Will start from generating the problem.

Attention: Do not run that in production!!!
Here is a script to generate the issue:
DROP TABLE IF EXISTS tbl_Real_Test;
GO
CREATE TABLE tbl_Real_Test (ID UNIQUEIDENTIFIER, R REAL, F FLOAT);
GO
INSERT INTO tbl_Real_Test (ID, R, F)
SELECT NewID(), TRY_PARSE(r as REAL), TRY_PARSE(f as FLOAT) 
FROM (VALUES
       ('0','434534776544')
       ,('0.0000323', 'NaN')
       ,('3.3881317890172E+17','')
       ,('','000000000000x000000000000')
       ,('NULL','ABCDEFG')
       ,('Null','--------------------')
       ,('null','!!!!!!!!!!!!!!!!!!!!')
       ,('ABC','345435467655665676545')
       ,('NaN','5.3881317890172E+47')
       ,('Nan','0.0000000000345')
       ,('nan','434.34543543')
) as R(r,f);
GO
SELECT * FROM tbl_Real_Test
WHERE R Is not Null ;

GO

That script will generate following error:
Msg 9100, Level 23, State 2, Line 57
Possible index corruption detected. Run DBCC CHECKDB.

Troubleshooting:

You might try to run "DBCC CHECKDB" against that database like this:
DBCC CHECKDB('Slava_Temp_Test');

DBCC CHECKDB resulted couple of errors and besides of that it made an error Dump into an ERRORLOG file and created four other DUMP files:
Msg 2570, Level 16, State 3, Line 62
Page (1:16200), slot 1 in object ID 2002106173, index ID 0, partition ID 72057594042843136, alloc unit ID 72057594048479232 (type "In-row data"). Column "F" value is out of range for data type "float".  Update column to a legal value.
Msg 2570, Level 16, State 3, Line 62
Page (1:16200), slot 8 in object ID 2002106173, index ID 0, partition ID 72057594042843136, alloc unit ID 72057594048479232 (type "In-row data"). Column "R" value is out of range for data type "real".  Update column to a legal value.
DBCC results for 'tbl_Real_Test'.
There are 9 rows in 1 pages for object "tbl_Real_Test".

CHECKDB found 0 allocation errors and 2 consistency errors in table 'tbl_Real_Test' (object ID 2002106173).

Fixing attempt:

To fix the issue you might try following commands
DBCC CHECKDB ('Slava_Temp_Test' , REPAIR_REBUILD) WITH ALL_ERRORMSGS;
GO
DBCC CHECKDB ('Slava_Temp_Test' , REPAIR_ALLOW_DATA_LOSS) WITH ALL_ERRORMSGS;

GO

However, in order to perform the fix DBCC CHECKDB requires the database to be in a single user mode:
Msg 7919, Level 16, State 3, Line 141
Repair statement not processed. Database needs to be in single user mode.

The problem is that it might be not possible to switch database into a single user mode, because it is part of an Availability Group and experiencing very heavy user activity. The use of conventional SQL Server tool would cause disruption in Production for who knows for how much time.

Solution:

To solve that problem we do not have to switch database in single mode or restore it from a previous backup to eliminate the issue. You just run an update script over your problematic table and columns to replace bogus values, which SQL Server can't interpret correctly, by NULLs:
UPDATE tbl_Real_Test
SET R = CASE CAST(R as BINARY(4))
       WHEN CAST( PARSE('NaN' as REAL) as BINARY(4))
       THEN Null Else R END
       , F = CASE CAST(F as BINARY(8))
       WHEN CAST( PARSE('NaN' as FLOAT) as BINARY(8))
       THEN Null Else F END;
GO
SELECT * FROM tbl_Real_Test
WHERE R Is not Null OR F Is not Null;

GO

The result of this query should be like this:

Disclaimer and Advertisement:

There are probably a lot of other cases, which might cause "Database corruption error 9100" and if that particular solution does not apply to your environment you can contact me for the resolution.


Tuesday, December 17, 2019

Having "NaN" value for REAL and FLOAT producing severe error.

Attention: DO NOT RUN THIS IN PRODUCTION!!!

That blog post is about the real error in SQL Server.
It was reproduced on SQL Server 2016 CU10 and on SQL Server 2019 RTM.

Database in trouble has a table with FLOAT column. It's Front-End application verifies user's input and inserts the data into that column using TRY_PARSE function.
The developer's intention was that any "Not-a-Numeric" or "Out-of-Range" values will be automatically converted to NULL and it will be for user's discretion to verify and fix these values.

However, one of the application users was very educated and instead of empty space, NULL or any other bad not numeric value the user supplied data with value of "NaN" for empty cells, which simply stands for "Not a Numeric".
That action caused a database corruption!

Here is how to reproduce it.

At first will create a table:
USE tempdb;
GO
DROP TABLE IF EXISTS tbl_Real_Float_Test;
GO
CREATE TABLE tbl_Real_Float_Test (ID INT IDENTITY(1,1), R REAL, F FLOAT);
GO

Will insert some bad data:
INSERT INTO tbl_Real_Float_Test(R,F)
SELECT TRY_PARSE('0' as REAL), TRY_PARSE('NaN' as FLOAT);
GO
INSERT INTO tbl_Real_Float_Test(R,F)
SELECT TRY_PARSE('NaN' as REAL), TRY_PARSE('0' as FLOAT);
GO

Now will try to query it:
SELECT R FROM tbl_Real_Float_Test WHERE ID = 1;
GO
SELECT F FROM tbl_Real_Float_Test WHERE ID = 2;
GO
SELECT ID FROM tbl_Real_Float_Test
WHERE R = 0 OR F = 0;
GO

First two queries will return zeroes. The third one will produce an error:
It also will write that error into the ERRORLOG:
Error: 9100, Severity: 23, State: 2. 
Possible index corruption detected. Run DBCC CHECKDB.

For the easiest reproduction of the destructive behavior of "NaN" you can run following statement:
DECLARE @r REAL = PARSE('NaN' as REAL);
PRINT 'We can Assign "NaN" to a variable, but we cannot see it:';
SELECT @r;

You will get an error of:
An error occurred while executing batch. Error message is: Arithmetic Overflow.

Interestingly enough the TRY_PARSE command allows "Culture" specification, which, in some cases, works very well:
DECLARE @Russian REAL = TRY_PARSE('NaN' as REAL USING 'Ru-RU');
DECLARE @US_Real REAL = TRY_PARSE('NaN' as REAL USING 'en-US');
SELECT Russian_Real = @Russian;
SELECT US_Real = @US_Real;

As you can see, the Russian culture treats the TRY_PARSE command correctly and produces "NULL" instead of the error. Besides of Russian it also works for Finnish, Swedish, Arabic, Traditional Chinese and undocumented "sr-Latn-CS".
No any other cultures treat "NaN" as it expected.

If you can reproduce that bug you can help Microsoft to fix it by voting for that bug:  https://feedback.azure.com/forums/908035-sql-server/suggestions/39278515-try-parse-and-parse-produce-an-error-converting-n

Again: DO NOT RUN IT IN PRODUCTION and do not forget to drop your test table!

Monday, December 16, 2019

T-SQL Bitwise Shifting

Unfortunately T-SQL supports only basic logical bitwise operations: AND, OR, XOR & NOT.
However, sometimes you need to do some "Rocket Science", or better say "Neurosurgery" when you need to shift value's internal bits.

When I hit that problem I came up with very easy solution with creation of a little function:
USE tempdb;
GO
CREATE FUNCTION dbo.fn_BitShift(
@Num INT
, @Shift SMALLINT    /* Positive - Right Shift, Negative - Left Shift */
, @Circular BIT      /* 0 - Not Circular Shift, 1 - Circular Shift */
) RETURNS INT AS
BEGIN
       DECLARE @BigNum BIGINT = @Num;

       WHILE @Shift != 0
              SELECT @BigNum = CASE WHEN SIGN(@Shift) > 0
                     THEN (@BigNum - (@BigNum & 1)) / 2
                           + 0x80000000 * (@BigNum & 1) * @Circular
                     ELSE (@BigNum - (@BigNum & 0x80000000)) * 2
                           + SIGN(@BigNum & 0x80000000) * @Circular
                     END, @Shift -= SIGN(@Shift);

       RETURN CAST(SUBSTRING(CAST(@BigNum as BINARY(8)),5,4) as INT);
END

GO

How does it work.

At first, here is some theory in the beginning: "Bit shifts" & "Circular shift"

Would say we need to shift bits one time in the right direction within our integer number "4".
The binary representation of integer "4" is "00000000 00000000 00000000 00000100".
The Right shift will move digit "1" on one space right and we get following binary number: "00000000 00000000 00000000 00000010", which is integer "2".

Here is an example of doing that via the function:
/*------------------------
SELECT dbo.fn_BitShift (4, 1, 0);
------------------------*/

-----------
2

The first parameter is "@Num", which is Integer "4".

The second parameter is number of shifts we want to perform. In our case it is "1".
Also, the Sign of the second parameter indicates the shift direction.
Positive values shifting our integer right and negative values shifting it left.
If we want shift our integer "4" two digits left we specify the second parameter as "-2" and will get "00000000 00000000 00000000 00010000", which is number "16".

Here is the example:
/*------------------------
SELECT dbo.fn_BitShift (4, -2, 0);
------------------------*/

-----------
16

The third variable indicates if we want to do our shift circular. Would say we want to shift our integer "4" three times. with regular shift result will be zero. With circular, the shifted digit will not be lost, but will move in the front and we will get binary number like this: "10000000 00000000 00000000 00000000", which is "-2147483648" in decimals.

Here is the example:
/*------------------------
SELECT dbo.fn_BitShift (4, 3, 0), dbo.fn_BitShift (4, 3, 1)
------------------------*/
            
----------- -----------
0           -2147483648

Because integers are stored within 32 bits and we will try to shift a number 32 times without circulation we will just loose that number. However, with circulation a number will make a whole circle and become itself:

Here is a sample:
/*------------------------
SELECT dbo.fn_BitShift (8, 32, 1), dbo.fn_BitShift (2, -32, 1);
------------------------*/
            
----------- -----------
8           2

Integers "8" & "2" were shifted around in different directions with returning to the same numbers.

If we try to use shift number bigger than 32 it will just continue to shift further. For instance shifting on 33 positions is equivalent to shift on only one position.

Here is a sample:
SELECT dbo.fn_BitShift (8, 33, 1), dbo.fn_BitShift (8, 1, 1)
, dbo.fn_BitShift (2, -33, 1), dbo.fn_BitShift (2, -1, 1);

Here are my testing samples:
/*------------------------
SELECT dbo.fn_BitShift (239623632, -32, 1)
, dbo.fn_BitShift (239623632, -27, 0)
, dbo.fn_BitShift (239623632, -27, 1)
, dbo.fn_BitShift (239623632, 32, 1)
, dbo.fn_BitShift (239623632, 25, 0)
, dbo.fn_BitShift (4, 31, 1)
, dbo.fn_BitShift (4, -31, 1)
, dbo.fn_BitShift (4, -29, 0);
------------------------*/
                                                                                    
----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
239623632   -2147483648 -2139995410 239623632   7           8           2           -2147483648

The limitation for that shifting function is only 32 bits of an integer number.

Thursday, December 5, 2019

Handling Forbidden XML characters in SQL Server

That is very known issue that SQL Server's XML does not accept characters "&", "<" and ">".
There are two more forbidden XML characters " ' " and " " " (single and double quotes), but SQL Server mostly accept them.

The common solution is to replace these characters by their codes.
Would say we have a silly sentence: "Anne & Robin collect > "berries" than Jane & Kevin, but < than Ivan & Lucy."

If we try to replace forbidden characters by their codes we get something like this:

It does not look like readable text.

So, here is the solution:

In case you do not care about special character coding and care ONLY about text visual representation you can replace forbidden symbols by their siblings from other Unicode pages:
"&" - "&" (65286)
"<" - "<" (65308)
">" - ">" (65310)
" ' " - " สน " (697)
" " " - " สบ " (698)

Then we can do replacement before converting to XML like this:
DECLARE @MyText VARCHAR(1000) =
'Anne & Robin collect > "berries" than Jane & Kevin, but < than Ivan & Lucy.';
PRINT 'My Text: "' + @MyText + '";';
SET @MyText = REPLACE(REPLACE(REPLACE(REPLACE(@MyText
,'&','&#65286;'),'<','&#65308;'),'>','&#65310;'),'"','&#698;') ;
PRINT 'My Converted XML: "'
+ CAST(CAST('<MyXML>' + @MyText + '</MyXML>' as XML) as VARCHAR(MAX)) + '";';

SELECT CAST('<MyXML>' + @MyText + '</MyXML>' as XML);

The XML results will be like this:

If you try to open that XML in SSMS you'll see it clear:

And if you try to convert it back to VARCHAR you'll get following:

Note: when you convert XML back to VARCHAR, SQL Server will convert forbidden symbols back to the default code page, which might be very convenient.

Caveats:

You might try to replace problematic symbols directly, but in this case you would have to use NVARCHAR data type instead of VARCHAR to preserve Unicode symbols from being auto-converted back to the default code page:
DECLARE @MyText NVARCHAR(1000) =
'Anne & Robin collect > "berries" than Jane & Kevin, but < than Ivan & Lucy.';
SET @MyText =
       REPLACE(REPLACE(REPLACE(REPLACE(@MyText,'&',N''),'<',N''),'>',N''),'"',N'สบ');
SELECT @MyText FOR XML PATH('MyXML');

In some of the cases you'd be forced to use direct Unicode characters to avoid placement of escape codes. I'd recommend use of NCHAR command to convert your symbols, then you won't loose "Unicode siblings" if you decide to store your SQL Script in a file:
DECLARE @MyText NVARCHAR(1000) =
'Anne & Robin collect > "berries" than Jane & Kevin, but < than Ivan & Lucy.';
SET @MyText = REPLACE(REPLACE(REPLACE(REPLACE(@MyText
,'&',NCHAR(65286)),'<',NCHAR(65308)),'>',NCHAR(65310)),'"',NCHAR(698)) ;
SELECT 1 as Tag, Null as Parent,
       @MyText as [MyXML!1!MyText]
FOR XML EXPLICIT;

Here are the results of that code.


Hope you can use that solution in your work.
Please let me know if you hit any other unexpected issues related to XML symbols, I'd be glad to include more solutions in my blog.