Sunday, November 17, 2013

Documenting AdventureWorks Database for 2012

I'm trying to produce full documentation for AdventureWorks Database.
Unfortunately Blogspot can't comprehend 950 Kb of code.

So, I had to attach it.


AdventureWorks2012_Doc.HTML

Steps to view:
1. Open the link.
2. In the new tab or window choose File -> download (in the Google menu)
3. Open the document with your browser.

Please let me know if you have any suggestion how to improve that document.

Thursday, November 14, 2013

Query to catch possible index duplication

To insure my database does not have redundancy I've checked it for any index duplications.

How can I do it? - Easy: Search the internet!!!

First what I've found was a blog post of SQLAuthority Pinal Dave:
 SQL SERVER – Query to Find Duplicate Indexes – Script to Find Redundant Indexes

Here is the text of the Query Pinal created:
;WITH MyDuplicate AS (SELECT
Sch.[name] AS SchemaName,
Obj.[name] AS TableName,
Idx.[name] AS IndexName,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 1) AS Col1,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 2) AS Col2,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 3) AS Col3,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 4) AS Col4,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 5) AS Col5,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 6) AS Col6,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 7) AS Col7,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 8) AS Col8,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 9) AS Col9,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 10) AS Col10,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 11) AS Col11,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 12) AS Col12,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 13) AS Col13,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 14) AS Col14,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 15) AS Col15,
INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 16) AS Col16
FROM sys.indexes Idx
INNER JOIN sys.objects Obj ON Idx.[object_id] = Obj.[object_id]
INNER JOIN sys.schemas Sch ON Sch.[schema_id] = Obj.[schema_id]
WHERE index_id > 0)
SELECT    MD1.SchemaName, MD1.TableName, MD1.IndexName,
MD2.IndexName AS OverLappingIndex,
MD1.Col1, MD1.Col2, MD1.Col3, MD1.Col4,
MD1.Col5, MD1.Col6, MD1.Col7, MD1.Col8,
MD1.Col9, MD1.Col10, MD1.Col11, MD1.Col12,
MD1.Col13, MD1.Col14, MD1.Col15, MD1.Col16
FROM MyDuplicate MD1
INNER JOIN MyDuplicate MD2 ON MD1.tablename = MD2.tablename
AND MD1.indexname <> MD2.indexname
AND MD1.Col1 = MD2.Col1
AND (MD1.Col2 IS NULL OR MD2.Col2 IS NULL OR MD1.Col2 = MD2.Col2)
AND (
MD1.Col3 IS NULL OR MD2.Col3 IS NULL OR MD1.Col3 = MD2.Col3)
AND (
MD1.Col4 IS NULL OR MD2.Col4 IS NULL OR MD1.Col4 = MD2.Col4)
AND (
MD1.Col5 IS NULL OR MD2.Col5 IS NULL OR MD1.Col5 = MD2.Col5)
AND (
MD1.Col6 IS NULL OR MD2.Col6 IS NULL OR MD1.Col6 = MD2.Col6)
AND (
MD1.Col7 IS NULL OR MD2.Col7 IS NULL OR MD1.Col7 = MD2.Col7)
AND (
MD1.Col8 IS NULL OR MD2.Col8 IS NULL OR MD1.Col8 = MD2.Col8)
AND (
MD1.Col9 IS NULL OR MD2.Col9 IS NULL OR MD1.Col9 = MD2.Col9)
AND (
MD1.Col10 IS NULL OR MD2.Col10 IS NULL OR MD1.Col10 = MD2.Col10)
AND (
MD1.Col11 IS NULL OR MD2.Col11 IS NULL OR MD1.Col11 = MD2.Col11)
AND (
MD1.Col12 IS NULL OR MD2.Col12 IS NULL OR MD1.Col12 = MD2.Col12)
AND (
MD1.Col13 IS NULL OR MD2.Col13 IS NULL OR MD1.Col13 = MD2.Col13)
AND (
MD1.Col14 IS NULL OR MD2.Col14 IS NULL OR MD1.Col14 = MD2.Col14)
AND (
MD1.Col15 IS NULL OR MD2.Col15 IS NULL OR MD1.Col15 = MD2.Col15)
AND (
MD1.Col16 IS NULL OR MD2.Col16 IS NULL OR MD1.Col16 = MD2.Col16)
ORDER BY MD1.SchemaName,MD1.TableName,MD1.IndexName



In order to Test it I run it in "AdventureWorks" Database.
It produced following result:

Good catch!

But I wanted to challenge that query.
What if in one index will be more columns than in another?
I tried following script to generate artificial duplicate:



USE AdventureWorks;

GO

CREATE INDEX TestIndex1_Customer  
ON Sales.Customer (PersonID, TerritoryID, StoreID);

GO

CREATE INDEX TestIndex2_Customer 
ON Sales.Customer (PersonID, TerritoryID, StoreID, AccountNumber);
GO

I tried Dave's script again and it perfectly worked:

 I reviewed the script an come up with worst case scenario when columns within index are mixed
(the guy who creates the duplicate does not look at right order at all):




USE AdventureWorks;

GO

DROP INDEX TestIndex2_Customer ON Sales.Customer

GO

CREATE INDEX TestIndex2_Customer  
ON Sales.Customer (PersonID, StoreID, TerritoryID, AccountNumber);
 

I just switched columns TerritoryID and StoreID and Dave's script did not catch that case.

I understand that sometimes there can be completely eligible case when we need two exactly same indexes with little bit mixed columns, but how often that can happen?

Than I tried to create a copy of one of XML indexes:




USE AdventureWorks;

GO
CREATE XML INDEX [XMLPATH_Person_Demographics2]

ON [Person].[Person] ([Demographics])

USING XML INDEX [PXML_Person_Demographics] FOR PATH

GO


And Dave's query failed again.

I needed something better to catch these cases and I came up with following query:




;WITH ForResearch AS (
SELECT o.object_id, s.name as schemaname,
o.name as tablename, i.name as IndexName,
ic.column_id, ic.key_ordinal, i.index_id, i.type,
IsNull(x.secondary_type_desc, 'PRIMARY') as XML_Type
FROM sys.indexes i
INNER JOIN sys.objects o ON i.object_id = o.object_id
INNER JOIN sys.schemas s ON s.schema_id = o.schema_id
INNER JOIN sys.index_columns ic
ON ic.index_id = i.index_id and ic.object_id = o.object_id
INNER JOIN sys.columns c
ON ic.column_id = c.column_id and c.object_id = o.object_id
LEFT JOIN sys.xml_indexes as x
ON x.index_id = i.index_id and o.object_id = x.object_id
WHERE i.index_id > 0 and s.name != 'sys'
)
SELECT DISTINCT t3.schemaname, t3.tablename, t3.IndexName, t3.object_id
FROM ForResearch  as t3 WHERE Not exists (
SELECT t1.object_id FROM ForResearch as t1
LEFT JOIN ForResearch as t2 ON
t1.object_id = t2.object_id  and
t1.column_id = t2.column_id  and
t1.index_id != t2.index_id and
t1.XML_Type = t2.XML_Type and
(
(t1.key_ordinal != 1 and t2.key_ordinal != 1) or
t1.key_ordinal = t2.key_ordinal
)

WHERE t2.object_id Is Null and
t1.object_id = t3.object_id and
t1.index_id = t3.index_id
);


I tried it against modified "AdventureWorks" database:


Even though my query does not nicely show duplicated columns and corresponding duplicate table I like results it produces. It perfectly caught cases that previous query couldn't.