SQL Server Techniques: Query to catch possible index duplication

To insure my database does not have redundancy I've checked it for any index duplications.

How can I do it? - Easy: Search the internet!!!

First what I've found was a blog post of SQLAuthority Pinal Dave:
SQL SERVER – Query to Find Duplicate Indexes – Script to Find Redundant Indexes

Here is the text of the Query Pinal created:

;WITH MyDuplicate AS (SELECT

Sch.[name] AS SchemaName,

Obj.[name] AS TableName,

Idx.[name] AS IndexName,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 1) AS Col1,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 2) AS Col2,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 3) AS Col3,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 4) AS Col4,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 5) AS Col5,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 6) AS Col6,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 7) AS Col7,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 8) AS Col8,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 9) AS Col9,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 10) AS Col10,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 11) AS Col11,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 12) AS Col12,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 13) AS Col13,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 14) AS Col14,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 15) AS Col15,

INDEX_COL(Sch.[name] + '.' + Obj.[name], Idx.index_id, 16) AS Col16

FROM sys.indexes Idx

INNER JOIN sys.objects Obj ON Idx.[object_id] = Obj.[object_id]

INNER JOIN sys.schemas Sch ON Sch.[schema_id] = Obj.[schema_id]

WHERE index_id > 0)

SELECT    MD1.SchemaName, MD1.TableName, MD1.IndexName,

MD2.IndexName AS OverLappingIndex,

MD1.Col1, MD1.Col2, MD1.Col3, MD1.Col4,

MD1.Col5, MD1.Col6, MD1.Col7, MD1.Col8,

MD1.Col9, MD1.Col10, MD1.Col11, MD1.Col12,

MD1.Col13, MD1.Col14, MD1.Col15, MD1.Col16

FROM MyDuplicate MD1

INNER JOIN MyDuplicate MD2 ON MD1.tablename = MD2.tablename

AND MD1.indexname <> MD2.indexname

AND MD1.Col1 = MD2.Col1

AND (MD1.Col2 IS NULL OR MD2.Col2 IS NULL OR MD1.Col2 = MD2.Col2)

AND (MD1.Col3 IS NULL OR MD2.Col3 IS NULL OR MD1.Col3 = MD2.Col3)

AND (MD1.Col4 IS NULL OR MD2.Col4 IS NULL OR MD1.Col4 = MD2.Col4)

AND (MD1.Col5 IS NULL OR MD2.Col5 IS NULL OR MD1.Col5 = MD2.Col5)

AND (MD1.Col6 IS NULL OR MD2.Col6 IS NULL OR MD1.Col6 = MD2.Col6)

AND (MD1.Col7 IS NULL OR MD2.Col7 IS NULL OR MD1.Col7 = MD2.Col7)

AND (MD1.Col8 IS NULL OR MD2.Col8 IS NULL OR MD1.Col8 = MD2.Col8)

AND (MD1.Col9 IS NULL OR MD2.Col9 IS NULL OR MD1.Col9 = MD2.Col9)

AND (MD1.Col10 IS NULL OR MD2.Col10 IS NULL OR MD1.Col10 = MD2.Col10)

AND (MD1.Col11 IS NULL OR MD2.Col11 IS NULL OR MD1.Col11 = MD2.Col11)

AND (MD1.Col12 IS NULL OR MD2.Col12 IS NULL OR MD1.Col12 = MD2.Col12)

AND (MD1.Col13 IS NULL OR MD2.Col13 IS NULL OR MD1.Col13 = MD2.Col13)

AND (MD1.Col14 IS NULL OR MD2.Col14 IS NULL OR MD1.Col14 = MD2.Col14)

AND (MD1.Col15 IS NULL OR MD2.Col15 IS NULL OR MD1.Col15 = MD2.Col15)

AND (MD1.Col16 IS NULL OR MD2.Col16 IS NULL OR MD1.Col16 = MD2.Col16)

ORDER BY MD1.SchemaName,MD1.TableName,MD1.IndexName

In order to Test it I run it in "AdventureWorks" Database.
It produced following result:

Good catch!

But I wanted to challenge that query.
What if in one index will be more columns than in another?
I tried following script to generate artificial duplicate:

USE AdventureWorks;

CREATE INDEX TestIndex1_Customer

ON Sales.Customer (PersonID, TerritoryID, StoreID);

CREATE INDEX TestIndex2_Customer

ON Sales.Customer (PersonID, TerritoryID, StoreID, AccountNumber);

I tried Dave's script again and it perfectly worked:

I reviewed the script an come up with worst case scenario when columns within index are mixed
(the guy who creates the duplicate does not look at right order at all):

USE AdventureWorks;

DROP INDEX TestIndex2_Customer ON Sales.Customer

CREATE INDEX TestIndex2_Customer

ON Sales.Customer (PersonID, StoreID, TerritoryID, AccountNumber);

I just switched columns TerritoryID and StoreID and Dave's script did not catch that case.

I understand that sometimes there can be completely eligible case when we need two exactly same indexes with little bit mixed columns, but how often that can happen?

Than I tried to create a copy of one of XML indexes:

USE AdventureWorks;

CREATE XML INDEX [XMLPATH_Person_Demographics2]

ON [Person].[Person] ([Demographics])

USING XML INDEX [PXML_Person_Demographics] FOR PATH

And Dave's query failed again.

I needed something better to catch these cases and I came up with following query:

;WITH ForResearch AS (

SELECT o.object_id, s.name as schemaname,

o.name as tablename, i.name as IndexName,

ic.column_id, ic.key_ordinal, i.index_id, i.type,

IsNull(x.secondary_type_desc, 'PRIMARY') as XML_Type

FROM sys.indexes i

INNER JOIN sys.objects o ON i.object_id = o.object_id

INNER JOIN sys.schemas s ON s.schema_id = o.schema_id

INNER JOIN sys.index_columns ic

ON ic.index_id = i.index_id and ic.object_id = o.object_id

INNER JOIN sys.columns c

ON ic.column_id = c.column_id and c.object_id = o.object_id

LEFT JOIN sys.xml_indexes as x

ON x.index_id = i.index_id and o.object_id = x.object_id

WHERE i.index_id > 0 and s.name != 'sys'

)

SELECT DISTINCT t3.schemaname, t3.tablename, t3.IndexName, t3.object_id

FROM ForResearch as t3 WHERE Not exists (

SELECT t1.object_id FROM ForResearch as t1

LEFT JOIN ForResearch as t2 ON

t1.object_id = t2.object_id and

t1.column_id = t2.column_id and

t1.index_id != t2.index_id and

t1.XML_Type = t2.XML_Type and

(

(t1.key_ordinal != 1 and t2.key_ordinal != 1) or

t1.key_ordinal = t2.key_ordinal

)

WHERE t2.object_id Is Null and

t1.object_id = t3.object_id and

t1.index_id = t3.index_id

);

I tried it against modified "AdventureWorks" database:

Even though my query does not nicely show duplicated columns and corresponding duplicate table I like results it produces. It perfectly caught cases that previous query couldn't.

SQL Server Techniques

Thursday, November 14, 2013

Query to catch possible index duplication

No comments:

Post a Comment