As most of the developers working in "Agile", which means without any proper planning, I've created several tables without proper indexing, just couple of foreign keys.
Tables were very narrow with just a few columns and my expectations for data growth were very modest. However, after just a little while I was very surprised when my database showed huge unexpected growth and size of the data became multiple times higher than I've expected.
After very little research I've found and fixed the problem. In this post I'll describe how I've done it.
1. Identifying the Problem.
At firs I run following script to see the biggest tables in my database:
SELECT OBJECT_SCHEMA_NAME(st.object_id) + '.' + OBJECT_NAME(st.object_id) as ObjectName,
SUM(CASE WHEN st.index_id < 2 THEN st.row_count ELSE 0 END) as row_count,
SUM(CASE WHEN st.index_id < 2 THEN st.reserved_page_count ELSE 0 END)*8 as Data_Size_KB,
SUM(CASE WHEN st.index_id < 2 THEN st.used_page_count ELSE 0 END)*8 as Used_Data_Space_KB
FROM sys.dm_db_partition_stats st
LEFT JOIN sys.tables as t ON t.object_id = st.object_id
LEFT JOIN sys.indexes i ON i.OBJECT_ID = st.OBJECT_ID AND i.index_id = 1
WHERE OBJECT_SCHEMA_NAME(st.object_id) != 'sys'
GROUP BY OBJECT_SCHEMA_NAME(st.object_id), OBJECT_NAME(st.object_id), t.name
ORDER BY Data_Size_KB DESC, row_count DESC
Here is the Top of what I've got:
2. Data ResearchAs you can see from previous results, the biggest table is "tbl_Instance_WAIT_Data", where I collect data for waits in my systems. Lets take a look at the table's structure using "sp_help":
EXEC sp_help '[Data_Collection].[tbl_Instance_WAIT_Data]';
As you can see, that table has only 7 columns and data size for one row is only 44 bytes. With simple calculation we can determine that on one SQL Server database page, which is 8 Kb in size, can fit about 150 records. For the current row count of 692096 rows, expected data size is supposed to be no more than 37 Mb, but data usage exceeds 370 Mb (10 times more!) and total space usage is almost 3 Gb (80 times more!!!)
3. Simple FixTo fix that space over usage problem we just need to create a clustered index like this:
CREATE CLUSTERED INDEX CLIX_Instance_WAIT_Data
([Timestamp], Instance_ID, Wait_ID);
And now lets check the space size usage:
EXEC sp_spaceused '[Data_Collection].[tbl_Instance_WAIT_Data]';
As you can see, the data size has reduced to the expected number of ~37 Mb.