What is "COLLATION"? - That is the way how SQL Server interprets ASCII symbols and character codes. It is applicably not only for "English", but for any supported Unicode languages.
Most of the time the default collation always works and nobody even look at it. All new databases will have the same collation and all CHAR/NCHAR/VARCHAR/NCHAR fields.
However, sometimes you have to pay attention to your collation.
There might be a case when you attach database from another server with different collation.
Sometimes your business rules dictate that values have to be case sensitive and should be sorted in certain order. Then you have to change your collation.
At first, how do you know what is default collation on your SQL Server?
Run following command:
SELECT SERVERPROPERTY(N'Collation'); |
If you want, you can change collation for any database.
Following script demonstrates different results of the same query with different database collation and two different ways to retrieve default collation information for a database:
USE [TestDB]; GO SELECT collation_name FROM sys.databases WHERE Name = 'TestDB'; GO SELECT 'A','A' UNION SELECT 'a','a' UNION SELECT 'B','B' UNION SELECT 'b','b'; GO ALTER DATABASE [TestDB] COLLATE SQL_Latin1_General_CP1_CS_AS; GO SELECT DATABASEPROPERTYEX('TestDB', 'Collation') as collation_name; GO SELECT 'A','A' UNION SELECT 'a','a' UNION SELECT 'B','B' UNION SELECT 'b','b'; GO |
Here is a demonstration how collation effects a table:
USE [TestDB]; GO CREATE TABLE tbl_test_Collation( A CHAR(1), B CHAR(1) ); GO INSERT INTO tbl_test_Collation(A, B) SELECT 'A','A' UNION SELECT 'a','a' UNION SELECT 'B','B' UNION SELECT 'b','b'; GO SELECT Name, collation_name FROM sys.columns WHERE object_id = OBJECT_ID('tbl_test_Collation'); GO |
The beauty of SQL Server collation is in ability to set individual collation on every column:
ALTER TABLE tbl_test_Collation ALTER COLUMN A CHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AS; GO ALTER TABLE tbl_test_Collation ALTER COLUMN B CHAR(1) COLLATE Latin1_General_BIN; GO SELECT Name, collation_name FROM sys.columns WHERE object_id = OBJECT_ID('tbl_test_Collation'); GO |
Now will play around with that table. As you noticed, values in column "A" and "B" are identical:
SELECT * FROM tbl_test_Collation WHERE A = 'a'; GO SELECT * FROM tbl_test_Collation WHERE B = 'a'; GO |
We can see that first query, when we filtered by case insensitive column, returned two records and filter on case sensitive column returned only one record.
Ordering, using different collation, is even more interesting:
SELECT * FROM tbl_test_Collation ORDER BY A; GO SELECT * FROM tbl_test_Collation ORDER BY B; GO |
Sometimes we need to JOIN, Link or concatenate character fields with different collation.
Here are couple of examples which will generate errors:
SELECT A + B FROM tbl_test_Collation; GOSELECT * FROM tbl_test_Collation WHERE A = B;GO |
Here is the way to fix these queries:
SELECT A + B COLLATE SQL_Latin1_General_CP1_CI_AS FROM tbl_test_Collation;GO SELECT * FROM tbl_test_Collation WHERE A COLLATE Latin1_General_BIN = B;GO |
However, be very careful joining columns, which do not have the same collation.
For example two queries below looks exactly the same, but return different sets of records:
SELECT * FROM tbl_test_Collation as a INNER JOIN tbl_test_Collation as b ON a.A COLLATE Latin1_General_BIN = b.B; GOSELECT * FROM tbl_test_Collation as a INNER JOIN tbl_test_Collation as b ON a.A = b.B COLLATE SQL_Latin1_General_CP1_CI_AS;GO |
To prevent discrepancy between queries you can use "DATABASE_DEFAULT" option instead of specifying particular collation:
SELECT * FROM tbl_test_Collation as a INNER JOIN tbl_test_Collation as b ON a.A COLLATE DATABASE_DEFAULT = b.B COLLATE DATABASE_DEFAULT;GO |
I want to finish this post by the simple query to retrieve list of all available collations.
In SQL Server 2014 is almost 4K collations. To see only applicable to your language you can use following script
SELECT name, [description] FROM sys.fn_helpcollations() WHERE name like 'Greek%'; |
No comments:
Post a Comment