Monday, August 5, 2019

SQL Server String concatenation behavior.

Pretty recently hit very interesting problem of how SQL Server concatenates strings.

The very same SELECT clause in some cases produced VARCHAR(MAX), but in same cases it cut the result to VARCHAR(8000) ( or NVARCHAR(4000) )

After a research I've discovered following:
1. SQL Server is applying escalation of datatype to "MAX" ONLY in the case if at least one of the concatenated strings already has that type.
2. SQL Server does concatenation operation from left to right and treats the result as VARCHAR(8000) or NVARCHAR(4000) unless it hits the "MAX".

That means that the following query will return 8000 and not 8001:
SELECT LEN(REPLICATE('.',8000) + '1');

If we add to concatenated string "MAX" type value that will switch the result ONLY at the point of that "MAX" type value. So, the following script will return 8001:
SELECT LEN( REPLICATE('0',8000) + '1' + CAST('2' AS VARCHAR(MAX)) );

In order to get correct value for a string we need to have "MAX" type value BEFORE the string reaches size limit of 8000. For instance you can do following, which will return correct number of characters - 8002:
SELECT LEN( REPLICATE('0',8000) + CAST('1' AS VARCHAR(MAX))  + '2' );

So, in order to prevent any misbehavior in the future you can just put EMPTY "MAX" type value as the very first mamber of the concatenation:
SELECT LEN(CAST('' AS VARCHAR(MAX)) + REPLICATE('0',8000) + '1' + '2');

In case of use of variables the solution can be the same:
DECLARE @BigString VARCHAR(MAX);

SET @BigString = '';
SELECT @BigString = @BigString + REPLICATE('0',8000) + '1' + '2';

SELECT LEN(@BigString);

However, be careful with the following solution, which WON'T produce the correct result:
DECLARE @BigString VARCHAR(MAX);

SET @BigString = '';
SELECT @BigString += REPLICATE('0',8000) + '1' + '2';

SELECT LEN(@BigString);



1 comment: