Beware diacritic characters where integrating with SQL

I’m certain you all know the above and practice it regularly. First a little background…

In Dynamics GP we wrote a very basic “CRM like” system using a .NET GP Addin, that lays over the top of the SOP module. It introduces the concept of contact records, with many-many relationship to customers/debtors in GP.  The list of contacts associated with an account can be viewed from a sales order and debtor card. The contacts are syncronised to MailChimp (saas email marketing). Marketing click through and email opens are also synced back to be shown next to the contact record. The contacts are also synchronised with the various ecommerce websites that feed GP, contacts being soft linked to website users.

The website integration means there is a merge required to accommodate new and updated records when users update details on the websites. This is where my oversight came to light. Duplicate records were being created, it turned out to be due to diacritics. Below is an example of a duplicate record.

FirstName
Kristján
Kristjan

The example shows what we know they are the same person, but SQL MERGE statement, due to the default collation on the database, sees these as the same. Instead it sees two distinctly different names and thus creates a new contact record for the second instance, where it should (in our case) be merging changes into the first instance. This is an over simplified version of what happened as there are other keys involved and lot of business rules. Obviously SQL is not doing anything wrong but it is not our desired behaviour for this particular task.

It is easy to resolve, when comparing records, for our purpose, we override the default collation and use a Accent Insensitive (AI) version instead, for example:

COLLATE Latin1_general_CI_AI

where “AI” at the end of the collation name is the key to the insensitive comparison.

WHERE
t2.FirstName = t1.FirstName COLLATE Latin1_general_CI_AI

The implementation depends on your own needs, my point for this post is to not forget about this issue if merging data from different sources where there may be a mixture of diacritic and non-diacritic text entered. Integration of data continues to have its challenges…

SQL bucket filling example permalink

I’m frightened that one day this sample disappears, its from a SQL Server Central.com thread, Filling Bucketsoriginally posted by “ChrisM@work.”

-- Originally from: 
-- http://www.sqlservercentral.com/Forums/Topic568498-338-1.aspx
DECLARE @AmountToAllocate INT = 21

;WITH Calculator AS (
SELECT
BucketID, TotalSize, Amount,
AmountLeftToAllocate = CASE
WHEN @AmountToAllocate > (TotalSize - Amount) THEN @AmountToAllocate - (TotalSize - Amount)
WHEN @AmountToAllocate < 0 AND ABS(@AmountToAllocate) > Amount THEN Amount + @AmountToAllocate
ELSE 0 END,
NewAmount = CASE
WHEN @AmountToAllocate > (TotalSize - Amount) THEN TotalSize
WHEN @AmountToAllocate < 0 AND ABS(@AmountToAllocate) > Amount THEN 0
ELSE Amount + @AmountToAllocate END
FROM dbo.Buckets
WHERE BucketID = 1
UNION ALL
SELECT
tr.BucketID, tr.TotalSize, tr.Amount,
AmountLeftToAllocate = CASE
WHEN lr.AmountLeftToAllocate > (tr.TotalSize - tr.Amount) THEN lr.AmountLeftToAllocate - (tr.TotalSize - tr.Amount)
WHEN lr.AmountLeftToAllocate < 0 AND ABS(lr.AmountLeftToAllocate) > tr.Amount THEN tr.Amount + lr.AmountLeftToAllocate
ELSE 0 END,
NewAmount = CASE
WHEN lr.AmountLeftToAllocate > (tr.TotalSize - tr.Amount) THEN tr.TotalSize
WHEN lr.AmountLeftToAllocate < 0 AND ABS(lr.AmountLeftToAllocate) > tr.Amount THEN 0
ELSE tr.Amount + lr.AmountLeftToAllocate END
FROM dbo.Buckets tr
INNER JOIN Calculator lr ON lr.BucketID + 1 = tr.BucketID
)
SELECT
BucketID,
TotalSize,
Amount = NewAmount,
OldAmount = Amount
FROM Calculator

It has been my preferred boilerplate/starting place for SQL challenges like stock allocation and bin allocation in Dynamics GP. So to protect it from getting lost on the internet and to aid me finding it quickly when I need it again, I’ve posted it here.

How to check if MS SQL Server, Linked Server object is in use

Linked server objects allow one SQL server to connect to another at the database engine level. A connection is defined when creating the linked server, with credentials that should be used for that connection. As time passes, the challenge is knowing if that old linked server object is still being used by a report or script or import somewhere, as the original reason for its existence and person responsible may be long gone.

The quickest (not easiest) method to find dependencies on a linked server is to delete the linked server and wait for the phone to ring sometime over the next year. The person ranting on the other end will help you identify what it was used for. If the linked server was involved in a complex integration orchestration, or year/month end reporting, then the fall out from this maverick approach may not be pleasant to resolve nor may the call be in social hours... Instead carefully identify, then remove or correctly document the dependencies on the linked server object.

Here a few tips to help you find those dependencies:

Look for dependencies using SQL inbuilt dependency tracking

Using script from here List All Objects Using Linked Serverinvestigate any objects it returns.

SELECT 
Distinct
referenced_Server_name As LinkedServerName,
referenced_schema_name AS LinkedServerSchema,
referenced_database_name AS LinkedServerDB,
referenced_entity_name As LinkedServerTable,
OBJECT_NAME (referencing_id) AS ObjectUsingLinkedServer
FROM sys.sql_expression_dependencies
WHERE referenced_database_name IS NOT NULL
And referenced_Server_name = 'Enter LinkedServerName here'

Script out objects

In built dependency checking does not work if the dependency is “hidden” in a dynamically generated (sp_executesql) SQL statement or SQL jobs or is used by SQL replication subscriptions etc. If you don’t trust the dependency script above or have older SQL server version, script out all your stored procedures, views, user defined functions, SQL jobs and any other SQL objects that might reference the linked server. Then perform a text search over all the resulting scripts for the linked server name. If you find the text, in turn identify if that object is still in use. If you use notepad++ or similar advanced text editor, they provide multiple file search with regular expression support. The regular expression support is handy if the linked server name is used in other parts of the database, allowing the search to be narrowed down to text patterns likely to be a linked server.  Also do a search on any source code for applications that may be developed against that database.

Use Sql Profiler

Run SQL server Profiler for a month against the target server (on a spare machine) to see if the linked server login name appears, this is the login name of the connection set up in the linked server configuration. I recommend a month as month end scripts or monthly maintenance scripts will be caught, however this won’t catch scripts or reports that are only ran at year end or once in a while. The trick here is to make certain you change the linked server connection to use a uniquely identifiable login name, that way it is possible to use the filter option in SQL profiler to only log events from that LoginName. Changing the login name has to be a considered move as it will change the security context and may thus affect rights to the destination server objects. These approaches all have down sides, but on less complex scenarios this technique works well.

Before deleing the linked server object

In SSMS, right click the object and script out the linked server. Put the resulting create script somewhere easy to find before deleting it, this way it will be quick to put it back again should an urgent need for it occur, say at year end months later. Having the create script helps get things running and buys you time to fix the offending referencing scripts, or document them correctly if it is decided they should exist after all.

Update all XML element nodes in an XML field of a SQL table

I wrote an electronic invoicing solution for Dynamics GP a long time ago. The XML invoices of various formats are stored against the invoice number in the database. The column was upgraded to XML type when SQL sever started supporting it in SQL 2005. Although it is possible to cast a varchar into XML data type, it makes it simpler to work with as a native XML data type.

Recently an issue surfaced where the tax amount on the invoice lines needed to be zeroed for a set of invoices, as the XML values had been incorrectly generated. Once the invoices had been corrected (only a dozen), they could be marked for retransmission. But how to quickly and easily update the invoices?

We have been using XML functionality in SQL server for another project recently, so it seemed natural to address the issue with with the inbuilt XML functions of SQL server.

Showing an snippet of the "modify replace value of" SQL

Documents

The nodes in the document all look like this:

<core:TotalTaxAmount>
<core:MonetaryAmount>7.680</core:MonetaryAmount>
</core:TotalTaxAmount>

Whatever the value is, it needs to be zero after the update.

<core:TotalTaxAmount>
<core:MonetaryAmount>0</core:MonetaryAmount>
</core:TotalTaxAmount>
 

Modify() function

The following is a good guide to using the XML modify() statement in TSQL

 
Using this as a guide I end up with the xPath looking like this
UPDATE 
[dbo].[CA_SOP_EINVOICE_Output]
SET
XMLToSend.modify('
replace value of
(//core:TotalTaxAmount/core:MonetaryAmount[text()!="0"])[1]
with ("0")'
)
WHERE
SOPNUMBE=@SOPNUMBE AND
SOPTYPE=@SOPTYPE

This gives us a name space issue error due to the :core namespaces
XQuery [dbo.CA_SOP_EINVOICE_Output.XMLToSend.modify()]: 
The name "core" does not denote a namespace.

This is easily resolved by specifying the namespace with
;WITH XMLNAMESPACES('rrn:org.xcbl:schemas/xcbl/v4_0/core/core.xsd' AS core)

Next we get an error of
The target of 'replace value of' must be a non-metadata attribute or an 
element with simple typed content
 
This is due to the schema not defining a type for our node, so just update it as text, by adding text()
(//core:TotalTaxAmount/core:MonetaryAmount[text()!="0"]/text())[1]

Only one replace is allowed at a time, this is a real shame as it makes an otherwise trivial statement into a script. To make it work for this problem, a loop is created that will keep updating the first node it finds until all the nodes have been set to zero. This then gives us the final script to do the task as shown below:
 
DECLARE @SOPTYPE smallint
DECLARE @SOPNUMBE CHAR(21)

DECLARE @KeepLooping bit

SELECT @SOPTYPE=3, @SOPNUMBE='[Enter Doc Number]'

BEGIN TRANSACTION

-- Declare the namespace we are using before each select/update
;WITH XMLNAMESPACES('rrn:org.xcbl:schemas/xcbl/v4_0/core/core.xsd' AS core)
SELECT @KeepLooping= XMLToSend.exist('//core:TotalTaxAmount/core:MonetaryAmount!=0')
FROM [dbo].[CA_SOP_EINVOICE_Output]
WHERE
SOPNUMBE=@SOPNUMBE
AND SOPTYPE=@SOPTYPE

WHILE
(@KeepLooping=1)
BEGIN

-- Do the update on first non-matching node each time until they are all done
;WITH XMLNAMESPACES('rrn:org.xcbl:schemas/xcbl/v4_0/core/core.xsd' AS core)
UPDATE
[dbo].[CA_SOP_EINVOICE_Output]
SET
XMLToSend.modify('
replace value of
(//core:TotalTaxAmount/core:MonetaryAmount[text()!="0"]/text())[1]
with ("0")'
)
WHERE
SOPNUMBE=@SOPNUMBE AND
SOPTYPE=@SOPTYPE

;WITH XMLNAMESPACES('rrn:org.xcbl:schemas/xcbl/v4_0/core/core.xsd' AS core)
SELECT @KeepLooping=
XMLToSend.exist('//core:TotalTaxAmount/core:MonetaryAmount[text()!="0"]')
FROM [dbo].[CA_SOP_EINVOICE_Output]
WHERE
SOPNUMBE=@SOPNUMBE AND
SOPTYPE=@SOPTYPE

END
ROLLBACK TRANSACTION
 
That was another interesting dip into SQL Server support for XML!