SQL Replication duplicate log reader in SQL replication monitor

It is upgrade time, Dynamics GP does not like schema objects bound by replication during upgrades. During version updates some tables are copied dropped and recreated, depending on what changes have been made to the schema. If replication is left running on a Dynamics GP company during the upgrade, then the upgrade will fail, if the object being replicated is change during the upgrade.

As a result, SQL replication has to be removed during the upgrade, after first scripting it out to SQL as a create so that it can be rebuilt post upgrade. During a recent upgrade, after rebuilding replication there were two log readers running for each publication. Oddly looking at the properties of the SQL log readers, they both reference the same source SQL jobs. This was causing issues trying to start and stop them or trying to change credentials as it all got a bit confusing. 

This has happened to me a few times now. The following command removes the extra log reader registration.

Selecting all the readers, from the replication distribution database where they are registered, simply choose the higher numbered duplicates for deletion, then remove them with the delete SQL command, using the ID from the table. 

SELECT  
*
FROM
[{replicationdistributiondbname}]..[MSlogreader_agents]


DELETE
FROM
[{replicationdistributiondbname}]..[MSlogreader_agents]
WHERE id={duplicate ID #}
Now the table should be left with one reader per publication (unless you have a more complicated replication arrangement going on). 
 
replication monitor showing reader agents

SSMS locked / frozen after network disconnect or computer sleep

image
When both the following are true:

  • the machine is woken from sleep or the network connection has been interrupted to SQL server
  • SQL Server Managment Studio Activity Monitor was open on a tab somewhere in the editor deck in SSMS.

-then It will be no longer be possible to interact with SSMS. It will feel like the SQL Server Management Studio has crashed or has frozen.

The cause

SSMS crashing as a result of connection lost or sleep  is caused by the modal dialog indicating the loss of connection for the activity monitor popping up in a modal mode behind the application surface somewhere. You cannot see this window by right clicking on the application in the bar, but it IS there. Modal mode prevents interaction with any other windows in the applicaiton until that modal window is closed.

image

The Fix

To get up and running again, click ALT+Tab until you find the pesky dialog window, then click OK on that window to dismiss it. SSMS will spring back to life. This is a regular occurrance for me after computer sleeps and I’ve been monitoring the SQL server.

SQL Server 2012 SP2 Replication Log Reader error v.11.00.5058

After moving Dynamics GP to a newly built server, using a back up and restore strategy, the Dynamics GP SQL replication kept causing problems. The following message drowned the error logs of the server. I ended up rebuilding replication configuration entirely to solve the issue, but there may be some clues in the steps I went through that might help someone else out there. There was lots more digging and prodding than summary shown below!

The last step did not log any message! (Source: MSSQL_REPL, Error number: MSSQL_REPL22037

image

It was possible to add text logging by going to the “Run Agent” step in the Log Reader SQL job and then adding to the list of the parameters, the following;

-Publisher [xxxxx] -PublisherDB [xxxxx] -Distributor [xxxxx] -DistributorSecurityMode 1  -Continuous -Output f:\SQLBackups\Reports\ReplicationOutputLog.txt

imageimage

The ReplicationOutputLog.txt then contained the following:

Validating publisher

            DBMS: Microsoft SQL Server
            Version: 11.00.5058
            catalog name:
            user name: dbo
            API conformance: 0
            SQL conformance: 0
            transaction capable: 1
            read only: F
            identifier quote char: "
            non_nullable_columns: 0
            owner usage: 15
            max table name len: 128
            max column name len: 128
            need long data len:
            max columns in table: 1000
            max columns in index: 16
            max char literal len: 131072
            max statement len: 131072
            max row size: 131072
2015-02-06 08:54:59.278 OLE DB xxxxxx 'xxxxxxx': {call sp_MSensure_single_instance(N'Repl-LogReader-0-xxxx-19', 2)}
2015-02-06 08:54:59.278 Parameter values obtained from agent profile:
            -pollinginterval 5000
            -historyverboselevel 1
            -logintimeout 15
            -querytimeout 1800
            -readbatchsize 500
            -readbatchsize 500000
2015-02-06 08:54:59.278 Status: 0, code: 22037, text: 'The last step did not log any message!'.
2015-02-06 08:54:59.278 The last step did not log any message!

 

We were also getting lots of errors relating to “Only one Log Reader Agent or log-related procedure”…”can connect to a database at a time” (see below for screenshot)

image

Google revealed some posts around SP1/SP2 hotfixes and cumulative updates for SQL 2012 fixing issues around area of multiple log reader errors like this. Other posts talked about the database not being upgraded correctly, new schema differences between versions. My conclusion on reading these posts was that the SQL replication database (distribution), may have been brought over from the previous server (we don’t know the full history of if it was restored or/and replication rebuilt by scripts). The restored database may not have been correctly upgraded by whatever is in SP1/SP2, both of which were applied prior to the restore of our data and thus any operations included in SP1/SP2 would have not been applied against it due to the time line.

After a few evenings spent removing all I could find of relevance in the replication distribution database tables and clearing out and recreating log reader agent jobs and snapshot agent jobs, still problems were persisting. After countless recreations of the publications and subscriptions, it felt like there were remnants of previous versions replication clinging on deep in the bowels of replication (amazing how gut feeling develops with experience).

Failing in patience and for lack of a silver bullet, the solution was to remove the subscriptions and the publications, disable replication on the server. Then ensured the replication distribution database and its data files were gone. Also ensured no SQL jobs relating to replication were left behind (there were so removed them too). Also checked all the nodes under SSMS that relate to replication to ensure nothing was left at all, including on the subscriber (there was a shadow of the subscription left on the subscriber).

Then built replication a new, created a new distributor database, with a new name for peace of mind. Created new publication, added articles, created new subscribers.

After reinitializing all subscriptions and getting snapshots, everything started working how it should.

I feel lucky that I don’t have a totally mission critical replication setup. It is good that we have two copies of our replicated tables in different schemas, with views fronting them. This allows us to toggle between live and cached data modes, while “messing on with replication”.

Hey its working!

image

The only thing left to figure out is if the “one log reader” error is something to worry about, or perhaps it will simply go away with the next service pack, whenever that is…

Beware DEX_ROW_TS for data synchronisation

DEX_ROW_TS is a SQL server database field. It is found on some tables in Dynamics GP ERP system(since V10). The field contains a UTC timestamp that is “stamped” whenever that row in the database is changed, setting it to current time and date. A blog entry from David Musgrave, Understanding how Microsoft Dynamics GP works with Microsoft SQL Server, says that this was introduced to “help with Business Data Catalog (BDC) indexing for SharePoint searches”.

Run this script from Tim Foster, to see all the tables in the GP database that use it;

SELECT t.name AS table_name,
SCHEMA_NAME(schema_id) AS schema_name,
c.name AS column_name
FROM sys.tables AS t
INNER JOIN sys.columns c ON t.OBJECT_ID = c.OBJECT_ID
WHERE c.name LIKE '%DEX_ROW_TS%'
ORDER BY schema_name, table_name;
 
Executing this script yields the following tables from using a test GP2010 database (including the DYNAMICS & Company database):
 
Figure: Tables containing DEX_ROW_TS
MC40200 GL00100 IV00101 PM00200 POP10100 RM00101 SOP10100 SY01200
  GL10000 IV00107 PM00300 POP10110 RM00102 SOP10200 UPR00100
  GL12000 IV00108 PM10000 POP30100 RM00301 SOP30200  
  GL20000 IV10000 PM20000 POP30110 RM00303 SOP30300  
  GL30000 IV30200 PM30200   RM10301    
  GL32000 IV40201     RM20101    
    IV40202     RM30101    
    IV40800          

A database trigger is responsible for keeping the timestamp updated on changes. Below you can see the trigger that works against the table MC40200.
CREATE TRIGGER [dbo].[zDT_MC40200U] 
ON [dbo].[MC40200]
AFTER UPDATE
AS
set nocount on
BEGIN
UPDATE dbo.MC40200
SET DEX_ROW_TS = GETUTCDATE()
FROM dbo.MC40200,
inserted
WHERE MC40200.CURNCYID = inserted.CURNCYID
END set
nocount off

As can be seen from the script above, the timestamp is updated with the current server UTC time when a row insert occurs. There are similar triggers for updates to the rows. The time used is the time of the SQL server, set by using the TSQL function GETUTCDATE(). It is worth noting that this date time is with no time zone offsets. It is easy to write SQL scripts that may break due to local time zones, remember to take this into account.

Using DEX_ROW_TS for data synchronisation

If the need arises to keep the Dynamics GP data replicated to a CRM system or say an ecommerce website in Azure, perhaps a data warehouse, then the field becomes very useful. The value of DEX_ROW_TS since the last sync can be compared checked across all the tables and allow the selection of only the rows that have changed and thus only those rows may be selected for sync or push to The Cloud.

Don’t get too excited, I’m about to put the fly in the ointment. Using the field DEX_ROW_TS is very useful, but there are issues with relying on it too much…

Only some tables have the DEX_ROW_TS column on them

It soon becomes apparent that the field DEX_ROW_TS is not ubiquitous over all tables (as was seen from the results of the earlier database query). Although it exists on many of the common tables used in integrations, it is also lacking from so many others. Unfortunately once it is not on one table that requires synchronisation, then a another solution needs to be sought for identifying changed rows.

Table IV00105, used in price list setup is an example of a table lacking the column.

image

Perhaps it is possible to infer the need to re-sync a record in some cases, from the change in a related record in a table that does contain the timestamp however treading down this route means introducing a lot of logic and database work for each update.

Tables can update the time stamp on different table, even if it didn’t change

YES, that is correct, the time stamp can lie! Tables such as Item Master are updated by any change in the Item Quantity table for the same item!

By scripting out the tables and looking at the triggers, it is found that six tables have cross table updates of this nature. A operation on a row in these tables will update its own time stamp, but also crucially it will update the timestamp of another table too, even if nothing has changed in that other table’s row!

All the occurrences where there are interdependencies between updates in one table and update of DEX_ROW_TS column updates in another table are illustrated below. All of these are in the company database, none are in the DYNAMICS database.

Table Source of Trigger Trig Type Time Stamp Updated In Table
[IV00102] – Item Quantity Master INSERT IV00101 – Item Master
[IV00102] – Item Quantity Master DELETE IV00101 – Item Master
[IV00102] – Item Quantity Master UPDATE IV00101 – Item Master
[IV00104] – Item Kit Master INSERT IV00101 – Item Master
[IV00104] – Item Kit Master UPDATE IV00101 – Item Master
[IV00104] – Item Kit Master DELETE IV00101 – Item Master
[IV00107] – Item Price List Options UPDATE IV40800 - Price Level Setup
[IV00108] – Item Price List INSERT IV40800 - Price Level Setup
[IV00108] – Item Price List INSERT IV00107 – Item Price List Options
[IV00108] – Item Price List DELETE IV40800 - Price Level Setup
[IV00108] – Item Price List DELETE IV00107 – Item Price List Options
[IV00108] – Item Price List UPDATE IV40800 - Price Level Setup
[IV00108] – Item Price List UPDATE IV00107 – Item Price List Options
[IV40202] – U of M Schedule Detail Setup INSERT IV40201 – U of M Schedule Setup (header)
[IV40202] – U of M Schedule Detail Setup DELETE IV40201 – U of M Schedule Setup (header)
[IV40202] – U of M Schedule Detail Setup UPDATE IV40201 – U of M Schedule Setup (header)
[RM00102] - Customer Master Address File INSERT RM00101 - RM Customer Master
[RM00102] - Customer Master Address File DELETE RM00101 - RM Customer Master
[RM00102] - Customer Master Address File UPDATE RM00101 - RM Customer Master

 

Be aware it is not just synchronisation where this can be an issue. Perhaps there are SQL jobs sending reports when rows have been changed, in the customer set up records. These could be wrong due to corruption from changed addresses in the address table.

SQL Server Change Tracking

If you use SQL server change tracking, then you may also experience the same misleading update. Obviously the cross table update would be logged as an update to both tables by SQL server change tracking. At least it is possible to query for the columns updated by using a column bitmask with change tracking.

Customer Created Custom SQL Triggers

A blog post by Milan of Merit Solutions highlights another danger that is introduced by the triggers of the DEX_ROW_TS , read it here Dynamics GP Workflow Tips for DEX_ROW_TS.

Essentially be careful that triggers may run twice as the triggers that keep the time stamp updated are operating on the same table that triggered the trigger (follow me?), a loop could be accidentally caused. The blog proposes the use of (NOT UPDATE (DEX_ROW_TS)) and sp_settriggerorder  to prevent this happening and manage the trigger order, if required.