Duplicate sticking sales order numbers in Dynamics GP
This is documentation of the issue we experienced with users entering sales orders getting the lines from another user who is also entering a sales order at the same time. This is a problem that we are no longer seeing with version 2013R2, although I have not yet investigated what has changed, our Visual Studio mods are the same as they were then.
Sequence diagram showing possible cause of bug
Sales transaction entry screen bug write up submitted to MS.
Problem:
· Users working concurrently in SOP Entry are issued with the same Sales document number as each other.
· Users working concurrently in SOP Entry are issued with the same Master document number as each other.
Consequence:
Orders can be overwritten (lost) and corrupted.
Observed behaviour:
We have a large number of telesales staff users who “sit” in SOP entry all day. Often they do so with a new document number, not yet committed in their sop entry screen, awaiting the next call to come in (i.e. tabbed to the document number field and wait).
From several times a day to several times a week, we observe, order lines magically appearing as the user builds the order in the sales transaction entry form. These magically appearing lines, are actually from another user in the room who is entering an order at the same time, who has been allocated the same order number by GP.
If timings are bad, then the order can be lost as the second user saves over the top of the first users order, losing the header info and gaining lines they didn’t notice were there. This is intolerable as with a telesales order, the customers’ details are lost forever as we don’t know who called, as its recorded in the sales header that was just lost.
My Analysis
Using the limited tooling available to me, I may be wrong but narrowing down the issue, it seems to be related to a concurrency SQL issue. This issue is with the way GP handles new document numbers. Normal behaviour is, that if a user goes into the SOP transaction entry form and hits the document number field on the form and the [sopGetNumber] procedure is called to get a new document number. This document number is placed into the form header table buffer (SOP10100) on the GUI.
Should the user exit the screen without saving, then that document number is returned to the number “pool” by an UPDATE SQL statement, updating the relevant document number table SOP40300, setting the next number to be the number the user just didn’t use. In the meantime other orders have been created concurrently by other users. The next SOP document number just set by the UPDATE, may now be way behind the highest document number actually used.
Now a subsequent user takes that SOP document number we just returned, the next SOP document number is incremented by one. Thus the next user to request a document number will find that number is now sitting on a previously used document. This is ok as the seek function will realise it is already used and seek up the numbers for the next gap where there is a gap in SOP10100 record.
However, if two users at the same time start seeking up because the next number is used, to get a new number from a gap in the SOP10100 table, in this scenario, they will both end up at the first available gap. The dex form does not do the handling of the next sop number inside a SQL transaction, thus they both read the same next sop number and step up until a gap is found ending on the same number. (I think)
EConnect (that we also use to create orders) uses an alternative stored procedure to get a new document number, [taGetSopNumber], this procedure implements a forward seek looking for the next available free number (up to I think 1000 steps), within the SQL transaction procedure. The dex form that seems to do it in dex without a transaction (I’ve not got source so don’t know, just looking at sql trace).
The econnect procedure ensures that any gaps in the SOP document numbers are filled, in a very similar manner to the dex form. However as the whole operation is done inside a transaction, thus we don’t get dirty SQL reads. Much safer it seems!
Temp work around
As this was causing a critical issue with lost/corrupt orders, as a short term emergency fix, I introduced a custom table, where I log all allocated document numbers. All new SOP document numbers are checked against this table to see if they have been issued before releasing a new document number. As this table is maintained with table locks and inside a SQL transaction it prevents duplicate numbers from being issued.
Currently I call the econnect get next sopnumber sql stored procedure from the SQL stored procedure dex calls. This takes advantage of its seek functionality and transaction in there and prevent the form doing a seek itself outside of a transaction. I log issued document numbers in my table, and prevent the reuse of numbers in my table, sending the proc off to get another number should it have been used before. This now successfully prevents two people getting the same number as proven by many months in service and as a result, in my mind validates where the bug lies.
I use a locking table as I’ve not had a chance to think through the implications of the delete of the order header from SOP10100 when a user exits the screen outside of transactions, when the users may be scanning for the table header as its being deleted. The locking table and preventing reuse of order numbers was a sledgehammer to crack a nut, but at least I know the nut is definitely smashed open…
Drawback, currently the solution does not allow document numbers to be returned to the “pool” as they will be logged as allocated in the table, they will never be reused, even if the next document number is set back in the next document number table. I cannot see easily a transaction safe way of removing my flag record without the same concurrency issues occurring again that I’m trying to prevent in the first place – although I’ve not given this deep thought yet.
This problem used to occur about once every couple of months on previous versions of GP. However since upgrading to GP2010 we see it sometimes several times a day. However this also coincides with us increasing the activity (people and effort) in the sales team, thus I’m not convinced it is introduced with GP2010 or increase use of the SOP entry form or both.
Master Number
We have seen duplicate Master Numbers too, as has our partner, through previous versions of GP too. This has traditionally be dealt with by a stored proc that reallocates orders with duplicate master numbers to a new master number (having checked the master number is not tying together any documents). This has not caused us so much pain as it does not loose data, but my guess is that the same root cause is here too.
also see following community posts: