Yes you can - SQL Server Table Compression and Dynamics GP


2019-03-01_23-12-19

Today I attended SqlBits in Manchester UK, where there was a session “Performance tuning SQL server on crappy hardware” by Monica Rathbun.

Monica has a the fast and punchy presentation style I enjoyed. Although I had already experienced or knew most of what was covered it was still a good presentation. There was one take away I noted in my notebook to comeback to later. Now back at the hotel I’m having a look.

Row/Page compression - More Data in MEMORY

Monica was promoting the use of COMPRESSION – not just backup compression but ROW/PAGE database compression in the database engine itself.

By compressing the data in the database, the theory goes that you reduce I/O required to move the data around and allow much more relevant data to be held in SQL server’s caches and perhaps the underlying storage system’s caches too. Having more data in memory leads to a more performant system.

For some reason the existence of compression in the database as was something that had slipped under my radar, perhaps because it used to be an Enterprise feature but now its available to me in our SQL2016 Standard Edition.


This is particularly interesting to Dynamics GP users as our database is full of padded CHAR data types, has very wide tables full of only partially used data (depending on modules used) or repeating data in the case of settings flags. Dynamics GP also has many tables full of decimal columns that are all zero, again due to configuration or options in how GP is set up or what modules are active. So from the outset it feels like Dynamics GP would benefit.

“Enabling compression only changes the physical storage format of the data that is associated with a data type but not its syntax or semantics”. This means the compression occurs inside the SQL engine but is transparent to the application interacting with SQL server. There are two levels of compression of interest and available to us. ROW compression takes each data row in the table,

  • It uses variable-length storage format for numeric types (for example integer, decimal, and float) and the types that are based on numeric (for example datetime and money).

  • It stores fixed character strings by using variable-length format by not storing the blank characters.

So imagine how much room can be saved when you consider the fields in Dynamics GP are fixed length!

What is more there is another option, PAGE compression that looks at repeating data within the pages of data stored on the filesystem and compresses that data. As this is over an entire page its more heavy on CPU resources but is great where there is a lot of repeated data down the rows of a table. Wait, repeating data down rows of a column? – That is what we get lots of due to status flags and little used fields in the GP tables that vary little from top to bottom of the table.

Just look at something like Item Master table IV00101 or one of the pricing tables etc. There are distributions and settings that are the same, repeated for all items and are ripe for compression as this leads to repeated content in the pages.

Data repeated down from table IV00101 Item Master


So both the nature of the data in the tables and the use of compressible data types by Dynamics GP sure makes it look good for compression.


Compression does cause more CPU load, but unless you are pulling millions of rows then it seems insignificant, see more here:

https://sqlperformance.com/2017/01/sql-performance/compression-effect-on-performance where it is proved it has little effect.


We can run EXEC sp_estimate_data_compression_savings 'dbo', 'IV00101', NULL, NULL, 'PAGE' ;

This will show us, by sampling a subset of the table, much as the statistics does, how much space should be saved by compressing the table, without having to actually do it. Let try with Item Master in Dynamics GP.

SELECT COUNT(*) from IV00101


EXEC sp_estimate_data_compression_savings 'dbo', 'IV00101', NULL, NULL, 'PAGE' ;

Item Master compression test

So we can see the item master table goes from 81,944KB to 16,304KB that is only 20% of what it was!

No trying it with IV00108  that has SELECT COUNT(*) FROM IV00108 = 6,107,169 rows and we get 779,816 going down to 139,440, that is only 19% of what it was before.

Compression testing with IV00108

So you can see how much saving can be achieved this way, imagine the reduced I/O from having 20% of what used to be read.

Even going down to ROW compression gives you only slightly less compression but less overhead too:

2019-03-02_00-31-51

Only 46% of what it was with row level compression.


Downside

There are not many downsides. The first is technical, compressing the data take up CPU, most SQL servers are not CPU bound in terms of resources, so this should not be an issue. Typically 10-30% increase, so check your current CPU load. As this is a table by table selection, you could tackle only the main most sizable tables in GP to get the majority of the benefit without having to apply compression to every table and index. When the data is written you take a hit on compressing it to reap the rewards later. So tables and indexes with great numbers of inserts per second may cause issues (have to be big loads).

The article below has some good scripts to see what will work and what will not…

https://thomaslarock.com/2018/01/when-to-use-row-or-page-compression-in-sql-server/


Upside

Much smaller data means less I/O more in cache. And more data in memory to make for more efficient queries.


Summary

I am going to gradually add tables to compression and see what happens to CPU usage. The benefits should be substantial in terms of reads so it seems well worth pursuing.


Support

This article would indicate its supported for Dynamics GP, although the tool referenced for choosing tables to compress is no longer available, however it is possible to manually work with the database to turn on compression.

https://blogs.msdn.microsoft.com/nav/2011/07/21/sql-server-data-compression-and-microsoft-dynamics/


This is the reason going to conference is so worth while, this is only one of many things I leant or got reinforced today in the various sessions I attended.

SQL server Optimize for Ad hoc Workloads setting in server advanced properties window

TL;DR Turn this option on, from false to true

Whilst looking at a problem SQL server instance, I was on the diagnostic journey looking at why the query plan cache was getting totally cleared every few minutes. It turned out the server had bad memory setup and SQL server was suffering some bad memory pressure. However I did learn about a setting that I find hard to think of a reason why you’d not want to use it in a typical (what install is typical tim?) setup. This option Optimize for Ad hoc Workloads tells SQL server to not cache the query plan for a query until it sees it twice.

Server properties window - highlighting Optimize for Ad hoc Workloads setting

 

This is important where the SQL query work load is very varied, particularly where ad-hoc, non-stored procedure queries are being ran as they can bloat the query cache. The query cache is used to store the compiled execution plan required to execute a particular query. When a new query is encountered, the plan is calculated and put in the cache to save having to compute the plan again, if the query is seen again. The problem is that with ad-hoc queries they are unlikely to be seen again, thus SQL server is using up memory unnecessarily that could be used for better things.

I understand, but need to verify, that this is also true of ORMs such as Entity framework, that although it does create parameter based SQL to execute, in the SQL it sends to the server, the length of those parameters can vary depending upon the length of the values in those parameters. Thus this can create a large number of query plans for what are essentially very similar queries. (OK technically the length of the searched text can vary the query plan but run with this).

The setting will greatly reduce the number of plans and memory used for them on the server as only if they the query is seen twice will it be fully cached. The first time it is seen a stub is created that is enough to spot if the query is seen a second time, only then will the query plan be cached. Truely ad-hoc queries wont be seen again and space in the cache is saved. The compile time is not really worth worrying about because having to compile the query twice is no big deal as after the second compile, the further 100,000k executions can come from the cache, so proportionally its an efficiency worth having for a tiny, tiny hit of compiling the query plan twice.

There is a good article here on how start looking into if you have bloat,http://www.sqlservercentral.com/articles/SQLServerCentral/91250/

There is a similar setting of Forced Paramatization for details of this setting see this post by Brent Ozar,  https://www.brentozar.com/archive/2018/03/why-multiple-plans-for-one-query-are-bad/ basically this setting forces the server to look more carefully at the plans and infer where parameters would exist if you were to paramatize the query. This is great for reducing plans in the cache but increases processor work as it has to examine queries a lot more to work out where the parameters lie. It can also lead to bad parameter sniffing as before the plans would have been totally bespoke, now they will be shared and the exact parameter can change the optimal plan.

Protocol handler debugging in Dynamics GP (drill down) problems

In previous blog posts we have talked about the way that other applications can drill down into parts of GP by using an direct approach via a pluggable protocol handler that is installed when GP is installed. Sometimes this does not work, to debug if the protocol handler has an end point to talk to follow these instructions.

 

Get Handle.exe

Download Handle.exe which is part of the SysInternals tools that Microsoft acquired some time back, written by Mark Russinovich,

https://docs.microsoft.com/en-us/sysinternals/downloads/handle

Extract the Zip file, if you have 64 bit system use the 64 bit version, otherwise use the plain version.

Handle viewer archive extracted

Use PowerShell to pipe out and decode handle.exe output

Open a PowerShell editor on your machine (e.g. start menu, search powershell, launch Windows PowerShell ISE)

In my case I extracted the handles.exe into the C:\Users\tw\Documents\ folder, you will need to change the path for where you extracted it to…

Execute the following PowerShell Command:

C:\Users\tw\Documents\handle64.exe net.pipe -accepteula

This will accept the user agreement and pipe the output of the handle viewer to powershell for decoding.

This gives the following output.

2018-10-31_14-55-08

You can see that visual studio and Dynamics.exe have named pipes running.

If I close Dynamics GP, then run it again…

2018-10-31_15-03-22

You see that Dynamics .exe is not listed as having named pipes available anymore as it is closed.

2018-10-31_15-15-42

[System.Text.Encoding]::UTF8.GetString([Convert]::FromBase64String("bmV0LnBpcGU6Ly8rLw=="))

net.pipe://+/

 

Using powershell to list all

[System.IO.Directory]::GetFiles("\\.\\pipe\\")

Diagnosing PowerShell New-GPSystemDatabase error when creating the Dynamics GP system database

The PowerShell modules for GP allow a number of DevOps activities. For example you may install the Dynamics GP Lesson (sample) company TWO and install the system database via these PowerShell modules. I take full advantage of the PowerShell modules when running and building Docker images for Dynamics GP. 

Today whilst doing this I encountered the following error when moving to build the image for a different version of Dynamics GP:

Dynamics GP - Macro failed to complete successfully

Macro failed to complete successfully.

The PowerShell script works by creating a macro file in the GP\data folder, the PowerShell script then runs DexUtils passing the macro file as a command parameter. The macro file tells utilities what it needs to do (in this case create the system database), you can see this macro in the folder below, captured after this error occurred in the Docker container. CreateSysetmDatabase.mac is the macro file that PowerShell is using to create the system database. 

CreateSystemDatabase.mac

We have no idea at this point what the error actually is, as the error description of “Macro failed to complete successfully” is not very helpful. However you will notice also in that folder the “DexMacro.log” file. Lets look in that file by reading it with the command “type DexMacro.log”.

2018-10-25_21-54-55

So below is the output of that command, the contents of the log.

GP SQL server version issue

There we can see the issue, I’m using SQL Server 2017, I already know this is only compatible with GP2018 onwards. However you will notice from the folder name I’m trying to use it with GP2016. The error is badly worded, it says you need to upgrade to server 11 from 14. It actually means you need to use an older version of SQL for this version of Dynamics GP. Something I already knew but my focus wasn’t in this area at the time I was developing the Docker container.


Please comment if you found this helpful – motivates me to blog more!