Why does Dynamics GP use Table Heaps (no clustered indexes)?
First we should define what a table heap is. A heap is a table without a clustered index.
Data is stored in the heap without specifying an order. Usually data is initially stored in the order in which is the rows are inserted into the table, but the Database Engine can move data around in the heap to store the rows efficiently; so the data order cannot be predicted.
There are sometimes good reasons to leave a table as a heap instead of creating a clustered index, but using heaps effectively is an advanced skill. Most tables should have a carefully chosen clustered index unless a good reason exists for leaving the table as a heap.
Generally I would say we are more used to finding a clustered index on a table, but looking below you can see the more normal picture for the indexes of a GP table.
Microsoft Dynamics GP does not use many clustered indexes in its database.
If you read The Microsoft Dynamics® GP Architecture White Paper, then it would seem to be a decision based on research and data:
So which tables? Lets look for tables in the SOP series that have clustered indexes on them:
SELECT
t.name AS table_name,
I.type_desc AS index_type,
I.is_unique AS is_unique_index
FROM sys.tables AS t
INNER JOIN sys.schemas AS s
ON t.schema_id = s.schema_id
INNER JOIN sys.indexes AS I
ON t.object_id = I.object_id
WHERE I.type_desc = 'CLUSTERED'
and t.name like 'Sop%'
order by 1;
Interesting is that only seven of the SOP tables that have clustered indexes, the rest are heaps. So large tables such as SOP30300, that is particularly large, a 8GB table in one of the companies I work with, are a heap. This at first glance seems wrong. in fact the MS advice from Heaps (Tables without Clustered Indexes) is
Do not use a heap when there are no non-clustered indexes and the table is large. In a heap, all rows of the heap must be read to find any row.
I do wonder about the conditions under which the decisions to not use clustered indexes was made, but have to trust the research was accurate (/S).
So the reason? GP didn’t always run on SQL server, it was on ISAM in the old days. The database schema that GP uses owes a lot to that legacy and is the reason for many of the oddities and areas lacking in GP’s use of SQL server and I expect this use of heap tables is also from that legacy and that study showed there were no performance benefits from introducing a clustered index.
There reports of people converting the primary keys into clustered indexes with success. It is possible to write a DML script to go through the database doing this for all tables too. However this is internet forum hearsay, so I cannot vouch for the accuracy of this information nor rigour of the testing. I also fear that the work can be undone when tables are dropped during upgrades, loosing the changes. I'm pretty certain this would also invalidate your support with MS, as all that careful performance tuning that GP database schema has experienced over the years may not be compatible with this action. ;)