Database Blocking Issue - dbo.p_Entity ChangeLog Snapshot;1
I have an SCSM related question - We had a unfortunate outage in our production environment this morning that was caused by data performance and it seems that the culprit was dbo.p_Entity ChangeLog Snapshot;1. After reading more this seems to be part of the SCSM grooming procedures but there's not a lot of information that I was able to ascertain from good ole' Google. By chance, has anyone seen this before?
You are right. The Entity Change Log table is the table that contains the "History" tab for each item. When a field is changed from one value to another, this table records the original value and the new value plus who made the change so it can be displayed.
This table can be HUGE!
Mostly due to the connectors running and each time the connector runs for AD or ConfigMgr, they update every CI record to say it was last updated 30/11/2016 (for example) This adds 1 row to the table for each CI, every time the connector runs. If you run your ConfigMgr connector every hour..... that's a lot of records.
SCSM 2016 offers some respite from this as it is now possible to disable the date modification by the connectors via a setting in the registry.
This just removes a LOT of noise and reduces the table size etc. meaning you will not run in to performance issues like you are seeing.
Hope this answers your question.
I made a program called Advanced History Groomer to attempt to combat the sheer amount of history that builds up within SCSM to purge history records no one cares about! Obviously this is more of a preventative tool than a fix after the database is too full to run grooming jobs within the timespan it has.
SCSM was never properly programmed to handle the issue of running out of space for the Database. Even if the Transaction Log runs out of space, it ends up rolling back SCSM, causing it to not function correctly. I have created documentation regarding a slow way to remove rows, backup the database to then shrink the log file, and repeat the process…. The issue with so many rows to delete means a bigger transaction log. I have a feeling this is why the SAN was getting hit really hard with 900MB/s on the disk. This issue can be prevented by ensuring enough space, because as soon as there isn't enough free space when compared against the transaction log doing a rollback, game over. HOWEVER WITH TOO MANY HISTORY LOGS IN TOO SHORT A TIME, WORKFLOWS WOULD FAIL AND TRY TO RERUN, AFTER ENOUGH PILE-UP, IT BECAME A DDOS ATTACK TO ITSELF AND WHATEVER IT COULD COMMUNICATE WITH.... THAT'S WHAT A VERY POORLY WRITTEN PROGRAM FUNCTION CAN DO, BREAK EVERYTHING IT CAN....
Steps to remove history logs and shrink database log file without triggering a rollback:
WHERE LastModified < '2016-10-16 06:47:26'
WHERE NOT EXISTS (SELECT 1 FROM [dbo].[EntityChangeLog] ECL WHERE ECL.EntityTransactionLogId = ETL.EntityTransactionLogId)
AND NOT EXISTS (SELECT 1 FROM [dbo].[RelatedEntityChangeLog] RECL WHERE RECL.EntityTransactionLogId = ETL.EntityTransactionLogId)
AND ETL.EntityTransactionLogId < dbo.fn_GetEntityChangeLogGroomingWatermark();
Note: I've found the MS workflow groom buildup can be dealt with by setting the history retention to 300 days (or a higher number) so it will be able to rerun the query without making changes and mark itself as successful. If you find SCSM too slow, you could directly edit the value by finding it in the database itself. Using SCSM Entity Explorer, you want to navigate to System.AdminItem --> System.GlobalSetting --> System.SolutionSettings then go to object tab and load objects, you'll see these allow you to see the Administration - Settings items…. However I find it easier to use sql and query the "[ServiceManager].[dbo].[MT_GroomingConfiguration]" table and edit the value from there, remember you'll have to write the value in minutes, so 300 days x 24 hours x 60 minutes == 432000.
Additionally you can make the reserved table size smaller if you use the following command, but it takes a long time.
DBCC DBREINDEX ('EntityChangeLog')