Security Scoping/Performance issues - Catalog Item Groups and Open form performance
Running into issues with opening work items taking a few minutes to open (CR's most heavily affected as they have the most related Activities)
Use Case (All testing done on primary workflow server) SCSM 2012 R2 UR9
Have an Advanced Operator role for a specific user group =Testing
On the testing role
- Queues = All access
- Configuration items = All access
- Catalog Item Groups = All access
- Tasks = All access
- Views = All access
- Form Template = All access
When Joe opens the console he can access all work items without issue.
I make one change
Catalog Item groups = Remove All access and set to provide access to only the selected group. Then place check box in all listed group.
When Joe opens the console and attempts to open a CR it takes about 3 minutes to open the work item before editing can occur.
Effectively this should have not changed any rights. But had major impact on performance.
On top of this have a end user role that is scoped to specific catalog item groups and when a AD Group that contains Joe the issue occurs.
Best Answer
-
Tom_Hendricks Customer Super IT Monkey ✭✭✭✭✭Update...
I am paraphrasing here, but the response so far is essentially "don't scope your users' templates or catalog groups and don't have very much data."
Since that (and any actual variation of it) is utterly unacceptable for our requirements, I am hoping that we will be able to at least come up with a plan to mitigate the effects of whatever design flaws would lead someone to suggest this in a non-joking manner.
I did not understand at the time why SCSM 2016 was a performance update without any significant new features over 2012 R2. Now, not only do I have a deep understanding why, but I am hoping they have the same idea for an upcoming UR or perhaps a 2016 R2, and soon.
I will share any good information that comes out of this, as we continue.
Not that this would be available quickly enough for either of us, but this is why I created this feature request: https://community.cireson.com/discussion/2967/replace-scsm-object-scoping-in-portal-for-performance#latest. It would be nice to just bypass this altogether with a better solution. Using AD group membership in the portal seems to work just fine, when it can be used.
6
Answers
I don't have this issue now, but can remember when we were implementing the system something along the same line s happening.
However nowadays we tend to schedule in the truncating of CI$User , CI$DomainGroup and LastModified. this is done on a monthly basis and we don't seem to have experienced the issue since.
I am not saying this is the answer by any means, just that we done appear to experience it.
That being said, the performance hit you're taking seems quite substantial, and completely out of proportion to the operation, which needs to be performed.
I don't know if some kind of misconfiguration could cause this issue, but you could try to check if any procedures takes longer to run, when you have scoped the access in the user role. You can check which queries are eating up the cpu with this: https://www.johnsansom.com/how-to-identify-the-most-costly-sql-server-queries-using-dmvs/
And also this post regarding performance counters could be relevant: https://www.johnsansom.com/how-to-identify-the-most-costly-sql-server-queries-using-dmvs/
But it does mostly sound like a bug.
When access is limited by just 1 area / object / class, SCSM has to check for all objects, as e.g. a CR can contain a CI, to which access is limited, or a relationship to a RO, to which access is limited. Therefore, to me, makes sense that the performance is affected when using scoped access.
But again, that being said, there's something wrong if it takes ~3 mins to open a CR - that's just not right. If you can, please let us know, when you get more info from MS!
Btw, are you running SCSM 2016 or 2012 R2?
We are not seeing too much of an issue when opening a ticket, but saving for scoped users takes an unacceptable amount of time. We do not scope much other than templates, but have plans to scope many more objects (Service Offerings, SLO's), so this is very concerning. Working with the PrePopulateCI and other settings, for example, have not provided any relief.
We are on 2016, but I do not expect that to make much of a difference when compared to a 2012 R2 environment.
I have a question though: Do you have the Survey App installed/configured? The scoping on one of the Survey roles has me wondering if it could be (partially?) to blame.
Still fighting red tape to get the ticket open.
I am running another add-on that I fear might be componding the issue.
http://cireson.com/blog/enhancing-activity-management-with-some-orchestrator-and-powershell-magic/
The more activities the work item has the longer it takes to load. But that could just be the linked work items not the add-on.
Still no root cause from MS yet, but we are still going back and forth and working through this.
I am paraphrasing here, but the response so far is essentially "don't scope your users' templates or catalog groups and don't have very much data."
Since that (and any actual variation of it) is utterly unacceptable for our requirements, I am hoping that we will be able to at least come up with a plan to mitigate the effects of whatever design flaws would lead someone to suggest this in a non-joking manner.
I did not understand at the time why SCSM 2016 was a performance update without any significant new features over 2012 R2. Now, not only do I have a deep understanding why, but I am hoping they have the same idea for an upcoming UR or perhaps a 2016 R2, and soon.
I will share any good information that comes out of this, as we continue.
Not that this would be available quickly enough for either of us, but this is why I created this feature request: https://community.cireson.com/discussion/2967/replace-scsm-object-scoping-in-portal-for-performance#latest. It would be nice to just bypass this altogether with a better solution. Using AD group membership in the portal seems to work just fine, when it can be used.
For those who use a SQL 2016 (or 2014) DB:
- Set the compatibility level of the ServiceManager DB to 110 (equivalent to 2012 R2)
- SQL 2014 and SQL 2016 use a different calculation for the Cardinality Estimator than 2012 did. SCSM performs much more poorly with the new calculation, but one can set a trace flag (9481) to revert back to the old calculation, per this link: new-functionality-in-sql-server-2014-part-2-new-cardinality-estimation/ (affects all DBs in the instance).
- The particular SP causing nearly all of the issues is dbo.p_UserRoleSelectAccessToMultipleEntities. It is possible to alter this SP to use this trace flag instead of the entire instance (OPTION (QUERYTRACEON 9481))
For everyone:- There is a registry setting of "GroupCalcPollingIntervalMilliseconds" which defaults to a very low/frequent number that contributes to some of the performance degradation associated with scoping. It can be changed with the following command (you can adjust the milliseconds as desired):
- REG ADD "HKLM\SOFTWARE\Microsoft\System Center\2010\Common" /v
"GroupCalcPollingIntervalMilliseconds" /t REG_DWORD /d 600000
/f
- Indexing is a must for not only the ServiceManager DB, but also the ServiceManagement DB for the Cireson portal. Talk with your DBA team about this--you will likely discover a significant number of opportunities to apply indexes that speed up operations that your users complain about.
- Check the number of concurrent users per management server. Basic, right? But be aware that one particular user might show up many multiple times due to having more than one browser open or the API and SDK calls (including some workflows) showing up more than once for separate calls. The calculation for how much CPU, RAM, and how many management servers you should have in your environment is essentially based on this concurrent user count (simplifying a little here). Do not count your users once if the system counts them more than once. Our estimate when we first designed the environment was significantly low due to expecting users to use only one session at a time.
- Check your groups and catalog item groups, and make sure there are no circular references (two groups use each other to determine the inclusion of objects for each other--which is as bad as it sounds). I had one of these--with absolutely no idea how it was allowed to exist or how/when it was created--and correcting it provided an instant performance boost. You probably do not look at these very often, so just have a look to make sure you do not see anything like this in yours, once in a while.
As I said, I am still working with Microsoft but we have made significant progress. Hopefully, these tips can you as well!Are you referring to users of SCSM 2012R2?
We are currently building SCSM 2016 running on SQL 2016.
And thanks for the great info. Will review all the points.
This might also apply to SCSM 2012 R2, and I have my own personal suspicions that it does (although I also do not recall if SQL 2016 is even supported for SCSM 2012 R2), but I could not say for sure. If you still have a ticket open, Microsoft would definitely be able to speak to that.
Just to note we are running SCSM 2012 R2 UR9 with SQL 2014