Homeβ€Ί General Discussion

Tips on improving your Incident managemement journey!

Gabriel_LencesGabriel_Lences Customer Advanced IT Monkey ✭✭✭
edited August 2021 in General Discussion

With these few simple steps, we've managed to get down from 1000 incidents per month to 20! 😁

Alright maaaaaaybe that's not entirely true, but with a few enouraging voices spanning from @Adam_Dzyacky , @Peter_Miklian , @Brad_Zima , @Chris_Chekaluk1 & others, I've decied to share some tips from our latest attempt on how to improve our Incident management even just by a tiny teedy bit. This is gonna be a long one so hold on tight and grab yourself a cup of cofee while at it.

As an intro to all of this, I'd like to explain that up until now we didn't really have any system of "prioritizing" our Incidents (based on the urgency and impact). We also had a system where we knew we wanted to know what business services are affected the most. Hence why we needed to somehow pair these (Business Services and Incidents) together. Up until now, our Service Desk was the one and only unit responsible for filling in the Affected Business Service after an Incident was created.

This was all fine and dandy, but we wanted to take a step further , so we wanted to

  1. Lower the burden a bit on our Service Desk AND
  2. Educate our employees that incidents affect their services we as an IT provide for them
  3. Come up with a system where the user could indicate to us WHO and HOW is impacted by the Incident

Regarding the priorities, we previously had a system where a user could control the urgency of the IR by us asking them a question "How urgent is your case?" - Needless to say, more than half of our users always picked "high".

So we as Business Analysts sat down together and agreed on re-structuring the Incident form for the users a bit. Give a bit of a redesign and a makeover that would not only be intuitive, but also would accomplish the goals above. We also talked with our UX experts to come up with a simple , understandable solution for the users and covered a whole lot : we talked out the wording of each question, in which order should they go, what should be the wording of the answers etc.

And here's the result.

Let's break it down a bit. We created an ARO with the following fields

  1. Description of the IR form - Display only text - So in order for the user to know if he's filling out the correct form, we've first wanted the users to know what an IR actually IS. We've drew the inspiration from Cireson's own support portal on the wording a bit here, because we thought it was straight to the point, so thanks Cireson!
  2. Name of the incident - self explaintatory πŸ˜‰
  3. Which area of your is affected - Required MP Enum List field - Service Classification - Remember the goals being educate our employees that behind everything IT does is a service at the end of the day and having some sort of mechanism that would pair Services to IRs AND lowering the burden a bit on services desk? Good! So what this field actually allows us to do is basically to group Business Services into categories (so if you have a lot of services you can group them into a reasonable amount of categories), which then in turn allow us to
  4. Optionally choose the affected service of the selected area - Optional Query Picker field - with a mapped token to where the contents of the Query Picker show all Business Services where Service Classificiation = chosen value from the previous question. This then, allows the user to choose a specific Affected Service, which should (we still need to see if it actually WILL) in turn lower the pressure on Service Desk. Now, by all means, the concerns that users don't know a lot of the time what service is affected are tottally understandable and that's the reason behind leaving the field as an optional one. Rather have it filled out only in some cases than encourage the "Jeez this is a required field and I don't know what to put over here, so I'll just mark the first one that comes up!" behaviour. We also needed to come up with a way to present simple understandable names of the Business Services rather than our own internal names with a lot of ID's like BS201 Desktops & Laptops, so we also extended the Business Services class with a simple string field a filled out that field with our "user-friendly" names such as you see on the screenshot "Desktops & Laptops" (without the ID that would just confuse the user even more).
  5. Who is affected by the incident? - this is a simple dropdown of a few options that after the ticket being saved, goes through a powershell script in order to map enum values of IMPACT according to the user inputs. Bear in mind that the contents of the dropdown is what we figured out could work for in our organization , but that doesn't mean it's globally applicable, every company size is a bit different so the wording of the values will vary, but you might draw some inspiration The simple list goes as follows:
    1. Me and / or a few other people - This sets the impact of the IR to Low
    2. Team departement - This sets the impact of the IR to Medium
    3. Entire building / single office location - This sets the impact of the IR to High
    4. Offices accross multiple locations / companywide - This sets the impact of the IR to Critical
  6. How is the work of the affected users impacted by this incident - again, this is the same principle as above, but this one handles the URGENCY. Fun fact: the wording on this mattered a LOT to us (since we always had HIGH urgency before) so our UX experts in addition to some quality wording came up with a bit of "graphical enhancement" so the users can more easily grasp what we're actually asking. We also have a "Critical" Value of urgency which only the Service Desk / Analysts are able set. Users have no way of doing this through the form. And here are the contents of the dropdown!:
    1. β—βš¬βš¬ Able to work almost as usually - Sets the urgency to Low
    2. β—β—βš¬ Work is partly disrupted - Sets the urgency to Medium
    3. ●●● Not able to work at all - Sets the urgency to High
  7. What are you experiencing? - self-explainatory πŸ˜‰

Now we got through all of that, let's recap a bit! So we're giving the ability for the users to choose:

  • the impact and urgency and thus setting the initial priority
  • optionally giving them the ability to set an affected service narrowed down by grouping them through a service classification thus making the list of the services shrink down to like 7-10 per category (around 6-7 categories) from an altogether of around 60 business services!

Both of these values are checked by Service Desk on the initial creation of the incidents (if the user inputs really match the reality) and corrected if needed. We just rolled out this new change a few days ago so we'll need a bit more time to wait and see if the users adopt to this change well and if our 3 goals that I outlined earlier were accomplished, but from the few (around 30-40 IR's) that came in already with this new system, things are looking optimistic and we finally make more use of the prioritization and business services respectively to improve our Incident Management even further! 😊

Whew, an extremly long post that should have probably even been a blog post or something, somewhere, but hey! Here we go. Hope this helps any of you out there trying to make small steps to improve your IR management , even if by a tiny bit!

In case you have any further questions, feel free to ask away! ✌


  • Options
    Adam_DzyackyAdam_Dzyacky Product Owner Contributor Monkey ✭✭✭✭✭

    Nice work @Gabriel_Lences!

    I think what's so great about this is that apart from how inherently awesome it is. It lays the ground work for a host of complimentary scenarios. Such as using Business Services to drive Change Requests/greater automation. For example, if one were to build an Advance Request Offering to drive a Change Request a few prompts that come to mind would be...

    • Upon selecting the Business Service, if it has related non Closed Incidents, Parent Incidents, or Problems. You could optionally pick which 1/many you are attempting to fix with the Change
    • If the Business Service has Service Contacts, Owners, etc. you could use randomly select a handful and use them in a Review Activity for approval of the Change with PowerShell Activity. See here for some PowerShell to get started in this area.

    Even with the Incident scenario you've laid out, you've really opened the door to some more global automation scenarios around Incident, Change, and Business Service management. Perhaps if some threshold of Incidents is created about a similar Business Service you could:

    • Automatically shift the Status of the Service to "In Maintenance" or something else
    • Automatically create/relate a Problem to the Service (perhaps based on the Status)
    • Trigger notifications to Business Service users if the Priority meets some criteria
  • Options
    Matt_Howard1Matt_Howard1 Customer Adept IT Monkey ✭✭

    @Gabriel_Lences this is a fantastic write up. We are planning on revamping our service catalog within the next 6 months or so and I really want to drive our portal implementation to be the single point of contact for customers and to make it as pain free as possible. This provides a tremendous amount of information to consider for our own planning. Thank you very much!

    @Adam_Dzyacky I was just having the same conversation with someone this morning about automating changes to services based on X number of incidents in an amount of time.

  • Options
    Peter_MiklianPeter_Miklian Customer Advanced IT Monkey ✭✭✭
    edited August 2021

    @Gabriel_Lences perfect demonstration of how to leverage different system features (MP enums, BS categories, Orchestrator, ...) and teams cooperation (end users, analysts, UX, IT, ...) as a whole.

    End users get clear interface (appreciated involvement of UX!) and education, Service desk get more accurate ticket submission and more relevant data, less effort for ticket categorization, analysts get inputs of higher value, more detail in incident description data for better prioritization of work, saving their time and work for investigation and returning services to normal.

    All prepared for next steps which are reporting/KPIs, reviews which may lead to lower number of IRs (and wrongly created IRs) and help to have better services with fewer interruptions or quality reductions.

    I'm happy to see this in action and looking forward for next steps to countless directions as @Adam_Dzyacky projected.

Sign In or Register to comment.