Everyone emphasises the importance of meaningful measurements to effectively manage IT service delivery, but how do we get it right? As a consultant, I often encounter organisations flooded with metrics that ultimately lack substance. It’s all too easy to get caught up in the excitement of technology—building dashboards and scorecards—only to lose sight of what truly matters. This blog will look at some key things to measure when looking at the overall health of your ITSM practices.
Focus on Value:
Adopting one of ITIL’s guiding principles, ensure that everything your organisation does directly or indirectly contributes value to stakeholders. Begin with a clear and relevant goal or mission statement that supports your processes. Think of your goals as statements of intent—articulating what you aim to achieve in your practice. For example:
• Incident Management: Resolve incidents quickly and with minimal business impact, ensuring nothing is lost, ignored, or overlooked.
• Request Management: Fulfil service requests effectively and efficiently.
• Change Enablement: Deploy changes in a manner that is effective, efficient, and safe.
• Problem Management: Identify the root causes of incidents and provide solutions or workarounds.
• IT Asset Management: Manage, control, and protect your IT asset estate.
• Service Catalogue Management: Create a unified view of IT services to enable users to access and utilize them effectively.
• Knowledge Management: Ensure the right information is available to the right people at the right time to support informed decision-making and improve the efficiency and effectiveness of IT service management processes.
Being clear, concise, and aligned with your mission sets the tone for success, and makes process goals transparent to all stakeholders.

Make Your Metrics Logical:
Make sure your Critical Success Factors or CSFs directly translate to the mission statement and goals we’ve defined in the previous step. Taking an example for incident management, you could have three critical success factors from the goal stated above which will start breaking down your critical deliverables into smaller chunks. Example CSFs for change could include:
• Resolve incidents within SLA
• Mitigate the adverse impact of incidents
• Manage incidents consistently
When defining your CSFs you’re documenting the factors that will support the desired outcome, the conditions needed to create that outcome as well as the assets and capabilities needed to achieve the stated goals and objectives. Now we have a clear mission or goal statement and some underpinning CSFs to make it achievable.
Use KPIs to Manage the Detail
Key Performance Indicators or KPIs are the next level down from CSFs and start getting into the granularity of measurements and metrics. KPIs focus on performance and can be used on an individual, team and organizational level. Some real-life examples of KPIs include:
• 99.5% of all Incidents are resolved within SLA (linking back to the effectiveness CSF)
• A Mean Time To Restore Service (MTTRS) of less than 4 hours for all Priority 1 incidents (linking back to the efficiency CSF)
• Percentage of incidents that are older than 5, 30 and 90 days (backlog management)
• Percentage of major incidents
• Percentage of incidents fixed at the first point of contact (first time fix rate) and escalation rate
Make sure that all your KPIs are directly transferable to their related CSFs so you can see a clear line of progression. The idea is that you should be able to track your metrics from KPIs and statistics back to the vision statement. Having a logical sequence of deliverables ensures everyone knows what is expected of them.

Compliance Matters:
Being transparent is important, and if your industry has compliance metrics, it may even be a legal requirement. Compliance measurements can be key indicators of potential risk, so we need to be able to report on them. Some compliance metrics could include:
• Number of compliance items open
• Number of compliance items closed
• Percentage of internal audits completed on time
• Percentage of internal audits completed in scope
• Number of external audit comments
• Number of external audit findings
Use XLAs to represent CX:
We all know that SLAs are a solid way to measure customer satisfaction, but they don’t give us the whole picture. SLAs focus on the technical – uptime, downtime, capacity and performance levels – but what does that mean to the average end user? XLAs or eXperience Level Agreements are a great way to capture customer experience as part of your reporting offering. Rather than focusing on technical details – XLAs focus on people. They’re about understanding what truly matters to our users. What keeps them up at night? What frustrations do they face? XLAs force us to confront those realities and set realistic expectations. XLAs can be used to capture information such as:
• How consistent the user experience is across all service desk platforms. For example, self-help, instant messaging, email or over the phone.
• Service desk approach; how was the issue dealt with? Was the issue understood from a business perspective?
• The knowledge level of support staff
• How empowered our end users feel. Do they feel comfortable engaging with self-service and self help?
• How proactive the support is. One approach could be to track the number of proactive actions taken, such as event management, preventive maintenance, and proactive communication of potential outages.
Automate As Much As Possible:
There will always be aspects of reports that need to be done manually, such as management commentary, progress on any backlog, incidents that have failed SLAs. They’re the value-add. However, the more day-to-day aspects, like incident volumes, totals that breached SLA, outstanding service requests by support teams. Using automation where possible can focus your efforts on insights that can drive real improvements.
Much has been written about how GenAI can improve ITSM reports and metrics. GenAI can do a significant amount of heavy lifting, being able to analyse large volumes of data from ITSM tools to identify trends, detect anomalies, and generate real-time insights. Ultimately saving time and reducing manual effort. Depending on the tool, GenAI can also visualize key metrics like SLA compliance or incident trends, automate root cause analyses, and offer predictive analytics to improve service resilience. A real-life example of ITSM toolsets using AI can be seen through our friends at Hornbill using it to support codeless reporting and ManageEngine to deliver a more proactive support model.
Keep Moving Forward
To ensure continuous improvement, we should integrate it into our service desk metrics. This demonstrates our commitment to quality and guarantees that service levels will consistently enhance over time. A simple starting point could be establishing an improvement register and committing to adding a specific number of suggestions each month. Other ideas include:
• Re-opened rates: Track the number of incidents reopened after initial resolution so that we can ensure our fix activities are fit for purpose.
• Repeat incident rates: Monitor the frequency of recurring incidents so that we can identify trends and involve problem management if appropriate.
• Knowledge base usage: Analyse how often and which knowledge base articles are accessed and ensure they are kept up to date.
That’s my take on creating meaningful metrics. What do you think? Please let me know in the comments.