Completing the Picture of Ticket Quality
By Shanu Vashishtha, Deep Learning Engineer, Kahuna Labs
The Completeness Score™ measures how thoroughly the troubleshooting process was documented—the clarity, coverage, and progression of troubleshooting actions. (We covered it in a previous blog post) But documentation quality is only half the story. A ticket can achieve a high Completeness Score with exemplary documentation, yet still leave a critical question unanswered: Did the documented solution actually work?
This is where the Credibility Score™ becomes essential. While Completeness measures how well the troubleshooting was documented, Credibility measures whether we can trust that the documented solution actually resolved the customer’s problem. Together, these two metrics provide a complete picture of ticket quality: one assesses documentation thoroughness, the other assesses solution reliability.
Why Credibility Matters: The Gap That Completeness Doesn’t Capture
Consider a ticket with a Completeness Score of 4: it documents every troubleshooting step, includes all relevant logs, and provides a clear technical narrative. However, the solution was never confirmed by the customer. Six months later, an engineer follows this well-documented trail, implements the solution, and discovers it doesn’t actually work. Tickets that score highly on Completeness but low on Credibility represent a particularly insidious problem: they appear trustworthy because they’re well-documented, but they lead engineers down incorrect paths. The documentation quality creates false confidence in unreliable solutions.
The Credibility Score evaluates dimensions that Completeness cannot assess: verification, validation, and temporal relevance. Where Completeness asks “is the documentation sufficient to replicate the process?”, Credibility asks “can we trust that this process actually solved the problem?”
When tickets lack credibility, critical problems emerge: engineers waste time implementing solutions that don’t work, knowledge bases become populated with unverified solutions that mislead future interactions, and customer satisfaction suffers.
The Complementary Relationship: Completeness + Credibility
High Completeness + High Credibility: The ideal ticket. Well-documented and verified. These tickets serve as reliable, reusable solution templates.
High Completeness + Low Credibility: Risky tickets. Thoroughly documented but unverified or unreliable. These create false confidence and should be flagged for verification.
Low Completeness + High Credibility: Incomplete but verified. The solution worked, but documentation is insufficient for replication.
Low Completeness + Low Credibility: Poor quality tickets. Neither well-documented nor verified. These should be prioritized for review.
How LLMs Enable Objective Assessment at Scale
Support organizations process thousands of tickets daily. Manually reviewing each one for credibility is prohibitively expensive and time-consuming. Large Language Models (LLMs) provide a solution by enabling automated, consistent evaluation of ticket credibility using structured prompts with clear rubrics.
The LLM evaluates specific aspects: whether actions were taken to solve the problem, whether the customer confirmed resolution, whether the ticket was reopened, and how recent the ticket is. The system returns structured JSON output containing both a numerical score (1-5) and reasoning, enabling downstream analytics and integration with quality assurance workflows. Thousands of tickets can be evaluated consistently in a fraction of the time required for manual review.
The Credibility Score Rubric
The credibility score uses a 5-point scale similar to the completeness scoring rubric but differing in what the individual scores signify:
Score 5: High Confidence – Concrete actions were taken, the customer confirmed resolution, the ticket was not reopened, and the ticket is recent. These tickets can be safely used as reference material.
Score 4: Likely Resolved – Actions were taken and the solution appears sound, but the customer didn’t explicitly confirm resolution, or the ticket may be somewhat older. The ticket was not reopened, indicating the solution likely worked.
Score 3: Plausible but Unconfirmed – Some actions may have been taken, but the customer didn’t confirm resolution, or the ticket is older. The solution seems reasonable but lacks explicit validation.
Score 2: Unclear or Partially Grounded – It’s unclear whether concrete actions were taken, or the ticket was reopened suggesting the initial solution didn’t fully work. These tickets shouldn’t be relied upon without additional verification.
Score 1: No Resolution – No clear actions were taken, or the ticket was reopened multiple times. The customer didn’t confirm the resolution, or the ticket is very old. These should be flagged for review.
The evaluation considers four key dimensions: Action taken (whether concrete steps were executed), Customer confirmation (whether the customer explicitly confirmed resolution), Ticket reopening (indicating potential issues with the initial resolution), and Ticket recency (recent tickets are generally more reliable than older ones).
Operational Applications
The combination of Completeness and Credibility scores enables sophisticated operational optimizations:
Historical Analysis: By filtering for both high Completeness (Score ≥ 4) and high Credibility (Score ≥ 4), Kahuna AI can immediately identify tickets that are both well-documented and verified, transforming historical ticket databases into readily deployable solutions for similar tickets.
Knowledge Base Curation: By requiring both high scores for knowledge base articles, organizations ensure that only thoroughly documented and verified solutions enter the knowledge base, preventing the propagation of well-documented but incorrect solutions.
Quality Assurance: QA teams can prioritize review efforts. Tickets with high Completeness but low Credibility need customer confirmation. Tickets with low Completeness but high Credibility need documentation improvement.
Conclusion
The Credibility Score completes the picture of ticket quality that the Completeness Score begins. Where Completeness measures documentation thoroughness, Credibility measures solution reliability. Together, these two metrics provide a comprehensive assessment that enables sophisticated filtering, prioritization, and knowledge curation.
LLMs enable this objective, scalable assessment by applying consistent evaluation criteria to every ticket. The automated process makes it feasible to evaluate thousands of tickets consistently and cost-effectively, transforming support operations from experience-based models to systematically verified, documented knowledge systems.
For support organizations, the combination of Completeness and Credibility scores provides the complete picture of ticket quality. It’s not enough to know that a ticket is well-documented—organizations need to know whether they can trust that the documented solution actually worked. The Credibility Score answers that question, working in tandem with the Completeness Score to enable better decisions about which tickets to reference, which to review, and which to use as the foundation for knowledge that will help future customers.

Leave a Reply