Kahuna Labs Blog

Why Escalations Happen—and How Predictive AI Can Prevent Them

December 17, 2025
By John Ragsdale, SVP Marketing, Kahuna Labs

A customer reports intermittent failures. There’s a vague error string. A screenshot. A “started happening last week.” Your team replies quickly, asks for logs, shares a few standard steps. The customer responds… slowly. They’re busy. They’re not sure where to find the right file. The thread stretches into a day, then two.

Meanwhile, the engineer assigned to the case is doing what good engineers do: trying to reconstruct context from fragments. Skimming old cases. Checking release notes. Asking a senior teammate, “Have you seen this before?” The case isn’t “hard” yet—but it’s already drifting. Progress is measured in messages, not evidence.

Then the calendar pressure hits. A renewal is near. A launch is blocked. Someone senior gets looped in on the customer side. And suddenly the temperature changes: “Can you escalate this?” Not because the issue is impossible–a code bug is indicated–but because confidence is gone. The customer doesn’t feel momentum. Your team doesn’t feel leverage. Everyone is operating with partial information and rising stakes.

This is what makes escalations so maddening: they often feel like a moment, but they’re really a trajectory—set in motion early, by missed signals that were present long before the escalation email arrived.

This is where predictive AI becomes less about “responding faster” and more about preventing the conditions that create escalations in the first place.

The Real Reasons Escalations Happen (Beyond “It’s A Hard Issue”)

In complex product support, escalations tend to come from a few recurring root causes:

1) The first few steps are wrong—or delayed

Most escalations start quietly. The initial triage misses a key diagnostic. The first response is generic. The engineer spends hours recreating context. Customers don’t escalate because the issue is complex—they escalate because they feel uncertainty and slow progress.

2) Tribal knowledge is inaccessible when it matters

Your best engineers carry “pattern memory” in their heads: which symptoms imply which root causes, what to ask next, what diagnostic will quickly collapse the search space. When that knowledge stays trapped in people—or buried in raw tickets—other engineers take longer, take more loops, and escalations happen more often.

3) Customer reality is unique, and documentation can’t keep up

Even strong knowledge bases cover only a fraction of real-world scenarios because customer environments vary significantly (configs, integrations, versions, constraints). Escalations spike when the support model assumes “one canonical flow” but reality is “a thousand variations.”

4) Support is operating without a map

Many teams are effectively navigating a maze without a current floor plan: fragmented tools, inconsistent ticket narratives, missing context, and no shared visibility into how problems typically evolve from first symptom to resolution.

Signals That Predict Escalation (Often Hours or Days Earlier)

Escalation risk usually telegraphs itself through patterns like:
- Stalled progression: multiple back-and-forth cycles without net-new evidence (no new diagnostics, no narrowing hypotheses).
- High “research load”: long time spent gathering context, searching past tickets, or asking internal SMEs what to do next.
- Mismatch of actions: customer-doable steps sent as engineer-only tasks (or vice versa), creating delays and frustration.
- Low-quality precedent: the “similar past tickets” exist, but they’re noisy, incomplete, or not aligned to the current stage of troubleshooting.
- Version/config sensitivity: the same symptom behaves differently across versions or specific configurations—so generic “best practices” fail.
Individually, these signals feel like normal variance. Together, they’re a pattern: this case is drifting toward escalation.

How Predictive AI Prevents Escalations: From “Ticket Handling” to “Path Guidance”

The most impactful shift is moving from AI that answers questions to AI that understands the troubleshooting journey.

At Kahuna, the foundation is a Troubleshooting Map™ built from historical ticket journeys—where tickets are reconstructed into step-by-step “snapshots” and clustered into repeatable paths. That means the AI can recognize not only “what this issue is,” but what stage you’re in and what paths usually succeed from here.

Three preventative strategies become possible:

1) Predict escalation risk by detecting “drift” early

When a case starts to diverge from successful historical paths—too many loops, missing diagnostics, delayed next steps—predictive alerts can trigger intervention before the customer forces it. (This is very different from simply routing “angry customers” faster.)

2) Recommend the next best step with confidence, not guesswork

Not all guidance deserves automation. Kahuna-style approaches use scoring—Credibility Score™, Completeness Score™, and a Complexity Score ™ for recommended paths—so engineers can see when the system is drawing from dense, high-quality precedent versus thin, ambiguous signals.

3) Prevent escalation by removing effort, not adding process

When confidence is high, preventative automation can do the work that typically causes delays: auto-collect diagnostics, propose probing questions, and standardize decision flows—so the case moves forward with momentum and clarity.

The Preventative Mindset Shift

Escalation prevention isn’t a “new policy.” It’s a capability. The support models that win in the next era will act less like reactive firefighters and more like orchestrators—using AI to make the invisible visible: the patterns, the paths, the signals, and the next steps that keep cases from ever becoming escalations.

The goal isn’t to eliminate every escalation. It’s to ensure escalations happen for the right reasons—true novelty and exceptions—not because the system couldn’t see what was coming.

Escalations have a strong correlation to customer satisfaction with support, and high rates of escalation can impact likelihood of renewal. Leveraging AI to prevent escalations from happening will eliminate a lot of friction in the customer experience, and enable support to be seen as a relationship builder, not an element that drives difficult conversations (and potential cost concessions) come renewal time.
Frontline Productivity and the Right AI: Why Context Is the New Competitive Edge

December 11, 2025

By John Ragsdale, SVP Marketing, Kahuna Labs

Every so often, a piece of research captures a shift you can feel happening in the market.
Constellation Research’s new paper, Augmenting and Accelerating Frontline Productivity, by industry veteran R “Ray” Wang, does exactly that. It’s one of those “Big Idea” moments that crystallizes what many of us have been sensing: the next leap in enterprise performance won’t come from automating the back office — it will come from empowering the frontline.

The report frames “frontline productivity” as an emerging market category focused on increasing decision velocity—equipping frontline teams with AI that can augment judgment, accelerate actions, and improve consistency and quality at scale. Kahuna Labs is the perfect example of this category.

The Frontline Is the New Growth Engine

For decades, innovation has flowed from the top down. Executives got dashboards, managers got analytics, and operations got automation. But the people at the edge of the business — the service engineers, technicians, and customer-facing teams — have too often been left behind.

That’s starting to change. Ray’s analysis makes a compelling case that AI is reshaping the structure of work itself. The old command-and-control pyramid is collapsing into what he calls the “diamond organization” — smaller teams, more autonomy, and more leverage from digital labor.

It’s not about replacing people. It’s about giving them decision velocity: the ability to make faster, smarter, and more contextual choices at the moment of truth. And nowhere is that more critical than on the front lines — where a single decision can make or break a customer relationship.

From Automation to Advice

Ray’s framework for AI maturity really resonated with me: from augmentation, to acceleration, to automation, to agents, and finally to advisors.

Most organizations are still stuck somewhere in the middle. We’ve built tools that do more — but we haven’t yet built systems that understand more. The leap from automation to advice is where the real transformation begins.

That’s when AI stops being a tool for efficiency and starts becoming a partner in judgment. It’s when the machine isn’t just executing instructions but anticipating what a skilled human would do next — using context, history, and intent to guide decisions.

That’s what frontline productivity in the AI era really means.

Why Context Is Everything

Here’s the hard truth: not all AI is capable of delivering a productivity leap.

Legacy SaaS systems were never designed for frontline work. They live outside the organization’s network, disconnected from the data that makes decisions meaningful — things like customer configuration, product version, or the subtle differences between one client environment and another.

As Ray puts it, “Legacy SaaS AI lacks contextual relevancy.” Without that, AI can’t deliver precision or trust.

The future belongs to in-network AI — systems deployed inside the enterprise environment, trained on its own tribal knowledge, and fluent in its unique operating reality. These systems don’t generalize. They personalize. They reason in context.

That’s what enables what Ray calls decision automation — AI that doesn’t just analyze, but acts, learns, and improves with every interaction.

From Insight to Impact

This shift has enormous implications for how we think about productivity. The goal isn’t just “doing more with less.” It’s about achieving what Constellation calls exponential efficiency — breakthroughs that are ten times faster, better, and cheaper, simultaneously.

And it’s not a theory — we’re seeing it play out in real organizations. When frontline teams gain AI that’s context-aware, predictive, and prescriptive, they stop firefighting and start foresighting. They move from reacting to issues to preventing them.

Most importantly, they’re free to focus on what humans do best: empathy, creativity, and problem-solving.

The Human-AI Partnership

In my conversations with business leaders, I often remind them that AI isn’t the end of human work — it’s the beginning of better human work.

The question isn’t what can we automate? It’s where do we want humans to shine?

Ray’s seven-factor model for balancing “machine scale” and “human touch” should be required reading for every executive designing next-generation services. It reminds us that the point isn’t to eliminate people from the process — it’s to elevate them within it.

AI can manage repetition, volume, and complexity. But humans still own creativity, empathy, and trust. The organizations that thrive in this new era will be the ones that know how to orchestrate both.

A Moment of Alignment

Having known Ray for over twenty years, he’s rarely wrong about where the industry is heading. This paper is another example of his ability to see the future a few years early.

For those of us working to bring AI to the front lines, it’s both validation and motivation.

The message is clear: the future of productivity belongs to frontline workers. The companies that get there first — with AI that’s deployed in-network, context-aware, and human-centric — will define the next generation of enterprise performance.

That’s a challenge worth rallying around. And it’s one I’m proud to be part of.

Here’s a link to access the full report, “Augmenting and Accelerating Frontline Productivity.”
How to Reduce Support Backlogs Without Hiring More Engineers

December 9, 2025
Eliminating the Quiet Tax of Backlog Management

By John Ragsdale, SVP Marketing, Kahuna Labs

Every support leader knows the feeling: you make real progress—then the backlog creeps right back.

Not because ticket volume exploded overnight. Not because your support engineers suddenly got slower. But because the queue is full of a specific kind of work: long-running, context-heavy cases where the “next step” isn’t obvious, and forward motion depends on reconstructing a story from fragments.

Those tickets carry a quiet tax. Every time someone picks one up, they start by paying it again: reread the thread, re-assemble context, re-search for precedent, re-ask the same foundational questions, re-check whether anything has changed since the last touch. It’s responsible work, but it’s also compounding work—effort spent rediscovering what the organization already knows.

That tax rarely shows up in productivity dashboards. But it’s why backlogs persist even when teams are working at full capacity.

Reducing backlog without hiring more engineers starts with one mindset shift: stop treating every ticket like a standalone investigation. Instead, treat your ticket history as a living body of troubleshooting journeys—so that whenever your organization learns something new, you can apply it to every similar open case and move the queue forward with less effort per ticket.

1) Start by Attacking the Biggest Hidden Time Sink: Re-Reading and Re-Research

In complex support environments, the real time drag isn’t writing customer communications or documenting the case. It’s the time engineers spend reconstructing context:
- parsing long case histories,
- hunting for “similar” tickets,
- figuring out what stage they’re actually in,
- and deciding what to do next.
This is why even strong teams can feel stuck. The backlog becomes a knowledge swamp: the answers may exist somewhere in history, but they aren’t easy to find, trust, or apply quickly—especially for engineers who didn’t live the original case.

The fastest backlog wins come from making progress cheaper: less time spent rebuilding context, more time spent advancing the case.

2) Break Work Into “Snapshots,” Not Tickets

Similarity search often fails because tickets are messy: multiple phases, resets, tangents, missing diagnostics, and a lot of human conversation. Comparing entire tickets to entire tickets produces “close” matches that still don’t tell you what to do next.

A more useful unit is the snapshot: a point-in-time representation of the case—what’s known, what’s been tried, what the customer environment looks like, and what evidence is available right now.

When you can identify the current snapshot, you can stop asking, “Have we seen this issue before?” and instead ask:

“Have we seen this state before—and what reliably moves the case forward from here?”

That shift alone reduces the time spent reinventing the wheel across the queue.

3) Build Paths, Not Just a Library of Articles

Traditional knowledge bases struggle with enterprise support because real environments vary constantly. Configurations differ. Integrations behave differently. Versions and edge cases compound. The result: static documentation covers the generic paths, while the backlog fills with everything else.

What scales better is a Troubleshooting Map™: a set of clustered, repeatable paths that describe how issues actually progress from symptom → diagnostics → resolution.

Instead of “here’s an article,” you get something closer to:
- “Tickets that look like this tend to follow these paths,” and
- “From this snapshot, these next steps usually narrow the journey fastest.”
This is the difference between a library and a navigation system.

4) Use Confidence Gating So AI Reduces Effort (Instead of Adding a Review Step)

AI can help—but only if it doesn’t create extra verification work.

A practical pattern is confidence-based guidance. Recommendations should be accompanied by clear signals about why they’re being suggested and how reliable they are—based on the quality and density of the underlying precedent.

When confidence is high, the system can do more: propose the next best diagnostic, standardize customer questions, or automate routine evidence collection. When confidence is low, it should behave differently: flag uncertainty, suggest options, and invite human judgment. This is what Kahuna’s Confidence Score™ automates.

Backlog reduction depends on this discipline. “AI suggestions” that engineers must validate from scratch don’t save time. Confidence-gated guidance does.

5) The Backlog Multiplier: Apply New Paths to Every Open Ticket

Here’s the compounding move that changes the economics of backlog work:

Every time Kahuna identifies new snapshots and new paths on the Troubleshooting Map, that knowledge can be applied to the current queue. Instead of helping only the next ticket that comes in, new insight helps all open tickets that resemble the newly learned snapshot.

That means you can automate the continuous scan of open/backlogged cases and:
- identify which tickets match a newly discovered snapshot,
- recommend the next best step,
- auto-collect missing diagnostics where appropriate,
- and prompt the right customer action to unblock progress.
The impact is immediate: hundreds of hours saved that would otherwise be spent re-reading old threads and re-searching the past to see whether anything new has occurred.

This is how backlog work stops being linear. Learning becomes a force multiplier.

6) Make It a Closed Loop, Not a One-Time Cleanup

“Backlog blitzes” feel good, but the impact is short lived. The queue comes back because the dysfunctional dynamics creating the backlog haven’t changed.

The durable approach is an automated, continual process:
1. Observe how tickets are actually solved (journeys, not just outcomes).
2. Distill those journeys into snapshots + paths.
3. Apply new learning to the open queue automatically.
4. Improve as outcomes confirm (or correct) the recommended paths.
Over time, you reduce the percentage of cases that require full ground-up investigation. Engineers spend less time rediscovering known routes and more time handling true exceptions—the work only humans can do.

Backlogs don’t shrink sustainably by pushing teams harder. They shrink when you remove the quiet tax that keeps expensive work repeating—by turning historical troubleshooting into reusable paths, and applying every new insight across the queue the moment it appears.
The Future of Support Engineering

December 4, 2025
From Troubleshooter to Orchestrator

By John Ragsdale, SVP of Marketing, Kahuna Labs

In many enterprise technology companies today, support engineers remain caught in a reactive loop: the alarms go off, tickets pile up, issues are escalated, patches get built, and everyone breathes a sigh of relief—until the next incident.

But what if that role could evolve? What if instead of fixing the problem, support engineers became the orchestration engine that stops major problems from ever surfacing, aligns diagnostics and troubleshooting into optimized flows, and elevates the leverage of the team across the entire customer lifecycle?

In this article I’ll explore why we’re at an inflection point, what’s holding support engineers back, and how you should start thinking about the “orchestrator” role today—so that you’re ready to realize not just incremental improvements, but a big productivity leap.

The Inflection Point

The pace and complexity of modern enterprise products has exploded. Not only are today’s products incredibly complex, but heavy configuration means each customer’s environment is different. Often vastly different.

Support engineering teams remain large cost centers—with engineers spending up to ~70% of time researching, recreating, and diagnosing rather than innovating or driving value.

Meanwhile, forward-looking support ororganizations are being asked: Can you prevent issues? Can you accelerate time to resolution? Can you become strategic partners instead of fire-fighters?

Agentic AI platforms are now viable for real support engineering rather than just ticket routing or simple chatbots. This sets the stage for transforming the role of the support engineer. Not by replacing them, but by redefining them.

The Traditional Support Engineer Role: Ripe for Transformation

The core support approach with multiple tiers of engineers, each level with more experience, has been around for decades. And while the basic structure works, in an AI world, there is a lot of room for transformation.

What works:
- Deep product/technical knowledge
- Ability to diagnose and troubleshoot under pressure
- Institutional expertise and years of experience
- Strong customer-facing skills when things go awry
What’s broken:
- Much of the “tribal” knowledge lives in heads or ad-hoc tickets; not easily reused or democratized.
- Engineers spend too much time on researching and recreating issues rather than higher-leverage work (e.g., architecture, root causes, proactive improvements).
- Systems are reactive: a ticket arrives, perform triage, try a number of things to address the issue, ultimately close the ticket, then repeat with the next ticket. Little orchestration, many silos and approaches.
- Training and ramp time are long; turnover is high; knowledge transfer is weak.
- Scaling is hard: when volume increases, simply adding more engineers yields diminishing returns.
The Orchestration Model

An orchestration-centric support engineering model features the following:
1. Pre-ticket diagnostics aligned to journey – The path a customer takes, the dependencies and configuration, telemetry, version, product build all feed into a Troubleshooting Map™, with paths for every conceivable twist and turn of the journey.
2. Dynamic decision flows – Instead of a static diagnostics checklist, you have decision flows that evolve based on previous ticket outcomes and customer context.
3. Knowledge reuse and routing leverage –Tribal knowledge is captured and surfaced dynamically to the right engineer at the right time, and constantly improved as new issues and resolutions emerge.
4. Escalation avoidance / resolution avoidance mindset – Rather than just fixing a problem, the AI enables an engineer to orchestrate prevention, mitigation, learning loops, and continuous improvement.
5. Engineered force-multiplier – The support engineer acts as the conductor of the diagnostics and resolution engine, rather than doing every step manually. The AI enables the system and team to run more efficiently by having SEs handle exceptions, not every step, which are predicted and automated.
6. Metrics shift – From tickets closed per engineer and time to respond and resolve, metrics can shift to more experiential data, such as auto-resolved tickets, number of steps and iterations streamlined or eliminated, with proven correlations to customer satisfaction, customer effort, and ultimately having a strategic impact on annual recurring revenue (ARR).
The Time to Act is Now: Getting Beyond Barriers

In a subscription economy, every interaction with a frontline employee has an impact on long-term account value. And no one has more post-sales interactions with customers than support. B2B support organizations are faced with mounting pressure:
- Customers demand faster, more proactive support with less downtime and less cost.
- Engineering teams are being held accountable not just for features shipped but also for post-release support burden.
- Emerging AI platforms are becoming mature enough to power orchestration and enable complex problem diagnostics, so the focus of AI can move beyond deflection and automating Level 1 issues.
- Competition in the technical product arena means support is a differentiator, not just a cost center.
But there remain a lot of barriers to overcome:
- Data fragmentation: Support history, configuration data, telemetry, and documentation are siloed, fragmented, and incomplete.
- Tribal knowledge: A lot of learning lives in engineers’ heads and is not accessible, or fragmented across multiple systems and difficult to correlate into meaningful or actionable content.
- Risk aversion: The idea of automating diagnostics or dynamic flows still triggers caution around “what if the AI is wrong?” The focus needs to be on how to allow human intervention when needed, and letting AI handle known issues/flows on its own.
- Investment mindset: Many execs still view support as a cost center, and only envision incremental improvements rather than strategic transformation. Support’s role in renewals and ARR is not front of mind for many companies.
Actionable Steps for Support Leaders Today
1. Stop obsession about deflection. Pacesetter companies have made great progress in improving self-service success and deflection. But they are blind to emerging AI approaches that tackle the real expense of support: the complex tickets that may take days, weeks, or even months to resolve.
2. Map your current “time-to-first-action” metrics — Identify bottlenecks: How long do engineers spend just getting context? How many tickets escalate because of the time required to manually understand, research, and recreate issues?
3. Surface your diagnostic flows and decision trees — Look for AI that can capture the repeatable paths your engineers take, understand the forward-looking journey for each approach, and enforce standardization for paths with the highest confidence and shortest resolution time.
4. Systematically capture Tribal Knowledge — Asking SMEs to write more documents is a waste of time. There are too many variables in each decision point to create an all-encompassing knowledge article. Rely on AI to create a Troubleshooting Map that dynamically pivots according to unique customer configurations and handling instructions, and product versions.
5. Do some internal marketing — Support offers immense strategic value for the organization. They have extensive data on product features that are poorly designed and hard to use. They know typical friction points new customers experience that could be addressed in customer success onboarding to boost adoption. Customer experience with support has a big impact on renewals and upsells. If your company execs don’t realize this, start educating them about support’s strategic value.
Conclusion

The traditional model of support engineering—where more engineers equal more throughput—is increasingly unaffordable and unsustainable in complex product environments. The future belongs to teams that think in terms of orchestration, engineer-enablement, and manual intervention only for exception handling.

Your Support Engineers can become the linchpins of product adoption and customer success rather than the reactive fire-fighters they’re forced to be today. Start the transformation now, and you’ll be ahead of the curve when the next wave of product complexity hits. And I promise you it is coming soon.
Rethinking AI Adoption

December 2, 2025

How Enterprises and AI Builders Co-Evolve

By Sudhamsh Goutham Teegala, Engineering @ Kahuna Labs

Why adopt AI?

Peer pressure? Hope for miracles? Or proven value demonstrated by competitors?

For most organizations, the motivation often lies somewhere in that mix. But history tells us that new technologies rarely change a single company – they change industries. Large language model (LLM)-based AI systems are poised to do exactly that for one of the most communication-driven sectors in the enterprise world: customer and product support.

Support functions have always sat at the intersection of human expertise and complex information. AI, with its ability to interpret, contextualize, and respond in natural language, is uniquely suited to reimagine how support is delivered, scaled, and learned from. The shift is already underway – not just in adoption, but in how enterprises think about the very act of supporting their products and customers.

Adoption versus Transformation

When we talk about “adopting AI,” what are we really doing? Are we automating a single process, or are we solving an end-to-end problem?

New technologies often “grow” alongside their own adoption. As more people use a new technology, both the users and the technology evolve toward each other. Most enterprises today are still in the early stages of this curve – replacing parts of their workflows with AI-enhanced stages or solutions.

But true transformation happens when the technology matures enough – and the organization becomes ready enough – to reimagine the whole system. In product support, this may soon mean moving away from today’s ticket-driven, reactive models toward predictive, AI-first ecosystems where discovery, diagnosis and resolution flow seamlessly.

That kind of change can sound unsettling. It challenges familiar roles, teams, and even career paths. Yet, the enterprises best positioned to lead their industries forward will be those that adopt AI as a co-evolutionary force where people, processes, and tools continuously shape each other. The companies that treat AI as a partner in organizational learning, rather than a replacement tool, will be the ones to define the new industry norms.

The Builder’s Dilemma

Then there is the parallel challenge for those of us building AI systems for these enterprises. When we design tools for customer and product support, what exactly are we optimizing for?

Do we start by deeply understanding current workflows – mapping pain points, inefficiencies, and blockers – and then build solutions that fit neatly into those patterns? Or should we focus on the essence of the problem itself, looking beyond the existing process? That might mean asking what information exists (and where), what an ideal resolution looks like, and whether there is a faster AI-native path to get there.

This is not an academic question. Every conversation we have with our users subtly shapes the direction of our tools, and by extension, the future of their organizations. The questions we ask are not neutral; they influence what our tools become and how our users’ processes evolve.

Over time, by working with multiple enterprises and observing their support journeys, AI builders begin to see patterns. Some organizations need tools that integrate smoothly into highly complex systems augmenting what already works. Others, especially those in periods of growth or transformation, benefit from AI-first support paradigms that prescribe new ways of working entirely.

This interplay, between enterprise readiness and AI maturity, is where the real progress happens.

A Co-Evolutionary Future

In the end, AI adoption is not about replacing humans or processes; it is about co-evolution. Enterprises evolve their structures and mindsets to make room for new capabilities, while AI builders evolve their systems through deeper understanding of human and organizational context.

Over time, these feedback loops will redefine what “support” even means. We may move from reactive helpdesk models to more forward-looking predictive recommendation systems. Ultimately, this leads to the creation of proactive, self-correcting ecosystems capable of diagnosing and resolving problems, while also sharing knowledge for broader reuse.

The companies that thrive in this new era will not simply “adopt” AI – they will grow with it. And those of us building these tools have both the privilege and the responsibility to guide that growth wisely. Because we are not just creating software; we are helping industries discover new ways to think, to solve, and to support.
Why New Support Engineers Take Months to Ramp

November 25, 2025
How AI Can Shorten Time to Competency

By John Ragsdale, SVP Marketing, Kahuna Labs

You’ve just hired a talented new Support Engineer (SE). They’re eager, technically capable, and full of potential. But six months later they’re still shadowing others, still struggling to independently handle moderate tickets, still asking “what diagnostics do I run next?” Meanwhile, backlog persists, escalations happen, and senior resources stay tied up mentoring.

In this post we’ll unpack why ramp time for new SEs stretches so long in complex technical product environments—and how purpose-built AI and a dynamic Troubleshooting Map™ can shorten that curve dramatically.

The Anatomy of Long Ramp-Time

In complex B2B product support, onboarding new support engineers is a lengthy process. Some industry numbers:
- Baseline competency (able to handle moderate‐complexity tickets independently, with minimal supervision): ~4-6 months
- Full competency (full mastery of the product/stack, able to handle highest-complexity issues, mentor others, work with minimal oversight): ~9-12 months
- If tooling, training, knowledge capture and flow-charts/decision trees are weak, it could easily stretch beyond 12 months.
Several core factors make ramping an SE a multi-month process:
1. Environment complexity – Every customer, every configuration, every version can be different. New engineers must learn not just “the product” but “the customer environments”—of which there are endless variations.
2. Tribal knowledge – A lot of reasoning lives in senior engineers’ heads, or in (often poorly documented) legacy tickets, not in structured content. New hires have to mine tickets, ask questions, shadow, discover context, and often create their own troubleshooting guides.
3. Fragmented diagnostics – Many organizations lack clear flows or decision trees for anything beyond common, repetitive customer issues; new engineers must learn by trial-and-error.
4. High stakes – Although customer uptime or performance is critical, and adhering to service level agreements (SLA) is table-stakes, new engineers often cannot experiment freely; they must wait for approvals or SME guidance, which slows progress and resolution time.
5. Lack of feedback loops – Without structured real-time feedback, in-flow educational content, and incremental autonomy, new engineers remain “safe novices” rather than being continually upskilled.
6. Insufficient tooling – If the support engineer lacks tools that surface decision logic, recommended diagnostics, or contextual reasoning, they default to trial and error, resulting in longer resolution times and increased customer effort.
The result: a prolonged time to reach independent contributor status. And while you wait, you’re paying for senior resources, lost productivity, potentially poor customer experiences, and missed scaling opportunities.

How AI and a Troubleshooting Map Shorten the Curve

Here’s how you can flip the script:

1. Context-aware triage and diagnostics
When a new engineer receives a ticket, the system presents not just the customer’s description, but troubleshooting steps that are in context of that customer’s configuration, version, past issues, and similar resolved tickets. That context jumpstarts their understanding.

2. Dynamic decision flows instead of open-ended blank pages
Rather than “Okay, figure it out”, the engineer sees guided recommendations: “Given these customer attributes, check A → if yes, check B → else check C.” This prevents time lost in researching past tickets and verifying next steps with an SME.

3. Embedded reasoning and learning content
Every recommended step comes with rationale: “We ask this because in version 4.2 clients with large memory allocations tended to hit X.” Over time this builds the “why”, not just the “what”. And contextual information about the product, particular features, or recent versions supplements SE learning without swivel-chairing to documentation or an eLearning system.

4. Feedback loops and adaptive learning
When an SE completes a step, the system captures whether the resolution path was successful, whether additional diagnostics were needed, and updates the Troubleshooting Map. The next new engineer inherits a richer decision space.

5. In-line quality control. Each response written to a customer is automatically reviewed for grammar and spelling, professional tone, empathy, as well as flagging potentially risky procedures or any customer PII included in the communication. The SE is prompted with a recommended rewritten version to review and send. In this way, response quality is proactive, in real-time, and potential issues are identified and corrected before they are sent to a customer.

6. Progressive autonomy with guardrails
AI-enabled decisioning guides the SE when they can proceed on their own and when they should escalate, and identifies the best-fit SME in case of questions. This structured autonomy accelerates growth and reduces risk.

7. Shift metrics to ramp-time, resolution accuracy, and engineer utilization
By measuring how long until an engineer handles a standard ticket independently, or how many guided vs. un-guided steps are involved at each progression of the ticket, you track ramp effectiveness—not just tickets closed.

8. Assigning/re-assigning tickets based on their evolving complexity. AI automatically calibrates a new support engineer’s ability to handle issues of a certain complexity level. As they gain experience, AI automatically assignes them with issues of increased complexity, so they are continually upskilled.

A Roadmap for Support Leaders

For support leaders introducing AI technology to streamline and automate troubleshooting, with an eye toward understanding the impact to new SEs, consider these recommendations:
1. Understand the current state. To benchmark improvements in time-to-competency, be sure you have solid data showing current ramp time for new SEs, so improvements are easy to identify and quantify.
2. Be generous in allowing AI to ingest content. With an in-network solution, there is no risk of data leaving your network. Providing the new AI platform access to any existing troubleshooting guides, Slack or Jira conversations, and any personal repositories senior engineers have created for their personal use, will accelerate accurate recommendations by AI.
3. Make adoption of the new technology an MBO. Introducing new tools for SEs, whether they be new or long-time employees, requires change management. Make sure all SEs understand the impact of the technology on their own performance, the customer experience, and overall impacts to the organization. Make rapid adoption of the new technology a key goal for SEs, and leverage adoption dashboards provided by the vendor to track employee use of the system.
4. Encourage continuous feedback loops. Even though the AI platform automatically audits each closed ticket to compare actuals to AI recommendations, and continually refines the Troubleshooting Map, verbatim feedbacks can also be submitted in each ticket. Encourage SEs to submit feedback on missing or unclear steps, or applicable content sources not referenced, to accelerate refinement of recommendations.
The Business Impact

ROI for new support AI typically focuses on core support metrics, such as response and resolution time, and customer satisfaction. Providing a Troubleshooting Map for SEs also has a big impact on time-to-competency for new hires. Be sure to include this impact when reporting the business results of AI pilots and production deployments. Shorter ramp-time means:
- New hires become productive faster, reducing cost of training and supervision.
- Senior engineers and SMEs free up time for high-value work (product improvements, root-cause elimination, proactive initiatives).
- The support team scales more linearly: fewer ramp-bottlenecks, better cost leverage per engineer.
- Customer satisfaction and customer effort improves: more consistent ticket handling, fewer escalations, faster resolution.
- The entire support function begins to shift from operational burden to strategic asset.
Conclusion

Ramping support engineers in the age of complex, configurable products has been notoriously slow. But it doesn’t have to remain that way. By combining structured decision-flows, knowledge orchestration, contextual reasoning and guided autonomy, support leaders can reduce ramp-time, elevate engineer productivity, and fundamentally shift how the support organization contributes to the business.

If you’re ready to move your support team from “new-hire shadowing” to “engineer orchestration”, the time is now.
The Complexity Score™

November 20, 2025

Capturing the Nuances of Technical Support Challenges

By Shanu Vashishtha, Deep Learning Engineer, Kahuna Labs

Not all support tickets are created equal. A password reset and a multi-system integration failure involving back and forths between the OS layer and the application in question both require resolution, but they represent fundamentally different levels of technical challenge. The Complexity Score™ captures these nuances, measuring how technically challenging an issue was to understand, debug, and resolve.

What Complexity Captures

The Complexity Score evaluates four key dimensions: whether the issue required specialized domain knowledge (e.g., registry modifications, scripting, platform-specific behavior), whether there was interaction across multiple technical layers (e.g., OS configuration, firmware, networking stack), whether solving it required multiple rounds of hypothesis testing or non-obvious workarounds, and whether the customer could follow and apply the solution independently.

The score uses a 5-point scale:

1 (Very Simple) – for trivial, self-evident issues

2 (Slightly Technical) – for straightforward fixes requiring basic knowledge

3 (Moderate Complexity) – for issues needing some technical investigation

4 (Complex) – for non-obvious root causes requiring multi-layer investigation; and

5 (Very Complex) – for advanced issues requiring deep domain knowledge, creative problem-solving, and multi-system interactions.

Operational Impact: Automation and Focused Resolution

Complexity scoring enables intelligent ticket routing from the moment a ticket enters the system. Low-complexity tickets (scores 1-2) most likely fall under the paradigm of single knowledge base document resolution. Kahuna AI’s Auto Resolve can retrieve the relevant KB article and provide an automated response, handling routine issues—password resets, simple configuration changes, standard troubleshooting procedures—without human intervention. This frees support teams to focus on more challenging problems.

High-complexity tickets (scores 4-5) require more sophisticated processing. The Kahuna AI system retrieves similar past tickets that were successfully resolved, prioritizing those with high credibility scores and high completeness scores. These historical solutions provide context and proven approaches for the current issue. These approaches form the crux of recommendations shown to a Support Engineer assigned to process such a complicated ticket.

Simultaneously, the deployed system intelligently routes the ticket to engineers with specific domain expertise—whether that’s database optimization, network troubleshooting, or platform-specific integrations. By matching complex problems with the right expertise from the start, organizations achieve faster resolution times and better outcomes.

LLM-Enabled Scoring

Large Language Models (LLMs) are uniquely positioned to perform complexity assessment at scale. Their training on publicly available troubleshooting documents, administrative guides, and compatibility lists for products has baked into them a deep understanding of complex technical issues and their resolution patterns. This foundational knowledge enables LLMs to recognize nuanced technical challenges—from multi-layer system interactions to platform-specific behaviors—that would require extensive domain expertise for human evaluators to identify.

The falling costs of LLM inference make automated complexity scoring economically compelling. Organizations can evaluate thousands of tickets consistently at a fraction of the cost of having support engineers devote their time to manual assessment. When a ticket first enters the system, Kahuna AI analyzes the initial description and assigns a complexity score using structured prompts with the rubrics described before, enabling immediate routing decisions before human review.

This initial filtering transforms support operations: low-complexity tickets trigger automated KB document retrieval and resolution, while high-complexity tickets immediately activate retrieval of similar high-credibility historical tickets and automated intelligent routing to higher tiers of support. The result is faster resolution times, better resource allocation, and improved customer satisfaction—all enabled by leveraging Kahuna AIs’ technical understanding and cost efficiency to apply the appropriate resolution strategy from the moment each ticket enters the system.

Conclusion

The Complexity Score transforms how support organizations understand and handle technical challenges. By capturing the nuanced differences between routine issues and complex technical problems, it enables intelligent automation for simple cases and focused expertise allocation for difficult ones. LLMs, with their training on technical documentation and falling operational costs, make this assessment feasible at scale—evaluating thousands of tickets consistently and economically.

The operational impact of this dual approach is clear—it ensures that every ticket receives the appropriate level of attention and resources, leading to faster resolutions, better customer satisfaction, and more efficient support operations. The Complexity Score doesn’t just measure technical difficulty; it enables a smarter, more responsive support system that adapts its strategy to the challenge at hand.
The Credibility Score™

November 18, 2025

Completing the Picture of Ticket Quality

By Shanu Vashishtha, Deep Learning Engineer, Kahuna Labs

The Completeness Score™ measures how thoroughly the troubleshooting process was documented—the clarity, coverage, and progression of troubleshooting actions. (We covered it in a previous blog post) But documentation quality is only half the story. A ticket can achieve a high Completeness Score with exemplary documentation, yet still leave a critical question unanswered: Did the documented solution actually work?

This is where the Credibility Score™ becomes essential. While Completeness measures how well the troubleshooting was documented, Credibility measures whether we can trust that the documented solution actually resolved the customer’s problem. Together, these two metrics provide a complete picture of ticket quality: one assesses documentation thoroughness, the other assesses solution reliability.

Why Credibility Matters: The Gap That Completeness Doesn’t Capture

Consider a ticket with a Completeness Score of 4: it documents every troubleshooting step, includes all relevant logs, and provides a clear technical narrative. However, the solution was never confirmed by the customer. Six months later, an engineer follows this well-documented trail, implements the solution, and discovers it doesn’t actually work. Tickets that score highly on Completeness but low on Credibility represent a particularly insidious problem: they appear trustworthy because they’re well-documented, but they lead engineers down incorrect paths. The documentation quality creates false confidence in unreliable solutions.

The Credibility Score evaluates dimensions that Completeness cannot assess: verification, validation, and temporal relevance. Where Completeness asks “is the documentation sufficient to replicate the process?”, Credibility asks “can we trust that this process actually solved the problem?”

When tickets lack credibility, critical problems emerge: engineers waste time implementing solutions that don’t work, knowledge bases become populated with unverified solutions that mislead future interactions, and customer satisfaction suffers.

The Complementary Relationship: Completeness + Credibility

High Completeness + High Credibility: The ideal ticket. Well-documented and verified. These tickets serve as reliable, reusable solution templates.

High Completeness + Low Credibility: Risky tickets. Thoroughly documented but unverified or unreliable. These create false confidence and should be flagged for verification.

Low Completeness + High Credibility: Incomplete but verified. The solution worked, but documentation is insufficient for replication.

Low Completeness + Low Credibility: Poor quality tickets. Neither well-documented nor verified. These should be prioritized for review.

How LLMs Enable Objective Assessment at Scale

Support organizations process thousands of tickets daily. Manually reviewing each one for credibility is prohibitively expensive and time-consuming. Large Language Models (LLMs) provide a solution by enabling automated, consistent evaluation of ticket credibility using structured prompts with clear rubrics.

The LLM evaluates specific aspects: whether actions were taken to solve the problem, whether the customer confirmed resolution, whether the ticket was reopened, and how recent the ticket is. The system returns structured JSON output containing both a numerical score (1-5) and reasoning, enabling downstream analytics and integration with quality assurance workflows. Thousands of tickets can be evaluated consistently in a fraction of the time required for manual review.

The Credibility Score Rubric

The credibility score uses a 5-point scale similar to the completeness scoring rubric but differing in what the individual scores signify:

Score 5: High Confidence – Concrete actions were taken, the customer confirmed resolution, the ticket was not reopened, and the ticket is recent. These tickets can be safely used as reference material.

Score 4: Likely Resolved – Actions were taken and the solution appears sound, but the customer didn’t explicitly confirm resolution, or the ticket may be somewhat older. The ticket was not reopened, indicating the solution likely worked.

Score 3: Plausible but Unconfirmed – Some actions may have been taken, but the customer didn’t confirm resolution, or the ticket is older. The solution seems reasonable but lacks explicit validation.

Score 2: Unclear or Partially Grounded – It’s unclear whether concrete actions were taken, or the ticket was reopened suggesting the initial solution didn’t fully work. These tickets shouldn’t be relied upon without additional verification.

Score 1: No Resolution – No clear actions were taken, or the ticket was reopened multiple times. The customer didn’t confirm the resolution, or the ticket is very old. These should be flagged for review.

The evaluation considers four key dimensions: Action taken (whether concrete steps were executed), Customer confirmation (whether the customer explicitly confirmed resolution), Ticket reopening (indicating potential issues with the initial resolution), and Ticket recency (recent tickets are generally more reliable than older ones).

Operational Applications

The combination of Completeness and Credibility scores enables sophisticated operational optimizations:

Historical Analysis: By filtering for both high Completeness (Score ≥ 4) and high Credibility (Score ≥ 4), Kahuna AI can immediately identify tickets that are both well-documented and verified, transforming historical ticket databases into readily deployable solutions for similar tickets.

Knowledge Base Curation: By requiring both high scores for knowledge base articles, organizations ensure that only thoroughly documented and verified solutions enter the knowledge base, preventing the propagation of well-documented but incorrect solutions.

Quality Assurance: QA teams can prioritize review efforts. Tickets with high Completeness but low Credibility need customer confirmation. Tickets with low Completeness but high Credibility need documentation improvement.

Conclusion

The Credibility Score completes the picture of ticket quality that the Completeness Score begins. Where Completeness measures documentation thoroughness, Credibility measures solution reliability. Together, these two metrics provide a comprehensive assessment that enables sophisticated filtering, prioritization, and knowledge curation.

LLMs enable this objective, scalable assessment by applying consistent evaluation criteria to every ticket. The automated process makes it feasible to evaluate thousands of tickets consistently and cost-effectively, transforming support operations from experience-based models to systematically verified, documented knowledge systems.

For support organizations, the combination of Completeness and Credibility scores provides the complete picture of ticket quality. It’s not enough to know that a ticket is well-documented—organizations need to know whether they can trust that the documented solution actually worked. The Credibility Score answers that question, working in tandem with the Completeness Score to enable better decisions about which tickets to reference, which to review, and which to use as the foundation for knowledge that will help future customers.
The Completeness Score™

November 17, 2025
Transforming Support Tickets Into Reusable Knowledge Assets

By Shanu Vashishtha, Deep Learning Engineer, Kahuna Labs

The Fundamental Challenge

In traditional support operations, engineers receive tickets without context. They must begin from uncertainty: requesting diagnostics, gathering specifications, and iteratively narrowing the problem space.

The critical issue emerges after ticket closure. Closed tickets accumulate with highly variable documentation quality. A ticket might contain initial diagnostic questions, a reference to log collection, followed by “Scheduled a Zoom call” and “Closing as resolved”—with no resolution details, verification steps, or technical narrative.

This represents a fundamental failure of knowledge capture. The resolution information exists but not where organizational processes require it: within the ticket itself. Incomplete documentation translates to duplicated diagnostic effort, extended MTTR, and the inability to build scalable institutional knowledge.

The Completeness Score™: A Systematic Quality Metric

Kahuna’s Completeness Score is a 0-5 rating scale that measures how thoroughly the troubleshooting process was documented. The central question: can another engineer, encountering a similar issue six months later, follow the documented diagnostic trail and resolve the issue based solely on the ticket information?

The Completeness Scale

Score 0: No Engineer Engagement – No messages from support engineer, or only automated responses. No troubleshooting information available.

Score 1: Minimal Engagement – Basic acknowledgment or initial information requests without substantive troubleshooting progression. Confirms the issue existed but provides no pathway toward resolution.

Score 2: Partial Investigation – Some troubleshooting effort evident, but significant gaps in documentation or resolution verification. Provides directional hints but lacks detail for reliable replication.

Score 3: Standard Documentation – Reasonable troubleshooting with specific steps and technical detail, but missing some elements. Provides a solid starting point, though engineers may need to supplement with additional diagnostics.

Score 4: Comprehensive Documentation – Thorough step-by-step troubleshooting with clear progression, multiple diagnostic checks, and strong resolution documentation. Provides a reliable playbook for similar issues.

Score 5: Exemplary Documentation – Complete, professional-grade diagnostic documentation with all relevant artifacts, clear resolution process, and verified outcomes. Represents a reusable solution pattern requiring minimal adaptation.

Critical Modifier: When evidence exists that a Zoom call, phone call, or remote session occurred but no transcript or summary was documented, the score is reduced by 1 point (floor of 0). Undocumented synchronous work is functionally equivalent to work that never occurred from an organizational knowledge perspective.

Operational Applications

Historical Analysis: Efficient Knowledge Retrieval

When engineers search historical tickets, result sets frequently contain 50-500 potentially relevant tickets. Without quality metadata, each appears equally promising, requiring sequential evaluation.

The completeness score enables immediate quality-based filtering:

– Score 4-5: High-confidence documentation with verified resolutions

– Score 3: Adequate documentation with some gaps

– Score 0-2: Insufficient documentation for knowledge transfer

This transforms historical ticket databases from undifferentiated archives into curated knowledge repositories where high-scoring tickets serve as reusable solution templates.

Active Ticket Resolution: Pattern Matching

For newly assigned tickets, similarity searches with completeness scoring return quality-weighted results. Engineers can rapidly assess pattern prevalence, implement validated solution methodologies, identify case-specific variations, and compress resolution timelines from days to hours.

For eg: An engineer searching for “API 503 errors” finds 12 relevant tickets. Filtering for Score ≥ 4 yields 2 high-quality matches. The highest-scoring ticket documents complete diagnostic steps, root cause analysis, resolution implementation, and customer verification—enabling resolution in hours rather than days.

Proactive Completeness: Real-Time Quality Improvement

Beyond retrospective analysis, the completeness scoring system operates proactively during the active ticket’s lifecycle. The system calculates completeness scores in real-time as engineers document their work, identifying gaps and prompting support engineers to provide additional documentation before ticket closure.

This proactive approach creates a win-win scenario: not only do organizations achieve better-documented tickets for audit and knowledge purposes, but this wealth of information significantly increases the accuracy of the Troubleshooting Map moving forward. When tickets contain comprehensive diagnostic narratives, resolution steps, and verification details, the underlying AI systems can more effectively identify patterns, map issue relationships, and generate actionable recommendations for future similar cases.

Organizational Impact

Systematic completeness scoring generates cascading benefits:
1. Documentation Quality Improvement: Objective criteria transform abstract directives into measurable standards
2. Knowledge Accumulation: High-scoring tickets function as reusable templates, creating compound returns on diagnostic effort
3. Data-Driven Assessment: Quantitative evaluation of documentation capabilities independent of other performance dimensions
4. Customer Experience: Reduced time-to-resolution through rapid access to validated solutions
5. Documentation Debt Visibility: Aggregate metrics reveal systemic documentation issues for targeted improvements
Conclusion

The Completeness Score introduces systematic quality metadata to historical ticket data, enabling efficient knowledge retrieval, validated solution reuse, visible quality standards, and organizational learning. This transforms support operations from an experience-based model (knowledge in individual memory) to a documented model (knowledge systematically captured and retrievable).

The cumulative impact: reduced resolution times, decreased duplicated effort, improved knowledge transfer, and systematic accumulation of institutional expertise in retrievable, actionable form.

Implementation Note

The completeness scoring system employs AI-based evaluation to automatically assess every ticket against objective criteria: troubleshooting steps, diagnostic artifacts, hypothesis testing, remote session documentation, knowledge base references, resolution steps, and customer confirmation. This automated approach provides scalability that manual review cannot achieve, making historical knowledge systematically accessible for future diagnostic work.
Transforming Support with AI: Build vs. Buy

November 12, 2025

Build vs. Buy is a common dilemma with AI projects. There are a lot of factors at work: company culture, privacy concerns, and cost of buying and implementing technology. In this webinar, Kahuna Labs CEO, Sanjeev Gupta, dives into the build vs. buy challenge, citing examples from Kahuna Labs customers, to help support leaders navigate this decision to arrive at the best outcome for the company, the department, and ultimately, your customers.

If you are already building in-house AI, learn how you can achieve the right build+buy collaboration.

recent posts

about

The Real Reasons Escalations Happen (Beyond “It’s A Hard Issue”)

1) The first few steps are wrong—or delayed

2) Tribal knowledge is inaccessible when it matters

3) Customer reality is unique, and documentation can’t keep up

4) Support is operating without a map

Signals That Predict Escalation (Often Hours or Days Earlier)

How Predictive AI Prevents Escalations: From “Ticket Handling” to “Path Guidance”

1) Predict escalation risk by detecting “drift” early

2) Recommend the next best step with confidence, not guesswork

3) Prevent escalation by removing effort, not adding process

The Preventative Mindset Shift

The Frontline Is the New Growth Engine

From Automation to Advice

Why Context Is Everything

From Insight to Impact

The Human-AI Partnership

A Moment of Alignment

Eliminating the Quiet Tax of Backlog Management

1) Start by Attacking the Biggest Hidden Time Sink: Re-Reading and Re-Research

2) Break Work Into “Snapshots,” Not Tickets

3) Build Paths, Not Just a Library of Articles

4) Use Confidence Gating So AI Reduces Effort (Instead of Adding a Review Step)

5) The Backlog Multiplier: Apply New Paths to Every Open Ticket

6) Make It a Closed Loop, Not a One-Time Cleanup

From Troubleshooter to Orchestrator

The Inflection Point

The Traditional Support Engineer Role: Ripe for Transformation

The Orchestration Model

The Time to Act is Now: Getting Beyond Barriers

Actionable Steps for Support Leaders Today

Conclusion

How Enterprises and AI Builders Co-Evolve

Why adopt AI?

Adoption versus Transformation

The Builder’s Dilemma

A Co-Evolutionary Future

How AI Can Shorten Time to Competency

The Anatomy of Long Ramp-Time

How AI and a Troubleshooting Map Shorten the Curve

A Roadmap for Support Leaders

The Business Impact

Conclusion

Capturing the Nuances of Technical Support Challenges

What Complexity Captures

Operational Impact: Automation and Focused Resolution

LLM-Enabled Scoring

Conclusion

Completing the Picture of Ticket Quality

Why Credibility Matters: The Gap That Completeness Doesn’t Capture

The Complementary Relationship: Completeness + Credibility

How LLMs Enable Objective Assessment at Scale

The Credibility Score Rubric

Operational Applications

Conclusion

Transforming Support Tickets Into Reusable Knowledge Assets

The Fundamental Challenge

The Completeness Score™: A Systematic Quality Metric

The Completeness Scale

Operational Applications

Historical Analysis: Efficient Knowledge Retrieval

Active Ticket Resolution: Pattern Matching

Organizational Impact

Conclusion

Implementation Note