Define whether to use Claude for ticket routing
Here are some key indicators that you should use an LLM like Claude instead of traditional ML approaches for your classification task:You have limited labeled training data available
You have limited labeled training data available
Your classification categories are likely to change or evolve over time
Your classification categories are likely to change or evolve over time
You need to handle complex, unstructured text inputs
You need to handle complex, unstructured text inputs
Your classification rules are based on semantic understanding
Your classification rules are based on semantic understanding
You require interpretable reasoning for classification decisions
You require interpretable reasoning for classification decisions
You want to handle edge cases and ambiguous tickets more effectively
You want to handle edge cases and ambiguous tickets more effectively
You need multilingual support without maintaining separate models
You need multilingual support without maintaining separate models
Build and deploy your LLM support workflow
Understand your current support approach
Before diving into automation, it’s crucial to understand your existing ticketing system. Start by investigating how your support team currently handles ticket routing. Consider questions like:- What criteria are used to determine what SLA/service offering is applied?
- Is ticket routing used to determine which tier of support or product specialist a ticket goes to?
- Are there any automated rules or workflows already in place? In what cases do they fail?
- How are edge cases or ambiguous tickets handled?
- How does the team prioritize tickets?
Define user intent categories
A well-defined list of user intent categories is crucial for accurate support ticket classification with Claude. Claude’s ability to route tickets effectively within your system is directly proportional to how well-defined your system’s categories are. Here are some example user intent categories and subcategories.Technical issue
Technical issue
- Hardware problem
- Software bug
- Compatibility issue
- Performance problem
Account management
Account management
- Password reset
- Account access issues
- Billing inquiries
- Subscription changes
Product information
Product information
- Feature inquiries
- Product compatibility questions
- Pricing information
- Availability inquiries
User guidance
User guidance
- How-to questions
- Feature usage assistance
- Best practices advice
- Troubleshooting guidance
Feedback
Feedback
- Bug reports
- Feature requests
- General feedback or suggestions
- Complaints
Order-related
Order-related
Service request
Service request
- Installation assistance
- Upgrade requests
- Maintenance scheduling
- Service cancellation
Security concerns
Security concerns
- Data privacy inquiries
- Suspicious activity reports
- Security feature assistance
Compliance and legal
Compliance and legal
- Regulatory compliance questions
- Terms of service inquiries
- Legal documentation requests
Emergency support
Emergency support
- Critical system failures
- Urgent security issues
- Time-sensitive problems
Training and education
Training and education
- Product training requests
- Documentation inquiries
- Webinar or workshop information
Integration and API
Integration and API
- Integration assistance
- API usage questions
- Third-party compatibility inquiries
Establish success criteria
Work with your support team to define clear success criteria with measurable benchmarks, thresholds, and goals. Here are some standard criteria and benchmarks when using LLMs for support ticket routing:Classification consistency
Classification consistency
Adaptation speed
Adaptation speed
Multilingual handling
Multilingual handling
Edge case handling
Edge case handling
Bias mitigation
Bias mitigation
Prompt efficiency
Prompt efficiency
Explainability score
Explainability score
Routing accuracy
Routing accuracy
Time-to-assignment
Time-to-assignment
Rerouting rate
Rerouting rate
First-contact resolution rate
First-contact resolution rate
Average handling time
Average handling time
Customer satisfaction scores
Customer satisfaction scores
Escalation rate
Escalation rate
Agent productivity
Agent productivity
Self-service deflection rate
Self-service deflection rate
Cost per ticket
Cost per ticket
Choose the right Claude model
The choice of model depends on the trade-offs between cost, accuracy, and response time. Many customers have foundclaude-3-5-haiku-20241022
an ideal model for ticket routing, as it is the fastest and most cost-effective model in the Claude 3 family while still delivering excellent results. If your classification problem requires deep subject matter expertise or a large volume of intent categories complex reasoning, you may opt for the larger Sonnet model.
Build a strong prompt
Ticket routing is a type of classification task. Claude analyzes the content of a support ticket and classifies it into predefined categories based on the issue type, urgency, required expertise, or other relevant factors. Let’s write a ticket classification prompt. Our initial prompt should contain the contents of the user request and return both the reasoning and the intent.- We use Python f-strings to create the prompt template, allowing the
ticket_contents
to be inserted into the<request>
tags. - We give Claude a clearly defined role as a classification system that carefully analyzes the ticket content to determine the customer’s core intent and needs.
- We instruct Claude on proper output formatting, in this case to provide its reasoning and analysis inside
<reasoning>
tags, followed by the appropriate classification label inside<intent>
tags. - We specify the valid intent categories: “Support, Feedback, Complaint”, “Order Tracking”, and “Refund/Exchange”.
- We include a few examples (a.k.a. few-shot prompting) to illustrate how the output should be formatted, which improves accuracy and consistency.
Deploy your prompt
It’s hard to know how well your prompt works without deploying it in a test production setting and running evaluations. Let’s build the deployment structure. Start by defining the method signature for wrapping our call to Claude. We’ll take the method we’ve already begun to write, which hasticket_contents
as input, and now return a tuple of reasoning
and intent
as output. If you have an existing automation using traditional ML, you’ll want to follow that method signature instead.
- Imports the Anthropic library and creates a client instance using your API key.
- Defines a
classify_support_request
function that takes aticket_contents
string. - Sends the
ticket_contents
to Claude for classification using theclassification_prompt
- Returns the model’s
reasoning
andintent
extracted from the response.
stream=False
(the default).
Evaluate your prompt
Prompting often requires testing and optimization for it to be production ready. To determine the readiness of your solution, evaluate performance based on the success criteria and thresholds you established earlier. To run your evaluation, you will need test cases to run it on. The rest of this guide assumes you have already developed your test cases.Build an evaluation function
Our example evaluation for this guide measures Claude’s performance along three key metrics:- Accuracy
- Cost per classification
- We added the
actual_intent
from our test cases into theclassify_support_request
method and set up a comparison to assess whether Claude’s intent classification matches our golden intent classification. - We extracted usage statistics for the API call to calculate cost based on input and output tokens used
Run your evaluation
A proper evaluation requires clear thresholds and benchmarks to determine what is a good result. The script above will give us the runtime values for accuracy, response time, and cost per classification, but we still would need clearly established thresholds. For example:- Accuracy: 95% (out of 100 tests)
- Cost per classification: 50% reduction on average (across 100 tests) from current routing method
Improve performance
In complex scenarios, it may be helpful to consider additional strategies to improve performance beyond standard prompt engineering techniques & guardrail implementation strategies. Here are some common scenarios:Use a taxonomic hierarchy for cases with 20+ intent categories
As the number of classes grows, the number of examples required also expands, potentially making the prompt unwieldy. As an alternative, you can consider implementing a hierarchical classification system using a mixture of classifiers.- Organize your intents in a taxonomic tree structure.
- Create a series of classifiers at every level of the tree, enabling a cascading routing approach.

- Pros - greater nuance and accuracy: You can create different prompts for each parent path, allowing for more targeted and context-specific classification. This can lead to improved accuracy and more nuanced handling of customer requests.
- Cons - increased latency: Be advised that multiple classifiers can lead to increased latency, and we recommend implementing this approach with our fastest model, Haiku.
Use vector databases and similarity search retrieval to handle highly variable tickets
Despite providing examples being the most effective way to improve performance, if support requests are highly variable, it can be hard to include enough examples in a single prompt. In this scenario, you could employ a vector database to do similarity searches from a dataset of examples and retrieve the most relevant examples for a given query. This approach, outlined in detail in our classification recipe, has been shown to improve performance from 71% accuracy to 93% accuracy.Account specifically for expected edge cases
Here are some scenarios where Claude may misclassify tickets (there may be others that are unique to your situation). In these scenarios,consider providing explicit instructions or examples in the prompt of how Claude should handle the edge case:Customers make implicit requests
Customers make implicit requests
- Solution: Provide Claude with some real customer examples of these kinds of requests, along with what the underlying intent is. You can get even better results if you include a classification rationale for particularly nuanced ticket intents, so that Claude can better generalize the logic to other tickets.
Claude prioritizes emotion over intent
Claude prioritizes emotion over intent
- Solution: Provide Claude with directions on when to prioritize customer sentiment or not. It can be something as simple as “Ignore all customer emotions. Focus only on analyzing the intent of the customer’s request and what information the customer might be asking for.”
Multiple issues cause issue prioritization confusion
Multiple issues cause issue prioritization confusion
- Solution: Clarify the prioritization of intents so thatClaude can better rank the extracted intents and identify the primary concern.
Integrate Claude into your greater support workflow
Proper integration requires that you make some decisions regarding how your Claude-based ticket routing script fits into the architecture of your greater ticket routing system.There are two ways you could do this:- Push-based: The support ticket system you’re using (e.g. Zendesk) triggers your code by sending a webhook event to your routing service, which then classifies the intent and routes it.
- This approach is more web-scalable, but needs you to expose a public endpoint.
- Pull-Based: Your code pulls for the latest tickets based on a given schedule and routes them at pull time.
- This approach is easier to implement but might make unnecessary calls to the support ticket system when the pull frequency is too high or might be overly slow when the pull frequency is too low.