What Actually Goes In A QA Rubric (And What Doesn’t)

I’ve been asked this question approximately 147 times in the last year: “What should we include in our QA rubric?”

The real answer? Way less than you’re probably thinking.

Most people getting started in QA overthink their rubric because they’re trying to measure everything at once. That’s not QA. That’s performance reviews disguised as quality assurance, and it’s why your managers hate doing reviews and your agents feel nitpicked even when their scores are perfect.

Why You’re Overthinking This

Here’s what usually happens:

You decide you need a QA rubric. Great! You pull together stakeholders to figure out what to measure.

Product wants technical accuracy measured. Marketing wants brand voice consistency. Leadership wants efficiency metrics. Your managers want to measure “helpfulness” (whatever that means). Someone suggests adding a category for grammar because Susan in accounting once got a ticket with a typo and hasn’t shut up about it since.

You end up with a 47-point rubric that takes 45 minutes per ticket to complete. Every category is subjective enough that two reviewers score the same ticket completely differently. No one can agree on what “good” looks like anymore.

Three months later, the rubric is abandoned, and you’re back to square one.

Sound familiar?

The 80/20 Rubric (What to Actually Measure)

Here’s the thing: Your rubric should have 3-5 categories maximum.

Not 47. Not 15. Not even 10.

Three. To. Five.

Why? Because if you can’t remember what you’re measuring without looking at the rubric, your reviewers sure as hell won’t either. And if your reviewers can’t remember it, they can’t calibrate on it. Please read that sentence a few more times.

Here’s what actually matters:

1. Accuracy (Did they solve the problem correctly?)

This is non-negotiable. If your agent is confidently giving wrong answers, nothing else matters.

What you’re measuring:

Did they understand what the customer was actually asking?
Was the solution technically correct?
Did they provide complete information, or did they leave gaps?

Why this matters: Wrong answers = customer churn = you don’t have a business anymore.

Impact level: HIGH

2. Communication (Can the customer understand and act on this?)

I don’t care if your agent wrote a technically perfect response if the customer has to read it three times to figure out what they’re supposed to do next.

What you’re measuring:

Is it written clearly? (No jargon unless absolutely necessary)
Did they set expectations? (What happens next, when, what the customer should do)
Is the tone appropriate for your brand and this specific customer? The latter is more important than you might think.

Why this matters: Clarity = fewer follow-ups = efficiency. Also, confused customers leave.

Impact level: MEDIUM

3. Process (Did they follow the workflow correctly?)

This is the “did you do your job the way we agreed you’d do your job” category.

What you’re measuring:

Did they use the right tools, macros, or resources?
Did they document and tag correctly so the next person can find this?
Did they escalate when they should have instead of winging it?

Why this matters: Process adherence = consistency = scalability. If everyone invents their own workflow, you can’t scale.

Impact level: MEDIUM

4. Empathy/Tone (OPTIONAL – only if brand-critical)

Notice I said optional. If you’re not a luxury brand or high-touch service where tone is make-or-break, you probably don’t need this as a separate category. Tone usually shows up naturally in “Communication”.

What you’re measuring:

Did they acknowledge the customer’s frustration instead of ignoring it?
Did they personalize the response or sound like a bot?

Why this matters: Customer experience, brand perception, making people feel heard.

Impact level: LOW to MEDIUM (unless you’re luxury/high-touch, then HIGH)

5. Efficiency (OPTIONAL – for teams drowning in volume)

Only add this if you’re actively struggling with agents taking 6 messages to resolve something that should take 1.

What you’re measuring:

Could this have been resolved faster?
Did they use available resources effectively, or did they reinvent the wheel?
Did they answer questions that the customer did not yet ask, but could be expected based on the available information? (Agents who can consistently do this are superheroes and you should probably pay them more)

Why this matters: Agent capacity, customer wait times, not burning out your team.

Impact level: MEDIUM

Key principle: If you can’t coach on it, don’t measure it. If you can’t show clear examples of “good vs. bad,” don’t measure it. If it’s not causing actual problems right now, don’t measure it.

What Does NOT Belong in Your Rubric

Let’s talk about the things teams try to shove into QA rubrics that have no business being there.

Grammar and Spelling

Unless the typos are making things genuinely unclear, this is editing, not quality assurance.

I’ve seen teams dock points for missing commas. Commas! Meanwhile, the agent gave a completely wrong answer, but hey, at least they used a semicolon correctly.

Exception: If poor grammar is a pattern that’s affecting customer trust or comprehension, then yes, address it. But that’s a coaching conversation, not a line item on every single review.

Agent Personality (“They’re not friendly enough”)

This is subjective, impossible to calibrate, and feels intensely personal to agents.

You know what’s worse than an agent who’s “not friendly enough”? An agent who’s so busy performing friendliness that they forget to actually solve the problem.

What to measure instead: Tone appropriateness. Was their response professional when it needed to be? Casual when appropriate? That’s measurable. “Friendliness” is not.

Things Agents Can’t Control

If your product has a bug, or your policy sucks, or your knowledge base is out of date, that’s not the agent’s fault. Don’t measure it in QA.

I’ve seen teams mark down agents for not being able to solve issues the product couldn’t handle. That’s not QA. That’s scapegoating.

What to do instead: Track these separately as “systemic issues” that need product/policy fixes. Your QA program should be flagging these for leadership, not punishing agents for working within broken systems.

Everything at Once

“But we need to measure X and Y and Z and—”

No. You don’t. Not all at once.

Start small. Add more later if needed. Most of the time, you won’t need to.

How to Build Your V1 Rubric (In Under 2 Hours)

Stop theorizing. Start doing.

Step 1: Pull 10 tickets. Five good ones, five bad ones. You know which ones they are. You’ve read enough tickets to have opinions.

Step 2: Ask yourself, “What made the good ones good? What made the bad ones bad?” Write it down. I call these ‘building blocks’; they are behaviors that can be repeated and therefore taught.

Step 3: Group those observations into 3-4 categories. Use the framework I gave you above if you’re stuck.

Step 4: Write 1-2 specific, observable sub-questions per category. Not “Was this good?” but “Did they identify the correct issue?” and “Did they provide a working solution?”

Step 5: Test it on 5 tickets with 2 reviewers. Do you both score them roughly the same? If not, your criteria are too subjective. Fix them.

Step 6: Ship it.

Time investment:

30 min: Pull and review tickets
30 min: Draft categories and questions
30 min: Test with real tickets
30 min: Refine and document

Total: 2 hours.

That’s it. You don’t need a committee. You don’t need six rounds of stakeholder feedback. You need something that works, and you need it now.

Here’s an example of what simple looks like:

Category: Accuracy (5 points)
- Did they identify the correct issue? (2 pts)
- Did they provide a working solution? (3 pts)

Category: Communication (3 points)
- Is the response clear and actionable? (2 pts)
- Did they set expectations? (1 pt)

Category: Process (2 points)
- Did they tag/route correctly? (1 pt)
- Did they document key info? (1 pt)

Total: 10 points
Pass threshold: 8/10

Simple. Clear. Usable.

Real Examples (Because You Asked – A Lot)

Someone in Support Driven recently asked what people use in their rubrics. A founder named Rasmus Chow shared how they break down “Product Knowledge” into specific sub-questions:

Did the agent paraphrase or mirror the customer’s inquiry to ensure alignment?
Did they proactively gather relevant context from internal and external resources to fully understand the issue?
Did they run initial diagnostics using appropriate tools for the specific product and inquiry?

See that? Those aren’t vague qualities like “knows the product.” Those are observable behaviors you can point to in a ticket and say “yes” or “no.”

That’s the level of specificity you want.

And for “Solution Provided,” they ask:

Did the agent translate technical terms into clear, simple language instead of just linking to docs?
Did they resolve the inquiry with a complete explanation – covering next steps, SLA, and resources – while safeguarding sensitive internal info?

Again: Specific. Observable. You can calibrate on this.

How to Evolve Your Rubric (Don’t Set and Forget)

Your rubric should change as your team does.

Quarter 1: Measure the basics (accuracy, communication, process)

Quarter 2: Add complexity if you need it (efficiency, advanced troubleshooting)

Quarter 3+: Refine based on what you’re actually coaching on

Here’s how you know your rubric needs updating:

Reviewers constantly say “this doesn’t apply to this ticket”
Everyone’s scoring 95%+ (you’re not measuring the right things)
Managers say “this tells me nothing useful”
You’re spending more time debating scores than coaching agents

If any of those are true, your rubric is broken. Fix it or burn it.

Just Ship It Already

Your rubric doesn’t need to be perfect. It needs to be useful.

Start with 3-4 categories. Test it for a month. Adjust what’s not working. That’s it.

If you’ve been stuck in rubric design hell for six months because you’re waiting for consensus from 47 stakeholders, I have bad news: you’re never going to get consensus. Someone will always want “just one more thing” added.

Stop asking for permission. Build something simple. Use it. Fix what breaks.

That’s how you actually improve quality: by measuring it, not by endlessly theorizing about how you might measure it someday.

Need help getting unstuck? I build and test working QA rubrics in one week. That’s literally what I do.

Book a call here if you want hands-on help building yours.

Ines van Dijk

Ines van Dijk is a support strategist, systems nerd, and the founder of Customer Support Excellence: a consultancy helping SaaS startups stop winging it and start scaling sustainably. With 15+ years in Support Ops and a talent for translating chaos into clean workflows, she’s built everything from ticket taxonomies to team charters that actually work.
When she’s not building service orgs that don’t make people cry, she’s probably threatening to burn down bad macros or posting TikTok videos with spicy truths about cost-per-ticket.

📍 Currently accepting founder meltdowns (and inquiries).

What Actually Goes In A QA Rubric (And What Doesn’t)

Why You’re Overthinking This

The 80/20 Rubric (What to Actually Measure)

1. Accuracy (Did they solve the problem correctly?)

2. Communication (Can the customer understand and act on this?)

3. Process (Did they follow the workflow correctly?)

4. Empathy/Tone (OPTIONAL – only if brand-critical)

5. Efficiency (OPTIONAL – for teams drowning in volume)

What Does NOT Belong in Your Rubric

Grammar and Spelling

Agent Personality (“They’re not friendly enough”)

Things Agents Can’t Control

Everything at Once

How to Build Your V1 Rubric (In Under 2 Hours)

Real Examples (Because You Asked – A Lot)

How to Evolve Your Rubric (Don’t Set and Forget)

Just Ship It Already

Comments

Leave a Reply Cancel reply