I open-sourced 24 QA skills for Claude Code — from spec to release

posted Originally published at dev.to 4 min read

TL;DR — I just open-sourced QA Claude Skill — 24 production-grade QA skills for Claude Code covering test design, automation, performance, security, mutation testing, and more. MIT for non-commercial use. GitHub repo.

The problem

For two years I've been iterating a personal Claude Code workspace for QA work — bug reports, test plans, review checklists, regression matrices. It saved me hours every week.

But every time a colleague asked "how do you write a test plan that fast?" — handing them my workspace meant they got dozens of files hard-coded with my JIRA project key, my Slack user ID, my AWS bucket. Useless to anyone else.

So I spent the last two weeks extracting 24 skills into a properly generalized, open-source repo. Drop in your team's IDs via config.json and it works for any team, any stack.

What's in the box

24 skills across 8 categories:

Test Design (8)

test-master · flutter-test-master · test-review · regression-test · speckit-to-tc · tc-version-diff · sheet-md-sync · smoke-test-analyzer

Automation (3)

test-automation · flutter-test-automation · tc-to-pytest

Bug Management (1)

bug-report

Quality Quantification (2)

mutation-testing · property-based-test-gen

Reporting (1)

publish-regression

Performance & Security (3)

performance-test-gen · security-scan · api-contract-test

CI Health (2)

visual-regression-gen · flaky-test-hunter

Quality Specialties (4)

a11y-audit · localization-test · push-notification-test · test-data-factory

What it actually does

Each skill activates on natural language triggers. Some examples:

1. "I want to file a bug"

The bug-report skill walks you through RIDER format (Reproduction / Impact / Device / Expected vs Actual / References), checks JIRA for duplicates, does root-cause analysis from git history, creates the ticket with the right priority, and sends a Slack DM — in one conversation.

2. "Plan tests for this new feature"

test-master reads your JIRA ticket (or your description), scans both iOS and Android repos for affected modules, designs a test pyramid (70% Unit / 20% Integration / 10% UI), generates black-box + white-box test cases in Google Sheets, identifies coverage gaps against existing tests, and builds an automation ROI roadmap.

It also enforces a11y must-checks per UI feature (Dynamic Type / VoiceOver / contrast / touch targets) — no more "we forgot accessibility" at the end of the sprint.

3. "Are my tests actually catching bugs?"

mutation-testing runs mutmut on your Python backend. It changes < to <=, True to False, or numeric literals — then re-runs your pytest. If your tests still pass with the broken code, that mutation survived = your TCs have fake coverage.

Then property-based-test-gen takes those survived mutations and generates hypothesis strategies that fuzz 200 inputs per test to close the gap.

4. "Which tests should run on every PR?"

smoke-test-analyzer scans your existing test suite (iOS XCUITest / Android Espresso / pytest), scores each test on 5 weighted criteria (criticality / speed / stability / independence / coverage value), and tiers them:

  • T0 PR Smoke (< 3 min) — runs every PR
  • T1 Daily (< 10 min) — runs nightly
  • T2 Release (< 60 min) — pre-release full regression
  • T3 Manual — exploratory, visual, a11y

Then it generates .xctestplan for iOS or Gradle filters for Android.

Three modes for any tool stack

Not every team has the same MCP servers installed. Same skills, three modes:

  • full-mcp — You have Atlassian + Slack + Google Workspace MCPs. Auto-creates tickets, sends notifications, writes Sheets.
  • partial-mcp — Some MCPs missing — skills degrade gracefully to Markdown.
  • markdown-only — Solo dev / no MCP / pure documentation flow. Zero external calls.

The markdown-only mode is what makes this actually portable — every skill can still produce useful Markdown reports under .claude/testing/ without external dependencies. Solo developers can use the full suite without setting up anything.

6 ready-to-use presets

cp config/presets/full-stack.json     config/config.json   # All MCPs
cp config/presets/jira-only.json      config/config.json   # JIRA only
cp config/presets/markdown-only.json  config/config.json   # Pure docs
cp config/presets/startup.json        config/config.json   # Small startup
cp config/presets/enterprise.json     config/config.json   # 5 team boards
cp config/presets/government.json     config/config.json   # High-compliance

Why I made it bilingual

I'm Taiwanese, and most of the test-engineering content out there is English-first. So every skill ships with:

  • SKILL.md — Traditional Chinese (primary)
  • SKILL.en.md — English mirror
  • concept-zh.md — Beginner intros for unfamiliar concepts (mutation testing, property-based testing, spec-driven dev, test tiering)

The README is in English (primary), Traditional Chinese, and Simplified Chinese.

The license model

I went with a dual license:

  • MIT — Personal use / education / research / non-profits / 30-day evaluation / open-source contributions
  • Commercial — For-profit company internal use, paid products, SaaS, paid consulting

See LICENSE-COMMERCIAL.md for how to obtain a commercial license. I'm doing this case-by-case via GitHub Issues — the goal isn't to monetize aggressively, but to leave space for sustainable enterprise support if it grows.

Quick start

git clone https://github.com/kao273183/qa-claude-skill.git
cd qa-claude-skill
cp config/config.example.json config/config.json   # Edit your IDs
./install.sh

In Claude Code:

Generate test plan for a user login feature

The test-master skill activates and walks you through. Or try:

  • "I want to file a bug — the checkout crashes on Android"
  • "Review these test cases [Google Sheet URL]"
  • "Check if my tests actually catch bugs in src/auth/"

Windows users — there's a PowerShell version (install.ps1) as of v1.3.0.

What's still missing

This is v1.6.2. The roadmap still has:

  • Japanese translation
  • Web UI for editing config.json visually
  • More skills (test-impact-analyzer, oauth-flow-test, websocket-realtime-test, llm-quality-eval...)

PRs welcome. The CONTRIBUTING.md has the template for adding a new skill.

Try it

GitHub: kao273183/qa-claude-skill

I'd love to hear what skills are missing for your team's stack — drop an issue or comment below.

If this saves your team time, you can buy me a coffee ☕ — but a ⭐ on the repo helps more.


This is a community / personal project for Claude Code users — NOT an official Anthropic product.

More Posts

I Wrote a Script to Fix Audible's Unreadable PDF Filenames

snapsynapse - Apr 20

Your AI Agent Skills Have a Version Control Problem

snapsynapse - Apr 22

I spent years trying to get AI agents to collaborate. Then Opus 4.6 and Codex 5.3 wrote the rules

snapsynapse - Apr 20

A 3-Agent Claude Pipeline That Safely Open-Sources Any Project

herakles-dev - Mar 31

A Claude Code Skills Stack: How to Combine Superpowers, gstack, and GSD Without the Chaos

Yaohua (Ivan) Chen - Apr 7
chevron_left

Related Jobs

Commenters (This Week)

1 comment
1 comment
1 comment

Contribute meaningful comments to climb the leaderboard and earn badges!