AI Code Review Assistant
GitHub Action that uses Claude API to automatically review pull requests — catches style issues, potential bugs, and suggests improvements before human reviewers look at the code.
This started as a weekend experiment that turned into something my team actually relies on.
Why I Built This
Code reviews are important but they’re a bottleneck. In my team, PRs would sit for hours waiting for someone to review them. The quality of reviews varied — sometimes you’d get detailed feedback, sometimes just a “LGTM.” I wanted something that could do the first pass automatically, catching the obvious stuff so human reviewers could focus on architecture and business logic.
How It Works
A GitHub Action triggers on every PR event. It:
- Collects the diff and relevant context (file history, related tests, PR description)
- Sends it to Claude’s API with a carefully crafted system prompt
- Returns structured feedback — categorized as bugs, style issues, performance concerns, or suggestions
- Posts the review as PR comments with inline annotations
The system prompt is the secret sauce. I spent a lot of time tuning it to avoid the annoying “you should add a comment here” type feedback that developers ignore. It focuses on actual issues: null reference risks, missing error handling, test coverage gaps, security concerns.
Technical Details
- Model: Claude Sonnet for the balance of quality vs. cost
- Prompt caching: Reduces API costs by ~60% since the system prompt and context instructions don’t change between reviews
- Structured outputs: JSON schema for consistent review format
- Configurable rules: Teams can adjust sensitivity, ignore patterns, and customize what gets flagged
What I Learned
- LLMs are surprisingly good at catching logic errors that linters miss
- They’re bad at understanding business context unless you give them very specific instructions
- Developers initially resist AI reviews, then quietly start relying on them
- Prompt caching is essential for cost control — without it, this would cost 2-3x more
Results
- Catches about 30% of issues before human review
- Review turnaround dropped from hours to minutes for the first pass
- Costs less than $50/month for a team of 10 developers
- My team’s code quality metrics actually improved (fewer bugs in production)
Timeline: First prototype November 2024, current version running since January 2025.