Analysis
Most engineering teams have the same quiet problem with code review. The reviews that matter get rushed, and the ones that don't matter eat an afternoon. A senior developer ends up skimming a 600-line pull request between meetings, missing the off-by-one error, and waving through the part that ships to production.
The idea here is simple: let a model do the first pass. It reads the diff the moment a pull request opens, flags what looks wrong, and posts its notes as comments before any human has spent a minute on it. By the time a reviewer shows up, the obvious stuff is already circled.
The catch, and the reason this isn't just "let the AI approve everything," is that the model only suggests. A person still decides what's a real bug and what's noise. That split matters, so it's baked into the design from the start.
What follows is the build itself: pulling the diff out of GitHub, feeding it to a review agent, and wiring the result back into a pull request. The code below uses Claude Sonnet 4.6 for the heavy analysis and, where you want a faster and cheaper second opinion, GPT-5.5 Instant for quick checks.
Analysis
Prerequisites
- GitHub repository
- GitHub Actions enabled
- Anthropic API key
- Python 3.10+
Step-by-Step Framework
Step 1: PR Diff Extraction
First job is getting the changes out of GitHub in a shape you can work with. The List pull request files endpoint hands back one object per changed file, with the filename, status, line counts, and the raw patch. The code below grabs that, skips anything that was deleted, and breaks each patch into hunks so you keep track of which line numbers changed.
# code_review/diff_extractor.py
import requests
import re
def fetch_pr_diff(owner: str, repo: str, pr_number: int, token: str) -> list[dict]:
"""Fetch and parse PR diff into structured file changes."""
url = f"https://api.github.com/repos/{owner}/{repo}/pulls/{pr_number}/files"
headers = {"Authorization": f"token {token}", "Accept": "application/vnd.github.v3+json"}
response = requests.get(url, headers=headers)
files = response.json()
changes = []
for f in files:
if f["status"] == "removed":
continue
patch = f.get("patch", "")
# Parse hunk headers
hunks = parse_hunks(patch)
changes.append({
"filename": f["filename"],
"status": f["status"],
"additions": f["additions"],
"deletions": f["deletions"],
"patch": patch,
"hunks": hunks
})
return changes
def parse_hunks(patch: str) -> list[dict]:
"""Parse diff patch into hunks with line numbers."""
hunks = []
current_hunk = None
for line in patch.split("\n"):
if line.startswith("@@"):
# New hunk: @@ -old_start,old_count +new_start,new_count @@
match = re.match(r"@@ -(\d+)?(\d*) \+(\d+)?(\d*) @@", line)
if match:
if current_hunk:
hunks.append(current_hunk)
current_hunk = {
"old_start": int(match.group(1)),
"new_start": int(match.group(3)),
"lines": []
}
elif current_hunk is not None:
current_hunk["lines"].append(line)
if current_hunk:
hunks.append(current_hunk)
return hunksStep 2: Code Analysis Agent
Now the part that does the reading. This agent takes one file change at a time and asks the model to review it. Working file by file keeps each prompt small, which is what holds the response time down on big pull requests. The client comes straight from the official Anthropic Python SDK, so there's no custom plumbing to maintain.
# code_review/analyzer.py
from anthropic import Anthropic
import json
class CodeReviewAgent:
def __init__(self):
self.client = Anthropic()
def review_file(self, file_change: dict, repo_context: str = "") -> list[dict]:
"""Review a single file change and return findings."""
prompt = f"""You are an expert code reviewer. Review this code change carefully.
File: {file_change['filename']}
Status: {file_change['status']}
Lines changed: +{file_change['additions']}/-{file_change['deletions']}
Repository context: {repo_context}
Code diff:


