Back to news

How-to Guide

How to set up agent approval gates and human review.

Implement tiered approval workflows for AI agent actions: auto-approve low-risk operations, require human confirmation for medium-risk, and enforce multi-person review for high-risk changes.

AI Kick Start editorial image for How to set up agent approval gates and human review.

Decision

Start narrow

Use the article to decide the smallest useful workflow worth testing before expanding the system.

Risk to watch

Hype drift

Avoid turning a practical adoption step into a broad transformation promise nobody can verify.

Proof to collect

Business signal

Write down the owner, data boundary, review point, and measurable outcome before the first build.

TL;DR

TL;DR: Not all agent actions carry the same weight. A tiered approval system auto-approves low-risk operations such as read-only queries, asks one person to confirm medium-risk work like file edits, and holds high-risk changes (deployments, schema changes) for multi-person review. This guide walks through building the whole approval workflow.

Key takeaways

  • Risk tiers: Auto-approve, single approval, multi-approval, emergency break
  • Detection: Automatic risk scoring based on action type and scope
  • Urgency: Fast-track for incidents; standard flow for normal operations
  • Audit: Every approval decision logged with full context
  • Escalation: Unreviewed requests escalate after configurable timeouts

Analysis

The pitch for AI agents is that they get on with the work so your people don't have to. The catch is that an agent willing to edit a file is, with the same confidence, willing to drop a production table. It doesn't pause to ask whether this one is different.

So the real question for any team running agents isn't "can it do the task?" It's "what happens the moment it tries to do something it shouldn't?" A read-only lookup and a database migration both arrive as the same kind of request. Treat them the same way and you either slow everything to a crawl with sign-offs, or you wave through the one action that takes the business down.

The fix most teams land on is approval gates: a layer that sorts each action by how much damage it could do, then routes it accordingly. Harmless work runs on its own. Risky work waits for a human. Genuinely dangerous work needs more than one set of eyes. Below is a working version of that pattern, in Python, with a Slack hook so approvers can respond where they already are.

One thing worth flagging up front, because it bites people: Slack's interactive Approve and Reject buttons don't reliably work through a plain incoming webhook. The example here shows the message shape, but for live buttons you'll need a proper Slack app. More on that at the end.

Analysis

Prerequisites

  • Web application for the approval UI (or a Slack/Teams integration)
  • Database to hold approval state
  • Notification system (email, Slack, PagerDuty)
  • Authentication system for approvers

Step-by-Step Framework

Step 1: Risk Classification

Start by deciding what each action is worth. The classifier reads the action type and scope, then sorts it into one of four tiers. Anything that doesn't match a riskier rule falls through to auto-approve, so read-only work never waits on a person.

# approval/risk_classifier.py
from enum import Enum
from dataclasses import dataclass

class RiskLevel(Enum):
    AUTO_APPROVE = "auto"
    SINGLE_APPROVAL = "single"
    MULTI_APPROVAL = "multi"
    EMERGENCY_STOP = "emergency"

@dataclass
class AgentAction:
    action_type: str
    target: str
    scope: str  # "single_file", "directory", "database", "infrastructure"
    description: str
    estimated_impact: str  # "none", "local", "service", "organisation"

class RiskClassifier:
    RULES = {
        # Read-only operations
        RiskLevel.AUTO_APPROVE: [
            {"action_type": "read", "scope": "*"},
            {"action_type": "search", "scope": "*"},
            {"action_type": "lint", "scope": "*"},
            {"action_type": "test", "scope": "*"},
        ],
        # File modifications (non-critical)
        RiskLevel.SINGLE_APPROVAL: [
            {"action_type": "write", "scope": "single_file", "estimated_impact": "local"},
            {"action_type": "refactor", "scope": "single_file", "estimated_impact": "local"},
            {"action_type": "generate_tests", "scope": "*"},
        ],
        # Wide-scope or impactful changes
        RiskLevel.MULTI_APPROVAL: [
            {"action_type": "write", "scope": "directory"},
            {"action_type": "migrate", "scope": "*"},
            {"action_type": "deploy", "scope": "*"},
            {"action_type": "modify_schema", "scope": "*"},
            {"action_type": "delete", "scope": "*"},
        ],
        # Critical infrastructure
        RiskLevel.EMERGENCY_STOP: [
            {"action_type": "modify", "target": "production_database"},
            {"action_type": "delete", "target": "production_*"},
            {"action_type": "rotate", "target": "master_key"},
        ]
    }

    def classify(self, action: AgentAction) -> RiskLevel:
        # Check emergency rules first
        for level, rules in [
            (RiskLevel.EMERGENCY_STOP, self.RULES[RiskLevel.EMERGENCY_STOP]),
            (RiskLevel.MULTI_APPROVAL, self.RULES[RiskLevel.MULTI_APPROVAL]),
            (RiskLevel.SINGLE_APPROVAL, self.RULES[RiskLevel.SINGLE_APPROVAL])
        ]:
            for rule in rules:
                if self._matches(action, rule):
                    return level

        return RiskLevel.AUTO_APPROVE

    def _matches(self, action: AgentAction, rule: dict) -> bool:
        for key, pattern in rule.items():
            value = getattr(action, key, "")
            if pattern != "*" and not self._match_pattern(value, pattern):
                return False
        return True

    def _match_pattern(self, value: str, pattern: str) -> bool:
        import fnmatch
        return fnmatch.fnmatch(value, pattern)

The order matters. Emergency rules get checked first, then multi-approval, then single. That way a delete against production_* trips the emergency tier before any looser rule can claim it. The pattern matching leans on Python's standard-library `fnmatch`, which handles shell-style wildcards like production_* out of the box, so you write the rules and the library does the comparison.

Step 2: Approval Workflow Engine

Once an action has a risk level, something has to track it from request to decision. The workflow engine creates the request, records who approved it, and closes it out when enough people have signed off. Auto-approved actions skip the queue entirely and return straight away.

# approval/workflow.py
import uuid
from datetime import datetime, timedelta
from typing import Optional

class ApprovalRequest:
    def __init__(self, action: AgentAction, risk_level: RiskLevel):
        self.id = str(uuid.uuid4())
        self.action = action
        self.risk_level = risk_level
        self.status = "pending"  # pending, approved, rejected, expired, auto_approved
        self.created_at = datetime.utcnow()
        self.expires_at = self.created_at + timedelta(hours=24)
        self.approvals = []
        self.rejection_reason = None

class ApprovalWorkflow:
    REQUIREMENTS = {
        RiskLevel.AUTO_APPROVE: {"approvers": 0, "timeout_minutes": 0},
        RiskLevel.SINGLE_APPROVAL: {"approvers": 1, "timeout_minutes": 60},
        RiskLevel.MULTI_APPROVAL: {"approvers": 2, "timeout_minutes": 240},
    }

    def __init__(self, db, notifier):
        self.db = db
        self.notifier = notifier
        self.classifier = RiskClassifier()

    async def submit(self, action: AgentAction) -> ApprovalRequest:
        risk = self.classifier.classify(action)

        request = ApprovalRequest(action, risk)

        if risk == RiskLevel.AUTO_APPROVE:
            request.status = "auto_approved"
            return request

        # Save to database
        await self.db.save(request)

        # Notify approvers
        await self.notifier.send_approval_request(request)

        return request

    async def approve(self, request_id: str, approver_id: str) -> ApprovalRequest:
        request = await self.db.get(request_id)

        if request.status != "pending":
            raise ValueError(f"Request is {request.status}")

        request.approvals.append({
            "approver": approver_id,
            "timestamp": datetime.utcnow()
        })

        required = self.REQUIREMENTS[request.risk_level]["approvers"]

        if len(request.approvals) >= required:
            request.status = "approved"
            await self.notifier.notify_agent(request)

        await self.db.save(request)
        return request

    async def reject(self, request_id: str, approver_id: str, reason: str):
        request = await self.db.get(request_id)
        request.status = "rejected"
        request.rejection_reason = reason
        await self.db.save(request)
        await self.notifier.notify_agent(request)

The REQUIREMENTS table is where you tune the trade-off between speed and safety. Single-approval requests carry a 60-minute timeout; multi-approval gets four hours, because rounding up two people takes longer than rounding up one. Each request also expires 24 hours after it's created, so nothing sits in the queue forever. Notice that approve raises if the request isn't pending any more, which stops a stale Slack button from double-approving something that's already been decided.

Step 3: Slack Integration

Most teams don't want approvers logging into a separate dashboard. Pushing the request into Slack, where they already spend their day, is what makes the gate get used instead of bypassed.

# approval/notifiers.py
class SlackNotifier:
    def __init__(self, webhook_url: str):
        self.webhook_url = webhook_url

    async def send_approval_request(self, request: ApprovalRequest):
        color = {
            RiskLevel.SINGLE_APPROVAL: "warning",
            RiskLevel.MULTI_APPROVAL: "danger",
        }.get(request.risk_level, "info")

        payload = {
            "attachments": [{
                "color": color,
                "title": f"Approval Required: {request.action.action_type}",
                "fields": [
                    {"title": "Action", "value": request.action.description, "short": False},
                    {"title": "Target", "value": request.action.target, "short": True},
                    {"title": "Risk Level", "value": request.risk_level.value, "short": True},
                    {"title": "Request ID", "value": request.id, "short": True},
                ],
                "actions": [
                    {
                        "name": "approve",
                        "text": "Approve",
                        "type": "button",
                        "style": "primary",
                        "value": request.id
                    },
                    {
                        "name": "reject",
                        "text": "Reject",
                        "type": "button",
                        "style": "danger",
                        "value": request.id
                    }
                ]
            }]
        }

        import requests
        requests.post(self.webhook_url, json=payload)

The colour mapping does some quiet work here: single-approval messages come through amber (warning), multi-approval comes through red (danger), so an approver reads the stakes before reading a word. The attachment shape itself is sound. Slack documents that interactive buttons live in an actions array inside an attachment, and that attachments accept warning and danger colour values, per its legacy interactive message field guide.

There's a catch, though, and it's the one I flagged at the top. This example POSTs to a plain incoming webhook, and Slack's own docs are blunt that legacy incoming webhooks don't support interactive messages. Drop this code in as-is and the buttons either won't render or won't do anything when clicked. To get working Approve and Reject buttons you need a proper Slack app: post the message with chat.postMessage, turn on interactivity, and point it at an endpoint that catches the button clicks and feeds them back into the approve and reject methods from Step 2. Treat the snippet as the message template, not the whole integration.

Do/Don't

DoDon't
Auto-approve all read-only operationsRequire approval for every action
Set expiration timeouts on approval requestsLeave requests open indefinitely
Escalate unreviewed requests after timeoutLet requests sit in queues
Log every approval/rejection with full contextSkip audit logging for "convenience"
Support emergency override with post-hoc reviewBlock critical incident response

Conclusion

If you're putting agents anywhere near production, approval gates aren't optional. The tiered approach earns its keep by matching the level of scrutiny to the level of risk: read-only work runs on its own, file edits get one reviewer, and anything that touches schemas or deployments needs two people to agree. Log every decision, expire every request, and keep an emergency override so the gate never gets in the way of a real incident. Build that, and you get most of the speed agents promise without betting the business on a single bad call.

Source trail

Primary references to keep this briefing grounded

AI and automation information changes quickly. Use these official or primary references to verify the claims, pricing, product behaviour, and compliance details before committing budget or production data.

What to do next

  1. Pick the smallest useful workflow that proves the pattern.
  2. Write down the owner, data boundary, review point, and success measure.
  3. Review the result after the first real run and decide whether to scale, change, or stop.

Want help applying this? Explore AI agent design systems.

AI Kick Start is an Illawarra-based AI studio in Figtree, helping businesses across Wollongong, Shellharbour and Kiama and right across Australia put AI to work.

Explore with AI

Use the article as a decision prompt

Summarise this AI Kick Start article for an Australian business owner. Focus on the useful decision, the risks, and the first practical next step: How to set up agent approval gates and human review

Turn this into a practical roadmap.

Use the guide as a starting point, then map the first workflow worth building.

Book an AI strategy call