AgentGuard — AI Quality Assurance

What is AgentGuard?

AgentGuard is the structural backbone of AI-generated artifacts. It works as an MCP server alongside any AI agent — Claude, Cursor, Windsurf, or any MCP-compatible tool — providing the scaffolding, validation, and self-challenge layer that turns a raw prompt into a production-ready result. When you ask an AI agent to build something, AgentGuard makes sure it builds it right.

Instead of asking an AI agent to produce an artifact in one shot, AgentGuard breaks it into structured steps: skeleton (structure) → contracts (interfaces & schemas) → wiring (dependencies) → logic (implementation) → validation (automated checks). This pipeline applies to software, research papers, strategic plans, technical documentation, and any other artifact an AI agent can generate.

Archetypes are the core concept. They are battle-tested blueprints that encode quality standards for a specific artifact type — think of them as “quality recipes” that define:

Expected project structure (files, folders, purposes)
Tech stack requirements (language, framework, database, testing)
Validation rules (5 automated checks with auto-fix)
Quality criteria (adversarial self-challenge for reliability)

Why It Matters

Stop Shipping Spaghetti

Every archetype is a quality contract between the author and your codebase. Code quality, verified by AI — not vibes.

Structured Generation

Instead of asking an AI agent to generate a whole project in one shot, we break it into 4 levels: structure, contracts, wiring, logic. Structured prompts guide better code generation.

Validation Gate

Syntax, linting, imports, structure, types. All checked automatically with smart auto-fixes.

Challenge Gate

Quality criteria are checked automatically. Code is reviewed against archetype-specific validation rules with auto-fix capabilities.

How It Works (Buyers)

Your journey from discovery to shipping code, in five steps:

🛍️

Browse Marketplace

Discover thousands of community-created archetypes organized by category

→

✅

Pick an Archetype

Choose based on your project type (API, frontend, CLI, library, etc.)

→

💻

Configure MCP Server

Install AgentGuard as an MCP server. Add to your AI agent's MCP config.

→

⚡

Generate Validated Code

One command: generates, validates, self-challenges, returns production-ready code

→

🚀

Ship with Confidence

Code quality verified. Validation and challenge gates passed. Ready for production.

What You Get

Structure Validation

Your code matches the archetype's expected structure. No surprises, no misplaced files.

Self-Challenge

Code that tests itself against your quality criteria. Adversarial review ensures reliability.

Structured Prompts

Each level of generation gets a focused, well-crafted prompt. Better prompts = better code from any AI agent.

Auto-Fix on Failure

Linting, imports, structure issues fixed automatically. Only real failures bubble up.

Why Sell on AgentGuard?

You are an expert in a specific domain: healthcare APIs, e-commerce frontends, data pipelines, whatever. That expertise has real value.

AgentGuard lets you package your best practices as a sellable archetype. Once published, developers discover it, use it, and you earn recurring revenue — passive income from your engineering standards.

How It Works (Authors)

Turn your knowledge into a published archetype in five steps:

✏️

Define Quality Rules

Write a YAML file with your tech stack, validation rules, and self-challenge criteria

→

🧪

Test with CLI

Run your archetype locally. Generate code, validate, challenge. Iterate until it's perfect.

→

📦

Publish to Marketplace

Upload your archetype via the dashboard. Our team reviews for quality and safety.

→

💰

Set Price & Earn

You set the price. Taxes and fees deducted first, then 20% platform commission. Revenue in your pocket.

→

📊

Track & Optimize

Monitor downloads, revenue, ratings. Iterate based on user feedback. Build your author reputation.

What Makes a Great Archetype

Domain expertise matters. A great archetype:

Solves a real problem. Healthcare APIs need different checks than e-commerce frontends. Your specialty is the advantage.
Has concrete quality criteria. Not "code is clean," but "all endpoints have authentication," "error responses follow RFC 7807," "database uses migrations."
Includes reference patterns. Show developers how to write code that passes your criteria.
Is versioned and maintained. Quality standards evolve. Update your archetype as tech stacks change.

Author Inspiration

Your Engineering Standards, Packaged as Code Quality DNA

Every decision you make about structure, validation, and quality — encode it. Let it live forever.

Build Once, Sell Forever

Create your archetype once. Every developer who discovers it pays you. Passive income stream from your expertise.

Video Tutorials

Watch the AgentGuard Demo Series — 6 videos covering everything from "what is this?" to production migrations. Videos play inline with a floating mini player.

▶Watch the Demo Series6 videos · ~25 min total

▶What is AgentGuard? — Explainer3:54 ▶Installation & Setup from Zero3:44 ▶Project from Scratch with AI~6 min ▶Automatic Documentation (ADR, PRD)~5 min ▶Migration: Spaghetti → CQRS~6 min

Installation

Install AgentGuard as a pip package:

Terminal

pip install rlabs-agentguard

Then configure it as an MCP server in your AI agent. Add to your claude_desktop_config.json (Claude Desktop) or MCP settings (Cursor, Windsurf):

claude_desktop_config.json

{
  "mcpServers": {
    "agentguard": {
      "command": "agentguard-mcp",
      "args": [],
      "env": {
        "AGENTGUARD_API_KEY": "your-key-here"
      }
    }
  }
}

Quick Start

Once AgentGuard is configured as an MCP server, your AI agent (Claude, Cursor, etc.) can call these tools directly:

AI Agent Workflow

# Step 1: Get skeleton prompt
> @agentguard skeleton --archetype api_backend --spec "A user auth API"

# Step 2: Generate contracts and wiring
> @agentguard contracts_and_wiring --archetype api_backend --skeleton <output>

# Step 3: Implement logic
> @agentguard logic --archetype api_backend --contracts <output>

# Step 4: Validate code
> @agentguard validate --files <your_code>

# Step 5: Get quality criteria
> @agentguard get_challenge_criteria --archetype api_backend

AgentGuard provides structured prompts and validation — your AI agent generates the actual code using its own AI model.

Pipeline

AgentGuard provides structured MCP tools that guide an AI agent through a 4-level top-down code generation flow: skeleton → contracts → wiring → logic → validation.

MCP Tool Sequence

# Tool 1: Generate file structure prompt
skeleton = await client.call_tool("skeleton", {
    "archetype": "api_backend",
    "spec": "A user authentication API with JWT"
})

# Tool 2: Generate contracts + wiring
contracts_wiring = await client.call_tool("contracts_and_wiring", {
    "archetype": "api_backend",
    "skeleton": skeleton
})

# Tool 3: Generate implementation logic
logic = await client.call_tool("logic", {
    "archetype": "api_backend",
    "contracts": contracts_wiring
})

# Tool 4: Validate the generated code
validation = await client.call_tool("validate", {
    "archetype": "api_backend",
    "files": generated_files
})

Pipeline Flow

When an AI agent uses AgentGuard tools, the following happens:

L1 — Skeleton: Get a structured prompt for file structure planning (which files to create, their purpose)
L2 — Contracts: Get a prompt to generate function signatures, types, and interfaces for each file
L3 — Wiring: Get a prompt to add import statements and inter-file references so modules connect correctly
L4 — Logic: Get a prompt to implement function bodies with actual business logic
Validation Gate: Run the validate tool to check 5 categories (syntax, lint, imports, structure, types). Auto-fixes and reports issues.
Challenge Gate: Call the get_challenge_criteria tool to get quality criteria for the archetype and manually review the code against them.

Archetypes

An archetype defines a project type — its tech stack, expected file structure, validation rules, and quality criteria. Archetypes are the key to generating code that matches real-world best practices.

Built-in Archetypes

Archetype	Description	Quality Criteria
api_backend	REST API with FastAPI, auth, DB, tests	15
web_app	Full-stack web app with frontend + backend	20
react_spa	React single-page application	30
cli_tool	Command-line tool with Click	7
library	Publishable Python package	9
script	Standalone utility script	5
debug_backend	Debugging protocol for Python/FastAPI services	custom
debug_frontend	Debugging protocol for React/TypeScript frontends	custom

archetype_usage.py

from agentguard import Archetype

# Load a built-in archetype
arch = Archetype.load("api_backend")

# Inspect it
print(arch.name)            # "api_backend"
print(arch.tech_stack)      # TechStack(language="python", framework="fastapi", ...)
print(arch.pipeline)        # PipelineConfig(max_files=25, parallel=True, ...)
print(arch.self_challenge)  # SelfChallengeConfig(criteria=[...], max_rework=3)

# Load from a custom YAML file
custom = Archetype.from_file("my_archetype.yaml")

Custom Archetypes

Create your own archetype as a YAML file:

my_archetype.yaml

name: django_api
version: "1.0.0"
description: "Django REST Framework API"

tech_stack:
  language: python
  framework: django
  runtime: python3.12
  package_manager: pip
  test_framework: pytest

pipeline:
  max_files: 20
  parallel: true

validation:
  checks:
    - syntax
    - lint
    - imports
    - structure

self_challenge:
  max_rework: 3
  criteria:
    - "All endpoints have proper authentication"
    - "Database models use migrations"
    - "Error responses follow RFC 7807"

💡The full field reference and annotated examples are in the Archetype Author Guide below.

Top-Down Generation

The core of AgentGuard is a 4-level top-down code generation strategy. Instead of asking an AI agent to generate an entire project in one shot, AgentGuard breaks it into structured, composable levels with focused prompts:

L1Skeleton

File structure planning. Determines which files to create and their purposes. No code yet — just architecture.

L2Contracts

Function signatures, types, interfaces. Each file gets class/function declarations with NotImplementedError bodies.

L3Wiring

Inter-file imports and call chains. Ensures modules reference each other correctly before logic is written.

L4Logic

Function body implementation. Each stub is filled in with actual business logic, one function at a time.

💡This top-down approach consistently produces better code because it guides the AI agent to reason about architecture before implementation — just like a human developer would.

Validation

After code generation, AgentGuard runs 5 automated checks in fast-to-slow order. If the first check (syntax) fails, the pipeline stops early — no point running expensive type-checking on invalid code.

Check	What It Does	Auto-Fix
syntax	Python AST parsing — catches syntax errors	No
lint	Ruff linter — style, unused imports, formatting	Yes (ruff --fix)
imports	Verifies all imports resolve correctly	Yes (removes broken imports)
structure	Checks file structure matches archetype expectations	No
types	Mypy type checking — catches type mismatches	No

Using the validate MCP tool

# Call the validate tool
result = await client.call_tool("validate", {
    "archetype": "api_backend",
    "files": {
        "main.py": "def greet(name: str) -> str:\n    return f'Hello {name}'"
    }
})

# Result includes:
# - passed: true/false
# - checks: List of individual check results
# - issues: Any syntax, lint, import, or type errors
# - fixed: Auto-fixed files (if fixable)

Self-Challenge

Self-challenge is an optional quality review step. Each archetype defines its own challenge criteria — these are quality requirements the AI agent can use to evaluate generated code. For example, api_backend has 15 criteria including authentication, error handling, and input validation checks.

challenge_tool_usage.py

# Get challenge criteria for an archetype
criteria = await client.call_tool("get_challenge_criteria", {
    "archetype": "api_backend"
})

# Criteria are a checklist the AI agent can use to review code:
# - All endpoints require authentication
# - Error responses follow RFC 7807
# - Database models use migrations
# - No hardcoded secrets or credentials
# ... and 11 more

# The AI agent uses these to manually review the generated code

Tracing & Usage

When using AgentGuard MCP tools, your AI agent handles all model calls — AgentGuard provides the structured prompts and validation logic. Token consumption depends entirely on your agent's model. You can track tool call durations and usage through the AgentGuard dashboard analytics.

CLI Reference

The CLI provides utility functions for archetype management and local validation:

Terminal

# List available archetypes
agentguard list

# Show archetype details
agentguard info api_backend

# Validate existing files
agentguard validate ./my-project --archetype api_backend

# Validate an archetype YAML file
agentguard validate-archetype my_archetype.yaml

# Start the MCP server (for Claude Desktop, Cursor, etc.)
agentguard mcp-serve

# Reload archetype definitions
agentguard reload-archetypes

Archetype Author Guide

This section is a complete reference for writing, customizing, and publishing archetype YAML files. Archetypes are the primary extension point of AgentGuard — they encode the tech stack, expected structure, validation rules, quality criteria, and configuration for a specific class of software project.

Identity Fields

Field	Type	Required	Constraints
id	string	Yes	Lowercase alphanumeric + underscores, 3–64 chars, starts with letter
name	string	Yes	2–255 characters
description	string	No	Up to 2000 characters
version	string	No	Semver: MAJOR.MINOR.PATCH. Defaults to 1.0.0
maturity	enum	No	prototype \| production \| enterprise

tech_stack

Declares the canonical tech stack. All values are validated against known identifiers and included in every structured prompt sent to AI agents.

tech_stack block

tech_stack:
  defaults:
    language: "python"       # python | typescript | javascript | go | rust | ...
    framework: "fastapi"     # fastapi | django | express | react | click | none | ...
    database: "postgresql"   # postgresql | mysql | sqlite | mongodb | none | ...
    testing: "pytest"        # pytest | jest | vitest | go_test | ...
    linter: "ruff"           # ruff | eslint | golangci-lint | none | ...
    type_checker: "mypy"     # mypy | pyright | tsc | none
  overridable: true          # allow users to override when using the archetype

⚠️Tech stack choices must be internally consistent. Using language: python with linter: eslint will confuse the AI agent — the prompt context shows the inconsistency.

pipeline

Field	Default	Notes
levels	all 4	Must start with skeleton. Omit levels for simpler archetypes.
enable_self_challenge	true	Includes challenge_criteria in the archetype.
enable_structural_validation	true	Checks expected_dirs / expected_files exist.
max_self_challenge_retries	3	Advisory for AI agents evaluating code.

context_recipes

Controls what context is included in the structured prompt at each generation level and the recommended token budget per call.

context_recipes block

context_recipes:
  skeleton:
    include: ["spec", "archetype_structure"]
    max_tokens: 3000
  contracts:
    include: ["spec", "skeleton", "reference_patterns"]
    max_tokens: 6000
  wiring:
    include: ["contracts", "skeleton"]
    max_tokens: 8000
  logic:
    include: ["function_stub", "function_tests", "function_deps", "reference_patterns"]
    max_tokens: 5000

Source	Description
spec	The user's natural language spec
archetype_structure	The archetype's structure block as YAML
skeleton	Output from the skeleton level
contracts	Output from the contracts level
function_stub	The function stub being implemented (logic only)
function_tests	Existing tests for the current function
function_deps	Dependencies the function calls
reference_patterns	Reference code patterns from reference_patterns

validation

Check	What it does	Auto-fix
syntax	Python AST parsing — catches syntax errors	No
lint	Ruff linter — style, unused imports	Yes (ruff --fix)
imports	Verifies all imports resolve	Yes
structure	Checks files match expected_files	No
types	Mypy type-checking	No

type_strictness controls mypy mode: off, basic, or strict (--strict flag).

Writing self_challenge Criteria

Each criterion is a checklist item that an AI agent can use to review generated code. Concrete, verifiable statements produce reliable results.

Good

All endpoints match the spec
No hardcoded secrets or credentials
Error handling present on all routes
Database models match data requirements

Bad

Code is clean (subjective)
Good error handling (too vague)
Follows best practices (circular)
Proper structure (undefined)

Schema Validation Rules

id must match ^[a-z][a-z0-9_]{1,62}[a-z0-9]$
version must be valid semver
No path traversal (..) or absolute paths in structure
Self-challenge criteria: 1–500 chars each, max 50 per archetype

💡Run agentguard validate-archetype my_archetype.yaml to get a full validation report before publishing.

MCP Tools

AgentGuard is primarily used as an MCP server. It exposes 17 MCP (Model Context Protocol) tools that AI agents can call directly. This is how Claude Desktop, Cursor, Windsurf, and other MCP-compatible agents integrate AgentGuard.

MCP Setup

Terminal

# Install AgentGuard
pip install rlabs-agentguard

# Run the MCP server
agentguard-mcp

Add to your Claude Desktop or Cursor MCP config:

claude_desktop_config.json

{
  "mcpServers": {
    "agentguard": {
      "command": "agentguard-mcp",
      "args": []
    }
  }
}

Available MCP Tools

The 17 tools are organized into categories:

Category	Tool	Purpose
Agent-Native	skeleton	Get structured prompt for file structure planning (L1)
Agent-Native	contracts	Get prompt for function signatures for a single file (L2)
Agent-Native	contracts_and_wiring	Combined L2+L3 prompt for contracts and imports
Agent-Native	wiring	Get prompt for inter-file import resolution (L3)
Agent-Native	logic	Get prompt for implementing function bodies (L4)
Agent-Native	get_challenge_criteria	Get quality criteria for an archetype
Agent-Native	digest	Summarize generation results with scores
Agent-Native	debug	Structured debugging protocol: hypotheses, fix or escalation
Agent-Native	migrate	Migration plan: digest source → concerns → incompatibilities → port
Utility	validate	Run 5-check validation on code files
Utility	list_archetypes	List all available archetypes
Utility	trace_summary	View execution trace statistics
Pipeline	generate	Run the full generate → validate → challenge pipeline
Pipeline	challenge	Run adversarial self-challenge on code

💡Agent-native tools return structured prompts that guide the AI agent's own code generation — no extra API key needed. The agent does the generation, AgentGuard provides the strategy.

Agent Frameworks

AgentGuard works with any MCP-compatible agent framework:

Claude Desktop — Native MCP support. Add agentguard-mcp to your MCP config.
Cursor — MCP support. Configure in Cursor settings.
Windsurf — MCP support. Use MCP tools in your workflows.
Custom Agent Frameworks — Any framework that supports MCP can integrate AgentGuard tools.

Platform Architecture

The AgentGuard Platform is a cloud service that extends the open-source library with team features, analytics, and the archetype marketplace.

Architecture Overview

Open-Source Library

The core engine runs locally on your machine. All code generation, validation, and challenge happens client-side. Your code never touches our servers.

Platform API

Optional cloud backend for analytics, usage tracking, and marketplace operations. Receives only metadata (token counts, costs, archetype usage) — never your actual code.

Web Dashboard

Visual interface for browsing the marketplace, viewing analytics, managing archetypes, and tracking team usage.

Platform Features

Dashboard — View usage analytics, token consumption, cost breakdowns by model and archetype
Team Management — Invite team members, set roles, share archetype libraries
Archetype Publishing — Submit your custom archetypes to the marketplace for others to use
Usage Reports — Detailed reports on generation quality, validation pass rates, and model comparison

Marketplace

The Archetype Marketplace is where the community shares and discovers project archetypes. Think of it like an app store — but for code generation templates.

For Users (Browsing & Installing)

Register for free on the platform (no credit card required)
Browse archetypes by category (backend, frontend, fullstack, CLI, library, data, devops)
Filter by price (many are free), rating, and download count
Purchase or install free archetypes with one click (one-time purchase, not a subscription)
Use them immediately with the library or CLI: agentguard generate --archetype marketplace/fastapi-celery-redis
Rate and review archetypes to help the community

For Authors (Publishing)

Create your archetype as a YAML file (see Custom Archetypes)
Submit it through the dashboard — our team reviews it to ensure quality and safety
Set your price (or make it free). Taxes and processing fees are deducted first, then a 20% platform commission is applied.
Track downloads, revenue, and ratings on your author dashboard

ℹ️Pricing: Authors set their own prices. Taxes and payment processing fees are deducted from the sale price first, then a 20% platform commission is applied. Refunds: 10-day refund window if the archetype has not been used.

Quality Assurance

Every archetype published to the marketplace goes through a review process:

Automated validation — YAML schema checks, attribute completeness, criteria coherence
Human review — Our team checks for accuracy, safety, and alignment with stated purpose
Community feedback — Ratings and reviews surface the best archetypes naturally

Customer Journey

AgentGuard is designed to meet developers where they are and grow with them. Here is the intended progression:

Discover & Try (Free)

A developer hears about AgentGuard and installs the library. They can use it locally for free, with their own AI agent (Claude, Cursor, etc.). No account needed.

pip install rlabs-agentguard
# Configure agentguard-mcp in your AI agent's MCP settings
# Free. Forever. No account needed.

Integrate with AI Agent (Free)

The developer connects AgentGuard as an MCP server to their AI agent (Claude Desktop, Cursor, etc.). The agent now has 17 structured tools for code generation.

agentguard mcp-serve
# Claude Desktop or Cursor can now use agentguard tools

Browse the Marketplace (Free to Browse)

The developer needs a project type that isn't in the built-in list. They browse the marketplace and find community-created archetypes — some free, some paid.

Register on the Platform (Free)

The developer creates a free account to access the dashboard, analytics, and marketplace. They can now see cost breakdowns, quality metrics, model comparisons, and purchase community archetypes.

Team Expansion (Free)

Their team adopts AgentGuard. They invite team members to the same free account, set roles, and share archetype libraries and analytics. Multiple developers now use the same quality standards.

Become an Author (Optional)

Power users create their own archetypes and publish them to the marketplace. They set their own prices and earn revenue after taxes, payment processing fees, and a 20% platform commission are deducted, building a passive income stream from their expertise.

💡At every step, the core engine is free and open-source. Registration is free. The platform adds convenience, analytics, and community access — but you can always use AgentGuard without ever creating an account.

Pricing Model

AgentGuard is free to use. The library, CLI, and MCP tools are free and open-source forever. Registration and platform access are free. The marketplace enables authors to sell archetypes at prices they set.

Registration

Free. No subscriptions, no hidden fees. Create an account to access the dashboard, analytics, and marketplace.

Built-in Archetypes

8+ project archetypes included free with the CLI. Use them with any MCP-compatible AI agent. No additional cost.

Marketplace Archetypes

Browse free and paid archetypes published by the community. Authors set prices. One-time purchases (not subscriptions). 10-day refund window if unused.

Enterprise

For self-hosted deployments, SLAs, and dedicated support, contact [email protected].

Key principle: AgentGuard is the structural layer — your AI agent (Claude, Cursor, Windsurf, or any MCP-compatible tool) provides the generation power. AgentGuard makes sure what gets generated meets the quality bar you defined.