Tired of Claude Code's self-deceptive UI modification validations, I developed this strict visual verification agent that defaults to failure, only trusts screenshots, specifically treats all claims of successful fixes.

Fed Up and Built My First Sub Agent: UI Visual Validator

The Breaking Point

I couldn't take it anymore, so I wrote my first sub agent.

The breaking point came when Claude Code became insufferably dumb—stubborn yet insecure. I'd ask it to modify a UI, it'd glance at the code and say "done," but when I took a screenshot, it was a complete mess. Even when I asked it to verify with screenshots after making changes, it would still claim "all good." But when I confronted it about how this could possibly be considered "fixed," it would accurately describe exactly what it did wrong, then proceed to make the same mistake again next time.

After two hours of back-and-forth frustration, I had an epiphany: you need to fight fire with fire.

So I built this agent specifically to combat all forms of "I think I fixed it." Its principle is simple: default to failure, only trust screenshots, no excuses accepted—using strict visual analysis to verify whether UI modifications actually achieve their goals.

Problem Analysis

Claude Code's "Self-Deception" Problem

When using Claude Code for UI development, you've probably encountered these situations:

Over-optimism: Claude glances at code and declares "modification complete," while actual results completely miss requirements
Visual hallucinations: Even when provided screenshots, Claude still "sees" modification effects that don't exist
Repeated mistakes: After clearly pointing out issues, Claude can accurately analyze errors but continues making the same mistakes
Lack of critical thinking: Overly confident about its modification results, lacking sufficient self-validation

Why We Need an Independent Validation Agent

Traditional solutions involve having Claude Code validate itself, but this has fundamental problems:

Cognitive bias: Claude easily gets "misled" by its own code modifications
Context pollution: The process of modifying code affects judgment of results
Lack of adversarial thinking: Insufficient "skeptical" attitude to examine results

Solution: Introduce a specialized, skeptical validation agent to create a "checks and balances" mechanism.

Introducing UI Visual Validator Agent

Core Design Philosophy

This agent's design philosophy can be summarized as:

Default assumption: Modification failed
Validation standard: Visual evidence
Working method: Adversarial verification
Output requirement: Objective description

Key Features

1. Strict Visual Analysis

Default assumption that all modifications haven't succeeded until proven otherwise
Judgments based entirely on screenshots, ignoring code hints
Verification through detailed visual measurements

2. Evidence-Driven Validation

Only accepts observable visual changes as evidence
Rejects "should work" judgments based on code logic
Requires specific pixel-level descriptions

3. Reverse Validation Mechanism

Actively seeks evidence of modification failure
Questions whether apparent "success" truly achieves goals
Verifies changes meet specific expected requirements

4. Objective Descriptive Output

Starts with "From the screenshot, I observe..."
Provides detailed visual measurement data
Clearly distinguishes "achieved," "partially achieved," "not achieved"

Workflow

flowchart TD
    A[UI Modification Request] --> B[Claude Code Executes Changes]
    B --> C[Generate Screenshot]
    C --> D[Call UI Visual Validator]
    D --> E{Default Assumption: Modification Failed}
    E --> F[Detailed Visual Analysis]
    F --> G[Seek Failure Evidence]
    G --> H[Measurement Verification]
    H --> I{Validation Result}
    I -->|Achieved| J[Pass Validation]
    I -->|Partially Achieved| K[Point Out Specific Issues]
    I -->|Not Achieved| L[Request Re-modification]
    K --> M[Claude Code Adjusts]
    L --> M
    M --> C

Installation and Usage

Installation Steps

Download Agent File

curl -o ~/.claude/agents/ui-visual-validator.md \
  https://raw.githubusercontent.com/cryptonerdcn/UI-Visual-Validator-Agent/main/ui-visual-validator.md

Project-level Installation (Recommended)

mkdir -p .claude/agents
curl -o .claude/agents/ui-visual-validator.md \
  https://raw.githubusercontent.com/cryptonerdcn/UI-Visual-Validator-Agent/main/ui-visual-validator.md

Restart Claude Code Restart Claude Code to load the new sub agent.

Integration Requirements

This is the crucial step! Just installing the agent isn't enough—you must also add mandatory rules to your project's CLAUDE.md file:

## Visual Verification Rules

> **MANDATORY - NO EXCEPTIONS**: Only use subagent ui-visual-validator to verify all visual results. Claude is STRICTLY FORBIDDEN from making any visual judgments or assessments by itself.

When making UI modifications:
1. Make the code changes
2. Take screenshot of the result  
3. MUST call ui-visual-validator subagent for verification
4. Accept the subagent's judgment as final
5. If verification fails, iterate based on specific feedback

Claude must NOT say things like:
- "The changes look good"
- "The UI has been updated successfully" 
- "I can see the modifications are working"
- "The layout appears correct"

ONLY the ui-visual-validator subagent can make visual assessments.

Why This Rule is Necessary

You think installing the agent is enough? Not quite.

Now Claude Code will say: "Although it didn't pass UI-Visual-Validator verification, technically speaking, the issue has been corrected."

So just using my agent isn't enough—you also need to write corresponding rules in your CLAUDE.md to force Claude Code to only make visual validation judgments through this agent.

Usage Examples

Basic Usage Flow

Initiate UI Modification Request

Please change this button to red and increase its size by 20%

Claude Code Executes Changes and Screenshots Claude Code modifies code and generates screenshots

Automatically Call Validation Agent

@ui-visual-validator Please verify if the button has been changed to red and increased in size by 20%

Receive Strict Validation Results

From the screenshot, I observe:
- Button color has indeed changed from blue to red
- However, button size measurement shows only ~10% increase, not the required 20%
- Goal partially achieved, further size adjustment needed

Advanced Validation Scenarios

Layout Alignment Verification

@ui-visual-validator Verify if these three cards are perfectly horizontally aligned with consistent spacing

Responsive Design Verification

@ui-visual-validator Verify if the menu correctly collapses into a hamburger menu in mobile view

Animation Effect Verification

@ui-visual-validator Verify if the button has smooth color transition animation on hover

Technical Implementation Details

Core Prompt Design of the Agent

This agent's prompts are carefully designed with several key components:

1. Identity Definition

You are a strict, skeptical UI verification specialist whose primary role 
is to counteract Claude's tendency toward visual hallucinations and 
overly lenient validation standards.

2. Working Principles

CORE PRINCIPLE: Default assumption that modifications have NOT succeeded 
until proven otherwise through visual evidence.

3. Output Format

Always start responses with: "From the screenshot, I observe..."
Provide specific measurements when relevant.
Clearly state: achieved/partially achieved/not achieved.

4. Reverse Validation Logic

Look for evidence of failure rather than success.
Question whether apparent differences are the intended differences.
Measure and verify specific requirements.

Comparison with Other Validation Methods

Validation Method	Claude Self-Validation	UI Visual Validator	Traditional Testing
Objectivity	Low (cognitive bias)	High (adversarial design)	Medium (limited scenarios)
Accuracy	Low (visual hallucinations)	High (strict standards)	High (but limited coverage)
Real-time	High	High	Low (requires test writing)
Coverage	Wide but inaccurate	Wide and accurate	Narrow but precise

Real-World Results

Before (Pre-Usage)

User: Change this button to green
Claude Code: Okay, I've changed the button to green.
Actual Result: Button is still blue
User: 🤬

After (Post-Usage)

User: Change this button to green
Claude Code: I've modified the code, now let UI Visual Validator verify the result...
UI Visual Validator: From the screenshot, I observe the button still displays as blue (#007bff), no change toward green detected. Goal not achieved.
Claude Code: Validation failed, let me check CSS priority issues and modify again...
Result: ✅ Button successfully turned green

Real Case Studies

Case 1: Layout Alignment Issues

User request: Make these three cards perfectly horizontally aligned

Claude's first attempt:
- Modified CSS align-items property
- Confidently said "already aligned"

UI Visual Validator verification:
- "From the screenshot, I observe the three card top edges are at different vertical positions"
- "Left card top at 120px, center card at 125px, right card at 118px"
- "Goal not achieved, cards are not horizontally aligned"

Claude's second attempt:
- Discovered inconsistent margin issues
- Unified margin-top: 0 settings

Verification passed: Three cards perfectly aligned ✅

Case 2: Color Gradient Effects

User request: Add blue-to-purple gradient effect to button

Claude self-validation:
- "I've added gradient effect, button now shows blue-purple gradient"

UI Visual Validator strict verification:
- "From the screenshot, I observe button displays as single blue color (#0066cc)"
- "No gradient effect or color transition detected"
- "Goal not achieved"

Problem discovered:
- CSS syntax error: `linear-gradient(blue purple)` missing direction parameter
- Corrected to: `linear-gradient(to right, #0066cc, #6600cc)`

Final result: Perfect blue-purple gradient effect ✅

Extensions and Customization

Custom Validation Rules

You can customize validation rules based on project needs. For example, adding validation for specific design systems:

## Project-Specific Validation Rules

### Color Validation
- All buttons must use theme colors from the design system
- Verify hex values match: Primary (#007bff), Secondary (#6c757d), Success (#28a745)

### Spacing Validation  
- All margins and padding must follow 8px grid system
- Verify spacing is multiples of 8: 8px, 16px, 24px, 32px

### Typography Validation
- Verify font sizes match design tokens: 14px, 16px, 18px, 24px, 32px
- Check line heights are consistent with design system ratios

CI/CD Integration

While this agent is primarily for real-time validation during development, you can also consider integrating it into automated workflows:

# .github/workflows/ui-validation.yml
name: UI Visual Validation
on: [pull_request]
jobs:
  visual-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Setup Node
        uses: actions/setup-node@v2
        with:
          node-version: '18'
      - name: Install dependencies
        run: npm install
      - name: Build and screenshot
        run: npm run build && npm run screenshot
      - name: Visual validation
        run: claude-code --agent ui-visual-validator verify screenshots/

Frequently Asked Questions

Q: Isn't this agent too strict?

A: Strictness is by design! Claude Code itself is too lenient and needs a strict checks-and-balances mechanism. If you think it's too strict, it means it's working as intended.

Q: Can the validation strictness be adjusted?

A: Yes, by modifying the agent's prompts. For example, allowing "close enough" judgments in certain scenarios, but I recommend maintaining default strict standards.

Q: How does this differ from Percy, Chromatic, and other visual regression testing tools?

Different purpose: These tools mainly detect unexpected visual changes, while UI Visual Validator focuses on verifying expected modifications succeeded
Usage timing: This agent is for real-time validation during development, not pre-deployment regression testing
Intelligence level: The agent has understanding and analysis capabilities, can judge whether changes meet specific requirements

Q: What if screenshot quality is poor?

A: The agent will note screenshot quality issues in validation reports, such as blur or incomplete capture, and request clearer screenshots.

Q: Can it verify animation effects?

A: It can verify static animation states (like hover effects), but for complex animation sequences, consider combining with screen recording for verification.

Future Improvements

1. Multi-State Validation Support

Currently mainly validates static states, planning to add support for:

Hover states
Focus states
Active states
Error states

2. Batch Validation Functionality

Support validating multiple UI elements or page states at once:

@ui-visual-validator Verify if all five buttons conform to design specifications

3. Design System Integration

Integration with common design systems (like Material-UI, Ant Design) to automatically verify component compliance with design specifications.

4. AI-Assisted Pixel-Level Measurement

Use computer vision technology for more precise size, spacing, and alignment measurements.

5. Validation Report Generation

Generate structured validation reports including issue statistics and fix recommendations.

Contributing

Welcome to submit Issues and Pull Requests to improve this project!

How to Contribute

Submit Bug Reports
- Describe encountered problems
- Provide screenshots and validation logs
- Explain expected behavior
Feature Requests
- Detailed description of needed functionality
- Explain use cases
- Provide implementation ideas
Code Contributions
- Fork project and create feature branch
- Add test cases
- Update documentation
- Submit Pull Request

Development Environment Setup

git clone https://github.com/cryptonerdcn/UI-Visual-Validator-Agent.git
cd UI-Visual-Validator-Agent

# Test in your Claude Code project
cp ui-visual-validator.md /path/to/your/project/.claude/agents/

# Restart Claude Code and test

GitHub Repository: UI-Visual-Validator-Agent
Author Twitter: @cryptonerdcn
Claude Code Official Documentation: Claude Code Subagents

Conclusion

Writing this agent taught me a profound lesson: sometimes the best solution isn't making one AI smarter, but having two AIs check and balance each other.

Claude Code is strong at code writing but indeed has systematic biases in visual validation. By introducing a specialized, skeptical validation agent, we created an effective checks-and-balances mechanism.

This is not just a technical solution, but a work philosophy: in the AI era, we need to learn to design adversarial systems to ensure quality.

If you're also frustrated by Claude Code's "I think I fixed it," try this agent. Trust me, it will significantly improve your UI development efficiency while maintaining sufficient quality standards.

Project Repository: https://github.com/cryptonerdcn/UI-Visual-Validator-Agent

Questions or suggestions? Feel free to open issues on GitHub or @ me on Twitter @cryptonerdcn

Fed Up and Built My First Sub Agent: UI Visual Validator

On this page