AI Verification

AI-generated code needs verification. ACT builds verification into every stage of the workflow so problems are caught early, not after deployment.

The verification stack

ACT uses three layers of automated verification:

1. Static analysis

Every phase of /act:workflow:work runs:

flutter analyze

This catches:

Type errors and null safety violations
Unused imports and variables
Lint rule violations
Breaking API changes from dependency upgrades

When it runs: After every phase completion, before committing.

2. Automated tests

ACT supports three levels of testing:

Unit tests

Test business logic, state management, and services in isolation.

flutter test

Test UI behavior, rendering, and interaction at the widget level.

Robot journey tests

Test complete user journeys across screens with stable selectors and deterministic test seams. See the Robot Testing playbook.

When they run: Continuously during implementation. ACT’s TDD discipline encourages tests to be written before implementation, so they run at every RED → GREEN cycle.

3. Visual verification

The /flutter-screenshot skill captures screenshots from running apps:

/flutter-screenshot ./screenshots/home-screen.png

Claude reads the screenshot and verifies the UI matches expectations. This catches:

Layout issues that pass analysis but look wrong
Color and styling mismatches
Missing or misplaced UI elements

When it runs: On demand during implementation, typically after UI changes.

TDD discipline

ACT encourages vertical-slice TDD for tasks marked with TDD: in the plan:

RED — Write one failing test for the next behavior
GREEN — Write the minimum code to pass that test
REFACTOR — Clean up while all tests remain green
Repeat for the next behavior

This discipline ensures:

Tests verify real behavior, not imagined behavior
Implementation is minimal — no over-engineering
Every change is backed by a test

The key difference from typical AI testing: ACT writes one test at a time, not a batch of tests followed by a batch of implementation. This produces honest tests that actually catch regressions.

Verification at every stage

Stage	Verification
Spec	Clarifying questions catch ambiguity early
Refine Spec	Adversarial review catches gaps and wrong assumptions
Plan	Codebase research ensures plan follows existing patterns
Work (each phase)	`flutter analyze` + `flutter test`
Work (TDD tasks)	RED → GREEN → REFACTOR cycles
Work (UI tasks)	Optional screenshot verification
Ship	Full test suite + analysis before PR

The mantra

If AI can verify its own work with a feedback loop, it will 2-3x the quality of the final result.

The cost of running flutter analyze and flutter test after each phase is trivial. The cost of shipping broken code is not. ACT errs on the side of verification.

Next steps

Learn about Context Management — keeping AI focused
See the TDD playbook for hands-on TDD guidance