AI Verification
AI-generated code needs verification. ACT builds verification into every stage of the workflow so problems are caught early, not after deployment.
The verification stack
Section titled “The verification stack”ACT uses three layers of automated verification:
1. Static analysis
Section titled “1. Static analysis”Every phase of /act:workflow:work runs:
flutter analyzeThis catches:
- Type errors and null safety violations
- Unused imports and variables
- Lint rule violations
- Breaking API changes from dependency upgrades
When it runs: After every phase completion, before committing.
2. Automated tests
Section titled “2. Automated tests”ACT supports three levels of testing:
Unit tests
Section titled “Unit tests”Test business logic, state management, and services in isolation.
flutter testWidget tests
Section titled “Widget tests”Test UI behavior, rendering, and interaction at the widget level.
Robot journey tests
Section titled “Robot journey tests”Test complete user journeys across screens with stable selectors and deterministic test seams. See the Robot Testing playbook.
When they run: Continuously during implementation. ACT’s TDD discipline encourages tests to be written before implementation, so they run at every RED → GREEN cycle.
3. Visual verification
Section titled “3. Visual verification”The /flutter-screenshot skill captures screenshots from running apps:
/flutter-screenshot ./screenshots/home-screen.pngClaude reads the screenshot and verifies the UI matches expectations. This catches:
- Layout issues that pass analysis but look wrong
- Color and styling mismatches
- Missing or misplaced UI elements
When it runs: On demand during implementation, typically after UI changes.
TDD discipline
Section titled “TDD discipline”ACT encourages vertical-slice TDD for tasks marked with TDD: in the plan:
- RED — Write one failing test for the next behavior
- GREEN — Write the minimum code to pass that test
- REFACTOR — Clean up while all tests remain green
- Repeat for the next behavior
This discipline ensures:
- Tests verify real behavior, not imagined behavior
- Implementation is minimal — no over-engineering
- Every change is backed by a test
The key difference from typical AI testing: ACT writes one test at a time, not a batch of tests followed by a batch of implementation. This produces honest tests that actually catch regressions.
Verification at every stage
Section titled “Verification at every stage”| Stage | Verification |
|---|---|
| Spec | Clarifying questions catch ambiguity early |
| Refine Spec | Adversarial review catches gaps and wrong assumptions |
| Plan | Codebase research ensures plan follows existing patterns |
| Work (each phase) | flutter analyze + flutter test |
| Work (TDD tasks) | RED → GREEN → REFACTOR cycles |
| Work (UI tasks) | Optional screenshot verification |
| Ship | Full test suite + analysis before PR |
The mantra
Section titled “The mantra”If AI can verify its own work with a feedback loop, it will 2-3x the quality of the final result.
The cost of running flutter analyze and flutter test after each phase is trivial. The cost of shipping broken code is not. ACT errs on the side of verification.
Next steps
Section titled “Next steps”- Learn about Context Management — keeping AI focused
- See the TDD playbook for hands-on TDD guidance