Philosophy

The problem

AI agents can write code fast. But without guardrails, they produce inconsistent results — code that compiles but doesn’t match what you needed, misses edge cases, or follows patterns that don’t fit your codebase.

The root cause: vague inputs produce vague outputs. Telling an AI “add authentication” gives it too many degrees of freedom. The result is a coin flip between something useful and something you’ll rewrite.

The solution: Spec-First Agentic Coding

ACT is built on a simple insight: the cost of fixing a problem increases the later you catch it.

A specification is roughly 200 lines. The implementation it produces might be 3,000 lines. Catching a wrong assumption in the spec takes minutes. Catching it in the code takes hours.

The framework:

Define before you build — Write a spec that maps user flows, edge cases, and requirements before any code is generated
Review at every stage — Humans approve the spec, the plan, and the implementation. AI is a collaborator, not an oracle
Verify automatically — Static analysis and automated tests catch regressions before they ship

Core principles

Catch problems early

The spec is your cheapest line of defense. A few clarifying questions during spec creation prevent entire features from being built wrong.

AI as collaborator, not oracle

ACT keeps humans in the loop at every stage:

Spec — you describe what to build, AI asks clarifying questions, you approve the spec
Plan — AI creates a phased plan, you review the approach
Work — AI implements, but commits frequently so you can review progress
Compound — AI captures what it learned, you decide what’s worth keeping

The AI does the heavy lifting. You make the decisions.

AI agents are fast and thorough, but they have blind spots that shape ACT’s design:

They don’t know what they don’t know — they fill gaps with plausible assumptions instead of asking. The spec stage surfaces these before they become code.
They optimize locally — each function looks correct in isolation, but the feature as a whole may not cohere. Specs and plans force a holistic view before implementation begins.
They have no taste — they can’t tell a clean solution from an over-engineered one. Human review at every stage provides the judgment AI lacks.
They don’t remember — each session starts fresh. ACT’s file-based artifacts (specs, plans, compound docs) serve as persistent memory the AI can reload.

Structure beats talent

A structured workflow gets more out of your effort than an unstructured one — but you still need to bring the effort. ACT provides the structure:

Specs force you to think through requirements before implementation
Plans break work into verifiable phases instead of one massive change
TDD discipline encourages tests to drive implementation, not the other way around
Commits at checkpoints keep changes reviewable and reversible

Predictable, high-quality results

The goal isn’t to make AI coding faster — it’s to make it predictable. When you follow the ACT workflow, you get consistent results because:

Specs eliminate ambiguity
Plans follow your codebase’s existing patterns
Verification runs at every step
Progress is visible through commits and plan checkpoints

What specs are (and aren’t)

ACT specs are not PRDs full of “As a user, I want…” stories. And they’re pragmatic about when to include implementation detail.

Not user stories

Traditional PRDs and acceptance criteria lean on formats like “As a user, I want X so that Y” and Given/When/Then scenarios. Read a dozen of those and you’re fighting to stay awake — the ceremony adds repetition without clarity, and the abstract framing leaves too many implementation decisions unresolved.

ACT specs describe concrete screen flows: “User taps ’+ Add Investment’ → navigates to full-screen form. User enters symbol, selects type via wheel picker, fills name → taps ‘Add Investment’. Investment persisted → screen pops back.” That’s a walkthrough someone could follow with the app in hand — no ceremony, just what happens.

Requirements are numbered technical decisions, not stories: “Prices reload button is enabled only when selected date is today, in both new and edit modes.” Each one resolves a specific ambiguity. And boundaries are first-class — every spec includes explicit “what NOT to do” sections, which matter as much as the goals when guiding an AI agent.

Pragmatic about detail

ACT specs use whatever notation communicates most clearly. Screen flows for behavior. Numbered requirements for decisions. Code-level detail for data models, file migrations, and mathematical constraints — because for those tasks, code IS the most precise notation.

A valid criticism of spec-driven workflows is that a sufficiently detailed spec becomes code in disguise. The test isn’t whether a spec contains implementation detail — it’s whether the spec has become an exhaustive prose rewrite of the entire implementation. A database schema table is fine. A conversion formula is fine. A full system’s worth of pseudocode algorithms in markdown is not.

The draft is the thinking tool

The spec command doesn’t try to exhaustively interrogate you upfront. It asks a few targeted questions, then produces a structured draft. The draft itself is where the real thinking happens — you read through 100-200 lines of requirements, user flows, and edge cases, and catch “wait, that’s not right” or “I hadn’t considered that.” Then refine-spec pushes further with adversarial review, surfacing gaps you missed.

This is a deliberate design choice. Thinking happens across multiple stages — not in one marathon interrogation session, but through iterating on a concrete artifact that gets sharper at each step.

Specs work best across architectural boundaries

A typical app feature touches UI, state management, networking, persistence, and error handling — many small decisions across multiple layers. A spec enumerates these cross-cutting concerns in one place, giving you a checklist before you start coding.

Pure algorithmic or business-dense logic is different. A pricing engine, a state machine, or a concurrency controller — these are best expressed directly in code with tests. The code is the most precise notation for that kind of work.

Watch for spec bloat

A good spec should be much shorter than the code it produces. If your spec is approaching the length of the implementation, you’ve crossed from specification into pseudocode. Stop and write real code instead — you’ll get compiler feedback, type checking, and testability that prose can never provide.

Example: a spec file that was produced by ACT

<goal>
Fix macOS database export failing with `SqliteException(14): unable to open database file` when using `VACUUM INTO` to a user-selected path.

The macOS App Sandbox prevents SQLite's internal VFS from writing to paths obtained via `NSSavePanel` (file_picker). SQLite operates below the Dart file API and does not inherit security-scoped bookmark access. Users cannot export their portfolio data on macOS.
</goal>

<background>
- App has `com.apple.security.app-sandbox` enabled with `com.apple.security.files.user-selected.read-write`
- `FilePicker.platform.saveFile()` returns a security-scoped path the app can access via Dart's `File` API
- `VACUUM INTO ?` uses SQLite's C-level VFS to open the output file, bypassing macOS sandbox bookmark resolution
- iOS already works around this by exporting to `Directory.systemTemp` first, then sharing via `share_plus`
- The app always has write access to its own temp directory

Key files:
- `@folio_tracker/lib/data/database/io/sqlite_file_io_internal.dart` — `exportDatabaseToPath()` (the broken function)
- `@folio_tracker/lib/data/database/io/database_export_service_native.dart` — platform-specific export orchestration
- `@folio_tracker/lib/data/database/app_database.dart` — `exportInto()` public API
- `@folio_tracker/test/data/database/database_export_test.dart` — existing export/import tests
- `@folio_tracker/macos/Runner/Release.entitlements` — sandbox entitlements
</background>

<requirements>
**Functional:**
1. macOS export via Settings > Export Data must produce a valid SQLite file at the user-selected path
2. `exportDatabaseToPath()` must VACUUM INTO a temp file, then copy to the target path using Dart's `File.copy()`
3. The temp file must be cleaned up after copy (success or failure)
4. iOS export flow must remain unchanged (already uses temp + share_plus)
5. Existing tests must continue to pass (tests use temp dirs, so VACUUM INTO works directly there)

**Error Handling:**
6. If the copy from temp to target fails, propagate the error (don't silently swallow)
7. If VACUUM INTO the temp file fails, propagate the original SQLite error
8. Temp file cleanup must happen in a finally block to avoid leaking files

**Edge Cases:**
9. Target file already exists — current code deletes it before export; preserve this behavior
10. Target parent directory doesn't exist — current code creates it; preserve this behavior
</requirements>

<implementation>
Modify `exportDatabaseToPath()` in `folio_tracker/lib/data/database/io/sqlite_file_io_internal.dart`:

- VACUUM INTO a temp file (`Directory.systemTemp` + unique name)
- Copy temp file to `outputPath` using `File.copy()`
- Delete temp file in a `finally` block

No changes needed to:
- `app_database.dart` (public API unchanged)
- `database_export_service_native.dart` (calls `exportDatabaseToPath` which is being fixed)
- Entitlements (existing `files.user-selected.read-write` is sufficient for Dart `File.copy`)
- Tests (temp dir paths work with both approaches)

Avoid:
- Disabling the App Sandbox — breaks macOS security model
- Adding network or file-system-access entitlements — unnecessary
- Changing the public `exportInto` API — no need, fix is internal
</implementation>

<validation>
1. Run existing tests: `cd folio_tracker && flutter test test/data/database/database_export_test.dart` — all must pass
2. Manual macOS test: run the app, go to Settings > Export Data, pick Downloads folder, verify file is created and openable
3. Verify temp file cleanup: after export, no orphaned temp files in `Directory.systemTemp`

**Unit test coverage:**
- Existing tests cover VACUUM INTO + round-trip data integrity (already pass since they use temp dirs)
- No new tests needed — the fix is transparent to callers
</validation>

<done_when>
- macOS export to user-selected paths (e.g., Downloads) succeeds without `SqliteException(14)`
- All existing export/import tests pass
- Temp files are cleaned up after export
- iOS export flow is unaffected
</done_when>

How it works in practice

/act:workflow:spec "add user authentication"
↓ Asks questions, creates detailed specification with user flows and edge cases

/act:workflow:refine-spec ai_specs/auth-spec.md
↓ Roasts the spec for gaps, wrong assumptions, and codebase misalignment

/act:workflow:plan ai_specs/auth-spec.md
↓ Creates phased implementation plan

/act:workflow:work ai_specs/auth-plan.md
↓ Executes plan phase by phase with commits and PR

/act:workflow:compound
↓ Captures reusable session insights

Each stage produces artifacts that inform the next, reducing ambiguity and rework.

Next steps

See the Workflow Overview for details on each stage
Try the Quickstart to experience the workflow firsthand