The Foundation Team — Apiary's data pipeline crew

Status: Candidate — awaiting founder verification. Why this page exists: Before any GPU rental, before any fine-tune, before any "spend money" phase, the Foundation Team builds the data layer that everything later depends on.

TL;DR

The Foundation Team is six specialized agents inside Apiary that map every asset across your projects, build a structured foundation database, and prepare the substrate for any later phase (synthesis, fine-tuning, deployment). The team forms at zero cost, runs in bounded sessions, and produces a foundation report the founder reads before approving the next stage.

The six agents

🐝 CARTOGRAPHER — maps the territory
   Scans:    every project codebase the founder lists
   Catalogs: every .md doc, every .ts file, every README
   Output:   asset_map.json with paths + summaries

🐝 ARCHIVIST — catalogs the substrate
   Scans:    memory files, logs, conversation history
   Categorizes: by type, by date, by topic
   Output:   archive_index.json

🐝 STRATEGIST — synthesizes the plan
   Reads:    Cartographer + Archivist output
   Identifies: what we have, what we need, in what order
   Output:   spend_roadmap.md (the master plan)

🐝 ESTIMATOR — costs every step
   Reads:    Strategist's roadmap
   Computes: per-stage cost, time, complexity
   Output:   budget_breakdown.json

🐝 SCRIBE — surfaces everything to the founder
   Reads:    all other agents' output
   Renders:  human-readable dashboard
   Output:   foundation_report.md (the founder's read)

🐝 QUARTERMASTER — manages session spending
   Tracks:   every dollar spent, against every budget
   Enforces: hard caps, session limits
   Output:   spend_log.csv (audit trail)

The roles are deliberately overlapping with classical guild specializations — cartographer, archivist, strategist, estimator, scribe, quartermaster. The taxonomy makes the work legible to the founder.

The bootstrap sequence

PHASE 4.0 — FOUNDATION TEAM FORMATION ($0)
  Team forms inside Apiary's substrate.
  Zero cost — the team formation itself is metadata, not inference.
  TIME: minutes.

PHASE 4.1 — ASSET INVENTORY ($0 to a few dollars)
  Foundation Team builds the foundation DATABASE.
  Location: foundation/foundation.db (SQLite, portable, no server).

  Tables:
    📁 assets         — every doc, every spec, every log
    🗂  gaps           — what's missing, what to fill
    🎯 milestones     — staged deliverables
    💰 budgets        — per-milestone tracking
    📊 retrospectives — compound learning over sessions

  COST: a few dollars in Claude calls (Strategist + Estimator work).
  TIME: about an hour.

PHASE 4.2 — SESSION-BUDGETED DATA CREATION ($20-100 per session)
  Generator agent fills the gaps identified in 4.1.
  Founder runs sessions they can afford. Each session ENDS with a report.
  Each session ADDS to the database.

PHASE 4.3+ — FINE-TUNING + DEPLOYMENT
  Once the foundation DB has enough quality data, the founder
  approves a fine-tune budget. GPU rented. Model trained.
  Distributed via Hugging Face / Ollama / Apiary itself.

Why this team exists

Most "let's fine-tune an LLM" projects fail at step 1: nobody knows what data they already have. The Foundation Team's job is to surface the assets you already own before you spend a dollar acquiring new ones.

A typical pre-Foundation snapshot:

Three project codebases on disk.
Hundreds of conversation transcripts in ~/.claude/projects/.
Dozens of memory files across multiple categories.
Plus README files, spec docs, design notes, prototype code.

The Foundation Team catalogs all of that into a single SQLite database that subsequent phases query. The fine-tune phase doesn't have to re-discover what's available; it reads foundation.db.

The compounding property

Each Foundation Team session ends with a retrospectives row: what worked, what failed, what to do next. Subsequent sessions read those retrospectives before starting. Over time the team gets better at its own job.

This is "self-improving" at the operational layer — the same property the reconsider pillar gives the governance layer. The substrate accumulates learning, not just data.

What the founder gets out of it

After the Foundation Team's first pass:

1. A foundation report. Human-readable, names what was found, what's missing, what the next stage would cost. 2. A foundation database. Queryable, exportable, the source of truth for every later phase. 3. A spend roadmap. Stage-by-stage cost projection. The founder approves stage-by-stage. 4. An audit trail. Every agent action logged in the audit log.

The founder approves the next phase only when they're ready. No phase runs without explicit approval.

Source quotes

"Foundation FIRST, then spend. Each stage is a self-contained group project session. Each stage's cost is bounded. No stage requires the next to start. You stop anytime, resume anytime."