Architecting a Local Agent Hub

The goal: run my workflow from the laptop—capture a 10–15 minute voice unload, transform it with GPT-5, and have agents file themes to knowledge, push tasks to boards, and draft client updates—without me doing the shuttling. I want a local-first hub that orchestrates this flow and enforces confidence thresholds (auto/approve/escalate) with a clear audit trail.

This is the architecture I’d build.

Design Goals (What “good” looks like)

Local-first. Works offline; sync is optional. Single-file DB, deterministic jobs.
Protocol-friendly. Speaks MCP for portable tools/connectors.
Composable agents. Each agent does one thing well; orchestration does the glue.
Observable. Human-readable audit log; replay any job; dry-run mode.
Secure by default. Secrets isolated; least-privilege adapters.
Confidence thresholds. Automatic when safe, review when ambiguous, escalate when risky.
Zero yak-shaving. Start simple: TUI/CLI first, thin web UI later.

Mental Model

Think of the hub as 5 layers:

Ingest — raw stuff enters (transcripts, notes, diffs).
Normalize — convert to a canonical event (JSON) + attachments.
Orchestrate — rules decide which agents run and how (with thresholds).
Act — adapters perform side-effects (write markdown, open issues, draft emails).
Observe — audit, notifications, replay.

High-Level Architecture

Event Bus (local): append-only events.sqlite + events/*.jsonl for durability.
State Store: SQLite (WAL mode) for jobs, rules, runs, artifacts.
Rule Engine: YAML/JSON rules → compiled to predicates/functions.
MCP Tooling: agent providers exposed as MCP tools (filesystem, GitHub, Notion, etc.).
Queue/Workers: simple priority queues (pending, active, deadletter).
Threshold Gate: auto / needs-review / escalate routing.
UI: TUI (terminal) for day-1; minimal web UI for review/approve.

Core Entities (Schema Sketch)

Event: { id, type, source, payload, attachments[], received_at }
Job: { id, event_id, rule_id, agent, status, confidence, created_at }
Artifact: { id, job_id, kind, path/hash, preview, created_at }
Decision: { id, job_id, action: "auto"|"approve"|"reject"|"escalate", by, reason }
Rule: { id, name, match, plan, thresholds, enabled }
Secret: stored via OS keychain; DB holds only references.

SQLite is enough. Keep everything local and commit selected artifacts to a git repo if you want history.

Rules: Human-Readable, Diffable

Rules live in hub/rules/*.yaml. Example for the voice unload flow:

id: rule.voice-unload.v1
name: Voice Unload → Themes/Tasks
match:
  event.type: "transcript.created"
  payload.duration_minutes: ">=8"          # simple numeric predicate
plan:
  - step: "Summarize to canonical structure"
    agent: "gpt5.summarizer"
    input:  "{{ event.attachments[0].text }}"
    output: "artifact://unload/{{ event.id }}.summary.json"
  - step: "Persist themes as markdown"
    agent: "fs.markdown"
    input:  "{{ artifact('summary').themes }}"
    output: "obsidian://daily/{{ event.date }}-unload.md"
  - step: "Create tasks by domain"
    agent: "tasks.router"
    input:  "{{ artifact('summary').tasks }}"
    params:
      mapping:
        dev: "github://org/repo"
        client: "notion://client-board"
        ops: "todo://inbox"
thresholds:
  auto:
    - agent: "fs.markdown"
    - agent: "gpt5.summarizer"
  review:
    - agent: "tasks.router"
  escalate:
    when:
      - "artifact('summary').contains_sensitive == true"
notifications:
  on_review: "notify://me?channel=desktop"
  on_error:  "notify://me?channel=desktop"
enabled: true

Interpretation

Always summarize + persist themes automatically.
Creating tasks goes to review; I approve with one tap.
If PII/sensitive content is detected, escalate before any action.

Agents & Adapters (Keep Them Small)

gpt5.summarizer (MCP tool) → input: raw text; output: normalized JSON
{ themes[], domains[], tasks{type,domain,title,notes,confidence}, flags{} }
fs.markdown → writes Obsidian-friendly MD with frontmatter + backlinks.
tasks.router → splits tasks by domain and type into downstream adapters:
- github.issues (scoped repo/project)
- notion.tasks (database + status)
- todo.inbox (local system / Things / Apple Reminders via bridge)
notify → desktop notifications for review/approve; can also post to a local inbox.

Each adapter runs in a sandbox (separate process) with scoped secrets.

Confidence Thresholds (How Decisions Happen)

Every agent returns a confidence score and optional risk_flags. The Threshold Gate applies rule policy:

If auto list contains the agent → run without human input.
If in review → create a Decision record and pause job until approval.
If escalate predicate matches → require explicit confirmation with context.

One-tap approvals: The UI shows the diff (“I will create 5 issues in org/repo”) and a single approve button. No modal marathons.

File/Repo Layout (Local-first)

~/agent-hub/
  hub.sqlite                  # state store
  events/                     # newline-delimited JSON events
  artifacts/                  # generated JSON/MD for inspection
  rules/
    voice-unload.yaml
    repo-sync.yaml
  agents/
    gpt5/
    fs/
    tasks/
  secrets/                    # references only; actual secrets in OS keychain
  ui/                         # TUI & optional web UI
  logs/                       # structured logs

Commit rules/ and selected artifacts/ to git for change history.

Observability & Replay

Audit log: append-only entries: time, event_id, job_id, agent, action, outcome.
Replay: pick an event_id → re-run through current rules (great for testing rule changes).
Dry-run: run agents in “explain mode” → show what would happen; no side effects.

Security Model (Local-First, Least Privilege)

Secrets stored in OS keychain (Keychain/Pass/Windows Credential Manager). DB holds aliases, not values.
Each adapter gets scoped tokens only (e.g., a repo-scoped GitHub token).
Content scanners (PII/keys) run before any “Act” step; if hits → escalate.
Network egress can be deny-listed by adapter to avoid surprise calls.

The Human Touchpoints

Inbox (Review): a queue of pending items (e.g., “Create 7 GitHub issues”). Approve/Reject with a note.
Timeline: event → jobs → artifacts → outcomes (click through details).
Health: quick indicators: “MCP connected”, “GitHub ok”, “Notion token expired”.
Search: by theme/domain across artifacts/ + Obsidian.

TUI first (fast, reliable); small web UI later for approvals from phone.

Example Artifact (Normalized Summary)

{
  "event_id": "evt_2025-09-19_0932",
  "themes": ["Payments", "StrongStart Courses", "Local Ops"],
  "domains": ["client", "product", "ops"],
  "tasks": [
    {"type":"followup","domain":"client","title":"Email Nick about automation runway","confidence":0.86},
    {"type":"research","domain":"product","title":"Evaluate rituals module outline","confidence":0.78},
    {"type":"ops","domain":"ops","title":"Price out pressure-washing gear","confidence":0.73}
  ],
  "flags": {"contains_sensitive": false}
}

This is the single truth everything else reads.

Event Flow (Voice Unload Use Case)

Ingest: transcript.created event with text attachment.
Rule match: rule.voice-unload.v1 fires.
Summarize (auto): GPT-5 → normalized artifact.
Persist (auto): fs.markdown → 2025-09-19-unload.md in Obsidian vault.
Route tasks (review): tasks.router proposes 1 GitHub issue, 1 Notion card, 1 local task.
Approve: one tap; jobs dispatched.
Notify: desktop ping with a compact summary + links.

If flags trip (sensitive), the plan halts at escalate with context.

Local DX (Developer Experience)

Bootstrap in minutes: hub init, hub run, hub ui.
Hot-reload rules: hub watches rules/*.yaml.
Fixtures: hub ingest transcript ./samples/day-2025-09-19.txt --as john.
Explain: hub plan evt_... --dry-run (show agents, targets, thresholds).
Replay: hub replay evt_... --rule rule.voice-unload.v2.

Testing & Reliability

Golden transcripts: known inputs + expected artifacts; run in CI.
Contract tests for adapters: mock GitHub/Notion; assert payload shapes.
Idempotency: every job has an idempotency_key; reruns don’t duplicate issues.
Deadletter queue: broken jobs go here with retry metadata and error snapshots.

Closing Notes

Designing a local-first agent hub isn’t just an exercise in architecture—it’s a way of reclaiming control over how AI fits into my daily work. Instead of scattering prompts and outputs across apps, the hub becomes a single home where rules, agents, and context live together.

The experience has shown me that the value of AI isn’t in the novelty of generation—it’s in the systems we design to receive it. By owning the pipeline end-to-end, from voice unload to task routing, I can finally step back from the manual “doing” and focus on the strategy, review, and momentum that only I can provide.

Architecting a Local Agent Hub

Architecting a Local Agent Hub

Design Goals (What “good” looks like)

Mental Model

High-Level Architecture

Core Entities (Schema Sketch)

Rules: Human-Readable, Diffable

Agents & Adapters (Keep Them Small)

Confidence Thresholds (How Decisions Happen)

File/Repo Layout (Local-first)

Observability & Replay

Security Model (Local-First, Least Privilege)

The Human Touchpoints

Example Artifact (Normalized Summary)

Event Flow (Voice Unload Use Case)

Local DX (Developer Experience)

Testing & Reliability

Closing Notes

Related Labs

From Voice Unloads to Automated Agents

Confidence Thresholds and Human-in-the-Loop UX

From Voice Unloads to Agents: A Labs Sequence

Agent-Oriented Design

The Automation Scope Framework

Working Backwards with Agent Mode