Back to Projects
AI Automation

Personal AI Employee — Autonomous Digital FTE

A production-grade, local-first autonomous AI system built in 4 progressive tiers — from a file-watching foundation to a distributed dual-agent cloud executive. Handles email triage, Odoo invoicing, social media drafting, and CEO briefings with atomic multi-agent coordination and Human-in-the-Loop safety gates.

// ContextBuilt as a real production tool for autonomous business operations. All 13 unit tests pass. Live E2E verified: task dropped into Obsidian vault → Groq API triage → AI-drafted reply surfaced in Pending_Approval within 60 seconds.
Personal AI Employee — Autonomous Digital FTE

// Problem

Small business owners spend 15-20 hours/week on repetitive administrative tasks — email triage, invoice entry, social media, financial reporting. A human VA costs $4,000–$8,000/month and works 40 hrs/week. Autonomous AI tools either lack safety controls or require expensive cloud infrastructure.

// Solution

A 4-tier autonomous agent system: lightweight Groq cloud agent handles high-frequency triage 24/7 while a local Claude agent executes sensitive operations (Odoo, Gmail). Two agents coordinate via atomic os.rename() filesystem claims — no message broker, no database. Every irreversible action requires explicit human approval.

// Evolution Phases

T1
Bronze — Foundation & Vault25 pts

Python 3.12, watchdog, Obsidian Markdown

Established the Obsidian vault with 9 canonical folders (Needs_Action, In_Progress, Pending_Approval, Done, Logs, Plans, Briefings, Approved, Rejected). Built a filesystem watcher that detects new task files, parses YAML front-matter, and routes to skill handlers. All state is plain Markdown — no external database.

T2
Silver — Intelligent Assistant50 pts

Gmail API, MCP Servers, Claude Sonnet, HITL Workflow

Integrated Gmail via MCP server. Emails are polled every 60s, converted to Markdown action files, triaged by Claude (urgent/routine/ignore), and draft replies written to Pending_Approval. An approval watcher detects human sign-off and triggers execution. No email is sent without explicit human approval.

T3
Gold — Autonomous Business Operations75 pts

Odoo XML-RPC, Playwright, JSON Audit Log, CEO Briefing

Added Odoo invoice processing (NLP extraction → draft creation → HITL → confirm), social media automation via Playwright (LinkedIn, Instagram, X), weekly CEO briefing generator, financial BI dashboard, structured JSON Lines audit trail with 100MB auto-rotation, and a watchdog process for sentinel auto-restart.

T4
Platinum — Cloud Executive + Multi-Model Brain100 pts

Groq, OpenRouter, PM2, Git Vault Sync, Atomic Claims

Distributed dual-agent architecture: cloud VM runs Groq Llama-3 (always-on via PM2) for triage; local machine runs Claude for sensitive execution. Agents coordinate via atomic os.rename() — no broker needed. Heartbeat TTL reclaims stale tasks. LLM_PROVIDER env var switches between Anthropic/Groq/OpenRouter with zero code changes. 13/13 tests pass.

// Screenshots

13/13 unit tests passing — Brain routing, atomic claim, domain handoff, vault sync all verified
13/13 unit tests passing — Brain routing, atomic claim, domain handoff, vault sync all verified
Email task dropped into vault — YAML front-matter with task_type, domain, and priority parsed instantly
Email task dropped into vault — YAML front-matter with task_type, domain, and priority parsed instantly
Live E2E log — agent startup → task claimed → Groq API triage → Pending_Approval in under 2 seconds
Live E2E log — agent startup → task claimed → Groq API triage → Pending_Approval in under 2 seconds
AI triage result — TRIAGE: routine, professional draft reply, signed 'Processed by AI Employee — Verified by Human'
AI triage result — TRIAGE: routine, professional draft reply, signed 'Processed by AI Employee — Verified by Human'
Published on LinkedIn — real-world post showcasing the system with 64 impressions and community engagement
Published on LinkedIn — real-world post showcasing the system with 64 impressions and community engagement

// Tech Stack

[Python 3.12][Claude Sonnet 4.6][Groq Llama-3][OpenRouter][Anthropic SDK][Playwright][Obsidian Vault][Gmail API][Odoo XML-RPC][PM2][uv][pytest][Git]

// Metrics

  • Operates 168 hours/week (24/7) vs 40 hrs/week for a human VA
  • ~90% cost reduction: $50–200/month LLM costs vs $4,000–8,000/month human
  • End-to-end email triage completed in under 2 seconds via Groq Llama-3
  • 13/13 unit tests pass — Brain routing, atomic claim, domain handoff, vault sync
  • Zero double-processing across 2 agents — guaranteed by os.rename() atomicity
  • Stale task auto-reclaim in 30 minutes via heartbeat TTL thread

// Highlights

  • Atomic multi-agent coordination via os.rename() — no database, no message broker, pure filesystem
  • Provider-agnostic Brain class: swap Claude ↔ Groq ↔ OpenRouter via single LLM_PROVIDER env var
  • Domain-based security routing: cloud VM holds only Groq key — Odoo and Gmail credentials never leave local machine
  • Human-in-the-Loop as architectural guarantee: no code path bypasses the Pending_Approval gate
  • Heartbeat TTL thread: background thread writes Unix timestamp every 30s; stale claims auto-reclaimed after 30 min
  • Structured JSON Lines audit trail: every external action logged with timestamp, actor, target, and outcome
GitHubLocal deployment — GitHub repo + live E2E demo available