ra-yavuz › vigil

vigil

How to build an always-on, autonomous AI operations assistant - and how to make it diligent enough to leave running unattended. A reference architecture plus the doctrine + hooks pattern that keeps an auto-approved agent honest on every turn.

source diligence pattern architecture examples issues

What this is

vigil is a documentation project. It describes, in enough depth to recreate from scratch, the architecture of a persistent "operator" AI: one that lives in a container, talks to its owner over normal chat channels (WhatsApp, Signal, email), and can autonomously write and run code, install packages, schedule tasks, and host the things it builds. There is no daemon to install and nothing to package; the repo is the spec and the example files.

The idea in one paragraph

Drive a coding-agent CLI (the reference uses the Claude CLI) as a subprocess with a resumable session in a container, piping messages in and responses out as JSON. For the always-on brain the reference respawns it per message with --resume rather than holding one long-lived process, which proved unreliable over a 24/7 uptime. Wrap it in a supervisor that handles persistence, health, scheduling, and session rotation, and split all the I/O into a separate transport process so the chat layer can restart without killing the brain. Give the AI a real Linux box with tools rather than a fixed API, and expose extra capabilities as small MCP servers. Then, the load-bearing part: hold the agent to an operating doctrine it reads at session start, kept fresh by a minimal per-turn hook reminder that points back at it, so an auto-approved agent keeps verifying before it acts, refuses workarounds, and never claims "done" without a real run. The architecture makes it reliable; the doctrine makes it diligent.

Autonomous diligence: the load-bearing part

The dangerous moments in an autonomous run are individual actions: a destructive command, an unverified assumption, a "done" that was never tested. Rules buried in a one-time system prompt drift as the conversation grows and never reassert themselves at the moment those actions happen.

The fix is three small pieces, a few kilobytes total:

A doctrine file - the full operating rules, written once: verify before acting, no workarounds, respect specs, don't claim completion you haven't verified, push back when the request is wrong, minimise blast radius, when unsure ask or stop.
A SessionStart hook - fires once per session, tells the agent to read the doctrine in full. Cost: a few hundred characters, once.
A UserPromptSubmit hook - fires on every turn, re-states the non-negotiables and asks the agent to print a pre-response check before any consequential reply. Cost: ~1.5 KB per turn, negligible.

Re-asserting the discipline right before the agent acts - deterministically, with no model call - is what keeps a capable, auto-approved agent reasoning like a diligent engineer instead of an eager one. Full write-up: AUTONOMOUS-DILIGENCE.md.

The pre-response check

The most useful single element. The per-turn reminder asks the agent to print this before any consequential reply, and the doctrine forbids answering it ritually:

PRE-RESPONSE CHECK
1. Verified, not assumed?
2. Completion claims backed by actual runs?
3. Relevant specs read and respected?
4. Overclaiming / over-engineering / workaround in this reply?
5. Pushback warranted?

It is printed in the output, so a human reviewing the run and the agent's own subsequent reasoning both see it. It raises the cost of the careless shortcut: the agent has to assert, in writing, that it verified before it claims it did.

Read the docs

AUTONOMOUS-DILIGENCE.md

The doctrine + hooks pattern that makes autonomy safe to leave running: why a system prompt is not enough, the three pieces, how they wire together, and how to adapt it. Start here.

read ›

ARCHITECTURE.md

The full operator architecture: per-invocation agent subprocess with a resumable session, supervisor, transport split, channel multiplexers, voice and media (free local speech-to-text via whisper.cpp, images as multimodal content blocks), an encrypted searchable email database, cost monitoring with a dashboard, MCP tool servers, encrypted persistence, scheduling, resilience, a security section to read twice, and a concrete implementation plan.

read ›

Implementation plan

Section 9 of the architecture doc: the deployment checklist. Prerequisites (a host you control, Docker with the host socket mounted, a coding-agent CLI, whatsapp-web.js with a dedicated phone number, an authenticated entry surface, a keyfile) and the bring-up order, email-first, WhatsApp last.

read ›

examples/

Drop-in starting points: a generic doctrine.md, the two hook scripts (which emit valid hook JSON), and a settings.json fragment that registers them.

browse ›

What is verified vs. described

The diligence layer (doctrine + hooks) is documented from a working, verified setup; the example files are real, redacted copies that emit valid hook JSON. The operator architecture is presented as a reference architecture: a design known to work in this shape, written so you can build your own. Verify every flag, path, and API against current upstream docs before you rely on it.

WhatsApp: use a dedicated number

Linking a headless WhatsApp Web session hands the running agent full read and send access to that account. Provision a separate phone number for the assistant, never your personal WhatsApp. WhatsApp does not sanction unofficial automation and can ban the account; a dedicated number means a ban costs you the assistant, not your personal messaging.

The multiplexer enforces who the agent may talk to in code, both directions: an allowlist drops inbound messages from anyone not explicitly permitted (so strangers cannot even prompt the agent), and an outreach-thread rule stops it cold-messaging arbitrary or model-invented numbers. Linking is done by scanning a QR code from a small hosted page (kept behind your VPN). All detailed in the architecture doc.

Disclaimer / no warranty

This repository is documentation and example configuration, provided as is, without warranty of any kind, express or implied. It describes how to build a system that runs an AI agent with auto-approved tool calls and broad shell, file, and network access. Building or running such a system is inherently risky. You alone are responsible for anything an autonomous agent you build does, including destructive, irreversible, or costly actions. The doctrine and hooks here are a behavioural guardrail, not a security boundary: they reduce careless mistakes but do not sandbox the agent. Real containment is the container/VM boundary and tightly scoped credentials. The author is not liable for any harm, loss, or damages, however caused, arising from following this documentation. Only run an auto-approved configuration inside an isolated environment you fully control and can afford to lose, on accounts and data you own. Full text: repo README, license: MIT.