
Recent shift: the previous subprocess-based SDK and ACP backends (claudesdk,codexsdk,opencodesdk, ACP bridges) have been removed. All AI dispatch now goes through an in-process Go runtime called olium (pkg/olium/). One unified provider interface, one conversation state, one place to reason about timeouts and retries.
1. Subcommand surface
vigolium agent is a parent command with informational flags only (--list-templates, --list-agents). Real work happens in subcommands.
| Subcommand | Purpose |
|---|---|
query | Single-shot prompt (template or inline). Code review, secret hunt, endpoint discovery. |
autopilot | Agentic scan: autonomous operator with full tool access. |
swarm | Agentic scan: 10-phase guided pipeline (plan → extension → scan → triage). |
audit | Unified source-audit dispatcher (drives the embedded vigolium-audit harness, piolium, or both). |
piolium | Direct piolium harness driver (Pi-native install required). |
olium | Interactive olium TUI or one-shot non-interactive prompt. |
session | List or inspect past agent runs (sessions list / detail view). |
query is the only mode that does not orchestrate a scan, it’s a one-shot prompt with optional source-code context. autopilot and swarm are the two agentic scan modes.
2. Architecture layers
2.1 Engine (pkg/agent/engine.go)
The engine is the seam between orchestrators and the olium runtime. Its job is to:
- Preflight: validate provider/model selection.
- Build the prompt: load template, parse frontmatter, render with
TemplateData. - Enrich context: pull DB context (previous findings, discovered endpoints, high-risk endpoints, module list, scan stats) through a thread-safe LRU cache that lives for one swarm/autopilot run.
- Dispatch: call the olium engine.
- Retry: exponential backoff on transient errors (default 2 retries, 2-30s backoff with jitter).
- Parse: schema-aware JSON extraction tolerating fences, prose, and type coercion (string ↔ int, object ↔ string body).
- Ingest: write parsed findings/HTTP records to the DB repository.
Engine.Run(ctx, opts): one-shot prompt execution; creates a fresh olium engine.Engine.RunOnOliumEngine(ctx, opts, eng): runs against a shared engine instance, preserving conversation prefix for prompt-cache hits across phases.Engine.RunSourceAnalysisParallel(ctx, cfg): fan-out source analysis (single explore call → parallel format/extension sub-calls on the same engine).
agent.olium.max_concurrent (default 4).
2.2 Olium runtime (pkg/olium/)
Native, in-process replacement for the old subprocess pool.
-
pkg/olium/engine:Engine.Run(ctx, prompt) <-chan Eventreturns a stream of events:EventTextDelta,EventThinkingDelta,EventToolCall,EventTurnDone(with token usage),EventError. Conversation state (system prompt, tool definitions, prior turns) lives on the engine and is reused across calls when phases share an engine. -
pkg/olium/tool: registry of built-in tools (bash, file ops, grep, fetch, …). Autopilot exposes the full set; swarm uses a smaller set per phase. -
pkg/olium/skill: optional skill files (Markdown SKILL.md packages) that augment the system prompt; loaded from embedded assets and~/.vigolium/skills/. -
pkg/olium/provider: provider dispatch (eight drivers):Provider Auth source openai-codex-oauthoauth_cred_path(JSON fromcodex login)anthropic-api-keyllm_api_keyor$ANTHROPIC_API_KEYanthropic-oauthoauth_token(fromclaude setup-token); falls back to$ANTHROPIC_API_KEYopenai-api-keyllm_api_keyor$OPENAI_API_KEYanthropic-clishells out to the local claudebinaryanthropic-vertexoauth_cred_path(GCP SA JSON or$GOOGLE_APPLICATION_CREDENTIALS) +google_cloud_project/google_cloud_locationgoogle-vertexsame GCP creds; routes gemini-*modelsopenai-compatiblecustom_provider.base_url(required),custom_provider.api_key(optional),custom_provider.model_id,custom_provider.extra_headers(Ollama / OpenRouter / LM Studio / vLLM / Groq / Together / LocalAI / custom proxies)
openai-compatible with gemma4:latest (a local Ollama endpoint). Configured under agent.olium in vigolium-configs.yaml. Per-call deadline defaults to 10 minutes (call_timeout_sec).
2.3 Prompt templates
Markdown files with YAML frontmatter, loaded from (in order):agent.templates_dir(config dir)~/.vigolium/prompts/- Embedded (
public/presets/prompts/baked into the binary)
| Schema | Used by | Parsed into |
|---|---|---|
findings | code review, triage, audit | []AgentFinding → DB |
http_records | endpoint discovery | []AgentHTTPRecord → DB |
source_analysis | swarm source-analysis phase | SourceAnalysisResult |
attack_plan / swarm_plan | swarm plan + extension phases | SwarmPlan |
triage_result | swarm triage phase | TriageResult |
TemplateData, which carries: source code snippets, directory tree, target URL, hostname, previous findings (DB), discovered endpoints (DB), module list/tags, scan stats, and a free-form Extra map for orchestrator-injected hints.
3. Swarm pipeline
vigolium agent swarm --target ... [--source ...] runs a state-machine pipeline. Each step implements swarmPhaseStep.Run(ctx, *swarmPipelineState).
native- are pure-Go (no LLM). The pipeline is gated by:
--only/--skip/--start-fromflags (with legacy aliases viaNormalizeSwarmPhase)- intensity preset (
SwarmPresets[Quick|Balanced|Deep]) cfg.SourcePath: empty source skips source-analysis and code-auditcfg.Discover,cfg.CodeAudit,cfg.Triagetoggles- checkpoint resume,
--resume <session-dir>skips already-completed phases
cfg.Audit != "") when source is provided, contributing source-code audit findings without blocking the swarm. Swarm uses the embedded vigolium-audit harness directly; the multi-driver agent audit command layers piolium support on top.
Plan & extension phases
The master agent receives input records (chunked intoMasterBatchSize, default 5) and returns a SwarmPlan:
<session>/extensions/ with sanitized filenames.
Triage loop
After the native scan, if findings exist and triage is enabled, the triage agent receives a fixture (truncated by detail tiers, 15 full-detail / 40 table-with-top-10 / etc.) and emits:FollowUpScans is non-empty and rescan is enabled, the pipeline loops back to native-scan with targeted modules. Loop bounded by MaxIterations (default 3); early-exits when all findings have “certain” confidence.
4. Autopilot pipeline
vigolium agent autopilot --target ... [--source ...] is simpler, no plan/extension phases. The agent itself decides what to run.
low/medium) is picked from vigolium-audit mode: balanced/deep vigolium-audit → medium effort, else low. The operator stream is captured to <session>/output.md.
Intensity presets (autopilot)
| Intensity | MaxCommands | Timeout | Vigolium-audit mode | Browser |
|---|---|---|---|---|
| quick | 150 | 1h | lite | on |
| balanced | 500 | 6h | balanced | on |
| deep | 1500 | 12h | deep | on |
5. Session directories
Every swarm and autopilot run writes a session dir underagent.sessions_dir (default ~/.vigolium/agent-sessions/<run-uuid>/). Layout:
EnsureSessionDir(baseDir, agenticScanUUID) in pkg/agent/pipeline_types.go is the canonical creator.
6. Configuration
All agent settings live underagent in vigolium-configs.yaml:
--provider, --model, --oauth-cred, --oauth-token, --llm-api-key) override the config at runtime. The REST API also accepts per-request BYOK credentials.
7. Where things live
| What | Where |
|---|---|
| Subcommand wiring | pkg/cli/agent*.go |
| Swarm orchestrator | pkg/agent/swarm.go, swarm_pipeline.go |
| Autopilot orchestrator | pkg/agent/autopilot_pipeline.go |
| Audit driver dispatcher | pkg/agent/audit_drivers.go, audit_chain.go |
| Vigolium-audit / piolium runners | pkg/agent/audit_agent.go, pkg/piolium/ |
| Engine (prompt → dispatch) | pkg/agent/engine.go |
| Prompt templates / rendering | pkg/agent/prompt/, public/presets/prompts/ |
| Output parsers (JSON-tolerant) | pkg/agent/parsing/ |
| Olium runtime | pkg/olium/engine, tool, skill |
| Olium providers | pkg/olium/provider/ |
| Phase constants & presets | pkg/agent/agenttypes/constants.go |
| Core types | pkg/agent/agenttypes/types.go |
| Public aliases | pkg/agent/aliases.go |
| Config schema | internal/config/agent.go |
8. Quick mental model
- Engine turns a prompt template + DB context into a structured result. One LLM call.
- Orchestrator sequences many engine calls plus native steps (discovery, scan), checkpoints state, and writes a session directory.
- Olium is the agent runtime, it holds conversation state and dispatches to a provider. One olium engine can serve many engine calls cheaply (prompt cache hits).
- Phases are the unit of resumable work.
--only,--skip,--start-from, and--resumeoperate on phase names. - Intensity is a single knob that hydrates a bundle of toggles (commands, timeout, vigolium-audit mode, discover/audit/triage flags, browser/auth).
