Documentation Index
Fetch the complete documentation index at: https://docs.vigolium.com/llms.txt
Use this file to discover all available pages before exploring further.
Architecture series: Overview · Native Scan · Agentic Scan · Data & Storage · Server & API
1. Multi-Tenancy: the project_uuid spine
All scan data is partitioned by project, a named container with a UUID, optional config overlay, and optional access-control lists. There is no separate database per project; isolation is a project_uuid column on every major table, filtered on every read and stamped on every write.
- Default project:
00000000-0000-0000-0000-000000000001, created duringvigolium init. Used whenever no project is selected. - Selection precedence:
--project-uuid>--project-name>VIGOLIUM_PROJECT_UUID>VIGOLIUM_PROJECT(legacy) > default. On the server, theX-Project-UUIDrequest header plays the same role. - Project config is a partial YAML overlay (same shape as a scanning profile) at
~/.vigolium/projects/<uuid>/config.yaml; only the keys it sets are overridden. - Access control:
allowed_emails/allowed_domainson the project row gate server requests that carryX-User-Email, exact-email list wins, else domain list, else open; a missing email header skips the check; denial is403.VIGOLIUM_PROJECT_READONLY=truedisables all mutatingprojectCLI subcommands.
Tables carrying project_uuid
scans · http_records · findings · scopes · source_repos · oast_interactions · scan_logs
Existing databases are migrated in place, the column is added with the default-project UUID as its default, so pre-multi-tenancy data lands in the default project.
2. The database backend
Vigolium uses the repository pattern over Bun ORM, with two interchangeable backends:| Backend | Driver | Use |
|---|---|---|
| SQLite (default) | sqliteshim → modernc | Single-binary, zero-config, local scans |
| PostgreSQL | pgdriver | Shared/server deployments, concurrent writers |
http_records row self-contained and avoids join fan-out on the hot ingestion path.
Core models (pkg/database/models.go)
HTTPRecord (table http_records), the unit of ingested traffic:
- Identity:
UUID(PK),RequestHash(SHA-256 of the raw request, used for per-source dedup) - Host:
Scheme,Hostname,Port,IP - Request:
Method,Path,URL,RequestHeaders(JSONB),RawRequest,RequestBody - Response:
StatusCode,ResponseHeaders(JSONB),RawResponse,ResponseBody,ResponseTitle,ResponseWords - Derived:
Parameters(JSONB array ofEmbeddedParam),RiskScore,Remarks - Metadata:
Source,SentAt,ReceivedAt,CreatedAt
Finding (table findings), a detected issue:
- Identity:
ID(auto-increment),FindingHash(unique constraint → dedup key, set from theResultEventID) - Module:
ModuleID,ModuleName,Description,Severity,Confidence - Evidence:
MatchedAt(JSONB),ExtractedResults,Request,Response,AdditionalEvidence(merged-duplicate request/response pairs, capped at 10) - Relations:
HTTPRecordUUIDs(JSONB),ScanUUID; thefinding_recordsjunction table is the many-to-many link to HTTP records.
findings table, tagged by source so native and AI results coexist and dedup together.
Converters (pkg/database/converters.go)
The in-memory scan types never touch the DB directly. HTTPRecord.FromHttpRequestResponse() and Finding.FromResultEvent() are the only seam, they generate UUIDs, compute hashes, parse URLs, extract titles, and count words, keeping persistence concerns out of the executor and modules.
3. Write paths
Method (pkg/database/repository.go) | Role |
|---|---|
SaveRecord() / SaveRecordsBatch() | Single vs. bulk INSERT (batch = one transaction) |
SaveFinding() | INSERT … ON CONFLICT (finding_hash) DO NOTHING + evidence append + junction rows |
DeduplicateFindings() | Post-phase grouping: merge findings sharing (module_id, severity, matched_at URL) |
CreateScanWithCursor() / CountRecordsAfterCursor() | Cursor bookkeeping for incremental rescans |
GetRecordsWithResponseBody() | UUID-cursor pagination for batch scanners (e.g. Kingfisher) |
UpdateRiskScores() | Batched CASE/WHEN UPDATE, 500 UUIDs per statement |
Async batched ingestion, RecordWriter
High-throughput ingestion (proxy capture, bulk import, spidering) does not call the repository synchronously. pkg/database/record_writer.go fronts it with a buffered channel:
WriteResult{UUID, Err} back on a per-request result channel, backpressure is the channel capacity, ordering is preserved, and the DB sees large batched transactions instead of a write per request.
SQLite DSN note: the modernc driver needs pragmas in
_pragma=name(value) form; the mattn-style _busy_timeout= is silently ignored. Relevant when tuning concurrent-writer behavior.4. Deduplication
Two layers, by design:- Per-source HTTP-record dedup:
RequestHash(SHA-256 of the raw request) plusDeduplicateRecordsBySourcecollapses re-ingested identical requests within a source. - Finding dedup: the
finding_hashunique constraint prevents exact duplicates at insert time;DeduplicateFindings()runs after a phase to group near-duplicates (same module/severity/URL), folding the extra request/response pairs into the survivor’sAdditionalEvidence(capped). The multi-driverauditcommand runs an additional project-wide findings dedup pass once its drivers exit.
5. Cloud storage (optional)
Storage is disabled by default. When enabled (storage.enabled: true), a single minio-go S3 client talks to GCS (HMAC), S3, or self-hosted MinIO, the driver differs, the rest is identical.
storage.ValidateKey rejects .., backslashes, absolute paths) and project-prefixed server-side, so one bucket safely holds many projects and clients cannot reach outside their own.
| Conventional prefix | Producer |
|---|---|
ugc/<file> | vigolium storage upload (default) |
imports/<base>-<ts>.<ext> | vigolium import --upload |
native-scans/<scan-uuid>/results.tar.gz | vigolium scan --upload-results |
agentic-scans/<run-uuid>/results.tar.gz | vigolium agent … --upload-results |
gs:// URLs are first-class inputs/outputs: vigolium import gs://… downloads-then-imports (detecting archon folders or JSONL inside .tar.gz/.zip), and any export -o gs://… writes locally then uploads on success. The {ts} and {project-uuid} placeholders expand in any -o path. The bundle export format round-trips a full snapshot (JSONL + HTML report + manifest + agent session dirs) that another machine can re-import.
Related
Projects API
Project CLI/API recipes and access-control management.
Storage API
Full
vigolium storage command and gs:// reference.Native Scan
Stage 11 traces a finding from
ResultEvent to row.Configuration
The
storage: and project config YAML blocks.