← Documentation

Deployment

Docker-based deployment, sizing, model configuration and operational practices.


Vera ships as a set of Docker images deployed in EU-region infrastructure — managed by Nex0 in our standard offering, or into the customer's own environment where data-sovereignty requirements demand it.

Services

Service Role
console The operator web console (Next.js).
engine Reasoning API: language layer, solver orchestration, agent runtime.
mcp The MCP server surface.
neo4j Knowledge-graph memory.
postgres Relational store: audit log, rule versions, users, configuration.

A reference docker-compose configuration covers single-site deployments; the services are stateless apart from the two stores and run on Kubernetes without modification for multi-site operators.

Sizing

No GPU is required at any tier.

  • CPU: 4–8 vCPU
  • Memory: 16–32 GB RAM
  • Storage: 50–100 GB (audit history dominates growth)
  • Inference: under 10,000 model calls per month for a typical site — the solver, not the model, does the heavy lifting

Model configuration

The language layer targets any OpenAI-compatible endpoint and is configured per deployment:

  • Hosted API model — fastest to start; only prompt text leaves the deployment.
  • Self-hosted open-weight model — Mistral-family or Teuken-class models via Ollama or vLLM, for full sovereignty. Requires its own inference hardware, separate from the sizing above.

Cutover between models is a configuration change gated on one condition: the candidate model must pass the extraction evaluation suite at parity or better. Decision integrity is unaffected by model choice — verification always happens in the solver.

Environments and the public demo

The public demonstration environment at vera.nex0.tech is the operator console running against a dedicated field-service demonstration workspace. It exists so evaluators can exercise the real pipeline — rule parsing, solver verdicts, unsat cores, audit traces — without touching customer workspaces (demo credentials are available on request). It is not a production deployment and carries no service levels.

Operational practices

  • Backups: nightly snapshots of both stores; audit log is append-only.
  • Upgrades: rolling, with rule-set and schema migrations applied transactionally — a failed migration rolls back, never half-applies.
  • Monitoring: structured logs from every service; solver latency and schema-validation failure rate are the two health signals worth alerting on.

For security architecture, data residency and the disclosure process, see Security.