Curiosity

Deployment

This page is the production-readiness checklist for a Curiosity Workspace deployment. It assumes you've already chosen a platform — for the platform-specific manifests, see Installation, Docker, Kubernetes, and the cloud guides.

Deployment goals

  • Reliability: predictable uptime, fast recovery.
  • Security: TLS, secrets discipline, scoped tokens, ReBAC enforced everywhere.
  • Scalability: handle data growth and query load.
  • Reproducibility: dev → staging → prod promotion is mechanical, not heroic.

Environments

Maintain three named environments with parity in shape but different scale and access:

Environment Purpose Notes
Dev Engineer-owned, may be on a laptop Local Docker run, default ports, generated admin password.
Staging Pre-production validation Prod-shaped manifest, smaller capacity, isolated secrets, restore drills allowed.
Production Real users Restricted access, change control, no shell access by default.

Promotion path: code/config changes land in source → tested in dev → deployed to staging → validated → promoted to prod.

What to version and promote

Treat these as deployable artifacts and version-control them:

  • Connector code (your data ingestion programs).
  • Custom endpoint and AI tool code (export from the workspace UI, store in git, redeploy on promotion).
  • Custom interface bundles (Tesserae / H5).
  • Schema migrations and ingestion pipeline definitions.
  • Search index configuration (indexed fields, boosts, facets).
  • NLP pipeline configuration (entity capture, embeddings field selection).
  • The deployment manifest itself (Docker Compose / Kubernetes / Helm / Terraform).

The Workspace stores all UI-managed configuration inside the graph, so a configuration export + import lets you snapshot and promote workspaces. See Backup and restore.

Production checklist

Before flipping a workspace to "production":

Image and runtime

  • Versioned image tag (curiosityai/curiosity:vX.Y.Z), not :latest.
  • Container memory and CPU sized for embeddings (start at 16 GB / 8 vCPU; bigger for large corpora).
  • Healthcheck on /api/login/check.
  • terminationGracePeriodSeconds ≥ 60 so the workspace can flush before being killed.

Storage

  • Persistent volume on SSD-backed block storage attached to MSK_GRAPH_STORAGE.
  • Separate volume (or directory) for MSK_GRAPH_BACKUP_FOLDER.
  • Backups scheduled, off-host, and tested by restoring to a sandbox.
  • Volume expansion enabled so you can grow without downtime.

Networking and TLS

  • TLS terminated at the proxy or inside the container; HSTS enabled.
  • MSK_PUBLIC_ADDRESS set to the user-facing URL.
  • No 0.0.0.0 exposure without an authenticating front-end.
  • Egress allowlist documented (Docker registry, NuGet, your LLM provider).

Identity and secrets

  • MSK_ADMIN_PASSWORD set explicitly (default admin/admin never used).
  • MSK_JWT_KEY set explicitly so tokens survive restarts.
  • MSK_GRAPH_MASTER_KEY set explicitly and backed up — losing it means losing encrypted content.
  • All secrets injected from a secret manager (AWS Secrets Manager, Azure Key Vault, GCP Secret Manager, Vault).
  • At least one SSO provider configured.
  • Admin sign-in via SSO only; the local admin account disabled after onboarding.

Permissions and tokens

  • Connectors run on dedicated tokens with ingestion scope only.
  • External integrations use endpoint tokens scoped to specific endpoints.
  • Token rotation documented and scheduled.

Observability

  • Stdout logs routed to your aggregator; audit log forwarded to your SIEM.
  • Alerts on liveness, latency regressions, ingestion failures, container restart rate.
  • Per-endpoint and per-tool metrics scraped into your monitoring system.

Disaster recovery

  • Documented RPO and RTO targets.
  • Restore drill completed within the past quarter.
  • Secrets manager backups verified.

See the per-page details: Security, Backup and restore, Monitoring, Upgrades and migrations.

Reverse proxy patterns

Most production deployments terminate TLS at a proxy and forward HTTP to the workspace. A minimal NGINX server block:

server {
    listen 443 ssl http2;
    server_name workspace.example.com;

    ssl_certificate     /etc/letsencrypt/live/workspace.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/workspace.example.com/privkey.pem;
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

    client_max_body_size 100m;
    proxy_read_timeout    300s;

    location / {
        proxy_pass         http://127.0.0.1:8080;
        proxy_set_header   Host              $host;
        proxy_set_header   X-Real-IP         $remote_addr;
        proxy_set_header   X-Forwarded-For   $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;
    }
}

Set MSK_PUBLIC_ADDRESS=https://workspace.example.com on the workspace so generated links use the proxy's hostname.

Rolling out a change

Recommended sequence for a non-trivial production change (image upgrade, schema migration, new SSO provider, …):

  1. Take a backup of the graph volume.
  2. Apply the change in staging; walk the post-restore validation checklist in Backup and restore.
  3. Promote to production during a low-traffic window.
  4. Watch Monitoring for 30 minutes after the rollout.
  5. Be prepared to roll back: revert the image tag (and configuration), restart, and restore the backup if data shape changed.

Next steps

© 2026 Curiosity. All rights reserved.
Powered by Neko