Curiosity - Backup and restore

Backup and restore

A Curiosity Workspace deployment has three things you need to be able to restore from cold storage:

The graph — the typed nodes, edges, and indexes that live under MSK_GRAPH_STORAGE. This is the database.
The workspace configuration — search indexes, NLP pipelines, embedding/LLM provider settings, SSO config, scheduled tasks, custom endpoints, AI tools. Stored inside the graph, so it's covered by the graph backup.
The secrets — MSK_ADMIN_PASSWORD, MSK_JWT_KEY, MSK_LICENSE, MSK_GRAPH_MASTER_KEY, model-provider API keys, certificate material. Stored outside the graph in your secret manager.

You need a working copy of all three to restore a Workspace from scratch.

Strategy at a glance

Tier	Target RPO	Target RTO	Approach
Local dev	24h	best-effort	Periodic `tar` of `MSK_GRAPH_STORAGE`.
Staging	24h	1h	Daily volume snapshot + secrets in a secret manager.
Production	1h	15 min	Hourly snapshot + 15-min journal sync + redundant secrets manager + tested restore drill.

What to back up

The graph (`MSK_GRAPH_STORAGE`)

Snapshot the directory MSK_GRAPH_STORAGE points at. The graph supports lock-free reads, so a snapshot taken with a CSI volume snapshot, an EBS snapshot, or a filesystem-level snapshot (LVM, ZFS, Btrfs) is consistent.

If your platform doesn't support snapshots, the workspace can write a consistent point-in-time backup to MSK_GRAPH_BACKUP_FOLDER:

Set MSK_GRAPH_BACKUP_FOLDER to a path that's mounted to durable storage (a separate volume, an S3 bucket via a filesystem driver, etc.).
Schedule a backup task under Settings → Tasks with type Backup. Recommended frequency: hourly for production.
Copy the resulting backup files off-host on a schedule.

The journal (`MSK_GRAPH_JOURNAL_FOLDER`)

If set, the journal contains the write log used to recover from crashes. Including it in your backup tightens your RPO between snapshots — restore can replay journal entries created after the last snapshot.

Secrets and configuration

The graph backup includes most workspace configuration. Outside of the graph, back up:

All MSK_* secrets — MSK_ADMIN_PASSWORD, MSK_JWT_KEY, MSK_LICENSE, MSK_GRAPH_MASTER_KEY, and any provider-side secrets you reference via *_FILE variables.
TLS material if certificates are mounted into the container (MSK_CERT_FILE, MSK_CERT_FILE_PRIVATE_KEY).
The Docker/Helm/Compose manifest that runs the workspace, so the restored environment looks the same.
Custom interface and connector source code — these live in your own git repositories; make sure those are mirrored.

Daily backup procedure (containerized)

#!/usr/bin/env bash
set -euo pipefail

TS=$(date -u +%Y-%m-%dT%H%M%SZ)
DEST=/srv/backups/curiosity/$TS
mkdir -p "$DEST"

# 1) Quiesce nothing — graph snapshots are lock-free, but if you have heavy
#    write activity you may want to pause ingestion for the duration.
docker exec curiosity sync || true

# 2) Snapshot
tar -C /srv/curiosity -czf "$DEST/graph.tar.gz" curiosity

# 3) Capture secrets from the secret manager
vault kv get -format=json secret/curiosity > "$DEST/secrets.json"

# 4) Verify integrity
test -s "$DEST/graph.tar.gz" && test -s "$DEST/secrets.json"

# 5) Off-host
aws s3 cp --recursive "$DEST" "s3://my-curiosity-backups/$TS/"

For Kubernetes deployments, the equivalent is a VolumeSnapshot plus a Secret snapshot. For cloud-managed disks (EBS, Azure Disk, Persistent Disk), use the platform's snapshot API instead of tar.

Restore procedure

You're restoring three things in this order: secrets, graph storage, then the running workspace.

Provision secrets in the destination secret manager, matching the names referenced by your manifest.

Restore the graph volume:

mkdir -p /srv/curiosity
tar -C /srv/curiosity -xzf /srv/backups/curiosity/<timestamp>/graph.tar.gz

Start the workspace with the same MSK_GRAPH_STORAGE path and the same secrets:

docker run --name curiosity \
  -p 127.0.0.1:8080:8080 \
  -v /srv/curiosity:/data \
  -e MSK_GRAPH_STORAGE=/data/curiosity \
  -e MSK_GRAPH_MASTER_KEY="$(vault kv get -field=master_key secret/curiosity)" \
  -e MSK_JWT_KEY="$(vault kv get -field=jwt_key secret/curiosity)" \
  -e MSK_ADMIN_PASSWORD="$(vault kv get -field=admin_password secret/curiosity)" \
  curiosityai/curiosity:<same-version-as-source>

Wait for startup — the workspace replays journal entries before accepting traffic. Watch docker logs -f curiosity.

Restore on the same Workspace version

Always restore onto the same container image version the backup was taken on, then upgrade afterward. Restoring across major versions is not supported.

Validating a restore

After restore, walk this checklist before declaring the environment ready:

Sign in works with the admin account from the backup.
Users and teams are present under Settings → Accounts.
Node counts match the source for each major type:
```
return Q().StartAt("Ticket").Count();
```
Search returns results for a smoke query you know should match.
Vector search returns results — embeddings survived the snapshot.
SSO works — sign in via each configured provider.
Scheduled tasks are present and enabled at the expected cadence.
Custom endpoints compile and respond on POST /api/endpoints/run/<name>.
AI tools respond inside the chat view.

Restore drills

A restore you've never tested is not a restore. We recommend a quarterly drill in production:

Spin up an isolated staging cluster.
Restore the latest backup into it.
Walk the validation checklist.
Tear it down.

A documented, dated drill — even one page — is what auditors will ask for.

Cross-environment migration

The same procedure works for promoting data from staging to production, or for cloning production into a sandbox. Two caveats:

Re-encrypt secrets for the destination environment. Don't share MSK_JWT_KEY between environments — it would let a session token from one work in the other.
Strip personally identifiable information before cloning production into shared dev environments. The graph has no built-in PII redaction; you can run a one-off cleanup endpoint after the restore.

Backup and restore

Strategy at a glance

What to back up

The graph (`MSK_GRAPH_STORAGE`)

The journal (`MSK_GRAPH_JOURNAL_FOLDER`)

Secrets and configuration

Daily backup procedure (containerized)

Restore procedure

Restore on the same Workspace version

Validating a restore

Restore drills

Cross-environment migration

See also

Referenced by

Backup and restore

Strategy at a glance

What to back up

The graph (MSK_GRAPH_STORAGE)

The journal (MSK_GRAPH_JOURNAL_FOLDER)

Secrets and configuration

Daily backup procedure (containerized)

Restore procedure

Restore on the same Workspace version

Validating a restore

Restore drills

Cross-environment migration

See also

Referenced by

The graph (`MSK_GRAPH_STORAGE`)

The journal (`MSK_GRAPH_JOURNAL_FOLDER`)