Installation
Curiosity Workspace ships as a single container image (curiosityai/curiosity) plus a Windows installer. This page helps you pick the right delivery model for your environment and links to the platform-specific guide that takes you the rest of the way.
For a runnable local install in under five minutes, jump to Quickstart. For a complete end-to-end developer build, see Build your first enterprise AI app.
Decision tree
| If you need… | Use… |
|---|---|
| Local development on a laptop | Docker — a single docker run or docker compose up. |
| A demo or evaluation on a Windows VM | Windows installer. Easy to install/uninstall as a service. |
| A staging environment on a single VM | Docker with Compose, behind your existing reverse proxy. |
| A production deployment on Kubernetes | Kubernetes, and consult the cloud-specific notes below. |
| A production deployment on AWS | AWS — EC2/EKS, EBS, ALB. |
| A production deployment on Azure | Azure — VM/AKS, Azure Disk, Entra ID. |
| A production deployment on Google Cloud | GCP — Compute Engine/GKE, Persistent Disk. |
| A production deployment on OpenShift | OpenShift. |
| An air-gapped or on-prem deployment | Docker or Kubernetes with a private registry mirror. |
Decisions you should make before installing
Environment tier
Local / staging / production — they each impose different defaults.
- Local: bind to loopback, use generated admin password via
MSK_ADMIN_PASSWORD, persistence on a real volume so you don't lose your work between restarts. - Staging: prod-shaped manifest, smaller capacity, isolated secrets, restore drills allowed.
- Production: TLS, secrets manager, monitoring, backups, anti-affinity, anti-cohabitation with noisy neighbors. ===
Storage
The graph and its indexes are I/O sensitive. Always provision SSD or better.
- Capacity = (sum of indexed text fields, in bytes) × ~1.5 + (embedded fields, in bytes) × (embedding dimensions × 4 / chunk size) + journal headroom.
- A starter PVC of 200 GB is sufficient for hundreds of thousands of documents.
- Pin a
MSK_GRAPH_BACKUP_FOLDERto a different volume so a corrupted graph volume doesn't take backups with it. ===
Access model
Decide before opening the workspace to anyone.
- Internal only: workspace reachable through a VPN or private network.
- Public, but authenticating: TLS-terminated reverse proxy, SSO via your IdP, no
admin/admindefault. - See Security. ===
Identity
Plan how users will sign in before ingesting production data, so ACLs can be ingested against the right teams.
- Local accounts for evaluation only.
- One or more SSO providers (Microsoft Entra ID, Google, Okta, Auth0, SAML).
- Map IdP groups to Workspace teams. ===
Observability and backups
- Centralized logs (stdout collector, or
MSK_LOG_PATHon a mounted volume). - Backups to off-host storage, with a documented restore drill. See Backup and restore.
- Alerts on liveness, latency regressions, and ingestion failures. See Monitoring. ===
Prerequisites
- CPU: 4 cores minimum (8+ recommended for production with embeddings).
- RAM: 8 GB minimum (16 GB+ recommended; embeddings indexes are memory-resident).
- Storage: SSD with enough space for graph + indexes + 1.5× headroom for backups.
- Network: TCP
8080(or your chosenMSK_PORT) reachable by clients. TLS terminated by a proxy or in-container. - License: a
MSK_LICENSEtoken if you have a commercial license. - For AI features: an LLM/embedding provider key (OpenAI, Azure OpenAI, Anthropic, or a local OpenAI-compatible server).
First-boot checklist
After the service is running for the first time:
- Open the UI and complete the setup wizard.
- Rotate admin credentials if you didn't already set
MSK_ADMIN_PASSWORD(never leave defaults in any environment beyond your laptop). - Set
MSK_JWT_KEYexplicitly so tokens survive restarts. - Set
MSK_GRAPH_MASTER_KEYand back it up; you cannot decrypt content without it. - Configure SSO before inviting real users.
- Create an API token for ingestion connectors and store it in a secret manager.
- Confirm persistence by restarting the service and verifying the workspace state remains.
Post-install validation checklist
- Web UI loads at your workspace URL (using your
MSK_PUBLIC_ADDRESSfrom a client machine). - You can log in with an admin account — and your IdP, if you configured one.
- TLS is correct end to end (browser shows the expected certificate; HSTS header present if enabled).
- Storage is persistent across restarts.
- Background tasks (indexing/parsing) can run.
- Backup runs successfully and the resulting snapshot restores in a separate environment.
- Logs reach your aggregator.
- Monitoring shows the workspace as healthy.
Common installation pitfalls
- Ephemeral storage: running with a non-persistent volume will lose data on restart.
- Reverse proxies and origins: when behind a proxy, set
MSK_PUBLIC_ADDRESSconsistently; otherwise generated links (SSO callbacks, email links) will be wrong. - Ports and binding: confirm the service binds to the expected interface (
127.0.0.1for local,0.0.0.0for a proxy-fronted deployment). :latestin production: pin to a versioned image tag (curiosityai/curiosity:vX.Y.Z) so upgrades are explicit.- Missing master key: encrypted properties can't be read after a restart if
MSK_GRAPH_MASTER_KEYwas autogenerated and then lost.
Next steps
- Configure the workspace basics: Workspace Configuration
- Walk an end-to-end build: Build your first enterprise AI app
- Promote to production: Deployment checklist