Curiosity - Read-only replicas

Read-only replicas

A read-only replica is a second Curiosity Workspace process that follows a primary, applies its writes in near-real-time, and serves read traffic. It exists to spread search and graph-query load off the primary and to provide a warm standby for failover.

This page covers the architecture, the environment flags, how to bring a replica online, and the operational expectations. For multi-region disaster recovery and snapshot-based DR, see Backup and restore.

Licensed feature

Replicas require the Replicas feature flag on your license. Check Admin → License before planning a deployment. If the feature is disabled, the replica process will refuse to start.

How it works

The replica protocol has three phases:

Register. On startup, the replica generates a unique ReplicaUID and registers itself with the primary over HTTPS (the primary's normal API port). Authentication uses a shared MSK_JWT_KEY between primary and replica.
Catch up. The replica streams the primary's underlying RocksDB files over gRPC (TCP port 42999) into its own local storage. This is a one-shot file copy that happens once per replica lifetime.
Tail. The replica subscribes to the primary's write-ahead log and applies new batches sequentially, typically every 250 ms. It periodically reports the last applied sequence number back to the primary so lag can be observed.

There is no rsync of backup snapshots, no shared block storage, and no requirement that primary and replica share a filesystem. Every replica keeps its own complete copy of the data.

What replicas serve

Traffic	Where it goes
Search queries	Replica (round-robin) or primary.
Graph queries (`Q().StartAt(…).Emit()`)	Replica or primary.
Node reads (`GET /api/node/{uid}`)	Replica or primary.
Embedding queries (`/api/embeddings/*`)	Replica or primary.
Writes (any `AddOrUpdate`, `Link`, `Delete`)	Primary only. Replicas reject write requests.
Schema changes	Primary only.
Admin operations (license, SSO config, …)	Primary only.
File uploads	Primary only.
Endpoint execution	Replica if the endpoint is marked read-only; primary otherwise.
AI tool calls	Same as endpoints.

Endpoints can be marked read-only-safe in their code; the front-end and SDK call ReadOnlyCompatible() on those requests to allow routing to replicas. Anything that mutates the graph calls ForcePrimary() and is routed to the primary.

Indexing on replicas

Replicas rebuild text and vector indexes locally as the WAL stream arrives — they do not copy index files from the primary. Two consequences:

A freshly-promoted replica is fully self-sufficient. Failing the primary over to it does not lose indexes.
The replica's CPU and disk-I/O cost is comparable to the primary's. Replicas are not "cheap read mirrors" — size them like primaries.

When the primary adds a new index, it pushes a LoadNewIndexOnReplica notification over gRPC; the replica picks it up and begins building locally. The same applies to index removal.

Front-end replication

If the primary is serving a custom front-end (an uploaded Tesserae bundle), replicas mirror it automatically — there is no separate step to push it to each replica. A replica fetches the primary's front-end folder once at boot, then again every time the primary's bundle changes: the primary's front-end hash replicates through the normal WAL stream, and a replica that observes a new hash re-syncs. Only new or changed files are re-fetched; .html entry points are always refetched since they carry the injected loader script and hash.

Deploy with curiosity-cli upload-front-end (or UploadNewApplicationInterfaceAsync) against the primary, as described in Development workflow — replicas pick it up without a separate upload.

Environment flags

On the replica

Variable	Required	Value	Notes
`MSK_REPLICA`	yes	`true`	Switches the process into read-only mode.
`MSK_PRIMARY_ADDRESS`	yes	full URL of the primary, e.g. `https://workspace-primary.example.com`	Used for initial registration and WAL streaming.
`MSK_JWT_KEY`	yes	shared secret string	Must match the primary's `MSK_JWT_KEY` exactly. Used for inter-node auth.
`MSK_GRAPH_STORAGE`	yes	local path	Where the replica stores its copy of the data. Must be on local SSD/NVMe.
`MSK_PUBLIC_ADDRESS`	yes	public URL the replica is reachable at	Used in registration so the primary can call the replica back.
`MSK_SERVER_ADDRESS`	recommended	internal URL for primary→replica calls	Falls back to `MSK_PUBLIC_ADDRESS` if unset.
`MSK_GRAPH_MASTER_KEY`	if encryption is on	same value as primary	Replicas cannot read encrypted data without the same key.
`MSK_LICENSE`	yes	license that includes the `Replicas` feature	The replica process exits with `0xDEAD` once a license without the `Replicas` feature is loaded.
Other `MSK_*` settings	as needed	Provider keys, SSO, TLS, etc.	Most settings can be set on both. The replica reads configuration from the graph after the initial catch-up.

On the primary

The primary doesn't need a flag to enable replication — it accepts replica registrations as soon as one shows up with a valid JWT. What it does need:

Variable	Required	Notes
`MSK_JWT_KEY`	yes	Same value the replicas will present.
`MSK_PUBLIC_ADDRESS`	yes	Reachable from each replica.
`MSK_CORS`	if cross-origin	Semicolon-separated list of extra origins to allow. Add each replica's public origin here if the front-end served from the primary needs to make XHR calls to the replicas.

Make sure TCP port 42999 is open from each replica to the primary — that's the gRPC channel the WAL stream uses. It's internal-only; never expose it on the public internet.

CORS

The replica auto-adds the primary's address to its own CORS allow-list as soon as MSK_PRIMARY_ADDRESS is set — no manual configuration needed for replica → primary traffic.

The primary does not dynamically add replica addresses to its CORS allow-list when they register. If you serve the front-end from the primary and need it to issue cross-origin XHR to the replicas (i.e. the replicas are on different hostnames), add the replica origins to the primary's CORS list with MSK_CORS (semicolon-separated, e.g. MSK_CORS="https://workspace-replica-a.example.com;https://workspace-replica-b.example.com").

What the primary does do when a replica registers is re-render the MSKREPLICAURLS JS global injected into the served HTML and broadcast a REPLICAS_CHANGED WebSocket message so already-connected clients pick up the new URL list.

Bringing a replica online

flowchart LR A[Provision host] --> B[Install workspace] B --> C[Set MSK_REPLICA + MSK_PRIMARY_ADDRESS + MSK_JWT_KEY] C --> D[Start the process] D --> E[Replica registers with primary] E --> F[Initial catch-up over gRPC] F --> G[Tailing the WAL] G --> H[Replica serves reads]

Step-by-step

Provision a host

Size it like a primary: same CPU class, same RAM, NVMe storage. A replica that's smaller than the primary will fall behind under load.

Install the workspace binary

Use the same version as the primary. The replica does not currently enforce a version-skew check at startup, so it is the operator's responsibility to keep primary and replica builds in lockstep — mismatched RocksDB or schema versions can cause silent replication failures. See Upgrades and migrations.

Set the replica-specific env vars

export MSK_REPLICA=true
export MSK_PRIMARY_ADDRESS=https://workspace-primary.example.com
export MSK_JWT_KEY="$(cat /etc/curiosity/jwt-key)"   # same as primary
export MSK_GRAPH_STORAGE=/var/lib/curiosity-replica
export MSK_PUBLIC_ADDRESS=https://workspace-replica-a.example.com
export MSK_LICENSE="$(cat /etc/curiosity/license)"

If the primary uses graph encryption, also set MSK_GRAPH_MASTER_KEY to the same value.

Start the process

systemctl start curiosity-workspace

Tail the log. You should see, in order:

Registering with primary 'https://workspace-primary.example.com' with Replica ID …
Starting initial catching-up of replica with primary
Fetching file: <name> with <N> bytes (Reason: …)
Done catching up with primary in <N> seconds
Loading read-only graph from storage
Starting front-end sync with primary into <wwwroot-path>
Done front-end sync with primary in <N>ms: fetched <N> of <N> files
Registering with primary 'https://workspace-primary.example.com': public address: '…', internal address: '…'
Starting to replicate WAL...
Destination DB sequence number: <N>. Starting WAL sync...

If catch-up fails, the replica exits with 0xDEAD (decimal 57005) — by design. Restart after fixing the cause (network, JWT mismatch, version skew). The same exit code is used if the license loaded at startup does not include the Replicas feature.

Verify

GET /api/graph/replica is protected by system-admin authorization, so call it with an admin cookie / bearer token rather than an anonymous curl:

# On the replica
curl -H "Authorization: Bearer <admin-token>" https://workspace-replica-a.example.com/api/graph/replica
# {"isReplica":true,"hasReplicas":false,"replicaUID":"…","replicaName":"replica-…"}

# On the primary
curl -H "Authorization: Bearer <admin-token>" https://workspace-primary.example.com/api/graph/replica
# {"isReplica":false,"hasReplicas":true}

hasReplicas: true on the primary confirms it's tracking changes for at least one replica. replicaName on the replica is a deterministic friendly name derived from the ReplicaUID (e.g. replica-cheerful-orion).

Routing reads to replicas

The workspace front-end (the in-browser app) reads a replica URL list and routes read-only-compatible API calls to it round-robin. Two ways to configure it:

Inline in the workspace settings

In Admin → Workspace Configuration, set the ReplicaURLs list. The setting propagates to clients on next page load.

Via the front-end global

Embed the list as a global JavaScript variable on the served page:

<script>
  window.MSKREPLICAURLS = "https://workspace-replica-a.example.com;https://workspace-replica-b.example.com";
</script>

The front-end picks this up at startup and splits on ;. Each entry is health-checked before use; failed replicas are retired and re-tested on a 5–30 minute window.

Behind a load balancer

If you sit a load balancer in front of the workspace, terminate two paths:

/api/** calls marked ReadOnlyCompatible() → primary + replicas, weighted however you want.
Everything else → primary only.

Most teams use the SDK / front-end replica list rather than a load balancer because the routing logic (and health checks) is already there.

Making code replica-aware

The routing layer only sends a request to a replica when it has been explicitly marked as read-only-safe. Anything else stays on the primary. Use the checklists below when writing or auditing code that should benefit from replicas.

The decision the front-end makes for every request is:

Flag on request	Where it goes
neither flag (default)	Primary. Falls through to no replica.
`.ReadOnlyCompatible()`	A healthy replica if one is reachable; otherwise the primary.
`.ForcePrimary()`	Always the primary, even if no primary is reachable (request will fail).
`.ReadOnlyCompatible()` + no primary online	Any reachable replica (so the app can keep reading during a primary outage).

Front-end and SDK request audit

Find every REQ.New(...) call in your front-end / SDK code (e.g. grep -rn "REQ.New" FrontEnd/).
For each call that only reads data (search, node fetch, graph query, download, blob get, etc.), chain .ReadOnlyCompatible() before the verb: csharp var doc = await REQ.New("node", uid, "pipeline") .WithObjectLiteralResponse() .ReadOnlyCompatible() // ← add this .GetAsync<Document>();
For each call that mutates data (AddOrUpdate, Link, Delete, schema changes, admin operations) leave the default — the request will route to the primary automatically.
For diagnostic / admin operations that must hit the primary even when a replica would technically accept them (e.g. stopandstore, dump-threads, license operations), chain .ForcePrimary(): csharp REQ.New("graph", "stopandstore").ForcePrimary().PostAsync().FireAndForget();
Never mix .ReadOnlyCompatible() and .ForcePrimary() on the same request — .ForcePrimary() wins, but the intent is unclear.
Make sure the call goes through REQ (the project's Request helper). Plain fetch(...) / XMLHttpRequest calls bypass the replica router and always hit whatever URL you hard-code.
If your app is served from the primary and you've added replicas on a different hostname, add each replica's origin to the primary's MSK_CORS so the browser can talk to them.

Code endpoints (server-side)

Code endpoints are the unit of routing on the server. A replica will refuse to execute any endpoint that is not marked read-only — it returns HTTP 303 See Other and the client retries against the primary.

Decide whether the endpoint can run entirely against a ReadOnlyGraph. If it writes to the graph in any way (AddOrUpdate, Link, Delete, schema mutations), it cannot be marked read-only.
In Admin → Endpoints, edit the endpoint and turn on the Read Only toggle. This switches the execution scope to ReadOnlyCodeEndpointExecutionScope, which rejects writes at compile time — if the code doesn't compile in that scope, the toggle stays off.
Equivalently, when creating an endpoint via the API, pass isReadOnly: true to POST /api/endpoints/create.
Make the front-end call mark the endpoint invocation as .ReadOnlyCompatible() so it can land on a replica. Without that flag the endpoint will run, but only on the primary.
Test the endpoint against a replica's URL directly (https://<replica>/api/<endpoint-path>) — a 303 response means the server-side ReadOnly flag is still off on the endpoint.
If a read-only endpoint silently degrades to the primary, check the front-end's health-check state — failed replicas are retried only every 5–30 minutes.

AI tool calls

AI tools that wrap a code endpoint inherit that endpoint's read-only flag — if the underlying endpoint is read-only, the tool can run on a replica via .ReadOnlyCompatible() on the invocation. Tool definitions themselves (/api/chatai/tools/*) are not currently flagged read-only in the API client, so administration of tools always lands on the primary.

For each custom AI tool you ship, follow the Code endpoints checklist above for the endpoint it wraps.
Don't rely on AI tools to do writes during a primary outage — the primary is the only writer, and a degraded primary will surface as tool-call failures.

Why "marked" and not "inferred"?

Replica routing is opt-in by design. The workspace never tries to guess whether an arbitrary endpoint or front-end call is safe to run on a replica — a wrong guess could silently return stale data or fail a write. ReadOnlyCompatible() (front-end) and the Read Only flag (server-side endpoints) are the two explicit switches that say "I know this is safe."

Health checks

Endpoint	Auth	Returns
`GET /health`	none	Aggregate health report (memory, disk, readiness).
`GET /health/replica`	none	Replica role plus, on the primary, per-replica lag in WAL versions.

/health/replica is unauthenticated (same as /health) so external monitoring can scrape it directly, and the response shape changes depending on the role:

# On the primary
curl https://workspace-primary.example.com/health/replica

{
  "isReplica": false,
  "hasReplicas": true,
  "currentLag": 42,
  "averageLag": 35.2,
  "emaLag": 33.1,
  "recommendedDelayMs": 200,
  "replicas": [
    { "replicaUID": "…", "replicaName": "replica-cheerful-orion", "lagVersions": 42 },
    { "replicaUID": "…", "replicaName": "replica-quiet-fern",     "lagVersions": 17 }
  ]
}

# On a replica
curl https://workspace-replica-a.example.com/health/replica

{
  "isReplica": true,
  "hasReplicas": false,
  "replicaUID": "…",
  "replicaName": "replica-cheerful-orion"
}

Field reference (primary response):

Field	Meaning
`currentLag`	Most recent cluster lag sample, measured in RocksDB WAL sequence numbers.
`averageLag`	Mean lag over the most recent reporting window.
`emaLag`	Exponential moving average of the lag — smoother than `currentLag` for alerting.
`recommendedDelayMs`	Adaptive commit delay the primary is currently applying to keep replicas within reach (ms).
`replicas[]`	Per-replica lag, keyed by `replicaUID`. `lagVersions` is the number of WAL entries the replica is behind.

currentLag / averageLag / emaLag only update as replicas check in (every WAL batch, typically every 250 ms). A replica that has stopped checking in keeps its last reported lagVersions until it reconnects.

If replication is not configured on the primary, the lag fields and replicas array are omitted — only isReplica and hasReplicas are returned.

A simple "is the replica caught up?" check is comparing node counts between primary and replica. For stricter monitoring, scrape /health/replica on the primary and alert when emaLag exceeds your tolerance, or when a specific replicaUID disappears from replicas[].

Lag, errors, and failure modes

Symptom	Likely cause
Replica refuses to start with `0xDEAD` immediately	JWT mismatch, primary unreachable, license missing the `Replicas` feature, or initial catch-up failure. Check the log line before exit.
Replica catches up but lag grows over time	Replica is underpowered. Match the primary's hardware spec.
Replica returns stale results	Expected within the ~250 ms tail interval; under high write load, can grow.
Replica restarts unexpectedly	Process or host crash. The primary detects this and re-initiates the catch-up flow.
`hasReplicas: false` on the primary despite a running replica	Replica isn't registered. Inspect the replica's startup log for registration errors.
Write requests succeed against a replica	They shouldn't — replicas reject writes at the HTTP layer. If they don't, you're hitting the primary by accident.

Operational recommendations

Minimum two replicas in production. Single-replica setups give no read-load headroom during a replica failure.
Geographic placement. Put replicas in the same network zone as the read traffic they serve. Cross-region replication works but adds latency to every WAL batch.
Same hardware tier as primary. Replicas rebuild indexes locally; underpowered replicas fall behind silently.
JWT rotation. When rotating MSK_JWT_KEY, do it in a maintenance window. The current implementation does not support overlapping keys; replicas must restart after the rotation.
Don't replicate to dev/staging. Replicas are for production read scaling, not for environment promotion. Use Workspace export/import for that.
Backup is still your responsibility. Replicas are not backups. A logical corruption on the primary replicates to every replica within milliseconds. Maintain real backups per Backup and restore.

Failover

Curiosity does not yet ship an automatic primary→replica promotion mechanism. If the primary fails:

Pick the most up-to-date replica (highest applied WAL sequence).
Stop it.
Restart it without MSK_REPLICA and without MSK_PRIMARY_ADDRESS. It boots as a normal primary.
Update DNS / load-balancer config to point clients at the new primary.
Tear down the old primary (or, after repair, re-introduce it as a replica).

Plan and rehearse this procedure. The first time you do it should be in a fire drill, not in a real outage.

Cost model

Resource	Per-replica cost vs primary
CPU	70–100% of primary (replicas rebuild indexes; the workload is similar).
RAM	70–100% of primary.
Disk	≈ 100% of primary (full copy of the data plus indexes).
Network	Outbound from primary scales with write volume × N replicas.
Embedding / LLM API calls	Zero — replicas do not call external providers; they receive embeddings via the WAL.

Where to go next

Backup and restore — replicas are not a substitute for backups.
Scaling — when to add replicas vs. scale the primary vertically.
Upgrades and migrations — version compatibility and rolling upgrades.
Configuration reference — full MSK_* flag list.
Monitoring — replica metrics in the built-in dashboards.
Custom front-ends — front-end bundles replicate to every replica automatically.