Troubleshooting
The connector failure modes are remarkably consistent — most production incidents map to one of the rows below.
| Symptom | Likely cause | Fix |
|---|---|---|
| Duplicate nodes across runs | [Key] is non-deterministic (GUIDs, timestamps, auto-increments). |
Switch to a source-derived key or a stable hash. See Idempotency. |
| Linked target "not found" until later in the run | Edge created via Node.FromKey(...) before the target node was upserted. |
Fine — edges become active when both ends exist. Just commit before the run ends. |
| Property updates not visible after re-run | Code uses TryAdd for a mutable record. |
Use AddOrUpdate for anything that can change. |
| Connector hangs partway through | Long-running source request with no client-side timeout. | Wrap fetches in WithCancellation(token) and set HttpClient.Timeout. |
Auth: 401 Unauthorized |
Token expired, wrong workspace, or wrong scope. | Re-mint under API integrations; check CURIOSITY_TOKEN env var for trailing whitespace. |
Auth: 403 Forbidden |
Token is valid but doesn't have the right scope (e.g., admin needed for schema bootstrap). | Mint a new token with the missing scope. |
| Connection refused | Wrong endpoint URL, workspace down, or proxy in the way. | Confirm curl /api/login/check works from the connector host. |
| CORS errors from a browser-based connector | MSK_CORS doesn't include the caller's origin. |
Add to MSK_CORS, restart the workspace. |
| Memory grows unbounded | Pending nodes never committed. | Set graph.SetAutoCommitCost(everyNodes: 10_000) and call CommitPendingAsync() at the end. |
| Stale search results after a backfill | PauseIndexing called without ResumeIndexing (e.g., crashed run). |
Wrap in try/finally. Trigger a manual reindex from Settings → Search Index. |
| ACLs not enforced for some users | Connector forgot to mirror source restrictions. | Add RestrictAccessToTeam / RestrictAccessToUser in the mapping function. |
| Two workers writing the same node race | Parallel ingestion with overlapping shards. | Partition source data by stable key range so each worker owns disjoint keys. |
Diagnostic queries
When a run looks wrong, these queries (run from Management → Shell) pinpoint the symptom quickly:
// Count by type
return Q().EmitSummary();
// Orphan nodes — no edges
return Q().StartAt("SupportCase")
.Where(c => c.Edges().Count() == 0)
.Take(10)
.Emit("N");
// Duplicates — two cases with the same Summary (suspicious)
return Q().StartAt("SupportCase")
.Emit("N", new[] { "Id", "Summary" });
// ... export to CSV and group in your tool of choice.
Reading the connector log
Stdout from a connector run is the first place to look. Useful events to log explicitly:
- Cursor start (
lastSyncvalue). - Per-batch summary (records fetched, mapped, committed).
- Cursor end + checkpoint write.
- Any unexpected status from the source.
For long-running connectors deployed as scheduled tasks, the logs flow through Management → Logs.
When to escalate
- Workspace returns 500 on
CreateNodeSchemaAsync— check workspace logs; this usually means the schema collides with an existing one (different attribute set). - Connector seems to succeed but the data doesn't appear in search — check Settings → Search Index for "rebuild required" and run it once.
Cross-links
- Idempotency — the cause of most "duplicates" reports.
- Performance — auto-commit, pause indexing, parallel ingestion.
- Access control — for ACL leak / missing-restriction symptoms.
- Custom connector from scratch — full reference loop with checkpointing.