Ingesting Data
Ingestion is the loop that maps source records into graph operations. The three building blocks — TryAdd, AddOrUpdate, and Link — cover every shape; the rest is pacing and observability.
For the end-to-end pattern with checkpointing see Custom connector from scratch. For the high-volume tuning see Performance.
Connecting and bootstrapping
using var graph = Graph.Connect(
endpoint: Environment.GetEnvironmentVariable("CURIOSITY_ENDPOINT")!,
token: Environment.GetEnvironmentVariable("CURIOSITY_TOKEN")!,
connectorName: "my-connector");
await graph.CreateNodeSchemaAsync<Device>();
await graph.CreateNodeSchemaAsync<SupportCase>();
await graph.CreateEdgeSchemaAsync(typeof(Edges));
Graph.Connect returns an IGraph you keep open for the duration of the run. The connectorName is what shows up in Settings → Tasks so operators can tell sources apart.
Adding nodes
| Method | If node exists | If node missing | Use when |
|---|---|---|---|
TryAdd |
Returns existing node; properties unchanged | Creates new node | Reference data that doesn't change (manufacturers, statuses). |
AddOrUpdate |
Updates properties | Creates new node | Source data that mutates over time (cases, articles). |
Update |
Updates properties | Throws / returns null | When you expect the node to exist and want a hard error otherwise. |
// Stable reference data — TryAdd is cheap.
graph.TryAdd(new Device { Name = "MacBook Pro 14" });
// Mutable source — AddOrUpdate keeps the graph in sync.
var caseNode = graph.AddOrUpdate(new SupportCase
{
Id = src.ReferenceNumber,
Summary = src.Summary,
Content = src.Content,
Time = src.OpenedAt,
});
Linking nodes
graph.Link(from, to, edge, reverse) creates a bi-directional edge.
foreach (var part in source.Parts)
{
var partNode = graph.TryAdd(new Part { Name = part.Name });
foreach (var deviceName in part.UsedIn)
{
graph.Link(
partNode,
Node.FromKey(nameof(Device), deviceName), // no fetch needed
Edges.PartOf,
Edges.HasPart);
}
}
Two patterns to know:
Node.FromKey(type, key)— link to a node by its key without fetching it first. The edge is stored by key and becomes active once both nodes exist. Saves a round-trip per link.- Bi-directional edges — pass both names so traversal works from either end without re-walking. Curiosity stores it as one edge; the names are just labels for direction.
Committing
Adds and links are batched in memory. Commit them before exiting:
await graph.CommitPendingAsync();
For long-running loops, set an auto-commit threshold so the buffer doesn't grow unbounded:
graph.SetAutoCommitCost(everyNodes: 10_000);
After this, the library auto-commits whenever the pending count exceeds the threshold. You should still call CommitPendingAsync() at the end of the run to flush.
Observability
A few lines of plain logging save a lot of debugging:
var added = 0;
foreach (var record in batch)
{
Map(graph, record);
added++;
}
await graph.CommitPendingAsync();
Console.WriteLine($"[{DateTimeOffset.UtcNow:HH:mm:ss}] committed {added} records (cursor: {lastSync:o})");
For production connectors, ship logs to your aggregator and write a heartbeat node to the graph so operators can dashboard freshness.
Cross-links
- Schemas — defining the types you're writing.
- Access control — mirroring source ACLs at ingest.
- Idempotency — making re-runs safe.
- Performance — auto-commit, pause indexing, batch sizing.
- Custom connector from scratch — end-to-end walkthrough.