Ingesting Data

Ingestion is the loop that maps source records into graph operations. The three building blocks — TryAdd, AddOrUpdate, and Link — cover every shape; the rest is pacing and observability.

For the end-to-end pattern with checkpointing see Custom connector from scratch. For the high-volume tuning see Performance.

Connecting and bootstrapping

using var graph = Graph.Connect(
    endpoint:      Environment.GetEnvironmentVariable("CURIOSITY_ENDPOINT")!,
    token:         Environment.GetEnvironmentVariable("CURIOSITY_TOKEN")!,
    connectorName: "my-connector");

await graph.CreateNodeSchemaAsync<Device>();
await graph.CreateNodeSchemaAsync<SupportCase>();
await graph.CreateEdgeSchemaAsync(typeof(Edges));

Graph.Connect returns an IGraph you keep open for the duration of the run. The connectorName is what shows up in Settings → Tasks so operators can tell sources apart.

Adding nodes

Method	If node exists	If node missing	Use when
`TryAdd`	Returns existing node; properties unchanged	Creates new node	Reference data that doesn't change (manufacturers, statuses).
`AddOrUpdate`	Updates properties	Creates new node	Source data that mutates over time (cases, articles).
`Update`	Updates properties	Throws / returns null	When you expect the node to exist and want a hard error otherwise.

// Stable reference data — TryAdd is cheap.
graph.TryAdd(new Device { Name = "MacBook Pro 14" });

// Mutable source — AddOrUpdate keeps the graph in sync.
var caseNode = graph.AddOrUpdate(new SupportCase
{
    Id      = src.ReferenceNumber,
    Summary = src.Summary,
    Content = src.Content,
    Time    = src.OpenedAt,
});

Linking nodes

graph.Link(from, to, edge, reverse) creates a bi-directional edge.

foreach (var part in source.Parts)
{
    var partNode = graph.TryAdd(new Part { Name = part.Name });

    foreach (var deviceName in part.UsedIn)
    {
        graph.Link(
            partNode,
            Node.FromKey(nameof(Device), deviceName),  // no fetch needed
            Edges.PartOf,
            Edges.HasPart);
    }
}

Two patterns to know:

Node.FromKey(type, key) — link to a node by its key without fetching it first. The edge is stored by key and becomes active once both nodes exist. Saves a round-trip per link.
Bi-directional edges — pass both names so traversal works from either end without re-walking. Curiosity stores it as one edge; the names are just labels for direction.

Committing

Adds and links are batched in memory. Commit them before exiting:

await graph.CommitPendingAsync();

For long-running loops, set an auto-commit threshold so the buffer doesn't grow unbounded:

graph.SetAutoCommitCost(everyNodes: 10_000);

After this, the library auto-commits whenever the pending count exceeds the threshold. You should still call CommitPendingAsync() at the end of the run to flush.

Observability

A few lines of plain logging save a lot of debugging:

var added = 0;
foreach (var record in batch)
{
    Map(graph, record);
    added++;
}

await graph.CommitPendingAsync();
Console.WriteLine($"[{DateTimeOffset.UtcNow:HH:mm:ss}] committed {added} records (cursor: {lastSync:o})");

For production connectors, ship logs to your aggregator and write a heartbeat node to the graph so operators can dashboard freshness.

Cross-links

Schemas — defining the types you're writing.
Access control — mirroring source ACLs at ingest.
Idempotency — making re-runs safe.
Performance — auto-commit, pause indexing, batch sizing.
Custom connector from scratch — end-to-end walkthrough.