#
Connectors
#
Connectors
Connectors are the most flexible way to ingest data into Curiosity Workspace. A connector is a program (or integration component) that:
- reads from a source system (files, databases, APIs, event streams)
- maps source records into node schemas and edge schemas
- commits changes into the workspace graph
#
Why connectors matter
Connectors are where you encode the “truth” of how your source systems map into your graph model:
- stable keys and deduplication
- relationship creation
- incremental updates and deletes
- enrichment (aliases, normalization, derived fields)
#
Minimal connector mapping example (C#)
The demo repository illustrates a common ingestion structure: define schemas, upsert nodes, link edges, then commit.
// Define schemas once (or validate they exist)
await graph.CreateNodeSchemaAsync<Device>();
await graph.CreateNodeSchemaAsync<Part>();
await graph.CreateEdgeSchemaAsync(typeof(Edges));
// Upsert nodes by key
var deviceNode = graph.TryAdd(new Device { Name = "iPhone 14 Pro Max" });
var partNode = graph.TryAdd(new Part { Name = "Loudspeaker" });
// Link nodes with an edge (and optionally the inverse edge name)
graph.Link(deviceNode, partNode, Edges.HasPart, Edges.PartOf);
await graph.CommitPendingAsync();
This pattern scales well when you add batching, incremental cursors, and observability.
#
Connector responsibilities
At minimum, a connector should:
- create/update schemas (or validate schemas exist)
- upsert nodes by stable keys
- create/update edges between nodes
- commit in batches and handle retries
- log ingestion progress and failures
#
Designing connector-friendly schemas
Good connector design starts with good schema design:
- each node type has a stable key
- relationships are explicit edges
- large text fields are properties (later indexed for search/embeddings)
See Schema Design.
#
Incremental ingestion patterns
Choose one:
- Full refresh: rebuild everything on each run (simple, expensive)
- Incremental: ingest only changes since last run (recommended for production)
- Event-driven: apply changes as they happen (fastest, most complex)
For incremental ingestion, track:
- source cursor/watermark (timestamp, sequence)
- deletes (tombstones or periodic reconciliation)
- schema evolution strategy (backfills)
#
Testing and validation
After running a connector, validate:
- counts per node type
- edge completeness
- no duplicate keys
- search indexes include the intended fields (if configured)
#
Common pitfalls
- Unstable keys cause duplicates and broken links.
- Missing edges make the graph unusable for navigation and graph-filtered search.
- Ingesting everything as text reduces the value of schemas and facets.
#
Next steps
- Build ingestion workflows: Ingestion Pipelines
- Learn best practices for modeling: Schema Design
See also
Curiosity Workspace is a platform for building data applications that integrate graph, search, and AI capabilities.
Curiosity Workspace data flow describes how raw source data becomes:
This page summarizes the ingestion approach used in the demo repository, focusing on reusable patterns rather than dataset specifics.
Curiosity Workspace is a product that combines a knowledge graph, a search engine, and AI capabilities (NLP, embeddings, LLM-driven workflows) into a...
Curiosity Workspace is where you bring your data, model it as a graph, make it searchable, and build AI-assisted experiences on top.
In Curiosity Workspace, “ingestion pipelines” describes the operational workflow that gets data from the outside world into your workspace and keeps...
Integrations connect Curiosity Workspace to external systems for ingestion, synchronization, and workflow automation.
This quickstart takes you through the smallest “end-to-end loop” that demonstrates the Curiosity Workspace value chain:
Sample datasets are a practical way to evaluate Curiosity Workspace and to teach teams the core patterns:
Workspace configuration covers the settings that control how your environment behaves: languages and NLP defaults, tokens and integrations,...