#
Schema Design
#
Schema Design
Schema design is the most important step in building a successful Curiosity Workspace application. It determines:
- how users navigate and explore data
- how search is scoped and filtered
- how AI features can ground and enrich results
- how connectors keep data consistent over time
#
The three layers of a good schema
- Entities (nodes): the “things” in your domain
- Relationships (edges): the meaningful links between those things
- Attributes (properties): the descriptive fields used for display, filtering, and retrieval
#
Start from user journeys
Ask these questions before you write your first node schema:
- What are the top 5 questions users ask?
- What are the top 5 workflows users execute?
- Which objects do they search for first?
- What do they click on next?
Those answers typically map directly to:
- primary node types
- the edges between them
- the filters and facets you must support
#
Keys: pick stable identity early
For each node type, define a stable key:
- Prefer stable IDs from the source system.
- If not available, use a deterministic key strategy (canonicalization + hash).
- Avoid random IDs unless you never need to re-run ingestion safely.
#
When to make something a node vs a property
Use a property when:
- the value is only displayed or filtered on the current node
- you do not need to navigate to it as an entity
Use a node + edge when:
- you need cross-cutting filters (e.g., status across multiple types)
- you need navigation and context building (“show all tickets for this customer”)
- the value should have its own metadata over time
#
Relationship modeling patterns
Common patterns:
- Ownership / membership:
Customer -> HasTicket -> Ticket - Attribution as node:
Ticket -> HasStatus -> Status - Mentions / linking:
Document -> Mentions -> Entity - Bipartite linking: avoid duplicating properties by linking to shared nodes
#
Schema evolution
Expect schema evolution in real systems:
- add properties as new data becomes available
- introduce new node/edge types for new workflows
- backfill or reparse content when pipelines change
Operational advice:
- version your connector logic and treat schema updates as deployments
- plan reindex and reparse windows for large changes
#
Next steps
- Learn ingestion patterns that keep schemas consistent: Ingestion Pipelines
- Tune search based on schema decisions: Search Optimization
See also
Connectors are the most flexible way to ingest data into Curiosity Workspace. A connector is a program (or integration component) that:
Curiosity Workspace data flow describes how raw source data becomes:
This page describes the example domain model used in the Technical Support demo repository.
This page collects practical graph modeling patterns that tend to work well for Curiosity Workspace applications.
Curiosity Workspace represents data as a labeled property graph:
In Curiosity Workspace, “ingestion pipelines” describes the operational workflow that gets data from the outside world into your workspace and keeps...
This quickstart takes you through the smallest “end-to-end loop” that demonstrates the Curiosity Workspace value chain:
This page defines the main schema concepts used throughout Curiosity Workspace.