# Data Flow

Ingestion: are nodes/edges being created? (counts, keys, errors)
Graph correctness: do expected relationships exist?
Indexing: are fields indexed? is a rebuild required?
Parsing: are pipelines applied to the right fields?
App logic: do endpoints/UI queries match the data model?

Curiosity Workspace data flow describes how raw source data becomes:

Ingest
- A connector/integration reads source records and maps them into node/edge schemas.
Persist
- Nodes and edges are committed into the workspace graph storage.
Index
- Selected fields are indexed for text search and/or embedding search.
Parse / enrich (optional)
- NLP pipelines extract entities/signals from text and can link them into the graph.
Serve
- UI, APIs, and endpoints read from graph + search to deliver experiences.

Connectors are best when you need full control over mapping, identifiers, and relationship creation.
Pipelines/integrations are best when your source is a standard system and configuration is enough.

Your data model decisions determine everything downstream:

NLP can add:

AI features typically rely on:

grounding from search + graph (to reduce hallucinations)
custom endpoints to orchestrate retrieval, scoring, and business rules
interfaces tailored to the workflow (support, investigation, research, etc.)

When something “doesn’t work”, validate in this order:

See also