Curiosity Workspaces

# Sample Datasets

# Sample Datasets

Sample datasets are a practical way to evaluate Curiosity Workspace and to teach teams the core patterns:

ingest data into a graph schema
configure search and facets
enable embeddings and AI-assisted workflows
build endpoints and interfaces

# What makes a good sample dataset

A good sample dataset has:

Multiple entity types (at least 3–5 node types)
Relationships that support navigation and filtering (edges)
Text fields suitable for search (titles, summaries, descriptions)
A time dimension (timestamps) to test recency and time filters
Enough volume to see ranking and performance behavior

# Recommended sample dataset categories

Support and case management
- tickets/cases, products, customers, teams, statuses
Compliance and audit
- policies, controls, evidence, owners, exceptions
Engineering knowledge base
- docs, code artifacts, services, incidents, runbooks
Research
- papers, authors, topics, citations, institutions

# How to use sample datasets in your docs/testing

Use sample datasets to validate:

Graph navigation: do the relationships enable the workflows you want?
Search relevance: can users find the right objects by keywords?
Semantic recall: can vector search find “similar meaning” content?
Facet usefulness: do the chosen facets match how users refine results?

# Next steps

Implement ingestion for a dataset: Connectors
Make it searchable: Search → Text Search