Curiosity - Concepts for developers

Concepts for developers

A short glossary of the terms used across the developer docs. If you've read the Architecture overview, most of these will already be familiar — this page exists so you can keep one tab open while you build.

Platform layers

Graph : The typed knowledge graph at the core of every workspace. Stores nodes and edges with stable keys and schemas. Used for navigation, faceting, and grounding AI. See Graph Model.
Search : The retrieval engine. Indexes selected fields as text, vectors, or both. Every query is filtered by the calling user's permissions before results return. See Search Model.
AI : Embeddings, entity extraction, LLM orchestration, AI tools, and agents. AI features are always grounded in graph + search retrieval — they don't have an independent data layer. See AI Models.

Schema and data

Node : A typed graph object — Customer, Ticket, Product. Defined as a C# class with [Node] and at least one [Key] property in the Curiosity.Library SDK.
Edge : A typed relationship between two nodes. Edge names are usually constants on a static Edges class. Edges are first-class: you traverse, facet, and search by them.
Key : The stable identifier of a node. Drives idempotent ingestion — two upserts with the same key update one node instead of creating two. Pick keys from the source system whenever possible.
Property : A scalar field on a node (or edge). Used for display, filtering, sorting, and as input to the search index.
Schema : The registered shape of node types and edge types in a workspace. Schemas are registered by the connector at startup and validated on every commit.

Ingestion

Connector : Code that reads from a source system and upserts nodes/edges into the graph. Usually a long-running C# program that uses Curiosity.Library. Connectors can also be written in Python via Curiosity.Library.Python.
Pipeline : The operational lifecycle of a connector (initial load → incremental sync → optional enrichment). See Ingestion Pipelines.
ACL ingestion : The pattern of attaching access-control metadata to a node during ingestion (RestrictAccessToTeam, RestrictAccessToUser, MarkFileAsPrivate). Drives the ReBAC graph that the search and graph engines use to filter at query time.

Retrieval

Text search : Keyword retrieval (BM25-style) over indexed fields. Best for identifiers, titles, and exact-match terms.
Vector search : Semantic retrieval using embeddings stored in a vector index. Best for paraphrased questions and long descriptive text.
Hybrid search : Combination of text and vector retrieval, typically the right default for enterprise data that mixes both.
Facet : A property or relationship users can filter results by — Status=Open, Customer=Acme, Product=Pro 14. Related facets traverse one edge and back, which is how "search within this context" works without a JOIN.
SearchRequest : The typed object passed to Graph.CreateSearchAsync / CreateSearchAsUserAsync. Holds query text, type scope (BeforeTypesFacet), an optional target-UID set (TargetUIDs), and ranking options.
Q() : The fluent graph-query chain. Reads as a pipeline: Q().StartAt(...).Out(...).Where(...).Take(...).Emit(...). See Graph Query Language.

Security

ReBAC (Relationship-Based Access Control) : Permissions expressed as paths in the graph: "User U can see Resource R if there exists a path from U via membership/ownership edges to R." The search engine compiles user → team membership into ACL filters at query time. See Access Control Model.
Team / _AccessGroup : A group of users. Resources are typically restricted to a team rather than to individual users so that membership changes don't require re-tagging data.
Public access group : The special UID PUBL1CaccesSgr8up11111. Any node with an _OwnedBy edge to this group is visible to all authenticated users.
API token : A workspace-issued credential used by connectors and external scripts. Scoped to one or more permission groups (ingestion, read, search, etc.). Never use admin tokens from a connector.
Endpoint token : A token scoped to one or more custom endpoint paths. Issued so external systems can call your endpoints without holding broader API rights.

Extensibility

Custom endpoint : A server-side C# function written inside the workspace, accessible at /api/endpoints/run/<name>. Has full access to the graph, search, and AI runtime. The right place for permission-aware business logic.
AI tool : An annotated C# class ([Tool], [Parameter]) the chat LLM can invoke. Each tool runs in the calling user's security context (scope.CurrentUser). Use scope.AddSnippet(...) so the LLM can cite its sources.
Scheduled task : Server-side code that runs on a cron-style schedule — periodic ingestion, reindex, enrichment, or analytics rollups. See Scheduled Tasks.
Custom interface : A Tesserae/H5 front-end that runs as part of the workspace and talks to your custom endpoints. See Build enterprise AI apps.

Operations

MSK_* environment variables : All container/runtime configuration uses the MSK_ prefix (MSK_GRAPH_STORAGE, MSK_ADMIN_PASSWORD, MSK_PORT, MSK_JWT_KEY, …). See the full list in the Configuration reference.
Re-index : Rebuild the search index from the current graph state. Triggered automatically when you change indexed fields, or manually from Settings → Maintenance. Required after some kinds of schema changes.
Re-parse : Re-run NLP and entity-extraction pipelines on existing content. Required when you change a pipeline definition. See Reindexing and re-embedding.
Snapshot / backup : A point-in-time copy of MSK_GRAPH_STORAGE (graph) plus MSK_GRAPH_BACKUP_FOLDER (rolling backups). See Backup & restore.

Keep reading:

Architecture overview — how all of the above fits together.
API Overview — the concrete API surface inside a connector or endpoint.
Build your first enterprise AI app — apply every term on this page in one tutorial.

Referenced by

Introduction