Curiosity

NLP Configuration

The workspace ships an NLP framework that extracts entities, links them to graph nodes, and produces document-level annotations searchable alongside structured data. This section is the operator's reference — what to configure, where to configure it, and how each model fits into a pipeline.

For per-page configuration screenshots, see the workspace UI itself; this section concentrates on the model.

The pipeline shape

flowchart LR Doc[(Text<br/>e.g. SupportCase.Content)] --> Tokenizer Tokenizer --> POS[Part-of-speech<br/>tagging] POS --> Spotter[Spotter<br/>models] POS --> Pattern[Pattern<br/>spotters] Spotter --> Link[Entity linking] Pattern --> Link Link --> Post[Post-processing] Post --> Graph[(Annotated nodes<br/>+ link edges)]

You bind a pipeline to one or more node-field pairs (e.g. SupportCase.Content). Every ingested or reparsed value of that field runs through the pipeline.

In this section

When to use which model

Goal Model
Match a fixed vocabulary that already lives in the graph Dynamic spotter — auto-syncs.
Match a curated alias list you maintain externally Manual spotter — upload a list.
Capture identifier patterns (ticket IDs, part numbers, dates) Pattern spotter.
Decide whether a match is correct based on document context Entity post-processing.
Link captured terms to existing nodes via _Mentions edges Entity linking.
Auto-create new nodes when a pattern hits unknown text Entity linking → "Create new node".

Programmatic access

The configuration lives inside the workspace's graph (under internal nodes), but you typically author it through the UI. The points where you'll write C# are:

  • Entity post-processing scripts — runs per-match with Tokens, Document, Graph, Q() in scope. See Entity post-processing.
  • AI tool annotations — if you want a chat tool to depend on extracted entities, query the _Mentions edges directly with Q().IsRelatedToVia(...).
© 2026 Curiosity. All rights reserved.