Curiosity

NLP Configuration

The workspace ships an NLP framework that extracts entities, links them to graph nodes, and produces document-level annotations searchable alongside structured data. This section is the operator's reference — what to configure, where to configure it, and how each model fits into a pipeline.

For per-page configuration screenshots, see the workspace UI itself; this section concentrates on the model.

The pipeline shape

flowchart LR Doc[(Text<br/>e.g. SupportCase.Content)] --> Tokenizer Tokenizer --> POS[Part-of-speech<br/>tagging] POS --> Spotter[Spotter<br/>models] POS --> Pattern[Pattern<br/>spotters] Spotter --> Link[Entity linking] Pattern --> Link Link --> Post[Post-processing] Post --> Graph[(Annotated nodes<br/>+ link edges)]

You bind a pipeline to one or more node-field pairs (e.g. SupportCase.Content). Every ingested or reparsed value of that field runs through the pipeline.

In this section

NLP pipelines

Assigning languages, modes, and target fields.

Spotter models

Gazetteer extraction (fixed terms / aliases).

Pattern spotter

Regex-style extraction for IDs and codes.

Entity linking

Wiring extracted terms to graph nodes.

Entity post-processing

Custom C# that runs on each match.

Advanced extraction

Experiments, reparsing, multilingual setup.

When to use which model

Goal Model
Match a fixed vocabulary that already lives in the graph Dynamic spotter — auto-syncs.
Match a curated alias list you maintain externally Manual spotter — upload a list.
Capture identifier patterns (ticket IDs, part numbers, dates) Pattern spotter.
Decide whether a match is correct based on document context Entity post-processing.
Link captured terms to existing nodes via _Mentions edges Entity linking.
Auto-create new nodes when a pattern hits unknown text Entity linking → "Create new node".

Programmatic access

The configuration lives inside the workspace's graph (under internal nodes), but you typically author it through the UI. The points where you'll write C# are:

  • Entity post-processing scripts — runs per-match with Tokens, Document, Graph, Q() in scope. See Entity post-processing.
  • AI tool annotations — if you want a chat tool to depend on extracted entities, query the _Mentions edges directly with Q().IsRelatedToVia(...).
© 2026 Curiosity. All rights reserved.