NLP Configuration

The workspace ships an NLP framework that extracts entities, links them to graph nodes, and produces document-level annotations searchable alongside structured data. This section is the operator's reference — what to configure, where to configure it, and how each model fits into a pipeline.

For per-page configuration screenshots, see the workspace UI itself; this section concentrates on the model.

The pipeline shape

flowchart LR Doc[(Text e.g. SupportCase.Content)] --> Tokenizer Tokenizer --> POS[Part-of-speech tagging] POS --> Spotter[Spotter models] POS --> Pattern[Pattern spotters] Spotter --> Link[Entity linking] Pattern --> Link Link --> Post[Post-processing] Post --> Graph[(Annotated nodes + link edges)]

You bind a pipeline to one or more node-field pairs (e.g. SupportCase.Content). Every ingested or reparsed value of that field runs through the pipeline.

In this section

NLP pipelines — assigning languages, modes, and target fields.
Spotter models — gazetteer extraction (fixed terms / aliases).
Pattern spotter — regex-style extraction for IDs and codes.
Entity linking — wiring extracted terms to graph nodes.
Entity post-processing — custom C# that runs on each match.
Advanced extraction — experiments, reparsing, multilingual setup.

When to use which model

Goal	Model
Match a fixed vocabulary that already lives in the graph	Dynamic spotter — auto-syncs.
Match a curated alias list you maintain externally	Manual spotter — upload a list.
Capture identifier patterns (ticket IDs, part numbers, dates)	Pattern spotter.
Decide whether a match is correct based on document context	Entity post-processing.
Link captured terms to existing nodes via `_Mentions` edges	Entity linking.
Auto-create new nodes when a pattern hits unknown text	Entity linking → "Create new node".

Programmatic access

The configuration lives inside the workspace's graph (under internal nodes), but you typically author it through the UI. The points where you'll write C# are:

Entity post-processing scripts — runs per-match with Tokens, Document, Graph, Q() in scope. See Entity post-processing.
AI tool annotations — if you want a chat tool to depend on extracted entities, query the _Mentions edges directly with Q().IsRelatedToVia(...).