NLP Configuration
The workspace ships an NLP framework that extracts entities, links them to graph nodes, and produces document-level annotations searchable alongside structured data. This section is the operator's reference — what to configure, where to configure it, and how each model fits into a pipeline.
For per-page configuration screenshots, see the workspace UI itself; this section concentrates on the model.
The pipeline shape
flowchart LR
Doc[(Text<br/>e.g. SupportCase.Content)] --> Tokenizer
Tokenizer --> POS[Part-of-speech<br/>tagging]
POS --> Spotter[Spotter<br/>models]
POS --> Pattern[Pattern<br/>spotters]
Spotter --> Link[Entity linking]
Pattern --> Link
Link --> Post[Post-processing]
Post --> Graph[(Annotated nodes<br/>+ link edges)]
You bind a pipeline to one or more node-field pairs (e.g. SupportCase.Content). Every ingested or reparsed value of that field runs through the pipeline.
In this section
- NLP pipelines — assigning languages, modes, and target fields.
- Spotter models — gazetteer extraction (fixed terms / aliases).
- Pattern spotter — regex-style extraction for IDs and codes.
- Entity linking — wiring extracted terms to graph nodes.
- Entity post-processing — custom C# that runs on each match.
- Advanced extraction — experiments, reparsing, multilingual setup.
When to use which model
| Goal | Model |
|---|---|
| Match a fixed vocabulary that already lives in the graph | Dynamic spotter — auto-syncs. |
| Match a curated alias list you maintain externally | Manual spotter — upload a list. |
| Capture identifier patterns (ticket IDs, part numbers, dates) | Pattern spotter. |
| Decide whether a match is correct based on document context | Entity post-processing. |
Link captured terms to existing nodes via _Mentions edges |
Entity linking. |
| Auto-create new nodes when a pattern hits unknown text | Entity linking → "Create new node". |
Programmatic access
The configuration lives inside the workspace's graph (under internal nodes), but you typically author it through the UI. The points where you'll write C# are:
- Entity post-processing scripts — runs per-match with
Tokens,Document,Graph,Q()in scope. See Entity post-processing. - AI tool annotations — if you want a chat tool to depend on extracted entities, query the
_Mentionsedges directly withQ().IsRelatedToVia(...).