Curiosity Workspaces

# NLP Overview

# NLP Overview

Natural Language Processing (NLP) in Curiosity Workspace turns raw text into structured signals you can search, filter, and connect to your graph.

You typically use NLP to:

extract entities and concepts from text
normalize content across languages and writing styles
link mentions in text to existing nodes in your graph
improve retrieval and downstream AI workflows

# How NLP fits into the platform

NLP interacts with:

Graph: entities can become nodes; links become edges (mentions → entities).
Search: extracted fields can be indexed and used as facets.
AI: LLM workflows become more reliable when grounded on extracted and linked entities.

# Key building blocks

Pipelines
- sequence of steps applied to a field (tokenization, entity detection, etc.)
Models
- spotters/patterns/classifiers used by pipelines to capture entities
Entity capture
- the act of extracting entities from text into structured outputs
Entity linking
- connecting captured entities to existing nodes (or creating new ones when appropriate)

# When to use NLP (and when not to)

Use NLP when:

your critical information is embedded in free text (tickets, notes, transcripts)
you need structured facets that don’t exist as explicit fields
you want graph navigation from text mentions

Avoid overusing NLP when:

your source already provides structured fields (use connector mapping first)
entity capture would be too noisy without domain tuning

# Next steps

Configure semantic retrieval: Embeddings
Extract structured signals: Entity Extraction

See also

Curiosity Workspace uses AI models in three common ways:

Curiosity Workspace data flow describes how raw source data becomes:

Workspace Configuration

Workspace configuration covers the settings that control how your environment behaves: languages and NLP defaults, tokens and integrations,