# Monitoring

# Monitoring

Monitoring ensures your Curiosity Workspace environment is healthy, performant, and secure. Effective monitoring covers:

  • availability (is the workspace reachable?)
  • performance (latency, throughput)
  • ingestion and background jobs (success/failure, duration)
  • search/index health
  • security signals (auth failures, token usage, admin actions)

# What to monitor (baseline)

  • Service health
    • uptime, restarts, error rates
  • Ingestion
    • last successful run time
    • items processed per run
    • failure rate and top failure reasons
  • Indexing
    • queue depth / progress (if available)
    • completion status for rebuilds
  • Query performance
    • slow queries (graph and search)
    • timeouts and resource saturation

# Logs and audit trails

Recommended log categories:

  • ingestion connector logs
  • endpoint invocation logs
  • admin configuration changes
  • authentication and authorization events

# Alerting

Start with a small set of high-signal alerts:

  • workspace unavailable
  • repeated ingestion failures
  • index rebuild stuck or failed
  • elevated error rate or latency regression

# Metrics and Observability

To gain deeper insights into your workspace, monitor these key metrics:

  • Ingestion Throughput: Number of nodes/edges created per minute.
  • Indexing Queue Length: Number of nodes pending indexing
  • File Indexing Queue: Number of files pending processing
  • Search Latency: P95 and P99 response times for search queries.
  • Resource Usage: CPU, RAM, and Disk I/O trends.
  • Error Rates: Frequency of 5xx responses or failed background tasks.

# Dashboards and Alerts

# Built-in Dashboards

The Workspace UI includes a Monitoring tab with real-time charts for the most critical metrics. Use these to identify sudden spikes or performance regressions.

# External Integration (Prometheus/Grafana) (Coming soon)

Curiosity Workspace exposes a /metrics endpoint in Prometheus format.

  1. Configure Prometheus to scrape the workspace URL.
  2. Import our official Grafana dashboard for a comprehensive view of system health.
  3. Set up alerts in Grafana for metrics exceeding your predefined thresholds.

# Log Structure

Workspace logs are output to the Docker console. Structured logs in JSON for easy ingestion into log management tools (e.g., ELK Stack, Datadog) can be configured using custom log sinks.

# Next steps