Curiosity

Cost and operations

A 5-step agent makes 5 model calls. Model choice and tool design are your main cost levers.

Model selection by role:

Role	Model
Tool routing / classification	`gpt-4o-mini`, `claude-haiku-4-5`
Final answer synthesis	`gpt-4o`, `claude-sonnet-4-6`
Air-gapped / no egress	Local 70B on GPU

Consider using a smaller model for tool selection and a larger one only for the final answer.

Cost guardrails:

Monitor per-tool metrics:

GET /api/chatai/tools/metrics

Returns per-tool call counts, latency, p95, and error rate. Wire to your monitoring stack.

Security checklist for tools: