Documentation
Everything you need to instrument your AI agents and get quality monitoring in minutes.
Quick Start
Get from zero to quality monitoring in under 5 minutes.
1 Install the SDK
$ npm install @sentygent/sdk 2 Create an agent in the dashboard
Go to app.sentygent.com, create a new agent, and copy its slug. You will also find your API key under Settings.
3 Set environment variables
SENTYGENT_API_KEY=sk-...
SENTYGENT_SERVICE=my-agent # optional, defaults to 'default' 4 Instrument your agent
import Anthropic from '@anthropic-ai/sdk';
import { SentygentClient, instrumentAnthropic } from '@sentygent/sdk';
const sentygent = new SentygentClient({
apiKey: process.env.SENTYGENT_API_KEY,
service: process.env.SENTYGENT_SERVICE, // optional
});
const anthropic = instrumentAnthropic(new Anthropic(), sentygent);
// Use anthropic as normal — all calls traced automatically
await sentygent.shutdown();
// Always call shutdown() before process exits Anthropic
Use instrumentAnthropic to wrap your Anthropic client. All messages.create() calls are captured automatically — model, tokens, cost, and latency.
import Anthropic from '@anthropic-ai/sdk';
import { SentygentClient, instrumentAnthropic } from '@sentygent/sdk';
const sentygent = new SentygentClient({
apiKey: process.env.SENTYGENT_API_KEY,
service: 'my-chatbot',
debug: true,
});
// Wrap once — all subsequent calls are captured
const anthropic = instrumentAnthropic(new Anthropic(), sentygent);
async function chat(userMessage: string) {
await sentygent.trace(`chat-${Date.now()}`, async (span) => {
span.captureLifecycle('message_received', { content: userMessage });
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{ role: 'user', content: userMessage }],
});
const text = response.content[0].text;
span.captureLifecycle('message_sent', { content: text });
});
}
await sentygent.shutdown(); Note: Always call await sentygent.shutdown() before your process exits. This flushes any pending events to the Sentygent backend.
AWS Bedrock
Use instrumentBedrock to wrap your BedrockRuntimeClient. Supports ConverseCommand with automatic token tracking. Use sentygent.request() for request-response patterns.
import { BedrockRuntimeClient, ConverseCommand } from '@aws-sdk/client-bedrock-runtime';
import { SentygentClient, instrumentBedrock } from '@sentygent/sdk';
const sentygent = new SentygentClient({
apiKey: process.env.SENTYGENT_API_KEY,
service: 'my-bedrock-agent',
});
const bedrock = instrumentBedrock(
new BedrockRuntimeClient({ region: process.env.AWS_REGION }),
sentygent,
);
await sentygent.request(`session-${Date.now()}`, async (span) => {
span.captureLifecycle('message_received', { content: userQuestion });
// Optional: capture RAG retrieval step
await span.captureRetrieval({
provider: 'pinecone',
query: userQuestion,
execute: () => retrieveContext(userQuestion),
extractResults: (r) => ({ resultsCount: r.chunks.length }),
searchType: 'semantic',
});
const response = await bedrock.send(new ConverseCommand({
modelId: 'anthropic.claude-3-5-sonnet-20241022-v2:0',
messages: [{ role: 'user', content: [{ text: userQuestion }] }],
inferenceConfig: { maxTokens: 512 },
}));
span.captureLifecycle('message_sent', {
content: response.output?.message?.content?.[0]?.text,
});
}); OpenAI
Use instrumentOpenAI to wrap your OpenAI client. All chat.completions.create() calls are captured automatically — model, tokens, cost, and latency. Streaming is supported.
import OpenAI from 'openai';
import { SentygentClient, instrumentOpenAI } from '@sentygent/sdk';
const sentygent = new SentygentClient({
apiKey: process.env.SENTYGENT_API_KEY,
service: 'my-agent',
});
const openai = instrumentOpenAI(new OpenAI(), sentygent);
// All calls are captured automatically
await sentygent.trace('conv-123', async () => {
const msg = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'Hello!' }],
});
}); Tracing Patterns
Choose the tracing pattern that fits your architecture.
| Pattern | Use when | Lifecycle |
|---|---|---|
trace(id, fn) | Wrapping logic in a callback | Callback-scoped, auto-instrumentors work |
request(id, fn) | Each HTTP request = one trace | Auto conversation_start / conversation_end |
startTrace(id) | Chatbots, REPLs, event-driven apps | Returns a Span directly, you control timing |
Imperative tracing with startTrace()
For event-driven architectures, REPLs, or long-running conversations where you can't wrap everything in a callback:
import { SentygentClient } from '@sentygent/sdk';
const sentygent = new SentygentClient({ apiKey: '...', service: 'my-chatbot' });
// Get a span without a callback — use it across turns
const span = sentygent.startTrace('conv-123');
span.captureLifecycle('conversation_start');
// ... later, in event handlers ...
span.reportLLM({ provider: 'openai', model: 'gpt-4o', duration: 500, ... });
// When done — triggers flush + quality scoring
span.captureLifecycle('conversation_end'); Vercel AI SDK
Use instrumentVercelAI to wrap Vercel AI SDK functions. All generateText(), streamText(), and generateObject() calls are captured automatically.
import { generateText, streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { SentygentClient, instrumentVercelAI } from '@sentygent/sdk';
const sentygent = new SentygentClient({
apiKey: process.env.SENTYGENT_API_KEY,
service: 'my-nextjs-app',
});
const ai = instrumentVercelAI({ generateText, streamText }, sentygent);
// Use ai.generateText / ai.streamText as normal
const result = await ai.generateText({
model: openai('gpt-4o-mini'),
prompt: 'Hello!',
}); Other Providers
For Cohere, Mistral, Groq, or any provider without auto-instrumentation, use span.captureLLM() to manually capture calls. Use span.reportLLM() if you already have the result and measured timing yourself.
await span.captureLLM({
provider: 'other',
model: 'mistral-large-latest',
execute: () => mistral.chat.complete({ ... }),
extractUsage: (r) => ({
promptTokens: r.usage?.prompt_tokens ?? 0,
completionTokens: r.usage?.completion_tokens ?? 0,
}),
}); Multi-agent
Use span.child(name, { agent: 'slug' }) to create child spans for sub-agents. Each sub-agent appears separately in the dashboard with its own cost breakdown.
Prerequisites: Register each agent slug (e.g. orchestrator, research-agent, writer-agent) in the Sentygent dashboard before running.
import { SentygentClient } from '@sentygent/sdk';
const sentygent = new SentygentClient({
apiKey: process.env.SENTYGENT_API_KEY,
service: 'orchestrator',
});
await sentygent.trace(`multi-agent-${Date.now()}`, async (span) => {
span.captureLifecycle('message_received', { content: userMessage });
// Research sub-agent — appears as separate agent in dashboard
const researchSpan = span.child('research', { agent: 'research-agent' });
const research = await researchSpan.captureLLM({
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
execute: () => callLLM('Summarize AI safety research findings'),
extractUsage: (r) => r.usage,
});
// Writer sub-agent — per-agent cost visible in dashboard
const writerSpan = span.child('write', { agent: 'writer-agent' });
await writerSpan.captureLLM({
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
execute: () => callLLM('Write polished summary from research notes'),
extractUsage: (r) => r.usage,
});
span.captureLifecycle('message_sent', { content: research.text });
});
await sentygent.shutdown(); RAG / Retrieval
Use span.captureRetrieval() to instrument knowledge base lookups, vector searches, or any retrieval step. This surfaces retrieval quality metrics in your dashboard.
await span.captureRetrieval({
provider: 'pinecone', // your KB / vector DB name
query: userQuestion,
execute: () => vectorSearch(userQuestion),
extractResults: (r) => ({
resultsCount: r.chunks.length,
relevantCount: r.chunks.filter((c) => c.score >= 0.5).length,
meanScore: r.chunks.reduce((s, c) => s + c.score, 0) / r.chunks.length,
topScore: Math.max(...r.chunks.map((c) => c.score)),
}),
searchType: 'semantic',
tags: { step: 'retrieve' },
}); captureRetrieval options
| Option | Type | Description |
|---|---|---|
provider | string | Knowledge base or vector DB name |
query | string | The search query |
execute | () => Promise<T> | Function that performs the retrieval |
extractResults | (r: T) => object | Extract metrics from the result |
searchType | 'semantic' | 'keyword' | string | Type of search performed |
tags | Record<string, string> | Dimensional metadata for filtering |
API Reference
Key methods of the Sentygent SDK.
SentygentClient
| Method | Description |
|---|---|
new SentygentClient({ apiKey, service? }) | Initialize the client. service is optional (defaults to 'default'). Also accepts agent as alias. |
sentygent.trace(id, fn) | Start a trace for a full conversation or workflow. |
sentygent.request(id, fn) | Start a trace for a single request-response cycle (auto start/end). |
sentygent.startTrace(id) | Imperative trace — returns a Span directly. For chatbots, REPLs, event-driven apps. |
sentygent.shutdown() | Flush pending events and shut down. Call before process exit. |
Instrumentation helpers
| Function | Description |
|---|---|
instrumentOpenAI(client, sentygent) | Wrap OpenAI SDK. All chat.completions.create() calls auto-traced (streaming + non-streaming). |
instrumentAnthropic(client, sentygent) | Wrap Anthropic SDK. All messages.create() calls auto-traced. |
instrumentBedrock(client, sentygent) | Wrap BedrockRuntimeClient. Supports ConverseCommand. |
instrumentVercelAI({ generateText, ... }, sentygent) | Wrap Vercel AI SDK functions. Supports generateText, streamText, generateObject. |
Span methods
| Method | Description |
|---|---|
span.captureLifecycle(event, data?) | Record a lifecycle event (e.g. message_received, message_sent). |
span.llm(execute, options?) | Simplified LLM capture. Auto-detects provider, model, tokens from OpenAI/Anthropic/Vercel AI responses. |
span.captureLLM(options) | Full LLM capture with execute callback. You provide provider, model, extractUsage. |
span.reportLLM(options) | Report a pre-executed LLM call. You provide timing and usage — no callback needed. |
span.captureTool(options) | Capture a tool call with execute callback. SDK measures timing. |
span.reportTool(options) | Report a pre-executed tool call. You provide timing and result — no callback needed. |
span.captureRetrieval(options) | Capture a RAG/retrieval step with execute callback. SDK measures timing. |
span.reportRetrieval(options) | Report a pre-executed retrieval. You provide timing and metrics — no callback needed. |
span.child(name, { agent }) | Create a child span for a sub-agent with its own slug. |