Reference
API Reference
Complete reference for @isoldex/sentinel. For a guided introduction, see the Getting Started docs.
Sentinel class
The main class. Wraps a Playwright browser instance, manages the LLM provider, and exposes all automation methods.
new Sentinel(options)
import { Sentinel } from '@isoldex/sentinel';
const sentinel = new Sentinel({
apiKey: process.env.GEMINI_API_KEY,
headless: true,
browser: 'chromium',
verbose: 1,
enableCaching: true,
});
await sentinel.init();
// ...
await sentinel.close();sentinel.init()
Launches the browser and creates a browser context. Must be called before any other method. Returns Promise<void>.
sentinel.goto(url, options?)
Navigates to a URL. Waits for networkidle by default. Accepts all Playwright GotoOptions as second argument.
sentinel.close()
Closes all pages, the browser context, and the browser. Always call in a finally block.
Static methods
Sentinel.parallel(tasks, options)
const results = await Sentinel.parallel(
[
{ url: 'https://site-a.com', goal: 'Extract product name and price' },
{ url: 'https://site-b.com', goal: 'Extract product name and price' },
],
{
apiKey: process.env.GEMINI_API_KEY,
concurrency: 2,
headless: true,
}
);
// results: ParallelResult[]
results.forEach((r) => {
if (r.success) console.log(r.data);
else console.error(r.error);
});Sentinel.extend(page, options)
Adds act(), extract(), and observe() to an existing Playwright Page object. See the Page extension section.
act()
Executes a single natural-language action. Returns an ActionResult.
sentinel.act(instruction, options?)
// Basic
const result = await sentinel.act('click the login button');
// With variables
const result = await sentinel.act('fill email with %email%', {
variables: { email: 'user@example.com' },
retries: 3,
});
console.log(result.success); // true
console.log(result.action); // 'click'
console.log(result.selector); // '#login-btn'
console.log(result.attempts); // 1All action types
// All supported action types:
await sentinel.act('click the submit button');
await sentinel.act('fill the email field with user@example.com');
await sentinel.act('append " (edited)" to the title field');
await sentinel.act('hover over the user avatar');
await sentinel.act('press Enter');
await sentinel.act('select "Germany" from the country dropdown');
await sentinel.act('double-click the image');
await sentinel.act('right-click the file row');
await sentinel.act('scroll down 300 pixels');
await sentinel.act('scroll up');
await sentinel.act('scroll to the footer');extract()
Extracts structured data from the current page. Accepts a Zod schema or a JSON Schema object. Returns the typed extracted data directly (not wrapped in a result object).
sentinel.extract(instruction, schema)
import { z } from 'zod';
// With Zod schema (recommended — gives full TypeScript inference)
const product = await sentinel.extract(
'Extract product name, price, and rating',
z.object({
name: z.string(),
price: z.string(),
rating: z.number(),
})
);
// product is typed as { name: string; price: string; rating: number }
console.log(product.name);
// With plain JSON schema
const raw = await sentinel.extract(
'Extract all review titles',
{ type: 'object', properties: { titles: { type: 'array', items: { type: 'string' } } } }
);observe()
Returns a natural-language description of the interactive elements currently visible on the page. Useful for debugging, building dynamic agents, or deciding what action to take next.
sentinel.observe(instruction?)
const elements = await sentinel.observe();
// Returns a description of all interactive elements currently visible
const specific = await sentinel.observe('find all buttons in the checkout section');
console.log(specific);
// "checkout-btn: [button] 'Proceed to checkout' at #checkout-form > button.primary
// back-btn: [button] 'Back to cart' at #checkout-form > button.secondary"run() — Agent
Runs an autonomous agent loop. The agent plans, executes, verifies, and reflects until the goal is achieved or maxSteps is exceeded. Includes built-in loop detection.
sentinel.run(goal, options?)
const result = await sentinel.run(
'Find the cheapest laptop under €500 and add it to the cart',
{
maxSteps: 25,
onStep: (event) => {
console.log(`[${event.step}/${event.totalSteps}] ${event.action.instruction}`);
console.log(` → ${event.action.action} on ${event.action.selector}`);
},
}
);
console.log(result.goalAchieved); // true
console.log(result.totalSteps); // 14
console.log(result.data); // { name: '...', price: '€349' }
console.log(result.selectors); // { addToCartBtn: '#add-to-cart', ... }
console.log(result.message); // "Goal achieved: ..."Session & Navigation
Session persistence (cookies + localStorage)
// Save session (cookies + localStorage) for reuse
const sentinel = new Sentinel({
apiKey: process.env.GEMINI_API_KEY,
sessionPath: './sessions/github.json',
});
await sentinel.init();
// Check if already logged in
if (await sentinel.hasLoginForm()) {
await sentinel.act('fill username with myuser');
await sentinel.act('fill password with %pw%', { variables: { pw: process.env.GH_PASS! } });
await sentinel.act('click sign in');
await sentinel.saveSession();
}
// On next run, cookies are restored automaticallyuserDataDir — IndexedDB / Service Workers
// Persist IndexedDB, WebSQL, Service Workers (e.g. WhatsApp Web, Notion)
const sentinel = new Sentinel({
apiKey: process.env.GEMINI_API_KEY,
userDataDir: './profiles/whatsapp',
});
await sentinel.init();
await sentinel.goto('https://web.whatsapp.com');
// QR scan only needed on first run| Property | Type | Default | Description |
|---|---|---|---|
| sentinel.saveSession() | Promise<void> | — | Saves current cookies + localStorage to sessionPath. |
| sentinel.hasLoginForm() | Promise<boolean> | — | Returns true if a login/sign-in form is detected on the current page. |
Tab management
sentinel.newTab / switchTab / closeTab / tabCount
// Open a new tab
const tabId = await sentinel.newTab('https://example.com');
// Switch to it
await sentinel.switchTab(tabId);
// How many tabs are open?
const count = await sentinel.tabCount();
// Close current tab
await sentinel.closeTab();| Property | Type | Default | Description |
|---|---|---|---|
| newTab(url?) | Promise<string> | — | Opens a new browser tab, optionally navigating to url. Returns a tabId. |
| switchTab(tabId) | Promise<void> | — | Switches focus to the tab with the given tabId. |
| closeTab(tabId?) | Promise<void> | — | Closes the specified tab, or the current tab if no tabId is provided. |
| tabCount() | Promise<number> | — | Returns the number of currently open tabs. |
Record & Replay
Record actions, export as TypeScript or JSON
// Start recording
await sentinel.startRecording();
// Perform actions normally
await sentinel.goto('https://github.com');
await sentinel.act('click Sign in');
await sentinel.act('fill username');
// Stop and export as TypeScript
const ts = await sentinel.stopRecording();
const code = await sentinel.exportWorkflowAsCode();
const json = await sentinel.exportWorkflowAsJSON();
// Replay later
await sentinel.replay(JSON.parse(json));| Property | Type | Default | Description |
|---|---|---|---|
| startRecording() | Promise<void> | — | Starts capturing all act() calls into a workflow. |
| stopRecording() | Promise<string> | — | Stops recording and returns the workflow as JSON string. |
| exportWorkflowAsCode() | Promise<string> | — | Returns the recorded workflow as executable TypeScript source code. |
| exportWorkflowAsJSON() | Promise<string> | — | Returns the recorded workflow as a JSON string for storage or replay. |
| replay(workflow) | Promise<void> | — | Replays a recorded workflow (parsed JSON object). |
Vision
When visionFallback: true is set, Sentinel automatically falls back to screenshot-based grounding when the accessibility tree fails to locate an element (Canvas, complex Shadow DOM, non-standard widgets).
sentinel.screenshot() / describeScreen()
// Enable vision grounding (screenshot fallback for Canvas / Shadow DOM)
const sentinel = new Sentinel({
apiKey: process.env.GEMINI_API_KEY,
visionFallback: true,
});
// Take a screenshot (returns Buffer)
const buf = await sentinel.screenshot();
// Describe current screen state
const description = await sentinel.describeScreen();
console.log(description);
// "A checkout page with a total of €349 visible in the top right..."| Property | Type | Default | Description |
|---|---|---|---|
| screenshot() | Promise<Buffer> | — | Takes a full-page screenshot and returns the PNG buffer. |
| describeScreen() | Promise<string> | — | Returns an LLM-generated natural-language description of the current viewport. |
Page extension
Sentinel.extend(page, options) enriches an existing Playwright Page object with AI methods without creating a new browser instance. Ideal for incremental adoption in existing Playwright test suites.
Sentinel.extend(page, options)
import { chromium } from 'playwright';
import { Sentinel } from '@isoldex/sentinel';
const browser = await chromium.launch();
const page = await browser.newPage();
// Extend an existing page object
const ai = await Sentinel.extend(page, { apiKey: process.env.GEMINI_API_KEY });
await page.goto('https://example.com');
// Now use act/extract/observe alongside existing playwright APIs
await ai.act('click the login button');
const data = await ai.extract('get the page title', { title: 'string' });
// Standard Playwright still works
await expect(page.locator('h1')).toBeVisible();Events & token tracking
sentinel.on() / getTokenUsage() / exportLogs()
// Listen to events
sentinel.on('action', (e) => console.log('action:', e));
sentinel.on('navigate', (e) => console.log('navigated to:', e.url));
sentinel.on('close', () => console.log('browser closed'));
// Token usage
const usage = sentinel.getTokenUsage();
console.log(usage);
// { promptTokens: 12400, completionTokens: 840, totalTokens: 13240, estimatedCost: 0.0018 }
// Export all logs
const logs = sentinel.exportLogs();| Property | Type | Default | Description |
|---|---|---|---|
| on('action', cb) | void | — | Emitted after every act() call. cb receives ActionResult. |
| on('navigate', cb) | void | — | Emitted on every page navigation. cb receives { url }. |
| on('close', cb) | void | — | Emitted when the browser is closed. |
| getTokenUsage() | TokenUsage | — | Returns cumulative prompt/completion/total tokens and estimated cost. |
| exportLogs() | LogEntry[] | — | Returns all recorded log entries for the current session. |
| sentinel.page | Page | — | Direct access to the underlying Playwright Page object. |
| sentinel.context | BrowserContext | — | Direct access to the underlying Playwright BrowserContext. |
SentinelOptions
| Property | Type | Default | Description |
|---|---|---|---|
| apiKey | string | — | API key for the selected LLM provider. Required for Gemini and OpenAI. |
| headless | boolean | true | Run browser in headless mode. |
| browser | 'chromium' | 'firefox' | 'webkit' | 'chromium' | Playwright browser engine. |
| viewport | { width, height } | 1280×720 | Browser viewport dimensions. |
| verbose | 0 | 1 | 2 | 3 | 0 | Log level. 3 = full LLM reasoning JSON. |
| enableCaching | boolean | false | Enable locator + prompt cache to cut repeated costs to near-zero. |
| visionFallback | boolean | false | Enable screenshot-based grounding as fallback for inaccessible elements. |
| provider | LLMProvider | — | Custom LLM provider instance. Overrides apiKey + default Gemini. |
| sessionPath | string | — | File path to save/restore cookies and localStorage. |
| userDataDir | string | — | Persistent browser profile directory (preserves IndexedDB, Service Workers). |
| proxy | ProxyOptions | — | Proxy configuration. See ProxyOptions. |
| humanLike | boolean | false | Add random delays between actions to simulate human behaviour. |
| domSettleTimeoutMs | number | 500 | Milliseconds to wait for DOM mutations to settle after each action. |
| locatorCache | ILocatorCache | — | Custom locator cache implementation (default: in-memory Map). |
| promptCache | IPromptCache | — | Custom prompt cache implementation (default: in-memory Map). |
| maxElements | number | 150 | Max interactive elements included in each LLM context snapshot. |
ActOptions
| Property | Type | Default | Description |
|---|---|---|---|
| variables | Record<string, string> | — | Variable interpolation map. Keys are used as %key% in instruction strings. |
| retries | number | 3 | Number of action attempts before throwing ActionError. |
ActionResult
| Property | Type | Default | Description |
|---|---|---|---|
| success | boolean | — | Whether the action completed successfully. |
| action | string | — | The action type that was executed (e.g. 'click', 'fill'). |
| instruction | string | — | The original instruction string. |
| selector | string | null | — | The CSS selector used for the action, if applicable. |
| message | string | — | Human-readable outcome message. |
| attempts | number | — | Number of attempts made before success or final failure. |
AgentRunOptions
| Property | Type | Default | Description |
|---|---|---|---|
| maxSteps | number | 20 | Maximum number of agent steps before aborting. |
| onStep | (event: AgentStepEvent) => void | — | Callback invoked after each step with real-time progress. |
AgentStepEvent
| Property | Type | Default | Description |
|---|---|---|---|
| step | number | — | Current step index (1-based). |
| totalSteps | number | — | Maximum steps configured. |
| action | ActionResult | — | Result of the action taken in this step. |
| reasoning | string | — | LLM reasoning for choosing this action. |
AgentResult
| Property | Type | Default | Description |
|---|---|---|---|
| success | boolean | — | True if the agent completed without throwing. |
| goalAchieved | boolean | — | True if the LLM confirmed the goal was met. |
| totalSteps | number | — | Number of steps executed. |
| message | string | — | Summary message from the agent. |
| history | ActionResult[] | — | Full list of actions taken. |
| data | Record<string, unknown> | — | Structured data extracted during the run. |
| selectors | Record<string, string> | — | camelCase key → CSS selector map for elements interacted with. |
ParallelTask / ParallelResult / ParallelOptions
ParallelTask
| Property | Type | Default | Description |
|---|---|---|---|
| url | string | — | URL to navigate to before running the goal. |
| goal | string | — | Natural-language goal for this browser session. |
| options | AgentRunOptions | — | Per-task agent options (maxSteps, onStep). |
ParallelResult
| Property | Type | Default | Description |
|---|---|---|---|
| success | boolean | — | Whether this task completed without error. |
| data | Record<string, unknown> | — | Extracted data if the task succeeded. |
| error | string | null | — | Error message if the task failed. |
| agentResult | AgentResult | null | — | Full agent result for detailed inspection. |
ParallelOptions
| Property | Type | Default | Description |
|---|---|---|---|
| apiKey | string | — | LLM API key shared across all parallel sessions. |
| concurrency | number | 3 | Maximum number of concurrent browser sessions. |
| headless | boolean | true | Headless mode for all sessions. |
| ...SentinelOptions | — | All other SentinelOptions are forwarded to each session. |
ProxyOptions
| Property | Type | Default | Description |
|---|---|---|---|
| server | string | — | Proxy server URL, e.g. 'http://proxy.example.com:8080'. |
| username | string | — | Proxy authentication username. |
| password | string | — | Proxy authentication password. |
| bypass | string | — | Comma-separated list of hosts to bypass the proxy for. |
LLMProvider interface
Implement this interface to use any LLM with Sentinel. Pass the instance as provider in SentinelOptions.
interface LLMProvider {
complete(prompt: string, options?: CompletionOptions): Promise<string>;
completeJSON<T>(prompt: string, schema: ZodSchema<T>): Promise<T>;
}See the LLM Providers page for built-in implementations (GeminiProvider, OpenAIProvider, AnthropicProvider, OllamaProvider).
Error classes
All errors extend SentinelError which extends the native Error.
Error hierarchy
import {
SentinelError,
ActionError,
ExtractionError,
NavigationError,
AgentError,
NotInitializedError,
} from '@isoldex/sentinel';
try {
await sentinel.act('click the non-existent button');
} catch (err) {
if (err instanceof ActionError) {
console.error('Action failed after', err.attempts, 'attempts');
console.error('Instruction:', err.instruction);
console.error('Selector tried:', err.selector);
}
}| Property | Type | Default | Description |
|---|---|---|---|
| SentinelError | class | — | Base class. All Sentinel errors extend this. |
| ActionError | class | — | Thrown when act() fails after all retries. Has .instruction, .selector, .attempts. |
| ExtractionError | class | — | Thrown when extract() cannot parse the LLM response into the requested schema. |
| NavigationError | class | — | Thrown on navigation failure (timeout, network error, invalid URL). |
| AgentError | class | — | Thrown when run() exceeds maxSteps without achieving the goal. |
| NotInitializedError | class | — | Thrown when any method is called before sentinel.init(). |