Field Test04.2026 · 5 sites

Sentinel vs Stagehand.

Head-to-head comparison on 5 real websites. Same model (Gemini Flash), same instructions, same machine. Default settings for both frameworks. April 2026.

5/5

Sites passed

vs. Stagehand 2/5

6–10×

Token efficiency

2–5k vs 29–51k tokens/action

$0.003

Cost per run

vs. $0.03+ (Stagehand)

Real-world E2E results

Both frameworks ran the same 5 tasks with identical instructions. Stagehand used its default DOM mode (v3.2).

Website	Sentinel	Stagehand	Cost (S)	Cost (SH)
Amazon.de	✓ 3 steps	✓ 6 actions	$0.003	~$0.03
npmjs.com	✓ 5 steps	✓ 10 actions	$0.003	~$0.03
Booking.com	✓ 6 steps	✗ 13 actions	$0.003	failed
Wikipedia DE	✓ 4 steps	✗ timeout	$0.003	failed
Durchblicker.at	✓ 17 steps	✗ 26 actions	$0.018	failed

Stagehand failed on Booking.com (couldn't extract prices), Wikipedia (cookie redirect crash), and Durchblicker (stopped at engine power selection after 26 actions).

Metric	Sentinel	Stagehand
Default model	Gemini 3 Flash	Gemini / GPT-4o
Tokens per action	2–5k	29–51k
Cost per run	~$0.002	~$0.03–0.08
Cost / 1k runs	~$2	~$30–80
E2E success rate	5/5 sites	2/5 sites
fillForm(json)	✓	✗
Network intercept	✓	✗
TOTP / MFA	✓	✗
Planner model split	✓	✗
Click verification	✓	✗
Form intelligence	✓	✗
Self-healing	✓	✗
Parallel sessions	✓ built-in	Manual
MCP Server	✓	✗
CLI	✓	✗
Custom LLM	✓ (+Ollama)	Partial
License	MIT	MIT

Extended comparison

Sentinel vs. Stagehand, BrowserUse (Python), and plain Playwright — across the metrics that matter for production use.

Metric	Sentinel	Stagehand	BrowserUse	Playwright
Language	TypeScript	TypeScript	Python	TS / Python / Java
AI actions	✓	✓	✓	✗ (manual only)
Cost / run	~$0.002	~$0.08	~$0.05	$0
Self-healing	✓	✗	Partial	✗
Parallel built-in	✓	Manual	Manual	✓
Shadow DOM	✓	Partial	✗	✓
OpenTelemetry	✓	✗	✗	✗
MCP Server	✓	✗	✗	✗
CLI	✓	✗	✗	✓
Playwright compat	✓ drop-in	Partial	✗	✓ native
License	MIT	MIT	MIT	Apache-2.0

BrowserUse costs estimated from GPT-4o at default settings. Plain Playwright has no AI cost but requires manual selector maintenance.

Methodology

Sites: Amazon.de, npmjs.com, Booking.com, Wikipedia DE, Durchblicker.at
Model: gemini-3-flash-preview (same for both)
Date: April 11, 2026
Sentinel: v4.1.0, default settings
Stagehand: v3.2, DOM mode (default), same API key
Instructions: identical goal text for both frameworks
Machine: same machine, sequential execution
Cost: token usage x official Gemini API pricing

▌ typescriptbenchmark-task.ts

// Same instructions, same model (Gemini Flash), same machine.
// Both frameworks tested on 5 real websites, April 11, 2026.

// Sentinel
const sentinel = new Sentinel({ apiKey: process.env.GEMINI_API_KEY });
await sentinel.init();
await sentinel.goto('https://www.amazon.de');
const result = await sentinel.run(
  'Search for "mechanical keyboard", extract the first 3 product names and prices'
);
// 3 steps, $0.003, extracted: [{name, price}, ...]

// Stagehand (same model: Gemini Flash)
const stagehand = new Stagehand({
  env: 'LOCAL',
  model: { modelName: 'google/gemini-3-flash-preview', apiKey },
});
await stagehand.init();
const result = await stagehand.agent().execute({
  instruction: 'Search for "mechanical keyboard", extract first 3 products',
  maxSteps: 10,
});
// 6 actions, ~$0.03 (6-10x more tokens per action)

Prices based on official Gemini API pricing as of April 2026. Both frameworks tested with the same model (gemini-3-flash-preview) and default configurations. Stagehand v3.2 used DOM mode (their current default). Test code available on GitHub.

Real-world costs are even lower

The $0.002/run baseline assumes cold LLM calls. With locatorCache and promptCache enabled, repeated runs skip the LLM entirely — reducing effective cost by 60–90%on stable pages. In a typical test suite where most pages don't change between runs, the real cost is closer to $0.0002–$0.0008/run.

▌ Cost / Operations Calculator

Run the numbers.

Adjust runs and per-run costs to match your setup.

$780saved vs Stagehand / mo

Runs / month10,000

100500k

Cost per run — adjust to your pricing

Sentinel$0.0020/run

BrowserUse$0.050/run

Stagehand$0.080/run

▌ Monthly cost — comparison10,000 runs / mo

Sentinel$0.0020 × 10,000

$20

BrowserUse$0.050 × 10,000

$500

Stagehand$0.080 × 10,000

$800

► Sentinel 40× cheaper than Stagehand at these settings. Defaults: $0.0020/run (Gemini Flash) vs $0.080 (GPT-4o) vs $0.050 (BrowserUse). With locatorCache + promptCacheenabled, Sentinel's effective cost drops further on repeated tasks.