Sentinel vs Stagehand.
Head-to-head comparison on 5 real websites. Same model (Gemini Flash), same instructions, same machine. Default settings for both frameworks. April 2026.
Real-world E2E results
Both frameworks ran the same 5 tasks with identical instructions. Stagehand used its default DOM mode (v3.2).
| Website | Sentinel | Stagehand | Cost (S) | Cost (SH) |
|---|---|---|---|---|
| Amazon.de | ✓ 3 steps | ✓ 6 actions | $0.003 | ~$0.03 |
| npmjs.com | ✓ 5 steps | ✓ 10 actions | $0.003 | ~$0.03 |
| Booking.com | ✓ 6 steps | ✗ 13 actions | $0.003 | failed |
| Wikipedia DE | ✓ 4 steps | ✗ timeout | $0.003 | failed |
| Durchblicker.at | ✓ 17 steps | ✗ 26 actions | $0.018 | failed |
Stagehand failed on Booking.com (couldn't extract prices), Wikipedia (cookie redirect crash), and Durchblicker (stopped at engine power selection after 26 actions).
| Metric | Sentinel | Stagehand |
|---|---|---|
| Default model | Gemini 3 Flash | Gemini / GPT-4o |
| Tokens per action | 2–5k | 29–51k |
| Cost per run | ~$0.002 | ~$0.03–0.08 |
| Cost / 1k runs | ~$2 | ~$30–80 |
| E2E success rate | 5/5 sites | 2/5 sites |
| fillForm(json) | ✓ | ✗ |
| Network intercept | ✓ | ✗ |
| TOTP / MFA | ✓ | ✗ |
| Planner model split | ✓ | ✗ |
| Click verification | ✓ | ✗ |
| Form intelligence | ✓ | ✗ |
| Self-healing | ✓ | ✗ |
| Parallel sessions | ✓ built-in | Manual |
| MCP Server | ✓ | ✗ |
| CLI | ✓ | ✗ |
| Custom LLM | ✓ (+Ollama) | Partial |
| License | MIT | MIT |
Extended comparison
Sentinel vs. Stagehand, BrowserUse (Python), and plain Playwright — across the metrics that matter for production use.
| Metric | Sentinel | Stagehand | BrowserUse | Playwright |
|---|---|---|---|---|
| Language | TypeScript | TypeScript | Python | TS / Python / Java |
| AI actions | ✓ | ✓ | ✓ | ✗ (manual only) |
| Cost / run | ~$0.002 | ~$0.08 | ~$0.05 | $0 |
| Self-healing | ✓ | ✗ | Partial | ✗ |
| Parallel built-in | ✓ | Manual | Manual | ✓ |
| Shadow DOM | ✓ | Partial | ✗ | ✓ |
| OpenTelemetry | ✓ | ✗ | ✗ | ✗ |
| MCP Server | ✓ | ✗ | ✗ | ✗ |
| CLI | ✓ | ✗ | ✗ | ✓ |
| Playwright compat | ✓ drop-in | Partial | ✗ | ✓ native |
| License | MIT | MIT | MIT | Apache-2.0 |
BrowserUse costs estimated from GPT-4o at default settings. Plain Playwright has no AI cost but requires manual selector maintenance.
Methodology
- Sites: Amazon.de, npmjs.com, Booking.com, Wikipedia DE, Durchblicker.at
- Model: gemini-3-flash-preview (same for both)
- Date: April 11, 2026
- Sentinel: v4.1.0, default settings
- Stagehand: v3.2, DOM mode (default), same API key
- Instructions: identical goal text for both frameworks
- Machine: same machine, sequential execution
- Cost: token usage x official Gemini API pricing
// Same instructions, same model (Gemini Flash), same machine.
// Both frameworks tested on 5 real websites, April 11, 2026.
// Sentinel
const sentinel = new Sentinel({ apiKey: process.env.GEMINI_API_KEY });
await sentinel.init();
await sentinel.goto('https://www.amazon.de');
const result = await sentinel.run(
'Search for "mechanical keyboard", extract the first 3 product names and prices'
);
// 3 steps, $0.003, extracted: [{name, price}, ...]
// Stagehand (same model: Gemini Flash)
const stagehand = new Stagehand({
env: 'LOCAL',
model: { modelName: 'google/gemini-3-flash-preview', apiKey },
});
await stagehand.init();
const result = await stagehand.agent().execute({
instruction: 'Search for "mechanical keyboard", extract first 3 products',
maxSteps: 10,
});
// 6 actions, ~$0.03 (6-10x more tokens per action)Prices based on official Gemini API pricing as of April 2026. Both frameworks tested with the same model (gemini-3-flash-preview) and default configurations. Stagehand v3.2 used DOM mode (their current default). Test code available on GitHub.
Real-world costs are even lower
The $0.002/run baseline assumes cold LLM calls. With locatorCache and promptCache enabled, repeated runs skip the LLM entirely — reducing effective cost by 60–90%on stable pages. In a typical test suite where most pages don't change between runs, the real cost is closer to $0.0002–$0.0008/run.
Run the numbers.
Adjust runs and per-run costs to match your setup.
Cost per run — adjust to your pricing
► Sentinel 40× cheaper than Stagehand at these settings. Defaults: $0.0020/run (Gemini Flash) vs $0.080 (GPT-4o) vs $0.050 (BrowserUse). With locatorCache + promptCacheenabled, Sentinel's effective cost drops further on repeated tasks.