Skip to main content
sentinel
Logsince v0.1.0

Changelog.

All notable changes to @isoldex/sentinel. Latest version: v4.1.0 · npm · Github

v4.1.02026-04-11

Three killer features: fillForm(json), intercept(), TOTP/MFA. Plus widget detection, click-target verification, and form intelligence.

  • addedsentinel.fillForm(json) — declarative form filling with a single JSON object. Sentinel maps keys to form fields via LLM and fills them automatically.
  • addedsentinel.intercept(urlPattern, trigger) — network interception: capture raw API responses during browser actions instead of scraping DOM.
  • addedTOTP/MFA automation — mfa: { type: 'totp', secret: '...' } auto-generates 2FA codes during login flows. generateTOTP() also exported standalone.
  • addedplannerModel / plannerProvider — use a stronger model for planning (e.g. Gemini 3.1 Pro) while a cheap model (Flash) handles execution.
  • addedmode: 'aom' | 'hybrid' | 'vision' — configurable element detection strategy with vision fallback on coordinate mismatch.
  • addedClick-target verification — verifies the element at click coordinates matches the intended target, with Playwright locator fallback on mismatch.
  • addedWidget pattern detection — 9 patterns for custom dropdowns, datepickers, sliders, and CSS-library components (React Select, Ant Design, MUI, etc.).
  • addedUniversal slider-fill — 3-strategy cascade: native range input, sibling text-input (Amazon-style price filters), or keyboard simulation via aria-valuemin/max.
  • addedValidation error detection — reads form error messages via aria-invalid, role='alert', class*='error' and passes them to the planner.
  • addedForm field/button separation — planner prompt structurally separates form fields from buttons with filled/unfilled status indicators.
  • addedProactive blocker dismissal — cookie banners and modals are dismissed at the start of each step, not just on failure.
  • changedState verification uses compact fingerprint (role+name+region+value+error+state). Catches dropdown openings, focus shifts, and value changes.
  • changedUnicode regex (\p{L}\p{N}) for text normalization — supports all Latin-script languages.
  • changedAction-level retry with exponential backoff for transient failures (timeout, detached, disposed).
v4.0.02026-04-11

Top-3 candidate ranking, pre-action validation, cookie auto-recovery, spatial region tags, contenteditable support.

  • breakingDefault viewport changed from 1280×720 to 1920×1080.
  • breakingLLM action schema uses candidates[] array instead of single elementId.
  • breakingPrompt format changed from JSON to pipe-delimited (id | role | name | region).
  • addedTop-3 candidate ranking — LLM returns up to 3 element candidates per action; next candidate is tried instantly without a new LLM call on failure.
  • addedPre-action validation — validateTarget() checks disabled, hidden, or overlay-blocked elements before clicking.
  • addedCookie/overlay auto-recovery — automatically dismisses cookie banners and closes modals before retrying.
  • addedSpatial region tags — every element gets a region field (header/nav/sidebar/main/footer/modal/popup).
  • addedcontenteditable support — rich-text editors are detected and handled correctly (fill uses Ctrl+A).
  • addedScroll discovery — batch-scrolls to find elements in virtual-scrolling containers.
  • addedVision-augmented planning — planner receives a visual page description on complex pages (>100 elements).
  • addedDOM fallback for off-screen AOM — triggers full DOM parse when all AOM elements are outside the viewport.
v3.9.02026-04-08

OpenTelemetry traces and metrics — every LLM call, act(), extract(), and agent step is instrumented.

  • addedOpenTelemetry support — 6 span types (sentinel.agent → sentinel.agent.step → sentinel.act → sentinel.llm, plus sentinel.extract, sentinel.observe) and 6 metrics (act.requests, act.duration_ms, llm.requests, llm.tokens, llm.duration_ms, agent.steps). Zero overhead when no OTel SDK is configured.
  • fixedActionResult.selector was discarded by the outer Sentinel.act() wrapper — now correctly forwarded from ActionEngine.
v3.8.02026-04-08

Stable CSS selector export after every run() — paste directly into Playwright tests.

  • addedAgentResult.selectors — camelCase slug of each instruction maps to the most stable CSS selector found. Priority: data-testid → #id → [name] → [placeholder] → [aria-label] → role:has-text.
  • addedActionResult.selector — single act() calls now also expose the selector for the acted-on element.
  • addedslugifyInstruction() exported from @isoldex/sentinel — converts a natural-language instruction to a camelCase key.
v3.7.02026-04-08

Prompt caching — identical (prompt, schema) pairs return instantly at zero token cost.

  • addedpromptCache option (false | true | string) — in-memory LRU (200 entries) or file-persisted cache keyed by djb2 hash of prompt + schema. Covers act(), extract(), observe(), and the agent loop.
  • addedsentinel.clearPromptCache() — flush the prompt cache programmatically.
  • addedIPromptCache interface exported for custom backends (Redis, SQLite, etc.).
  • fixedSentinel.parallel() factory errors are now isolated per task — a browser launch failure no longer aborts remaining tasks.
  • fixedextend() CDP session leak — calling extend() on the same page multiple times now detaches the previous session first.
v3.6.02026-04-08

Sentinel.parallel() — concurrent browser sessions with a worker pool, error isolation, and progress callbacks.

  • addedSentinel.parallel(tasks, options) — runs N independent agent tasks in parallel, each in its own browser session. concurrency option limits simultaneous sessions (default: 3).
  • addedonProgress callback — fires after each task with (completed, total, result).
  • addedParallelTask, ParallelResult, ParallelOptions types exported.
v3.5.02026-04-07

sentinel.extend(page) — add AI capabilities to any existing Playwright Page object.

  • addedsentinel.extend(page) — attaches act(), extract(), and observe() directly to any Playwright Page. Drop-in for existing Playwright projects.
  • addedverbose: 3 — new debug level exposing chunk-processing stats and full LLM decision JSON per act() call.
  • changedverbose: 1 now logs action summaries only. Reasoning moved to verbose: 2. Minor breaking change for consumers relying on reasoning at level 1.
v3.4.02026-04-07

Chunk-processing, Shadow DOM, and iframe support.

  • addedfilterRelevantElements() — keyword-overlap scoring reduces elements sent to LLM on pages with 200+ interactive elements. maxElements option (default: 50).
  • addedFull Shadow DOM support — parseDOMSnapshot() and parseFormElements() recursively pierce all shadow roots via queryShadowAll(). Covers Salesforce, ServiceNow, Lit, Polymer, Stencil.
  • addediframe support — parseFrameElements() collects interactive elements from same-origin frames with coordinate offsets.
v3.3.02026-04-07

Intelligent error messages with structured diagnostic output and actionable tips.

  • addedActionResult.attempts — structured array of every tried path (coordinate-click, vision-grounding, locator-fallback) with specific errors.
  • addedContextual tip in result.message — outside viewport → scroll suggestion, timeout → overlay hint, all paths exhausted → rephrase or enable visionFallback.
v3.2.02026-04-07

Self-healing locators — cache successful element lookups, skip the LLM on repeated calls.

  • addedlocatorCache option (false | true | string) — in-memory or file-persisted cache. On repeated act() calls with the same URL + instruction, Playwright uses the cached selector directly.
  • addedAutomatic cache invalidation — if the cached element is gone, the entry is removed and the LLM path takes over.
  • addedILocatorCache interface exported for custom backends (Redis, etc.).
v3.1.12026-04-07

Six bug fixes: stale DOM on retries, tab index corruption, elementCounter race, wrong Gemini model, token callback leak, MCP crash on browser failure.

  • fixedstateParser.invalidateCache() now called at the start of every retry attempt — not just once before the loop.
  • fixedcloseTab() now correctly decrements activePageIndex when a lower-index tab is closed.
  • fixedelementCounter race condition in StateParser — now a local variable threaded through parallel parse() calls.
  • fixedGeminiProvider.generateText() with systemInstruction now uses the constructor model, not process.env.GEMINI_VERSION.
  • fixedonTokenUsage callback nulled out on close() to prevent TokenTracker from being held in memory.
  • fixedAll 7 MCP tool handlers now wrapped in try-catch — browser failures return isError: true instead of crashing the server.
v3.1.02026-04-07

CLI tool, MCP server, and Playwright Test integration — all in one release.

  • addedCLI — sentinel binary with 4 subcommands: run, act, extract, screenshot. Accepts --api-key, --headless, --model, --output, --url.
  • addedMCP server — 8 tools exposed via stdio transport: sentinel_goto, _act, _extract, _observe, _run, _screenshot, _close, _token_usage.
  • addedPlaywright Test fixture — @isoldex/sentinel/test exports test with ai fixture. sentinelOptions configurable globally in playwright.config.ts.
v3.0.02026-04-07

extract step type in AgentLoop, append action, token tracking, withRetry utility.

  • breakingAgentLoop constructor now requires extractionEngine as second parameter. Only affects consumers constructing AgentLoop directly — sentinel.run() is unaffected.
  • addedAgentResult.data — the planner can now issue extract steps mid-run. Structured data is returned in the final AgentResult.
  • addedappend action type — appends text to an input without clearing existing content.
  • addedToken usage tracking via onTokenUsage callback — all four providers fire it after every LLM call.
  • addedwithRetry() utility — unified exponential backoff extracted from all four providers.
  • fixeddomSettleTimeoutMs not forwarded to ActionEngine — now passed to all three waitForPageSettle call sites.
v2.3.32026-04-06

userDataDir — persistent browser profiles including IndexedDB (WhatsApp Web, PWAs).

  • addeduserDataDir option — persists the full Chromium profile including IndexedDB and ServiceWorkers. Required for apps that use IndexedDB for auth (WhatsApp Web, etc.).
v2.3.12026-04-06

Native vision support for all LLM providers via analyzeImage().

  • addedanalyzeImage() method added to all four providers (Gemini, OpenAI, Claude, Ollama). visionFallback: true now works with any vision-capable provider.
  • changedDefault Gemini model updated to gemini-3-flash-preview.
v2.3.02026-04-06

Contextual button naming, off-screen enrichment, withTimeout on all actions, 4-strategy locator chain.

  • addedContextual button naming — StateParser walks AOM ancestors to enrich generic labels like 'Select plan' with card context ('Kelag | Fixtarif | 17,40 cent/kWh: Select plan').
  • addedwithTimeout wrapper on all actions — 10-second timeout prevents indefinite hangs.
  • addedViewport bounds check before click + scrollIntoViewIfNeeded fallback.
  • addedRadio/checkbox JS click fallback — handles inputs hidden via CSS by traversing to the closest label.
  • added4-strategy locator chain: exact role+name → inexact role+name → CSS :has-text → plain text.
  • changedMutationObserver DOM settle replaces networkidle — resolves after 300ms of DOM silence (cap: 3s).
v2.0.02026-04-05

Major release — Sentinel becomes a full AI agent framework.

  • addedAutonomous Agent Loop — sentinel.run(goal) with Plan → Execute → Verify → Reflect cycle.
  • addedVision Grounding — Gemini Vision fallback in act() via visionFallback option.
  • addedMulti-LLM Provider System — GeminiProvider, OpenAIProvider, ClaudeProvider, OllamaProvider.
  • addedMulti-Tab and Multi-Browser support (Chromium, Firefox, WebKit).
  • addedSession Persistence — saveSession(), sessionPath option.
  • addedRecord and Replay — startRecording(), stopRecording(), exportWorkflowAsCode(), replay().
  • addedProxy and Stealth Mode — proxy option, humanLike delays, User-Agent rotation.
  • addedEvent System — Sentinel extends EventEmitter, emits action, navigate, close.
  • addedToken Tracking — getTokenUsage(), exportLogs().
  • addedStructured Error Classes — SentinelError, ActionError, ExtractionError, NavigationError, AgentError, NotInitializedError.
v1.0.02025-01-01

Initial release — AOM-based browser automation with natural language actions.

  • addedPlaywright-based browser automation (Chromium). AOM via CDP.
  • addedsentinel.act(instruction) — click, fill, hover.
  • addedsentinel.extract(instruction, schema) — Zod-typed structured extraction.
  • addedsentinel.observe() — page observation via AOM.
  • addedSemantic verification loop with automatic retry.
  • addedGemini Flash / Pro integration, verbose logging levels.