Logsince v0.1.0

Changelog.

All notable changes to @isoldex/sentinel. Latest version: v4.1.0 · npm · Github

v4.1.02026-04-11

Three killer features: fillForm(json), intercept(), TOTP/MFA. Plus widget detection, click-target verification, and form intelligence.

addedsentinel.fillForm(json) — declarative form filling with a single JSON object. Sentinel maps keys to form fields via LLM and fills them automatically.
addedsentinel.intercept(urlPattern, trigger) — network interception: capture raw API responses during browser actions instead of scraping DOM.
addedTOTP/MFA automation — mfa: { type: 'totp', secret: '...' } auto-generates 2FA codes during login flows. generateTOTP() also exported standalone.
addedplannerModel / plannerProvider — use a stronger model for planning (e.g. Gemini 3.1 Pro) while a cheap model (Flash) handles execution.
addedmode: 'aom' | 'hybrid' | 'vision' — configurable element detection strategy with vision fallback on coordinate mismatch.
addedClick-target verification — verifies the element at click coordinates matches the intended target, with Playwright locator fallback on mismatch.
addedWidget pattern detection — 9 patterns for custom dropdowns, datepickers, sliders, and CSS-library components (React Select, Ant Design, MUI, etc.).
addedUniversal slider-fill — 3-strategy cascade: native range input, sibling text-input (Amazon-style price filters), or keyboard simulation via aria-valuemin/max.
addedValidation error detection — reads form error messages via aria-invalid, role='alert', class*='error' and passes them to the planner.
addedForm field/button separation — planner prompt structurally separates form fields from buttons with filled/unfilled status indicators.
addedProactive blocker dismissal — cookie banners and modals are dismissed at the start of each step, not just on failure.
changedState verification uses compact fingerprint (role+name+region+value+error+state). Catches dropdown openings, focus shifts, and value changes.
changedUnicode regex (\p{L}\p{N}) for text normalization — supports all Latin-script languages.
changedAction-level retry with exponential backoff for transient failures (timeout, detached, disposed).

v4.0.02026-04-11

Top-3 candidate ranking, pre-action validation, cookie auto-recovery, spatial region tags, contenteditable support.

breakingDefault viewport changed from 1280×720 to 1920×1080.
breakingLLM action schema uses candidates[] array instead of single elementId.
breakingPrompt format changed from JSON to pipe-delimited (id | role | name | region).
addedTop-3 candidate ranking — LLM returns up to 3 element candidates per action; next candidate is tried instantly without a new LLM call on failure.
addedPre-action validation — validateTarget() checks disabled, hidden, or overlay-blocked elements before clicking.
addedCookie/overlay auto-recovery — automatically dismisses cookie banners and closes modals before retrying.
addedSpatial region tags — every element gets a region field (header/nav/sidebar/main/footer/modal/popup).
addedcontenteditable support — rich-text editors are detected and handled correctly (fill uses Ctrl+A).
addedScroll discovery — batch-scrolls to find elements in virtual-scrolling containers.
addedVision-augmented planning — planner receives a visual page description on complex pages (>100 elements).
addedDOM fallback for off-screen AOM — triggers full DOM parse when all AOM elements are outside the viewport.

v3.9.02026-04-08

OpenTelemetry traces and metrics — every LLM call, act(), extract(), and agent step is instrumented.

addedOpenTelemetry support — 6 span types (sentinel.agent → sentinel.agent.step → sentinel.act → sentinel.llm, plus sentinel.extract, sentinel.observe) and 6 metrics (act.requests, act.duration_ms, llm.requests, llm.tokens, llm.duration_ms, agent.steps). Zero overhead when no OTel SDK is configured.
fixedActionResult.selector was discarded by the outer Sentinel.act() wrapper — now correctly forwarded from ActionEngine.

v3.8.02026-04-08

Stable CSS selector export after every run() — paste directly into Playwright tests.

addedAgentResult.selectors — camelCase slug of each instruction maps to the most stable CSS selector found. Priority: data-testid → #id → [name] → [placeholder] → [aria-label] → role:has-text.
addedActionResult.selector — single act() calls now also expose the selector for the acted-on element.
addedslugifyInstruction() exported from @isoldex/sentinel — converts a natural-language instruction to a camelCase key.

v3.7.02026-04-08

Prompt caching — identical (prompt, schema) pairs return instantly at zero token cost.

addedpromptCache option (false | true | string) — in-memory LRU (200 entries) or file-persisted cache keyed by djb2 hash of prompt + schema. Covers act(), extract(), observe(), and the agent loop.
addedsentinel.clearPromptCache() — flush the prompt cache programmatically.
addedIPromptCache interface exported for custom backends (Redis, SQLite, etc.).
fixedSentinel.parallel() factory errors are now isolated per task — a browser launch failure no longer aborts remaining tasks.
fixedextend() CDP session leak — calling extend() on the same page multiple times now detaches the previous session first.

v3.6.02026-04-08

Sentinel.parallel() — concurrent browser sessions with a worker pool, error isolation, and progress callbacks.

addedSentinel.parallel(tasks, options) — runs N independent agent tasks in parallel, each in its own browser session. concurrency option limits simultaneous sessions (default: 3).
addedonProgress callback — fires after each task with (completed, total, result).
addedParallelTask, ParallelResult, ParallelOptions types exported.

v3.5.02026-04-07

sentinel.extend(page) — add AI capabilities to any existing Playwright Page object.

addedsentinel.extend(page) — attaches act(), extract(), and observe() directly to any Playwright Page. Drop-in for existing Playwright projects.
addedverbose: 3 — new debug level exposing chunk-processing stats and full LLM decision JSON per act() call.
changedverbose: 1 now logs action summaries only. Reasoning moved to verbose: 2. Minor breaking change for consumers relying on reasoning at level 1.

v3.4.02026-04-07

Chunk-processing, Shadow DOM, and iframe support.

addedfilterRelevantElements() — keyword-overlap scoring reduces elements sent to LLM on pages with 200+ interactive elements. maxElements option (default: 50).
addedFull Shadow DOM support — parseDOMSnapshot() and parseFormElements() recursively pierce all shadow roots via queryShadowAll(). Covers Salesforce, ServiceNow, Lit, Polymer, Stencil.
addediframe support — parseFrameElements() collects interactive elements from same-origin frames with coordinate offsets.

v3.3.02026-04-07

Intelligent error messages with structured diagnostic output and actionable tips.

addedActionResult.attempts — structured array of every tried path (coordinate-click, vision-grounding, locator-fallback) with specific errors.
addedContextual tip in result.message — outside viewport → scroll suggestion, timeout → overlay hint, all paths exhausted → rephrase or enable visionFallback.

v3.2.02026-04-07

Self-healing locators — cache successful element lookups, skip the LLM on repeated calls.

addedlocatorCache option (false | true | string) — in-memory or file-persisted cache. On repeated act() calls with the same URL + instruction, Playwright uses the cached selector directly.
addedAutomatic cache invalidation — if the cached element is gone, the entry is removed and the LLM path takes over.
addedILocatorCache interface exported for custom backends (Redis, etc.).

v3.1.12026-04-07

Six bug fixes: stale DOM on retries, tab index corruption, elementCounter race, wrong Gemini model, token callback leak, MCP crash on browser failure.

fixedstateParser.invalidateCache() now called at the start of every retry attempt — not just once before the loop.
fixedcloseTab() now correctly decrements activePageIndex when a lower-index tab is closed.
fixedelementCounter race condition in StateParser — now a local variable threaded through parallel parse() calls.
fixedGeminiProvider.generateText() with systemInstruction now uses the constructor model, not process.env.GEMINI_VERSION.
fixedonTokenUsage callback nulled out on close() to prevent TokenTracker from being held in memory.
fixedAll 7 MCP tool handlers now wrapped in try-catch — browser failures return isError: true instead of crashing the server.

v3.1.02026-04-07

CLI tool, MCP server, and Playwright Test integration — all in one release.

addedCLI — sentinel binary with 4 subcommands: run, act, extract, screenshot. Accepts --api-key, --headless, --model, --output, --url.
addedMCP server — 8 tools exposed via stdio transport: sentinel_goto, _act, _extract, _observe, _run, _screenshot, _close, _token_usage.
addedPlaywright Test fixture — @isoldex/sentinel/test exports test with ai fixture. sentinelOptions configurable globally in playwright.config.ts.

v3.0.02026-04-07

extract step type in AgentLoop, append action, token tracking, withRetry utility.

breakingAgentLoop constructor now requires extractionEngine as second parameter. Only affects consumers constructing AgentLoop directly — sentinel.run() is unaffected.
addedAgentResult.data — the planner can now issue extract steps mid-run. Structured data is returned in the final AgentResult.
addedappend action type — appends text to an input without clearing existing content.
addedToken usage tracking via onTokenUsage callback — all four providers fire it after every LLM call.
addedwithRetry() utility — unified exponential backoff extracted from all four providers.
fixeddomSettleTimeoutMs not forwarded to ActionEngine — now passed to all three waitForPageSettle call sites.

v2.3.32026-04-06

userDataDir — persistent browser profiles including IndexedDB (WhatsApp Web, PWAs).

addeduserDataDir option — persists the full Chromium profile including IndexedDB and ServiceWorkers. Required for apps that use IndexedDB for auth (WhatsApp Web, etc.).

v2.3.12026-04-06

Native vision support for all LLM providers via analyzeImage().

addedanalyzeImage() method added to all four providers (Gemini, OpenAI, Claude, Ollama). visionFallback: true now works with any vision-capable provider.
changedDefault Gemini model updated to gemini-3-flash-preview.

v2.3.02026-04-06

Contextual button naming, off-screen enrichment, withTimeout on all actions, 4-strategy locator chain.

addedContextual button naming — StateParser walks AOM ancestors to enrich generic labels like 'Select plan' with card context ('Kelag | Fixtarif | 17,40 cent/kWh: Select plan').
addedwithTimeout wrapper on all actions — 10-second timeout prevents indefinite hangs.
addedViewport bounds check before click + scrollIntoViewIfNeeded fallback.
addedRadio/checkbox JS click fallback — handles inputs hidden via CSS by traversing to the closest label.
added4-strategy locator chain: exact role+name → inexact role+name → CSS :has-text → plain text.
changedMutationObserver DOM settle replaces networkidle — resolves after 300ms of DOM silence (cap: 3s).

v2.0.02026-04-05

Major release — Sentinel becomes a full AI agent framework.

addedAutonomous Agent Loop — sentinel.run(goal) with Plan → Execute → Verify → Reflect cycle.
addedVision Grounding — Gemini Vision fallback in act() via visionFallback option.
addedMulti-LLM Provider System — GeminiProvider, OpenAIProvider, ClaudeProvider, OllamaProvider.
addedMulti-Tab and Multi-Browser support (Chromium, Firefox, WebKit).
addedSession Persistence — saveSession(), sessionPath option.
addedRecord and Replay — startRecording(), stopRecording(), exportWorkflowAsCode(), replay().
addedProxy and Stealth Mode — proxy option, humanLike delays, User-Agent rotation.
addedEvent System — Sentinel extends EventEmitter, emits action, navigate, close.
addedToken Tracking — getTokenUsage(), exportLogs().
addedStructured Error Classes — SentinelError, ActionError, ExtractionError, NavigationError, AgentError, NotInitializedError.

v1.0.02025-01-01

Initial release — AOM-based browser automation with natural language actions.

addedPlaywright-based browser automation (Chromium). AOM via CDP.
addedsentinel.act(instruction) — click, fill, hover.
addedsentinel.extract(instruction, schema) — Zod-typed structured extraction.
addedsentinel.observe() — page observation via AOM.
addedSemantic verification loop with automatic retry.
addedGemini Flash / Pro integration, verbose logging levels.

Read the docs →View on GitHub