# DeepSyte > Give your AI coding assistant eyes on every screen. Test and verify web apps and native mobile apps — screenshots, browser workflows, audits, and native Android/iOS automation — all via MCP tools. DeepSyte is a Model Context Protocol (MCP) server ecosystem with 130+ tools that gives AI coding assistants the ability to see, interact with, and verify both websites and native mobile apps. The core DeepSyte MCP handles web: remote browser workflows for public pages, managed local browser workflows for private or local environments, campaign workflows, and evidence capture for proof-heavy debugging and QA. The `deepsyte mobile-mcp` server handles Android and iOS: native app gestures, element finding, app lifecycle, device control, and AI-first screen analysis via Appium. It works with Codex, Claude, Cursor, Windsurf, VS Code, and any MCP-compatible client. ## Canonical Endpoints - MCP endpoint: `https://api.deepsyte.com/mcp` - OAuth issuer and authorization server: `https://api.deepsyte.com` - REST API base: `https://api.deepsyte.com` - Valid Railway service domain: `https://screenshotsmcp-api-production.up.railway.app` - Do not use `https://deepsyte-api-production.up.railway.app`; it is an obsolete Railway host and returns 404. - Raw `sk_live_...` API keys are for REST only. MCP and CLI browser workflows require website OAuth and a `dso_...` session. - If an agent tries a natural request such as "run an SEO test on example.com" before DeepSyte is connected, the MCP `401` response includes `mcp_url`, `login_url`, `setup_url`, and `codex_login_command`. Surface those fields and tell Codex users to run `codex mcp login deepsyte` against `https://api.deepsyte.com/mcp`. - Cold agents with no DeepSyte memory can discover common MCP tool names from public metadata before auth. `/.well-known/mcp.json`, `/.well-known/oauth-protected-resource/mcp`, and the `/mcp` 401 JSON body include an informational `tool_catalog`; execution still requires website OAuth and a `dso_...` Bearer token. - Common post-auth MCP tools include `screenshot_fullpage`, `take_screenshot`, `screenshot_responsive`, `browser_navigate`, `browser_screenshot`, `browser_click`, `browser_fill`, `browser_wait_for`, `browser_close`, `browser_seo_audit`, `browser_perf_metrics`, `responsive_audit`, `ux_review`, `extract_text_from_image`, `create_audit_report`, `get_audit_report_status`, `get_audit_report_link`, `create_audit_comparison_report`, `get_audit_comparison_report_link`, `create_audit_campaign`, `add_audit_campaign_targets`, `list_campaign_sender_capacity`, `update_campaign_sender_limits`, `launch_audit_campaign`, and `get_audit_campaign_status`. - MCP screenshot/browser responses are evidence-rich by design: they include captured/current URL, sessionId, Run URL where available, viewport, result counts/filters/scripts, and `Evidence screenshot` URLs when an image was persisted. When a tool supports `caption`, pass a short user-facing reason so VS Code, Windsurf, Codex, and CLI transcripts explain what happened instead of only showing a tool name. ## Mobile MCP — Android & iOS Automation (deepsyte mobile-mcp) The companion mobile MCP adds 78 tools for native Android and iOS automation. Install it alongside (or instead of) the main DeepSyte web MCP. ### Install ```json { "mcpServers": { "deepsyte-mobile": { "command": "npx", "args": ["-y", "deepsyte@latest", "mobile-mcp"] } } } ``` **Requirements:** ADB (Android SDK Platform Tools) in PATH. Appium 3.x running locally (`appium server`). For iOS: macOS + Xcode + WebDriverAgent. ### Mobile Quick Start ``` # 1. Start Appium appium server # 2. List connected devices Ask agent: "What mobile devices are connected?" → mobile_devices # 3. Start a session and automate Ask agent: "Open com.example.myapp on emulator-5554 and tap the Login button" → mobile_session_start → mobile_tap_text → mobile_screenshot ``` ### Mobile Session Tools | Tool | Description | |---|---| | `mobile_devices` | List connected ADB devices | | `mobile_session_start` | Start Appium session. Params: `deviceUdid`, `platformName` (Android\|iOS), `app` or `bundleId`, `noReset`, `extraCaps` | | `mobile_session_stop` | Close session. Params: `sessionId` | | `mobile_session_status` | Check session health. Params: `sessionId` | ### Mobile Screenshot & Screen Analysis | Tool | Description | |---|---| | `mobile_screenshot` | ADB screenshot (no session needed). Params: `deviceUdid` | | `mobile_screenshot_element` | Crop screenshot to element bounds. Params: `sessionId`, `elementId` | | `mobile_describe_screen` | AI-readable element list from current screen. Params: `sessionId` | | `mobile_assert_text` | Assert text visible — throws if not found. Params: `sessionId`, `text` | | `mobile_wait_for_screen_change` | Wait until screen content changes. Params: `sessionId`, `timeout` | | `mobile_uitree` | Raw accessibility XML via ADB (no session). Params: `deviceUdid` | | `mobile_source` | Full Appium page source XML. Params: `sessionId` | ### Mobile Gestures 9 tools accept `screenshot: true` to return a confirmation screenshot after the action: `mobile_tap`, `mobile_tap_text`, `mobile_swipe`, `mobile_type`, `mobile_fill_field`, `mobile_long_press`, `mobile_double_tap`, `mobile_drag`, `mobile_scroll_to_text` | Tool | Description | |---|---| | `mobile_tap` | Tap at coordinates. Params: `sessionId`, `x`, `y` | | `mobile_tap_text` | Find element by visible text and tap. Params: `sessionId`, `text` | | `mobile_double_tap` | Double-tap. Params: `sessionId`, `x`, `y` | | `mobile_long_press` | Long press. Params: `sessionId`, `x`, `y`, `duration` | | `mobile_swipe` | Swipe between two points. Params: `sessionId`, `startX`, `startY`, `endX`, `endY`, `duration` | | `mobile_drag` | Drag-and-drop. Params: `sessionId`, `startX`, `startY`, `endX`, `endY` | | `mobile_pinch` / `mobile_zoom` | Pinch/zoom gestures. Params: `sessionId`, `x`, `y`, `scale` | | `mobile_type` | Type text via keyboard. Params: `sessionId`, `text` | | `mobile_key` | Press hardware key (HOME, BACK, ENTER, VOLUME_UP…). Params: `sessionId`, `key` | | `mobile_hide_keyboard` | Dismiss soft keyboard. Params: `sessionId` | | `mobile_press_home` | Home button (Android only). Params: `sessionId` | | `mobile_press_back` | Back button (Android only). Params: `sessionId` | | `mobile_set_orientation` / `mobile_get_orientation` | Rotate device. Params: `sessionId`, `orientation` | ### Mobile Element Tools | Tool | Description | |---|---| | `mobile_find_element` | Find by strategy + selector. Params: `sessionId`, `strategy` (accessibility id\|xpath\|id\|class name), `selector` | | `mobile_find_all_elements` | Find all matching. Params: `sessionId`, `strategy`, `selector` | | `mobile_find_by_text` | Cross-platform text search. Params: `sessionId`, `text` | | `mobile_wait_for_element` | Wait until element appears. Params: `sessionId`, `strategy`, `selector`, `timeout` | | `mobile_scroll_to_text` | Scroll until text visible. Params: `sessionId`, `text` | | `mobile_element_text` | Get element's text. Params: `sessionId`, `elementId` | | `mobile_fill_field` | Clear + type into field. Params: `sessionId`, `elementId`, `text` | | `mobile_set_value` | Set value bypassing keyboard. Params: `sessionId`, `elementId`, `value` | | `mobile_element_exists` | Check if element is present. Params: `sessionId`, `strategy`, `selector` | | `mobile_execute` | Run `mobile:` script escape hatch. Params: `sessionId`, `script`, `args` | ### Mobile App Lifecycle Tools | Tool | Description | |---|---| | `mobile_install_app` | Install APK/IPA. Params: `sessionId`, `appPath` | | `mobile_uninstall_app` | Remove app. Params: `sessionId`, `bundleId` or `appPackage` | | `mobile_activate_app` | Bring to foreground. Params: `sessionId`, `bundleId` or `appPackage` | | `mobile_clear_app_data` | Clear data/cache without uninstalling (Android). Params: `sessionId`, `appPackage` | | `mobile_app_launch` | Launch by package (ADB, no session). Params: `deviceUdid`, `appPackage`, `activity` | | `mobile_app_stop` | Force-stop (ADB, no session). Params: `deviceUdid`, `appPackage` | | `mobile_app_state` | Get app state (foreground/background/not running). Params: `sessionId`, `bundleId` | ### Mobile Device Control Tools | Tool | Description | |---|---| | `mobile_grant_permission` / `mobile_revoke_permission` | ADB permission management (Android). Params: `sessionId`, `appPackage`, `permission` | | `mobile_deep_link` | Open URL scheme or universal link. Params: `sessionId`, `url` | | `mobile_clipboard` | Get or set clipboard text. Params: `sessionId`, `action` (get\|set), `text` | | `mobile_geolocation` | Mock GPS coordinates. Params: `sessionId`, `latitude`, `longitude` | | `mobile_push_file` / `mobile_pull_file` | File transfer to/from device. Params: `sessionId`, `localPath`, `remotePath` | ### Mobile Alerts, Context & WebView | Tool | Description | |---|---| | `mobile_handle_alert` | Accept or dismiss alert/dialog. Params: `sessionId`, `action` (accept\|dismiss) | | `mobile_get_contexts` | List Native + WebView contexts. Params: `sessionId` | | `mobile_switch_context` | Switch context. Params: `sessionId`, `contextName` | To automate Chrome on-device or a WebView: call `mobile_get_contexts` to find the context name, `mobile_switch_context` to enter it, then use `mobile_browser_*` tools (20 CDP tools: navigate, tap, type, evaluate JS, cookies, localStorage, network throttle, geolocation, user agent, and more). ### Mobile Screen Recording | Tool | Description | |---|---| | `mobile_record_start` | Start screen recording. Params: `deviceUdid` | | `mobile_record_stop` | Stop and return recording. Params: `deviceUdid` | ### Mobile Workflows **Explore a native app:** → `mobile_session_start` → `mobile_screenshot` → `mobile_describe_screen` → tap → `mobile_screenshot` **Test a login flow:** → `mobile_session_start` → `mobile_find_by_text("Email")` → `mobile_fill_field` → `mobile_find_by_text("Password")` → `mobile_fill_field` → `mobile_tap_text("Sign In")` → `mobile_wait_for_screen_change` → `mobile_assert_text("Dashboard")` **Test a deep link:** → `mobile_session_start` → `mobile_deep_link("myapp://product/123")` → `mobile_assert_text("Product Name")` → `mobile_screenshot` **Cross-platform:** → Run same flow with `platformName: "Android"` then `platformName: "iOS"` — tool names are identical. **Slow emulator fix:** → Pass `extraCaps: { uiautomator2ServerLaunchTimeout: 60000, uiautomator2ServerInstallTimeout: 60000, adbExecTimeout: 60000 }` to `mobile_session_start`. ## Agent Skill Download the complete Agent Skill for all tools: https://deepsyte.com/.skills/deepsyte/SKILL.md ## Discovery Model - Treat DeepSyte tools as atomic actions. - Treat the DeepSyte skill as broad guidance for choosing the right path. - Treat packaged workflows as targeted procedures for repeatable multi-step jobs. - When the task is an audit, verification flow, or another repeatable multi-step procedure, check the available workflows before improvising. - For any site audit, performance audit, SEO audit, UX audit, full audit, or another repeatable multi-page website review, read `workflows/sitewide-performance-audit/WORKFLOW.md` before opening browser sessions, running audit tools, or drafting findings. - If the user gives you a site URL but no page list, infer a representative public page set and start instead of blocking on permission. - Default authenticated pages to out of scope unless the user explicitly asks for login, dashboard, or another protected flow. - Do not load every workflow up front. Read only the workflow that matches the task. - If terminal access exists and repeated tool calls are likely, prefer the CLI when it is clearly faster than repeated MCP round-trips. If terminal access is not available, stay in MCP. - For multi-page performance audits in MCP, avoid opening many new browser sessions in parallel. Measure sequentially unless there is a proven reason to increase concurrency. Available workflow: - `workflows/sitewide-performance-audit/WORKFLOW.md` — use when the user asks why a site is slow, wants the slowest pages identified, or wants a repeatable multi-page performance review. - `workflows/seo-audit/WORKFLOW.md` — use when the user asks to run an SEO test or audit. If DeepSyte is not connected, do not stop vaguely; show the MCP URL `https://api.deepsyte.com/mcp` and the login command from the MCP error. ## Quick Start ### Option A: CLI (fastest) ```bash npx deepsyte setup --client codex # or: cursor, vscode, windsurf, claude, claude-code ``` The CLI now also installs or repairs the managed core DeepSyte skill in `~/.agents/skills/deepsyte`, including `workflows/sitewide-performance-audit/WORKFLOW.md`, during successful `login`, `install`, and `setup` flows. If you prefer to do onboarding in two steps, run `npx deepsyte login` followed by `npx deepsyte install `. For most clients that reaches the same result as `setup --client `. The main nuances are that `install vscode` writes a workspace-local `.vscode/mcp.json`, while `install claude-code` prints the `claude mcp add ...` command for you to run manually. Use remote workflows first for public sites. Escalate to the managed local browser when you need localhost access, intranet or VPN reachability, authenticated realism, or explicit user-approved local execution. Managed local browser commands require a valid DeepSyte website OAuth session. ## Campaigns and Mailbox DeepSyte Campaigns creates bulk one-page SEO audits or full SEO audits and sends each recipient their own tokenized report link. AgentMail is the two-way campaign mailbox provider. Resend is only for DeepSyte transactional/report delivery. Gmail, Outlook, Composio, and inbox mining are out of scope for this phase. Campaign agents can: - Create campaigns with `create_audit_campaign`; use `report_type: "full_site"` when every recipient should receive a full SEO audit, and `send_hours: "office"` or `send_hours: "anytime"` for the global campaign sending-hours mode. - Add selected prospects directly with `add_audit_campaign_targets`; set a target `reportType: "full_site"` or `fullAudit: true` for a one-off full SEO audit row. - Check real AgentMail sender capacity with `list_campaign_sender_capacity`. - Switch sender domains or inboxes between `warmup` and `manual` daily-limit modes with `update_campaign_sender_limits`. - Launch with `launch_audit_campaign`. - Check recipient and send status with `get_audit_campaign_status`. Campaign cron remains the only automated sender. It enforces campaign sending-hours mode, inbox caps, domain caps, campaign caps, warmup/manual mode, suppressions, and report quality gates. Do not claim that MCP, CLI, or API target import can bypass those gates. Campaign sending-hours modes are global to the campaign and apply to Resend and AgentMail. `office` means Monday-Friday 9am-4pm in the campaign timezone. `anytime` means every day, 24-hour delivery. Warmup defaults per inbox are 5/day for days 0-7, 10/day for days 8-14, 15/day for days 15-21, 20/day for days 22-30, 25/day for days 31-45, 30/day for days 46-60, 35/day for days 61-90, and 40/day hard max after 90 days. Manual mode is for prewarmed senders and shows a warning instead of blocking the user. Campaign emails are plain-text-first, one recipient at a time, with no tracking pixels, no tracking redirects, no attachments, and at most one report link in the first outbound email. Campaign cron waits at least 5 minutes between report emails for the same account owner, and AgentMail inboxes also use a daily first-send offset plus deterministic spacing jitter from the campaign sending-hours window and effective daily cap. Reports must not be evidence-free or partial. DataForSEO enrichment is used only for full SEO report paths, including paid full reports and campaign rows marked full. AI Campaign Intelligence is evidence-bound and uses MiniMax Token Plan in production (`MINIMAX_API_KEY`, optional `MINIMAX_BASE_URL`, optional `MINIMAX_MODEL`, default model `MiniMax-M2.7`). Before campaign launch/delivery, rows can be scored `qualified`, `weak_fit`, `review`, or `block` from page HTML/text, domain signals, contact quality, business category, and existing suppressions. Campaign angles store `angle`, `angleReason`, `subjectHint`, `firstLine`, and `ctaStyle`; outbound templates stay controlled and AI only fills fact-backed snippets. Public free reports may include AI report intro, executive summary, and upsell bridge. Paid full reports may include an AI strategic roadmap, 30-day plan, DataForSEO opportunity explanations, and AI-selected representative page sets. The MCP `ux_review` tool also uses MiniMax for text reasoning over verified browser evidence; server-side OCR is deprecated in favor of MiniMax Token Plan MCP `understand_image`. Inbound AgentMail replies are classified automatically; safe replies can be answered automatically, while unsubscribe, angry, legal-risk, bounce-like, and not-interested replies suppress instead of selling. ## Hosted Audit Reports DeepSyte hosted audit reports use tokenized `/r/[token]` links and can be created from the dashboard, workspace sharing, campaigns, MCP, or CLI. Use `create_audit_report` when an agent needs a shareable single-page or full-site report link. Use `create_audit_comparison_report` when the user wants a before/after report from two audit ids for the same site. Report tools and commands return `auditId`, `reportType`, `confidenceStatus`, `marketDataStatus`, `score`, `topFinding`, `reportUrl`, `pdfAction`, `runUrl`, `evidenceSummary`, `sourceSummary`, and `dataforseoSummary` when full-report market data exists. Hosted report pages also render the canonical `SeoReportModel`: weighted score buckets, separate confidence and freshness scores, source freshness, provider status, and evidence-linked action items. Confidence statuses are `complete`, `usable_with_warning`, `retry_required`, and `blocked`. Full-report market statuses are `market_complete`, `market_partial`, `market_unavailable`, and `provider_failed`. Campaign delivery and workspace sharing must not send `retry_required` or `blocked` reports unless an explicit debug path is used. CLI report commands: ```bash deepsyte reports run https://example.com --type single-page --share deepsyte reports run https://example.com --type full-site --share deepsyte reports status deepsyte reports link deepsyte reports compare --before --after --share ``` Hosted report pages render the reconciled evidence model: SEO, schema, social, performance, HAR-style waterfall rows when captured, LCP proof, network, console, accessibility, responsive proof, screenshots, run links, scope, confidence reasons, market-data status, endpoint counts/costs/errors, and print/PDF layout. `.au` domains default to DataForSEO Australia `2036` + `en` unless env overrides are set. If ranked keywords are empty, the full-report path seeds keyword and SERP checks from page titles, H1s, service URLs, business name, and audit facts before accepting a zero-data result. AI can explain evidence after reconciliation, but it must not invent raw traffic, ranking, revenue, competitor, or urgency claims. ## Escalation Ladder for website auth Some sites (Cloudflare Turnstile, WorkOS AuthKit, Clerk bot-detection, Akamai/PerimeterX-protected signups) silently reject requests from the Railway-hosted cloud browser. `solve_captcha` returns a valid token, but the fingerprint gets filtered by Siteverify — no amount of token injection fixes it. When a valid-looking submit silently does nothing (URL doesn't change, no error, form resets), escalate instead of retrying: 1. Start with MCP tools (`browser_navigate`, `smart_login`, `solve_captcha`). 2. If MCP stalls silently and `deepsyte whoami` confirms an authenticated website session, switch to the CLI local browser: `npx deepsyte browser:start `, then drive real Chrome one atomic command at a time with `browser:click`, `browser:fill`, `browser:paste` (React-compatible), `browser:wait-for`, `browser:inspect`, `browser:eval`. Real Chrome on the user's residential IP passes WorkOS/Turnstile trust checks silently, often without even showing a CAPTCHA checkbox. 3. Always call `deepsyte auth:plan ` before a fresh auth attempt and `deepsyte auth:record ` after. Inbox, password, and per-site auth state persist in the DB so the next run resumes at the right stage. 4. Use `+alias` emails (e.g. `you+smithery@agentmail.to`) to reuse a single inbox for multiple signups — most providers treat each alias as a distinct identity. The interactive rule: read the returned PNG after every `browser:*` command, confirm the state, then issue the next command. No preset scripts. For repeatable public-page performance audits, use the CLI only when the command path is already available or can be approved up front. If command approval would stall the run and MCP is already available, begin with MCP and collect metrics sequentially. Important: `deepsyte skills ...` only manages the local core DeepSyte skill. For community skill discovery or installation, use the `find-skills` workflow or `npx skills find ...` / `npx skills add ...` instead. ### Option B: Manual MCP OAuth For Codex, add native HTTP MCP config to `~/.codex/config.toml`, then trigger Codex OAuth explicitly: ```toml [mcp_servers.deepsyte] url = "https://api.deepsyte.com/mcp" scopes = ["mcp:tools"] oauth_resource = "https://api.deepsyte.com/mcp" ``` ```bash codex mcp login deepsyte ``` If your Codex Desktop build signs in but does not mount `mcp__deepsyte` tools into a fresh thread, use `mcp-remote` as a temporary fallback: ```toml [mcp_servers.deepsyte] command = "npx" args = ["-y", "mcp-remote@0.1.38", "https://api.deepsyte.com/mcp"] ``` Add the base MCP URL to an OAuth-capable MCP client: ```json { "mcpServers": { "deepsyte": { "url": "https://api.deepsyte.com/mcp" } } } ``` The client should open the DeepSyte website for sign-in before MCP tools work. ## VS Code Extension Preview A native DeepSyte VS Code extension is now being developed in the monorepo for a dedicated Activity Bar sidebar, automatic browser OAuth sign-in, automatic editor MCP setup, automatic managed core skill sync, native MCP registration, command palette actions, a live activity timeline panel, and browser workflow UX inside the editor. Current preview commands include `DeepSyte: Sign In`, `DeepSyte: Check Status`, `DeepSyte: Take Screenshot`, `DeepSyte: Open Timeline`, `DeepSyte: Configure Editor Integration`, and `DeepSyte: Sync Core Skill`. The sidebar also exposes quick actions and recent activity directly in VS Code, and the extension opens browser OAuth, configures the editor automatically, and repairs the managed core skill when no credentials are stored. Until the Marketplace release is ready, the recommended VS Code setup is to install the preview VSIX, sign in once, and use `DeepSyte: Configure Editor Integration` only if you need to repair the automatic MCP setup or `DeepSyte: Sync Core Skill` when you need to repair the managed local skill. ## Chrome Extension Preview The monorepo also includes a Chrome extension preview under `packages/chrome-extension`. - **Public pages** use DeepSyte cloud capture after website authorization. - **Localhost and private pages** stay local-first, so the extension can still capture and inspect pages that the cloud browser cannot reach. - **Page tools** in the popup can read visible text and DOM HTML for the active tab. - **Saved key validation** happens before the extension accepts a pasted API key, so revoked keys are rejected instead of being silently stored. - **Viewer cloud actions** reuse existing cloud-backed captures when available and only upload local-only captures when needed. Use the extension when you want a browser-native client for current-tab capture and inspection, while keeping parity with the DeepSyte platform for public URLs. ## CLI (for agents and humans) DeepSyte has a CLI that exposes all tools as terminal commands. **AI agents can and should use the CLI directly via terminal/run_command** — it's often faster than MCP tool calls and returns structured text output. Install: `npm install -g deepsyte` npm: https://www.npmjs.com/package/deepsyte Or use without installing: `npx deepsyte ` ### Auth ``` deepsyte login # OAuth (opens browser, saves website-issued session) deepsyte whoami # Check auth status deepsyte logout # Clear credentials ``` If `deepsyte login` opens a Railway 404 page, update the saved API URL to `https://api.deepsyte.com` or reinstall the latest CLI, then run login again. ### Campaigns ``` deepsyte campaigns create --name "Sydney plumbers" --send-at 2026-05-12T00:00:00.000Z --timezone Australia/Sydney --report-type single_page --send-hours office --file targets.csv deepsyte campaigns add-targets --campaign --file targets.csv deepsyte campaigns capacity deepsyte campaigns sender-limits --account --mode warmup --daily-limit 10 deepsyte campaigns sender-limits --domain --mode manual --daily-limit 40 deepsyte campaigns launch deepsyte campaigns status ``` Campaign target CSVs require website and email columns. Common aliases include `website`, `url`, `website_url`, `email`, `recipient_email`, `name`, `company`, and `first_line`. Optional `full_audit`, `full_seo`, `report_type`, or `audit_type` columns can mark one row as a full SEO audit. ### Screenshots ``` deepsyte screenshot # 1280×800 viewport deepsyte screenshot --width 1920 --height 1080 --full-page deepsyte screenshot --format jpeg --delay 2000 deepsyte fullpage # Dedicated full-page capture deepsyte responsive # Desktop + tablet + mobile deepsyte mobile # iPhone 14 Pro (393×852) deepsyte tablet # iPad (820×1180) deepsyte dark # Dark mode emulated deepsyte element --selector "#hero" # CSS element capture deepsyte diff # Pixel-diff two URLs deepsyte cross-browser # Chromium + Firefox + WebKit deepsyte batch # Multiple URLs (max 10) deepsyte pdf # Export as PDF deepsyte screenshots # List recent screenshot jobs deepsyte screenshot:status # Check screenshot job status ``` ### Browser Sessions ``` deepsyte browse # Start session → sessionId deepsyte browse:click # Click element deepsyte browse:click-at 320 480 # Coordinate click deepsyte browse:fill # Type into input deepsyte browse:hover ".menu-trigger" # Hover state deepsyte browse:select "select[name=country]" "Australia" deepsyte browse:wait-for ".results-loaded" # Wait for selector deepsyte browse:screenshot # Capture current state deepsyte browse:text # Get visible text deepsyte browse:html # Get page HTML deepsyte browse:a11y # Accessibility tree deepsyte browse:eval "document.title" # Evaluate JavaScript deepsyte browse:console --level error # Console logs deepsyte browse:network-errors # Failed requests deepsyte browse:network-requests # Request waterfall deepsyte browse:cookies get # Inspect cookies deepsyte browse:storage getAll # Inspect storage deepsyte browse:back # Back in history deepsyte browse:forward # Forward in history deepsyte browse:viewport 393 852 # Resize existing session deepsyte browse:seo # Session SEO audit deepsyte browse:perf # Session performance metrics deepsyte browse:captcha # Solve CAPTCHA in-session deepsyte browse:scroll --y 500 # Scroll down deepsyte browse:key Enter # Press key deepsyte browse:goto # Navigate deepsyte browse:close # End session ``` ### Reviews & Audits ``` deepsyte review # AI UX review (standalone, no Run created) deepsyte seo # SEO metadata, run-backed (shows verdict + summary in /dashboard/runs) deepsyte perf # Core Web Vitals, run-backed against sitewide-performance-audit workflow deepsyte a11y # Accessibility tree (standalone) deepsyte ocr # Extract text from image via AI vision (OCR) deepsyte ocr --session # OCR from browser session screenshot deepsyte breakpoints # Responsive breakpoints (standalone) ``` `perf` and `seo` open a workflow-aware browser session so the run lands in the dashboard Runs UI with a structured outcome (verdict, summary, findings, proof coverage, next actions). `review`, `a11y`, and `breakpoints` stay standalone and do not create a Run. ### Disposable Email ``` deepsyte auth:test # Reuse auth memory + primary inbox deepsyte auth:find-login # Discover likely login pages deepsyte auth:smart-login --username user@example.com --password secret deepsyte auth:authorize-email # Connect Gmail once for OTP reads deepsyte auth:read-email # Read latest Gmail OTP deepsyte inbox:create # Create or reuse the primary test inbox deepsyte inbox:check # Read messages, extract OTP codes deepsyte inbox:send --to user@example.com --subject "Test" --text "Hello" ``` ### Reusable Website Auth - Start with `auth_test_assist` or `deepsyte auth:test ` for login, sign-up, or verification flows. - Read the helper's recommended auth path, account-exists confidence, likely auth method, and expected follow-up before choosing sign-in or sign-up. - Treat the helper's reusable strategy as the default cross-site guidance, and treat per-site hints as evidence rather than universal rules. - Reuse the saved primary inbox and password unless you explicitly need a fresh registration. - If sign-in fails because the account does not exist, switch to sign-up with the same saved credentials. - If `smart_login` is uncertain on Clerk or other multi-step auth UIs, fall back to browser tools and inspect network or console evidence before concluding the login failed. - Use `check_inbox` for verification codes or email links. - After the attempt, call `auth_test_assist` again with `action: "record"` to save the outcome for future runs. ### Setup & Install ``` deepsyte setup # Interactive: login + choose IDE + auto-configure (recommended) deepsyte setup --client cursor # Non-interactive: for AI agents, skips prompt deepsyte setup --client windsurf deepsyte setup --client vscode deepsyte setup --client claude deepsyte setup --client claude-code deepsyte browser open https://example.com # Launch extension-free local browser with explicit approval deepsyte browser open https://example.com --record-video # Record the full managed local browser session to a local .webm file deepsyte browser back # Navigate browser history backward deepsyte browser forward # Navigate browser history forward deepsyte browser status # Inspect the tracked managed local browser deepsyte browser goto https://example.org # Navigate the managed local browser deepsyte browser click-at 320 480 # Click viewport coordinates in the managed local browser deepsyte browser hover ".menu-trigger" # Trigger hover states in the managed local browser deepsyte browser wait-for ".results-loaded" --timeout 8000 deepsyte browser select "select[name=country]" "Australia" deepsyte browser viewport 393 852 # Resize the managed local browser viewport deepsyte browser screenshot # Save a local screenshot from the managed browser deepsyte browser text # Read visible text from the managed browser deepsyte browser console --level error # Read captured console logs from the managed browser deepsyte browser network-errors # Read failed network requests from the managed browser deepsyte browser network-requests --resource-type fetch --min-duration 200 deepsyte browser cookies get # Inspect cookies in the managed browser deepsyte browser storage getAll --type localStorage deepsyte browser eval "document.title" # Evaluate JavaScript in the managed browser deepsyte browser a11y --max-depth 6 # Inspect the accessibility tree from the managed browser deepsyte browser perf # Read performance metrics from the managed browser deepsyte browser seo # Audit SEO metadata from the managed browser deepsyte browser close # Close the tracked managed local browser deepsyte skills list # List installed skills under ~/.agents/skills deepsyte skills sync # Install, update, or repair the managed core skill deepsyte skills update # Alias for core skill sync deepsyte install cursor # Writes ~/.cursor/mcp.json deepsyte install vscode # Writes .vscode/mcp.json deepsyte install windsurf # Writes ~/.codeium/windsurf/mcp_config.json deepsyte install claude # Writes Claude Desktop config deepsyte install claude-code # Prints `claude mcp add` command ``` For community skills such as Anthropic's `frontend-design`, use `find-skills` or run `npx skills find frontend design` followed by `npx skills add anthropics/skills@frontend-design -g -y`. ### One-liner Install ``` # macOS/Linux curl -fsSL https://deepsyte.com/install.sh | bash # Windows PowerShell irm https://deepsyte.com/install.ps1 | iex # Or just use npx (no install needed) npx deepsyte setup ``` ### Agent Tips - **AI agents: use `npx deepsyte setup --client ` to install non-interactively.** - **Use CLI when you have terminal access** — structured text output, no JSON-RPC overhead. - **For auth testing, start with `npx deepsyte auth:test https://example.com`** so you reuse saved inbox credentials, remembered auth history, and the helper's site-specific confidence signals. - Every screenshot command returns a public CDN URL. - CLI browser commands print the target session, current URL, result counts, and evidence screenshot path/URL. Read that output and relay the evidence details in your final answer. - Browser sessions: start with `browse`, get sessionId, pass it to subsequent `browse:*` commands, and always `browse:close` when done. - Use remote `browse:*` commands for public-site MCP-parity workflows. Use `deepsyte browser ...` only for the separate managed local browser used for localhost, VPN-only, or approval-gated work. - Managed local browser commands under `deepsyte browser ...` now support continuous console/network capture while the browser stays open, plus history navigation, coordinate clicks, hover states, wait conditions, dropdown selection, viewport resizing, screenshots, text, HTML, cookies/storage inspection, script evaluation, accessibility trees, performance metrics, SEO audits, timestamped evidence bundle export via `browser evidence`, finalized video-inclusive export via `browser close --evidence`, and optional local `.webm` session recording against the tracked local browser. - Prefer evidence-rich workflows when debugging: capture screenshots, logs, recordings, and bundle exports so the result is reviewable by both humans and agents. - Credentials stored in `~/.config/deepsyte/config.json`. Once logged in, all commands are authenticated. - Use `npx deepsyte` if unsure whether it's installed globally. ## Webhooks Subscribe an HTTPS endpoint to receive HMAC-signed events: ``` POST /v1/webhooks # create endpoint, secret returned once GET /v1/webhooks # list endpoints PATCH /v1/webhooks/:id # update url, events, enabled flag POST /v1/webhooks/:id/rotate # rotate signing secret POST /v1/webhooks/:id/test # fire test.ping GET /v1/webhooks/:id/deliveries # last 50 delivery attempts DELETE /v1/webhooks/:id # remove endpoint ``` Available events: `screenshot.completed`, `screenshot.failed`, `run.completed`, `run.failed`, `quota.warning`, `test.ping`. Default subscription is `["*"]` (forward-compatible). Headers on every delivery: `Webhook-Id`, `Webhook-Timestamp`, `Webhook-Signature: t=,v1=`. Verify within 5 minutes of `Webhook-Timestamp`. Retries: 6 attempts at 1m / 5m / 30m / 2h / 12h. Exhausted deliveries are visible in `GET /v1/webhooks/:id/deliveries`. Full reference: https://deepsyte.com/docs/api/webhooks ## Server URL MCP endpoint: `https://api.deepsyte.com/mcp` REST API base: `https://api.deepsyte.com` Transport: Streamable HTTP OAuth discovery should point to `https://api.deepsyte.com/.well-known/oauth-protected-resource/mcp` and `https://api.deepsyte.com/.well-known/oauth-authorization-server`. ## Authentication REST API requests use API keys. MCP and CLI access use website OAuth sessions. - MCP: Use the base URL and complete the browser sign-in prompt - REST API: Pass as `Authorization: Bearer sk_live_...` header Rate limits by plan: - Free: 100 screenshots/month - Starter: 2,000 screenshots/month - Pro: 10,000 screenshots/month --- ## Tools Reference (53+ tools) ### Screenshot Tools (no session needed) #### take_screenshot Capture a screenshot of any URL and return a public image URL. Parameters: - url (string, required): The URL to screenshot - width (number, default: 1280): Viewport width in pixels (320–3840) - height (number, default: 800): Viewport height in pixels (240–2160) - fullPage (boolean, default: true): Capture full scrollable page - maxHeight (number, optional): Cap extremely tall captures - format (string, default: "png"): png, jpeg, or webp - delay (number, default: 0): Wait ms after page load (0–10000) #### screenshot_fullpage Capture entire scrollable page. Params: url, width, format, maxHeight #### screenshot_mobile iPhone 14 Pro (393×852). Params: url, fullPage, format #### screenshot_tablet iPad (820×1180). Params: url, fullPage, format #### screenshot_responsive Desktop + tablet + mobile in ONE call. Most efficient for responsive visual comparison. For detailed checks (overflow, touch targets, font sizes), follow up with responsive_audit in a browser session. Params: url, fullPage, format #### screenshot_dark Dark mode emulated (prefers-color-scheme: dark). Params: url, width, height, format #### screenshot_element Specific element by CSS selector. SPA-friendly with auto-wait. Params: url, selector, format, delay #### screenshot_pdf Export as PDF (A4 with backgrounds). Params: url #### screenshot_batch Capture multiple URLs in one call (max 10). Params: urls[], width, height, format, fullPage #### screenshot_cross_browser Chromium + Firefox + WebKit simultaneously. Params: url, width, height, fullPage #### screenshot_diff Pixel-diff two URLs. Returns diff image + percentage changed + match score. To capture multiple URLs for comparison, use screenshot_batch. Params: urlA, urlB, width, height, threshold #### find_breakpoints Detect responsive breakpoints (scans 320px–1920px). Returns structured width table with overflow status (✅/❌), height, and scrollWidth at each width. For element-level issues (culprit elements, touch targets, font sizes), follow up with responsive_audit. Params: url #### responsive_audit One-call responsive design audit in a browser session. Checks: horizontal overflow with culprit elements, touch target sizes (≥44×44px), text below 16px, viewport meta tag, input font sizes for iOS zoom prevention, and interactive element spacing. Returns structured pass/fail report. Params: sessionId #### list_recent_screenshots View recent captures. Params: limit (1–20) #### get_screenshot_status Check if a job is done. Params: id --- ### Browser Session Tools Start with `browser_navigate` → get sessionId → pass to all tools → `browser_close` when done. **Both tools return a `Run URL` pointing to the live dashboard for this run (timeline + captures + replay + console + network). Always surface the Run URL to the user at the end of the task so they can review the evidence. If a `Share URL` is also returned, include it for teammates who don't have an account.** #### browser_navigate Open URL, returns screenshot + sessionId + `Run URL` (dashboard deep-link for this run). Surface the Run URL to the user at the end of the task. Supports workflow-aware outcome context for run summaries. Params: url, sessionId (optional), width, height, record_video, task_type, user_goal, workflow_name, workflow_required, auth_scope, tool_path, page_set, required_evidence #### browser_click Click by CSS selector or visible text. Params: sessionId, selector #### browser_click_at Click at x,y coordinates — for CAPTCHAs, canvas, iframes. Params: sessionId, x, y, clickCount, delay #### browser_fill Type into input field (clears first). Params: sessionId, selector, value #### browser_hover Trigger hover states/tooltips/dropdowns. Params: sessionId, selector #### browser_select_option Select from dropdown. Params: sessionId, selector, value #### browser_press_key Keyboard: Enter, Tab, Escape, Control+a, etc. Params: sessionId, key #### browser_scroll Scroll by pixel amount. Params: sessionId, x, y #### browser_wait_for Wait for element to appear. Params: sessionId, selector, timeout #### browser_go_back / browser_go_forward Browser history navigation. Params: sessionId #### browser_set_viewport Resize viewport mid-session (e.g. desktop ↔ mobile). Params: sessionId, width, height #### browser_close Free resources. Always call when done. Returns a `Run URL` pointing to the dashboard view of this run — you MUST include this Run URL in your final reply so the user can review the captured timeline, evidence, console, and network. Also returns a `Share URL` when one exists (public link for teammates). Params: sessionId #### browser_screenshot Screenshot current page state. Params: sessionId #### browser_get_text All visible text (or specific element). Returns fast "no matching element" error instead of hanging. Params: sessionId, selector (optional), timeout (default 5000ms, 500–30000) #### browser_get_html DOM source. Returns fast "no matching element" error instead of hanging. Params: sessionId, selector (optional), outer, timeout (default 5000ms, 500–30000) #### browser_get_accessibility_tree Full a11y tree — best for understanding page structure. Params: sessionId, interestingOnly, maxDepth #### accessibility_snapshot A11y tree for any URL without a session. Params: url, interestingOnly, maxDepth #### accessibility_audit Run a real WCAG 2.1 AA compliance audit on a URL. Checks landmarks, skip links, focus indicators, heading hierarchy, image alt text, aria-hidden on decorative SVGs, color contrast ratios, form labels, touch targets, and reduced-motion handling. Returns categorized PASS/FAIL results with WCAG criteria references. For element-level responsive checks (overflow culprits, touch target sizes, font sizes), use responsive_audit in a browser session. Params: url, width, height #### browser_evaluate Run JavaScript, return result. Params: sessionId, script --- ### Performance & SEO #### browser_perf_metrics Core Web Vitals: LCP, FCP, CLS, TTFB, DOM size, resource counts. For the full request waterfall with timing data, use browser_network_requests. Good thresholds: TTFB < 800ms, FCP < 1.8s, LCP < 2.5s, CLS < 0.1 Params: sessionId #### browser_network_requests Full network waterfall with timing. Params: sessionId, resourceType, minDuration, limit #### browser_seo_audit Meta, OG, Twitter cards, headings, JSON-LD, alt text, structured data. Params: sessionId #### seo_batch_compare Compare SEO metadata across 2–10 URLs in one call. Returns a comparison table showing which meta fields are duplicated across pages — catches identical titles, descriptions, OG tags, and canonical issues that single-page tools miss. No browser session needed. For deeper single-page analysis, use browser_seo_audit in a browser session. For social card previews, use og_preview. Params: urls (array of 2–10 URLs) #### og_preview Preview how a URL will look when shared on social media. Extracts all OG and Twitter Card meta tags from the rendered page, validates them, screenshots the og:image, and generates a social card mockup. Works with JS-rendered pages (SPAs). No browser session needed. For full SEO metadata (headings, structured data, robots), use browser_seo_audit. To compare OG tags across multiple pages, use seo_batch_compare. Params: url (required), platform (twitter|facebook|linkedin|slack|all, default: all) --- ### Debugging #### browser_console_logs Console errors, warnings, logs, exceptions. Params: sessionId, level, limit #### browser_network_errors Failed requests (4xx, 5xx). Params: sessionId, limit #### browser_cookies Get/set/clear cookies. Params: sessionId, action, cookies[] #### browser_storage Read/write localStorage and sessionStorage. Params: sessionId, storageType, action, key, value --- ### Smart Login #### auth_test_assist Start here for website login, sign-up, and verification testing. Reuses the saved inbox/password, checks remembered auth state for the site's normalized origin, and returns reusable auth strategy plus site-specific signals such as recommended auth path, account-exists confidence, likely auth method, expected follow-up, and known-site history. Params: url, action, intent, loginUrl, outcome, verification_required, username, display_name, force_new_inbox, notes #### find_login_page Discover login pages via sitemap.xml + common paths. After finding the login URL, use auth_test_assist to plan the auth flow, or smart_login to attempt sign-in directly. Params: url #### smart_login Auto-detect form fields, fill credentials, submit with click, Enter, and form-submit fallbacks, then report result. Params: loginUrl, username, password, usernameSelector, passwordSelector, submitSelector Returns: screenshot + status (SUCCESS/FAILED/UNCERTAIN) + sessionId. --- ### CAPTCHA Solving #### solve_captcha Auto-detect and solve Cloudflare Turnstile, reCAPTCHA v2/v3, hCaptcha using AI (CapSolver). For Clerk-powered sites, automatically calls sign-up/sign-in API with the solved token. Params: sessionId, type (auto), sitekey (auto), pageUrl (auto), autoSubmit (default: true) --- ### Disposable Email (AgentMail) Each user needs their own AgentMail API key (free at https://console.agentmail.to). Configure in Dashboard → Settings. #### create_test_inbox Standalone inbox helper. Create or reuse the saved primary inbox and return its email, password, inbox ID, and known-site history. For website auth work, start with auth_test_assist first so you also get reusable cross-site strategy and remembered per-site guidance. Params: username (optional), display_name (optional), force_new (optional) #### check_inbox Read messages, auto-extracts OTP codes and verification links. Params: inbox_id, limit #### send_test_email Send email from an inbox. Params: inbox_id, to, subject, text --- ### Gmail Verification (OAuth) #### authorize_email_access One-time OAuth setup for Gmail. #### read_verification_email Read OTP codes from user's Gmail inbox. Params: sender (optional), subject_keyword (optional), max_age_minutes --- ### AI-Powered Analysis #### ux_review AI-powered UX review using vision. Returns actionable feedback across Accessibility, SEO, Performance, Navigation, Content, and Mobile-friendliness. For deeper checks, follow up with accessibility_audit (WCAG compliance), responsive_audit (overflow, touch targets, font sizes), or browser_perf_metrics (Core Web Vitals). Params: url, width, height #### extract_text_from_image Extract text from an image using AI vision (OCR). Works on screenshots, photos of text, infographics, social cards, Canva graphics, and any image with embedded text. If you need a screenshot URL first, use take_screenshot or browser_screenshot. Params: image_url (optional — public URL of image), sessionId (optional — screenshot current page), selector (optional — OCR a specific element), prompt (optional — custom extraction prompt) Requires either image_url or sessionId. Use when page text is embedded in images rather than DOM. --- ## Common Workflows ### Responsive Testing → `find_breakpoints` to scan all widths for overflow and layout shifts → `browser_navigate` (mobile viewport) → `responsive_audit` for element-level checks → `browser_set_viewport` (tablet) → `responsive_audit` again → `screenshot_responsive` for visual comparison across viewports ### Full Site Audit → First read `workflows/sitewide-performance-audit/WORKFLOW.md` before any browser or audit tool use → State that you read it, the page set, whether authenticated pages are in scope, and whether you will use MCP or CLI first → If the user gave the site URL but not the page list, infer the representative public pages and begin → `browser_navigate` → `browser_get_accessibility_tree` → `browser_perf_metrics` → `browser_seo_audit` → `og_preview` → `browser_console_logs` → `browser_network_errors` ### Sign-Up Testing with Disposable Email 1. `auth_test_assist` → decide whether sign-in or sign-up should be attempted first 2. `find_login_page` if needed 3. Reuse the saved primary inbox and password from the helper unless you explicitly need a fresh registration 4. `smart_login` for the first auth attempt or `browser_fill` / `browser_click` for manual multi-step flows 5. `check_inbox` → extract OTP or verification link 6. Record the outcome with `auth_test_assist` 7. Report reusable auth-system heuristics first, then the site-specific evidence that supported them ### Before/After Comparison → `screenshot_diff` with urlA and urlB → pixel diff + match score ### Cross-Browser Testing → `screenshot_cross_browser` for Chromium + Firefox + WebKit --- ## REST API Base URL: https://api.deepsyte.com ### POST /v1/screenshot Enqueue a screenshot job. Headers: Authorization: Bearer sk_live_... Body: { url, width, height, fullPage, format, delay } Response: { id, status: "pending" } ### GET /v1/screenshot/:id Poll for status. Response: { id, status, url, error, createdAt } --- ## Installation by Client ### Cursor ~/.cursor/mcp.json: ```json { "mcpServers": { "deepsyte": { "url": "https://api.deepsyte.com/mcp" } } } ``` ### Windsurf ```json { "mcpServers": { "deepsyte": { "serverUrl": "https://api.deepsyte.com/mcp" } } } ``` ### Claude Desktop / Claude Code ```json { "mcpServers": { "deepsyte": { "url": "https://api.deepsyte.com/mcp" } } } ``` ### VS Code (Copilot) .vscode/mcp.json: ```json { "mcpServers": { "deepsyte": { "url": "https://api.deepsyte.com/mcp" } } } ``` ### Gemini CLI / OpenCode / Roo Code / Other MCP clients ```json { "mcpServers": { "deepsyte": { "url": "https://api.deepsyte.com/mcp" } } } ``` --- ## Documentation Full docs: https://deepsyte.com/docs Agent Skill: https://deepsyte.com/.skills/deepsyte/SKILL.md