# DeepSyte

> Give your AI coding assistant eyes on every screen. Test and verify web apps and native mobile apps — screenshots, browser workflows, audits, and native Android/iOS automation — all via MCP tools.

DeepSyte is a Model Context Protocol (MCP) server ecosystem with 130+ tools that gives AI coding assistants the ability to see, interact with, and verify both websites and native mobile apps. The core DeepSyte MCP handles web: remote browser workflows for public pages, managed local browser workflows for private or local environments, campaign workflows, and evidence capture for proof-heavy debugging and QA. The `deepsyte mobile-mcp` server handles Android and iOS: native app gestures, element finding, app lifecycle, device control, and AI-first screen analysis via Appium. It works with Codex, Claude, Cursor, Windsurf, VS Code, and any MCP-compatible client.

## Canonical Endpoints

- MCP endpoint: `https://api.deepsyte.com/mcp`
- OAuth issuer and authorization server: `https://api.deepsyte.com`
- REST API base: `https://api.deepsyte.com`
- Valid Railway service domain: `https://screenshotsmcp-api-production.up.railway.app`
- Do not use `https://deepsyte-api-production.up.railway.app`; it is an obsolete Railway host and returns 404.
- Raw `sk_live_...` API keys are for REST only. MCP and CLI browser workflows require website OAuth and a `dso_...` session.
- If an agent tries a natural request such as "run an SEO test on example.com" before DeepSyte is connected, the MCP `401` response includes `mcp_url`, `login_url`, `setup_url`, and `codex_login_command`. Surface those fields and tell Codex users to run `codex mcp login deepsyte` against `https://api.deepsyte.com/mcp`.
- Cold agents with no DeepSyte memory can discover common MCP tool names from public metadata before auth. `/.well-known/mcp.json`, `/.well-known/oauth-protected-resource/mcp`, and the `/mcp` 401 JSON body include an informational `tool_catalog`; execution still requires website OAuth and a `dso_...` Bearer token.
- Common post-auth MCP tools include `screenshot_fullpage`, `take_screenshot`, `screenshot_responsive`, `browser_navigate`, `browser_screenshot`, `browser_click`, `browser_fill`, `browser_wait_for`, `browser_close`, `browser_seo_audit`, `browser_perf_metrics`, `responsive_audit`, `ux_review`, `extract_text_from_image`, `create_audit_report`, `get_audit_report_status`, `get_audit_report_link`, `create_audit_comparison_report`, `get_audit_comparison_report_link`, `create_audit_campaign`, `add_audit_campaign_targets`, `list_campaign_sender_capacity`, `update_campaign_sender_limits`, `launch_audit_campaign`, and `get_audit_campaign_status`.
- MCP screenshot/browser responses are evidence-rich by design: they include captured/current URL, sessionId, Run URL where available, viewport, result counts/filters/scripts, and `Evidence screenshot` URLs when an image was persisted. When a tool supports `caption`, pass a short user-facing reason so VS Code, Windsurf, Codex, and CLI transcripts explain what happened instead of only showing a tool name.

## Mobile MCP — Android & iOS Automation (deepsyte mobile-mcp)

The companion mobile MCP adds 78 tools for native Android and iOS automation. Install it alongside (or instead of) the main DeepSyte web MCP.

### Install

```json
{
  "mcpServers": {
    "deepsyte-mobile": {
      "command": "npx",
      "args": ["-y", "deepsyte@latest", "mobile-mcp"]
    }
  }
}
```

**Requirements:** ADB (Android SDK Platform Tools) in PATH. Appium 3.x running locally (`appium server`). For iOS: macOS + Xcode + WebDriverAgent.

### Mobile Quick Start

```
# 1. Start Appium
appium server

# 2. List connected devices
Ask agent: "What mobile devices are connected?" → mobile_devices

# 3. Start a session and automate
Ask agent: "Open com.example.myapp on emulator-5554 and tap the Login button"
→ mobile_session_start → mobile_tap_text → mobile_screenshot
```

### Mobile Session Tools

| Tool | Description |
|---|---|
| `mobile_devices` | List connected ADB devices |
| `mobile_session_start` | Start Appium session. Params: `deviceUdid`, `platformName` (Android\|iOS), `app` or `bundleId`, `noReset`, `extraCaps` |
| `mobile_session_stop` | Close session. Params: `sessionId` |
| `mobile_session_status` | Check session health. Params: `sessionId` |

### Mobile Screenshot & Screen Analysis

| Tool | Description |
|---|---|
| `mobile_screenshot` | ADB screenshot (no session needed). Params: `deviceUdid` |
| `mobile_screenshot_element` | Crop screenshot to element bounds. Params: `sessionId`, `elementId` |
| `mobile_describe_screen` | AI-readable element list from current screen. Params: `sessionId` |
| `mobile_assert_text` | Assert text visible — throws if not found. Params: `sessionId`, `text` |
| `mobile_wait_for_screen_change` | Wait until screen content changes. Params: `sessionId`, `timeout` |
| `mobile_uitree` | Raw accessibility XML via ADB (no session). Params: `deviceUdid` |
| `mobile_source` | Full Appium page source XML. Params: `sessionId` |

### Mobile Gestures

9 tools accept `screenshot: true` to return a confirmation screenshot after the action: `mobile_tap`, `mobile_tap_text`, `mobile_swipe`, `mobile_type`, `mobile_fill_field`, `mobile_long_press`, `mobile_double_tap`, `mobile_drag`, `mobile_scroll_to_text`

| Tool | Description |
|---|---|
| `mobile_tap` | Tap at coordinates. Params: `sessionId`, `x`, `y` |
| `mobile_tap_text` | Find element by visible text and tap. Params: `sessionId`, `text` |
| `mobile_double_tap` | Double-tap. Params: `sessionId`, `x`, `y` |
| `mobile_long_press` | Long press. Params: `sessionId`, `x`, `y`, `duration` |
| `mobile_swipe` | Swipe between two points. Params: `sessionId`, `startX`, `startY`, `endX`, `endY`, `duration` |
| `mobile_drag` | Drag-and-drop. Params: `sessionId`, `startX`, `startY`, `endX`, `endY` |
| `mobile_pinch` / `mobile_zoom` | Pinch/zoom gestures. Params: `sessionId`, `x`, `y`, `scale` |
| `mobile_type` | Type text via keyboard. Params: `sessionId`, `text` |
| `mobile_key` | Press hardware key (HOME, BACK, ENTER, VOLUME_UP…). Params: `sessionId`, `key` |
| `mobile_hide_keyboard` | Dismiss soft keyboard. Params: `sessionId` |
| `mobile_press_home` | Home button (Android only). Params: `sessionId` |
| `mobile_press_back` | Back button (Android only). Params: `sessionId` |
| `mobile_set_orientation` / `mobile_get_orientation` | Rotate device. Params: `sessionId`, `orientation` |

### Mobile Element Tools

| Tool | Description |
|---|---|
| `mobile_find_element` | Find by strategy + selector. Params: `sessionId`, `strategy` (accessibility id\|xpath\|id\|class name), `selector` |
| `mobile_find_all_elements` | Find all matching. Params: `sessionId`, `strategy`, `selector` |
| `mobile_find_by_text` | Cross-platform text search. Params: `sessionId`, `text` |
| `mobile_wait_for_element` | Wait until element appears. Params: `sessionId`, `strategy`, `selector`, `timeout` |
| `mobile_scroll_to_text` | Scroll until text visible. Params: `sessionId`, `text` |
| `mobile_element_text` | Get element's text. Params: `sessionId`, `elementId` |
| `mobile_fill_field` | Clear + type into field. Params: `sessionId`, `elementId`, `text` |
| `mobile_set_value` | Set value bypassing keyboard. Params: `sessionId`, `elementId`, `value` |
| `mobile_element_exists` | Check if element is present. Params: `sessionId`, `strategy`, `selector` |
| `mobile_execute` | Run `mobile:` script escape hatch. Params: `sessionId`, `script`, `args` |

### Mobile App Lifecycle Tools

| Tool | Description |
|---|---|
| `mobile_install_app` | Install APK/IPA. Params: `sessionId`, `appPath` |
| `mobile_uninstall_app` | Remove app. Params: `sessionId`, `bundleId` or `appPackage` |
| `mobile_activate_app` | Bring to foreground. Params: `sessionId`, `bundleId` or `appPackage` |
| `mobile_clear_app_data` | Clear data/cache without uninstalling (Android). Params: `sessionId`, `appPackage` |
| `mobile_app_launch` | Launch by package (ADB, no session). Params: `deviceUdid`, `appPackage`, `activity` |
| `mobile_app_stop` | Force-stop (ADB, no session). Params: `deviceUdid`, `appPackage` |
| `mobile_app_state` | Get app state (foreground/background/not running). Params: `sessionId`, `bundleId` |

### Mobile Device Control Tools

| Tool | Description |
|---|---|
| `mobile_grant_permission` / `mobile_revoke_permission` | ADB permission management (Android). Params: `sessionId`, `appPackage`, `permission` |
| `mobile_deep_link` | Open URL scheme or universal link. Params: `sessionId`, `url` |
| `mobile_clipboard` | Get or set clipboard text. Params: `sessionId`, `action` (get\|set), `text` |
| `mobile_geolocation` | Mock GPS coordinates. Params: `sessionId`, `latitude`, `longitude` |
| `mobile_push_file` / `mobile_pull_file` | File transfer to/from device. Params: `sessionId`, `localPath`, `remotePath` |

### Mobile Alerts, Context & WebView

| Tool | Description |
|---|---|
| `mobile_handle_alert` | Accept or dismiss alert/dialog. Params: `sessionId`, `action` (accept\|dismiss) |
| `mobile_get_contexts` | List Native + WebView contexts. Params: `sessionId` |
| `mobile_switch_context` | Switch context. Params: `sessionId`, `contextName` |

To automate Chrome on-device or a WebView: call `mobile_get_contexts` to find the context name, `mobile_switch_context` to enter it, then use `mobile_browser_*` tools (20 CDP tools: navigate, tap, type, evaluate JS, cookies, localStorage, network throttle, geolocation, user agent, and more).

### Mobile Screen Recording

| Tool | Description |
|---|---|
| `mobile_record_start` | Start screen recording. Params: `deviceUdid` |
| `mobile_record_stop` | Stop and return recording. Params: `deviceUdid` |

### Mobile Workflows

**Explore a native app:**
→ `mobile_session_start` → `mobile_screenshot` → `mobile_describe_screen` → tap → `mobile_screenshot`

**Test a login flow:**
→ `mobile_session_start` → `mobile_find_by_text("Email")` → `mobile_fill_field` → `mobile_find_by_text("Password")` → `mobile_fill_field` → `mobile_tap_text("Sign In")` → `mobile_wait_for_screen_change` → `mobile_assert_text("Dashboard")`

**Test a deep link:**
→ `mobile_session_start` → `mobile_deep_link("myapp://product/123")` → `mobile_assert_text("Product Name")` → `mobile_screenshot`

**Cross-platform:**
→ Run same flow with `platformName: "Android"` then `platformName: "iOS"` — tool names are identical.

**Slow emulator fix:**
→ Pass `extraCaps: { uiautomator2ServerLaunchTimeout: 60000, uiautomator2ServerInstallTimeout: 60000, adbExecTimeout: 60000 }` to `mobile_session_start`.

## Agent Skill

Download the complete Agent Skill for all tools: https://deepsyte.com/.skills/deepsyte/SKILL.md

## Discovery Model

- Treat DeepSyte tools as atomic actions.
- Treat the DeepSyte skill as broad guidance for choosing the right path.
- Treat packaged workflows as targeted procedures for repeatable multi-step jobs.
- When the task is an audit, verification flow, or another repeatable multi-step procedure, check the available workflows before improvising.
- For any site audit, performance audit, SEO audit, UX audit, full audit, or another repeatable multi-page website review, read `workflows/sitewide-performance-audit/WORKFLOW.md` before opening browser sessions, running audit tools, or drafting findings.
- If the user gives you a site URL but no page list, infer a representative public page set and start instead of blocking on permission.
- Default authenticated pages to out of scope unless the user explicitly asks for login, dashboard, or another protected flow.
- Do not load every workflow up front. Read only the workflow that matches the task.
- If terminal access exists and repeated tool calls are likely, prefer the CLI when it is clearly faster than repeated MCP round-trips. If terminal access is not available, stay in MCP.
- For multi-page performance audits in MCP, avoid opening many new browser sessions in parallel. Measure sequentially unless there is a proven reason to increase concurrency.

Available workflow:
- `workflows/sitewide-performance-audit/WORKFLOW.md` — use when the user asks why a site is slow, wants the slowest pages identified, or wants a repeatable multi-page performance review.
- `workflows/seo-audit/WORKFLOW.md` — use when the user asks to run an SEO test or audit. If DeepSyte is not connected, do not stop vaguely; show the MCP URL `https://api.deepsyte.com/mcp` and the login command from the MCP error.

## Quick Start

### Option A: CLI (fastest)

```bash
npx deepsyte setup --client codex     # or: cursor, vscode, windsurf, claude, claude-code
```

The CLI now also installs or repairs the managed core DeepSyte skill in `~/.agents/skills/deepsyte`, including `workflows/sitewide-performance-audit/WORKFLOW.md`, during successful `login`, `install`, and `setup` flows.

If you prefer to do onboarding in two steps, run `npx deepsyte login` followed by `npx deepsyte install <client>`. For most clients that reaches the same result as `setup --client <client>`. The main nuances are that `install vscode` writes a workspace-local `.vscode/mcp.json`, while `install claude-code` prints the `claude mcp add ...` command for you to run manually.

Use remote workflows first for public sites. Escalate to the managed local browser when you need localhost access, intranet or VPN reachability, authenticated realism, or explicit user-approved local execution. Managed local browser commands require a valid DeepSyte website OAuth session.

## Campaigns and Mailbox

DeepSyte Campaigns creates bulk one-page SEO audits or full SEO audits and sends each recipient their own tokenized report link. AgentMail is the two-way campaign mailbox provider. Resend is only for DeepSyte transactional/report delivery. Gmail, Outlook, Composio, and inbox mining are out of scope for this phase.

Campaign agents can:

- Create campaigns with `create_audit_campaign`; use `report_type: "full_site"` when every recipient should receive a full SEO audit, and `send_hours: "office"` or `send_hours: "anytime"` for the global campaign sending-hours mode.
- Add selected prospects directly with `add_audit_campaign_targets`; set a target `reportType: "full_site"` or `fullAudit: true` for a one-off full SEO audit row.
- Check real AgentMail sender capacity with `list_campaign_sender_capacity`.
- Switch sender domains or inboxes between `warmup` and `manual` daily-limit modes with `update_campaign_sender_limits`.
- Launch with `launch_audit_campaign`.
- Check recipient and send status with `get_audit_campaign_status`.

Campaign cron remains the only automated sender. It enforces campaign sending-hours mode, inbox caps, domain caps, campaign caps, warmup/manual mode, suppressions, and report quality gates. Do not claim that MCP, CLI, or API target import can bypass those gates.

Campaign sending-hours modes are global to the campaign and apply to Resend and AgentMail. `office` means Monday-Friday 9am-4pm in the campaign timezone. `anytime` means every day, 24-hour delivery.

Warmup defaults per inbox are 5/day for days 0-7, 10/day for days 8-14, 15/day for days 15-21, 20/day for days 22-30, 25/day for days 31-45, 30/day for days 46-60, 35/day for days 61-90, and 40/day hard max after 90 days. Manual mode is for prewarmed senders and shows a warning instead of blocking the user.

Campaign emails are plain-text-first, one recipient at a time, with no tracking pixels, no tracking redirects, no attachments, and at most one report link in the first outbound email. Campaign cron waits at least 5 minutes between report emails for the same account owner, and AgentMail inboxes also use a daily first-send offset plus deterministic spacing jitter from the campaign sending-hours window and effective daily cap. Reports must not be evidence-free or partial. DataForSEO enrichment is used only for full SEO report paths, including paid full reports and campaign rows marked full.

AI Campaign Intelligence is evidence-bound and uses MiniMax Token Plan in production (`MINIMAX_API_KEY`, optional `MINIMAX_BASE_URL`, optional `MINIMAX_MODEL`, default model `MiniMax-M2.7`). Before campaign launch/delivery, rows can be scored `qualified`, `weak_fit`, `review`, or `block` from page HTML/text, domain signals, contact quality, business category, and existing suppressions. Campaign angles store `angle`, `angleReason`, `subjectHint`, `firstLine`, and `ctaStyle`; outbound templates stay controlled and AI only fills fact-backed snippets. Public free reports may include AI report intro, executive summary, and upsell bridge. Paid full reports may include an AI strategic roadmap, 30-day plan, DataForSEO opportunity explanations, and AI-selected representative page sets. The MCP `ux_review` tool also uses MiniMax for text reasoning over verified browser evidence; server-side OCR is deprecated in favor of MiniMax Token Plan MCP `understand_image`. Inbound AgentMail replies are classified automatically; safe replies can be answered automatically, while unsubscribe, angry, legal-risk, bounce-like, and not-interested replies suppress instead of selling.

## Hosted Audit Reports

DeepSyte hosted audit reports use tokenized `/r/[token]` links and can be created from the dashboard, workspace sharing, campaigns, MCP, or CLI. Use `create_audit_report` when an agent needs a shareable single-page or full-site report link. Use `create_audit_comparison_report` when the user wants a before/after report from two audit ids for the same site.

Report tools and commands return `auditId`, `reportType`, `confidenceStatus`, `marketDataStatus`, `score`, `topFinding`, `reportUrl`, `pdfAction`, `runUrl`, `evidenceSummary`, `sourceSummary`, and `dataforseoSummary` when full-report market data exists. Hosted report pages also render the canonical `SeoReportModel`: weighted score buckets, separate confidence and freshness scores, source freshness, provider status, and evidence-linked action items. Confidence statuses are `complete`, `usable_with_warning`, `retry_required`, and `blocked`. Full-report market statuses are `market_complete`, `market_partial`, `market_unavailable`, and `provider_failed`. Campaign delivery and workspace sharing must not send `retry_required` or `blocked` reports unless an explicit debug path is used.

CLI report commands:

```bash
deepsyte reports run https://example.com --type single-page --share
deepsyte reports run https://example.com --type full-site --share
deepsyte reports status <auditId>
deepsyte reports link <auditId>
deepsyte reports compare --before <beforeAuditId> --after <afterAuditId> --share
```

Hosted report pages render the reconciled evidence model: SEO, schema, social, performance, HAR-style waterfall rows when captured, LCP proof, network, console, accessibility, responsive proof, screenshots, run links, scope, confidence reasons, market-data status, endpoint counts/costs/errors, and print/PDF layout. `.au` domains default to DataForSEO Australia `2036` + `en` unless env overrides are set. If ranked keywords are empty, the full-report path seeds keyword and SERP checks from page titles, H1s, service URLs, business name, and audit facts before accepting a zero-data result. AI can explain evidence after reconciliation, but it must not invent raw traffic, ranking, revenue, competitor, or urgency claims.

## Escalation Ladder for website auth

Some sites (Cloudflare Turnstile, WorkOS AuthKit, Clerk bot-detection, Akamai/PerimeterX-protected signups) silently reject requests from the Railway-hosted cloud browser. `solve_captcha` returns a valid token, but the fingerprint gets filtered by Siteverify — no amount of token injection fixes it. When a valid-looking submit silently does nothing (URL doesn't change, no error, form resets), escalate instead of retrying:

1. Start with MCP tools (`browser_navigate`, `smart_login`, `solve_captcha`).
2. If MCP stalls silently and `deepsyte whoami` confirms an authenticated website session, switch to the CLI local browser: `npx deepsyte browser:start <url>`, then drive real Chrome one atomic command at a time with `browser:click`, `browser:fill`, `browser:paste` (React-compatible), `browser:wait-for`, `browser:inspect`, `browser:eval`. Real Chrome on the user's residential IP passes WorkOS/Turnstile trust checks silently, often without even showing a CAPTCHA checkbox.
3. Always call `deepsyte auth:plan <url>` before a fresh auth attempt and `deepsyte auth:record <url> <outcome>` after. Inbox, password, and per-site auth state persist in the DB so the next run resumes at the right stage.
4. Use `+alias` emails (e.g. `you+smithery@agentmail.to`) to reuse a single inbox for multiple signups — most providers treat each alias as a distinct identity.

The interactive rule: read the returned PNG after every `browser:*` command, confirm the state, then issue the next command. No preset scripts.

For repeatable public-page performance audits, use the CLI only when the command path is already available or can be approved up front. If command approval would stall the run and MCP is already available, begin with MCP and collect metrics sequentially.

Important: `deepsyte skills ...` only manages the local core DeepSyte skill. For community skill discovery or installation, use the `find-skills` workflow or `npx skills find ...` / `npx skills add ...` instead.

### Option B: Manual MCP OAuth

For Codex, add native HTTP MCP config to `~/.codex/config.toml`, then trigger Codex OAuth explicitly:

```toml
[mcp_servers.deepsyte]
url = "https://api.deepsyte.com/mcp"
scopes = ["mcp:tools"]
oauth_resource = "https://api.deepsyte.com/mcp"
```

```bash
codex mcp login deepsyte
```

If your Codex Desktop build signs in but does not mount `mcp__deepsyte` tools into a fresh thread, use `mcp-remote` as a temporary fallback:

```toml
[mcp_servers.deepsyte]
command = "npx"
args = ["-y", "mcp-remote@0.1.38", "https://api.deepsyte.com/mcp"]
```

Add the base MCP URL to an OAuth-capable MCP client:

```json
{
  "mcpServers": {
    "deepsyte": {
      "url": "https://api.deepsyte.com/mcp"
    }
  }
}
```

The client should open the DeepSyte website for sign-in before MCP tools work.

## VS Code Extension Preview

A native DeepSyte VS Code extension is now being developed in the monorepo for a dedicated Activity Bar sidebar, automatic browser OAuth sign-in, automatic editor MCP setup, automatic managed core skill sync, native MCP registration, command palette actions, a live activity timeline panel, and browser workflow UX inside the editor.

Current preview commands include `DeepSyte: Sign In`, `DeepSyte: Check Status`, `DeepSyte: Take Screenshot`, `DeepSyte: Open Timeline`, `DeepSyte: Configure Editor Integration`, and `DeepSyte: Sync Core Skill`. The sidebar also exposes quick actions and recent activity directly in VS Code, and the extension opens browser OAuth, configures the editor automatically, and repairs the managed core skill when no credentials are stored.

Until the Marketplace release is ready, the recommended VS Code setup is to install the preview VSIX, sign in once, and use `DeepSyte: Configure Editor Integration` only if you need to repair the automatic MCP setup or `DeepSyte: Sync Core Skill` when you need to repair the managed local skill.

## Chrome Extension Preview

The monorepo also includes a Chrome extension preview under `packages/chrome-extension`.

- **Public pages** use DeepSyte cloud capture after website authorization.
- **Localhost and private pages** stay local-first, so the extension can still capture and inspect pages that the cloud browser cannot reach.
- **Page tools** in the popup can read visible text and DOM HTML for the active tab.
- **Saved key validation** happens before the extension accepts a pasted API key, so revoked keys are rejected instead of being silently stored.
- **Viewer cloud actions** reuse existing cloud-backed captures when available and only upload local-only captures when needed.

Use the extension when you want a browser-native client for current-tab capture and inspection, while keeping parity with the DeepSyte platform for public URLs.

## CLI (for agents and humans)

DeepSyte has a CLI that exposes all tools as terminal commands. **AI agents can and should use the CLI directly via terminal/run_command** — it's often faster than MCP tool calls and returns structured text output.

Install: `npm install -g deepsyte`
npm: https://www.npmjs.com/package/deepsyte
Or use without installing: `npx deepsyte <command>`

### Auth
```
deepsyte login                    # OAuth (opens browser, saves website-issued session)
deepsyte whoami                   # Check auth status
deepsyte logout                   # Clear credentials
```

If `deepsyte login` opens a Railway 404 page, update the saved API URL to `https://api.deepsyte.com` or reinstall the latest CLI, then run login again.

### Campaigns
```
deepsyte campaigns create --name "Sydney plumbers" --send-at 2026-05-12T00:00:00.000Z --timezone Australia/Sydney --report-type single_page --send-hours office --file targets.csv
deepsyte campaigns add-targets --campaign <id> --file targets.csv
deepsyte campaigns capacity
deepsyte campaigns sender-limits --account <id> --mode warmup --daily-limit 10
deepsyte campaigns sender-limits --domain <domainId> --mode manual --daily-limit 40
deepsyte campaigns launch <id>
deepsyte campaigns status <id>
```

Campaign target CSVs require website and email columns. Common aliases include `website`, `url`, `website_url`, `email`, `recipient_email`, `name`, `company`, and `first_line`. Optional `full_audit`, `full_seo`, `report_type`, or `audit_type` columns can mark one row as a full SEO audit.

### Screenshots
```
deepsyte screenshot <url>                    # 1280×800 viewport
deepsyte screenshot <url> --width 1920 --height 1080 --full-page
deepsyte screenshot <url> --format jpeg --delay 2000
deepsyte fullpage <url>                      # Dedicated full-page capture
deepsyte responsive <url>                    # Desktop + tablet + mobile
deepsyte mobile <url>                        # iPhone 14 Pro (393×852)
deepsyte tablet <url>                        # iPad (820×1180)
deepsyte dark <url>                          # Dark mode emulated
deepsyte element <url> --selector "#hero"    # CSS element capture
deepsyte diff <urlA> <urlB>                  # Pixel-diff two URLs
deepsyte cross-browser <url>                 # Chromium + Firefox + WebKit
deepsyte batch <url1> <url2> <url3>          # Multiple URLs (max 10)
deepsyte pdf <url>                           # Export as PDF
deepsyte screenshots                         # List recent screenshot jobs
deepsyte screenshot:status <id>              # Check screenshot job status
```

### Browser Sessions
```
deepsyte browse <url>                                    # Start session → sessionId
deepsyte browse:click <sessionId> <selector>             # Click element
deepsyte browse:click-at <sessionId> 320 480             # Coordinate click
deepsyte browse:fill <sessionId> <selector> <value>      # Type into input
deepsyte browse:hover <sessionId> ".menu-trigger"       # Hover state
deepsyte browse:select <sessionId> "select[name=country]" "Australia"
deepsyte browse:wait-for <sessionId> ".results-loaded"  # Wait for selector
deepsyte browse:screenshot <sessionId>                   # Capture current state
deepsyte browse:text <sessionId>                         # Get visible text
deepsyte browse:html <sessionId>                         # Get page HTML
deepsyte browse:a11y <sessionId>                         # Accessibility tree
deepsyte browse:eval <sessionId> "document.title"       # Evaluate JavaScript
deepsyte browse:console <sessionId> --level error        # Console logs
deepsyte browse:network-errors <sessionId>               # Failed requests
deepsyte browse:network-requests <sessionId>             # Request waterfall
deepsyte browse:cookies <sessionId> get                  # Inspect cookies
deepsyte browse:storage <sessionId> getAll               # Inspect storage
deepsyte browse:back <sessionId>                         # Back in history
deepsyte browse:forward <sessionId>                      # Forward in history
deepsyte browse:viewport <sessionId> 393 852             # Resize existing session
deepsyte browse:seo <sessionId>                          # Session SEO audit
deepsyte browse:perf <sessionId>                         # Session performance metrics
deepsyte browse:captcha <sessionId>                      # Solve CAPTCHA in-session
deepsyte browse:scroll <sessionId> --y 500               # Scroll down
deepsyte browse:key <sessionId> Enter                    # Press key
deepsyte browse:goto <sessionId> <newUrl>                # Navigate
deepsyte browse:close <sessionId>                        # End session
```

### Reviews & Audits
```
deepsyte review <url>              # AI UX review (standalone, no Run created)
deepsyte seo <url>                 # SEO metadata, run-backed (shows verdict + summary in /dashboard/runs)
deepsyte perf <url>                # Core Web Vitals, run-backed against sitewide-performance-audit workflow
deepsyte a11y <url>                # Accessibility tree (standalone)
deepsyte ocr <imageUrl>            # Extract text from image via AI vision (OCR)
deepsyte ocr --session <id>        # OCR from browser session screenshot
deepsyte breakpoints <url>         # Responsive breakpoints (standalone)
```

`perf` and `seo` open a workflow-aware browser session so the run lands in the dashboard Runs UI with a structured outcome (verdict, summary, findings, proof coverage, next actions). `review`, `a11y`, and `breakpoints` stay standalone and do not create a Run.

### Disposable Email
```
deepsyte auth:test <url>           # Reuse auth memory + primary inbox
deepsyte auth:find-login <url>     # Discover likely login pages
deepsyte auth:smart-login <url> --username user@example.com --password secret
deepsyte auth:authorize-email      # Connect Gmail once for OTP reads
deepsyte auth:read-email           # Read latest Gmail OTP
deepsyte inbox:create              # Create or reuse the primary test inbox
deepsyte inbox:check <inboxId>     # Read messages, extract OTP codes
deepsyte inbox:send <inboxId> --to user@example.com --subject "Test" --text "Hello"
```

### Reusable Website Auth
- Start with `auth_test_assist` or `deepsyte auth:test <url>` for login, sign-up, or verification flows.
- Read the helper's recommended auth path, account-exists confidence, likely auth method, and expected follow-up before choosing sign-in or sign-up.
- Treat the helper's reusable strategy as the default cross-site guidance, and treat per-site hints as evidence rather than universal rules.
- Reuse the saved primary inbox and password unless you explicitly need a fresh registration.
- If sign-in fails because the account does not exist, switch to sign-up with the same saved credentials.
- If `smart_login` is uncertain on Clerk or other multi-step auth UIs, fall back to browser tools and inspect network or console evidence before concluding the login failed.
- Use `check_inbox` for verification codes or email links.
- After the attempt, call `auth_test_assist` again with `action: "record"` to save the outcome for future runs.

### Setup & Install
```
deepsyte setup                     # Interactive: login + choose IDE + auto-configure (recommended)
deepsyte setup --client cursor     # Non-interactive: for AI agents, skips prompt
deepsyte setup --client windsurf
deepsyte setup --client vscode
deepsyte setup --client claude
deepsyte setup --client claude-code
deepsyte browser open https://example.com  # Launch extension-free local browser with explicit approval
deepsyte browser open https://example.com --record-video  # Record the full managed local browser session to a local .webm file
deepsyte browser back                     # Navigate browser history backward
deepsyte browser forward                  # Navigate browser history forward
deepsyte browser status                   # Inspect the tracked managed local browser
deepsyte browser goto https://example.org # Navigate the managed local browser
deepsyte browser click-at 320 480         # Click viewport coordinates in the managed local browser
deepsyte browser hover ".menu-trigger"    # Trigger hover states in the managed local browser
deepsyte browser wait-for ".results-loaded" --timeout 8000
deepsyte browser select "select[name=country]" "Australia"
deepsyte browser viewport 393 852         # Resize the managed local browser viewport
deepsyte browser screenshot               # Save a local screenshot from the managed browser
deepsyte browser text                     # Read visible text from the managed browser
deepsyte browser console --level error    # Read captured console logs from the managed browser
deepsyte browser network-errors           # Read failed network requests from the managed browser
deepsyte browser network-requests --resource-type fetch --min-duration 200
deepsyte browser cookies get              # Inspect cookies in the managed browser
deepsyte browser storage getAll --type localStorage
deepsyte browser eval "document.title"   # Evaluate JavaScript in the managed browser
deepsyte browser a11y --max-depth 6       # Inspect the accessibility tree from the managed browser
deepsyte browser perf                     # Read performance metrics from the managed browser
deepsyte browser seo                      # Audit SEO metadata from the managed browser
deepsyte browser close                    # Close the tracked managed local browser
deepsyte skills list               # List installed skills under ~/.agents/skills
deepsyte skills sync               # Install, update, or repair the managed core skill
deepsyte skills update             # Alias for core skill sync
deepsyte install cursor            # Writes ~/.cursor/mcp.json
deepsyte install vscode            # Writes .vscode/mcp.json
deepsyte install windsurf          # Writes ~/.codeium/windsurf/mcp_config.json
deepsyte install claude            # Writes Claude Desktop config
deepsyte install claude-code       # Prints `claude mcp add` command
```

For community skills such as Anthropic's `frontend-design`, use `find-skills` or run `npx skills find frontend design` followed by `npx skills add anthropics/skills@frontend-design -g -y`.

### One-liner Install
```
# macOS/Linux
curl -fsSL https://deepsyte.com/install.sh | bash

# Windows PowerShell
irm https://deepsyte.com/install.ps1 | iex

# Or just use npx (no install needed)
npx deepsyte setup
```

### Agent Tips
- **AI agents: use `npx deepsyte setup --client <ide>` to install non-interactively.**
- **Use CLI when you have terminal access** — structured text output, no JSON-RPC overhead.
- **For auth testing, start with `npx deepsyte auth:test https://example.com`** so you reuse saved inbox credentials, remembered auth history, and the helper's site-specific confidence signals.
- Every screenshot command returns a public CDN URL.
- CLI browser commands print the target session, current URL, result counts, and evidence screenshot path/URL. Read that output and relay the evidence details in your final answer.
- Browser sessions: start with `browse`, get sessionId, pass it to subsequent `browse:*` commands, and always `browse:close` when done.
- Use remote `browse:*` commands for public-site MCP-parity workflows. Use `deepsyte browser ...` only for the separate managed local browser used for localhost, VPN-only, or approval-gated work.
- Managed local browser commands under `deepsyte browser ...` now support continuous console/network capture while the browser stays open, plus history navigation, coordinate clicks, hover states, wait conditions, dropdown selection, viewport resizing, screenshots, text, HTML, cookies/storage inspection, script evaluation, accessibility trees, performance metrics, SEO audits, timestamped evidence bundle export via `browser evidence`, finalized video-inclusive export via `browser close --evidence`, and optional local `.webm` session recording against the tracked local browser.
- Prefer evidence-rich workflows when debugging: capture screenshots, logs, recordings, and bundle exports so the result is reviewable by both humans and agents.
- Credentials stored in `~/.config/deepsyte/config.json`. Once logged in, all commands are authenticated.
- Use `npx deepsyte` if unsure whether it's installed globally.

## Webhooks

Subscribe an HTTPS endpoint to receive HMAC-signed events:

```
POST   /v1/webhooks                  # create endpoint, secret returned once
GET    /v1/webhooks                  # list endpoints
PATCH  /v1/webhooks/:id              # update url, events, enabled flag
POST   /v1/webhooks/:id/rotate       # rotate signing secret
POST   /v1/webhooks/:id/test         # fire test.ping
GET    /v1/webhooks/:id/deliveries   # last 50 delivery attempts
DELETE /v1/webhooks/:id              # remove endpoint
```

Available events: `screenshot.completed`, `screenshot.failed`, `run.completed`, `run.failed`, `quota.warning`, `test.ping`. Default subscription is `["*"]` (forward-compatible).

Headers on every delivery: `Webhook-Id`, `Webhook-Timestamp`, `Webhook-Signature: t=<unix>,v1=<hex hmac sha256 of "${ts}.${body}">`. Verify within 5 minutes of `Webhook-Timestamp`.

Retries: 6 attempts at 1m / 5m / 30m / 2h / 12h. Exhausted deliveries are visible in `GET /v1/webhooks/:id/deliveries`.

Full reference: https://deepsyte.com/docs/api/webhooks

## Server URL

MCP endpoint: `https://api.deepsyte.com/mcp`
REST API base: `https://api.deepsyte.com`
Transport: Streamable HTTP

OAuth discovery should point to `https://api.deepsyte.com/.well-known/oauth-protected-resource/mcp` and `https://api.deepsyte.com/.well-known/oauth-authorization-server`.

## Authentication

REST API requests use API keys. MCP and CLI access use website OAuth sessions.

- MCP: Use the base URL and complete the browser sign-in prompt
- REST API: Pass as `Authorization: Bearer sk_live_...` header

Rate limits by plan:
- Free: 100 screenshots/month
- Starter: 2,000 screenshots/month
- Pro: 10,000 screenshots/month

---

## Tools Reference (53+ tools)

### Screenshot Tools (no session needed)

#### take_screenshot
Capture a screenshot of any URL and return a public image URL.

Parameters:
- url (string, required): The URL to screenshot
- width (number, default: 1280): Viewport width in pixels (320–3840)
- height (number, default: 800): Viewport height in pixels (240–2160)
- fullPage (boolean, default: true): Capture full scrollable page
- maxHeight (number, optional): Cap extremely tall captures
- format (string, default: "png"): png, jpeg, or webp
- delay (number, default: 0): Wait ms after page load (0–10000)

#### screenshot_fullpage
Capture entire scrollable page.
Params: url, width, format, maxHeight

#### screenshot_mobile
iPhone 14 Pro (393×852).
Params: url, fullPage, format

#### screenshot_tablet
iPad (820×1180).
Params: url, fullPage, format

#### screenshot_responsive
Desktop + tablet + mobile in ONE call. Most efficient for responsive visual comparison. For detailed checks (overflow, touch targets, font sizes), follow up with responsive_audit in a browser session.
Params: url, fullPage, format

#### screenshot_dark
Dark mode emulated (prefers-color-scheme: dark).
Params: url, width, height, format

#### screenshot_element
Specific element by CSS selector. SPA-friendly with auto-wait.
Params: url, selector, format, delay

#### screenshot_pdf
Export as PDF (A4 with backgrounds).
Params: url

#### screenshot_batch
Capture multiple URLs in one call (max 10).
Params: urls[], width, height, format, fullPage

#### screenshot_cross_browser
Chromium + Firefox + WebKit simultaneously.
Params: url, width, height, fullPage

#### screenshot_diff
Pixel-diff two URLs. Returns diff image + percentage changed + match score. To capture multiple URLs for comparison, use screenshot_batch.
Params: urlA, urlB, width, height, threshold

#### find_breakpoints
Detect responsive breakpoints (scans 320px–1920px). Returns structured width table with overflow status (✅/❌), height, and scrollWidth at each width. For element-level issues (culprit elements, touch targets, font sizes), follow up with responsive_audit.
Params: url

#### responsive_audit
One-call responsive design audit in a browser session. Checks: horizontal overflow with culprit elements, touch target sizes (≥44×44px), text below 16px, viewport meta tag, input font sizes for iOS zoom prevention, and interactive element spacing. Returns structured pass/fail report.
Params: sessionId

#### list_recent_screenshots
View recent captures.
Params: limit (1–20)

#### get_screenshot_status
Check if a job is done.
Params: id

---

### Browser Session Tools

Start with `browser_navigate` → get sessionId → pass to all tools → `browser_close` when done. **Both tools return a `Run URL` pointing to the live dashboard for this run (timeline + captures + replay + console + network). Always surface the Run URL to the user at the end of the task so they can review the evidence. If a `Share URL` is also returned, include it for teammates who don't have an account.**

#### browser_navigate
Open URL, returns screenshot + sessionId + `Run URL` (dashboard deep-link for this run). Surface the Run URL to the user at the end of the task. Supports workflow-aware outcome context for run summaries.
Params: url, sessionId (optional), width, height, record_video, task_type, user_goal, workflow_name, workflow_required, auth_scope, tool_path, page_set, required_evidence

#### browser_click
Click by CSS selector or visible text.
Params: sessionId, selector

#### browser_click_at
Click at x,y coordinates — for CAPTCHAs, canvas, iframes.
Params: sessionId, x, y, clickCount, delay

#### browser_fill
Type into input field (clears first).
Params: sessionId, selector, value

#### browser_hover
Trigger hover states/tooltips/dropdowns.
Params: sessionId, selector

#### browser_select_option
Select from dropdown.
Params: sessionId, selector, value

#### browser_press_key
Keyboard: Enter, Tab, Escape, Control+a, etc.
Params: sessionId, key

#### browser_scroll
Scroll by pixel amount.
Params: sessionId, x, y

#### browser_wait_for
Wait for element to appear.
Params: sessionId, selector, timeout

#### browser_go_back / browser_go_forward
Browser history navigation.
Params: sessionId

#### browser_set_viewport
Resize viewport mid-session (e.g. desktop ↔ mobile).
Params: sessionId, width, height

#### browser_close
Free resources. Always call when done. Returns a `Run URL` pointing to the dashboard view of this run — you MUST include this Run URL in your final reply so the user can review the captured timeline, evidence, console, and network. Also returns a `Share URL` when one exists (public link for teammates).
Params: sessionId

#### browser_screenshot
Screenshot current page state.
Params: sessionId

#### browser_get_text
All visible text (or specific element). Returns fast "no matching element" error instead of hanging.
Params: sessionId, selector (optional), timeout (default 5000ms, 500–30000)

#### browser_get_html
DOM source. Returns fast "no matching element" error instead of hanging.
Params: sessionId, selector (optional), outer, timeout (default 5000ms, 500–30000)

#### browser_get_accessibility_tree
Full a11y tree — best for understanding page structure.
Params: sessionId, interestingOnly, maxDepth

#### accessibility_snapshot
A11y tree for any URL without a session.
Params: url, interestingOnly, maxDepth

#### accessibility_audit
Run a real WCAG 2.1 AA compliance audit on a URL. Checks landmarks, skip links, focus indicators, heading hierarchy, image alt text, aria-hidden on decorative SVGs, color contrast ratios, form labels, touch targets, and reduced-motion handling. Returns categorized PASS/FAIL results with WCAG criteria references. For element-level responsive checks (overflow culprits, touch target sizes, font sizes), use responsive_audit in a browser session.
Params: url, width, height

#### browser_evaluate
Run JavaScript, return result.
Params: sessionId, script

---

### Performance & SEO

#### browser_perf_metrics
Core Web Vitals: LCP, FCP, CLS, TTFB, DOM size, resource counts. For the full request waterfall with timing data, use browser_network_requests.
Good thresholds: TTFB < 800ms, FCP < 1.8s, LCP < 2.5s, CLS < 0.1
Params: sessionId

#### browser_network_requests
Full network waterfall with timing.
Params: sessionId, resourceType, minDuration, limit

#### browser_seo_audit
Meta, OG, Twitter cards, headings, JSON-LD, alt text, structured data.
Params: sessionId

#### seo_batch_compare
Compare SEO metadata across 2–10 URLs in one call. Returns a comparison table showing which meta fields are duplicated across pages — catches identical titles, descriptions, OG tags, and canonical issues that single-page tools miss. No browser session needed. For deeper single-page analysis, use browser_seo_audit in a browser session. For social card previews, use og_preview.
Params: urls (array of 2–10 URLs)

#### og_preview
Preview how a URL will look when shared on social media. Extracts all OG and Twitter Card meta tags from the rendered page, validates them, screenshots the og:image, and generates a social card mockup. Works with JS-rendered pages (SPAs). No browser session needed. For full SEO metadata (headings, structured data, robots), use browser_seo_audit. To compare OG tags across multiple pages, use seo_batch_compare.
Params: url (required), platform (twitter|facebook|linkedin|slack|all, default: all)

---

### Debugging

#### browser_console_logs
Console errors, warnings, logs, exceptions.
Params: sessionId, level, limit

#### browser_network_errors
Failed requests (4xx, 5xx).
Params: sessionId, limit

#### browser_cookies
Get/set/clear cookies.
Params: sessionId, action, cookies[]

#### browser_storage
Read/write localStorage and sessionStorage.
Params: sessionId, storageType, action, key, value

---

### Smart Login

#### auth_test_assist
Start here for website login, sign-up, and verification testing. Reuses the saved inbox/password, checks remembered auth state for the site's normalized origin, and returns reusable auth strategy plus site-specific signals such as recommended auth path, account-exists confidence, likely auth method, expected follow-up, and known-site history.
Params: url, action, intent, loginUrl, outcome, verification_required, username, display_name, force_new_inbox, notes

#### find_login_page
Discover login pages via sitemap.xml + common paths. After finding the login URL, use auth_test_assist to plan the auth flow, or smart_login to attempt sign-in directly.
Params: url

#### smart_login
Auto-detect form fields, fill credentials, submit with click, Enter, and form-submit fallbacks, then report result.
Params: loginUrl, username, password, usernameSelector, passwordSelector, submitSelector

Returns: screenshot + status (SUCCESS/FAILED/UNCERTAIN) + sessionId.

---

### CAPTCHA Solving

#### solve_captcha
Auto-detect and solve Cloudflare Turnstile, reCAPTCHA v2/v3, hCaptcha using AI (CapSolver).
For Clerk-powered sites, automatically calls sign-up/sign-in API with the solved token.
Params: sessionId, type (auto), sitekey (auto), pageUrl (auto), autoSubmit (default: true)

---

### Disposable Email (AgentMail)

Each user needs their own AgentMail API key (free at https://console.agentmail.to). Configure in Dashboard → Settings.

#### create_test_inbox
Standalone inbox helper. Create or reuse the saved primary inbox and return its email, password, inbox ID, and known-site history. For website auth work, start with auth_test_assist first so you also get reusable cross-site strategy and remembered per-site guidance.
Params: username (optional), display_name (optional), force_new (optional)

#### check_inbox
Read messages, auto-extracts OTP codes and verification links.
Params: inbox_id, limit

#### send_test_email
Send email from an inbox.
Params: inbox_id, to, subject, text

---

### Gmail Verification (OAuth)

#### authorize_email_access
One-time OAuth setup for Gmail.

#### read_verification_email
Read OTP codes from user's Gmail inbox.
Params: sender (optional), subject_keyword (optional), max_age_minutes

---

### AI-Powered Analysis

#### ux_review
AI-powered UX review using vision. Returns actionable feedback across Accessibility, SEO, Performance, Navigation, Content, and Mobile-friendliness. For deeper checks, follow up with accessibility_audit (WCAG compliance), responsive_audit (overflow, touch targets, font sizes), or browser_perf_metrics (Core Web Vitals).
Params: url, width, height

#### extract_text_from_image
Extract text from an image using AI vision (OCR). Works on screenshots, photos of text, infographics, social cards, Canva graphics, and any image with embedded text. If you need a screenshot URL first, use take_screenshot or browser_screenshot.
Params: image_url (optional — public URL of image), sessionId (optional — screenshot current page), selector (optional — OCR a specific element), prompt (optional — custom extraction prompt)
Requires either image_url or sessionId. Use when page text is embedded in images rather than DOM.

---

## Common Workflows

### Responsive Testing
→ `find_breakpoints` to scan all widths for overflow and layout shifts
→ `browser_navigate` (mobile viewport) → `responsive_audit` for element-level checks
→ `browser_set_viewport` (tablet) → `responsive_audit` again
→ `screenshot_responsive` for visual comparison across viewports

### Full Site Audit
→ First read `workflows/sitewide-performance-audit/WORKFLOW.md` before any browser or audit tool use
→ State that you read it, the page set, whether authenticated pages are in scope, and whether you will use MCP or CLI first
→ If the user gave the site URL but not the page list, infer the representative public pages and begin
→ `browser_navigate` → `browser_get_accessibility_tree` → `browser_perf_metrics` → `browser_seo_audit` → `og_preview` → `browser_console_logs` → `browser_network_errors`

### Sign-Up Testing with Disposable Email
1. `auth_test_assist` → decide whether sign-in or sign-up should be attempted first
2. `find_login_page` if needed
3. Reuse the saved primary inbox and password from the helper unless you explicitly need a fresh registration
4. `smart_login` for the first auth attempt or `browser_fill` / `browser_click` for manual multi-step flows
5. `check_inbox` → extract OTP or verification link
6. Record the outcome with `auth_test_assist`
7. Report reusable auth-system heuristics first, then the site-specific evidence that supported them

### Before/After Comparison
→ `screenshot_diff` with urlA and urlB → pixel diff + match score

### Cross-Browser Testing
→ `screenshot_cross_browser` for Chromium + Firefox + WebKit

---

## REST API

Base URL: https://api.deepsyte.com

### POST /v1/screenshot
Enqueue a screenshot job.
Headers: Authorization: Bearer sk_live_...
Body: { url, width, height, fullPage, format, delay }
Response: { id, status: "pending" }

### GET /v1/screenshot/:id
Poll for status.
Response: { id, status, url, error, createdAt }

---

## Installation by Client

### Cursor
~/.cursor/mcp.json:
```json
{ "mcpServers": { "deepsyte": { "url": "https://api.deepsyte.com/mcp" } } }
```

### Windsurf
```json
{ "mcpServers": { "deepsyte": { "serverUrl": "https://api.deepsyte.com/mcp" } } }
```

### Claude Desktop / Claude Code
```json
{ "mcpServers": { "deepsyte": { "url": "https://api.deepsyte.com/mcp" } } }
```

### VS Code (Copilot)
.vscode/mcp.json:
```json
{ "mcpServers": { "deepsyte": { "url": "https://api.deepsyte.com/mcp" } } }
```

### Gemini CLI / OpenCode / Roo Code / Other MCP clients
```json
{ "mcpServers": { "deepsyte": { "url": "https://api.deepsyte.com/mcp" } } }
```

---

## Documentation

Full docs: https://deepsyte.com/docs
Agent Skill: https://deepsyte.com/.skills/deepsyte/SKILL.md