Files
website-enchun-mgr/.agent/skills/agent-browser/references/snapshot-refs.md
pkupuk ad8e2e313e chore(agent): configure AI agents and tools
Add configuration for BMad, Claude, OpenCode, and other AI agent tools and workflows.
2026-02-11 11:51:23 +08:00

4.0 KiB

Snapshot + Refs Workflow

The core innovation of agent-browser: compact element references that reduce context usage dramatically for AI agents.

How It Works

The Problem

Traditional browser automation sends full DOM to AI agents:

Full DOM/HTML sent → AI parses → Generates CSS selector → Executes action
~3000-5000 tokens per interaction

The Solution

agent-browser uses compact snapshots with refs:

Compact snapshot → @refs assigned → Direct ref interaction
~200-400 tokens per interaction

The Snapshot Command

# Basic snapshot (shows page structure)
agent-browser snapshot

# Interactive snapshot (-i flag) - RECOMMENDED
agent-browser snapshot -i

Snapshot Output Format

Page: Example Site - Home
URL: https://example.com

@e1 [header]
  @e2 [nav]
    @e3 [a] "Home"
    @e4 [a] "Products"
    @e5 [a] "About"
  @e6 [button] "Sign In"

@e7 [main]
  @e8 [h1] "Welcome"
  @e9 [form]
    @e10 [input type="email"] placeholder="Email"
    @e11 [input type="password"] placeholder="Password"
    @e12 [button type="submit"] "Log In"

@e13 [footer]
  @e14 [a] "Privacy Policy"

Using Refs

Once you have refs, interact directly:

# Click the "Sign In" button
agent-browser click @e6

# Fill email input
agent-browser fill @e10 "user@example.com"

# Fill password
agent-browser fill @e11 "password123"

# Submit the form
agent-browser click @e12

Ref Lifecycle

IMPORTANT: Refs are invalidated when the page changes!

# Get initial snapshot
agent-browser snapshot -i
# @e1 [button] "Next"

# Click triggers page change
agent-browser click @e1

# MUST re-snapshot to get new refs!
agent-browser snapshot -i
# @e1 [h1] "Page 2"  ← Different element now!

Best Practices

1. Always Snapshot Before Interacting

# CORRECT
agent-browser open https://example.com
agent-browser snapshot -i          # Get refs first
agent-browser click @e1            # Use ref

# WRONG
agent-browser open https://example.com
agent-browser click @e1            # Ref doesn't exist yet!

2. Re-Snapshot After Navigation

agent-browser click @e5            # Navigates to new page
agent-browser snapshot -i          # Get new refs
agent-browser click @e1            # Use new refs

3. Re-Snapshot After Dynamic Changes

agent-browser click @e1            # Opens dropdown
agent-browser snapshot -i          # See dropdown items
agent-browser click @e7            # Select item

4. Snapshot Specific Regions

For complex pages, snapshot specific areas:

# Snapshot just the form
agent-browser snapshot @e9

Ref Notation Details

@e1 [tag type="value"] "text content" placeholder="hint"
│    │   │             │               │
│    │   │             │               └─ Additional attributes
│    │   │             └─ Visible text
│    │   └─ Key attributes shown
│    └─ HTML tag name
└─ Unique ref ID

Common Patterns

@e1 [button] "Submit"                    # Button with text
@e2 [input type="email"]                 # Email input
@e3 [input type="password"]              # Password input
@e4 [a href="/page"] "Link Text"         # Anchor link
@e5 [select]                             # Dropdown
@e6 [textarea] placeholder="Message"     # Text area
@e7 [div class="modal"]                  # Container (when relevant)
@e8 [img alt="Logo"]                     # Image
@e9 [checkbox] checked                   # Checked checkbox
@e10 [radio] selected                    # Selected radio

Troubleshooting

"Ref not found" Error

# Ref may have changed - re-snapshot
agent-browser snapshot -i

Element Not Visible in Snapshot

# Scroll to reveal element
agent-browser scroll --bottom
agent-browser snapshot -i

# Or wait for dynamic content
agent-browser wait 1000
agent-browser snapshot -i

Too Many Elements

# Snapshot specific container
agent-browser snapshot @e5

# Or use get text for content-only extraction
agent-browser get text @e5