> ## Documentation Index
> Fetch the complete documentation index at: https://hyperbrowser.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Overview

> Use HyperAgent to automate browser tasks with AI

HyperAgent is an open-source browser automation framework that extends Playwright with AI capabilities. Write natural language commands instead of complex selectors, and let HyperAgent handle the tedious parts of web automation.

<CardGroup cols={2}>
  <Card title="GitHub Repository" icon="github" href="https://github.com/hyperbrowserai/HyperAgent">
    View source code
  </Card>

  <Card title="npm Package" icon="npm" href="https://www.npmjs.com/package/@hyperbrowser/agent">
    Install node SDK
  </Card>
</CardGroup>

<CardGroup cols={1}>
  <Card title="Templates" icon="layer-group" href="https://www.hyperbrowser.ai/templates">
    View curated list of templates
  </Card>
</CardGroup>

## The Challenge with Browser Automation

Browser automation tools like Puppeteer and Playwright offer powerful functionality for scripting clicks, typing, scrolling, and more. But they require you to understand the DOM structure and locate elements through HTML attributes, CSS selectors, or complex XPath queries.

This gets harder fast:

* **Selectors break** when websites update their markup
* **Iframes isolate content**, requiring nested queries to reach elements inside them
* **Shadow DOM encapsulation** makes elements even harder to access
* **Dynamic content** means selectors that worked yesterday might fail today

You end up spending more time maintaining selectors than building features.

## What HyperAgent Does

HyperAgent lets you describe what you want in plain English. The AI figures out how to interact with the page—no matter how the DOM is structured.

```typescript theme={null}
// Instead of this:
await page.locator('/html[1]/body[1]/c-wiz[2]/div[1]/div[2]/c-wiz[1]/div[1]/c-wiz[1]/div[2]/div[1]/div[1]/div[1]/div[1]/div[2]/div[1]/div[6]/div[2]/div[2]/div[1]/div[1]/input[1]').fill('Miami');

// Write this:
await page.perform("type Miami into the departure city field");
```

HyperAgent handles iframes, shadow DOM, and dynamic content automatically. When a site changes, your automation keeps working.

## Core Methods

<CardGroup cols={2}>
  <Card title="page.ai()" href="/hyperagent/page-ai">
    Execute complex multi-step tasks with natural language
  </Card>

  <Card title="page.perform()" href="/hyperagent/page-perform">
    Fast, single-action execution
  </Card>

  <Card title="page.extract()" href="/hyperagent/extract">
    Pull structured data with Zod schemas
  </Card>

  <Card title="Playwright Compatible" href="/sessions/playwright">
    Use standard Playwright when you need deterministic control
  </Card>
</CardGroup>

```typescript theme={null}
import { HyperAgent } from "@hyperbrowser/agent";
import { z } from "zod";

const agent = new HyperAgent();
const page = await agent.newPage();

await page.goto("https://flights.google.com");

// AI handles the complexity
await page.ai("search for flights from Miami to LAX on Dec 15");

// Single actions when you know what you need
await page.perform("click the first result");

// Extract structured data
const flight = await page.extract(
  "get the price and duration of the selected flight",
  z.object({
    price: z.number(),
    duration: z.string(),
  })
);

// use Playwright
await page.locator('css=button').click();

await agent.closeAgent();
```

## Key Features

<AccordionGroup>
  <Accordion title="Automatic Element Location">
    Describe the element in natural language. HyperAgent finds it regardless of DOM structure, iframes, or shadow DOM.
  </Accordion>

  <Accordion title="Action Caching">
    Record your automation once, replay it without LLM calls. Deterministic execution at a fraction of the cost.
  </Accordion>

  <Accordion title="Multiple LLM Providers">
    Use OpenAI, Anthropic, Google Gemini. Switch providers with one line of code.
  </Accordion>

  <Accordion title="Cloud Ready">
    Run locally for development, scale to hundreds of sessions with [Hyperbrowser](/sessions/create) in production.
  </Accordion>

  <Accordion title="CDP-First Architecture">
    Native Chrome DevTools Protocol integration for precise coordinates, deep iframe tracking, and automatic ad filtering.
  </Accordion>
</AccordionGroup>

## Get Started

<CodeGroup>
  ```bash npm theme={null}
  npm install @hyperbrowser/agent
  ```

  ```bash yarn theme={null}
  yarn add @hyperbrowser/agent
  ```

  ```bash pnpm theme={null}
  pnpm add @hyperbrowser/agent
  ```
</CodeGroup>

```typescript theme={null}
import { HyperAgent } from "@hyperbrowser/agent";

const agent = new HyperAgent();
const page = await agent.newPage();

await page.goto("https://news.ycombinator.com");
await page.ai("find the top story and summarize it");

await agent.closeAgent();
```

<CardGroup cols={1}>
  <Card title="Quickstart" icon="rocket" href="/hyperagent/quickstart">
    Build your first automation
  </Card>
</CardGroup>
