> ## Documentation Index
> Fetch the complete documentation index at: https://hyperbrowser.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Browser-Use

> Fast and efficient browser automation with the open-source browser-use framework

Browser-Use is an open-source solution optimized for fast, efficient browser automation. It enables AI to interact with websites naturally—clicking, typing, scrolling, and navigating just like a human would.

Perfect for automating repetitive web tasks, extracting data from complex sites, or testing web applications at scale. Hyperbrowser hosts the browser-use framework so you can run agent tasks with a single API call.

You can view your Browser-Use tasks in the [dashboard](https://app.hyperbrowser.ai/features/agents/browser-use).

<Info>
  Browser-Use agents run asynchronously by default. Start a task, then poll for results. Our SDKs include a `startAndWait()` helper that handles polling automatically and returns when the task completes.
</Info>

## How It Works

You can use Browser-Use in two ways:

1. **Start and Wait**: SDKs provide a `startAndWait()` method that blocks until the task completes and returns the result
2. **Async Pattern**: Start a task, get a job ID, then poll for status and results—useful for long-running tasks or when you want more control

## Installation

<CodeGroup>
  ```bash npm theme={null}
  npm install @hyperbrowser/sdk dotenv
  ```

  ```bash yarn theme={null}
  yarn add @hyperbrowser/sdk dotenv
  ```

  ```bash pip theme={null}
  pip install hyperbrowser python-dotenv
  ```

  ```bash uv theme={null}
  uv add hyperbrowser python-dotenv
  ```
</CodeGroup>

## Quick Start

The simplest way to run a Browser-Use task is with `startAndWait()`, which handles the entire lifecycle for you:

<CodeGroup>
  ```typescript Node.js theme={null}
  import { Hyperbrowser } from "@hyperbrowser/sdk";
  import { config } from "dotenv";

  config();

  const client = new Hyperbrowser({
    apiKey: process.env.HYPERBROWSER_API_KEY,
  });

  async function main() {
    const result = await client.agents.browserUse.startAndWait({
      task: "Go to Hacker News and tell me the title of the top post",
      llm: "gemini-2.5-flash",
      maxSteps: 20,
    });

    console.log(`Output:\n${result.data?.finalResult}`);
  }

  main().catch((err) => {
    console.error(`Error: ${err.message}`);
  });
  ```

  ```python Python theme={null}
  from hyperbrowser import Hyperbrowser
  from hyperbrowser.models import StartBrowserUseTaskParams
  import os
  from dotenv import load_dotenv

  load_dotenv()

  client = Hyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))

  result = client.agents.browser_use.start_and_wait(
      params=StartBrowserUseTaskParams(
          task="Go to Hacker News and tell me the title of the top post",
          llm="gemini-2.5-flash",
          max_steps=20
      )
  )

  print(f"Output:\n{result.data.final_result}")
  ```

  ```bash cURL theme={null}
  # Start the task
  curl -X POST https://api.hyperbrowser.ai/api/task/browser-use \
    -H "Content-Type: application/json" \
    -H "x-api-key: YOUR_API_KEY" \
    -d '{
      "task": "Go to Hacker News and tell me the title of the top post",
      "llm": "gemini-2.5-flash",
      "maxSteps": 20
    }'

  # Response: {"jobId": "abc123", "liveUrl": "https://..."}

  # Check status
  curl https://api.hyperbrowser.ai/api/task/browser-use/abc123/status \
    -H "x-api-key: YOUR_API_KEY"

  # Get full results
  curl https://api.hyperbrowser.ai/api/task/browser-use/abc123 \
    -H "x-api-key: YOUR_API_KEY"
  ```
</CodeGroup>

## Async Pattern

When you need more control, use the async pattern to start a task and poll for results:

<CodeGroup>
  ```typescript Node.js theme={null}
  import { Hyperbrowser } from "@hyperbrowser/sdk";
  import { config } from "dotenv";

  config();

  const client = new Hyperbrowser({
    apiKey: process.env.HYPERBROWSER_API_KEY,
  });

  async function main() {
    try {
      // Start the task
      const task = await client.agents.browserUse.start({
        task: "What is the title of the first post on Hacker News today?",
        llm: "gemini-2.5-flash",
        maxSteps: 20,
      });

      console.log(`Task started: ${task.jobId}`);
      console.log(`Watch live: ${task.liveUrl}`);

      // Poll for completion
      let result;
      while (true) {
        result = await client.agents.browserUse.getStatus(task.jobId);
        console.log(`Status: ${result.status}`);

        if (result.status === "completed" || result.status === "failed") {
          break;
        }

        await new Promise((resolve) => setTimeout(resolve, 5000)); // Wait 5s
      }

      const fullResult = await client.agents.browserUse.get(task.jobId);

      if (fullResult.status === "completed") {
        console.log("Result:", fullResult.data?.finalResult);
        console.log("Steps taken:", fullResult.data?.steps?.length);
      } else {
        console.error("Task failed:", fullResult.error);
      }
    } catch (err) {
      console.error(`Error: ${err.message}`);
    }
  }

  main();
  ```

  ```python Python theme={null}
  import asyncio
  from hyperbrowser import AsyncHyperbrowser
  from hyperbrowser.models import StartBrowserUseTaskParams
  from dotenv import load_dotenv
  import os

  load_dotenv()

  client = AsyncHyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))


  async def main():
      try:
          # Start the task
          task = await client.agents.browser_use.start(
              params=StartBrowserUseTaskParams(
                  task="What is the title of the first post on Hacker News today?",
                  llm="gemini-2.5-flash",
                  max_steps=20,
              )
          )

          print(f"Task started: {task.job_id}")
          print(f"Watch live: {task.live_url}")

          # Poll for completion
          while True:
              result = await client.agents.browser_use.get_status(task.job_id)
              print(f"Status: {result.status}")

              if result.status in ["completed", "failed"]:
                  break

              await asyncio.sleep(5)  # Wait 5s

          full_result = await client.agents.browser_use.get(task.job_id)

          if full_result.status == "completed":
              print("Result:", full_result.data.final_result)
              print(
                  "Steps taken:",
                  len(full_result.data.steps) if full_result.data.steps else 0,
              )
          else:
              print("Task failed:", full_result.error)
      except Exception as e:
          print(f"Error: {e}")


  if __name__ == "__main__":
      asyncio.run(main())
  ```
</CodeGroup>

## Stop a Running Task

Stop a task before it completes:

<CodeGroup>
  ```typescript Node.js theme={null}
  await client.agents.browserUse.stop("job-id");
  ```

  ```python Python theme={null}
  client.agents.browser_use.stop("job-id")
  ```

  ```bash cURL theme={null}
  curl -X PUT https://api.hyperbrowser.ai/api/task/browser-use/job-id/stop \
    -H "x-api-key: YOUR_API_KEY"
  ```
</CodeGroup>

## Parameters

<ParamField path="task" type="string" required>
  Natural language description of what the agent should accomplish.
</ParamField>

<ParamField path="version" type="string" default="0.1.40">
  Version of browser-use to use.
  Options: `0.1.40`, `0.7.10`, `latest`

  <Warning>
    Be cautious when using the `latest` version. We periodically update the latest version to match the latest version of browser-use and this may introduce changes to how the agent works or other breaking changes.
  </Warning>
</ParamField>

<ParamField path="llm" type="string" default="gemini-2.0-flash">
  Language model to use. Options: `gpt-4o`, `gpt-4o-mini`, `gpt-4.1`, `gpt-4.1-mini`, `claude-sonnet-4-6`, `claude-sonnet-4-5`, `claude-sonnet-4-20250514`, `gemini-2.0-flash`, `gemini-2.5-flash`
</ParamField>

<ParamField path="maxSteps" type="number" default="20">
  Maximum number of steps the agent can take. Increase if tasks aren't able to complete within the given number of steps.
</ParamField>

<ParamField path="sessionId" type="string">
  ID of an existing browser session to reuse. Useful for multi-step workflows that need to maintain the same browser session.
</ParamField>

<ParamField path="useVision" type="boolean" default="true">
  Enable screenshot analysis for better context understanding.
</ParamField>

<ParamField path="validateOutput" type="boolean" default="false">
  Validate agent output against a schema.
</ParamField>

<ParamField path="useVisionForPlanner" type="boolean" default="false">
  Provide screenshots to the planning component.
</ParamField>

<ParamField path="maxActionsPerStep" type="number" default="10">
  Maximum actions per step before reassessing.
</ParamField>

<ParamField path="maxInputTokens" type="number" default="128000">
  Maximum tokens for LLM input.
</ParamField>

<ParamField path="plannerLlm" type="string" default="gemini-2.0-flash">
  Separate language model for planning (can be different from main LLM).
</ParamField>

<ParamField path="pageExtractionLlm" type="string" default="gemini-2.0-flash">
  Separate language model for extracting structured data from pages.
</ParamField>

<ParamField path="plannerInterval" type="number" default="10">
  How often (in steps) the planner reassesses strategy.
</ParamField>

<ParamField path="maxFailures" type="number" default="3">
  Maximum consecutive failures before aborting the task.
</ParamField>

<ParamField path="initialActions" type="array">
  List of actions to execute before starting the main task.
</ParamField>

<ParamField path="sensitiveData" type="object">
  Key-value pairs to mask the data sent to the LLM. The LLM only sees placeholders (x\_user, x\_pass), browser-use filters your sensitive data from the input text. Real values are injected directly into form fields after the LLM call.
</ParamField>

<ParamField path="outputModelSchema" type="object">
  Valid JSON schema for structured output.
</ParamField>

<ParamField path="keepBrowserOpen" type="boolean" default="false">
  Keep session alive after task completes.
</ParamField>

<ParamField path="sessionOptions" type="object">
  [Session configuration](/api-reference/start-a-browser-use-task#body-session-options) (proxy, stealth, captcha solving, etc.). Only applies when creating a new session. If you provide an existing `sessionId`, these options are ignored.
</ParamField>

<ParamField path="useCustomApiKeys" type="boolean" default="false">
  Use your own LLM API keys instead of Hyperbrowser's. You will only be charged for browser usage.
</ParamField>

<ParamField path="apiKeys" type="object">
  API keys for `openai`, `anthropic`, and `google`. Required when `useCustomApiKeys` is `true`. Must provide keys based on the LLMs you are using.

  ```typescript theme={null}
  {
    openai: "...",
    anthropic: "...",
    google: "..."
  }
  ```
</ParamField>

<Tip>
  The agent may not complete the task within the specified `maxSteps`. If that happens, try increasing the `maxSteps` parameter.

  Additionally, the browser session used by the AI Agent will time out based on your team's default Session Timeout settings or the session's `timeoutMinutes` parameter if provided. You can adjust the default Session Timeout in the [Settings page](https://app.hyperbrowser.ai/settings).
</Tip>

## Reuse Browser Sessions

You can pass in an existing `sessionId` to the Browser Use task so that it can execute the task on an existing session. Also, if you want to keep the session open after executing the task, you can supply the `keepBrowserOpen` parameter.

<CodeGroup>
  ```typescript Node.js theme={null}
  import { Hyperbrowser } from "@hyperbrowser/sdk";
  import { config } from "dotenv";

  config();

  const client = new Hyperbrowser({
    apiKey: process.env.HYPERBROWSER_API_KEY,
  });

  const main = async () => {
    const session = await client.sessions.create();

    try {
      const result = await client.agents.browserUse.startAndWait({
        task: "What is the title of the first post on Hacker News today?",
        sessionId: session.id,
        keepBrowserOpen: true,
      });

      console.log(`Output:\n${result.data?.finalResult}`);

      const result2 = await client.agents.browserUse.startAndWait({
        task: "Tell me how many upvotes the first post has.",
        sessionId: session.id,
      });

      console.log(`\nOutput:\n${result2.data?.finalResult}`);
    } catch (err) {
      console.error(`Error: ${err}`);
    } finally {
      await client.sessions.stop(session.id);
    }
  };

  main().catch((err) => {
    console.error(`Error: ${err.message}`);
  });
  ```

  ```python Python theme={null}
  import os
  from hyperbrowser import Hyperbrowser
  from hyperbrowser.models import StartBrowserUseTaskParams
  from dotenv import load_dotenv

  load_dotenv()

  client = Hyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))


  def main():
      session = client.sessions.create()

      try:
          resp = client.agents.browser_use.start_and_wait(
              StartBrowserUseTaskParams(
                  task="What is the title of the first post on Hacker News today?",
                  session_id=session.id,
                  keep_browser_open=True,
              )
          )

          print(f"Output:\n{resp.data.final_result}")

          resp2 = client.agents.browser_use.start_and_wait(
              StartBrowserUseTaskParams(
                  task="Tell me how many upvotes the first post has.",
                  session_id=session.id,
              )
          )

          print(f"\nOutput:\n{resp2.data.final_result}")
      except Exception as e:
          print(f"Error: {e}")
      finally:
          client.sessions.stop(session.id)


  if __name__ == "__main__":
      try:
          main()
      except Exception as e:
          print(f"Error: {e}")
  ```
</CodeGroup>

<Warning>
  Always set `keepBrowserOpen: true` on tasks that you want to reuse the session from. Otherwise, the session will be automatically closed when the task completes.
</Warning>

## Use Your Own API Keys

You can provide your own API Keys to the Browser Use task so that it doesn't charge credits to your Hyperbrowser account for the steps it takes during execution. Only the credits for the [usage of the browser itself](/reference/pricing#browser-sessions) will be charged. Depending on which model you select for the `llm`, `plannerLlm`, and `pageExtractionLlm` parameters, the API keys from those providers will need to be provided when `useCustomApiKeys` is set to true.

<CodeGroup>
  ```typescript Node.js theme={null}
  import { Hyperbrowser } from "@hyperbrowser/sdk";
  import { config } from "dotenv";

  config();

  const client = new Hyperbrowser({
    apiKey: process.env.HYPERBROWSER_API_KEY,
  });

  const main = async () => {
    const result = await client.agents.browserUse.startAndWait({
      task: "What is the title of the first post on Hacker News today?",
      llm: "gpt-4o",
      plannerLlm: "gpt-4o",
      pageExtractionLlm: "gpt-4o",
      useCustomApiKeys: true,
      apiKeys: {
        openai: "<OPENAI_API_KEY>",
        // Below are needed if Claude or Gemini models are used
        // anthropic: "<ANTHROPIC_API_KEY>",
        // google: "<GOOGLE_API_KEY>",
      },
    });

    console.log(`Output:\n\n${result.data?.finalResult}`);
  };

  main().catch((err) => {
    console.error(`Error: ${err.message}`);
  });
  ```

  ```python Python theme={null}
  import os
  from hyperbrowser import Hyperbrowser
  from hyperbrowser.models import StartBrowserUseTaskParams, BrowserUseApiKeys
  from dotenv import load_dotenv

  load_dotenv()

  client = Hyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))


  def main():
      resp = client.agents.browser_use.start_and_wait(
          StartBrowserUseTaskParams(
              task="What is the title of the first post on HackerNews today?",
              llm="gpt-4o",
              planner_llm="gpt-4o",
              page_extraction_llm="gpt-4o",
              use_custom_api_keys=True,
              api_keys=BrowserUseApiKeys(
                  openai="<OPENAI_API_KEY>",
                  # Below are needed if Claude or Gemini models are used
                  # anthropic="<ANTHROPIC_API_KEY>",
                  # google="<GOOGLE_API_KEY>",
              )
          )
      )

      print(f"Output:\n\n{resp.data.final_result}")


  if __name__ == "__main__":
      try:
          main()
      except Exception as e:
          print(f"Error: {e}")
  ```

  ```bash cURL theme={null}
  curl -X POST https://api.hyperbrowser.ai/api/task/browser-use \
    -H "Content-Type: application/json" \
    -H "x-api-key: YOUR_HYPERBROWSER_API_KEY" \
    -d '{
      "task": "What is the title of the first post on Hacker News today?",
      "llm": "gpt-4o",
      "plannerLlm": "gpt-4o",
      "pageExtractionLlm": "gpt-4o",
      "useCustomApiKeys": true,
      "apiKeys": {
        "openai": "YOUR_OPENAI_API_KEY"
      }
    }'
  ```
</CodeGroup>

You can provide keys for multiple providers:

```json theme={null}
{
  "apiKeys": {
    "openai": "sk-...",
    "anthropic": "sk-ant-...",
    "google": "..."
  }
}
```

## Session Configuration

Configure the browser environment with proxies, stealth mode, CAPTCHA solving, and more:

<CodeGroup>
  ```typescript Node.js theme={null}
  import { Hyperbrowser } from "@hyperbrowser/sdk";
  import { config } from "dotenv";

  config();

  const client = new Hyperbrowser({
    apiKey: process.env.HYPERBROWSER_API_KEY,
  });

  const main = async () => {
    const result = await client.agents.browserUse.startAndWait({
      task: "go to Hacker News and summarize the top 5 posts of the day",
      sessionOptions: {
        acceptCookies: true,
      }
    });

    console.log(`Output:\n\n${result.data?.finalResult}`);
  };

  main().catch((err) => {
    console.error(`Error: ${err.message}`);
  });
  ```

  ```python Python theme={null}
  import os
  from hyperbrowser import Hyperbrowser
  from hyperbrowser.models import StartBrowserUseTaskParams, CreateSessionParams
  from dotenv import load_dotenv

  load_dotenv()

  client = Hyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))


  def main():
      resp = client.agents.browser_use.start_and_wait(
          StartBrowserUseTaskParams(
              task="go to Hacker News and summarize the top 5 posts of the day",
              session_options=CreateSessionParams(
                  accept_cookies=True,
              ),
          )
      )

      print(f"Output:\n\n{resp.data.final_result}")


  if __name__ == "__main__":
      try:
          main()
      except Exception as e:
          print(f"Error: {e}")
  ```

  ```bash cURL theme={null}
  curl -X POST https://api.hyperbrowser.ai/api/task/browser-use \
    -H 'Content-Type: application/json' \
    -H 'x-api-key: <YOUR_API_KEY>' \
    -d '{
        "task": "go to Hacker News and summarize the top 5 posts of the day",
        "sessionOptions": {
            "acceptCookies": true
        }
    }'
  ```
</CodeGroup>

<Info>
  `sessionOptions` only apply when creating a new session. If you provide an existing `sessionId`, these options are ignored.
</Info>

<Warning>
  Proxies and CAPTCHA solving add latency. Only enable them when necessary for your use case.
</Warning>
