Browser-Use is an open-source solution optimized for fast, efficient browser automation. It enables AI to interact with websites naturally—clicking, typing, scrolling, and navigating just like a human would.
Perfect for automating repetitive web tasks, extracting data from complex sites, or testing web applications at scale. Hyperbrowser hosts the browser-use framework so you can run agent tasks with a single API call.
You can view your Browser-Use tasks in the dashboard.
Browser-Use agents run asynchronously by default. Start a task, then poll for results. Our SDKs include a startAndWait() helper that handles polling automatically and returns when the task completes.
How It Works
You can use Browser-Use in two ways:
- Start and Wait: SDKs provide a
startAndWait() method that blocks until the task completes and returns the result
- Async Pattern: Start a task, get a job ID, then poll for status and results—useful for long-running tasks or when you want more control
Installation
npm install @hyperbrowser/sdk dotenv
Quick Start
The simplest way to run a Browser-Use task is with startAndWait(), which handles the entire lifecycle for you:
import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";
config();
const client = new Hyperbrowser({
apiKey: process.env.HYPERBROWSER_API_KEY,
});
async function main() {
const result = await client.agents.browserUse.startAndWait({
task: "Go to Hacker News and tell me the title of the top post",
llm: "gemini-2.0-flash",
maxSteps: 20,
});
console.log(`Output:\n${result.data?.finalResult}`);
}
main().catch((err) => {
console.error(`Error: ${err.message}`);
});
Async Pattern
When you need more control, use the async pattern to start a task and poll for results:
import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";
config();
const client = new Hyperbrowser({
apiKey: process.env.HYPERBROWSER_API_KEY,
});
async function main() {
try {
// Start the task
const task = await client.agents.browserUse.start({
task: "What is the title of the first post on Hacker News today?",
llm: "gemini-2.0-flash",
maxSteps: 20,
});
console.log(`Task started: ${task.jobId}`);
console.log(`Watch live: ${task.liveUrl}`);
// Poll for completion
let result;
while (true) {
result = await client.agents.browserUse.getStatus(task.jobId);
console.log(`Status: ${result.status}`);
if (result.status === "completed" || result.status === "failed") {
break;
}
await new Promise((resolve) => setTimeout(resolve, 5000)); // Wait 5s
}
const fullResult = await client.agents.browserUse.get(task.jobId);
if (fullResult.status === "completed") {
console.log("Result:", fullResult.data?.finalResult);
console.log("Steps taken:", fullResult.data?.steps?.length);
} else {
console.error("Task failed:", fullResult.error);
}
} catch (err) {
console.error(`Error: ${err.message}`);
}
}
main();
Stop a Running Task
Stop a task before it completes:
await client.agents.browserUse.stop("job-id");
Parameters
Natural language description of what the agent should accomplish.
Version of browser-use to use.
Options: 0.1.40, 0.7.10, latestBe cautious when using the latest version. We periodically update the latest version to match the latest version of browser-use and this may introduce changes to how the agent works or other breaking changes.
llm
string
default:"gemini-2.0-flash"
Language model to use. Options: gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, claude-sonnet-4-5, claude-sonnet-4-20250514, claude-3-7-sonnet-20250219, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022, gemini-2.0-flash, gemini-2.5-flash
Maximum number of steps the agent can take. Increase if tasks aren’t able to complete within the given number of steps.
ID of an existing browser session to reuse. Useful for multi-step workflows that need to maintain the same browser session.
Enable screenshot analysis for better context understanding.
Validate agent output against a schema.
Provide screenshots to the planning component.
Maximum actions per step before reassessing.
Maximum tokens for LLM input.
plannerLlm
string
default:"gemini-2.0-flash"
Separate language model for planning (can be different from main LLM).
Separate language model for extracting structured data from pages.
How often (in steps) the planner reassesses strategy.
Maximum consecutive failures before aborting the task.
List of actions to execute before starting the main task.
Key-value pairs to mask the data sent to the LLM. The LLM only sees placeholders (x_user, x_pass), browser-use filters your sensitive data from the input text. Real values are injected directly into form fields after the LLM call.
Valid JSON schema for structured output.
Keep session alive after task completes.
Session configuration (proxy, stealth, captcha solving, etc.). Only applies when creating a new session. If you provide an existing sessionId, these options are ignored.
Use your own LLM API keys instead of Hyperbrowser’s. You will only be charged for browser usage.
API keys for openai, anthropic, and google. Required when useCustomApiKeys is true. Must provide keys based on the LLMs you are using.{
openai: "...",
anthropic: "...",
google: "..."
}
The agent may not complete the task within the specified maxSteps. If that happens, try increasing the maxSteps parameter.Additionally, the browser session used by the AI Agent will time out based on your team’s default Session Timeout settings or the session’s timeoutMinutes parameter if provided. You can adjust the default Session Timeout in the Settings page.
Reuse Browser Sessions
You can pass in an existing sessionId to the Browser Use task so that it can execute the task on an existing session. Also, if you want to keep the session open after executing the task, you can supply the keepBrowserOpen parameter.
import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";
config();
const client = new Hyperbrowser({
apiKey: process.env.HYPERBROWSER_API_KEY,
});
const main = async () => {
const session = await client.sessions.create();
try {
const result = await client.agents.browserUse.startAndWait({
task: "What is the title of the first post on Hacker News today?",
sessionId: session.id,
keepBrowserOpen: true,
});
console.log(`Output:\n${result.data?.finalResult}`);
const result2 = await client.agents.browserUse.startAndWait({
task: "Tell me how many upvotes the first post has.",
sessionId: session.id,
});
console.log(`\nOutput:\n${result2.data?.finalResult}`);
} catch (err) {
console.error(`Error: ${err}`);
} finally {
await client.sessions.stop(session.id);
}
};
main().catch((err) => {
console.error(`Error: ${err.message}`);
});
Always set keepBrowserOpen: true on tasks that you want to reuse the session from. Otherwise, the session will be automatically closed when the task completes.
Use Your Own API Keys
You can provide your own API Keys to the Browser Use task so that it doesn’t charge credits to your Hyperbrowser account for the steps it takes during execution. Only the credits for the usage of the browser itself will be charged. Depending on which model you select for the llm, plannerLlm, and pageExtractionLlm parameters, the API keys from those providers will need to be provided when useCustomApiKeys is set to true.
import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";
config();
const client = new Hyperbrowser({
apiKey: process.env.HYPERBROWSER_API_KEY,
});
const main = async () => {
const result = await client.agents.browserUse.startAndWait({
task: "What is the title of the first post on Hacker News today?",
llm: "gpt-4o",
plannerLlm: "gpt-4o",
pageExtractionLlm: "gpt-4o",
useCustomApiKeys: true,
apiKeys: {
openai: "<OPENAI_API_KEY>",
// Below are needed if Claude or Gemini models are used
// anthropic: "<ANTHROPIC_API_KEY>",
// google: "<GOOGLE_API_KEY>",
},
});
console.log(`Output:\n\n${result.data?.finalResult}`);
};
main().catch((err) => {
console.error(`Error: ${err.message}`);
});
You can provide keys for multiple providers:
{
"apiKeys": {
"openai": "sk-...",
"anthropic": "sk-ant-...",
"google": "..."
}
}
Session Configuration
Configure the browser environment with proxies, stealth mode, CAPTCHA solving, and more:
import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";
config();
const client = new Hyperbrowser({
apiKey: process.env.HYPERBROWSER_API_KEY,
});
const main = async () => {
const result = await client.agents.browserUse.startAndWait({
task: "go to Hacker News and summarize the top 5 posts of the day",
sessionOptions: {
acceptCookies: true,
}
});
console.log(`Output:\n\n${result.data?.finalResult}`);
};
main().catch((err) => {
console.error(`Error: ${err.message}`);
});
sessionOptions only apply when creating a new session. If you provide an existing sessionId, these options are ignored.
Proxies and CAPTCHA solving add latency. Only enable them when necessary for your use case.