> ## Documentation Index > Fetch the complete documentation index at: https://hyperbrowser.ai/docs/llms.txt > Use this file to discover all available pages before exploring further. # page.ai() > Execute complex multi-step browser tasks with AI-powered automation The `page.ai()` method executes multi-step browser tasks using natural language. It runs an agentic loop that observes the page, makes decisions, and executes actions until the goal is complete. `agent.executeTask()` does the same thing—it's a convenience method that creates a page for you. Use whichever fits your workflow. ## Basic Usage Execute a multi-step task on a specific page: ```typescript theme={null} import { HyperAgent } from "@hyperbrowser/agent"; const agent = new HyperAgent({ llm: { provider: "openai", model: "gpt-4o" }, }); const page = await agent.newPage(); await page.goto("https://flights.google.com"); const { output, actionCache } = await page.ai( "Search for round-trip flights from Miami to LAX, leaving Dec 15 and returning Dec 22" ); console.log(output); // The actionCache can be saved and replayed later console.log(`Completed in ${actionCache.steps.length} steps`); await agent.closeAgent(); ``` ### Parameters Natural language description of what you want to accomplish. Be specific for best results. Configuration options for the task execution. Maximum number of actions the AI can take. Increase for complex tasks. Reuse DOM snapshots across steps for faster execution. Enable screenshots with element overlays for visual understanding. Stream DOM updates for more responsive execution. Define a Zod schema for structured output extraction at the end of the task. ### Example with Options ```typescript theme={null} import { z } from "zod"; const { output, actionCache } = await page.ai( "Find the cheapest flight from NYC to London next month", { maxSteps: 30, useDomCache: true, outputSchema: z.object({ airline: z.string(), price: z.number(), departure: z.string(), arrival: z.string(), }), } ); console.log(output); // Typed as { airline, price, departure, arrival } ``` ## agent.executeTask() `executeTask()` is a shorthand that creates a page and runs `page.ai()` for you: ```typescript theme={null} // This: const result = await agent.executeTask("Go to amazon.com and find the top seller"); // Is equivalent to: const page = await agent.newPage(); const result = await page.ai("Go to amazon.com and find the top seller"); ``` It accepts the same parameters as `page.ai()`: ### With Output Schema ```typescript theme={null} import { z } from "zod"; const result = await agent.executeTask( "Navigate to imdb.com, search for 'The Matrix', and extract the movie details", { outputSchema: z.object({ director: z.string().describe("The name of the movie director"), releaseYear: z.number().describe("The year the movie was released"), rating: z.string().describe("The IMDb rating of the movie"), }), } ); console.log(result.output); // { director: "Lana Wachowski, Lilly Wachowski", releaseYear: 1999, rating: "8.7/10" } ``` ## Return Value Both methods return a `TaskOutput` object: ```typescript theme={null} interface TaskOutput { taskId: string; // Unique identifier for this task status: TaskStatus; // "completed" | "failed" | "cancelled" output: string | T; // Result (or typed if outputSchema provided) steps: AgentStep[]; // Array of steps taken actionCache: ActionCacheOutput; // Recorded actions for replay } ``` ### Task Status | Status | Description | | ----------- | ------------------------------------ | | `completed` | Task finished successfully | | `failed` | Task encountered an error | | `cancelled` | Task was cancelled before completion | ## Visual Mode Enable visual mode when the AI needs to understand page layout or when dealing with complex visual elements: ```typescript theme={null} const { output } = await page.ai( "Find the product image and describe what's shown", { enableVisualMode: true, } ); ``` Visual mode uses screenshots which increases token usage and latency. Only enable when visual understanding is necessary. ## Real-World Examples ### Flight Search ```typescript theme={null} const page = await agent.newPage(); await page.goto("https://flights.google.com"); const { output, actionCache } = await page.ai( "Search for round-trip flights from Rio de Janeiro to Los Angeles, " + "leaving December 11, 2025 and returning December 22, 2025. " + "Select the option with the lowest carbon emissions.", { useDomCache: true, enableDomStreaming: true, } ); // Save actionCache for later replay console.log(JSON.stringify(actionCache, null, 2)); ``` ### E-commerce Price Comparison ```typescript theme={null} import { z } from "zod"; const result = await agent.executeTask( "Go to amazon.com, search for 'mechanical keyboard', and compare the top 3 results", { outputSchema: z.object({ products: z.array(z.object({ name: z.string(), price: z.number(), rating: z.number(), reviewCount: z.number(), })), recommendation: z.string(), }), } ); console.log(result.output.products); console.log(result.output.recommendation); ``` ### Google Form Submission ```typescript theme={null} const agent = new HyperAgent({ llm: { provider: "openai", model: "gpt-4o" }, }); const page = await agent.newPage(); await page.goto("https://docs.google.com/forms/d/e/1FAIpQLScPkE8wNLpPSkP2d__Ee7xx5Pj7_XDuZ0p16geYWrp73Nutmw/viewform?usp=dialog"); // Fill each field await page.perform("fill the name field with John Doe"); await page.perform("fill the email field with john@example.com"); await page.perform("fill the feedback text area with This is a test submission"); await page.perform("select 5 rating option"); // Submit await page.perform("click the submit button"); await agent.closeAgent(); ``` ## Action Cache Output Every `page.ai()` call returns an `actionCache` that records all actions taken: ```json theme={null} { "taskId": "abc-123", "createdAt": "2025-01-15T10:30:00Z", "status": "completed", "steps": [ { "stepIndex": 0, "actionType": "actElement", "instruction": "Click the source location input", "method": "click", "arguments": [], "frameIndex": 0, "xpath": "/html/body/div[1]/input[1]", "success": true, "message": "Successfully clicked element" } ] } ``` This cache can be saved and [replayed later](/docs/hyperagent/action-cache) for deterministic execution without LLM calls. ## Error Handling ```typescript theme={null} try { const result = await page.ai("Complete the checkout process", { maxSteps: 50, }); if (result.status === "failed") { console.error("Task failed:", result.output); } } catch (error) { console.error("Execution error:", error); } ``` ## Best Practices Instead of "search for flights", say "search for round-trip flights from Miami to LAX, departing December 15 and returning December 22, 2025". Simple tasks: 10-20 steps. Complex multi-page workflows: 50+ steps. Monitor task outputs and adjust as needed. When you need specific data extracted, define a Zod schema to get typed, validated output. Store the returned `actionCache` to replay the same automation later without LLM calls. ## Next Steps Fast single-action execution Extract structured data Replay automations without LLM calls Configure LLM providers