Skip to main content
Action Caching records every action taken during a page.ai() call. Replay these recordings later for deterministic, LLM-free automation—dramatically reducing costs and improving execution speed.

Why Use Action Caching?

Without CachingWith Caching
LLM call every runLLM call once, replay free
Variable behaviorDeterministic execution
Higher latencyNear-instant replay
Higher costPay once

Recording Actions

Every page.ai() call automatically returns an actionCache:
import { HyperAgent } from "@hyperbrowser/agent";
import fs from "fs";

const agent = new HyperAgent();
const page = await agent.newPage();

await page.goto("https://flights.google.com");

// Execute task - actionCache is automatically generated
const { output, actionCache } = await page.ai(
  "Search for flights from Miami to LAX on December 15"
);


console.log(`Recorded ${actionCache.actionCache.steps.length} steps`);

Replaying Actions

Use agent.runFromActionCache() to replay recorded actions:
import { HyperAgent, ActionCacheOutput } from "@hyperbrowser/agent";
import fs from "fs";

const agent = new HyperAgent();


// Replay without LLM calls
const result = await agent.runFromActionCache(actionCache.steps, {
  maxXPathRetries: 3,
});

console.log("Replay status:", result.status);
await agent.closeAgent();

Generate Script from Action Cache

Instead of replaying actions programmatically, you can generate a standalone TypeScript script from recorded actions using agent.createScriptFromActionCache():
import { HyperAgent } from "@hyperbrowser/agent";

const agent = new HyperAgent({
  llm: { provider: "anthropic", model: "claude-sonnet-4-0" },
});

const page = await agent.newPage();

// Record the automation
const { actionCache } = await page.ai(
  "Go to https://demo.automationtesting.in/Frames.html, " +
  "select the iframe within iframe tab, " +
  "and fill in the text box in the nested iframe"
);

// Generate a reusable script
const script = agent.createScriptFromActionCache(actionCache.steps);
console.log(script);

await agent.closeAgent();
This outputs a standalone script you can save and run directly—no LLM calls needed:
// Generated script
import { HyperAgent } from "@hyperbrowser/agent";

async function main() {
  const agent = new HyperAgent();
  const page = await agent.newPage();

  await page.goto("https://demo.automationtesting.in/Frames.html");
  await page.performClick("/html/body/section/div/div/div/ul/li[2]/a", {
    performInstruction: "Click the iframe within iframe tab"
  });
  await page.performFill("/html/body/section/div/div/div/input", "Hello", {
    performInstruction: "Fill in the text box",
    frameIndex: 2
  });

  await agent.closeAgent();
}

main();
The generated script uses the cache perform actions to execute the task without LLM calls.

How Replay Works

  1. XPath First: Attempts to find elements using cached XPaths
  2. Retry on Failure: Retries up to maxXPathRetries times
  3. LLM Fallback: If XPath fails, falls back to AI using the cached instruction
  4. Continue or Stop: Stops on first failure by default
const result = await page.runFromActionCache(cache, {
  maxXPathRetries: 3,  // Retry XPath 3 times before LLM fallback
  debug: true,         // Log execution details
});

// Check what happened
for (const step of result.steps) {
  console.log(`Step ${step.stepIndex}:`, {
    usedXPath: step.usedXPath,
    fallbackUsed: step.fallbackUsed,
    success: step.success,
  });
}

Action Cache Format

The cache is a JSON structure containing all recorded steps:
{
  "taskId": "abc-123",
  "createdAt": "2025-01-15T10:30:00Z",
  "status": "completed",
  "steps": [
    {
      "stepIndex": 0,
      "actionType": "actElement",
      "instruction": "Click the departure city input",
      "method": "click",
      "arguments": [],
      "frameIndex": 0,
      "xpath": "/html/body/div[2]/div[4]/input[1]",
      "success": true
    },
    {
      "stepIndex": 1,
      "actionType": "actElement",
      "instruction": "Type 'Miami' into the input",
      "method": "fill",
      "arguments": ["Miami"],
      "frameIndex": 0,
      "xpath": "/html/body/div[2]/div[4]/input[1]",
      "success": true
    }
  ]
}

Direct XPath Execution

For maximum control, use the perform helpers to execute actions directly:
const page = await agent.newPage();
await page.goto("https://example.com");

// Execute by XPath with LLM fallback
await page.performClick(
  "/html/body/button[1]",
  { performInstruction: "Click the submit button" }
);

await page.performFill(
  "/html/body/input[1]",
  "[email protected]",
  { performInstruction: "Fill the email field" }
);

Available Perform Actions

HelperDescription
performClick(xpath)Click an element
performFill(xpath, text)Clear and fill an input
performType(xpath, text)Type into an element
performPress(xpath, key)Press a keyboard key
performSelectOption(xpath, option)Select from dropdown
performCheck(xpath)Check a checkbox
performUncheck(xpath)Uncheck a checkbox
performHover(xpath)Hover over an element
performScrollToElement(xpath)Scroll element into view
Each helper accepts an options object:
await page.performClick(xpath, {
  performInstruction: "Click the login button", // Fallback instruction
  frameIndex: 0,  // Target iframe (0 = main frame)
  maxSteps: 3,    // Retries before fallback
});

When to Use Action Caching

ScenarioRecommendation
Repetitive tasks (daily scraping, scheduled jobs)✅ Record once, replay indefinitely
E2E testing✅ Fast, deterministic test runs
High-volume automation✅ Eliminate per-run LLM costs
Stable page structures✅ XPaths remain valid longer
Dynamic pages with frequent layout changes⚠️ May require frequent re-recording
One-time tasks❌ Just use page.ai() directly

Monitoring Fallback Rates

When a cached XPath no longer matches the page, HyperAgent falls back to the LLM to find the element if the performInstruction is provided. You’ll see logs like this:
⚠️ [runCachedStep] Cached action failed. Falling back to LLM...
   Instruction: "Select the LATAM/Delta flight with the lowest carbon emissions"
   ❌ Cached XPath Failed: "/html[1]/body[1]/c-wiz[2]/div[1]/.../li[5]/div[1]/div[1]"
   ✅ LLM Resolved New XPath: "/html[1]/body[1]/c-wiz[2]/div[1]/.../li[4]/div[1]/div[1]"
What this means:
  • The cached XPath pointed to li[5] but the element moved to li[4]
  • The LLM successfully found the correct element using the instruction
  • The action completed, but with added latency and cost
When to re-record:
  • If you see fallback warnings frequently, the page structure has changed
  • Re-run the original page.ai() task to capture fresh XPaths
  • Save the new actionCache to replace your stale recording
Enable debug: true on your agent to see more detailed logging.