BrowserAgent

The BrowserAgent is the primary class for browser automation with Magnitude. It provides a high-level, AI-powered API for controlling a web browser.

An agent is created and initialized using the startBrowserAgent function.

`startBrowserAgent(options?)`

This function creates, initializes, and returns a BrowserAgent instance.

import { startBrowserAgent } from 'magnitude-core';

const agent = await startBrowserAgent();
await agent.nav("https://google.com");
// ... perform actions
await agent.stop();

options

object

An optional configuration object that can be used to customize the agent’s behavior, including Playwright launch options and LLM settings. See Agent Options for a full reference.

browser

object

An object to configure the browser instance. It can contain either Playwright launchOptions or contextOptions.

launchOptions

object

Standard Playwright launch options. See the Playwright documentation for a full list of options.

Example: Launching with a proxy

const agent = await startBrowserAgent({
  browser: {
    launchOptions: {
      proxy: {
        server: 'http://my-proxy-server.com:8080'
      }
    }
  }
});

contextOptions

object

Standard Playwright browser context options. See the Playwright documentation for a full list of options.

Example: Setting a custom user agent

const agent = await startBrowserAgent({
  browser: {
    contextOptions: {
      userAgent: 'MyCustomBrowser/1.0'
    }
  }
});

url

string

An initial URL to navigate to immediately after the browser starts.

virtualScreenDimensions

{ width: number, height: number }

Sets the resolution of the screenshot that the LLM sees. This does not change the browser’s viewport size, but scales the screenshot to these dimensions. This is useful for ensuring a consistent input size for the vision model.

llm

object

Configure the LLM to be used by the agent. See the LLM Providers documentation for details.

narrate

boolean

If true, the agent will narrate its actions to the console, providing a running commentary of what it’s doing.

Methods

`act(description, options?)`

Executes one or more browser actions based on a natural language description. Magnitude interprets the description and determines the necessary interactions (clicks, types, scrolls, etc.).

// Simple step
await agent.act("Click the main login button");

// Step with data
await agent.act("Enter {username} into the user field", {
  data: { username: "test@example.com" }
});

description

string

required

A natural language description of the action(s) to perform. Can include placeholders like {key} which will be substituted by values from options.data.

options

object

Optional parameters for the step.

`nav(url: string)`

Navigates to a URL.

await agent.nav('https://google.com');

`page` and `context`

Access the underlying Playwright Page and BrowserContext.

const page = agent.page;
const context = agent.context;

`extract(instructions, schema)`

The BrowserAgent provides a powerful extract() method that uses AI to pull structured data from a webpage based on your instructions and a Zod schema.

import { startBrowserAgent } from 'magnitude-core';
import { z } from 'zod';

const HackerNewsStory = z.object({
  rank: z.number().describe('The story rank'),
  title: z.string().describe('The title of the story'),
  url: z.string().url().describe('The URL of the story'),
});

const HackerNewsSchema = z.array(HackerNewsStory);

const agent = await startBrowserAgent();
await agent.nav("https://news.ycombinator.com");

const stories = await agent.extract(
  "Extract the top 5 stories from Hacker News",
  HackerNewsSchema
);

console.log(stories);
await agent.stop();

instructions

string

required

Natural language instructions for the AI, describing what data to extract from the page.

schema

ZodSchema

required

A Zod schema that defines the structure of the data to be extracted. The AI will do its best to populate the fields of the schema based on the page content and your instructions.

Adding .describe() calls to your schema fields can significantly improve the accuracy of the extraction by providing more context to the AI.

`stop()`

Closes the browser and cleans up any resources used by the agent.

Getting Started

Core Concepts

Testing Web Apps

Reference

BrowserAgent

BrowserAgent

`startBrowserAgent(options?)`

Methods

`act(description, options?)`

`nav(url: string)`

`page` and `context`

`extract(instructions, schema)`

`stop()`

Getting Started

Core Concepts

Testing Web Apps

Reference

​BrowserAgent

​startBrowserAgent(options?)

​Methods

​act(description, options?)

​nav(url: string)

​page and context

​extract(instructions, schema)

​stop()

BrowserAgent

`startBrowserAgent(options?)`

Methods

`act(description, options?)`

`nav(url: string)`

`page` and `context`

`extract(instructions, schema)`

`stop()`