# Memory and Prompt Caching Source: https://docs.magnitude.run/advanced/memory Configure memory settings and prompt caching optimizations ## Prompt caching By default, when Anthropic models are used, Magnitude will automatically utilize Anthropic's [prompt caching feature](https://www.anthropic.com/news/prompt-caching) to save on token costs as much as possible. By using prompt caching - Magnitude will use typically around 40% less tokens compared to uncached. The effects of prompt caching compound with longer tasks. ## Configuring prompt caching If you would for whatever reason like to explicitly disable prompt caching, you can do so: ```ts theme={null} const cachedAgent = await startBrowserAgent({ url: 'https://magnitasks.com', llm: { provider: 'claude-code', options: { model: 'claude-sonnet-4-20250514', promptCaching: false } } }); ``` Otherwise, prompt caching will be enabled by default with `anthropic` or `claude-code` providers. Note that prompt caching is not available on Bedrock. ## Configuring retained screenshots By default, Magnitude will retain 1 screenshot plus any screenshots that were included in a previous cache write. If your task requires complicated interactions and you want to ensure the agent retains more screenshots in working memory, you can increase the `minScreenshots` option: ```ts theme={null} const cachedAgent = await startBrowserAgent({ url: 'https://magnitasks.com', llm: { provider: 'claude-code', options: { model: 'claude-sonnet-4-20250514' } }, minScreenshots: 3, }); ``` ## How Magnitude's memory works Anthropic provides 4 cache control messages at a time. The Magnitude agent will use one cache control on the system prompt so that it is always cached, and then cycle the other 3 with prompt cache writes and reads to optimize on screenshot tokens during the task. In addition - Magnitude only keeps one screenshot unless already cached, plus the last 20 thoughts from the agent. This means that context length will grow roughly linearly with task complexity, and is extremely performant compared to other computer use agent implementations which do not implement sliding-window memory or prompt caching. # LLM Roles Source: https://docs.magnitude.run/advanced/roles Designate different LLMs for different responsibilities You can customize the Magnitude agent to use different LLMs for each of the three primary operations: `act`, `extract`, `query`. By default when a single LLM is provided, all responsibilites will be handled by that LLM. However, by specifying different LLMs for certain roles you may be able to save on cost and speed. Example: ```typescript theme={null} import { startBrowserAgent } from 'magnitude-core'; import z from 'zod'; async function main() { const agent = await startBrowserAgent({ url: 'https://magnitasks.com/tasks', narrate: true, llm: [ { provider: 'claude-code', options: { model: 'claude-sonnet-4-20250514' } }, { roles: ['extract'], provider: 'google-ai', options: { model: 'gemini-2.5-flash-lite-preview-06-17',//'gemini-2.5-flash' } }, { roles: ['query'], provider: 'google-ai', options: { // Balance intelligent querying and cheap tokens model: 'gemini-2.5-flash' } } ] }); const tasks = await agent.extract( 'Extract all tasks in To Do column', z.array(z.object({ title: z.string(), desc: z.string() })) ); // ^ this will use gemini-2.5-flash-lite-preview-06-17 await agent.act('Move each to in progress', { data: tasks }); // ^ this will use Claude const numTodosMoved = await agent.query( 'How many todos were moved?', z.number() ); // ^ this will use gemini-2.5-flash console.log(numTodosMoved); await agent.stop(); } main(); ``` One great use case for this is to reduce the cost of extracting data. While `act` requires an intelligent and [visually grounded model](/core-concepts/compatible-llms), `extract` and `query` do not require grounded models, and can often work fine with less intelligent models. General recommendations: * `act`: MUST use an [intelligent, visually grounded model](/core-concepts/compatible-llms) * `extract`: Can use a fast and cheap model, like `gemini-2.5-flash` or even `gemini-2.5-flash-lite` * `query`: Can use any model that's reasonably intelligent but fast, depending on the complexity of the queries you plan to ask. `gemini-2.5-flash` might be a good option. # Creating PRs Source: https://docs.magnitude.run/contributing/creating-prs How to propose code changes to Magnitude ## Accepted PRs Before you dive into making code changes, you should understand whether your work is likely to get merged! Generally accepted PRs: * Bugfixes (for known or newly identified issues) * [Open issues](https://github.com/magnitudedev/magnitude/issues), especially issues that are affecting or blocking multiple people. However, if there's something else you plan to implement - feel free to create an issue for it or run it by us on Discord first. You can wing it and create the PR too, but no guarantees it will be accepted. If you're ever not sure if a PR is likely to get accepted, feel free to ping us on Discord. Keep in mind that even a working implementation of a feature or resolving an issue may recieve additional scrutiny based on conventions or quality. Other general guidelines: * Changes in `magnitude-test` are more likely to be accepted quickly as the surface area to gain feature-parity with other test runners is quite large. * Changes to `magnitude-core` may also be accepted but may want to run these past us on Discord first. ## PR Creation Process To make a contribution to Magnitude, follow this process: First, fork the Magnitude monorepo on GitHub into a version that lives on your account. This will enable you to push changes onto your fork that you can use to create PRs into the main repo. 1. Go to [https://github.com/magnitudedev/magnitude](https://github.com/magnitudedev/magnitude) 2. Click "Fork", name it whatever, then "Create fork" 3. Clone your fork locally Follow the [Development Setup](/contributing/setup) instructions. Create changes and push them to your fork, adhering to the conventions listed [below](#pr-checklist). [Create a PR](https://github.com/magnitudedev/magnitude/pulls) in the Magnitude repository requesting to merge from the appropriate branch on your fork. Follow instructions below for what should be included in your [PR description](#pr-checklist). We regularly keep an eye out for new PRs - and may leave comments about any additional changes you may need to make in order for your PR to be merged. Once we've taken a look - if the PR looks good, it'll be merged and you'll officially be a Magnitude contributor 😎 Upon the next release, your changes will be live and everyone can enjoy your contributions! πŸŽ‰ PRs will be reviewed and accepted on a case-by-case basis. Submitting a PR does not guarantee that your work will be merged. ## PR Checklist There's a few things you should be sure to do in any PR. Please follow these steps to help your PR get merged smoothly. ### βœ… Manually verify your own changes You should confirm your changes are working as expected, for example by using yalc to publish and test locally (see [local testing](/contributing/setup#local-testing)). ### βœ… Add a clear description If you've implemented a new feature, give example usage that demonstrates what changes you've made. You should also describe what you've done to confirm that your changes are working as expected (any manual testing). ### βœ… Create a `changeset` describing your changes: Once you've made your other changes, run this from monorepo root: ``` bun changeset add ``` In this menu, add a comment describing your change. Unless it's a big change and you've received confirmation that your change should be a `major` or `minor` version bump, ensure that your changeset is marked as `patch`. Commit this changeset file. This will ensure that the next release is tracking your change and will credit you as the author for the change. ### βœ… Update docs If you've made a dev-facing change and there is an appropriate place in the docs that needs updating, feel free to do so. This is not required, but we appreciate it! # Contributing to Magnitude Source: https://docs.magnitude.run/contributing/introduction Let's build some cool sh*t Magnitude aims to be the best browser agent available, and our open source community enables us to grow and improve. If you have a suggestion that you'd like to implement or want to help improve or fix an existing issue, we are open to contributions! Read on to learn how to build the Magnitude repo locally and make changes, or how to suggest improvements. How to suggest improvements to Magnitude How to start creating and merging Magnitude code changes Learn how to set up the Magnitude codebase # Development Setup Source: https://docs.magnitude.run/contributing/setup Learn how to set up the Magnitude codebase ## Prerequisites ### Bun While not explicitly required, it's recommended to use [Bun](https://bun.sh/) as your node runtime since that's what our team uses. To install bun: ```sh theme={null} curl -fsSL https://bun.sh/install | bash ``` ```powershell theme={null} powershell -c "irm bun.sh/install.ps1 | iex" ``` All other contribution-related docs will refer to `bun` in commands - but you can replace with `npm` if preferred ## Monorepo Setup ```sh theme={null} git clone https://github.com/magnitudedev/magnitude.git ``` Then run: ```sh theme={null} bun i && bun run build ``` The monorepo now has dependencies installed and is built. ## Building Packages Whenever you make a change to `magnitude-core` that you want to be reflected in `magnitude-test` during testing, make sure to rebuild with `bun run build`. ## Local Testing When you make changes to `magnitude-test` that you want to test in your project or elsewhere, you need a way to refer to your local package. To do this, you can use [yalc](https://github.com/wclr/yalc). Install yalc: ```sh theme={null} bun i -g yalc ``` Then in the monorepo: ```sh theme={null} cd packages/magnitude-test bun run pubdev ``` In your other project: ```sh theme={null} yalc add magnitude-test ``` This will add the yalc (local) version of magnitude-test instead of the one published on npm Run `pubdev` again as needed to update your other project with your modified version of Magnitude. # Making Suggestions Source: https://docs.magnitude.run/contributing/suggestions How to suggest improvements to Magnitude There are a couple ways to share feedback or suggestions for Magnitude. **1. Join our [Discord](https://discord.gg/VcdpMh9tTy)** Our Discord community frequently gives us feedback on how test cases are running or what features they are missing, which enables us to improve it. **2. Create a [GitHub Issue](https://github.com/magnitudedev/magnitude/issues)** Another way to bring an issue or feature request to our attention is to create a GitHub issue. We check these often! **3. Create a [GitHub Discussion](https://github.com/magnitudedev/magnitude/discussions/categories/ideas)** For feature suggestions that are more abstract or need brainstorming, feel free to create a discussion instead of an issue to engage with us and others in the community. # Agent Options Source: https://docs.magnitude.run/core-concepts/agent-options Configure the Magnitude agent and browser Magnitude can be customized by passing in options when you start a browser agent: ```ts theme={null} await startBrowserAgent({ // Starting URL for agent url: "https://google.com", // Show thoughts and actions narrate: true, // LLM configuration llm: { provider: 'anthropic', options: { model: 'claude-sonnet-4-20250514', apiKey: process.env.ANTHROPIC_API_KEY } }, // Any system instructions specific to your agent or website prompt: 'Prefer mouse to keyboard when filling out form fields' }); ``` Only some LLMs are compatible with Magnitude - see [compatible LLMs](/core-concepts/compatible-llms) for details. For information on configuring the test runner instead see [Configure Test Runner](/testing/test-configuration) ## Browser Options Various browser options can also be passed, such as browser launch options or context options: ```ts theme={null} const agent = await startBrowserAgent({ url: "https://google.com", browser: { // Configured launched browser: launchOptions: { // chromium launch options, for example enabling CDP args: ["--remote-debugging-port=9222"] }, contextOptions: { // see https://playwright.dev/docs/api/class-browser#browser-new-context // for comprehensive list of options viewport: { width: 1280, height: 720 } } } }); ``` See Playwright's docs on [Launch Options](https://playwright.dev/docs/api/class-browsertype#browser-type-launch) and [Browser Context](https://playwright.dev/docs/api/class-browsercontext) for more details on what can be configured. You can also connect via CDP to an open CDP-enabled browser: ```ts theme={null} const agent = await startBrowserAgent({ url: "https://google.com", browser: { cdp: "http://localhost:9222" } }); ``` # Browser Interaction Source: https://docs.magnitude.run/core-concepts/browser-interaction Instruct and control what the agent should do in the browser ## Taking Action Use **act()** to tell the agent what to do: Instructions provided to act can be high-level tasks: ```ts theme={null} await agent.act('log in to the app'); ``` or low level actions: ```ts theme={null} await agent.act('click the submit button'); ``` Think of it like you're telling a coworker to do something. Breaking it up into tiny steps is unnecessary, but you want to be specific enough to make the objective clear. ### Chaining Acts Combine multiple **act** calls to accomplish complex sequences of interactions: ```ts theme={null} await agent.act('go to tasks page'); await agent.act('assign all pending tasks to Bob'); await agent.act('move all pending tasks to "In Progress"'); ``` You can also chain multiple steps together in the same act call for convenience: ```ts theme={null} await agent.act([ 'go to tasks page', 'assign all pending tasks to Bob', 'move all pending tasks to "In Progress"' ]); ``` ### Providing Data You can provide arbitrary data fields that the agent will use where appropriate during its actions: ```ts theme={null} await agent.act('create a new task', { data: { title: 'important task', description: 'some description' } }); ``` ### Custom Prompting Provide custom system prompt instructions as needed: ```ts theme={null} await agent.act('create a new task', { prompt: 'tasks should be written in spanish' }); ``` ## Navigating Directly While the agent is capable of navigating to URLs on its own, you may sometimes want to navigate to a specific URL directly. To do this, use `nav`: ```ts theme={null} await agent.nav('https://google.com'); ``` ## Agent Capabilities ### What can agent do in act? The agent is capable of **mouse**, **keyboard**, and **browser**-specific actions, including but not limited to: * Clicking with the mouse * Dragging with the mouse * Typing long blocks of content * Pressing specific keystrokes * Switching tabs * Navigating to URLs ### What is the agent aware of? The agent knows about and sees: * The current screenshot plus some past screenshots * History of its own actions from the same `act()` * All currently open tabs * Which tab is active # Compatible LLMs Source: https://docs.magnitude.run/core-concepts/compatible-llms Visually grounded LLMs compatible with Magnitude Magnitude requires an LLM than is both: 1. Very good at instruction following and planning 2. Is **visually grounded**, meaning it understands precise coordinates in an image to interact with the browser accurately. Very few LLMs meet this criteria, which is why we recommend **Claude Sonnet 4**, which has strong reasoning abilities and is grounded. To use Sonnet, simply set `ANTHROPIC_API_KEY` in your environment. You can also choose a specific Claude model by configuring an [Anthropic](/reference/llm-providers#anthropic) or [Bedrock](/reference/llm-providers#aws-bedrock) client. Most LLMs are NOT grounded, for example models from OpenAI, Gemini, or Llama. ## Other compatible models If you are looking for a cheaper / open source alternative with comparable performance we recommend **Qwen 2.5 VL 72B**. Here's an example of how you could configure Magnitude to use Qwen via OpenRouter: ```typescript theme={null} const agent = await startBrowserAgent({ url: "https://google.com", llm: { provider: 'openai-generic', options: { baseUrl: 'https://openrouter.ai/api/v1', model: 'qwen/qwen2.5-vl-72b-instruct', apiKey: process.env.OPENROUTER_API_KEY } } }); ``` ```typescript magnitude.config.ts theme={null} export default { url: "http://localhost:5173", llm: { provider: 'openai-generic', options: { baseUrl: 'https://openrouter.ai/api/v1', model: 'qwen/qwen2.5-vl-72b-instruct', apiKey: process.env.OPENROUTER_API_KEY } } } satisfies MagnitudeConfig; ``` For instructions on configuring LLMs with various providers, see [LLM Providers](/reference/llm-providers). Other visually grounded models in the 32B-72B parameter range may be appropriate for Magnitude, depending on the LLM and your test case complexity. Some of these include: * [Qwen2.5-VL-32B](https://huggingface.co/Qwen/Qwen2.5-VL-32B-Instruct) * [UI-TARS-72B](https://huggingface.co/ByteDance-Seed/UI-TARS-72B-DPO) * [Molmo-72B](https://huggingface.co/allenai/Molmo-72B-0924) These models are mostly untested with Magnitude, they may not be suitable for running tests. If any of these LLMs are struggling to follow instructions or have issues with accuracy, please try a recommended model instead. # Data Extraction Source: https://docs.magnitude.run/core-concepts/data-extraction Intelligently turn browser content into structured data ## Extract 101 Pass instructions and a zod schema to `extract()` in order to intelligently collect data from the current page: ```ts theme={null} import z from 'zod'; const numInProgress = await agent.extract( 'how many items are "In Progress"?', z.number() ); ``` Schemas can be any valid zod schema to capture complex data: ```ts theme={null} const tasks = await agent.extract( 'list all tasks', z.array(z.object({ title: z.string(), status: z.enum(['todo', 'inprogress', 'done']), description: z.string(), priority: z.enum(['low', 'medium', 'high', 'urgent']), labels: z.array(z.string()), assignee: z.string() })) ); ``` ### Data Chaining Capturing structured data on its own is helpful. You could save that data to a filesystem, upload it to a database, or pass it off to another process. However, you might want to integrate that data into another web application or trigger additional agent workflows with it. A great way to do this is by using standard control flow based on extracted data, or passing `data` to `act` where needed: ```ts theme={null} const urgentTasks = tasks.filter( task => task.priority === 'urgent' && task.status === 'todo' ); if (urgentTasks.length > 10) { await agent.act('create a new task', data: { title: 'get some of these urgent tasks done!', description: urgentTasks.map(task => task.title).join(', ') }); } ``` ### Extractable content `extract()` will show the agent: 1. A screenshot of the browser window 2. A simplified version of the DOM content 3. The instructions and schema you provide As long as its clear enough how that data should be converted to the provdided zod schema, the agent will return data conforming to the schema based on what it sees in the browser. Magnitude supports any schema that can be defined with `zod` - including arrays, composite objects, numbers, strings, etc. See [https://zod.dev/](https://zod.dev/) for more information about zod. # Playwright Access Source: https://docs.magnitude.run/core-concepts/playwright Combine agentic flows with Playwright operations Magnitude interacts with the browser using Playwright. The agent exposes its `BrowserContext` and the `Page` it's currently on via `agent.context` and `agent.page`. This can be useful when you need some lower-level browser interactions - for example manipulating cookies, listening to network traffic, or emulating specific keystrokes. ## Example ```ts theme={null} // Inject some authorization cookies directly with browser context await agent.context.addCookies([{ name: 'session_id', value: 'fake-session-token', domain: 'localhost', path: '/' }]); // Mock the settings API await agent.page.route('**/api/user/settings', async route => { await route.fulfill({ status: 200, contentType: 'application/json', body: JSON.stringify(mockProfileData) }); }); // Manually emulate a keypress await agent.page.keyboard.press('ArrowRight'); ``` If the agent switches tabs, `agent.page` will always refer to the agent's active tab. See Playwright's docs for [Page](https://playwright.dev/docs/api/class-page) and [BrowserContext](https://playwright.dev/docs/api/class-browsercontext) for details. # Asking Questions Source: https://docs.magnitude.run/core-concepts/query Query the agent about things that happened You can use `agent.query` in order to ask the agent about the actions it just took or anything that happened in the last call to `agent.act`. # Introduction Source: https://docs.magnitude.run/getting-started/introduction Learn about Magnitude's superpowers # What is Magnitude? Magnitude is an open source AI browser automation framework. It uses vision AI to enable you to control your browser with natural language. Magnitude has **four key abilities**: Sees and understands any interface to plan out actions Executes precise actions using mouse and keyboard Intelligently extracts useful structured data Built-in test runner with powerful visual assertions Combining these abilities in our framework makes automating any browser task possible. Keep going to jump into an example and learn more! # Quickstart Source: https://docs.magnitude.run/getting-started/quickstart Set up a Magnitude project and run an example ## Get Started ```sh theme={null} npx create-magnitude-app ``` This script will create a project from a starter template based on your preferences. Simply follow the instructions to set up your project and configure an LLM. ```typescript theme={null} import { startBrowserAgent } from "magnitude-core"; import z from 'zod'; import dotenv from 'dotenv'; dotenv.config(); async function main() { const agent = await startBrowserAgent({ // Starting URL for agent url: 'https://docs.magnitude.run/getting-started/quickstart', // Show thoughts and actions narrate: true, // LLM configuration llm: { provider: 'claude-code', options: { model: 'claude-sonnet-4-20250514' } }, }); // Intelligently extract data based on the DOM content matching a provided zod schema const gettingStarted = await agent.extract('Extract how to get started with Magnitude', z.object({ // Agent can extract existing data or new insights difficulty: z.enum(['easy', 'medium', 'hard']), steps: z.array(z.string()), })); // Navigate to a new URL await agent.nav('https://magnitasks.com'); // Magnitude can handle high-level tasks await agent.act('Create a task', { // Optionally pass data that the agent will use where appropriate data: { title: 'Get started with Magnitude', description: gettingStarted.steps.map(step => `β€’ ${step}`).join('\n') } }); // It can also handle low-level actions await agent.act('Drag "Get started with Magnitude" to the top of the in progress column'); // Stop agent and browser await agent.stop(); } main(); ``` ```sh theme={null} cd my-project npm start ``` This will kick off a basic example included in the template that shows you a bit of what Magnitude can do. Trying tweaking one of the `act()` calls, run `npm start` again, and see what happens - or replace the URL and try an automation on a completely different site. πŸš€ Now you're ready to automate anything! Continue reading docs to learn more about what you can do with the agent, or just keep building and let us know if you have any questions in our [Discord](https://discord.gg/VcdpMh9tTy)! For testing web apps with our native test runner, see [testing setup](/testing/test-setup) # Kernel Source: https://docs.magnitude.run/integrations/kernel Run Magnitude agents on Kernel cloud browsers [Kernel](https://www.onkernel.com/docs/introduction) offers fast browser infrastructure for running browser automations at scale. By integrating with Kernel, you can run Magnitude agents and automations in production with cloud-hosted browsers. ## Benefits of using Kernel with Magnitude * **No local browser management**: Run automations without installing or maintaining browsers locally * **Scalability**: Launch multiple browser sessions in parallel * **Stealth mode**: Built-in anti-detection features for web scraping * **Session persistence**: Maintain browser state across automation runs * **Live view**: Debug your cloud automations with real-time browser viewing ## Adding Kernel to existing Magnitude implementations Ready to go live? Run your existing Magnitude automation in production with Kernel’s cloud browsers by updating your browser configuration. ### 1. Install the Kernel SDK ```bash theme={null} npm install @onkernel/sdk ``` ### 2. Initialize Kernel and create a browser Import the libraries and create a cloud browser session: ```typescript theme={null} import { startBrowserAgent } from "magnitude-core"; import Kernel from '@onkernel/sdk'; import z from 'zod'; const client = new Kernel({ apiKey: process.env.KERNEL_API_KEY, }); const kernelBrowser = await client.browsers.create({ viewport: { width: 1920, height: 1080 } }); console.log(`Live view url: ${kernelBrowser.browser_live_view_url}`); ``` ### 3. Update your browser configuration Replace your existing browser setup to use Kernel's CDP URL and display settings: ```typescript theme={null} const agent = await startBrowserAgent({ url: 'https://magnitasks.com', narrate: true, browser: { cdp: kernelBrowser.cdp_ws_url, contextOptions: { viewport: { width: 1920, height: 1080 } } }, llm: { provider: 'anthropic', options: { model: 'claude-sonnet-4-20250514' } } }); ``` ### 4. Use your agent Use Magnitude's agent methods with the Kernel-powered browser: ```typescript theme={null} await agent.act([ 'click on "Tasks" in the sidebar', 'click on the first item in the "In Progress" column' ]); const assignee = await agent.extract('Extract the task Assignee', z.string()); await agent.stop(); // Clean up the Kernel Browser Session await client.browsers.deleteByID(kernelBrowser.session_id); ``` ## Quick setup with Kernel's app template Alternatively, you can use Kernel's app template that includes a pre-configured Magnitude integration: ```bash theme={null} npx @onkernel/create-kernel-app my-magnitude-app ``` Choose `TypeScript` as the programming language and then select `Magnitude` as the template. Then follow the [Quickstart guide](https://www.onkernel.com/docs/quickstart) to deploy and run your Magnitude automation on Kernel's infrastructure. For more information, see the [Kernel documentation](https://www.onkernel.com/docs/integrations/magnitude). # MCP Source: https://docs.magnitude.run/integrations/mcp Enable Cline, Cursor, or Windsurf to build and run Magnitude tests with MCP Magnitude provides an official MCP server that enables AI assistants to set up projects, build test cases, and run tests with Magnitude. ### Marketplace Install To install with Cline, you can find us on the official Cline Marketplace. Go to `MCP Servers -> Marketplace`, search for `Magnitude`, click `Install` and follow the instructions! ### Manual Install Alternatively, to manually install the MCP server for Cline, follow these steps: 1. Install MCP server via npm: ``` npm i -g magnitude-mcp ``` 2. Go to `MCP Servers -> Installed -> Configure MCP Servers` and add our MCP server to the JSON: ```json theme={null} { "mcpServers": { "magnitude": { "command": "npx", "args": ["magnitude-mcp"] } } } ``` Now start a new chat with Cline and ask to set up a new project with Magnitude, build new Magnitude tests, or run tests! 1. Install MCP server via npm: ``` npm i -g magnitude-mcp ``` 2. Open Cursor Settings, go to Features > MCP Servers 3. Click "+ Add new global MCP server" and enter the following code: ```json theme={null} { "mcpServers": { "magnitude": { "command": "npx", "args": ["magnitude-mcp"] } } } ``` 1. Install MCP server via npm: ``` npm i -g magnitude-mcp ``` 2. Add this to your `./codeium/windsurf/model_config.json`: ```json theme={null} { "mcpServers": { "magnitude": { "command": "npx", "args": ["magnitude-mcp"] } } } ``` # BrowserAgent Source: https://docs.magnitude.run/reference/browser-agent The BrowserAgent class for browser automation. # BrowserAgent The `BrowserAgent` is the primary class for browser automation with Magnitude. It provides a high-level, AI-powered API for controlling a web browser. An agent is created and initialized using the `startBrowserAgent` function. ## `startBrowserAgent(options?)` This function creates, initializes, and returns a `BrowserAgent` instance. ```typescript Basic Usage theme={null} import { startBrowserAgent } from 'magnitude-core'; const agent = await startBrowserAgent(); await agent.nav("https://google.com"); // ... perform actions await agent.stop(); ``` An optional configuration object that can be used to customize the agent's behavior, including Playwright launch options and LLM settings. See [Agent Options](/core-concepts/agent-options) for a full reference. An object to configure the browser instance. It can contain either Playwright `launchOptions` or `contextOptions`. Standard Playwright launch options. See the [Playwright documentation](https://playwright.dev/docs/api/class-browsertype#browser-type-launch) for a full list of options. ```typescript Example: Launching with a proxy theme={null} const agent = await startBrowserAgent({ browser: { launchOptions: { proxy: { server: 'http://my-proxy-server.com:8080' } } } }); ``` Standard Playwright browser context options. See the [Playwright documentation](https://playwright.dev/docs/api/class-browser#browser-new-context) for a full list of options. ```typescript Example: Setting a custom user agent theme={null} const agent = await startBrowserAgent({ browser: { contextOptions: { userAgent: 'MyCustomBrowser/1.0' } } }); ``` An initial URL to navigate to immediately after the browser starts. Sets the resolution of the screenshot that the LLM sees. This does not change the browser's viewport size, but scales the screenshot to these dimensions. This is useful for ensuring a consistent input size for the vision model. Configure the LLM to be used by the agent. See the [LLM Providers](/reference/llm-providers) documentation for details. If `true`, the agent will narrate its actions to the console, providing a running commentary of what it's doing. *** ## Methods ### `act(description, options?)` Executes one or more browser actions based on a natural language description. Magnitude interprets the description and determines the necessary interactions (clicks, types, scrolls, etc.). ```typescript Step Examples theme={null} // Simple step await agent.act("Click the main login button"); // Step with data await agent.act("Enter {username} into the user field", { data: { username: "test@example.com" } }); ``` A natural language description of the action(s) to perform. Can include placeholders like `{key}` which will be substituted by values from `options.data`. Optional parameters for the step. Provides data for the step. * **`string`**: A single string value. * **`Record`**: Key-value pairs where keys match placeholders in the `description`. * **`string`**: Provide additional instructions for the LLM. These are injected into the system prompt. ### `nav(url: string)` Navigates to a URL. ```typescript theme={null} await agent.nav('https://google.com'); ``` ### `page` and `context` Access the underlying Playwright `Page` and `BrowserContext`. ```typescript theme={null} const page = agent.page; const context = agent.context; ``` ### `extract(instructions, schema)` The `BrowserAgent` provides a powerful `extract()` method that uses AI to pull structured data from a webpage based on your instructions and a Zod schema. ```typescript Example: Extracting Hacker News headlines theme={null} import { startBrowserAgent } from 'magnitude-core'; import { z } from 'zod'; const HackerNewsStory = z.object({ rank: z.number().describe('The story rank'), title: z.string().describe('The title of the story'), url: z.string().url().describe('The URL of the story'), }); const HackerNewsSchema = z.array(HackerNewsStory); const agent = await startBrowserAgent(); await agent.nav("https://news.ycombinator.com"); const stories = await agent.extract( "Extract the top 5 stories from Hacker News", HackerNewsSchema ); console.log(stories); await agent.stop(); ``` Natural language instructions for the AI, describing what data to extract from the page. A Zod schema that defines the structure of the data to be extracted. The AI will do its best to populate the fields of the schema based on the page content and your instructions. Adding `.describe()` calls to your schema fields can significantly improve the accuracy of the extraction by providing more context to the AI. ### `stop()` Closes the browser and cleans up any resources used by the agent. # LLM Providers Source: https://docs.magnitude.run/reference/llm-providers Instructions for configuring LLMs with different providers You can specify which LLM to use in `magnitude.config.ts`. If no LLM is configured and `ANTHROPIC_API_KEY` is available, Magnitude will use Claude Sonnet 4 automatically. While a variety of LLM providers are supported, many of them do NOT have any visually grounded model. See [LLM Configuration](/customizing/llm-configuration) for details. Magnitude uses [BAML](https://docs.boundaryml.com/ref/llm-client-providers/overview)'s providers under the hood, so their docs may be a useful secondary reference for credential configuration. To configure your LLM, pass one of the client interfaces described below to your `magnitude.config.ts`, like: ```typescript theme={null} import { type MagnitudeConfig } from 'magnitude-test'; export default { url: "http://localhost:5173", llm: { provider: 'anthropic', // your provider of choice options: { // any required + optional configuration for that provider model: 'claude-3-7-sonnet-latest', apiKey: process.env.ANTHROPIC_API_KEY } } } satisfies MagnitudeConfig; ``` # Providers ## Google AI Studio ```typescript theme={null} interface GoogleAIClient { provider: 'google-ai', options: { model: string, apiKey?: string // defaults to GOOGLE_API_KEY temperature?: number, baseUrl?: string // defaults to https://generativelanguage.googleapis.com/v1beta } } ``` ## Google Vertex AI ```typescript theme={null} interface GoogleVertexClient { provider: 'vertex-ai', options: { model: string, location: string, baseUrl?: string, projectId?: string, credentials?: string | object, temperature?: number, } } ``` The easiest way to authenticate with Vertex AI is to authenticate using the `gcloud` CLI. 1. Create a project in [Google Cloud](https://console.cloud.google.com). 2. Enable Vertex AI in that project by going to [Vertex AI Dashboard](https://console.cloud.google.com/vertex-ai/dashboard) an selecting "Enable all Recommended APIs" 3. Install the `gcloud` CLI ([instructions](https://cloud.google.com/sdk/docs/install)) 4. Run `gcloud auth application-default login --project ` Once you've done these steps, you can set up a project to use Vertex with the available credentials like this: ```ts theme={null} import { type MagnitudeConfig } from "magnitude-test"; export default { url: "http://localhost:5173", llm: { provider: 'vertex-ai', options: { model: 'google/gemini-2.5-pro-preview-05-06', location: 'us-central1' } } } satisfies MagnitudeConfig; ``` If running in GCP, it will query the metadata server to use the attached service account. More info: [BAML Google Vertex Provider Docs](https://docs.boundaryml.com/ref/llm-client-providers/google-vertex#authentication) ## Anthropic ```typescript theme={null} interface AnthropicClient { provider: 'anthropic', options: { model: string, apiKey?: string, temperature?: number } } ``` ## OpenAI ```typescript theme={null} interface OpenAIClient { provider: 'openai', options: { model: string, apiKey?: string, temperature?: number } } ``` ## OpenAI-compatible (OpenRouter, Ollama, etc.) ```typescript theme={null} interface OpenAIGenericClient { provider: 'openai-generic' options: { model: string, baseUrl: string, apiKey?: string, temperature?: number, headers?: Record } } ``` ## AWS Bedrock ```typescript theme={null} interface BedrockClient { provider: 'aws-bedrock', options: { model: string, // passed to inference_configuration temperature?: number } } ``` Authenticate with bedrock using environment variables: ```sh theme={null} export AWS_ACCESS_KEY_ID="your_key" export AWS_SECRET_ACCESS_KEY="your_secret" export AWS_REGION="us-east-1" ``` ## Azure OpenAI ```typescript theme={null} interface AzureOpenAIClient { provider: 'azure-openai', options: { resourceName: string, deploymentId: string, apiVersion: string, apiKey: string } } ``` More info on authenticating with Azure: [https://docs.boundaryml.com/ref/llm-client-providers/open-ai-from-azure](https://docs.boundaryml.com/ref/llm-client-providers/open-ai-from-azure) ## Configuring Moondream Moondream cloud is the easiest way to get set up, and offers 5,000 free requests per day. Get an API key [here](https://moondream.ai/c/cloud/api-keys). Moondream is open source and can also be self-hosted instead of using their cloud option. See [here](https://moondream.ai/c/moondream-server) for instructions. If self-hosting, configure the `baseUrl` to point to your server: ```typescript theme={null} import { type MagnitudeConfig } from 'magnitude-test'; export default { url: "http://localhost:5173", grounding: { provider: 'moondream', options: { baseUrl: 'your-self-hosted-moondream-endpoint', apiKey: process.env.MOONDREAM_API_KEY // not necessary if self-hosted } } } satisfies MagnitudeConfig; ``` # TestCaseAgent Source: https://docs.magnitude.run/reference/test-case-agent Reference for the `agent` object used in Magnitude tests. The `agent` object is your primary tool for interacting with the browser within a Magnitude test. It's an instance of the `TestCaseAgent` class. `TestCaseAgent` extends [`BrowserAgent`](/reference/browser-agent) and includes all of its methods, such as `act()`, `nav()`, and `extract()`. In addition to the inherited methods, `TestCaseAgent` provides the following methods for making assertions: ## `agent.check(description)` Verifies that a certain condition holds true on the web page based on a natural language description. The AI evaluates the description against the current page state (DOM, visibility, text content). ```typescript Complete Test Case Example theme={null} import { test } from 'magnitude-test'; test('should create three todos', async (agent) => { await agent.act('create three todos'); await agent.check('three todos exist'); }); ``` A natural language statement describing the expected condition or state to verify. # Test Declaration Source: https://docs.magnitude.run/reference/test-declaration Reference for the `test` function used to define Magnitude test cases. Magnitude tests are defined using the globally available `test` function imported from `magnitude-test`. ```typescript Basic Usage theme={null} import { test } from 'magnitude-test'; test('Descriptive test title', async (agent) => { // Test logic using agent }); ``` ## `test(title, options?, testFn)` Defines a new test case. A descriptive title for the test case. This title appears in test reports and logs. Optional configuration specific to this test case. Overrides the base URL defined in the global configuration or test group for this specific test case. An asynchronous function containing the test logic. It receives an `agent` object with the properties described below. The agent object providing methods for AI interaction (`agent.act()`, `agent.check()`) and access to Playwright's `agent.page` and `agent.context`. See [AI Steps and Checks](./ai-steps-checks), [Low-Level AI Actions](./ai-low-level), and [Playwright Access](./playwright-access). ## `test.group(id, options?, groupFn)` Defines a group of test cases, allowing shared options (like `url`) to be applied to all tests within the group. ```typescript Group Example theme={null} import { test } from 'magnitude-test'; test.group('User Authentication Flow', { url: '/login' }, () => { test('should display login form', async (agent) => { await agent.check("Login form is visible"); }); test('should allow login with valid credentials', async (agent) => { await agent.act("Log in with valid credentials"); await agent.check("User is redirected to dashboard"); }); }); ``` A descriptive identifier for the test group. Optional configuration applied to all tests within this group. See properties below. Sets a base URL for all tests within the group. Can be overridden by individual test options. A synchronous function that contains the `test()` declarations belonging to this group. # Building Test Cases Source: https://docs.magnitude.run/testing/building-test-cases How to design and build effective test cases ## Test Cases Each Magnitude test case navigates to a URL in a browser, executes **Test Steps** on the web application at that URL, and verifies any **Checks** along the way. For example: ```typescript theme={null} test('can add and remove todos', async (agent) => { await agent.act('Add a todo'); await agent.act('Remove the todo'); }); ``` A test case is designed to represent a single user flow in your web app. ### Configure Test Cases Each test can additionally be configured with a different starting URL (defaults to the [configured project](/customization/configuration) `url` in `magnitude.config.ts`): ```typescript theme={null} test('can add and remove todos', { url: "https://mytodoapp.com" }, async (agent) => { await agent.act('Add a todo'); await agent.act('Remove the todo'); }); ``` ## Test Steps When you define a step, you provide a description for what Magnitude should do during that step, for example: ```typescript theme={null} test('example', async (agent) => { await agent.act('Log in'); // step description }); ``` Each step should make sense on its own and describe a portion of the user flow. Steps should only be specific enough that it's clear from your app's interface how to complete the step. For example - to log into an app, you don't need to say type into each field or what buttons to press - just provide any necessary data and say "Log in". ### Checks A **check** is a **natural language visual assertion** that you can add to any step in your test case. Think `assert` in other testing frameworks, except it can "see" the website and understand natural language descriptions. Examples of valid checks: * "Only 3 todos should be listed" * "Make sure image of giraffe is visible" * "The response from the chat bot should make sense and answer the user's question" Checks are validated after the step they are attached to is executed. To actually use a check in a test case, include it after the relevant step: ```typescript theme={null} test('example', async (agent) => { await agent.act('Log in'); await agent.check('Dashboard is visible'); }); ``` ### Test Data You can provide additional **test data** relevant to specific step like this: ```typescript theme={null} test('example', async (agent) => { await agent.act('Log in', { data: { email: "foo@bar.com", password: "foo" } }); await agent.check('Dashboard is visible'); }); ``` The key/value pairs are completely up to you, but it should be clear enough what they should be used for. You can also provide completely freeform data by passing in a string instead of a key/value object: ```typescript theme={null} test('example', async (agent) => { await agent.act('Add 3 todos', { data: 'Use "Take out trash" for the first todo and make up the other 2' }); }); ``` ### Custom LLM prompting You can pass custom instructions to any `act` call by specifing the `prompt` option. ```typescript theme={null} test('example', async (agent) => { await agent.act('create 3 todos', { prompt: 'all todos must be animal-related' }); }); ``` You can also do this at the test or group level to apply to all acts within that block. ```typescript theme={null} test.group('todo list', { prompt: 'Each todo should be exactly 5 words'}, () => { test('can add todos', { url: 'https://magnitodo.com', prompt: 'All todos should be animal related' }, async (agent) => { await agent.act('create 3 todos', { prompt: 'the first and last word on the todo must start with the same letter'}); }); }); ``` ### Example of migrating a Playwright test case to Magnitude A simple test case from the Playwright demo TODO app: ```typescript theme={null} test('should allow me to add todo items', async ({ page }) => { const newTodo = page.getByPlaceholder('What needs to be done?'); await newTodo.fill(TODO_ITEMS[0]); await newTodo.press('Enter'); await expect(page.getByTestId('todo-title')).toHaveText([ TODO_ITEMS[0] ]); await newTodo.fill(TODO_ITEMS[1]); await newTodo.press('Enter'); await expect(page.getByTestId('todo-title')).toHaveText([ TODO_ITEMS[0], TODO_ITEMS[1] ]); }); ``` The same test case in Magnitude: ```typescript theme={null} test('should allow me to add todo items', async (agent) => { await agent.act('Create todo', { data: TODO_ITEMS[0] }); await agent.check('First todo appears in list'); await agent.act('Create another todo', { data: TODO_ITEMS[1] }); await agent.check('List has two todos'); }); ``` # Testing in CI Source: https://docs.magnitude.run/testing/ci Run Magnitude tests with GitHub Actions You can kick off Magnitude tests from GitHub actions by: 1. Ensuring that your development server is accessible in the test runner 2. Ensuring `magnitude-test` gets installed on the test runner 3. Running the appropriate `npx magnitude` CLI command 4. Including the appropriate LLM client credentials Here's an example `.githhub/workflows/magnitude.yaml`, from our our [example repo](https://github.com/magnitudedev/magnitude-demo-repo/blob/main/.github/workflows/magnitude.yaml): ```yaml theme={null} name: Run Magnitude Tests on: push: branches: [ main ] pull_request: branches: [ main ] jobs: test: runs-on: ubuntu-latest env: ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} steps: - uses: actions/checkout@v4 - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '22' cache: 'npm' - name: Install dependencies run: npm ci - name: Install playwright run: npx playwright install chromium - name: Start development server run: npm run dev & - name: Wait for server to start run: sleep 5 - name: Run tests with Xvfb uses: GabrielBB/xvfb-action@v1 with: run: npx magnitude -p ``` # Running Tests Source: https://docs.magnitude.run/testing/running-tests How to run test cases To run your Magnitude test cases, use the CLI: ``` npx magnitude ``` ## Test in Parallel You can run your Magnitude tests in parallel simply by providing the `--workers` or `-w` flag with the desired number of parallel workers: ``` npx magnitude -w 4 ``` If any Magnitude test fails, the CLI process will exit with status code 1. When deployed as part of a CI/CD pipeline e.g. with a GitHub Action, this will fail the deployment. ## Test Failures Unlike existing frameworks like Playwright, the criteria for test case failure is not based on whether a selector fails or some expression evaluates false. Instead, Magnitude decides to fail a test case if either **(1) any act cannot be completed** or **(2) a check does not hold true**. It will attempt to execute a test case according to the provided steps and only fail if there is no clear way to accomplish the test case, or if any check isn't satisfied. ## Integrating with CI/CD You can run Magnitude tests in CI anywhere that you could run Playwright tests, just include LLM client credentials. For instructions on running tests cases on GitHub actions, see [here](/testing/ci). # Configure Test Runner Source: https://docs.magnitude.run/testing/test-configuration Customize browser settings, web servers, and more When you run `npx magnitude init`, a `magnitude.config.ts` will be generated for you. By default it looks something like: ```typescript theme={null} import { type MagnitudeConfig } from 'magnitude-test'; export default { url: "http://localhost:5173" } satisfies MagnitudeConfig; ``` `url` is the default URL that all test cases will use if not specified. However, there's a lot more you can customize to get Magnitude working exactly as you want. ## Browser Options You can customize options to pass to each [Playwright browser context](https://playwright.dev/docs/api/class-browser#browser-new-context) that gets created while running Magnitude tests. Common options you may want to customize might be `viewport` or even `recordVideo` to capture videos of tests. For example: ```typescript theme={null} import { type MagnitudeConfig } from 'magnitude-test'; export default { url: "http://localhost:5173", browser: { contextOptions: { viewport: { width: 800, height: 600 }, recordVideo: { dir: './videos/', size: { width: 800, height: 600 } } } } } satisfies MagnitudeConfig; ``` ## Development web server Magnitude can automatically launch your development server when tests run. Configure `webServer` with the command to start the server and the URL it will listen on: ```typescript theme={null} import { type MagnitudeConfig } from 'magnitude-test'; export default { url: 'http://localhost:3000', webServer: { command: 'npm run start', url: 'http://localhost:3000', timeout: 120_000, reuseExistingServer: true } } satisfies MagnitudeConfig; ``` Magnitude checks if `webServer.url` is already reachable. If so and `reuseExistingServer` is `true`, the command is skipped. The server process is killed automatically once the test run completes. ## Test failure behaviour The default behaviour for failed tests is to terminate execution after the first failed test. To change this and allow other tests to run regardless of failed tests, you can either * Run magnitude with the `--no-fail-fast` CLI flag * Add `continueAfterFailure: true` to your `magnitude.config.ts` Example: ```typescript theme={null} import { type MagnitudeConfig } from 'magnitude-test'; export default { url: "...", continueAfterFailure: true, llm: { //... } } satisfies MagnitudeConfig; ``` ## Test URL resolution Each test uses a URL that is built from the broader scope of configuration in this order: 1. **Environment variable** (`MAGNITUDE_TEST_URL`) 2. **Global configuration** (`magnitude.config.ts`) 3. **Test group options** 4. **Individual test options** For the `url` option at any of these levels, you can provide a relative path to attach to the upper level's URL, or a full URL to override it. For example, if you provide `{url: "https://localhost:8080"}` in `magnitude.config.ts`, and an individual test has `{url: "/items?id=1"}`, the test runner will navigate to `https://localhost:8080/items?id=1`. ## Telemetry Opt-Out By default Magnitude collects basic anonymized telemetry when you run a test, such as the duration of the test and number of tokens used. We use this information to help us understand our usage and grow as an open source project. We appreciate it if you leave it on :) To opt out of telemetry: ```typescript theme={null} import { type MagnitudeConfig } from 'magnitude-test'; export default { url: "http://localhost:5173", telemetry: false } satisfies MagnitudeConfig; ``` # Testing Setup Source: https://docs.magnitude.run/testing/test-setup Set up a Magnitude project and run an example ## Setup ```sh theme={null} npm install --save-dev magnitude-test ``` > or see our [demo repo](https://github.com/magnitudedev/magnitude-demo-repo) if you don't have a project to try it on ```sh theme={null} npx magnitude init ``` This will create a basic tests directory `tests/magnitude` with: * `magnitude.config.ts`: Magnitude test configuration file * `example.mag.ts`: An example test file The easiest way to set up an LLM for Magnitude is to set the `ANTHROPIC_API_KEY` environment variable. Sonnet 4 will be used by default. See [LLM Configuration](/customizing/llm-configuration) for more details. πŸš€ Now you're ready to run tests! ## Running tests **Run your Magnitude tests with:** ```sh theme={null} npx magnitude ``` This will run all Magnitude test files discovered with the `*.mag.ts` pattern. If the agent finds a problem with your app, it will tell you what happened and describe the bug! > To run many tests in parallel, add `-w ` To learn more about different options for running tests see [here](/testing/running-tests). ## Building test cases Now that you've got Magnitude set up, you can create real test cases for your app. Here's an example for a general idea: ```ts theme={null} import { test } from 'magnitude-test'; test('can log in and create company', async (agent) => { await agent.act('Log in to the app', { data: { username: 'test-user@magnitude.run', password: 'test' } }); await agent.check('Can see dashboard'); await agent.act('Create a new company', { data: 'Make up the first 2 values and use defaults for the rest' }); await agent.check('Company added successfully'); }); ``` Acts, checks, and data are all natural language. Think of it like you're describing how to test a particular flow to a co-worker - what steps they need to take, what they should check for, and what test data to use. For more information on how to build test cases see our docs.