Low-Level AI Actions
Bypass the planner and trigger specific web actions with AI
While ai.step()
is the primary method for defining actions using natural language, the ai
object also provides lower-level methods for more direct control over specific browser interactions, like clicking and typing into specific targets.
These actions bypass the planner and go straight to Moondream. This can be useful to use AI selectors directly as we refine the agentβs behavior on steps and checks.
Action Method Comparison
Example | Web Actions | Brittleness | Control | |
---|---|---|---|---|
π§ High Level AI | ai.step(decription) | One or Many | π Low | Medium |
π€ Low Level AI | ai.click(target) | One | π Medium | High |
βοΈ Playwright | page.mouse.click(x, y) | One | β οΈ High | High |
When to use low level actions
They can be useful when:
- You need precise control over a specific interaction.
- Moondream is having trouble clicking a specific target and you want to directly control the prompt
ai.click(target)
Directly performs a click action on an element identified by the target
description.
A natural language description of the element to click. The AI identifies the element based on this description and performs a click.
ai.type(target, content)
Directly performs a typing action into a specified input element.
A natural language description of the input element to type into.
The text content to type into the target element.
ai.exec(action)
Execute any action using its JSON representation.
An ActionIntent
object describing the exact action. See properties below for variants.