Designate different LLMs for different responsibilities
act
, extract
, query
.
By default when a single LLM is provided, all responsibilites will be handled by that LLM. However, by specifying different LLMs for certain roles you may be able to save on cost and speed.
Example:
act
requires an intelligent and visually grounded model, extract
and query
do not require grounded models, and can often work fine with less intelligent models.
General recommendations:
act
: MUST use an intelligent, visually grounded modelextract
: Can use a fast and cheap model, like gemini-2.5-flash
or even gemini-2.5-flash-lite
query
: Can use any model that’s reasonably intelligent but fast, depending on the complexity of the queries you plan to ask. gemini-2.5-flash
might be a good option.