Google’s AI can now surf the web for you, click on buttons, and fill out forms with Gemini 2.5 Computer Use

by | Oct 7, 2025 | Technology

Some of the largest providers of large language models (LLMs) have sought to move beyond multimodal chatbots — extending their models out into “agents” that can actually take more actions on behalf of the user across websites. Recall OpenAI’s ChatGPT Agent (formerly known as “Operator”) and Anthropic’s Computer Use, both released over the last two years. Now, Google is getting into that same game as well. Today, the search giant’s DeepMind AI lab subsidiary unveiled a new, fine-tuned and custom-trained version of its powerful Gemini 2.5 Pro LLM known as “Gemini 2.5 Pro Computer Use,” which can use a virtual browser to surf the web on your behalf, retrieve information, fill out forms, and even take actions on websites — all from a user’s single text prompt.”These are early days, but the model’s ability to interact with the web – like scrolling, filling forms + navigating dropdowns – is an important next step in building general-purpose agents,” said Google CEO Sundar Pichai, as part of a longer statement on the social network, X.The model is not available for consumers directly from Google, though.Instead, Google partnered with another company, Browserbase, founded by former Twilio engineer Paul Klein in early 2024, which offers virtual “headless” web browser specifically for use by AI agents and applications. (A “headless” browser is one that doesn’t require a graphical user interface, or GUI, to navigate the web, though in this case and others, Browserbase does show a graphical representation for the user). Users can demo the new Gemini 2.5 Computer Use model directly on Browserbase here and even compare it side-by-side with the older, rival offerings from OpenAI and Anthropic in a new “Browser Arena” launched by the startup (though only one additional model can be selected alongside Gemini at a time).For AI builders and developers, it’s being made as a raw, albeit propreitary LLM through the Gemini API in Google AI Studio for rapid prototyping, and Google Cloud’s Vertex AI model selector and applications building platform. The new offering builds on the capabilities of Gemini 2.5 Pro, released back in March 2025 but which has been updated significantly several times since then, with a specific focus on enabling AI agents to perform direct interactions with user interfaces, including browsers and mobile applications.Overall, it appears Gemini 2.5 Computer Use is designed to let developers create agents th …

Article Attribution | Read More at Article Source