Rick Osterloh and Parisa Tabriz explain how Gemini turns Chrome from a passive window into an agentic assistant with Auto Browse, Nano Banana image editing, History Recall, and a User Alignment Critic for safety.
Google's Rick Osterloh (SVP Platforms & Devices) and Parisa Tabriz (VP Chrome) demo Gemini's new Chrome integration, which adds agentic browsing via Auto Browse, on-device image editing with Nano Banana, tab management through History Recall, and a User Alignment Critic that guards against prompt injection. The update is available now on Windows, Mac, and Chrome OS for Google One subscribers.
Rick Osterloh, Google's SVP of Platforms & Devices, and Parisa Tabriz, VP of Chrome and a 15-year Google security veteran, sit down with Logan Kilpatrick from Google DeepMind to demo and discuss Gemini's integration into Chrome. The update adds a side panel assistant (Ctrl+G), Auto Browse for agentic web task automation, Nano Banana for on-device image editing, and History Recall for natural language browsing history search. Auto Browse is currently limited to Google One Ultra and Pro subscribers in the US. The single most important takeaway: Google built a dedicated "User Alignment Critic" model that monitors every prompt in real time to guard against prompt injection when Auto Browse navigates the open web. Osterloh explicitly called Auto Browse "transitional technology," with MCP and UCP protocols as the long-term path to agent-to-agent commerce.
Auto Browse is deliberately conservative on sensitive actions. It can navigate sites, search, and fill forms autonomously, but it won't post to social media or complete a purchase without human confirmation. Google Password Manager login requires per-site permission. This is a trust-first rollout for a product with billions of users.
Google built a dedicated AI model just to watch other AI models. The User Alignment Critic is a separate agent that monitors every prompt flowing to Gemini in Chrome, checking if instructions align with the user's original intent. This is their answer to prompt injection on the open web, not a filter on the main model but a parallel "Overwatch agent."
We created this concept of the User Alignment Critic, which is effectively an Overwatch agent that is looking at the prompts that are going by and trying to see if that really is something that you should be cautious about or whether it's okay. It's constantly got your back.

Tabriz says people are already replacing SEO with "agent optimization." She's seeing websites create agent-specific buttons that swap to markdown so AI models don't waste context parsing CSS. Chrome's VP of security framing this as an active trend, not a prediction, is a strong signal.
Auto Browse is transitional technology. Osterloh was explicit: today agents literally navigate pages like a human would. The future uses protocols like MCP and Google's UCP for structured agent-to-agent commerce. Sites that expose APIs now will be ahead when that shift happens.
Chrome's cross-device identity layer is the AI context advantage. Users already sync passwords, autofill, and history across Windows, Mac, Chrome OS, Android, and iOS. Gemini inherits all of that context without the user uploading anything. Tabriz called Chrome "an operating system on top of your operating system."
History Recall solves tab hoarding through loss aversion. People keep tabs open because they're afraid of losing context. History Recall lets you ask natural language questions about your browsing history and get links back.

We actually know a lot of people keep their tabs around because they're afraid of closing them because they don't know how to get back to it.

I'm a clean tabs person now. I just trust search.
AI compute costs forced the Google One subscription model. Osterloh was candid: AI inference is significantly more expensive than traditional web transactions. The tiered Google One subscription (free, Pro, Ultra) funds continued infrastructure investment. He called it "one of our biggest growth engines."
The Chrome-DeepMind collaboration required redefining shared vocabulary. Tabriz described getting frustrated because "quality" meant different things to the Chrome team and the modeling team. They had to physically get in the same room with a whiteboard to align.
I think this totally changes what a browser is. The auto browse capability turns the browser a bit into an automated workflow system. This is only scratching the surface, but I think this is a real beginning of a huge change in how people use computers.

I build infrastructure that connects AI agents to web data at WebSearchAPI.ai. When Google ships an agentic browser that can navigate sites, fill forms, and make purchases on a user's behalf, that changes the relationship between AI systems and the web itself. I watched this 48-minute conversation to understand what's actually shipping, what's still experimental, and what it means for anyone building on the web.

Google launched a major update to Gemini in Chrome that works across Windows, Mac, and Chrome OS. The integration adds a side panel accessible via Ctrl+G (or a button click) that supports multi-conversation multitasking, connected app integrations with YouTube, Gmail, Flights, and other Google services, and personal intelligence that syncs your Gemini preferences across devices.
This week we launched another really big update of Gemini in Chrome. It works across Windows and Mac and Chrome OS, and we brought a more integrated experience into it. You can open it with a control G or shortcut to get a side panel experience that helps with multitasking so you can have multiple conversations at once.

The update also previews Auto Browse, an agentic feature where Gemini can navigate websites and complete tasks on your behalf. Auto Browse is currently limited to Google One Ultra and Pro subscribers in the US.
For anyone building web applications, the most interesting detail is the connected apps architecture. Gemini in Chrome already integrates with YouTube, Gmail, shopping, and flights. Tabriz confirmed they're building more integrations. This means Chrome is evolving from a rendering engine into a platform that can take actions across Google's ecosystem and, through Auto Browse, the open web.
| Feature | What It Does | Availability |
|---|---|---|
| Gemini Side Panel | Multi-conversation AI assistant via Ctrl+G | All Chrome users (free tier limited) |
| Auto Browse | Agentic web task automation | Google One Ultra & Pro (US only) |
| Nano Banana | On-device image editing in context | Google One subscribers |
| History Recall | AI-powered browsing history search | All Gemini in Chrome users |
| Connected Apps | Integrations with Gmail, YouTube, Flights | All Gemini in Chrome users |
| Personal Intelligence | Cross-device preference syncing | Rolling out |
Before this update, AI assistance on desktop lived in a separate browser tab. You'd go to gemini.google.com, ChatGPT, or Claude, type your question, and context-switch back to what you were doing. Mobile had it better because AI assistants were integrated at the OS level through voice activation.
For a long time, the desktop with AI has evolved as a web app, a direct web app that you'd go to. And then mobile sort of evolved in a slightly different direction where it was a full mobile experience with the assistant integrated usually through activation through a button press. We actually felt there was this really important opportunity to bring some of the power that people have in using in-context assistance on a mobile device to the web applications they're using every day.

According to Osterloh, the gap was clear: mobile users got contextual AI assistance, but desktop users had to leave their workflow. The fix was embedding Gemini directly into Chrome's side panel so you can interact with AI while staying on whatever page you're working with.
This matters beyond convenience. When the AI assistant lives inside the browser, it can see your current tab, understand the page content, and take action without you copying and pasting context into a separate window. That's the difference between a chatbot and an assistant.
Chrome runs on Windows, Mac, Chrome OS, Linux, Android, and iOS. Users already sign in and sync their autofill data, passwords, history, and bookmarks across devices. That cross-device identity layer is what makes Chrome different from a standalone AI app.
We think of Chrome as a platform in and of itself. It really is an operating system on top of your operating system. With Gemini we're bringing assistance and intelligence into the browser with your personal context. We can help on a lot of long-lived journeys, like shopping. Most people don't sit down and buy a new car in one session. They start on one device, pick things up on their laptop, then their phone.

The personal context angle is where this gets interesting for AI applications. Tabriz described use cases ranging from summarizing YouTube videos to synthesizing content across multiple tabs. Her team's own kids use tab groups for schoolwork, asking Gemini to generate quizzes from their biology research tabs.
From an API perspective, this creates a new kind of context window. Instead of a user manually feeding documents into an AI chat, the browser already has the context: browsing history, open tabs, logged-in accounts. The question is how much of that context Google will expose to Gemini, and how users will control what's shared.

History Recall lets you ask Gemini natural language questions about your browsing history. If you were researching Italian restaurants last week and closed those tabs, you can ask "what were those restaurants I was searching for last week?" and get the answer back with links.

One thing we also launched in Gemini and Chrome is History Recall. We actually know a lot of people keep their tabs around because they're afraid of closing them because they don't know how to get back to it. In Gemini and Chrome, we made it easier to ask for what you know you had open and actually get it back.

I'm a clean tabs person now. I just trust search. I use Gemini all the time to find what I was left off with.
This is a practical feature that solves a real behavior problem. Tab hoarding happens because closing a tab feels like losing context. History Recall turns that loss aversion into a non-issue by making your browsing history queryable through natural language.
The deeper implication: if users trust that the AI can recall anything they've seen, they'll change how they browse. Fewer open tabs means less memory pressure, faster browsing, and a different relationship with the browser itself.
Nano Banana is Google's on-device image generation and editing model. In Chrome, it lets you transform images directly on the web page you're viewing. The demo showed editing a kitchen photo from Redfin, changing wall colors and furniture styles without leaving the page.
One of the things that we've implemented, which I think is great, is integration of Nano Banana. You can be on a web page and maybe you're trying to do some home decoration, you could ask Gemini to change the color of the carpet or the sofa, and then you can see it as it would appear on the page. These kinds of use cases that you would have to download the image, upload it to a different web app in a different tab, it's just made super easy with this integration.

The key detail: Nano Banana runs on-device, not in the cloud. That's why it can transform images in real time without the round-trip latency of uploading to a server. For privacy-sensitive use cases like editing photos of your home, on-device processing means the images never leave your machine.
Auto Browse is the most significant feature in this update. It lets Gemini navigate websites, click buttons, fill forms, search for products, and draft emails on your behalf. During the demo, Tabriz asked Gemini to find pirate-themed party supplies on Etsy, and the agent navigated to the site, searched for items, and compiled results while she continued working.
I think this totally changes what a browser is. The big evolution in desktop computing has been that you spend most of your time in front of a browser. Most apps are web apps. But what bringing Gemini into this means is that now you can start to automate some workflows, start to take care of things asynchronously. The auto browse capability turns the browser a bit into an automated workflow system.

Osterloh used it to buy Stanford basketball tickets through SeatGeek without directing the agent to any specific website. The model chose SeatGeek on its own, found the right event, and selected seats.
The browser is the original user agent. It is moving from this passive window that just renders content to being an agent on your behalf, able to operate on the web and handle these complex tasks that typically require lots of manual work. You oversee and make the fun decisions, maybe picking something, and make the sensitive decisions like taking the action on the purchase.

The practical design decisions are telling. Auto Browse can:
But it won't autonomously:
This conservative approach makes sense for a product with billions of users. As Tabriz put it, "we're more on the conservative side and often asking the user, because we want people to trust the technology."
Logan Kilpatrick raised the obvious question: the internet has CAPTCHAs specifically designed to stop automated systems. How does an agentic browser handle that?

The sort of agent-browser-human interaction cycle, obviously the internet has CAPTCHAs to stop agentic systems from engaging on websites. What does Auto Browse do if it hits a CAPTCHA? Does it prompt a human in the loop?

Figuring out that balance of what should the agent do on your behalf and when does it make sense to have the human in the loop is something we're continuing to think really deeply about. Right now we're pretty conservative. We want the user to be engaged when it's posting to social media or in that final purchase confirmation step.
Tabriz didn't directly answer the CAPTCHA question, which is itself informative. The agent likely can't solve CAPTCHAs (that would undermine their purpose), so it presumably hands control back to the user. The broader design philosophy is human-in-the-loop for sensitive actions, with the agent handling the tedious navigation and research.
For web developers, this raises a new question: should your site optimize for agent visitors? Osterloh mentioned seeing websites create "agent buttons" that swap to markdown so AI models don't waste context parsing CSS. That's early, but it points toward a web where pages serve both human and machine readers.
The conversation took an interesting turn when Osterloh brought up MCP (Model Context Protocol) and Google's own UCP announcement. Today, Auto Browse literally navigates pages like a human would. In the future, that changes.
Today to buy things online through an agent, you often have to use Auto Browse and it'll literally navigate the page as you would. In the future, this will evolve more and more to using protocols and standards and things like MCP and what we announced recently with UCP, creating much easier facilitation of commerce between users and their agents and other websites and their agents.

This is the biggest signal in the conversation. Google is explicitly positioning Auto Browse as a transitional technology. The long-term vision is a protocol-based web where AI agents interact with structured APIs rather than rendering and clicking through web pages.
That future has immediate implications for developers. Sites that expose structured APIs and adopt agent-friendly protocols will get better results from AI browsers than sites that force pixel-level page navigation. It's a shift from optimizing for human eyes to optimizing for machine consumption.
According to Tabriz, standards are evolving faster than ever in the AI space, with protocols going "from zero to the universal thing in a matter of weeks." Chrome has historically pushed to demonstrate what's possible and then worked with standards bodies to formalize it.

One of the most quotable moments came when Tabriz described a shift she's already seeing in how people think about web presence.
You start seeing some changes already where people are not doing SEO anymore, but they're starting to think of agent optimization. How do you optimize your site for an agent to be able to interact with it?

From a security perspective, she immediately flagged the dark side: prompt injection becomes a real risk when AI agents navigate the open web. Attackers could embed hidden instructions in web pages that manipulate what the agent does on the user's behalf.
This connects to a broader trend we're seeing at WebSearchAPI.ai. When AI systems crawl and interact with web content, the page becomes an attack surface. A page optimized for agent interaction is also a page that could contain adversarial prompts. The defense-in-depth approach Chrome is taking, with sandboxing and the User Alignment Critic, is a direct response to this threat.
Chrome has billions of users. Some want bleeding-edge AI features. Others just want to check email. Designing for both audiences simultaneously is the core product challenge.
There's going to be a segment of your users who understand and dive in to every single feature that you develop. They want more complexity, more power. They're vocal about it. But you also have to be cautious that you don't lose the long tail of people that are just trying to check their email. Power users can just hit Ctrl+G and it opens. People that don't want it at all can remove it. We can meet the user where they are.

Google's solution is tiered access through Google One subscriptions. The free tier gets basic Gemini in Chrome. Pro and Ultra subscribers get Auto Browse and advanced features. This creates a natural gradient where the most complex features reach users who've self-selected as power users.
The Google One subscription model evolved from storage plans created ten years ago. As Osterloh explained, AI compute costs are significantly higher than traditional web transactions, and the subscription model funds continued investment in infrastructure and R&D. He called Google One subscriptions "one of our biggest growth engines."
The User Alignment Critic is a separate AI model that monitors prompts going to Gemini in Chrome. It acts as a real-time safety layer, checking whether the instructions the agent receives actually align with what the user intended.
We created this concept of the User Alignment Critic, which is effectively an Overwatch agent that is looking at the prompts that are going by and trying to see if that really is something that you should be cautious about or whether it's okay. It's constantly got your back.

This is Google's answer to prompt injection in an agentic browsing context. When Auto Browse navigates a website, that website could contain hidden text designed to redirect the agent's behavior. The User Alignment Critic watches for this, comparing incoming instructions against the user's original intent.

I've been at Google a long time, and I started in the security space. Security is a core value for Chrome. For this to be successful, people need to trust it. We made changes in Chrome to take advantage of sandboxing so that Auto Browse is limited in what domains it accesses. There are open problems. At the end of the day, people will use technology and products if they can trust it and feel safe.
The layered defense approach includes:
This is a mature security architecture for a v1 product. The fact that Google built a dedicated critic model rather than relying on the main Gemini model's safety filters shows they understand the threat model is different when AI navigates the open web.

Tabriz described the collaboration between Chrome and Google DeepMind as one of Chrome's biggest cross-team efforts. The challenge wasn't just technical. The teams had different definitions of quality.

When we start off, we're saying the same word but we mean totally different things. What quality means to some of the folks working in the modeling team was very different than how we think about quality. Practically, we gotta get in the same room with a whiteboard and actually flush some of this stuff out because we're getting frustrated by each other but realizing that we're just coming from different worlds.
The testing challenge was especially hard. Testing a text generation model is straightforward: does the output look right? Testing an agentic browser that navigates the open web is a different problem entirely. The team had to build tooling to evaluate whether complex multi-step tasks completed correctly, across different websites, with different page layouts.
Tabriz mentioned that Chrome's existing work on accessibility (screen readers that navigate page elements) and automated testing gave them a head start. DeepMind brought vision-based UI understanding. Combining those capabilities is what made Auto Browse reliable enough to ship.
Auto Browse turns Chrome into an agent runtime, not just a browser. It can navigate sites, search for products, draft emails, and complete multi-step tasks autonomously, with human confirmation for sensitive actions.
Nano Banana runs image editing on-device inside Chrome, so you can edit images on any web page without uploading to a cloud service. Privacy benefit: images never leave your machine.
History Recall makes browsing history queryable through natural language, which solves tab hoarding by making it safe to close tabs without losing context.
The User Alignment Critic is a dedicated AI monitor that watches for prompt injection and intent misalignment when Auto Browse navigates the open web. It's the first major browser-level defense against adversarial web content targeting AI agents.
Google sees Auto Browse as transitional technology. The long-term vision is protocol-based agent interaction using MCP and UCP, not pixel-level page navigation.
Chrome's agentic features are gated behind Google One subscriptions because AI compute costs are significantly higher than traditional web transactions, making the subscription model necessary for sustainable investment.
Agent optimization is replacing SEO according to Tabriz. Websites will need to serve both human visitors and AI agent visitors, a shift that creates new opportunities and new attack surfaces.
This post is based on Gemini in Chrome: Your agentic browsing assistant by Google for Developers.