YouTube AnalysisPublished 30 Mar 2026

Gemini in Chrome: Auto Browse, Nano Banana, and the Browser as an AI Agent

Rick Osterloh and Parisa Tabriz explain how Gemini turns Chrome from a passive window into an agentic assistant with Auto Browse, Nano Banana image editing, History Recall, and a User Alignment Critic for safety.

JBJames Bennett

18 minutes read

Google's Rick Osterloh (SVP Platforms & Devices) and Parisa Tabriz (VP Chrome) demo Gemini's new Chrome integration, which adds agentic browsing via Auto Browse, on-device image editing with Nano Banana, tab management through History Recall, and a User Alignment Critic that guards against prompt injection. The update is available now on Windows, Mac, and Chrome OS for Google One subscribers.

Video Summary

Rick Osterloh, Google's SVP of Platforms & Devices, and Parisa Tabriz, VP of Chrome and a 15-year Google security veteran, sit down with Logan Kilpatrick from Google DeepMind to demo and discuss Gemini's integration into Chrome. The update adds a side panel assistant (Ctrl+G), Auto Browse for agentic web task automation, Nano Banana for on-device image editing, and History Recall for natural language browsing history search. Auto Browse is currently limited to Google One Ultra and Pro subscribers in the US. The single most important takeaway: Google built a dedicated "User Alignment Critic" model that monitors every prompt in real time to guard against prompt injection when Auto Browse navigates the open web. Osterloh explicitly called Auto Browse "transitional technology," with MCP and UCP protocols as the long-term path to agent-to-agent commerce.

Gemini in Chrome: Key Insights

Auto Browse is deliberately conservative on sensitive actions. It can navigate sites, search, and fill forms autonomously, but it won't post to social media or complete a purchase without human confirmation. Google Password Manager login requires per-site permission. This is a trust-first rollout for a product with billions of users.
Google built a dedicated AI model just to watch other AI models. The User Alignment Critic is a separate agent that monitors every prompt flowing to Gemini in Chrome, checking if instructions align with the user's original intent. This is their answer to prompt injection on the open web, not a filter on the main model but a parallel "Overwatch agent."

Rick Osterloh — Rick OsterlohSVP of Platforms & Devices, Google

Tabriz says people are already replacing SEO with "agent optimization." She's seeing websites create agent-specific buttons that swap to markdown so AI models don't waste context parsing CSS. Chrome's VP of security framing this as an active trend, not a prediction, is a strong signal.
Auto Browse is transitional technology. Osterloh was explicit: today agents literally navigate pages like a human would. The future uses protocols like MCP and Google's UCP for structured agent-to-agent commerce. Sites that expose APIs now will be ahead when that shift happens.
Chrome's cross-device identity layer is the AI context advantage. Users already sync passwords, autofill, and history across Windows, Mac, Chrome OS, Android, and iOS. Gemini inherits all of that context without the user uploading anything. Tabriz called Chrome "an operating system on top of your operating system."
History Recall solves tab hoarding through loss aversion. People keep tabs open because they're afraid of losing context. History Recall lets you ask natural language questions about your browsing history and get links back.

Parisa Tabriz

Parisa Tabriz· VP of Chrome09:19

We actually know a lot of people keep their tabs around because they're afraid of closing them because they don't know how to get back to it.

Rick Osterloh

Rick Osterloh· SVP Platforms & Devices11:58

I'm a clean tabs person now. I just trust search.

AI compute costs forced the Google One subscription model. Osterloh was candid: AI inference is significantly more expensive than traditional web transactions. The tiered Google One subscription (free, Pro, Ultra) funds continued infrastructure investment. He called it "one of our biggest growth engines."
The Chrome-DeepMind collaboration required redefining shared vocabulary. Tabriz described getting frustrated because "quality" meant different things to the Chrome team and the modeling team. They had to physically get in the same room with a whiteboard to align.

Rick Osterloh — Rick OsterlohSVP of Platforms & Devices, Google

I build infrastructure that connects AI agents to web data at WebSearchAPI.ai. When Google ships an agentic browser that can navigate sites, fill forms, and make purchases on a user's behalf, that changes the relationship between AI systems and the web itself. I watched this 48-minute conversation to understand what's actually shipping, what's still experimental, and what it means for anyone building on the web. (For the search side of this story, see how Google's VP of Search describes AI Mode, query fan-out, and the agentic future of search.)

Rick Osterloh and Parisa Tabriz discussing Gemini in Chrome on Google's Release Notes show

What Did Google Actually Ship in Gemini for Chrome?

Google launched a major update to Gemini in Chrome that works across Windows, Mac, and Chrome OS. The integration adds a side panel accessible via Ctrl+G (or a button click) that supports multi-conversation multitasking, connected app integrations with YouTube, Gmail, Flights, and other Google services, and personal intelligence that syncs your Gemini preferences across devices.

Parisa Tabriz — Parisa TabrizVP of Chrome, Google

The update also previews Auto Browse, an agentic feature where Gemini can navigate websites and complete tasks on your behalf. Auto Browse is currently limited to Google One Ultra and Pro subscribers in the US.

For anyone building web applications, the most interesting detail is the connected apps architecture. Gemini in Chrome already integrates with YouTube, Gmail, shopping, and flights. Tabriz confirmed they're building more integrations. This means Chrome is evolving from a rendering engine into a platform that can take actions across Google's ecosystem and, through Auto Browse, the open web.

Feature	What It Does	Availability
Gemini Side Panel	Multi-conversation AI assistant via Ctrl+G	All Chrome users (free tier limited)
Auto Browse	Agentic web task automation	Google One Ultra & Pro (US only)
Nano Banana	On-device image editing in context	Google One subscribers
History Recall	AI-powered browsing history search	All Gemini in Chrome users
Connected Apps	Integrations with Gmail, YouTube, Flights	All Gemini in Chrome users
Personal Intelligence	Cross-device preference syncing	Rolling out

How Did Chrome Evolve from a Web App Window to an AI Assistant?

Before this update, AI assistance on desktop lived in a separate browser tab. You'd go to gemini.google.com, ChatGPT, or Claude, type your question, and context-switch back to what you were doing. Mobile had it better because AI assistants were integrated at the OS level through voice activation.

Rick Osterloh — Rick OsterlohSVP of Platforms & Devices, Google

According to Osterloh, the gap was clear: mobile users got contextual AI assistance, but desktop users had to leave their workflow. The fix was embedding Gemini directly into Chrome's side panel so you can interact with AI while staying on whatever page you're working with.

This matters beyond convenience. When the AI assistant lives inside the browser, it can see your current tab, understand the page content, and take action without you copying and pasting context into a separate window. That's the difference between a chatbot and an assistant.

Why Is Chrome a Unique Platform for Personal AI Context?

Chrome runs on Windows, Mac, Chrome OS, Linux, Android, and iOS. Users already sign in and sync their autofill data, passwords, history, and bookmarks across devices. That cross-device identity layer is what makes Chrome different from a standalone AI app.

Parisa Tabriz — Parisa TabrizVP of Chrome, Google

The personal context angle is where this gets interesting for AI applications. Tabriz described use cases ranging from summarizing YouTube videos to synthesizing content across multiple tabs. Her team's own kids use tab groups for schoolwork, asking Gemini to generate quizzes from their biology research tabs.

From an API perspective, this creates a new kind of context window. Instead of a user manually feeding documents into an AI chat, the browser already has the context: browsing history, open tabs, logged-in accounts. The question is how much of that context Google will expose to Gemini, and how users will control what's shared.

Parisa Tabriz explaining Chrome as a cross-device AI platform

What Is History Recall and How Does It Solve Tab Hoarding?

History Recall lets you ask Gemini natural language questions about your browsing history. If you were researching Italian restaurants last week and closed those tabs, you can ask "what were those restaurants I was searching for last week?" and get the answer back with links.

Parisa Tabriz

Parisa Tabriz· VP of Chrome09:19

One thing we also launched in Gemini and Chrome is History Recall. We actually know a lot of people keep their tabs around because they're afraid of closing them because they don't know how to get back to it. In Gemini and Chrome, we made it easier to ask for what you know you had open and actually get it back.

Rick Osterloh

Rick Osterloh· SVP Platforms & Devices11:58

I'm a clean tabs person now. I just trust search. I use Gemini all the time to find what I was left off with.

This is a practical feature that solves a real behavior problem. Tab hoarding happens because closing a tab feels like losing context. History Recall turns that loss aversion into a non-issue by making your browsing history queryable through natural language.

The deeper implication: if users trust that the AI can recall anything they've seen, they'll change how they browse. Fewer open tabs means less memory pressure, faster browsing, and a different relationship with the browser itself.

What Is Nano Banana and How Does It Work In-Browser?

Nano Banana is Google's on-device image generation and editing model. In Chrome, it lets you transform images directly on the web page you're viewing. The demo showed editing a kitchen photo from Redfin, changing wall colors and furniture styles without leaving the page.

Rick Osterloh — Rick OsterlohSVP of Platforms & Devices, Google

The key detail: Nano Banana runs on-device, not in the cloud. That's why it can transform images in real time without the round-trip latency of uploading to a server. For privacy-sensitive use cases like editing photos of your home, on-device processing means the images never leave your machine.

Nano Banana demo: editing room colors and furniture in-browser

How Does Auto Browse Turn Chrome into an Agentic Workflow System?

Auto Browse is the most significant feature in this update. It lets Gemini navigate websites, click buttons, fill forms, search for products, and draft emails on your behalf. During the demo, Tabriz asked Gemini to find pirate-themed party supplies on Etsy, and the agent navigated to the site, searched for items, and compiled results while she continued working.

Rick Osterloh — Rick OsterlohSVP of Platforms & Devices, Google

Osterloh used it to buy Stanford basketball tickets through SeatGeek without directing the agent to any specific website. The model chose SeatGeek on its own, found the right event, and selected seats.

Parisa Tabriz — Parisa TabrizVP of Chrome, Google

The practical design decisions are telling. Auto Browse can:

Run multiple tasks in parallel while you work on other things
Log in to sites using Google Password Manager (with permission)
Click through pop-ups to complete tasks
Search and navigate autonomously when you don't specify a site

But it won't autonomously:

Post to social media
Complete a purchase (human confirms the final step)
Log in without asking first (and asks "this time only" or "every time")

This conservative approach makes sense for a product with billions of users. As Tabriz put it, "we're more on the conservative side and often asking the user, because we want people to trust the technology."

Auto Browse starts navigating Etsy to find pirate-themed party supplies

What Happens When Auto Browse Hits CAPTCHAs and Other Guardrails?

Logan Kilpatrick raised the obvious question: the internet has CAPTCHAs specifically designed to stop automated systems. How does an agentic browser handle that?

Logan Kilpatrick

Logan Kilpatrick· Google DeepMind23:42

The sort of agent-browser-human interaction cycle, obviously the internet has CAPTCHAs to stop agentic systems from engaging on websites. What does Auto Browse do if it hits a CAPTCHA? Does it prompt a human in the loop?

Parisa Tabriz

Parisa Tabriz· VP of Chrome24:06

Figuring out that balance of what should the agent do on your behalf and when does it make sense to have the human in the loop is something we're continuing to think really deeply about. Right now we're pretty conservative. We want the user to be engaged when it's posting to social media or in that final purchase confirmation step.

Tabriz didn't directly answer the CAPTCHA question, which is itself informative. The agent likely can't solve CAPTCHAs (that would undermine their purpose), so it presumably hands control back to the user. The broader design philosophy is human-in-the-loop for sensitive actions, with the agent handling the tedious navigation and research.

For web developers, this raises a new question: should your site optimize for agent visitors? Osterloh mentioned seeing websites create "agent buttons" that swap to markdown so AI models don't waste context parsing CSS. That's early, but it points toward a web where pages serve both human and machine readers.

How Will MCP, UCP, and Web Standards Shape the Agentic Web?

The conversation took an interesting turn when Osterloh brought up MCP (Model Context Protocol) and Google's own UCP announcement. Today, Auto Browse literally navigates pages like a human would. In the future, that changes.

Rick Osterloh — Rick OsterlohSVP of Platforms & Devices, Google

This is the biggest signal in the conversation. Google is explicitly positioning Auto Browse as a transitional technology. The long-term vision is a protocol-based web where AI agents interact with structured APIs rather than rendering and clicking through web pages.

That future has immediate implications for developers. Sites that expose structured APIs and adopt agent-friendly protocols will get better results from AI browsers than sites that force pixel-level page navigation. It's a shift from optimizing for human eyes to optimizing for machine consumption.

According to Tabriz, standards are evolving faster than ever in the AI space, with protocols going "from zero to the universal thing in a matter of weeks." Chrome has historically pushed to demonstrate what's possible and then worked with standards bodies to formalize it.

Rick Osterloh discussing MCP and UCP protocols for the agentic web

Is SEO Dead? Tabriz Says Agent Optimization Is Already Replacing It

One of the most quotable moments came when Tabriz described a shift she's already seeing in how people think about web presence.

Parisa Tabriz — Parisa TabrizVP of Chrome, Google

From a security perspective, she immediately flagged the dark side: prompt injection becomes a real risk when AI agents navigate the open web. Attackers could embed hidden instructions in web pages that manipulate what the agent does on the user's behalf.

This connects to a broader trend we're seeing at WebSearchAPI.ai. When AI systems crawl and interact with web content, the page becomes an attack surface. A page optimized for agent interaction is also a page that could contain adversarial prompts. The defense-in-depth approach Chrome is taking, with sandboxing and the User Alignment Critic, is a direct response to this threat.

How Does Google Design Gemini for Billions of Users at Different Skill Levels?

Chrome has billions of users. Some want bleeding-edge AI features. Others just want to check email. Designing for both audiences simultaneously is the core product challenge.

Rick Osterloh — Rick OsterlohSVP of Platforms & Devices, Google

Google's solution is tiered access through Google One subscriptions. The free tier gets basic Gemini in Chrome. Pro and Ultra subscribers get Auto Browse and advanced features. This creates a natural gradient where the most complex features reach users who've self-selected as power users.

The Google One subscription model evolved from storage plans created ten years ago. As Osterloh explained, AI compute costs are significantly higher than traditional web transactions, and the subscription model funds continued investment in infrastructure and R&D. He called Google One subscriptions "one of our biggest growth engines."

What Is the User Alignment Critic and Why Does It Matter for Security?

The User Alignment Critic is a separate AI model that monitors prompts going to Gemini in Chrome. It acts as a real-time safety layer, checking whether the instructions the agent receives actually align with what the user intended.

Rick Osterloh — Rick OsterlohSVP of Platforms & Devices, Google

This is Google's answer to prompt injection in an agentic browsing context. When Auto Browse navigates a website, that website could contain hidden text designed to redirect the agent's behavior. The User Alignment Critic watches for this, comparing incoming instructions against the user's original intent.

Parisa Tabriz

Parisa Tabriz· VP of Chrome45:37

I've been at Google a long time, and I started in the security space. Security is a core value for Chrome. For this to be successful, people need to trust it. We made changes in Chrome to take advantage of sandboxing so that Auto Browse is limited in what domains it accesses. There are open problems. At the end of the day, people will use technology and products if they can trust it and feel safe.

The layered defense approach includes:

User Alignment Critic — An AI model monitoring for intent misalignment
Domain sandboxing — Auto Browse is limited to domains relevant to the task
Human-in-the-loop — Sensitive actions require user confirmation
Red teaming — Multiple Google security teams actively probe for vulnerabilities

This is a mature security architecture for a v1 product. The fact that Google built a dedicated critic model rather than relying on the main Gemini model's safety filters shows they understand the threat model is different when AI navigates the open web.

Rick Osterloh explaining the User Alignment Critic safety system

How Did Chrome and DeepMind Collaborate to Build This?

Tabriz described the collaboration between Chrome and Google DeepMind as one of Chrome's biggest cross-team efforts. The challenge wasn't just technical. The teams had different definitions of quality.

Parisa Tabriz

Parisa Tabriz· VP of Chrome42:14

When we start off, we're saying the same word but we mean totally different things. What quality means to some of the folks working in the modeling team was very different than how we think about quality. Practically, we gotta get in the same room with a whiteboard and actually flush some of this stuff out because we're getting frustrated by each other but realizing that we're just coming from different worlds.

The testing challenge was especially hard. Testing a text generation model is straightforward: does the output look right? Testing an agentic browser that navigates the open web is a different problem entirely. The team had to build tooling to evaluate whether complex multi-step tasks completed correctly, across different websites, with different page layouts.

Tabriz mentioned that Chrome's existing work on accessibility (screen readers that navigate page elements) and automated testing gave them a head start. DeepMind brought vision-based UI understanding. Combining those capabilities is what made Auto Browse reliable enough to ship.

Key Takeaways

Auto Browse turns Chrome into an agent runtime, not just a browser. It can navigate sites, search for products, draft emails, and complete multi-step tasks autonomously, with human confirmation for sensitive actions.
Nano Banana runs image editing on-device inside Chrome, so you can edit images on any web page without uploading to a cloud service. Privacy benefit: images never leave your machine.
History Recall makes browsing history queryable through natural language, which solves tab hoarding by making it safe to close tabs without losing context.
The User Alignment Critic is a dedicated AI monitor that watches for prompt injection and intent misalignment when Auto Browse navigates the open web. It's the first major browser-level defense against adversarial web content targeting AI agents.
Google sees Auto Browse as transitional technology. The long-term vision is protocol-based agent interaction using MCP and UCP, not pixel-level page navigation.
Chrome's agentic features are gated behind Google One subscriptions because AI compute costs are significantly higher than traditional web transactions, making the subscription model necessary for sustainable investment.
Agent optimization is replacing SEO according to Tabriz. Websites will need to serve both human visitors and AI agent visitors, a shift that creates new opportunities and new attack surfaces.

This post is based on Gemini in Chrome: Your agentic browsing assistant by Google for Developers.