Google I/O 2026 keynote in 35 minutes

The Verge

210,441 views • 1 month ago

Video Summary

Google has unveiled a suite of AI advancements across its products, integrating conversational AI and advanced models like Gemini Omni and Gemini 3.5 Flash. New features include "Ask Maps" for complex queries, an enhanced "Ask YouTube" that summarizes video content and provides direct links to relevant segments, and "Docs Live" which allows voice-driven document creation and editing. Gemini Omni, a new model, promises enhanced understanding and editing capabilities for video and images, with features like interactive simulations and content credential verification to distinguish AI-generated content. Gemini 3.5 Flash is highlighted for its speed and efficiency, powering new agentic capabilities through "Gravity 2.0" and "Gemini Spark," a personal AI agent designed for task management. Google is also revolutionizing search with an intelligent search box capable of cross-modal queries and introducing "Search Agents" for 24/7 information gathering and task execution. A notable development is the "Universal Cart," an intelligent shopping assistant across merchants. The Gemini app has been redesigned with a "Neural Expressive" aesthetic and enhanced live experiences. Furthermore, advancements in AI simulations like "Alpha Earth Foundations" and AI-driven drug discovery at "Isomorphic Labs" are set to accelerate scientific progress and improve human health. One fascinating fact is that Gemini 3.5 Flash is four times faster than other frontier models in terms of output tokens per second.

Short Highlights

Google's Maps and YouTube are being upgraded with conversational AI, allowing users to ask complex questions and receive summarized, relevant video content.
"Docs Live" introduces voice-driven document creation and editing, with an example showing the AI pulling information from a resume and an email to draft talking points for a career day presentation.
Gemini Omni is a new model capable of creating and editing content from any input, with advanced capabilities in video generation and simulation, including realistic physics.
Gemini 3.5 Flash is a highly capable and fast model, co-optimized with the "Gravity 2.0" harness, enabling agentic capabilities and powering new products like "Gemini Spark."
Google is enhancing its Search experience with an intelligent search box, AI overviews, and introducing "Search Agents" that work 24/7 to gather information and help users take action.

Key Details

"Ask Maps" and "Ask YouTube" Enhancements [00:05]

Maps has received its biggest upgrade in a decade with the introduction of "Ask Maps," enabling users to pose complex and lengthy questions.
"Ask YouTube" reimagines the search experience by providing digestible overviews, helpful tips, and directly jumping to the most relevant parts of videos for complex queries.
The system remembers context, allowing for follow-up questions, and can even present information in tables for easy comparison, as demonstrated with teaching a child to ride a bike.
Ask YouTube is currently in testing and will roll out broadly in the US this summer.

People come to YouTube every day to ask a lot of questions. It's a lot of great videos. Sometimes it's hard to know where to start.

Docs Live and Voice Capabilities [01:43]

"Docs Live" allows users to verbally brain dump ideas, with Gemini processing them to create and edit documents in real-time.
A demo showcased a user requesting talking points for a career day, asking for funny analogies, and requesting details from an email, all processed and drafted within the document.
Users can further refine the output by asking for formatting changes, like organizing analogies into a table, and adding specific narrative elements.
Future capabilities will include creating and editing new documents entirely through voice commands.
Docs Live will roll out to Pro and Ultra subscribers this summer, with similar voice capabilities coming to Gmail and Google Keep.

Oh, actually, can you just pull my resume from Drive? Although, that might be boring. Um, maybe can you come up with some funny analogies so it'll be more of an engaging talk for the students?

Gemini Omni: Multimodal Understanding and Creation [03:34]

Gemini Omni is introduced as a new model that can create anything from any input, combining Gemini's intelligence with generative media models.
It boasts advanced world understanding, multimodality, and editing capabilities, surpassing previous models in simulating concepts like kinetic energy and gravity.
Omni can translate complex ideas into highly accurate videos, exemplified by an animation explaining protein folding.
The model offers natural language editing for videos, allowing users to change styles, adjust details, add elements, and even modify the camera angle, all through conversational prompts.
Content credentials verification (Synth ID) is being expanded across Search and Chrome to indicate the origin of content (AI or camera) and if it has been edited with generative AI.

Models like VO, Nano Banana, and Genie are able to create extremely realistic videos, images, and interactive simulations.

Gemini 3.5 Flash and Agentic Capabilities [06:56]

Gemini 3.5 Flash is presented as a model combining frontier intelligence with action, outperforming Gemini 3.1 Pro across most benchmarks and showing extraordinary progress in coding and GDP value.
It is significantly faster than other frontier models, offering an intelligence-to-output speed ratio in a league of its own.
"Gravity 2.0" is a new standalone desktop application focused on agent-first experiences, including conversations, artifact production, and multi-agent orchestration.
The "Gravity harness" for Gemini has been enhanced with primitives like sub-agents, hooks, and asynchronous task management, with Gemini 3.5 Flash co-optimized for it.
A demonstration showed an agent built with Gravity and Gemini 3.5 Flash successfully creating a working operating system, even running Doom.
Gemini 3.5 Flash is available for everyone today across Google products and APIs.

Second, 3.5 flash is a very capable model at the frontier and comparable to the best models but much much faster which is why when you look at the intelligence versus output speed it's in a whole league of its own in the top right quadrant.

Gemini Spark and AI Agents in Action [10:26]

Gemini Spark is introduced as a personal AI agent that helps users navigate their digital lives by taking actions on their behalf, running 24/7 on Google Cloud.
It leverages Gemini 3.5 and the Google Gravity harness for long-running tasks and integrates with various services, including third-party tools via MCP.
Users can interact with Spark through the Gemini app, email, or chat.
A demonstration showed Spark creating multiple threads for tasks like finding meetings with a specific person, writing a welcome note to a neighbor, and creating a categorized to-do list for kids' school year activities.
Gemini Spark is rolling out to trusted testers this week and as a beta for Google AI Ultra subscribers next week.

Spark will integrate seamlessly with tools starting with our own and in the coming weeks with third party tools through MCP and you can work with Spark however is most convenient in the Gemini app or soon through email and chat.

Next-Generation Search and Universal Cart [13:26]

A brand new intelligent search box is launching, integrating powerful AI tools and enabling cross-modal queries (text, images, files, videos).
This is described as the biggest upgrade to the search box in over 25 years, rolling out today.
AI overviews and AI mode are being merged into a seamless AI search experience, allowing users to flow effortlessly from questions to follow-ups with context maintained.
The era of "Search Agents" begins, allowing users to set up information agents that work 24/7 in the background to find information and take action, such as apartment hunting or tracking sneaker drops.
The "Universal Cart" is a new intelligent shopping cart that works across merchants, allowing users to add items from Search, Gemini, YouTube, or Gmail, and automatically finding deals, price drops, and stock alerts.

We're entering the era of search agents. Now to start, you can set information agents to work for you 247 in the background.

Generative UI in Search and Google Pix [15:26]

Search will incorporate "Generative UI" with Gravity, allowing it to build ideal, custom formats for questions on the fly, including dynamic layouts and interactive widgets.
An example demonstrated an interactive visual of how black holes affect spacetime and how orbiting objects create gravitational waves, built dynamically in response to user queries.
Generative UI with Gravity is rolling out to Search this summer, free of charge.
"Google Pix," a new image creation and editing tool in Google Workspace, allows users to create and edit images with creative controls, understanding how objects work together.
Pix outputs are watermarked with Synth ID, and the tool is rolling out this summer.

Generative UI with anti-gravity is rolling out to search this summer for everyone free of charge.

Google Flow and Flow Music Advancements [25:43]

A new agent in "Google Flow" allows for multiple actions to be executed at once, moving beyond single prompt execution.
An agent can analyze a single image and generate multiple video angles or perform large-scale edits, such as changing a desert scene from day to night, demonstrating precise context understanding.
"Flow Tools" enables users to "vibe code" any creative tool within Flow, allowing for custom tools for unique creative processes.
"Google Flow Music" allows artists to create original songs, with an example of transforming a piano riff into an R&B demo with female vocals.
These features in Google Flow and Google Flow Music are available today.

Our next update is Flow Tools. Now you can vibe code any creative tool you could think of right in Flow.

Audio Glasses and AI in Wearables [28:35]

Google's first audio glasses will arrive this fall, offering all-day help with Gemini spoken privately into the ear, allowing users to remain hands-free and heads-up.
The glasses pair with both Android and iOS devices and support features like listening to music, taking photos, making calls, and accessing phone apps.
A live demo showcased navigation assistance, ordering coffee via DoorDash through voice commands, and receiving important message alerts, all without using a phone.
The glasses can also provide glanceable displays on a watch.
A live demo also showed taking a photo of an audience, transforming it into a cartoon with a blimp added to the sky, previewed on a watch.

They are designed to give you all day help with Gemini that is spoken into your ear privately rather than shown on a display.

AGI, Cybersecurity, and Scientific Acceleration [32:20]

Artificial General Intelligence (AGI) is on the horizon, poised to be the most profound technology ever invented, with the potential to propel human progress.
Google is investing in cybersecurity with tools like "Code Mender," an agent that automatically finds and fixes software vulnerabilities.
"Gemini for Science" is introduced, bringing together AI tools to accelerate research, assisting in solving complex problems, streamlining daily tasks, and generating hypotheses.
AI simulations, like "Alpha Earth Foundations" (a digital twin of the planet), are critical for understanding and predicting complex dynamic systems to address global issues like deforestation and food security.
"Isomorphic Labs" is using AI to model molecular interactions to accelerate drug discovery for immune disorders and cancer, aiming to solve all disease.