News
Google Intros Gemini 2.5, AWS Counters with Amazon Nova Act for the Browser
The cloud giant AI wars are heating up with Google announcing its Gemini 2.5 model and Amazon countering with the launch of Amazon Nova Act for running AI in the browser, part of an access expansion of the Nova AI ecosystem.
Gemini 2.5
The first model offered in this family is Gemini 2.5 Pro Experimental. This is one of the new "thinking models" that can handle more complex tasks than previous generations, with Google noting its advanced reasoning and planning abilities. This space is associated with "agentic AI" in which autonomous agents can conduct tasks on their own according to human instructions and use computers independently to complete tasks.
"We've been focused on coding performance, and with Gemini 2.5 we've achieved a big leap over 2.0 -- with more improvements to come," said Google in discussing agentic capabilities in a March 25 announcement. "2.5 Pro excels at creating visually compelling web apps and agentic code applications, along with code transformation and editing. On SWE-Bench Verified, the industry standard for agentic code evals, Gemini 2.5 Pro scores 63.8% with a custom agent setup."
[Click on image for larger view.] Benchmarks (source: Google).
It also facilitates what has been dubbed "vibe coding," in which developers simply provide instructions to AI, which then generates all or almost all of the code, basically eliminating manual heads-down coding.
For a vibe coding example, the announcement shows how 2.5 Pro used its reasoning capabilities to create a video game by producing the executable code from a single line prompt.
Other highlights of the announcement include:
- Developer Access via API & Cloud: The model is available to developers through Google AI Studio now and soon Vertex AI, enabling integration into enterprise apps, custom tools, and agentic systems.
- Powers Google Workspace Features: Gemini 2.5 Pro is integrated into Google Workspace, enhancing the AI capabilities within Google's suite of productivity tools.
- Multi-Modal & Real-World Readiness: Although primarily text-based, Gemini 2.5 is optimized for real-world scenarios involving diverse inputs, and it's architected to scale into full multimodal use in future versions. It supports processing of text, audio, images, video, and code, with a 1 million token context window.
- LMArena Leaderboard Topper: Gemini 2.5 Pro Experimental debuts at #1 on the LMArena leaderboard by a significant margin, indicating high capability and quality style.
- Top Score on Humanity's Last Exam: The model achieved a state-of-the-art score of 18.8% on Humanity's Last Exam among models without tool use, demonstrating its ability to handle complex knowledge and reasoning tasks.
- High Score on SWE-Bench Verified: On SWE-Bench Verified, a key industry benchmark for agentic code evaluations, Gemini 2.5 Pro scores 63.8% with a custom agent setup.
"Gemini 2.5 builds on what makes Gemini models great -- native multimodality and a long context window," Google said. "2.5 Pro ships today with a 1 million token context window (2 million coming soon), with strong performance that improves over previous generations. It can comprehend vast datasets and handle complex problems from different information sources, including text, audio, images, video and even entire code repositories."
Amazon Nova
The Amazon Nova foundational model isn't as new as Gemini 2.5, having been introduced in December 2024, but the company today (March 31) announced expanded access to the Nova AI ecosystem, which includes a new website and the new Nova Act for running AI in the browser. This is part of a larger effort to make AI more accessible and easier to use for developers and businesses.
While the new website, nova.amazon.com, helps users easily explore the company's foundation models, Amazon also introduced Amazon Nova Act, a new AI model trained to perform actions within a web browser. Specifically, the company released a research preview of the Amazon Nova Act SDK, which will help developers to experiment with an early version of the new model.
[Click on image for larger view.] Nova.Amazon.Com (source: Amazon).
"Nova.amazon.com puts the power of Amazon's frontier intelligence into the hands of every developer and tech enthusiast, making it easier than ever to explore the capabilities of Amazon Nova,” the company said. "We've created this experience to inspire builders, so that they can quickly test their ideas with Nova models, and then implement them at scale in Amazon Bedrock. It is an exciting step forward for rapid exploration with AI, including bleeding-edge capabilities such as the Nova Act SDK for building agents that take actions on the web. We're excited to see what they build and to hear their useful feedback."
The Nova family of models now includes:
- Nova Micro: Fast, text-only model
- Nova Lite: Understands text, images and video
- Nova Pro: Best combo of quality and speed
- Nova Canvas: Image generation model
- Nova Reel: Video generation model
Here again, agentic AI is front and center, as Amazon Nova's ability to enable developers to build web browser agents by breaking down workflows into atomic commands.
"We think of agents as systems that can complete tasks and act in a range of digital and physical environments on behalf of the user," Amazon said. "Today, such agents are still in an early stage. The Nova Act SDK is a crucial step forward, toward building reliable agents by enabling developers to break down complex workflows into atomic commands (e.g., search, checkout, answer questions about the screen). It also enables developers to add more detailed instructions to those commands where needed (e.g., 'don't accept the insurance upsell'), call APIs, and more to further strengthen reliability."
Users were invited to explore the Amazon Nova models or download the Amazon Nova Act SDK.
About the Author
David Ramel is an editor and writer at Converge 360.