首页 News 正文

Google releases its most powerful model to attack OpenAI, shifting focus to AI agents

老好人啊
3103 0 0

After releasing its strongest quantum chip, Google has made another important move in AI.
On the early morning of December 12th Beijing time, Google released a new model Gemini 2.0 before OpenAI announced the official launch of ChatGPT on the iPhone.
Google CEO Sundar Pichai said that this is Google's most powerful model to date. With improvements in multimodal aspects such as native image and native audio output, Gemini 2.0 is able to build new AI agents, taking Google one step closer to its vision of building a universal assistant.
It should be pointed out that Gemini 2.0 is mainly open to developers and trusted testers. At present, the Gemini 2.0 Flash Experience model is open to all Gemini users.
Gemini 2.0 Flash is a model built on the foundation of 1.5 Flash, which was previously Google's most popular version among developers. Compared to 1.5 Flash, Gemini 2.0 Flash further enhances performance with the same fast response time. Google claims that 2.0 Flash even surpassed 1.5 Pro in key benchmark tests, with its speed being twice that of 1.5 Pro.
At the same time, 2.0 Flash also has new features. In addition to supporting multimodal inputs such as images, videos, and audio, it can also support multimodal outputs, such as directly generating content that mixes images and text, as well as native generation of controllable multilingual text to speech (TTS) audio. It can also natively call tools such as Google Search, code execution, and third-party user-defined functions.
Global Gemini users can now experience chat conversations optimized for 2.0 Flash on both computers and mobile devices, and this version will soon be released in the Gemini mobile application. Based on this new model, users can also experience the Gemini assistant. At the beginning of next year, Google will also expand Gemini 2.0 to more products.
The biggest change in Gemini 2.0 is the shift in focus towards AI agents, aiming to become the foundation model for all AI agents. Based on this, Gemini 2.0 has developed a series of prototypes that can help users complete corresponding tasks.
Among them, the upgraded version of Project Astra is used to explore the research prototype of future general AI assistant capabilities. Since the launch of Project Astra at Google I/O, Google has been collecting feedback from trusted testers who use it on Android phones. The upgraded version launched this time can enable conversations between multiple languages and mixed languages, and can also use new tools such as Google Search, Google Lens, and Google Maps. It can remember conversation content for up to 10 minutes and understand language with a latency close to that of human conversations.
The all-new Project Mariner explores the future development of interaction between humans and intelligent agents from a browser perspective. Project Mariner utilized early research prototypes built with Gemini 2.0, capable of understanding and inferring information in browser pages, including pixels and text, code, images, and forms, among other web elements, and then assisting users with corresponding tasks through experimental Chrome extensions. In this upgrade, Project Mariner has improved the previously slow speed issue.
In short, users can use this feature to let the browser help them complete specific tasks, such as batch searching for email addresses on certain websites, thereby achieving a certain degree of "automatic operation" of the browser.
Jules is a coding agent designed for developers, which can be directly integrated into GitHub workflows to assist developers in completing development tasks.
In Google's demonstration video, the presenter inputs a long string of prompts containing detailed programming questions. Jules will then analyze these requirements and provide a three-step programming solution. After clicking 'agree', the model will start automatic programming and generate code. This undoubtedly helps developers further improve their work efficiency.
At the end of last year, Google released the Gemini 1.0 model, whose main capability is to integrate and understand information. And Gemini 2.0 can make information more useful. Sundar Pichai stated that the progress of Gemini 2.0 is due to Google's 10-year investment in full stack AI innovation research, built on Google's customized hardware sixth generation TPU Trillium.
Just as Google was attracting attention with its most powerful model, OpenAI's 12 day product launch event was still ongoing. On the same day, OpenAI showcased the integration of ChatGPT and Apple Intelligence to the public, but the content was somewhat plain. The sudden release of Google Gemini 2.0 clearly stole a lot of attention from OpenAI.
With the support of Gemini 2.0, Google has launched three intelligent agent products in one go, which also means that it has taken another important step in the competition with Microsoft's OpenAI, Amazon, and Anthropic.
Intelligent agents have become the core direction of competition in the field of large models. The so-called intelligent agent refers to a system that can perceive the environment, make decisions, and take actions to achieve specific goals, and is regarded as a key carrier for the implementation and application of Large Language Models (LLMs).
Nearly two months ago, Microsoft released 10 AI agents for sales, operations, and other scenarios, and later announced that the Copilot Studio platform now supports users in building autonomous agents, while also releasing 5 pre built agents. At the just concluded 2024 re: Invent, Amazon released six large models in one go, among which Amazon Nova Premier is also a multimodal large model designed for complex reasoning tasks.
Whether in consumer or enterprise scenarios, AI agents have a lot of room for imagination and a clear commercial prospect. Several industry insiders predict that 2025 will be the year of the commercial explosion of AI intelligent agents. At that time, the competition among technology giants such as Google and OpenAI around intelligent agents will inevitably become increasingly fierce.
LogoMoney.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表LogoMoney.com立场,且不构成建议,请谨慎对待。
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

  •   工信部党组书记李乐成会见德国汽车工业协会主席希尔德加德·穆勒   4月27日,工业和信息化部党组书记李乐成在北京会见德国汽车工业协会主席希尔德加德·穆勒,双方就深化中德汽车产业合作进行了交流。李乐成表 ...
    moonlightplay
    前天 10:42
    支持
    反对
    回复
    收藏
  •   美国总统特朗普近日在接受媒体采访时表示,他第二个任期不仅治理美国,也治理全世界。   特朗普于4月24日接受了《大西洋》(The Atlantic)月刊采访,这段专访于4月28日发布。   “第一次当总统时,我要做两 ...
    lfancn
    9 小时前
    支持
    反对
    回复
    收藏
  •   4月29日凌晨,阿里巴巴开源新一代通义千问模型Qwen3(千问3),参数量为DeepSeek-R1的三分之一,成本大幅下降。据称,该模型性能全面超越R1、OpenAI-o1等领先模型,登顶全球最强开源模型。   千问3是国内首个“ ...
    风雨中行走
    昨天 10:32
    支持
    反对
    回复
    收藏
  •   东风有限回应武汉工厂关停事宜   据第一财经,4月29日,东风汽车有限公司证实,该公司武汉工厂目前正常运行,后续也不会关停。东风有限称,该公司将在东风与日产母公司的支持下平稳有序发展,持续加速向新能源 ...
    king19831101
    11 小时前
    支持
    反对
    回复
    收藏
老好人啊 新手上路
  • 粉丝

    0

  • 关注

    0

  • 主题

    0