首页 News 正文

Google Snipes OpenAI, Concentrates Fire on Attacking AI Agents

六月清晨搅
231 0 0

On December 12th, as OpenAI announced the full integration of ChatGPT with Apple, Google released a new generation of big model Gemini 2.0. It is worth noting that Gemini 2.0 is specifically designed for AI agents.
Google CEO Sundar Pichai stated in an open letter, "Over the past year, we have been investing in developing more 'proxy' models that can better understand the world around you, think multiple steps ahead, and perform tasks under your supervision. Today, we are pleased to welcome a new generation of models - Gemini 2.0, which is our most powerful model to date. Through new advances in multimodality, such as native image and audio output, as well as the use of native tools, we are able to build new AI agents that bring us closer to the vision of universal AI assistants
Demis Hassabis, CEO of Google DeepMind, also stated that 2025 will be the era of AI agents, and Gemini 2.0 will be the latest generation model to support our work based on agents.
At present, Gemini 2.0 version has not been officially launched, and Google has stated that it has been provided to some developers for internal testing. The Gemini 2.0 Flash experimental version, which is stronger than Gemini 1.5 Pro, was launched immediately. The experimental version has been opened on the web, and Gemini users can access Gemini 2.0 Flash through the PC end. The mobile end is about to be launched.
According to benchmark test results released by Google, in terms of multimodal image and video capabilities, as well as encoding and mathematical abilities, the Flash experimental version of Gemini 2.0 almost outperforms Gemini 1.5 Pro in all aspects, and its response speed has been doubled.
Google focuses its firepower on fiercely attacking AI intelligent agents
Through Google's latest update, we can now glimpse a corner of the glacier in its AI layout - everything for intelligent agents.
1. More powerful multimodal capabilities:
Gemini 2.0 Flash Experimental Edition not only supports multimodal inputs such as images, videos, and audio, but also multimodal outputs such as native generated images combined with text, as well as controllable multilingual text to speech (TTS) audio.
2. More professional AI search:
Google has launched a new intelligent agent feature called Deep Research in Gemini Advanced. This feature combines Google's search expertise with Gemini's advanced reasoning abilities to generate research reports around a complex topic, serving as a personal research assistant.
3. Multiple intelligent agents have been updated and launched:
Updated the intelligent agent Project Astra based on Gemini 2.0: Astra's new features include support for multilingual mixed dialogue; Ability to directly call Google Lens and map functions in Gemini applications; Improved memory ability, with up to 10 minutes of intra session memory, resulting in more coherent conversations; With the help of new streaming processing technology and native audio understanding capabilities, this intelligent agent is able to understand language with a latency close to human dialogue. It is worth noting that Astra is a forward-looking project developed by Google for the glasses project. Google mentioned that it is porting Project Astra to more mobile devices such as glasses.
Release Project Mariner, an intelligent agent for browsers: This agent is capable of understanding and inferring information on the browser screen, including pixels and web elements such as text, code, and images, and then using this information through Chrome extensions to help you complete tasks.
Release AI programming agent Jules specially designed for developers: Jules supports direct integration into GitHub workflows, allowing users to describe problems in natural language and generate code that can be merged into GitHub projects;
Release game intelligent agent: capable of real-time interpretation of screen images, providing next operation suggestions through user actions on the game screen, or directly communicating with you through voice communication while you are playing games.
Google has stated that it will expand Gemini 2.0 to more of its products early next year. The previously launched AI Overviews will integrate Gemini 2.0 to enhance complex problem-solving capabilities, including advanced mathematical formulas, multimodal queries, and programming. Limited testing has been conducted this week, and it is expected to be promoted next year and expanded to more countries and languages.
LogoMoney.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表LogoMoney.com立场,且不构成建议,请谨慎对待。
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

  •   工信部党组书记李乐成会见德国汽车工业协会主席希尔德加德·穆勒   4月27日,工业和信息化部党组书记李乐成在北京会见德国汽车工业协会主席希尔德加德·穆勒,双方就深化中德汽车产业合作进行了交流。李乐成表 ...
    moonlightplay
    前天 10:42
    支持
    反对
    回复
    收藏
  •   美国总统特朗普近日在接受媒体采访时表示,他第二个任期不仅治理美国,也治理全世界。   特朗普于4月24日接受了《大西洋》(The Atlantic)月刊采访,这段专访于4月28日发布。   “第一次当总统时,我要做两 ...
    lfancn
    8 小时前
    支持
    反对
    回复
    收藏
  •   东风有限回应武汉工厂关停事宜   据第一财经,4月29日,东风汽车有限公司证实,该公司武汉工厂目前正常运行,后续也不会关停。东风有限称,该公司将在东风与日产母公司的支持下平稳有序发展,持续加速向新能源 ...
    king19831101
    10 小时前
    支持
    反对
    回复
    收藏
  •   4月29日凌晨,阿里巴巴开源新一代通义千问模型Qwen3(千问3),参数量为DeepSeek-R1的三分之一,成本大幅下降。据称,该模型性能全面超越R1、OpenAI-o1等领先模型,登顶全球最强开源模型。   千问3是国内首个“ ...
    风雨中行走
    昨天 10:32
    支持
    反对
    回复
    收藏
六月清晨搅 注册会员
  • 粉丝

    0

  • 关注

    0

  • 主题

    30