Open source community watershed: Meta model Llama 3 with the highest release parameters or up to 400 billion
六月清晨搅
发表于 2024-4-19 16:12:24
253
0
0
In order to maintain the company's position in the field of AI (artificial intelligence) open source big models, social media giant Meta has launched its latest open source model.
On April 18th local time, Meta announced on its official website the release of its latest large model, Llama 3. At present, Llama 3 has opened two small parameter versions, 8 billion (8B) and 70 billion (70B), with a context window of 8k. Meta stated that by using higher quality training data and fine-tuning instructions, Llama 3 has achieved a "significant improvement" compared to the previous generation Llama 2.
In the future, Meta will launch a larger parameter version of Llama 3, which will have over 400 billion parameters. Meta will also introduce new features such as multimodality for Llama 3 in the future, including longer context windows and Llama 3 research papers.
Meta wrote in the announcement, "Through Llama 3, we are committed to building open-source models that can compete with today's best proprietary models. We want to handle developer feedback, improve the overall practicality of Llama 3, and continue to play a leading role in responsible use and deployment of LLM (Large Language Models)."
On the 18th, Meta's stock price (Nasdaq: META) closed at $501.80 per share, up 1.54%, with a total market value of $1.28 trillion.
"The best open source big model currently on the market"
According to Meta, Llama 3 has demonstrated state-of-the-art performance on various industry benchmarks, providing new features including improved inference capabilities, and is currently the best open-source large model on the market.
At the architecture level, Llama3 has chosen the standard decoder only Transformer architecture, using a tokenizer that includes a 128K token vocabulary. Llama 3 was pre trained on two 24K GPU clusters created by Meta, using over 15T of publicly available data, including 5% non English data covering over 30 languages. The training data volume was seven times that of the previous generation Llama 2, and the code included was four times that of Llama 2.
According to Meta's test results, the Llama 38B model outperforms Gemma 7B and Mistral 7B Instrument on multiple performance benchmarks such as MMLU, GPQA, and HumanEval, while the 70B model surpasses the well-known closed source model Claude 3's intermediate version Sonnet, with three wins and two losses compared to Google's Gemini Pro 1.5.
Llama 3 performs exceptionally well on multiple performance benchmarks. Source: Meta official website
In addition to conventional datasets, Meta is also committed to optimizing the performance of Llama 3 in practical scenarios, and has specifically developed a high-quality manual testing set for this purpose. This test set contains 1800 pieces of data, covering 12 key use cases such as seeking advice, closed ended question answering, brainstorming, coding, and writing, and is kept confidential by the development team.
In this test set, the results show that Llama 3 outperforms Llama 2 significantly and also surpasses well-known models such as Claude 3 Sonnet, Mistral Medium, and GPT-3.5.
Llama 3 achieved excellent results on the manual test set. Source: Meta official website
Although the 400B+model of Llama 3 is still being trained, Meta has also demonstrated some of its testing results, seemingly aimed at benchmarking against the strongest version of Claude 3, Opus. However, Meta has not released the comparison results between the Llama 3 larger parameter model and GPT-4 equivalent specification players.
The 400B+model of Llama 3 is still being trained. Source: Meta official website
The Llama 3 model will soon be available to developers on Amazon AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM Watson X, Amazon Azure, Nvidia NIM, and Snowflake, and will receive hardware platform support from AMD, AWS, Dell, Intel, Nvidia, and Qualcomm. In order for Llama 3 to be developed responsibly, Meta will also provide new trust and security tools, including Llama Guard 2, Code Shield, and CyberSec Eval 2.
Meanwhile, Meta has released an official web version of Meta AI based on Llama3. At present, the platform is still in its early stages, with only two major functions: dialogue and painting. Users do not need to register to use the dialogue function, while using the painting function requires users to register and log in to an account.
Injecting vitality into the open source community
Meta's AI path has always been closely linked to open source, and once Llama 3 was launched, it was warmly welcomed by the open source community.
Although there are some roast about the size of Llama 3's 8k context window, Meta said that it would soon expand the Llama 3's context window. Matt Shumer, CEO and co-founder of email startup Otherside AI, is also optimistic about this and said, "We are entering a new world where GPT-4 level models are open source and accessible for free."
According to Jim Fan, a senior research scientist at Nvidia, the upcoming larger parameter Llama 3 model marks a "watershed" for the open source community, which can change the decision-making methods of many academic research and startups, and "is expected to see a surge in vitality throughout the entire ecosystem.".
However, it is worth noting that Meta has not released the training data for Llama 3, only stating that it is entirely from publicly available data. Strictly speaking, so-called "open source" software should be fully open to the public during the development and distribution process, including the source code of software products, training data, and other content. Previously, the "strongest open source model" DBRX released by data company Databricks not only had standard configurations far beyond ordinary computers, but also had this issue.
The launch of Llama 3 closely follows the progress made by Meta's self-developed chips. Just last week, Meta announced the latest version of its self-developed chip MTIA. MTIA is a customized chip series designed by Meta specifically for AI training and inference work. Compared to the Meta's first generation AI inference accelerator MTIA v1, which was officially announced in May last year, the latest version of the chip has significantly improved performance, specifically designed for the ranking and recommendation system of Meta's social software. Analysis indicates that Meta's goal is to reduce dependence on chip manufacturers such as Nvidia.
LogoMoney.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表LogoMoney.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表LogoMoney.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Open source securities: AI leads the rapid development of the education industry
- OpenAI has Rocket again! Officially launched Sora, an AI video generation model
- Google releases its most powerful model to attack OpenAI, shifting focus to AI agents
- Challenge OpenAI, Google's new move! Significantly updated generative AI, launching video model VEO 2 and the latest version Imagen3
- Is it increasingly difficult to distinguish between truth and falsehood? Google launches new generation video generation model Veo 2
- Microsoft is reportedly committed to adding non OpenAI models to its 365 Copilot product
- The most expensive and cheapest models will be launched on the same day! Li Bin: NIO will double its efforts to become one of the top ten global car companies
- How will Google respond under stricter regulation in a more competitive track? CEO: Focus on Gemini model next year
- 焕新Model Y首次推出5年0息
- 特斯拉焕新Model Y首次推出5年0息
-
美股市场:纽约股市三大股指4月30日涨跌不一。截至当天收盘,道琼斯工业平均指数比前一交易日上涨141.74点,收于40669.36点,涨幅为0.35%;标准普尔500种股票指数上涨8.23点,收于5569.06点,涨幅为0.15%;纳斯 ...
- joey791216
- 3 天前
- 支持
- 反对
- 回复
- 收藏
-
当地时间周四,美股三大股指集体收涨,其中道指和标普500指数实现“八连涨”。不过,三大股指均在尾盘出现小幅跳水。 苹果、亚马逊于周四美股盘后公布了最新业绩,尽管业绩有所超出预期,但仍有令市场不满 ...
- jiangu12
- 前天 10:28
- 支持
- 反对
- 回复
- 收藏
-
5月2日,全球电商巨头亚马逊公布了2025年第一季度财报。亚马逊第一季度净销售额为1556.67亿美元,较2024年第一季度同比增长9%;净利润为171.27亿美元,较2024年第一季度增长64%;每股摊薄收益1.59美元,较上年同 ...
- 独品金莲芳
- 昨天 10:16
- 支持
- 反对
- 回复
- 收藏
-
周三热门中概股涨跌不一。纳斯达克中国金龙指数(HXC)收跌0.95%。 上涨股当中(按市值从高到低),台积电涨1.34%,阿里巴巴涨0.46%,拼多多涨1.36%,网易涨0.66%,中华电信涨1.33%,理想汽车涨0.91%,日月 ...
- 蓝蓝的彩
- 3 天前
- 支持
- 反对
- 回复
- 收藏