Nvidia and other giants exposed for illegally using YouTube data to train models involving 170000 videos
六月清晨搅
发表于 2024-7-17 15:00:42
249
0
0
According to media reports, some large tech companies, including Apple, NVIDIA, Salesforce, and Anthropic, have been exposed for using unauthorized data from Google's video website YouTube to train their AI models. These companies used a dataset provided by a third party, which contained a large amount of video subtitle text crawled from YouTube, violating YouTube's ban on unauthorized content crawling from the platform. The report points out that these tech companies all use a dataset called "YouTube Subtitles" when training their AI models, which is 5.7GB in size and contains 489 million words from 173500 videos across over 48000 channels on YouTube. This dataset consists of pure text for video subtitles, including parts uploaded by video bloggers and automatically transcribed text from YouTube. In addition to English, it usually comes with translations for languages such as Japanese, German, and Arabic.
LogoMoney.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表LogoMoney.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表LogoMoney.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Nvidia launches ExBody2 system to enhance bipedal robot balance and adaptability
- Elon Musk's AI becomes Silicon Valley darling, $6 billion financing luxury lineup revealed, "old friends" such as Nvidia, AMD added
- Attraction crushing wide base index! Retail investors net purchase $29.8 billion worth of Nvidia stocks in 2024
- Nvidia New Product Countdown: New 'Nuclear Bomb' RTX 5090 Coming Soon, B300 Coming Soon
- Over 210 billion yuan in explosive purchases! Retail investors' fierce pursuit 'of Nvidia investment bank, optimistic about next year's performance
- NVIDIA's new 'nuclear bomb' leaked!
- NVIDIA's latest statement! Robot 'ChatGPT Moment' is Coming, Bet on the Next Growth Driver
- The seven giants of the US stock market all rose this year, and Tesla experienced a major reversal in the fourth quarter
- Nvidia may launch robot 'brain' in the first half of next year, with the company's stock price increasing by over 176% since the beginning of this year
- Nvidia plans to release a new generation of humanoid robot computing platform in the first half of next year, supporting multimodal AI models
-
美股市场:纽约股市三大股指4月30日涨跌不一。截至当天收盘,道琼斯工业平均指数比前一交易日上涨141.74点,收于40669.36点,涨幅为0.35%;标准普尔500种股票指数上涨8.23点,收于5569.06点,涨幅为0.15%;纳斯 ...
- joey791216
- 昨天 11:57
- 支持
- 反对
- 回复
- 收藏
-
美国总统特朗普近日在接受媒体采访时表示,他第二个任期不仅治理美国,也治理全世界。 特朗普于4月24日接受了《大西洋》(The Atlantic)月刊采访,这段专访于4月28日发布。 “第一次当总统时,我要做两 ...
- lfancn
- 前天 12:10
- 支持
- 反对
- 回复
- 收藏
-
东风有限回应武汉工厂关停事宜 据第一财经,4月29日,东风汽车有限公司证实,该公司武汉工厂目前正常运行,后续也不会关停。东风有限称,该公司将在东风与日产母公司的支持下平稳有序发展,持续加速向新能源 ...
- king19831101
- 前天 09:56
- 支持
- 反对
- 回复
- 收藏
-
4月29日凌晨,阿里巴巴开源新一代通义千问模型Qwen3(千问3),参数量为DeepSeek-R1的三分之一,成本大幅下降。据称,该模型性能全面超越R1、OpenAI-o1等领先模型,登顶全球最强开源模型。 千问3是国内首个“ ...
- 风雨中行走
- 3 天前
- 支持
- 反对
- 回复
- 收藏