The next step in multimodal AI is 3D content generation? New tools have become popular on GitHub
jiangcr81
发表于 2024-3-5 16:39:44
3375
0
0
Multimodal content generation demonstrates the vast application space of AIGC, and 3D is expected to become the next modality to achieve breakthroughs.
Recently, a new AI tool called DUSt3R has become popular on Microsoft's GitHub platform. It only takes 2 images and 2 seconds to complete 3D reconstruction without measuring any additional data. This AI tool made it to the second place on the GitHub Hot List shortly after its launch. A netizen tested and took two photos to reconstruct his kitchen, and the entire process took less than 2 seconds.
3D model generation typically requires MVS estimation of camera parameters and triangulation of corresponding pixels in 3D space. What sets this product apart from the original is that it is a new paradigm for 3D reconstruction of any image, without the need for prior information such as camera calibration and viewpoint pose. Thus, it is possible to perform 3D modeling and reconstruction while providing two or more images.
3D modeling refers to the process of using software to create mathematical representations of 3D objects or shapes. 3D modeling technology and models are widely used in fields such as healthcare, gaming, film and television, architecture, product design, and virtual reality.
AI+3D modeling is the process of using artificial intelligence technology to automatically generate high-quality 3D models. Traditional 3D modeling requires artists to spend a lot of time and effort on manual creation, while AI generation trains machine learning algorithms to enable computers to automatically learn and generate 3D models, greatly improving efficiency and accuracy while reducing the overall production cost.
Users only need to input keywords or upload a 2D image, and these tools can directly generate multiple preliminary 3D models within a certain period of time. If the user is satisfied, they can choose to further generate more accurate 3D models.
There are already many 3D content generation tools available today, and overseas AI+3D technology is mainly divided into industrial scene exploration and non industrial scene exploration. The exploration of non industrial scenario applications is mainly represented by Google's DreamFusion and Nvidia's Magic3D, mainly aimed at designing 3D assets in games and the metaverse; Industrial scenario applications mainly rely on derivative design software, such as PTC's Creo and Autodesk's Fushion360, which provide derivative design capabilities.
The well-known 3D generation AI models in China include: MVDream developed by the ByteDance research team; DreamCraft3D, developed by DeepSeek, a large modeling enterprise under Magic Square Quantification; Yingmou Technology has been engaged in the business of generating 3D models through facial capture since 2016. According to CTO Zhang Qixuan of the company, its facial 3D generation service should be the only product in China that has entered the game production end for 3D generation
From ChatGPT representing literary texts, to DALL · E representing literary diagrams, and then to Sora representing literary videos, multimodality has become a consensus trend in the development of AI. Zhongtai Securities has clearly proposed that after text, code, images, and videos, the next possible breakthrough mode is likely to be 3D, and the next step after SORA is Wensheng 3D. In the future, with the continuous development of digitalization and the rapid growth of 3D assets, the automation generation of 3D modeling may become a new development trend. The production capacity of AIGC represented by various large models empowers 3D modeling, and the continuous growth of demand for cultural 3D will promote the rapid development of AI empowering 3D modeling.
However, AI+3D modeling technology currently faces many challenges, such as a lack of 3D data and assets, high difficulty in AI training, limited real-time rendering technology for AI, and difficulty in commercial implementation.
The institution further stated that from an industry perspective, it is recommended to continue tracking and paying attention to the progress in the field of cultural and biological 3D modeling, with a focus on Guanglianda and Yingjianke in the BIM field; Focus on Zhongwang Software, Sochen Technology, and Haochen Software in the CAX field; The EDA field focuses on Huada Jiutian and Gailun Electronics.
LogoMoney.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表LogoMoney.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表LogoMoney.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Analyst: Apple can no longer rely entirely on iPhone sales, and the next focus will be on the XR headphone display
- MR Big Bull Deep Decompose Vision Pro: It represents the industry's next decade. What is the gap between Chinese manufacturers
- Liu Qiangdong's next live broadcast? JD official announcement: It's "Dongge"!
- The selection of the next United States Trade Representative has been announced: a long-time prot é g é of the "Trump Tariff 1.0" manipulator
-
美股市场:纽约股市三大股指4月30日涨跌不一。截至当天收盘,道琼斯工业平均指数比前一交易日上涨141.74点,收于40669.36点,涨幅为0.35%;标准普尔500种股票指数上涨8.23点,收于5569.06点,涨幅为0.15%;纳斯 ...
- joey791216
- 昨天 11:57
- 支持
- 反对
- 回复
- 收藏
-
美国总统特朗普近日在接受媒体采访时表示,他第二个任期不仅治理美国,也治理全世界。 特朗普于4月24日接受了《大西洋》(The Atlantic)月刊采访,这段专访于4月28日发布。 “第一次当总统时,我要做两 ...
- lfancn
- 前天 12:10
- 支持
- 反对
- 回复
- 收藏
-
东风有限回应武汉工厂关停事宜 据第一财经,4月29日,东风汽车有限公司证实,该公司武汉工厂目前正常运行,后续也不会关停。东风有限称,该公司将在东风与日产母公司的支持下平稳有序发展,持续加速向新能源 ...
- king19831101
- 前天 09:56
- 支持
- 反对
- 回复
- 收藏
-
当地时间周四,美股三大股指集体收涨,其中道指和标普500指数实现“八连涨”。不过,三大股指均在尾盘出现小幅跳水。 苹果、亚马逊于周四美股盘后公布了最新业绩,尽管业绩有所超出预期,但仍有令市场不满 ...
- jiangu12
- 6 小时前
- 支持
- 反对
- 回复
- 收藏