Posted in

谷歌发布游戏模拟 AI:GameNGen_AI阅读总结 — 包阅AI

包阅导读总结

1. 关键词:Google、GameNGen、视频游戏、模拟、AI 模型

2. 总结:

Google 研究团队发布可模拟视频游戏 Doom 的生成式 AI 模型 GameNGen,能以 20 帧每秒模拟,在人类评估中表现接近真实游戏。它基于开源模型修改,虽未开源但底层模型权重可获取,新范式或降低游戏开发成本。

3. 主要内容:

– Google 发布 GameNGen 模型

– 基于开源 Stable Diffusion v1.4 文本到图像模型修改

– 能根据前帧和动作输入生成游戏帧

– 训练数据与效果

– 用强化学习训练的游戏代理收集约 9 亿帧及对应动作创建训练数据

– 能模拟和维持游戏复杂状态

– 人类评估中模拟游戏片段被偏好 40%

– 潜在优势与问题

– 优势包括降低开发成本、保证帧率和内存等

– 开发中存在错误积累和样本质量下降问题,通过添加噪声和输入噪声水平改进

– 未开源但底层模型权重可获取,引发相关讨论

思维导图:

文章地址:https://www.infoq.com/news/2024/09/google-gamengen/?utm_campaign=infoq_content&utm_source=infoq&utm_medium=feed&utm_term=global

文章来源:infoq.com

作者:Anthony Alford

发布时间:2024/9/10 0:00

语言:英文

总字数:559字

预计阅读时间:3分钟

评分:86分

标签:游戏中的 AI,GameNGen,稳定扩散,谷歌研究,毁灭战士模拟


以下为原文内容

本内容来源于用户推荐转载,旨在分享知识与观点,如有侵权请联系删除 联系邮箱 media@ilingban.com

A research team from Google recently published a paper on GameNGen, a generative AI model that can simulate the video game Doom. GameNGen can simulate the game at 20 frames-per-second (FPS) and in human evaluations was preferred only slightly less often than the actual game.

GameNGen (pronounced “game engine”) is based on the open-source Stable Diffusion v1.4 text-to-image model. Google modified it so that instead of generating an image from a text prompt, it generates a frame of gameplay from previous frames and an action input (such as a key press or mouse click). To create training data at scale, Google used a game-playing agent trained with reinforcement learning (RL), collecting around 900M frames along with corresponding actions. After training, the model is able to simulate and maintain the complex state of the real game, including player health and items. Google evaluated GameNGen by showing human judges a side-by-side comparison of video clips from the simulated game with clips from the real game. The judges preferred the simulated game clip 40% of the time. According to Google,

While many important questions remain, we are hopeful that this paradigm could have important benefits. For example, the development process for video games under this new paradigm might be less costly and more accessible, whereby games could be developed and edited via textual descriptions or example images. A small part of this vision, namely creating modifications or novel behaviors for existing games, might be achievable in the shorter term. For example, we might be able to convert a set of frames into a new playable level or create a new character just based on example images, without having to author code. Other advantages of this new paradigm include strong guarantees on frame rates and memory footprints.

GameNGen Architecture

GameNGen Architecture. Image Source: GameNGen Project Website

Google’s research paper on GameNGen cited the It Runs Doomsubreddit, which is dedicated to “odd hardware that runs Doom.” Users in that subreddit started a discussion thread about GameNGen, with one describing thus:

You ever dreamt of being in a game? Details are hazy, things aren’t all where they should be, but in general the game is recognizable. That’s what this is. The AI is recalling what should happen from its memory, but it literally doesn’t know a single thing about the game it’s in. It doesn’t know what the game’s code is or what the next level is, it’s just going off of memory because it’s watched so much Doom gameplay.

Users on Hacker News also discussed the model. One user noted that:

Apparently there’s more cause, effect, and sequencing in diffusion models than what I expected, which would be roughly ‘none’. Google here uses [Stable Diffusion] 1.4, as the core of the diffusion model, which is a nice reminder that open models are useful to even giant cloud monopolies.

The same user also called out a problem the researchers discovered while developing GameNGen. They noticed initially when the model generated game frames, it suffered from “error accumulation and fast degradation in sample quality.” To correct this, they added noise to the training data and included a noise levelinput to the model. This allowed the model to learn to “de-noise” its autoregressive output.

Although Google did not release the GameNGen code, the model weights for the underlying open-source Stable Diffusion model weights are available on Huggingface.