Open AI has trained a neural network to play the game Minecraft

Published: 2023-03-10

According to an article from Duncan MacRae for AInews.com,  by employing a tiny bit of labeled contractor data together with a big unlabeled video dataset of human Minecraft play, Open AI has trained a neural network to play the game of Minecraft.

The AI research and deployment business is sure that its model can learn to make diamond tools, a process that typically requires skilled humans over 20 minutes, with a little bit of fine-tuning (24,000 actions). Its paradigm, which is fairly generic and takes a step toward general computer-using agents, employs the innate human interface of keypresses and mouse movements.

A representative for the Microsoft-backed company said: "The internet has a huge collection of freely accessible films that we can study from. You can see someone create a stunning presentation, a digital artist render a stunning sunset, and a Minecraft player construct a complex house. But, these movies just serve as a record of what occurred, not an explanation of how it was accomplished, so you won't be able to determine the specific order in which the mouse was moved and the keys were pushed.

This absence of action labels presents a new problem not present in the language domain, where "action labels" are just the next words in a sentence, if we would wish to develop large-scale foundation models in these domains as we have done in language with GPT.

Open AI proposes an innovative, yet straightforward, semi-supervised imitation learning technique called Video PreTraining in order to make use of the abundance of unlabeled video data accessible on the internet (VPT). To start, the team collects a tiny dataset from contractors in which it records both their video and their behaviors, in this case, keystrokes and mouse movements. With the help of this information, the business may create an inverse dynamics model (IDM), which forecasts the actions that will be done at each video step. It's significant that the IDM can predict the action at each stage using previous and future knowledge.

"This challenge is far easier and thus requires significantly less data than the behavioral cloning problem of predicting behaviors given previous video frames only, which entails inferring what the individual intends to do and how to do it," the spokesman continued. Next, by using behavioral cloning, we may train the taught IDM to label a much bigger dataset of internet movies.

According to Open AI, VPT paves the way for enabling agents to learn to behave by viewing the many films available online.

According to the spokesperson, "VPT provides the intriguing potential of directly learning large-scale behavioral priors in more than simply language, in contrast to contrastive approaches that would only produce representational priors. Although we only test in Minecraft, the game's basic human interface (the mouse and keyboard) is extremely general, therefore we think our findings are promising for other related domains, including computer usage.