AIRIS is a learning AI teaching itself how to play Minecraft


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


A new learning AI has been left to its own devices within an instance of Minecraft as the artificial intelligence learns how to play the game through doing, says AI development company SingularityNET and the Artificial Superintelligence Alliance (ASI Alliance). The AI, named AIRIS (Autonomous Intelligent Reinforcement Inferred Symbolism), is essentially starting from nothing inside Minecraft to learn how to play the game using nothing but the game’s feedback loop to teach it.

AI has been set loose to learn a game before, but often in more linear 2D spaces. With Minecraft, AIRIS can enter a more complex 3D world and slowly start navigating and exploring to see what it can do and, more importantly, whether the AI can understand game design goals without necessarily being told them. How does it react to changes in the environment? Can it figure out different paths to the same place? Can it play the game with anything resembling the creativity that human players employ in Minecraft?

VentureBeat reached out to SingularityNET and ASI Alliance to ask why they chose Minecraft specifically.

“Early versions of AIRIS were tested in simple 2D grid world puzzle game environments,” a representative from the company replied. “We needed to test the system in a 3D environment that was more complex and open ended. Minecraft fits that description nicely, is a very popular game, and has all of the technical requirements needed to plug an AI into it. Minecraft is also already used as a Reinforcement Learning benchmark. That will allow us to directly compare the results of AIRIS to existing algorithms.”

They also provided a more in-depth explanation of how it works.

“The agent is given two types of input from the environment and a list of actions that it can perform. The first type of input is a 5 x 5 x 5 3D grid of the block names that surround the agent. That’s how the agent “sees” the world. The second type of input is the current coordinates of the agent in the world. That gives us the option to give the agent a location that we want it to reach. The list of actions in this first version are to move or jump in one 8 directions (the four cardinal directions and diagonally) for a total of 16 actions. Future versions will have many more actions as we expand the agent’s capabilities to include mining, placing blocks, collecting resources, fighting mobs, and crafting.

“The agent begins in ‘Free Roam’”’ mode and seeks to explore the world around it. Building an internal map of where it has been that can be viewed with the included visualization tool. It learns how to navigate the world and as it encounters obstacles like trees, mountains, caves, etc. it learns and adapts to them. For example, if it falls into a deep cave, it will explore its way out. Its goal is to fill in any empty space in its internal map. So it seeks out ways to get to places it hasn’t yet seen.

“If we give the agent a set of coordinates, it will stop freely exploring and navigate its way to wherever we want it to go. Exploring its way through areas that it has never seen. That could be on top of a mountain, deep in a cave, or in the middle of an ocean. Once it reaches its destination, we can give it another set of coordinates or return it to free roam to explore from there.

“The free exploration and ability to navigate through unknown areas is what sets AIRIS apart from traditional Reinforcement Learning. These are tasks that RL is not capable of doing regardless of how many millions of training episodes or how much compute you give it.”

For game development, a successful use-case for AIRIS may include automatic bug and stress tests for software. A hypothetical AIRIS that can run across the entirety of Fallout 4 could create bug reports when interacting with NPCs or enemies, for example. While quality assurance testers would still need to check what the AI has documented, it would speed along a laborious and otherwise frustrating process for development.

Moreover, it is the first step in a virtual world for self-directed learning for AI in complex, omni-directional worlds. That should be exciting for AI enthusiasts as a whole.



Source link