Introduction To The Reinforcement Learning Environment!
3 main points
✔️ Introduction to the environments used to train various reinforcement learning
✔️ Different environments have different evaluation objectives, including those for complex and long-horizon tasks
✔️ It is important to choose the right environment according to what you want to evaluate
First of all
With recent advances in reinforcement learning research, various environments for training reinforcement learning agents have been proposed. So, which environment should we choose for the agent we want to train? In this article, we will introduce some of the most frequently used environments among the various reinforcement learning environments, to help you choose the right environment for your future experiments using reinforcement learning.
In this chapter, we introduce the environment for using reinforcement learning for robotics.
Robosuite is an environment that provides basic manipulation tasks (Lift, Assembly, etc.). It can also be used to evaluate a variety of controllers and robots, and currently provides models for Panda, Sawyer, IIWA (KUKA), Jaco, Kinova3, UR5e, and Baxter. In addition, this environment provides not only tasks solved by a single robot, but also tasks solved by multiple robots, as shown in the figure below. The beauty of this environment is that it is relatively easy to create a new environment yourself, making it a very useful benchmark environment.
IKEA Furniture Assembly
It is an environment for assembling furniture using robots and provides a long-horizon task that is very complex and requires the agent to run for a long period of time. The environment provides over 80 pieces of furniture, and the background, lights, and textures of the environment can be easily changed. Currently, models of the Baxter and Sawyer robots are available. The environment also provides access to depth and segmentation images.
Fifty different Sawyer robot manipulation tasks are provided. This environment is mainly used in Multi-Task Learning and other applications, where there are various modes of evaluation with different numbers of tasks used for learning and testing.
RL Bench is also an environment that provides various manipulation tasks as well as the environment introduced above. However, this environment is also designed for research fields such as geometric computer vision. In this environment, there are also simple tutorials on various research fields such as Few-shot learning, Meta Learning, Sim-To-Real, and Multi-Task Learning.
In this chapter, we present several environments for applying reinforcement learning to games.
Classic video games have been adapted to the Gym environment, and there are about 1000 different games available.
VizDoom is a shooting game simulator called Doom, which can be used to learn reinforcement learning using image information. There are several tasks available in this environment as well, such as defeating enemies, collecting first aid kits, etc.
It is an interface that allows you to run the StarCraft2 game, receive observations through this interface, and send actions to the game through the interface. This is the environment that was primarily used in the DeepMind study, among others.
is a simulator that allows you to play a game of MineCraft.
This chapter introduces the so-called Suites, which are a set of tasks and environments of various types.
DeepMind Control Suite
In the environment provided by DeepMind, a variety of control tasks are available. Typical examples include Walker, a task that aims to make the agent walk, and Reacher, a task that aims to make the agent walk. In a 2D environment This environment is used as a benchmark in various studies. This environment is often used as a benchmark in various studies, and it is possible to easily switch whether the agent receives an image observation or a low-dimensional state.
Open AI Gym Atari
This is the environment provided by OpenAI, where 59 Atari games are available and images are given as observations.
Open AI Gym Mujoco
This is an environment that provides tasks related to continuous control and uses a very fast physics simulator called MuJoco. In this environment, a low-dimenstional state is basically given as observation.
Open AI Gym Robotics
ShadowHand (bottom left) and Fetch (bottom right) This is an environment in which tasks are provided such that a goal can be given regarding two of the robots.
In this chapter, we will introduce the Navigation task.
The environment is provided by DeepMind and offers tasks related to difficult 3D Navigation. It also offers a variety of tasks, such as puzzle-solving tasks.
It is a grid-based, lightweight, and fast environment that is mainly used for simple experiments. This environment is also easy to use because it is easily modifiable and extensible.
AI2THOR is a simulation of Home Navigation, which allows you to interact with various furniture and objects. The agent's actions are basically discrete, and can mainly be used for long-horizon tasks.
Gibson, like AI2THOR, is an environment that provides tasks related to indoor navigation, but in Gibson the robot is mainly moved by continuous control.
Habitat is a simulator provided by Facebook and is a photorealistic environment. Therefore, it is a simulator that can be considered effective for sim2real and other applications.
Multi-agent Particle Environment
A simple environment in which Multi-Agent RL can be trained, where continuous values are given as observations and actions are represented discrete.
OpenAI Multi-Agent Competition Environments
As shown in the figure below, it offers a variety of Multi-agent tasks for continuous control and is mainly focused on competition in Multi-agent.
OpenAI Multi-Agent Hide and Seek
It is a Multi-agent environment. It is a multi-agent environment, and it also focuses on learning how to use a given tool (object).
Reinforcement learning has attracted much attention in the field of safety research. In particular, in situations where robots are operated, choosing the wrong action can lead to accidents. In this chapter, we introduce an environment in which the safety of reinforcement learning can be evaluated.
A total of 6 assistive tasks (ScratchItch, BedBathing, Feeding, Drinking, Dressing, ArmManipulation) and 4 types of robots (PR2, Jaco, Baxter, Sawyer) are provided in the Environment. Humans exist in two states: one in which they do not move, and one in which they move with the action of another measure. In addition, humans can use either the male or female model, which is made up of as many as 40 joints.
This environment has a number of obstacles as shown in the figure below, and is mainly used to study safety during reinforcement learning exploration.
Automatic operation (machine)
Autonomous Vehicle Simulator
This is a simulator for automated driving from Microsoft AI & Research and is based on Unreal Engine / Unity.
CARLA is an environment where automated driving can be trained and evaluated, and the provided APIs can be used to set various environmental conditions such as traffic conditions, pedestrian behavior, and weather. It also has access to a variety of sensor data, including LIDARs, multiple cameras, depth sensors, and GPS. Furthermore, users can create their own maps.
In GTAV (Grand Theft Auto V), it is a plug-in that can do the learning of the automatic driving as an image input.
In this article, we have introduced various environments for reinforcement learning. Since you need to choose an environment depending on what you want to evaluate, it is very important to have some knowledge of what each environment can do or what you can evaluate. It is also interesting to think about what kind of new environment is needed if there is something that cannot be evaluated in the environment that exists now.
Categories related to this article