ModularEvoGym | Yuxing Wang (王昱兴)

ModularEvoGym [1] is based on Evolution Gym [2], a large-scale benchmark for co-optimizing the design and control of Voxel-based Soft Robots (VSRs). We modified the original state representation to formulate a new modular observation space. The input state of the robot at time step $t$ is represented as $s_{t}^{c}=\lbrace s_{t}^{v},s_{t}^{g}\rbrace$, where $s_{t}^{v}=\lbrace s_{t}^{v_{1}}, s_{t}^{v_{2}},...,s_{t}^{v_N}\rbrace$, $s_{t}^{v_i}$ is composed of each voxel’s local information which contains the relative position of its four corners with respect to the center of mass of the robot and its material information (e.g., soft voxel, rigid voxel, horizontal actuator and vertical actuator). $s_{t}^{g}$ is the task-related observation such as terrain information of the environment and goal-relevant information. During the simulation, voxels (except empty voxels) only sense locally, and based on the input sensory information, a controller outputs control signals to vary the volume of actuator voxels. The morphology of the robot is unchangeable during the interaction with the environment.

With the support of ModularEvoGym, we can do many interesting things. We modeled the local observations of all voxels as a sequence and adopted the self-attention mechanism [3] to develop a more efficient controller (shown below) that handles incompatible state-action spaces. This controller can be trained by popular Reinforcement Learning algorithms (e.g., PPO [4]) to simultaneously control a variety of robot morphologies.

For encoding a VSR’s morphology, apart from using methods such as direct encoding and Compositional Pattern Producing Network (CPPN) [5], which rely on having access to the whole design space, we provided another choice, Neural Cellular Automata (NCA) [6], which takes multiple actions to grow a robot from an initial seed (morphology), as shown below. NCA encodes complex patterns in a neural network and generates different developmental outcomes while using a smaller set of trainable parameters.

Built upon the aforementioned successes, we also presented an efficient Curriculum-based Co-design (CuCo) method for learning to design and control VSRs through an easy-to-difficult process [7]. The following pictures show a developmental process of a walker agent.

$3 \times 3$ design space, $9$ voxels.

$5 \times 5$ design space, $25$ voxels.

$7 \times 7$ design space, $49$ voxels.

$9 \times 9$ design space, $81$ voxels.

$11 \times 11$ design space, $121$ voxels.

Reference

[1] ModularEvoGym
[2] Evolution Gym: A Large-Scale Benchmark for Evolving Soft Robots
[3] Attention Is All You Need
[4] Proximal Policy Optimization Algorithms
[5] Compositional Pattern Producing Network
[6] Neural Cellular Automata
[7] Curriculum-based Co-design of Morphology and Control of Voxel-based Soft Robots