DeXtreme achives human-like in-hand manipulation while training entirely in simulation. In the inset video, we show the live visualisation (no physics involved) of the states of hand and cube in Omniverse i.e. the joint values of the hand and cube pose are teleported in simulation. The Omniverse video is recorded at 10Hz while the real-world video is captured at 30Hz so there might be some repetitive frames and lag in the inset video. The live visualisation is only possible because of the explicit representation used for tracking the cube and we found this to be very helpful for diagnosing the performance of the pose estimator as well as the policy.


We present our techniques to train a) a policy that can perform robust dexterous manipulation on an anthropomorphic robot hand and b) a robust pose estimator suitable for providing reliable real-time information on the state of the object being manipulated.

DeXtreme policies are trained to adapt to a wide range of conditions in simulation. Consequently, our vision-based policies significantly outperform the best vision policies in the literature on the same reorientation task and are competitive with policies that are given privileged state information via motion capture systems.

Recent work has demonstrated the ability of deep reinforcement learning (RL) algorithms to learn complex robotic behaviours in simulation, including in the domain of multi-fingered manipulation. However, such models can be challenging to transfer to the real world due to the gap between simulation and reality.

Our work reaffirms the possibilities of sim-to-real transfer for dexterous manipulation in diverse kinds of hardware and simulator setups, and in our case, with the Allegro Hand and Isaac Gym GPU-based simulation. Furthermore, it opens up possibilities for researchers to achieve such results with commonly-available, affordable robot hands and cameras.


All the video below are recorded at 1x real time.

A video of a longer rollout: 86 CS

An example of a longer rollout where the hand achieves 86 sucesses in a row.

Goal frame hold rollout: 43CS

An example of a rollout achieving 43 CS where the cube is held at the goal pose for 10 frames in a row.

Goal frame hold rollout: 29CS

An example of a rollout achieving 29 CS where the cube is held at the goal pose for 10 frames in a row.

Goal frame hold rollout: 13CS

An example of a rollout achieving 13 CS where the cube is held at the goal pose for 10 frames in a row.


