We used MimicGen to autonomously generate over 50,000 demonstrations from less than 200 human demonstrations across 18 tasks, multiple simulators, and the real-world. This took considerably less human effort than prior work.
In the example below, MimicGen generated 1000 demos for each of 3 reset distributions from just 10 human demos.
10 Human Demos
Generated Dataset (Nominal Variant)
Generated Dataset (Greater Variability)
Generated Dataset (Greatest Variability)
We showcase several datasets generated by MimicGen across broad task reset distributions below.
Three Piece Assembly
Square
Stack Three
Coffee
Threading
Mug Cleanup
Below, MimicGen generated demonstrations for unseen mugs.
10 human demos on 1 mug
1000 generated demos across 12 mugs
Below, MimicGen generated demonstrations for new robot arms.
10 human demos (Panda)
1000 generated demos (Sawyer)
1000 generated demos (IIWA)
1000 generated demos (UR5e)
Coffee Preparation
Kitchen
Pick Place
MimicGen works for contact-rich tasks requiring millimeter-precision. Furthermore, it is simulator-agnostic -- these tasks are from Isaac Gym Factory, as opposed to the other robosuite MuJoCo tasks.
Gear Assembly
Frame Assembly
Mobile Kitchen
100 MimicGen demos on Coffee task with pod placed anywhere in large region
Stack Three D1 (91%)
Coffee D1 (93%)
Threading D1 (80%)
Square D1 (69%)
Three Piece Assembly D1 (61%)
Mug Cleanup O2 (67%)
Kitchen D1 (78%)
Coffee Preparation D1 (59%)
Nut-and-Bolt Assembly D1 (96%)
Gear Assembly D1 (76%)
Frame Assembly D1 (71%)
Mobile Kitchen D0 (77%)
Below, we show MimicGen datasets generated on Square D2 from two sets of source datasets - 10 demos from a better quality operator and 10 demos from a worse quality operator (both are from the robomimic Square MH dataset). Surprisingly, policies trained on each dataset achieve comparable results, which suggests that in the large-scale data regime, data quality might not matter as much.
Dataset generated from 10 better quality operator demos.
Dataset generated from 10 worse quality operator demos.
Below, we show policies trained on 200 demos on Square D0 -- on the left, the policy was trained on 200 demos generated by MimicGen from 10 human demos, and on the right, the policy was trained on 200 human demos. The agent performance is comparable. This raises important questions about the presence of redundancies in large human datasets and when to request additional data from a human.
Square D0 Policy (79%) trained on 200 MimicGen demos.
Square D0 Policy (84%) trained on 200 human demos.
The video above shows an example of MimicGen using a source human trajectory to generate a demonstration on a new scene.
Stack
Stack Three
Square
Coffee
Threading
Three Piece Assembly
Hammer Cleanup
Mug Cleanup
Pick Place
Nut Assembly
Kitchen
Coffee Preparation
Mobile Kitchen
Nut-and-Bolt Assembly
Gear Assembly
Frame Assembly
Stack (Real)
Coffee (Real)
Stack D0
Stack D1
Stack Three D0
Stack Three D1
Square D0
Square D1
Square D2
Coffee D0
Coffee D1
Coffee D2
Threading D0
Threading D1
Threading D2
Three Piece Assembly D0
Three Piece Assembly D1
Three Piece Assembly D2
Hammer Cleanup D0
Hammer Cleanup D1
Mug Cleanup D0
Mug Cleanup D1
Pick Place D0
Nut Assembly D0
Mobile Kitchen D0
Kitchen D0
Kitchen D1
Coffee Preparation D0
Coffee Preparation D1
Nut-and-Bolt Assembly D0
Nut-and-Bolt Assembly D1
Nut-and-Bolt Assembly D2
Gear Assembly D0
Gear Assembly D1
Gear Assembly D2
Frame Assembly D0
Frame Assembly D1
Frame Assembly D2
Stack Real D0
Stack Real D1
Coffee Real D0
Coffee Real D1
@inproceedings{mandlekar2023mimicgen,
title={MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations},
author={Mandlekar, Ajay and Nasiriany, Soroush and Wen, Bowen and Akinola, Iretiayo and Narang, Yashraj and Fan, Linxi and Zhu, Yuke and Fox, Dieter},
booktitle={7th Annual Conference on Robot Learning},
year={2023}
}
This work was made possible due to the help and support of Sandeep Desai (robot hardware), Ravinder Singh (IT), Alperen Degirmenci (compute cluster), Anima Anandkumar (access to robot hardware), Yifeng Zhu (some robosuite tasks and robot control software), Cheng Chi (diffusion policy experiments), Shuo Cheng (drawer design used in Coffee Preparation task), Balakumar Sundaralingam (code release), and Stan Birchfield (dataset release).