MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations

Ajay Mandlekar¹    Soroush Nasiriany^*^{1, 2}    Bowen Wen^*¹    Iretiayo Akinola¹
Yashraj Narang¹    Linxi "Jim" Fan¹    Yuke Zhu^{1, 2}    Dieter Fox¹
¹NVIDIA       ²The University of Texas at Austin
Published at CoRL 2023.

Update (July 2024): Data Generation Code has been released!

Paper arXiv Code Summary Documentation

MimicGen automatically generates large datasets from a few human demonstrations

MimicGen produces large-scale datasets with minimal human effort

We used MimicGen to autonomously generate over 50,000 demonstrations from less than 200 human demonstrations across 18 tasks, multiple simulators, and the real-world. This took considerably less human effort than prior work.

MimicGen can generate diverse datasets from just 10 human demos including...

New Reset Distributions

In the example below, MimicGen generated 1000 demos for each of 3 reset distributions from just 10 human demos.

10 Human Demos

Generated Dataset (Nominal Variant)

Generated Dataset (Greater Variability)

Generated Dataset (Greatest Variability)

We showcase several datasets generated by MimicGen across broad task reset distributions below.

Three Piece Assembly

Square

Stack Three

Coffee

Threading

Mug Cleanup

New Objects

Below, MimicGen generated demonstrations for unseen mugs.

10 human demos on 1 mug

1000 generated demos across 12 mugs

New Robot Hardware

Below, MimicGen generated demonstrations for new robot arms.

10 human demos (Panda)

1000 generated demos (Sawyer)

1000 generated demos (IIWA)

1000 generated demos (UR5e)

Long-Horizon Tasks

Coffee Preparation

Kitchen

Pick Place

High-Precision Tasks

MimicGen works for contact-rich tasks requiring millimeter-precision. Furthermore, it is simulator-agnostic -- these tasks are from Isaac Gym Factory, as opposed to the other robosuite MuJoCo tasks.

Gear Assembly

Frame Assembly

Mobile Manipulation

Mobile Kitchen

Real-World Tasks

100 MimicGen demos on Coffee task with pod placed anywhere in large region

MimicGen datasets can produce performant policies across diverse tasks with simple Behavioral Cloning

Stack Three D1 (91%)

Coffee D1 (93%)

Threading D1 (80%)

Square D1 (69%)

Three Piece Assembly D1 (61%)

Mug Cleanup O2 (67%)

Kitchen D1 (78%)

Coffee Preparation D1 (59%)

Nut-and-Bolt Assembly D1 (96%)

Gear Assembly D1 (76%)

Frame Assembly D1 (71%)

Mobile Kitchen D0 (77%)

MimicGen can produce good quality datasets and policies across different quality human operators.

Below, we show MimicGen datasets generated on Square D2 from two sets of source datasets - 10 demos from a better quality operator and 10 demos from a worse quality operator (both are from the robomimic Square MH dataset). Surprisingly, policies trained on each dataset achieve comparable results, which suggests that in the large-scale data regime, data quality might not matter as much.

Dataset generated from 10 better quality operator demos.

Dataset generated from 10 worse quality operator demos.

Using MimicGen to generate equal amounts of data as a human operator can result in comparable policy performance.

Below, we show policies trained on 200 demos on Square D0 -- on the left, the policy was trained on 200 demos generated by MimicGen from 10 human demos, and on the right, the policy was trained on 200 human demos. The agent performance is comparable. This raises important questions about the presence of redundancies in large human datasets and when to request additional data from a human.

Square D0 Policy (79%) trained on 200 MimicGen demos.

Square D0 Policy (84%) trained on 200 human demos.

MimicGen Data Generation Example

The video above shows an example of MimicGen using a source human trajectory to generate a demonstration on a new scene.

Tasks

Below, we visualize trajectories for each task.

Stack

Stack Three

Square

Coffee

Threading

Three Piece Assembly

Hammer Cleanup

Mug Cleanup

Pick Place

Nut Assembly

Kitchen

Coffee Preparation

Mobile Kitchen

Nut-and-Bolt Assembly

Gear Assembly

Frame Assembly

Stack (Real)

Coffee (Real)

Task Reset Distributions

Below, we show the reset distributions for each task.

Stack D0

Stack D1

Stack Three D0

Stack Three D1

Square D0

Square D1

Square D2

Coffee D0

Coffee D1

Coffee D2

Threading D0

Threading D1

Threading D2

Three Piece Assembly D0

Three Piece Assembly D1

Three Piece Assembly D2

Hammer Cleanup D0

Hammer Cleanup D1

Mug Cleanup D0

Mug Cleanup D1

Pick Place D0

Nut Assembly D0

Mobile Kitchen D0

Kitchen D0

Kitchen D1

Coffee Preparation D0

Coffee Preparation D1

Nut-and-Bolt Assembly D0

Nut-and-Bolt Assembly D1

Nut-and-Bolt Assembly D2

Gear Assembly D0

Gear Assembly D1

Gear Assembly D2

Frame Assembly D0

Frame Assembly D1

Frame Assembly D2

Stack Real D0

Stack Real D1

Coffee Real D0

Coffee Real D1

BibTeX

@inproceedings{mandlekar2023mimicgen,
    title={MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations},
    author={Mandlekar, Ajay and Nasiriany, Soroush and Wen, Bowen and Akinola, Iretiayo and Narang, Yashraj and Fan, Linxi and Zhu, Yuke and Fox, Dieter},
    booktitle={7th Annual Conference on Robot Learning},
    year={2023}
}

Acknowledgements

This work was made possible due to the help and support of Sandeep Desai (robot hardware), Ravinder Singh (IT), Alperen Degirmenci (compute cluster), Anima Anandkumar (access to robot hardware), Yifeng Zhu (some robosuite tasks and robot control software), Cheng Chi (diffusion policy experiments), Shuo Cheng (drawer design used in Coffee Preparation task), Balakumar Sundaralingam (code release), and Stan Birchfield (dataset release).