Codebase Overview#

Codebase Structure#

We outline some important folders and files below.

mimicgen/scripts: utility scripts
- generate_dataset.py: main script for data generation
mimicgen/exps/templates: collection of data generation config json templates for each task
mimicgen/configs: implementation of data generation config classes
- config.py: base config class
- task_spec.py: TaskSpec object for specifying sequence of subtasks for each task
- robosuite.py: robosuite-specific config classes
mimicgen/env_interfaces: implementation of Environment Interface classes that help simulation environments provide datagen info during data generation
mimicgen/datagen: implementation of core Data Generation classes
- data_generator.py: DataGenerator class used to generate new trajectories
- datagen_info.py: DatagenInfo class to group information from the sim environment needed during data generation
- selection_strategy.py: SelectionStrategy classes that contain different heuristics for selecting source demos during each data generation trial
- waypoint.py: collection of Waypoint classes to help end effector controllers execute waypoint targets and waypoint sequences
mimicgen/envs and mimicgen/models: files containing collection of robosuite simulation environments and assets released with this project
mimicgen/utils: collection of utility functions and classes
docs: files related to documentation

Important Modules#

We provide some more guidance on some important modules and how they relate to one another.

MimicGen starts with a handful of source demonstrations and generates new demonstrations automatically. MimicGen treats each task as a sequence of object-centric subtasks, and attempts to generate trajectories one subtask at a time. MimicGen must parse source demonstrations into contiguous subtask segments – it uses Subtask Termination Signals to do this. It also requires object poses at the start of each subtask, both in the source demonstrations and in the current scene during data generation. Information on object poses, subtask termination signals, and other information needed at data generation time is collected into DatagenInfo objects, which are read from the source demonstrations, and also read from the current scene. This information is provided through Environment Interface classes which connect underlying simulation environments to DatagenInfo objects.

Data generation is carried out by the DataGenerator class. Each data generation attempt requires choosing one or more subtask segments from the source demonstrations to transform – this is carried out by a SelectionStrategy instance. The transformation consists of keeping track of a collection of end effector target poses for a controller to execute – this is managed by Waypoint classes.

The sequence of object-centric subtasks and other important data generation settings for each data generation run are communicated to MimicGen through the TaskSpec object, which is read as part of the MimicGen config.

mimicgen 1.0 documentation

Codebase Overview

Contents

Codebase Overview#

Codebase Structure#

Important Modules#