Codebase Overview
Contents
Codebase Overview#
Codebase Structure#
We outline some important folders and files below.
mimicgen/scripts
: utility scriptsgenerate_dataset.py
: main script for data generation
mimicgen/exps/templates
: collection of data generation config json templates for each taskmimicgen/configs
: implementation of data generation config classesconfig.py
: base config classtask_spec.py
: TaskSpec object for specifying sequence of subtasks for each taskrobosuite.py
: robosuite-specific config classes
mimicgen/env_interfaces
: implementation of Environment Interface classes that help simulation environments provide datagen info during data generationmimicgen/datagen
: implementation of core Data Generation classesdata_generator.py
: DataGenerator class used to generate new trajectoriesdatagen_info.py
: DatagenInfo class to group information from the sim environment needed during data generationselection_strategy.py
: SelectionStrategy classes that contain different heuristics for selecting source demos during each data generation trialwaypoint.py
: collection of Waypoint classes to help end effector controllers execute waypoint targets and waypoint sequences
mimicgen/envs
andmimicgen/models
: files containing collection of robosuite simulation environments and assets released with this projectmimicgen/utils
: collection of utility functions and classesdocs
: files related to documentation
Important Modules#
We provide some more guidance on some important modules and how they relate to one another.
MimicGen starts with a handful of source demonstrations and generates new demonstrations automatically. MimicGen treats each task as a sequence of object-centric subtasks, and attempts to generate trajectories one subtask at a time. MimicGen must parse source demonstrations into contiguous subtask segments – it uses Subtask Termination Signals to do this. It also requires object poses at the start of each subtask, both in the source demonstrations and in the current scene during data generation. Information on object poses, subtask termination signals, and other information needed at data generation time is collected into DatagenInfo objects, which are read from the source demonstrations, and also read from the current scene. This information is provided through Environment Interface classes which connect underlying simulation environments to DatagenInfo objects.
Data generation is carried out by the DataGenerator class. Each data generation attempt requires choosing one or more subtask segments from the source demonstrations to transform – this is carried out by a SelectionStrategy instance. The transformation consists of keeping track of a collection of end effector target poses for a controller to execute – this is managed by Waypoint classes.
The sequence of object-centric subtasks and other important data generation settings for each data generation run are communicated to MimicGen through the TaskSpec object, which is read as part of the MimicGen config.