Policies & Environments Reference

This page provides detailed information about all available policies and environments, including their configuration parameters (kwargs) for both model loading and inference.

Policies

OpenVLA

Description

OpenVLA (Open Vision-Language-Action) is a 7B parameter transformer-based vision-language-action model for robotic manipulation. It takes visual observations and natural language instructions as input and outputs robot actions.

Available Versions

7b - OpenVLA 7B parameter model (openvla/openvla-7b)
latest - Alias for the 7B model (default)

Container Image

maplerobotics/openvla:latest

Inputs

image - Visual observation (RGB image)
instruction - Natural language task instruction

Outputs

action - Predicted robot action vector (unnormalized to target space)

Model Load Parameters (model_load_kwargs)

OpenVLA uses standard model loading and does not require additional kwargs for loading.

Inference Parameters (model_kwargs)

Important Notes

unnorm_key is REQUIRED: OpenVLA outputs normalized actions that must be converted using dataset-specific statistics. Without unnormalization, the actions cannot be executed.
The unnorm_key must match the environment/dataset you’re evaluating on (e.g., use bridge_orig).

SmolVLA

Description

SmolVLA (Small Vision-Language-Action) is a compact vision-language-action model for robotic manipulation. It supports multi-modal observations including images and proprioceptive state, and directly outputs actions in the target space without requiring unnormalization.

Available Versions

libero - SmolVLA fine-tuned for LIBERO benchmark (HuggingFaceVLA/smolvla_libero)
base - Base SmolVLA trained on diverse datasets (lerobot/smolvla_base)

Container Image

maplerobotics/smolvla:latest

Inputs

image - Visual observation (RGB image, can include multiple camera views)
state - Proprioceptive robot state (joint positions, velocities, etc.)
instruction - Natural language task instruction

Outputs

action - Predicted robot action vector in target action space

Model Load Parameters (model_load_kwargs)

SmolVLA uses standard model loading and does not require additional kwargs for loading.

Inference Parameters (model_kwargs)

SmolVLA does not require any additional inference parameters. All observations are passed through the adapter, and the model directly outputs actions in the target space.

Important Notes

SmolVLA handles multi-modal observations automatically through the adapter system.
The libero version is specifically fine-tuned for LIBERO tasks and may perform better than base on those benchmarks.
Unlike OpenVLA, SmolVLA does not require action unnormalization.

OpenPI

Description

OpenPI (π₀ / π₀.₅) is Physical Intelligence’s family of vision-language-action models for robotic manipulation. Available in multiple sizes and task-specific variants, OpenPI supports multi-modal observations and is trained on diverse real-world robot datasets.

Available Versions

Base models (for fine-tuning):

pi0_base - π₀ base model
pi0_fast_base - π₀ fast variant base model
pi05_base - π₀.₅ base model

DROID fine-tuned (mobile manipulation):

pi0_fast_droid
pi0_droid
pi05_droid

ALOHA fine-tuned (bimanual manipulation):

pi0_aloha_towel
pi0_aloha_tupperware
pi0_aloha_pen_uncap

LIBERO fine-tuned (long-horizon manipulation):

pi05_libero

Bridge and Fractal Dataset fine-tuned (long-horizon manipulation):

HaomingSong/openpi0-bridge-lora
"HaomingSong/openpi0-fractal-lora

Aliases:

latest - Alias for pi05_droid (default)

Container Image

maplerobotics/openpi:latest

Model Source

Models are downloaded from Google Cloud Storage (gs://openpi-assets) using anonymous access (no credentials required).

Inputs

image - Visual observation (supports multiple camera views)
state - Proprioceptive robot state
prompt - Natural language instruction

Outputs

action - Predicted robot action vector in target action space

Model Load Parameters (model_load_kwargs)

Parameter

Type

Required

Description

config_name

str

NO

OpenPI model configuration identifier. Auto-inferred from version if not provided.

Values: Same as version names (e.g., pi05_droid, pi0_base)

Inference Parameters (model_kwargs)

OpenPI does not require any additional inference parameters. All observations are passed through the adapter.

Important Notes

The config_name parameter is automatically inferred from the version, so manual specification is typically not needed.
OpenPI models are downloaded from a public S3 bucket and require fsspec[gs] and gsfs to be installed.
Different variants are optimized for different robot platforms (DROID for mobile manipulation, ALOHA for bimanual tasks, etc.).

Environments

LIBERO

Description

LIBERO (Language-Instructed Benchmarks for Embodied Robot Learning) is a suite of robotic manipulation tasks with natural language instructions. It uses MuJoCo for physics simulation with OSMesa for headless rendering.

Container Image

maplerobotics/libero:latest

Task Suites

Suite	Tasks	Description
`libero_spatial` \| 10		Spatial reasoning tasks
`libero_object` \| 10		Object manipulation tasks
`libero_goal` \| 10		Goal-conditioned tasks
`libero_10` \| 10		Diverse benchmark tasks
`libero_90` \| 90		Large-scale diverse task suite

Environment Setup Parameters (env_kwargs)

LIBERO environments are configured through task selection and do not require additional kwargs for standard usage.

Container Configuration

Rendering: OSMesa (headless, software rendering)
GPU: Not required (CPU-only)
Memory Limit: 4GB
Environment Variables: MUJOCO_GL=osmesa

Important Notes

LIBERO uses OSMesa for rendering, so no GPU or X11 display is required.
Task instructions are automatically loaded when setting up a task.
Use maple env list-tasks libero to see all available tasks.

RoboCasa

Description

RoboCasa is a large-scale simulation framework for training robots to perform everyday tasks in kitchen environments. It provides both atomic (single-step) and composite (multi-step) manipulation tasks.

Container Image

maplerobotics/robocasa:latest

Task Categories

Category	Tasks	Description
`atomic`	25	Low-level primitive operations that cannot be decomposed further
`composite`	97	High-level multi-step behaviors composed of atomic actions in structured sequences

Environment Setup Parameters (env_kwargs)

Parameter

Type

Required

Description

robot

str

NO

Robocasa can add any desired robot in an env. PandaOmron is selected if not provided.

Values: Same as version names (e.g., PandaOmron, GR1)

layout_id

int

NO

Layout ID defining variation in the env.

style_id

int

NO

Style ID defining variation in the env.

Container Configuration

Rendering: OSMesa (headless, software rendering)
GPU: Not required (CPU-only)
Memory Limit: 4GB
Environment Variables: MUJOCO_GL=osmesa

Important Notes

RoboCasa uses OSMesa for rendering, so no GPU or X11 display is required.
Composite tasks involve multi-step reasoning and are more challenging than atomic tasks.
Use maple env list-tasks robocasa to see all available tasks with instructions.

AlohaSim

Description

AlohaSim is a simulation environment suite for the ALOHA (A Low-cost Open-source Hardware System for Bimanual Teleoperation) robot. It provides a collection of bimanual manipulation tasks for robot learning and evaluation with MuJoCo physics simulation.

Container Image

maplerobotics/alohasim:latest

Task Suites

Suite	Tasks	Description
`basic`	5	Basic bimanual manipulation tasks
`instruction`	12	Language-conditioned instruction following tasks
`dexterous`	3	Complex dexterous manipulation tasks

Environment Setup Parameters (env_kwargs)

AlohaSim environments are configured through task selection and do not require additional kwargs for standard usage.

Container Configuration

Rendering: EGL (headless rendering with hardware acceleration)
GPU: Recommended for EGL rendering
Memory Limit: 4GB
Environment Variables:
- MUJOCO_GL=egl
- PYOPENGL_PLATFORM=egl

Usage Example

# Start an AlohaSim environment
maple serve env alohasim

# Run evaluation on basic tasks
maple eval policy-id env-id \
    --tasks basic \
    --seeds 0,1,2

Important Notes

AlohaSim is designed for bimanual manipulation with the ALOHA robot platform.
The environment uses EGL for rendering, which can leverage GPU acceleration when available.
Tasks range from basic pick-and-place to complex dexterous manipulation requiring coordinated bimanual control.
Use maple env list-tasks alohasim to see all available tasks with instructions.

SimplerEnv

Description

SimplerEnv combines Bridge and Fractal simulation environments, providing tasks for both WidowX and Google Robot platforms with natural language instructions.

Container Image

maplerobotics/simplerenv:latest

Task Suites

Suite	Tasks	Description
`bridge`	4	Tasks with the WidowX robot
`fractal`	16	Tasks with the Google Robot

Environment Setup Parameters (env_kwargs)

SimplerEnv environments are configured through task selection and do not require additional kwargs for standard usage.

Container Configuration

Rendering: EGL (headless rendering with hardware acceleration)
GPU: Recommended for EGL rendering
Memory Limit: 4GB
Environment Variables:
- MUJOCO_GL=egl
- PYOPENGL_PLATFORM=egl
- SAPIEN_DISABLE_VULKAN_RAY_TRACING=1
- SAPIEN_DISABLE_VULKAN_RAY_QUERY=1

Important Notes

SimplerEnv uses EGL for rendering, which can leverage GPU acceleration when available.
Bridge tasks use the WidowX robot platform, while Fractal tasks use the Google Robot.
Use maple env list-tasks simplerenv to see all available tasks.

Common Patterns

Policy-Environment Compatibility

Policy	Environment	Required kwargs
OpenVLA	LIBERO	`unnorm_key="libero_spatial"` (or other LIBERO suite)
OpenVLA	SimplerEnv	`unnorm_key="bridge"` or `unnorm_key="fractal"`
SmolVLA (libero)	LIBERO	No kwargs needed
SmolVLA (base)	Multiple	No kwargs needed
OpenPI (libero)	LIBERO	No kwargs needed
OpenPI (droid)	Multiple	No kwargs needed

Passing Model Kwargs

Model kwargs can be passed in two ways:

Via Command Line

# During evaluation
maple eval policy-id env-id \
    --tasks task_suite \
    --model-kwargs '{"unnorm_key": "libero_spatial"}'

# During single run
maple run policy-id env-id task_name \
    --model-kwargs '{"unnorm_key": "libero_spatial"}'

Via Configuration File

# config.yaml
evaluation:
  model_kwargs:
    unnorm_key: "libero_spatial"

Environment Kwargs

Environment kwargs can be passed similarly:

# During environment setup
maple run policy-id env-id task_name \
    --env-kwargs '{}'

Currently, the supported environments (LIBERO, SimplerEnv) do not require environment kwargs for standard usage. Future environment backends may expose additional configuration options.

Adding Custom Parameters

When developing custom policies or environments, you can extend the kwargs system:

For Policies

Implement the act() method to accept and use model_kwargs:

def act(
    self,
    handle: PolicyHandle,
    payload: Any,
    instruction: str,
    model_kwargs: Optional[Dict[str, Any]] = {}
) -> List[float]:
    # Extract custom kwargs
    temperature = model_kwargs.get("temperature", 1.0)
    top_p = model_kwargs.get("top_p", 0.9)

    # Use in inference
    ...

For Environments

Implement the setup() method to accept and use env_kwargs:

def setup(
    self,
    handle: EnvHandle,
    task: str,
    seed: Optional[int] = None,
    env_kwargs: Optional[Dict[str, Any]] = {}
) -> Dict:
    # Extract custom kwargs
    render_mode = env_kwargs.get("render_mode", "rgb_array")
    camera_id = env_kwargs.get("camera_id", 0)

    # Use in setup
    ...

See the Adding Policies and Adding Environments guides for more details.