maple.backend.policy.gr00tn15

GR00T N1.5/N1.6 policy backend.

This module implements the policy backend for NVIDIA Isaac GR00T N1.5/N1.6, a foundation model for generalized humanoid robot reasoning and skills. GR00T takes visual observations, proprioceptive state, and natural language instructions as input and outputs robot actions using a flow matching transformer architecture.

GR00T is a cross-embodiment model that can be post-trained for specific robot platforms. It uses SigLip2 for vision encoding, T5 for text encoding, and a flow matching diffusion transformer for action prediction.

Available versions: - 3b: GR00T N1.5 3B parameter model - n1.5-3b: GR00T N1.5 3B parameter model - n1.6-3b: GR00T N1.6 3B parameter model (latest) - latest: Alias for the N1.6 3B model

Supported embodiments and data configs: - GR1: fourier_gr1_arms_only (Fourier GR1 humanoid) - LIBERO: libero (LIBERO benchmark tasks) - ALOHA: aloha, aloha_2 (Aloha bimanual) - SO100: so100_dualcam (SO-100/SO-101 arms) - OXE_DROID: oxe_droid (Open X-Embodiment DROID)

Classes

GR00TN15Policy()

Backend for NVIDIA Isaac GR00T N1.5/N1.6 vision-language-action models.