maple.backend.policy.gr00tn15.GR00TN15Policy.act
- GR00TN15Policy.act(handle: PolicyHandle, payload: Any, instruction: str, model_kwargs: Dict[str, Any] | None = None) List[float]
Get action prediction for a single observation.
Sends visual observations, proprioceptive state, and language instruction to the GR00T model and receives a predicted action. GR00T uses flow matching to iteratively denoise actions from gaussian noise.
The server expects observations in the format: - observation.images.*: Base64 encoded camera images - observation.state: Robot proprioceptive state as list - prompt: Natural language instruction
- Parameters:
handle – Policy handle for the running container.
payload – Observation payload containing: - Image keys (e.g., ‘image’, ‘wrist_image’): camera observations - ‘state’ or ‘observation.state’: robot proprioceptive state
instruction – Natural language instruction for the task.
model_kwargs – Optional runtime parameters (not used for GR00T, configuration is done at load time via model_load_kwargs)
- Returns:
Predicted action as list of floats.