maple.backend.policy.smolvla.SmolVLAPolicy.act

SmolVLAPolicy.act(handle: PolicyHandle, payload: Any, instruction: str, model_kwargs: Dict[str, Any] | None = {}) List[float]

Get action prediction for a single observation.

Sends visual observations, proprioceptive state, and language instruction to the SmolVLA model and receives a predicted action.

Parameters:
  • handle – Policy handle for the running container.

  • payload – Observation payload containing image and state keys. Image keys are automatically detected and base64 encoded. Non-image keys (e.g., ‘state’) are passed through directly.

  • instruction – Natural language instruction for the task.

  • model_kwargs – Model-specific parameters (optional for SmolVLA).

Returns:

Predicted action as list of floats in the target action space.