maple.backend.policy.smolvla.SmolVLAPolicy.act

SmolVLAPolicy.act(handle: PolicyHandle, payload: Any, instruction: str, model_kwargs: Dict[str, Any] | None = {}) → List[float]

Get action prediction for a single observation.

Sends visual observations, proprioceptive state, and language instruction to the SmolVLA model and receives a predicted action.

Parameters:

handle – Policy handle for the running container.
payload – Observation payload containing image and state keys. Image keys are automatically detected and base64 encoded. Non-image keys (e.g., ‘state’) are passed through directly.
instruction – Natural language instruction for the task.
model_kwargs – Model-specific parameters (optional for SmolVLA).

Returns:

Predicted action as list of floats in the target action space.