maple.backend.policy.base.PolicyBackend.serve

PolicyBackend.serve(version: str, model_path: Path, device: str, host_port: int | None = None, model_load_kwargs: Dict[str, Any] | None = {}) → PolicyHandle

Start policy container and load model.

Orchestrates the complete container startup and model loading process: 1. Start Docker container with GPU support 2. Wait for container to become healthy 3. Load model weights with specified configuration 4. Verify model is ready for inference

The container is configured with: - Model weights mounted as read-only volume - GPU device request if CUDA device specified - Memory and shared memory limits - Port mapping for HTTP API

Parameters:

version – Model version to serve (must exist in _hf_repos).
model_path – Filesystem path to model weights.
device – Device to load model on (‘cpu’, ‘cuda:0’, etc.).
host_port – Optional specific port to bind (random if None).
model_load_kwargs – Model-specific loading parameters.

Returns:

PolicyHandle for the running container.