GPUStack Runner

This repository serves as the Docker image pack center for GPUStack Runner. It provides a collection of Dockerfiles to build images for various inference services across different accelerated backends.

Agenda

Directory Structure
Dockerfile Convention
Docker Image Naming Convention
Integration Process

Directory Structure

The pack skeleton is organized by backend:

pack
├── {BACKEND 1}
│   └── Dockerfile
├── {BACKEND 2}
│   └── Dockerfile
├── {BACKEND 3}
│   └── Dockerfile
├── ...
│   └── Dockerfile
└── {BACKEND N}
    └── Dockerfile

Dockerfile Convention

Each Dockerfile follows these conventions:

Begin with comments describing the package logic in steps and usage of build arguments (ARGs).
Use ARG for all required and optional build arguments. If a required argument is unused, mark it as (PLACEHOLDER).
Use heredoc syntax for RUN commands to improve readability.

Example Dockerfile Structure

# Describe package logic and ARG usage.
#
ARG PYTHON_VERSION=...                                 # REQUIRED
ARG CMAKE_MAX_JOBS=...                                 # REQUIRED
ARG {OTHERS}                                           # OPTIONAL
ARG {BACKEND}_VERSION=...                              # REQUIRED
ARG {BACKEND}_ARCHS=...                                # REQUIRED
ARG {BACKEND}_{OTHERS}=...                             # OPTIONAL
ARG {SERVICE}_BASE_IMAGE=...                           # REQUIRED
ARG {SERVICE}_VERSION=...                              # REQUIRED
ARG {SERVICE}_{OTHERS}=...                             # OPTIONAL
ARG {SERVICE}_{FRAMEWORK}_VERSION=...                  # REQUIRED
ARG {SERVICE}_{FRAMEWORK}_{OTHERS}=...                 # OPTIONAL

# Stage Bake Runtime
FROM {BACKEND DEVEL IMAGE} AS runtime
SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
ARG TARGETPLATFORM
ARG TARGETOS
ARG TARGETARCH
ARG ...
RUN <<EOF
    # TODO: install runtime dependencies
EOF

# Stage Install Service
FROM {BACKEND}_BASE_IMAGE AS {service}
SHELL ["/bin/bash", "-eo", "pipefail", "-c"]
ARG TARGETPLATFORM
ARG TARGETOS
ARG TARGETARCH
ARG ...
RUN <<EOF
    # TODO: install service and dependencies
EOF

WORKDIR /
ENTRYPOINT [ "tini", "--" ]

Docker Image Naming Convention

The Docker image naming convention is as follows:

Multi-architecture image names: {NAMESPACE}/{REPOSITORY}:{TAG}.
Single-architecture image tags: {BACKEND}{BACKEND_VERSION%.*}[-{BACKEND_VARIANT}]-{SERVICE}{SERVICE_VERSION}-{OS}-{ARCH}.
Multi-architecture image tags: {BACKEND}{BACKEND_VERSION%.*}[-{BACKEND_VARIANT}]-{SERVICE}{SERVICE_VERSION}[-dev].
All names adn tags must be lowercase.

Example

NAMESPACE: gpustack
REPOSITORY: runner

Accelerated Backend	OS/ARCH	Inference Service	Single-Arch Image Name	Multi-Arch Image Name
Ascend CANN 910b	linux/amd64	vLLM	`gpustack/runner:cann8.1-910b-vllm0.9.2-linux-amd64`	`gpustack/runner:cann8.1-910b-vllm0.9.2`
Ascend CANN 910b	linux/arm64	vLLM	`gpustack/runner:cann8.1-910b-vllm0.9.2-linux-arm64`	`gpustack/runner:cann8.1-910b-vllm0.9.2`
NVIDIA CUDA 12.8	linux/amd64	vLLM	`gpustack/runner:cuda12.8-910b-vllm0.9.2-linux-amd64`	`gpustack/runner:cuda12.8-910b-vllm0.9.2`
NVIDIA CUDA 12.8	linux/arm64	vLLM	`gpustack/runner:cuda12.8-910b-vllm0.9.2-linux-arm64`	`gpustack/runner:cuda12.8-910b-vllm0.9.2`

Build and Release Workflow

Build single architecture images for OS/ARCH, e.g. gpustack/runner:cann8.1-910b-vllm0.9.2-linux-amd64.
Combine single-architecture images into a multiple architectures image, e.g. gpustack/runner:cann8.1-910b-vllm0.9.2-dev.
After testing, rename the multi-architecture image to the final tag, e.g. gpustack/runner:cann8.1-910b-vllm0.9.2.

Integration Process

Ingesting a New Accelerated Backend

To add support for a new accelerated backend:

Create a new directory under pack/ named with the new backend.
Add a Dockerfile in the new directory following the Dockerfile Convention.
Update pack.yml to include the new backend in the build matrix.
Update matrix.yml to include the new backend and its variants.
Update _RE_DOCKER_IMAGE in runner.py to recognize the new backend.
[Optional] Update tests if necessary.

Ingesting a New Inference Service

To add support for a new inference service:

Modify the Dockerfile of the relevant backend in pack/{BACKEND}/Dockerfile to include the new service.
Update pack.yml to include the new service in the build matrix.
Update matrix.yml to include the new service.
Update _RE_DOCKER_IMAGE in runner.py to recognize the new service.
[Optional] Update tests if necessary.

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at LICENSE file for details.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github/workflows		.github/workflows
docs		docs
gpustack_runner		gpustack_runner
pack		pack
tests/gpustack_runner		tests/gpustack_runner
tools		tools
.codespelldict		.codespelldict
.codespellrc		.codespellrc
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
hatch.toml		hatch.toml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
ruff.toml		ruff.toml
uv.lock		uv.lock
uv.toml		uv.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GPUStack Runner

Agenda

Directory Structure

Dockerfile Convention

Example Dockerfile Structure

Docker Image Naming Convention

Example

Build and Release Workflow

Integration Process

Ingesting a New Accelerated Backend

Ingesting a New Inference Service

License

About

Uh oh!

Releases 5

Languages

License

gpustack/runner

Folders and files

Latest commit

History

Repository files navigation

GPUStack Runner

Agenda

Directory Structure

Dockerfile Convention

Example Dockerfile Structure

Docker Image Naming Convention

Example

Build and Release Workflow

Integration Process

Ingesting a New Accelerated Backend

Ingesting a New Inference Service

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 5

Languages