Skip to content

InferX platform k8s deployment

inferx-net edited this page Aug 11, 2025 · 12 revisions

System Requirement

  • OS: Linux Kernel > 5.8.0, it has been tested in Ubuntu 20.04 and 22.4
  • Processor: X86-64/Amd64
  • Docker: > 17.09.0
  • KVM: bare metal or VM with nested virtualization, Enable virtualization technology in BIOS (Usually in Security tab of BIOS)
  • Memory: >=64GB
  • Cuda: >= 12.5
  • Kubenetes: K8S or K3S There are following container images:
    1. inferx/inferx_dashboard:v0.1.3: The inferx webui dashboard.
    2. inferx/inferx_one:v0.1.3: The inferx platform services such as rest api gateway,scheduler, etc.
    3. inferx/spdk-container:v0.1.3: Optional. This is simple wrapper of https://spdk.io/. It is only needed when using InferX blob store.
    4. quay.io/keycloak/keycloak:latest: Keycloak image which used for Authentication.
    5. postgres:14.5: Standard postgres container. It stores inferx audit log, secret and keycloak configuration
    6. quay.io/coreos/etcd:v3.5.13: Inferx configurations such as tenant, namespace and model functions.

Kube cluster setup steps

  1. Label the the GPU node
kubectl label node <NodeName> inferx_storage=data --overwrite
kubectl label node <NodeName>-precision-7960-tower inferx_nodeType=inferx_file --overwrite
  1. Start the pods: in the inferx folder run
make runkblob

Submit models

  1. clone inferx repo:
git clone https://github.com/inferx-net/inferx.git
  1. setup enviroment variable
export INFX_GATEWAY_URL="http://localhost:31501" # the inferx_one expose 31501 as the node port 
export IFERX_APIKEY="87831cdb-d07a-4dc1-9de0-fb232c9bf286" # this is admin apikey, it is configured in the nodeagent yaml
  1. submit first model
cd inferx/config
/opt/inferx/bin/ixctl create public.json # create tenant
/opt/inferx/bin/ixctl create Qwen_namespace.json # create namespace
/opt/inferx/bin/ixctl create Qwen2.5-Coder-1.5B-Instruct.json # create first model
Clone this wiki locally