Skip to content

User Guide

Welcome to gpuctl! This guide will help you master gpuctl's core features from scratch and efficiently manage GPU compute resources.

Contents

  • Quickstart


    Complete installation and submit your first job in under 5 minutes.

    Quickstart

  • Training Jobs


    LlamaFactory and DeepSpeed distributed training, with complete examples for single-node multi-GPU and multi-node multi-GPU scenarios.

    Training Jobs

  • Inference Services


    Deploy inference services using VLLM and similar frameworks, with multi-replica and auto-scaling support.

    Inference Services

  • Notebook


    Launch a JupyterLab environment with GPU resources attached in one command, for rapid prototyping.

    Notebook Development

  • Compute Jobs


    Deploy CPU services like nginx and redis without worrying about Kubernetes Deployment details.

    Compute Jobs

  • Resource Pool Management


    Partition nodes into resource pools for training/inference isolation and fine-grained scheduling.

    Resource Pool Management

  • Quotas & Namespaces


    Set CPU, memory, and GPU quotas per team or user to prevent resource abuse.

    Quotas & Namespaces


YAML Configuration Overview

All resources are defined through declarative YAML. The following describes the common fields:

kind: training          # Job type: training / inference / notebook / compute / pool / quota
version: v0.1           # Version, currently fixed at v0.1

job:
  name: my-job          # Job name (also used as the K8s resource name)
  priority: medium      # Priority: high / medium / low
  description: "..."    # Optional description

environment:
  image: my-image:tag   # Container image
  imagePullSecret: xxx  # Image pull secret (optional)
  command: [...]        # Startup command
  args: [...]           # Command arguments (optional)
  env:                  # Environment variables (optional)
    - name: KEY
      value: VALUE

resources:
  pool: default         # Resource pool name (default: default)
  gpu: 0                # Number of GPUs (0 for CPU-only jobs)
  gpu-type: A100-100G   # GPU model (optional, K8s schedules any GPU if omitted)
  cpu: 4                # CPU cores
  memory: 8Gi           # Memory size

service:                # Only applicable to inference / notebook / compute
  replicas: 1           # Number of replicas
  port: 8080            # Service port
  healthCheck: /health  # Health check path (optional)

storage:
  workdirs:             # Host directory mount list
    - path: /data/models
    - path: /output

Naming Rules

The job.name field is used directly as the Kubernetes resource metadata.name. Names must follow K8s naming conventions: lowercase letters, numbers, and hyphens only, max 63 characters.