Instructions

In this page, I will demonstrate how to use Triton inference server, and combine MinIO storage database as model repository to do inference.

Please check further info and here.

Docker-compose sample

docker-compose.yml
version: "3.8"
services:
  tritonserver:
    image: nvcr.io/nvidia/tritonserver:21.03-py3
    #image: nvcr.io/nvidia/tritonserver:22.07-py3
    ports:
      - "8888:8000"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              capabilities: [ gpu ]
    environment:
      AWS_ACCESS_KEY_ID: user
      AWS_SECRET_ACCESS_KEY: user123456
    command: tritonserver --model-store=s3://minio:9000/models --model-control-mode="poll" --log-info true --strict-model-config=false
    shm_size: 20gb
    ulimits:
      stack: 67108864
      memlock: -1
    container_name: test_triton
    restart: unless-stopped
    networks:
      cicdnetwork:
        ipv4_address: 172.25.100.3

  minio:
    image: minio/minio:RELEASE.2022-03-24T00-43-44Z
    shm_size: 20gb
    volumes:
      - ./database/data:/data:rw
      - ./database/config:/root/.minio
    ports:
      - "9020:9000"
      - "9021:9001"
    environment:
      MINIO_ROOT_USER: user
      MINIO_ROOT_PASSWORD: user123456
    entrypoint: sh
    container_name: test_minio
    command: -c 'mkdir -p /data/models && minio server /data --console-address "0.0.0.0:9001"'
    restart: unless-stopped
    networks:
      cicdnetwork:
        ipv4_address: 172.25.100.9


networks:
  cicdnetwork:
    driver: bridge
    ipam:
      config:
        - subnet: 172.25.100.0/16

Check model

You can use this url or curl to get the model configuration.

http://(Server IP):8000/v2/models/(Model name)/config

Docker-compose sample​

Check model​

Docker-compose sample

Check model