Skip to main content

Instructions

In this page, I will demonstrate how to use Triton inference server, and combine MinIO storage database as model repository to do inference.

Please check further info and here.

Docker-compose sample

docker-compose.yml
version: "3.8"
services:
tritonserver:
image: nvcr.io/nvidia/tritonserver:21.03-py3
#image: nvcr.io/nvidia/tritonserver:22.07-py3
ports:
- "8888:8000"
deploy:
resources:
reservations:
devices:
- driver: nvidia
capabilities: [ gpu ]
environment:
AWS_ACCESS_KEY_ID: user
AWS_SECRET_ACCESS_KEY: user123456
command: tritonserver --model-store=s3://minio:9000/models --model-control-mode="poll" --log-info true --strict-model-config=false
shm_size: 20gb
ulimits:
stack: 67108864
memlock: -1
container_name: test_triton
restart: unless-stopped
networks:
cicdnetwork:
ipv4_address: 172.25.100.3

minio:
image: minio/minio:RELEASE.2022-03-24T00-43-44Z
shm_size: 20gb
volumes:
- ./database/data:/data:rw
- ./database/config:/root/.minio
ports:
- "9020:9000"
- "9021:9001"
environment:
MINIO_ROOT_USER: user
MINIO_ROOT_PASSWORD: user123456
entrypoint: sh
container_name: test_minio
command: -c 'mkdir -p /data/models && minio server /data --console-address "0.0.0.0:9001"'
restart: unless-stopped
networks:
cicdnetwork:
ipv4_address: 172.25.100.9


networks:
cicdnetwork:
driver: bridge
ipam:
config:
- subnet: 172.25.100.0/16


Check model

You can use this url or curl to get the model configuration.

http://(Server IP):8000/v2/models/(Model name)/config