Overview

Container GPU Passthrough

About Advantech Container Catalog

Advantech Container Catalog is a comprehensive collection of ready-to-use, containerized software packages designed to accelerate the development and deployment of Edge AI applications. By offering pre-integrated solutions optimized for embedded hardware, it simplifies the challenges often faced with software and hardware compatibility, especially in GPU/NPU-accelerated environments.

Key benefits of the Container Catalog include:

Feature / Benefit	Description
Accelerated Edge AI Development	Ready-to-use containerized solutions for fast prototyping and deployment
Hardware Compatibility Solved	Eliminates embedded hardware and AI software package incompatibility
GPU/NPU Access Ready	Supports passthrough for efficient hardware acceleration
Model Conversion & Optimization	Built-in AI model quantization and format conversion support
Optimized for CV & LLM Applications	Pre-optimized containers for computer vision and large language models
Scalable Device Management	Supports large-scale IoT deployments via EdgeSync, Kubernetes, etc.
Lower Entry Barrier for Developers	High-level language (Python, C#, etc.) support enables easier development
Developer Accessibility	Junior engineers can build embedded AI applications more easily
Increased Customer Stickiness	Simplified tools lead to higher adoption and retention
Open Ecosystem	3rd-party developers can integrate new apps to expand the platform

Container Overview

This container, Container GPU Passthrough, provides a ready-to-use environment with optimized AI frameworks, GPU passthrough, and industrial-grade reliability on Advantech hardware platforms accelerated by GPU. It enables users to focus on developing AI applications on Advantech Edge AI systems accelerated by GPU chipsets—eliminating the complexity of hardware setup and AI framework compatibility.

Key Features

Full Hardware Acceleration: Optimized access to GPU
Complete AI Framework Stack: PyTorch, TensorFlow, ONNX Runtime
Industrial Vision Support: Accelerated OpenCV pipelines
Edge AI Capabilities: Support for computer vision, LLMs, and time-series analysis
Performance Optimized: Tuned specifically for Advantech EPC-R7300 and more devices

Host Device Prerequisites

Item	Specification
Compatible Hardware	Advantech GPU-accelerated devices - refer to Compatible hardware
Host OS	Ubuntu 20.04
Required Software packages	*refer to below
Software Installation	Host Software Package Installation

Container Environment Overview

Container Quick Start Guide

For Software Components on Container Image, container quick start, including docker-compose file, and more, please refer to Advantech EdgeSync Container Repository

Supported AI Capabilities

Vision Models

Model Family	Versions	Performance (FPS)	Quantization Support
YOLO	v3/v4/v5 (up to v5.6.0), v6 (up to v6.2), v7 (up to v7.0), v8 (up to v8.0)	YOLOv5s: 45-60 @ 640x640, YOLOv8n: 40-55 @ 640x640, YOLOv8s: 30-40 @ 640x640	INT8, FP16, FP32
SSD	MobileNetV1/V2 SSD, EfficientDet-D0/D1	MobileNetV2 SSD: 50-65 @ 300x300, EfficientDet-D0: 25-35 @ 512x512	INT8, FP16, FP32
Faster R-CNN	ResNet50/ResNet101 backbones	ResNet50: 3-5 @ 1024x1024	FP16, FP32
Segmentation	DeepLabV3+, UNet	DeepLabV3+ (MobileNetV2): 12-20 @ 512x512	INT8, FP16, FP32
Classification	ResNet (18/50), MobileNet (V1/V2/V3), EfficientNet (B0-B2)	ResNet18: 120-150 @ 224x224, MobileNetV2: 180-210 @ 224x224	INT8, FP16, FP32
Pose Estimation	PoseNet, HRNet (up to W18)	PoseNet: 15-25 @ 256x256	FP16, FP32

Supported AI Model Formats

Format	Support Level	Compatible Versions	Notes
ONNX	Full	1.10.0 - 1.16.3	Recommended for cross-framework compatibility
PyTorch (JIT)	Full	1.8.0 - 2.0.0	Native support via TorchScript
TensorFlow SavedModel	Full	2.8.0 - 2.12.0	Recommended TF deployment format
TFLite	Partial	Up to 2.12.0	May have limited hardware acceleration

Language Models Recommendation

Model Family	Versions	Memory Requirements	Performance Notes
DeepSeek Coder	Mini (1.3B), Light (1.5B)	2-3 GB	10-15 tokens/sec in FP16
TinyLlama	1.1B	2 GB	8-12 tokens/sec in FP16
Phi	Phi-1.5 (1.3B), Phi-2 (2.7B)	1.5-3 GB	Phi-1.5: 8-12 tokens/sec in FP16, Phi-2: 4-8 tokens/sec in FP16
Llama 2	7B (Quantized to 4-bit)	3-4 GB	1-2 tokens/sec in INT4/INT8
Mistral	7B (Quantized to 4-bit)	3-4 GB	1-2 tokens/sec in INT4/INT8

DeepSeek R1 1.5B Optimizations Recommendations:

Supports INT4-8 quantization for inference
Typical throughput: 8-12 tokens/sec in FP16, 12-18 tokens/sec in INT8
Recommended batch size: 1-2 for real-time applications