Overview
eIQ GenAI Flow 2.0 on NXP i.MX95
Overview
The eIQ GenAI Flow docker container by Advantech WEDA is a modular and scalable pipeline to create exciting AI-powered experiences on edge devices. The Flow provides a one stop shop to build production-grade, generative AI using the i.MX 95’s powerful NPU to accelerate LLM inference. The docker container and python API architecture is the ultimate tool for AI practitioners to bring their ideas to NXP’s edge computing ecosystem.
Block Diagram
Advantech WEDA container successfully containerizes NXP eIQ GenAI Flow 2.0, making deployment and future deployment easier with Advantech WEDA. More details regarding NXP eIQ GenAI Flow V2.0
Key Features
| Feature | Description |
|---|---|
| eIQ® GenAI Flow 2.0 | Integrates multiple AI technologies to create a seamless HMI experience. |
| Edge AI Capabilities | Optimized support for LLM and ASR leveraging NXP NPU acceleration. |
| Hardware Acceleration | Direct passthrough access to NPU hardware for high-performance, low-latency, and low-power inference. |
| Preconfigured Environment | Bundles drivers, toolchains, and AI libraries to eliminate setup time. |
| Rapid Prototyping & Deployment | Streamlines testing AI models and validating PoCs without rebuilding from scratch. |
Value Proposition
Customers can kickstart GenAI development on low-power NXP i.mx95 platform with full end-to-end Gen AI workflow, fully containerized with WEDA, including ASR(Automatic Speech Recognition), LLM, RAG(Retrieval-Augumented Generation), and TTS (Text-to-speech). Plus, containerized GenAI workflow can further be deployed massively through WEDA API, enabling not only 1st edge device development but also scaling & managing Edge AI applications to edge devices.
• What are the Key takeaways?
- Running end-to-end GenAI workflow even on Lower power platform
- Enable deployment via containerized environment and handle edge AI development lifecycle, including AI model & containerized edge Application mass deployment, device and data management through WEDA.
Hardware Specifications
| Component | Specification |
|---|---|
| Target Hardware | Advantech AOM-5521 |
| SoC | NXP i.MX95 |
| GPU | Arm® Mali™ G310 |
| NPU | eIQ neutron N3-1034S |
| Memory | 8 GB LPDDR5 |
Operating System
| Environment | Operating System |
|---|---|
| Device Host | Yocto 5.2 Walnascar |
| Container | Ubuntu:24.04 |
| Base Container | NXP-iMX95-Neutron-Passthrough Container |
Software Components
| Component | Version | Description |
|---|---|---|
| eIQ GenAI Flow | 2.0 | |
| Python | 3.13 | Python runtime for building applications |
| ONNX Runtime | 1.22.0 from NXP BSP v6.12.20 | Inference runtime |
Supported AI Capabilities
ASR Models
| Model | Format | Note |
|---|---|---|
| moonshine-tiny | .onnx | Provide By NXP eIQ GenAI Flow |
| moonshine-base | .onnx | Provide By NXP eIQ GenAI Flow |
| whisper-small.en | .onnx | Provide By NXP eIQ GenAI Flow |
LLM Models
| Model | Format | Note |
|---|---|---|
| danube-500M-q4 | .onnx | Provide By NXP eIQ GenAI Flow |
| danube-500M-q8 | .onnx | Provide By NXP eIQ GenAI Flow |
TTS Models
| Model | Format | Note |
|---|---|---|
| vits-english | .onnx | Provide By NXP eIQ GenAI Flow |
- The models provide by NXP eIQ GenAI Flow is encrypted .
Hardware Acceleration Support
| Accelerator | Support Level | Compatible Libraries |
|---|---|---|
| NPU | INT8 (primary), INT4 | NXP eIQ GenAI Flow |
Precision Support
| Precision | Support Level | Notes |
|---|---|---|
| INT8 | NPU,CPU | Primary mode for NPU acceleration, best performance-per-watt |
Repository Structure
eIQ-GenAI-Flow2.0-Container-on-NXP-i.mx95/
├── README.md # Overview and quick start steps
├── run.sh # Script to launch the container
└── docker-compose.yml # Configuration file of docker compose
Quick Start Guide
Prerequisites
- Please ensure docker & docker compose are available and accessible on device host OS
- Since default eMMC boot provides only 16 GB storage which is in-sufficient to run/build the container image, it is required to boot the Host OS using a 32 GB (minimum) SD card.
For container quick start, including the docker-compose file and more, please refer to Advantech Container Github Repository
Best Practices
Precision Selection
- Prefer INT8 for NPU acceleration: The i.MX95 NPU is optimized for quantized INT8 models. Always convert to INT8 using post-training quantization or quantization-aware training for maximum performance and efficiency.
Known Limitations
-
Minimum storage Storage required for running Docker containers is 32 GB
-
NPU passthrough Neutron EP error sometimes, reboot the development board may solve this problem.
Neutron: NEUTRON_IOCTL_BUFFER_CREATE failed!: Cannot allocate memory
Copyright © 2026 Advantech Corporation. All rights reserved.
