Catalog

Overview

eIQ GenAI Flow 2.0 on NXP i.MX95

Overview

The eIQ GenAI Flow docker container by Advantech WEDA is a modular and scalable pipeline to create exciting AI-powered experiences on edge devices. The Flow provides a one stop shop to build production-grade, generative AI using the i.MX 95’s powerful NPU to accelerate LLM inference. The docker container and python API architecture is the ultimate tool for AI practitioners to bring their ideas to NXP’s edge computing ecosystem.

Block Diagram

Advantech WEDA container successfully containerizes NXP eIQ GenAI Flow 2.0, making deployment and future deployment easier with Advantech WEDA. More details regarding NXP eIQ GenAI Flow V2.0

Key Features

Feature Description
eIQ® GenAI Flow 2.0 Integrates multiple AI technologies to create a seamless HMI experience.
Edge AI Capabilities Optimized support for LLM and ASR leveraging NXP NPU acceleration.
Hardware Acceleration Direct passthrough access to NPU hardware for high-performance, low-latency, and low-power inference.
Preconfigured Environment Bundles drivers, toolchains, and AI libraries to eliminate setup time.
Rapid Prototyping & Deployment Streamlines testing AI models and validating PoCs without rebuilding from scratch.

Value Proposition

Customers can kickstart GenAI development on low-power NXP i.mx95 platform with full end-to-end Gen AI workflow, fully containerized with WEDA, including ASR(Automatic Speech Recognition), LLM, RAG(Retrieval-Augumented Generation), and TTS (Text-to-speech). Plus, containerized GenAI workflow can further be deployed massively through WEDA API, enabling not only 1st edge device development but also scaling & managing Edge AI applications to edge devices.

• What are the Key takeaways?

  1. Running end-to-end GenAI workflow even on Lower power platform
  2. Enable deployment via containerized environment and handle edge AI development lifecycle, including AI model & containerized edge Application mass deployment, device and data management through WEDA.

Hardware Specifications

Component Specification
Target Hardware Advantech AOM-5521
SoC NXP i.MX95
GPU Arm® Mali™ G310
NPU eIQ neutron N3-1034S
Memory 8 GB LPDDR5

Operating System

Environment Operating System
Device Host Yocto 5.2 Walnascar
Container Ubuntu:24.04
Base Container NXP-iMX95-Neutron-Passthrough Container

Software Components

Component Version Description
eIQ GenAI Flow 2.0
Python 3.13 Python runtime for building applications
ONNX Runtime 1.22.0 from NXP BSP v6.12.20 Inference runtime

Supported AI Capabilities

ASR Models

Model Format Note
moonshine-tiny .onnx Provide By NXP eIQ GenAI Flow
moonshine-base .onnx Provide By NXP eIQ GenAI Flow
whisper-small.en .onnx Provide By NXP eIQ GenAI Flow

LLM Models

Model Format Note
danube-500M-q4 .onnx Provide By NXP eIQ GenAI Flow
danube-500M-q8 .onnx Provide By NXP eIQ GenAI Flow

TTS Models

Model Format Note
vits-english .onnx Provide By NXP eIQ GenAI Flow
  • The models provide by NXP eIQ GenAI Flow is encrypted .

Hardware Acceleration Support

Accelerator Support Level Compatible Libraries
NPU INT8 (primary), INT4 NXP eIQ GenAI Flow

Precision Support

Precision Support Level Notes
INT8 NPU,CPU Primary mode for NPU acceleration, best performance-per-watt

Repository Structure

eIQ-GenAI-Flow2.0-Container-on-NXP-i.mx95/
├── README.md                               # Overview and quick start steps
├── run.sh                                  # Script to launch the container
└── docker-compose.yml                      # Configuration file of docker compose

Quick Start Guide

Prerequisites

  • Please ensure docker & docker compose are available and accessible on device host OS
  • Since default eMMC boot provides only 16 GB storage which is in-sufficient to run/build the container image, it is required to boot the Host OS using a 32 GB (minimum) SD card.

For container quick start, including the docker-compose file and more, please refer to Advantech Container Github Repository

Best Practices

Precision Selection

  • Prefer INT8 for NPU acceleration: The i.MX95 NPU is optimized for quantized INT8 models. Always convert to INT8 using post-training quantization or quantization-aware training for maximum performance and efficiency.

Known Limitations

  • Minimum storage Storage required for running Docker containers is 32 GB

  • NPU passthrough Neutron EP error sometimes, reboot the development board may solve this problem.

Neutron: NEUTRON_IOCTL_BUFFER_CREATE failed!: Cannot allocate memory

Copyright © 2026 Advantech Corporation. All rights reserved.