Armino AI Solution Introduction
Overview
The Armino AI Solution is an intelligent AI device solution developed by Beken Corporation based on the Armino SMP architecture. This solution provides complete end-to-cloud and cloud-to-large-model AI interaction capabilities, supports multiple large language model integrations, and provides developers with a complete development framework for rapidly building intelligent AI devices.
Design Philosophy
The Armino AI Solution adopts a three-layer architecture design of “Device-Cloud-Model”:
Device Side: Intelligent devices based on BK7258 chip, responsible for audio/video capture, local processing, and user interaction
Cloud Side: BK/Agora RTC/VolcEngine RTC servers, responsible for audio/video data transmission and routing
AI Model Side: Supports multiple large language models (OpenAI, Doubao, DeepSeek, etc.), providing AI conversation and image recognition capabilities
Core Features
1. Multimodal Interaction
Voice Interaction: Supports voice wake-up, speech recognition, and speech synthesis, providing natural human-machine dialogue experience
Visual Interaction: Supports dual-screen display (SPI LCD X2), providing rich visual feedback and emotional expression
Image Recognition: Supports real-time image capture and recognition, can switch between large language models and image recognition models
2. Real-time Audio/Video Communication
Low Latency Transmission: Supports low-latency audio/video transmission
Multiple Encoding Formats: - Video: H.264, JPEG - Audio: G.711A/U, G.722, OPUS, PCM
Intelligent Bitrate Control: Supports adaptive bandwidth estimation and bitrate adjustment
3. Device-side Audio Processing
AEC (Echo Cancellation): Eliminates speaker echo, improves speech recognition accuracy
NS (Noise Suppression): Suppresses environmental noise, improves voice quality
KWS (Keyword Wake-up): Supports custom wake words for local wake-up
Prompt Tone Playback: Supports multiple event prompt tones to enhance user experience
4. Multi-model Support
Large Language Models: Supports mainstream large language models such as OpenAI, Doubao, DeepSeek
Image Recognition Models: Supports image recognition and analysis
Flexible Switching: Supports runtime switching between different models to meet different application scenarios
5. Complete Peripheral Support
Display: Dual SPI LCD screens (GC9D01 160x160)
Input: Microphone, buttons, gyroscope, NFC
Output: Speaker, LED effects, vibration motor
Storage: SD NAND 128MB
Power: Lithium battery, charging management (ETA3422)
Camera: DVP camera (gc2145)
System Architecture
Relationship between BK AI and BK AVDK SMP
The BK AI Solution delivered in this repository relates to the underlying platform BK AVDK SMP (Armino SMP SDK) as follows:
Development and scope: BK AI is a scenario-oriented solution built on BK AVDK SMP, focusing on AI business logic, RTC, and cloud large-model integration. The chip, RTOS, drivers, and Wi-Fi/BLE stacks are provided by BK AVDK SMP.
Build and compilation: BK AI does not ship a standalone build system. Firmware build, toolchain, Kconfig, and project generation rely on the BK AVDK SMP build environment (for example, point
SDK_DIRto the SMP SDK tree).Code boundary: This repository mainly contains solution and business implementation code. It does not include hardware drivers, RTOS, memory management, or Wi-Fi/BLE stacks; use the interfaces and components provided by BK AVDK SMP when you need those capabilities.
The following describes the typical processor split and software layering on BK7258 from the BK AVDK SMP platform perspective (BK AI business code runs at the application and service layers and depends on OS, drivers, and networking from SMP).
The Armino SMP architecture uses AP (Application Processor) + CP (Communication Processor):
AP (CPU1 + CPU2): Runs multimedia applications, AI interaction, audio/video processing, and related core functions
CP (CPU0): Runs Wi-Fi, BLE, and low-power protocol stacks
Software architecture layers (illustrative):
Application Layer (AI Interaction, UI Display)
↓
Service Layer (Media Service, Network Transfer, Audio Engine)
↓
RTC Layer (Agora RTC SDK)
↓
OS Layer (RTOS)
↓
Hardware Layer (BK7258)
Main Application Scenarios
Intelligent Companion Devices: Provides voice dialogue, emotional expression, and visual feedback, suitable for scenarios such as children’s companionship and elderly care
Intelligent Education Devices: Supports voice Q&A and image recognition, can be used for learning assistance and knowledge Q&A
Smart Home Control: Controls smart home devices through voice interaction, providing convenient control methods
Enterprise Service Robots: Supports multimodal interaction, can be used for customer service, guidance, and other scenarios
Technical Advantages
Solution stack with SMP: BK AI provides scenario solutions and reference projects with documentation; hardware drivers, RTOS, and stacks are provided by BK AVDK SMP, together forming a complete development path
Modular Design: Each functional module is independent, facilitating customization and expansion
Rich Reference Designs: Includes reference implementations of common peripherals to accelerate product development
Flexible Configuration: Supports Kconfig configuration system, can flexibly tailor functions according to requirements
Comprehensive Documentation: Provides detailed development documentation, API references, and usage examples