Armino AI Solution Introduction
=====================================

:link_to_translation:`zh_CN:[中文]`

Overview
----------------

The Armino AI Solution is an intelligent AI device solution developed by Beken Corporation based on the Armino SMP architecture. This solution provides complete end-to-cloud and cloud-to-large-model AI interaction capabilities, supports multiple large language model integrations, and provides developers with a complete development framework for rapidly building intelligent AI devices.

Design Philosophy
------------------

The Armino AI Solution adopts a three-layer architecture design of "Device-Cloud-Model":

- **Device Side**: Intelligent devices based on BK7258 chip, responsible for audio/video capture, local processing, and user interaction
- **Cloud Side**: BK/Agora RTC/VolcEngine RTC servers, responsible for audio/video data transmission and routing
- **AI Model Side**: Supports multiple large language models (OpenAI, Doubao, DeepSeek, etc.), providing AI conversation and image recognition capabilities

Core Features
----------------

1. Multimodal Interaction
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

- **Voice Interaction**: Supports voice wake-up, speech recognition, and speech synthesis, providing natural human-machine dialogue experience
- **Visual Interaction**: Supports dual-screen display (SPI LCD X2), providing rich visual feedback and emotional expression
- **Image Recognition**: Supports real-time image capture and recognition, can switch between large language models and image recognition models

2. Real-time Audio/Video Communication
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

- **Low Latency Transmission**: Supports low-latency audio/video transmission
- **Multiple Encoding Formats**:
  - Video: H.264, JPEG
  - Audio: G.711A/U, G.722, OPUS, PCM
- **Intelligent Bitrate Control**: Supports adaptive bandwidth estimation and bitrate adjustment

3. Device-side Audio Processing
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

- **AEC (Echo Cancellation)**: Eliminates speaker echo, improves speech recognition accuracy
- **NS (Noise Suppression)**: Suppresses environmental noise, improves voice quality
- **KWS (Keyword Wake-up)**: Supports custom wake words for local wake-up
- **Prompt Tone Playback**: Supports multiple event prompt tones to enhance user experience

4. Multi-model Support
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

- **Large Language Models**: Supports mainstream large language models such as OpenAI, Doubao, DeepSeek
- **Image Recognition Models**: Supports image recognition and analysis
- **Flexible Switching**: Supports runtime switching between different models to meet different application scenarios

5. Complete Peripheral Support
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

- **Display**: Dual SPI LCD screens (GC9D01 160x160)
- **Input**: Microphone, buttons, gyroscope, NFC
- **Output**: Speaker, LED effects, vibration motor
- **Storage**: SD NAND 128MB
- **Power**: Lithium battery, charging management (ETA3422)
- **Camera**: DVP camera (gc2145)

System Architecture
-------------------

.. rubric:: Relationship between BK AI and BK AVDK SMP

The BK AI Solution delivered in this repository relates to the underlying platform BK AVDK SMP (Armino SMP SDK) as follows:

#. **Development and scope**: BK AI is a scenario-oriented solution built on BK AVDK SMP, focusing on AI business logic, RTC, and cloud large-model integration. The chip, RTOS, drivers, and Wi-Fi/BLE stacks are provided by BK AVDK SMP.

#. **Build and compilation**: BK AI does **not** ship a standalone build system. Firmware build, toolchain, Kconfig, and project generation rely on the **BK AVDK SMP** build environment (for example, point ``SDK_DIR`` to the SMP SDK tree).

#. **Code boundary**: This repository mainly contains **solution and business implementation** code. It does **not** include hardware drivers, RTOS, memory management, or Wi-Fi/BLE stacks; use the interfaces and components provided by **BK AVDK SMP** when you need those capabilities.

The following describes the typical processor split and software layering on BK7258 from the **BK AVDK SMP** platform perspective (BK AI business code runs at the application and service layers and depends on OS, drivers, and networking from SMP).

The Armino SMP architecture uses AP (Application Processor) + CP (Communication Processor):

- **AP (CPU1 + CPU2)**: Runs multimedia applications, AI interaction, audio/video processing, and related core functions
- **CP (CPU0)**: Runs Wi-Fi, BLE, and low-power protocol stacks

Software architecture layers (illustrative):

::

    Application Layer (AI Interaction, UI Display)
            ↓
    Service Layer (Media Service, Network Transfer, Audio Engine)
            ↓
    RTC Layer (Agora RTC SDK)
            ↓
    OS Layer (RTOS)
            ↓
    Hardware Layer (BK7258)

Main Application Scenarios
--------------------------

1. **Intelligent Companion Devices**: Provides voice dialogue, emotional expression, and visual feedback, suitable for scenarios such as children's companionship and elderly care

2. **Intelligent Education Devices**: Supports voice Q&A and image recognition, can be used for learning assistance and knowledge Q&A

3. **Smart Home Control**: Controls smart home devices through voice interaction, providing convenient control methods

4. **Enterprise Service Robots**: Supports multimodal interaction, can be used for customer service, guidance, and other scenarios

Technical Advantages
--------------------

1. **Solution stack with SMP**: BK AI provides scenario solutions and reference projects with documentation; hardware drivers, RTOS, and stacks are provided by BK AVDK SMP, together forming a complete development path

2. **Modular Design**: Each functional module is independent, facilitating customization and expansion

3. **Rich Reference Designs**: Includes reference implementations of common peripherals to accelerate product development

4. **Flexible Configuration**: Supports Kconfig configuration system, can flexibly tailor functions according to requirements

5. **Comprehensive Documentation**: Provides detailed development documentation, API references, and usage examples