Rock Paper Scissors

1. Introduction

This project is a rock-paper-scissors game demo implemented based on an open-source gesture detection model and the TensorFlow Lite Micro lightweight deep learning framework. It captures gesture images through a DVP camera, uses TensorFlow Lite Micro to perform AI inference on CPU1 to recognize gestures, then battles against system-generated random gestures, and finally displays the match results on dual LCD screens with voice announcements.

1.1 System Architecture

Dual-Core Architecture:
- CPU0: Runs WiFi, Bluetooth
- CPU1: Runs multimedia, TensorFlow Lite Micro AI inference tasks (480MHz)
AI Processing Flow:
- Image Capture → Preprocessing → TensorFlow Inference → Post-processing → Result Output

1.2 Hardware Configuration

SPI LCD X2 (GC9D01) - Dual screen display
DVP Camera (GC2145) - Image capture
Microphone - Voice input
Speaker - Voice announcement
SD NAND 128MB - Storage resources
NFC (MFRC522) - Near field communication
Gyroscope (SC7A20H) - Attitude detection
Charging Management Chip (ETA3422)
Lithium Battery

2. Project Structure

2.1 Directory Organization

rock_paper_scissors/
├── main/
│   ├── app_cpu0_main.c           # CPU0 main program
│   ├── app_cpu1_main.c           # CPU1 entry
│   ├── app_main_cpu1.cc          # CPU1 TensorFlow task
│   ├── tflite/                   # TensorFlow Lite code
│   │   ├── main_functions.cc     # Inference main logic
│   │   ├── gesture_detection_model_data.cc  # Model data
│   │   ├── image_provider.cc     # Image input interface
│   │   ├── detection_responder.cc # Result processing
│   │   └── model_settings.h      # Model configuration
│   ├── media/                    # Media processing
│   │   ├── media_main.c          # Media main control
│   │   ├── media_audio.c         # Audio processing
│   │   └── resource/             # Voice resources
│   ├── display/                  # Display control
│   │   ├── lvgl_app.c            # LVGL application
│   │   └── lv_*.c                # Display interfaces
│   └── CMakeLists.txt            # Build configuration
├── config/                       # Chip configuration
└── CMakeLists.txt                # Project build

2.2 Key Components

bk_tflite_micro: TensorFlow Lite Micro component
multimedia: Multimedia processing
lvgl: Graphics library
audio_play: Audio playback
media_service: Media service

3. TensorFlow Lite Micro Code Implementation

3.1 Model Configuration

Model Parameters (model_settings.h):

// Input image dimensions
constexpr int kNumCols = 192;      // Width
constexpr int kNumRows = 192;      // Height
constexpr int kNumChannels = 3;    // RGB channels

// Output categories
constexpr int kCategoryCount = 2;

// Maximum image size
constexpr int kMaxImageSize = kNumCols * kNumRows * kNumChannels;

3.2 Model Initialization

CPU1 Initialization Flow (app_main_cpu1.cc):

extern "C" void app_main_cpu1(void *arg) {
    // 1. Set CPU frequency to 480MHz
    bk_pm_module_vote_cpu_freq(PM_DEV_ID_LIN, PM_CPU_FRQ_480M);

    // 2. Create TensorFlow inference task
    xTaskCreate(tflite_task, "test", 1024*16, NULL, 3, NULL);

    // 3. CPU1 main loop
    while (1) {
        vTaskDelay(pdMS_TO_TICKS(1000));
    }
}

TensorFlow Task Initialization:

void tflite_task(void *arg) {
    // 1. Register debug log callback
    RegisterDebugLogCallback(debugLogCallback);

    // 2. Allocate model data buffer from PSRAM
    data_ptr = (unsigned char*)psram_malloc(
        g_gesture_detection_model_data_len);

    if (data_ptr == NULL) {
        os_printf("===data buffer malloc error====\r\n");
        return;
    }

    // 3. Copy model data to PSRAM
    os_memcpy(data_ptr, g_gesture_detection_model_data,
              g_gesture_detection_model_data_len);

    // 4. Call setup to initialize model
    setup();

    // 5. Inference loop
    while (1) {
        loop();  // Execute one inference
        vTaskDelay(pdMS_TO_TICKS(5000));
    }
}

3.3 Model Loading

Setup Function Implementation (main_functions.cc):

void setup() {
    // 1. Initialize target platform
    tflite::InitializeTarget();

    // 2. Load model
    model = tflite::GetModel(data_ptr);

    // 3. Verify model version
    if (model->version() != TFLITE_SCHEMA_VERSION) {
        MicroPrintf("Model schema version mismatch!\r\n");
        return;
    }

    // 4. Create operator resolver (13 operators)
    static tflite::MicroMutableOpResolver<13> micro_op_resolver;
    micro_op_resolver.AddConv2D(tflite::Register_CONV_2D_INT8());
    micro_op_resolver.AddPad();
    micro_op_resolver.AddMaxPool2D();
    micro_op_resolver.AddAdd();
    micro_op_resolver.AddQuantize();
    micro_op_resolver.AddConcatenation();
    micro_op_resolver.AddReshape();
    micro_op_resolver.AddPadV2();
    micro_op_resolver.AddResizeNearestNeighbor();
    micro_op_resolver.AddLogistic();
    micro_op_resolver.AddTranspose();
    micro_op_resolver.AddSplitV();
    micro_op_resolver.AddMul();

    // 5. Allocate Tensor Arena from PSRAM (360KB)
    uint8_t *tensor_arena = (uint8_t *)psram_malloc(360 * 1024);
    if (tensor_arena == NULL) {
        MicroPrintf("Tensor arena allocation failed\r\n");
        return;
    }

    // 6. Create interpreter
    static tflite::MicroInterpreter static_interpreter(
        model, micro_op_resolver, tensor_arena, 360 * 1024);
    interpreter = &static_interpreter;

    // 7. Allocate tensor memory
    TfLiteStatus allocate_status = interpreter->AllocateTensors();
    if (allocate_status != kTfLiteOk) {
        MicroPrintf("AllocateTensors() failed\r\n");
        return;
    }

    // 8. Get input tensor
    input = interpreter->input(0);
}

3.4 Inference Execution

Loop Function Implementation:

void loop() {
    TfLiteTensor* output = NULL;
    uint64_t before, after;

    // 1. Get camera image
    MicroPrintf("======Start detecting gesture======\r\n");
    before = (uint64_t)rtos_get_time();

    if (kTfLiteOk != GetImage(kNumCols, kNumRows, kNumChannels,
                              input->data.int8, 0)) {
        MicroPrintf("Image capture failed.\r\n");
        return;
    }

    // 2. Execute model inference
    if (kTfLiteOk != interpreter->Invoke()) {
        MicroPrintf("Invoke failed.\r\n");
        return;
    }

    // 3. Get output tensor and quantization parameters
    output = interpreter->output(0);
    g_scale = output->params.scale;
    g_zero_point = output->params.zero_point;

    // 4. Post-process to parse results
    uint8_t result = GESTURE_MAX;
    post_process(output->data.int8, &result);

    // 5. Calculate inference time
    after = (uint64_t)rtos_get_time();
    MicroPrintf("Detection time: %d ms\r\n", (uint32_t)(after - before));
}

3.5 Result Post-processing

Gesture Recognition Post-processing:

uint8_t post_process(int8_t *out_data, uint8_t *result) {
    *result = GESTURE_MAX;

    // Iterate through 2268 detection boxes
    for(int i = 0; i < 2268; i++) {
        // 1. Dequantize confidence score
        float score = (out_data[i*8 + 4] - g_zero_point) * g_scale;

        // 2. Filter by confidence threshold (>62)
        if(score > 62) {
            // 3. Dequantize bounding box coordinates
            int x = (out_data[i*8 + 0] - g_zero_point) * g_scale;
            int y = (out_data[i*8 + 1] - g_zero_point) * g_scale;
            int w = (out_data[i*8 + 2] - g_zero_point) * g_scale;
            int h = (out_data[i*8 + 3] - g_zero_point) * g_scale;

            // 4. Dequantize gesture category scores
            float paper = (out_data[i*8 + 5] - g_zero_point) * g_scale;
            float rock = (out_data[i*8 + 6] - g_zero_point) * g_scale;
            float scissors = (out_data[i*8 + 7] - g_zero_point) * g_scale;

            // 5. Determine gesture type (threshold >90)
            if (paper > 90) {
                MicroPrintf("Paper detected!\r\n");
                *result = GESTURE_PAPER;
            } else if (rock > 90) {
                MicroPrintf("Rock detected!\r\n");
                *result = GESTURE_ROCK;
            } else if (scissors > 90) {
                MicroPrintf("Scissors detected!\r\n");
                *result = GESTURE_SCISSORS;
            }

            break;  // Only process first high-confidence detection
        }
    }

    return 0;
}

3.6 Image Input Interface

GetImage Function (image_provider.cc):

Get input data from camera or pre-stored images and convert to INT8 format:

TfLiteStatus GetImage(int image_width, int image_height,
                     int channels, int8_t* image_data, int index) {
    // 1. Get image from camera or cache
    // 2. Convert to 192x192x3 INT8 format
    // 3. Fill into image_data buffer
    return kTfLiteOk;
}

3.7 External Call Interface

Provide C interface for other modules:

// Initialization interface
extern "C" void tflite_task_init_c(void *arg) {
    bk_pm_module_vote_cpu_freq(PM_DEV_ID_LIN, PM_CPU_FRQ_480M);
    RegisterDebugLogCallback(debugLogCallback);

    data_ptr = (unsigned char*)psram_malloc(
        g_gesture_detection_model_data_len);
    os_memcpy(data_ptr, g_gesture_detection_model_data,
              g_gesture_detection_model_data_len);

    setup();
}

// Processing interface
extern "C" void tflite_process(uint8_t *data, uint32_t len,
                               uint8_t *result) {
    process_pic(data, len, result);
}

4. Testing Instructions

4.1 Gesture Direction

During testing, gesture should be oriented as shown in the figure, and positioned approximately 50cm above the camera.

Figure 1. Rock_Paper_Scissors Gesture Direction

4.2 Test Steps

Preparation Phase
- After power-on, you’ll hear voice announcement “Get ready”
- Position gesture facing the camera
- Hold gesture steady until voice prompts “Put your hand down”
Detection Phase
- DVP camera captures 192x192 image
- CPU1 executes TensorFlow Lite inference (approximately 200-400ms)
- System recognizes gesture type
Result Display
- If detection succeeds:
  - Dual LCD screens display gestures from both sides
  - Voice announces battle result (win/lose/draw)
- If detection fails:
  - Voice prompts “Detection failed”
  - Can retry

4.3 Debug Information

View inference logs through serial port (115200 baud):

======Start detecting gesture======
Paper detected!
Detection time: 285 ms

5. References

TensorFlow Lite Micro Developer Guide: TensorFlow Lite Micro Developer Guide
Model Training: Based on YOLO object detection architecture
Component Source: components/bk_tflite_micro/
Example Project: projects/rock_paper_scissors/