Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esp-p4 上推理 MobileNet_v2 模型的结果不正确 (AIV-750) #200

Open
3 tasks done
WhiteDoveBuct opened this issue Feb 14, 2025 · 18 comments
Open
3 tasks done

Comments

@WhiteDoveBuct
Copy link

WhiteDoveBuct commented Feb 14, 2025

Checklist

  • Checked the issue tracker for similar issues to ensure this is not a duplicate.
  • Provided a clear description of your suggestion.
  • Included any relevant context or examples.

Issue or Suggestion Description

我的pytorch推理代码:

import torch
from torchvision import models, transforms
from PIL import Image
import torch.nn as nn
import os

def convert_relu6_to_relu(model):
    for child_name, child in model.named_children():
        if isinstance(child, nn.ReLU6):
            setattr(model, child_name, nn.ReLU())
        else:
            convert_relu6_to_relu(child)
    return model

def get_jpeg_paths(directory):
    jpeg_paths = []
    for root, dirs, files in os.walk(directory):
        for file in files:
            if file.lower().endswith(('.jpeg')):  # 检查文件扩展名
                jpeg_paths.append(os.path.join(root, file))
    return jpeg_paths

def replace_extension_to_bin(file_path: str) -> str:
    # 获取文件名和扩展名
    base_name = os.path.splitext(file_path)[0]
    
    # 拼接新的文件路径,替换扩展名为 .bin
    new_file_path = base_name + '.bin'
    
    return new_file_path

# 加载 MobileNetV2 预训练模型
model = models.mobilenet_v2(weights=models.MobileNet_V2_Weights.IMAGENET1K_V1)
# onnx_model_path = "/home/lzm/esp-dl/tools/quantization/models/torch/mobilenet_v2_relu.onnx"
# model = torch.onnx._load(onnx_model_path)
model = convert_relu6_to_relu(model)
model.eval()  # 设置为评估模式

# 预处理转换
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# 加载测试图片
image_pathes = get_jpeg_paths("/home/lzm/esp-dl/examples/mobilenet_v2/main/images")
image_pathes = sorted(image_pathes)

print(image_pathes)
for image_path in image_pathes:
    # print(idx)
    print(image_path)
    image = Image.open(image_path).convert("RGB")
    # print(image)
    input_tensor = transform(image).unsqueeze(0)  # 添加 batch 维度
    # print(input_tensor.shape)

    tensor_permuted = input_tensor.permute(0, 2, 3, 1)
    np_flat = torch.flatten(tensor_permuted).numpy()
    # print(np_flat[:10])

    # 保存为二进制文件
    out_bin_path = replace_extension_to_bin(image_path)
    with open(out_bin_path, "wb") as f:
        f.write(np_flat.tobytes())

    # 进行推理
    with torch.no_grad():
        output = model(input_tensor)
    # print(output)

    # 获取预测类别
    probabilities = torch.nn.functional.softmax(output[0], dim=0)
    predicted_class = probabilities.argmax().item()

    # 打印结果
    # print(f"Predicted class index: {predicted_class}")
    print(f"{image_path}: {predicted_class} : {probabilities[predicted_class].item()}")

我的pytorch推理结果:

Image

0 : 0.9897634983062744 分别表示类别和置信度

我的esp32-p4芯片上运行的代码:

extern const uint8_t _0_start[] asm("_binary_0_bin_start");
extern const uint8_t _0_end[] asm("_binary_0_bin_end");

extern const uint8_t _5_1_start[] asm("_binary_5_1_bin_start");
extern const uint8_t _5_1_end[] asm("_binary_5_1_bin_end");

extern const uint8_t _5_start[] asm("_binary_5_bin_start");
extern const uint8_t _5_end[] asm("_binary_5_bin_end");

extern const uint8_t _10_start[] asm("_binary_10_bin_start");
extern const uint8_t _10_end[] asm("_binary_10_bin_end");

extern const uint8_t _997_start[] asm("_binary_997_bin_start");
extern const uint8_t _997_end[] asm("_binary_997_bin_end");

extern const uint8_t _999_start[] asm("_binary_999_bin_start");
extern const uint8_t _999_end[] asm("_binary_999_bin_end");

extern "C" void app_main(void)
{
    ESP_LOGI(TAG, "get into app_main");
    Model *model = new Model("model", fbs::MODEL_LOCATION_IN_FLASH_PARTITION);

    ///< get model input
    // std::map<std::string, TensorBase *> graph_test_inputs = get_graph_test_inputs(model);
    std::map<std::string, TensorBase *> graph_inputs = model->get_inputs();
    std::string input_name = graph_inputs.begin()->first;
    TensorBase * input = graph_inputs.begin()->second;
    for(int i=0;i<input->shape.size();i++){
        ESP_LOGI(TAG, "shape[%d]=%d",i, input->shape[i]);
    }
    ESP_LOGI(TAG, "dtype=%d", (int)(input->dtype));
    ESP_LOGI(TAG, "exponent=%d", input->exponent);
    ///>==========================================

    ///< get model output
    std::map<std::string, dl::TensorBase *> model_outputs_map = model->get_outputs();
    TensorBase * output = model_outputs_map.begin()->second;
    ///>==========================================

    uint8_t *test_data[] = {
        (uint8_t *) _0_start,
        (uint8_t *) _5_1_start,
        (uint8_t *) _5_start,
        (uint8_t *) _10_start,
        (uint8_t *) _997_start,
        (uint8_t *) _999_start
    };

    uint32_t test_data_size[] = {
        (uint32_t)(_0_end   -   _0_start),
        (uint32_t)(_5_1_end -   _5_1_start),
        (uint32_t)(_5_end   -   _5_start),
        (uint32_t)(_10_end  -   _10_start),
        (uint32_t)(_997_end -   _997_start),
        (uint32_t)(_999_end -   _999_start)
    };
   
    int test_size = 6;

    for(int idx=0; idx<test_size; idx++){
        
        uint8_t *test_jpg_data = test_data[idx];
        uint32_t test_jpg_size = test_data_size[idx];
        ESP_LOGI(TAG, "test_jpg_size: %lu", test_jpg_size);

        float * test_float_data = (float*)test_jpg_data;
        // for(int i=0; i<10; i++){
        //     ESP_LOGI(TAG, "test_float_data[%d]=%f",i,test_float_data[i]);
        // }

        // uint8_t *outbuf = (uint8_t *)heap_caps_malloc(test_jpg_size, MALLOC_CAP_SPIRAM);
        ///> prepare test data
        TensorBase test_float_input({224,224,3}, test_jpg_data, 0, dtype_t::DATA_TYPE_FLOAT, false, MALLOC_CAP_SPIRAM);
        
        ///< pre process
        std::map<std::string, TensorBase *> graph_test_inputs;
        TensorBase *test_input = new TensorBase(input->shape, nullptr, input->exponent, input->dtype, false, MALLOC_CAP_SPIRAM);
        test_input->assign(&test_float_input);
        graph_test_inputs.emplace(input_name, test_input);
        ///>============================================

        ///> forward
        model->run(graph_test_inputs);
        
        ///< post process
        dl::TensorBase test_output(output->shape, nullptr, 0, dl::DATA_TYPE_FLOAT);
        bool need_softmax = true;
        dl::module::Softmax * softmax_module = nullptr;
        if (need_softmax) {
            softmax_module = new dl::module::Softmax(nullptr, -1, dl::MODULE_NON_INPLACE, dl::QUANT_TYPE_SYMM_8BIT);
        }
        if (need_softmax) {
            softmax_module->run(output, &test_output);
        } else {
            output->assign(&test_output);
        }

        /// softmax
        float *output_ptr = (float *)test_output.data;
        float max_conf = 0.0;
        int max_idx = 0;
        for (int i = 0; i < test_output.get_size(); i++) {
            if (*output_ptr > max_conf) {
                max_conf = *output_ptr;
                max_idx = i;
            }
            output_ptr++;
        }
        ESP_LOGI(TAG, "max_conf=%f", max_conf);
        ESP_LOGI(TAG, "max_idx=%d", max_idx);
        ///>====================================

        ///< release resources
        if(softmax_module != nullptr) {
            delete softmax_module;
            softmax_module = nullptr;
        }
        // ::compare_test_outputs(model, model->get_outputs());
        for (auto graph_test_inputs_iter = graph_test_inputs.begin(); graph_test_inputs_iter != graph_test_inputs.end();
            graph_test_inputs_iter++) {
            if (graph_test_inputs_iter->second) {
                delete graph_test_inputs_iter->second;
            }
        }
        graph_test_inputs.clear();
        ///>==================================
    }

    delete model;
    ESP_LOGI(TAG, "exit app_main");
}

我的esp32-p4芯片上推理结果:

Image

在esp32-p4上跑的模型路径:esp-dl/examples/mobilenet_v2/models/esp32p4/mobilenet_v2.espdl

@github-actions github-actions bot changed the title esp-p4 上推理 MobileNet_v2 模型的结果不正确 esp-p4 上推理 MobileNet_v2 模型的结果不正确 (AIV-750) Feb 14, 2025
@100312dog
Copy link
Contributor

100312dog commented Feb 17, 2025

@WhiteDoveBuct 你c++里直接喂了jpeg数据给模型。jpeg首先要解码,然后还要resize,normalization,然后quantize成int8才可以作为输入。

Image
对应python的这一部分,可以不做centercrop。
具体的可以参照https://github.com/espressif/esp-dl/tree/master/examples/imagenet_cls 这个示例。
https://github.com/espressif/esp-dl/blob/master/esp-dl/vision/image/dl_image_preprocessor.hpp
image_preprocessor封装了上述提到的resize,normalization,quantize

@WhiteDoveBuct
Copy link
Author

不是这样子的,我喂的数据是使用python脚本做完前处理的,这个代码我也在上面提问中贴出来了,我再截一下图如下:

Image

@WhiteDoveBuct
Copy link
Author

这里保存的二进制文件就是 c++代码的输入,按理说我在c++代码中只需要把float 量化之后就可以放入模型推理了,我c++代码中的量化如下图(完整代码请看上面的提问):

Image
红色1 标明的地方是 把 float32的数据装入 TensorBase
红色2 标明的地方是 把 float32转成模型所需的的 int8

@WhiteDoveBuct
Copy link
Author

@100312dog 还请帮忙仔细看看c++的推理代码,我没有直接传入jpg图片,传入的是使用python脚本处理好的二进制数据,在c++代码中直接读取这些二进制数据,然后利用 TensorBase的assgin方法量化输入的数据

@100312dog
Copy link
Contributor

@WhiteDoveBuct 噢噢,我理解了。我看了下应该是因为esp-dl中输入是bhwc,你这边python脚本处理最后保存的是bchw。

@WhiteDoveBuct
Copy link
Author

@WhiteDoveBuct 噢噢,我理解了。我看了下应该是因为esp-dl中输入是bhwc,你这边python脚本处理最后保存的是bchw。

不是这个原因呢,我在python脚本上已经把 格式转成 bhwc 了的,如下图:

Image

标红 2 处,把 bchw 转成了 bhwc 了

@100312dog
Copy link
Contributor

100312dog commented Feb 17, 2025

@WhiteDoveBuct 方便发下原图,python和c++代码不。我试一下看看问题在哪

@WhiteDoveBuct
Copy link
Author

WhiteDoveBuct commented Feb 17, 2025

@WhiteDoveBuct 方便发下原图,python和c++代码不。我试一下看看问题在哪

python: mobilenet_v2/main/test_ty_model_with_pic.py
c++: mobilenet_v2/main/app_main.cpp

main.zip

@WhiteDoveBuct
Copy link
Author

@100312dog 麻烦帮忙看下,我本想上传整个工程的,奈何不知道为啥上传不了,只上传了源码和测试图片的压缩包

Image

@100312dog
Copy link
Contributor

100312dog commented Feb 18, 2025

@WhiteDoveBuct
Image
这是我这边的结果。

你提供的工程我只改了test_ty_model_with_pic.py
Image

测试的工程在这边,你可以再试下结果对不对
test_issue.zip

idf.py set-target esp32p4
idf.py flash monitor

@WhiteDoveBuct
Copy link
Author

@WhiteDoveBuct Image 这是我这边的结果。

你提供的工程我只改了test_ty_model_with_pic.py Image

测试的工程在这边,你可以再试下结果对不对 test_issue.zip

idf.py set-target esp32p4
idf.py flash monitor

不对哦,我都试过的,你看看我一开始的提问的代码就是:

Image
区别是:我使用的是esp32-p4开发板,我看你用的好像是esp32-s3 ?
还有一个问题是,你的模型是怎么来的呢?
我使用的模型是:esp-dl/examples/mobilenet_v2/models/esp32p4/mobilenet_v2.espdl
esp-dl: v3.0.0

@WhiteDoveBuct
Copy link
Author

@100312dog 我看你使用的是esp-dl v3.1.0哦,但是github上并没有发布v3.1.0 哦?我这边使用的是 v3.0.0 哦

@100312dog
Copy link
Contributor

100312dog commented Feb 18, 2025

@Ginosko-mia
Copy link
Contributor

Ginosko-mia commented Feb 18, 2025

@100312dog 我看你使用的是esp-dl v3.1.0哦,但是github上并没有发布v3.1.0 哦?我这边使用的是 v3.0.0 哦

我用esp-dl 3.0.0跑了,结果是对的,你先使用发布的mobilenet_v2.espdl(esp-dl/examples/mobilenet_v2/models/esp32p4/mobilenet_v2.espdl)试一下,是否能跑正确。注意发布的版本的exponents=-6,你之前跑的exponents=-5

@WhiteDoveBuct
Copy link
Author

@100312dog 我看你使用的是esp-dl v3.1.0哦,但是github上并没有发布v3.1.0 哦?我这边使用的是 v3.0.0 哦

我用esp-dl 3.0.0跑了,结果是对的,你先使用发布的mobilenet_v2.espdl(esp-dl/examples/mobilenet_v2/models/esp32p4/mobilenet_v2.espdl)试一下,是否能跑正确。注意发布的版本的exponents=-6,你之前跑的exponents=-5

我是真的不明白你什么意思?v3.0.0上的mobilenet_v2.espdl(esp-dl/examples/mobilenet_v2/models/esp32p4/mobilenet_v2.espdl)的 exponents=-5,不是 exponents=-6。 只有master分支上的 exponents才等于-6,但你又说的v3.0.0没问题?至少模型是有问题的吧?

@Ginosko-mia
Copy link
Contributor

@WhiteDoveBuct OK. 我是跑的master分支,可以跑出正确的结果。

@WhiteDoveBuct
Copy link
Author

WhiteDoveBuct commented Feb 18, 2025

总结原因:github 上的 v3.0.0 是有问题的,也是可以复现我上述描述的现象的(如果在v3.0.0中只把模型替换成 master上的,结果也是不对的),具体 v3.0.0 上哪一处有问题可能需要 开发者指出,但这个问题可以切换成master分支再编译,烧录解决
这个问题可以 close 了。。。感谢各位的帮助

@BlueSkyB
Copy link
Collaborator

因为之前修改了模型的序列化和反序列化,esp-dl没有更新TAG。
如果你的esp-ppq和esp-dl的版本错配,会带来些问题。
现在esp-dl和esp-ppq还在快速迭代中,导致有时TAG会跟不上master的更新速度,可以两者都同时更新使用master分支。
后续我们会尝试下将esp-dl和esp-ppq的版本做下绑定

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants