夜夜大叔 · 2021年07月29日

【周易AIPU 仿真】基于win10+wsl2+ubuntu+docker的shufflenet模型

流程

  1. 配置wsl2+ubuntu+docker
  2. docker部署zepan/zhouyi的一些小技巧
  3. 准备模型
  4. 准备矫正数据集
  5. 准备输入样本和输出参考
  6. 修改配置文件
  7. 验真结果

1. 配置wsl2+ubuntu+docker

  • windows配置wsl2+ubuntu教程可参考链接
  • 从win10访问Ubuntu的文件(添加到快捷方式会比较容易找到)
  • 配置好ubuntu后输入

    explorer.exe .
  • 配置docker教程可参考链接
  • 注意!!
  • wsl中启动docker应该执行

    sudo service docker start

2. docker部署zepan/zhouyi的一些小技巧

  • 根据官方教程的链接来部署
  • 退出容器后可以运行docker ps -a查看容器ID
  • 将容器里的主要文件拷贝到主机上

    docker cp [容器ID]:/root/demos/tflite [主机目录]
  • 新运行一个周易容器并挂载刚刚的目录文件,实现容器和主机之间文件的实时传输

    sudo docker run -i -t --privileged=true -v [主机目录]:/root/demos/tflite zepan/zhouyi  /bin/bash

    image.png

3. 准备模型

  • 可以直接下载github上的onnx模型shufflenet-9shufflenet-v2-10
  • 注意!!
  • 请使用网址来检查模型的输入和输出,这在之后的修改配置文件中需要使用。
  • shufflenet-9的输入输出shufflenet-9.png
  • shufflenet-v2-10的输入输出image.png

4. 准备矫正数据集

  • 在/root/demos/tflite/目录下新建一个preprocess_dataset_onnx.py写入以下代码并运行

    import os
    from torchvision import transforms
    from PIL import Image
    import numpy as np
    
    imgs_path = './img/'
    imgs_list = os.listdir(imgs_path)
    imgs_path_list = [imgs_path + i for i in imgs_list]
    imgs_list = []
    for i in imgs_path_list:
      imgs_list = imgs_list + [Image.open(i)]
    transforms = transforms.Compose([
      transforms.Resize(224),
      transforms.CenterCrop(224),
      transforms.ToTensor(),
      transforms.Normalize([0.485, 0.456, 0.406],[0.229, 0.224, 0.225])
    ])
    imgs_list = [np.array(transforms(i)) for i in imgs_list]
    imgs_list = [np.transpose(i,(1,2,0)) for i in imgs_list]
    imgs_list = np.array(imgs_list)
    print(imgs_list.shape)
    np.save('./preprocess/data.npy',imgs_list)
    
    #保存label
    label_array = []
    with open('label.txt') as f:
      line = f.readlines()
      for i in range(imgs_list.shape[0]):
          label_array = label_array + [line[i][29:-2]]
    label_array = [int(i) for i in label_array]
    label_array = np.array(label_array)
    print(label_array.shape)
    np.save('./preprocess/label.npy',label_array)

5. 准备输入样本和输出参考

  • 在/root/demos/tflite/目录下新建一个preprocess.py写入以下代码并运行

    import cv2
    import numpy as np
    import onnx
    import onnxruntime as ort
    
    input_height=224
    input_width=224
    input_channel=3
    
    img_path = "./img/ILSVRC2012_val_00000004.JPEG"
    
    orig_image = cv2.imread(img_path)
    image = cv2.cvtColor(orig_image, cv2.COLOR_BGR2RGB)
    image = cv2.resize(image, (input_width, input_height))
    image = (image - 127.5) / 1
    image_1 = np.expand_dims(image, axis=0)
    image = image_1.astype(np.int8)
    
    image_1 = image_1.astype(np.float32)
    image_1 = image_1.transpose([0,3,1,2])
    ort_session = ort.InferenceSession(r"\shufflenet-v2-10.onnx")
    outputs = ort_session.run(None, {'input':image_1})
    
    print("onnx result:",outputs[0])    
    
    
    pred = 255 * outputs[0]
    pred = pred.astype(np.uint8)
    fw=open('./preprocess/output_ref.bin', 'wb')
    fw.write(pred)
    fw.close()
    
    image.tofile("./preprocess/input.bin")
    print("save to input.bin OK")
  • 同时修改/root/demos/tflite/目录下的quant_predict.py,以备后续测试使用

    from PIL import Image
    import cv2
    from matplotlib import pyplot as plt
    import matplotlib.patches as patches
    import numpy as np
    import os
    import imagenet_classes as class_name
    
    current_dir = os.getcwd()
    label_offset = 1
    outputfile = current_dir + '/preprocess/output.bin'
    npyoutput = np.fromfile(outputfile, dtype=np.uint8)
    outputclass = npyoutput.argmax()
    head5p = npyoutput.argsort()[-5:][::-1]
    
    labelfile = current_dir + '/preprocess/output_ref.bin'
    npylabel = np.fromfile(labelfile, dtype=np.int8)
    labelclass = npylabel.argmax()
    head5t = npylabel.argsort()[-5:][::-1]
    
    print("predict first 5 label:")
    for i in head5p:
      print("    index %4d, prob %3d, name: %s"%(i, npyoutput[i], class_name.class_names[i-label_offset]))
      
    print("true first 5 label:")
    for i in head5t:
      print("    index %4d, prob %3d, name: %s"%(i, npylabel[i], class_name.class_names[i-label_offset]))
    
    # Show input picture
    print('Detect picture save to result.jpeg')
    
    input_path = './preprocess/input.bin'
    npyinput = np.fromfile(input_path, dtype=np.int8)
    image = np.clip(np.round(npyinput)+128, 0, 255).astype(np.uint8)
    image = np.reshape(image, (224, 224, 3))
    im = Image.fromarray(image)
    im.save('result.jpeg')

6. 修改配置文件

  • 参考tflite_mobilenet_v2_run.cfg文件,并生成一个自己的onnx_shufflenet_run.cfg文件,内如如下

    [Common]
    mode = run
    
    [Parser]
    model_type = onnx
    input_data_format = NCHW
    model_name = shufflenet
    detection_postprocess = 
    model_domain = image_classification
    input_model = ./preprocess/shufflenet-v2-10.onnx
    input = input
    input_shape = [1, 3, 224, 224]
    output = 
    
    [AutoQuantizationTool]
    quantize_method = SYMMETRIC
    ops_per_channel = DepthwiseConv
    reverse_rgb = False
    calibration_data = ./preprocess/data.npy
    calibration_label = ./preprocess/label.npy
    label_id_offset = 0
    preprocess_mode = normalize
    quant_precision = int8
    
    [GBuilder]
    inputs=./preprocess/input.bin
    simulator=aipu_simulator_z1
    outputs=./preprocess/output.bin
    profile= True
    target=Z1_0701

7. 验真结果

  • 执行命令

    aipubuild config/onnx_shufflenet_run.cfg
  • 得到输出结果

    root@f4e7a897f777:~/demos/tflite# aipubuild config/onnx_shufflenet_run.cfg
    WARNING:tensorflow:
    The TensorFlow contrib module will not be included in TensorFlow 2.0.
    For more information, please see:
    * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
    * https://github.com/tensorflow/addons
    * https://github.com/tensorflow/io (for I/O related ops)
    If you depend on functionality not listed there, please file an issue.
    
    [I] Parsing model....
    [I] [Parser]: Begin to parse onnx model shufflenet...
    2021-07-27 15:46:56.578841: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
    2021-07-27 15:46:56.591916: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2499995000 Hz
    2021-07-27 15:46:56.602401: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7bee6d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
    2021-07-27 15:46:56.602450: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
    /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
    return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
    [I] [Parser]: Parser done!
    [I] Parse model complete
    [I] Quantizing model....
    [I] AQT start: model_name:shufflenet, calibration_method:MEAN, batch_size:1
    [I] ==== read ir ================
    [I]     float32 ir txt: /tmp/AIPUBuilder_1627400815.3939884/shufflenet.txt
    [I]     float32 ir bin2: /tmp/AIPUBuilder_1627400815.3939884/shufflenet.bin
    [I] ==== read ir DONE.===========
    WARNING:tensorflow:From /usr/local/bin/aipubuild:8: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
    
    WARNING:tensorflow:From /usr/local/bin/aipubuild:8: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.
    
    [I] ==== auto-quantization ======
    WARNING:tensorflow:From /usr/local/bin/aipubuild:8: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
    Instructions for updating:
    Use eager execution and:
    `tf.data.TFRecordDataset(path)`
    WARNING:tensorflow:Entity <bound method ImageNet.data_transform_fn of <AIPUBuilder.AutoQuantizationTool.auto_quantization.data_set.ImageNet object at 0x7f64b4cef588>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: <cyfunction ImageNet.data_transform_fn at 0x7f65cc8a7d38> is not a module, class, method, function, traceback, frame, or code object
    WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/impl/api.py:330: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.
    
    WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/impl/api.py:330: The name tf.parse_single_example is deprecated. Please use tf.io.parse_single_example instead.
    
    WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/func_graph.py:915: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.
    
    WARNING:tensorflow:From /usr/local/bin/aipubuild:8: DatasetV1.make_one_shot_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
    Instructions for updating:
    Use `for ... in dataset:` to iterate over a dataset. If using `tf.estimator`, return the `Dataset` object directly from your input function. As a last resort, you can use `tf.compat.v1.data.make_one_shot_iterator(dataset)`.
    WARNING:tensorflow:From /usr/local/bin/aipubuild:8: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.
    
    WARNING:tensorflow:From /usr/local/bin/aipubuild:8: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.
    
    
    [I]     step1: get max/min statistic value DONE
    [W] shift value is discrete in Depthwise, layer Conv_12, fixed by constraining shift value, may lead to acc drop
    [W] shift value is discrete in Depthwise, layer Conv_42, fixed by constraining shift value, may lead to acc drop
    [W] shift value is discrete in Depthwise, layer Conv_76, fixed by constraining shift value, may lead to acc drop
    [W] shift value is discrete in Depthwise, layer Conv_91, fixed by constraining shift value, may lead to acc drop
    [W] shift value is discrete in Depthwise, layer Conv_106, fixed by constraining shift value, may lead to acc drop
    [W] shift value is discrete in Depthwise, layer Conv_151, fixed by constraining shift value, may lead to acc drop
    [W] shift value is discrete in Depthwise, layer Conv_181, fixed by constraining shift value, may lead to acc drop
    [W] shift value is discrete in Depthwise, layer Conv_200, fixed by constraining shift value, may lead to acc drop
    [W] shift value is discrete in Depthwise, layer Conv_215, fixed by constraining shift value, may lead to acc drop
    [W] shift value is discrete in Depthwise, layer Conv_230, fixed by constraining shift value, may lead to acc drop
    [W] shift value is discrete in Depthwise, layer Conv_245, fixed by constraining shift value, may lead to acc drop
    [I]     step2: quantization each op DONE
    [I]     step3: build quantization forward DONE
    [I]     step4: show output scale of end node:
    [I]             layer_id:137, layer_top:Gemm_260, output_scale:[9.276978]
    [I] ==== auto-quantization DONE =
    [I] Quantize model complete
    [I] Building ...
    [I] [common_options.h: 276] BuildTool version: 4.0.175. Build for target Z1_0701 at frequency 800MHz
    [I] [common_options.h: 297] using default profile events to profile AIFF
    
    [I] [IRChecker] Start to check IR: /tmp/AIPUBuilder_1627400815.3939884/shufflenet_int8.txt
    [I] [IRChecker] model_name: shufflenet
    [I] [IRChecker] IRChecker: All IR pass
    [I] [graph.cpp : 846] loading graph weight: /tmp/AIPUBuilder_1627400815.3939884/shufflenet_int8.bin size: 0x2322f0
    [I] [builder.cpp:1059] Total memory for this graph: 0x90f940 Bytes
    [I] [builder.cpp:1060] Text   section:  0x000b61c0 Bytes
    [I] [builder.cpp:1061] RO     section:  0x00007300 Bytes
    [I] [builder.cpp:1062] Desc   section:  0x00010900 Bytes
    [I] [builder.cpp:1063] Data   section:  0x00286f80 Bytes
    [I] [builder.cpp:1064] BSS    section:  0x0057a800 Bytes
    [I] [builder.cpp:1065] Stack         :  0x00040400 Bytes
    [I] [builder.cpp:1066] Workspace(BSS):  0x00049800 Bytes
    [I] [main.cpp  : 467] # autogenrated by aipurun, do NOT modify!
    LOG_FILE=log_default
    FAST_FWD_INST=0
    INPUT_INST_CNT=1
    INPUT_DATA_CNT=2
    CONFIG=Z1-0701
    LOG_LEVEL=0
    INPUT_INST_FILE0=/tmp/temp_3bbac3852f594e475c81cc8abeccf.text
    INPUT_INST_BASE0=0x0
    INPUT_INST_STARTPC0=0x0
    INPUT_DATA_FILE0=/tmp/temp_3bbac3852f594e475c81cc8abeccf.ro
    INPUT_DATA_BASE0=0x10000000
    INPUT_DATA_FILE1=/tmp/temp_3bbac3852f594e475c81cc8abeccf.data
    INPUT_DATA_BASE1=0x20000000
    OUTPUT_DATA_CNT=2
    OUTPUT_DATA_FILE0=output.bin
    OUTPUT_DATA_BASE0=0x20a05a00
    OUTPUT_DATA_SIZE0=0x3e8
    OUTPUT_DATA_FILE1=profile_data.bin
    OUTPUT_DATA_BASE1=0x20410b80
    OUTPUT_DATA_SIZE1=0xf00
    RUN_DESCRIPTOR=BIN[0]
    
    [I] [main.cpp  : 118] run simulator:
    aipu_simulator_z1 /tmp/temp_3bbac3852f594e475c81cc8abeccf.cfg
    [INFO]:SIMULATOR START!
    [INFO]:========================================================================
    [INFO]:                             STATIC CHECK
    [INFO]:========================================================================
    [INFO]:  INST START ADDR : 0x0(0)
    [INFO]:  INST END ADDR   : 0xb61bf(745919)
    [INFO]:  INST SIZE       : 0xb61c0(745920)
    [INFO]:  PACKET CNT      : 0xb61c(46620)
    [INFO]:  INST CNT        : 0x2d870(186480)
    [INFO]:------------------------------------------------------------------------
    [WARN]:[0803] INST WR/RD REG CONFLICT! PACKET 0x3f41: 0x4720204(POP R4,Rc7) vs 0x47a1be0(ADD R0,R6,R31,Rc7), PACKET:0x3f41(16193) SLOT:0 vs 3
    [WARN]:[0803] INST WR/RD REG CONFLICT! PACKET 0x46b0: 0x4720204(POP R4,Rc7) vs 0x47a1be0(ADD R0,R6,R31,Rc7), PACKET:0x46b0(18096) SLOT:0 vs 3
    [WARN]:[0803] INST WR/RD REG CONFLICT! PACKET 0x4ad4: 0x472021b(POP R27,Rc7) vs 0x5f00000(MVI R0,0x0,Rc7), PACKET:0x4ad4(19156) SLOT:0 vs 3
    [WARN]:[0803] INST WR/RD REG CONFLICT! PACKET 0x4ae1: 0x472021b(POP R27,Rc7) vs 0x5f00000(MVI R0,0x0,Rc7), PACKET:0x4ae1(19169) SLOT:0 vs 3
    [WARN]:[0803] INST WR/RD REG CONFLICT! PACKET 0x4c46: 0x472021b(POP R27,Rc7) vs 0x9f80020(ADD.S R0,R0,0x1,Rc7), PACKET:0x4c46(19526) SLOT:0 vs 3
    [WARN]:[0803] INST WR/RD REG CONFLICT! PACKET 0x4de3: 0x4520180(BRL R0) vs 0x47a03e5(ADD R5,R0,R31,Rc7), PACKET:0x4de3(19939) SLOT:0 vs 3
    [WARN]:[0803] INST WR/RD REG CONFLICT! PACKET 0x57ab: 0x4720204(POP R4,Rc7) vs 0x47a1be0(ADD R0,R6,R31,Rc7), PACKET:0x57ab(22443) SLOT:0 vs 3
    [WARN]:[0803] INST WR/RD REG CONFLICT! PACKET 0x6c16: 0x4720204(POP R4,Rc7) vs 0x47a1be0(ADD R0,R6,R31,Rc7), PACKET:0x6c16(27670) SLOT:0 vs 3
    [WARN]:[0803] INST WR/RD REG CONFLICT! PACKET 0x7535: 0x4720204(POP R4,Rc7) vs 0x47a1be0(ADD R0,R6,R31,Rc7), PACKET:0x7535(30005) SLOT:0 vs 3
    [WARN]:[0803] INST WR/RD REG CONFLICT! PACKET 0x808b: 0x4720204(POP R4,Rc7) vs 0x47a1be0(ADD R0,R6,R31,Rc7), PACKET:0x808b(32907) SLOT:0 vs 3
    [INFO]:========================================================================
    [INFO]:                             STATIC CHECK END
    [INFO]:========================================================================
    
    [INFO]:AIPU START RUNNING: BIN[0]
    [INFO]:TOTAL TIME: 23.970811s.
    [INFO]:SIMULATOR EXIT!
    [I] [main.cpp  : 135] Simulator finished.
    Total errors: 0,  warnings: 0
  • 得到Simulator总耗时为23.97s,感觉不错
  • 执行测试文件

    python quan_predict.py
  • 得到最总结果

    root@f4e7a897f777:~/demos/tflite# python quant_predict.py
    predict first 5 label:
      index  231, prob 150, name: Shetland sheepdog, Shetland sheep dog, Shetland
      index  232, prob  143, name: collie
      index  158, prob   46, name: papillon
      index  342, prob   45, name: hog, pig, grunter, squealer, Sus scrofa
      index  340, prob   31, name: sorrel
    true first 5 label:
      index  232, prob  123, name: collie
      index  231, prob  109, name: Shetland sheepdog, Shetland sheep dog, Shetland
      index  158, prob  41, name: papillon
      index  170, prob  36, name: borzoi, Russian wolfhound
      index  161, prob  34, name: Afghan hound, Afghan
    Detect picture save to result.jpeg
推荐阅读
关注数
7443
内容数
92
人工智能边缘计算软硬件解决方案,提供高性能、低成本、低功耗、易使用的硬件选型方案.
目录
极术微信服务号
关注极术微信号
实时接收点赞提醒和评论通知
安谋科技学堂公众号
关注安谋科技学堂
实时获取安谋科技及 Arm 教学资源
安谋科技招聘公众号
关注安谋科技招聘
实时获取安谋科技中国职位信息