34

nihui · 2022年11月30日 · 广东

【聆思CSK6视觉AI开发套件试用】手势识别和TinyMaix神经网络推理实验

【聆思CSK6视觉AI开发套件试用】手势识别和TinyMaix神经网络推理实验

CSK6 是聆思科技新一代的 AI 芯片 SoC 产品系列,采用多核异构架构,集成了 ARM Star MCU,HiFi4 DSP,以及聆思全新设计的 AI 神经网络处理内核 NPU,算力达到 128 GOPS。多核异构的设计使芯片能以较低功耗满足音频及图像视频的 AI 应用需求。

本系列芯片集成了 SRAM 与 PSRAM,支持内置或外接Flash,可提供最高 4 入 2 出的 Audio Codec,VGA 像素的 DVP 摄像头接口,多达 6 路的触控检测以及 SPI、UART、USB、SDIO、I2C、I2S 等各类外设接口,丰富接口支持各类应用方案的开发。

CSK6011-NanoKit V1 是一款板载了CSK6011A纯离线模组的NanoKit开发板。

0x0 环境搭建

https://docs.listenai.com/chi...

天然支持linux好评,这里是 fedora 36 系统上的环境搭建步骤

安装 snapd,然后手动造个链接,否则后面 snap install 会报错 classic confinement requires snaps under /snap or symlink from /snap to /var/lib/snapd/snap

dnf install snapd
ln -s /var/lib/snapd/snap /snap

下载离线安装包,解压,根据 install.sh 内容执行即可

cp -f 99-lisa.rules /etc/udev/rules.d/99-lisa.rules
udevadm control --reload-rules
udevadm trigger

snap install ./csk6_integration_installer_linux_v1.6.5.snap --classic --dangerous

重新开个终端,执行 lisa info zep 即可显示环境安装情况,提示 lisa 工具有更新,Y 回车即可更新

0x1 手势识别

https://docs.listenai.com/chi...

根据文档的开发指引,下载SDK和Sample项目,这个过程比较久,挂代理会快些

  1. prj.conf 修改 CONFIG_WEBUSB=y
  2. lisa zep build -b csk6011a_nano
  3. lisa zep flash

至此一切顺利,但是烧录资源时遇到了小砍

lisa zep exec cskburn -s /dev/ttyACM0 -C 6 0x400000 ./resource/cp.bin -b 748800
lisa zep exec cskburn -s /dev/ttyACM0 -C 6 0x500000 ./resource/res.bin -b 748800

提示错误

Partition 1: 0x00400000 (751.35 KB) - ./resource/cp.bin
Waiting for device...
Entering update mode...
Detected flash size: 16 MB
Burning partition 1/1... (0x00400000, 751.35 KB)
ERROR: Failed burning partition 1

✖ Command failed : cskburn -s /dev/ttyACM0 -C 6 0x400000 ./resource/cp.bin -b 748800
 ›   Error: Command failed : cskburn -s /dev/ttyACM0 -C 6 0x400000 ./resource/cp.bin -b 748800

检查错误帮助,退出 tio 串口程序,依然报错,总是 update 到 25% 左右失败

经过一番尝试,终于发现了问题,usb线连接在了机箱前置的蓝色usb口上,把usb线接在机箱后面的红色usb口,就可以顺利完成烧录

烧录完成后,按开发板的 reset 按钮重启

https://docs.listenai.com/chi...

根据文档,再使用另一根 typec 线接到电脑

git clone https://cloud.listenai.com/zephyr/applications/csk_view_finder_spd.git

用开源的 chromium 浏览器打开 csk_view_finder_spd/src/index.html,点击选择设备连接,就能看到摄像头检测的结果了

0x2 一些小修改

默认的画面分辨率非常低

app_algo_hsd_sample_for_csk6/src/webusb_render.c 开头有个 #define WEBUSB_IMAGE_DOWNSAMPLING (6)

改为 4 会稍微质量好些,但是 fps 会更低,改为 2 就会画面撕裂,应该是图像大了来不及传输

输出 LOG_INF

在 prj.conf 添加 CONFIG_LOG_DEFAULT_LEVEL=3 重新编译烧录,就会在串口中打印这些 LOG_INF

app_algo_hsd_sample_for_csk6/src/main.c 有个主回调函数 on_receive_hsd_result

开启 LOG_INF

[00:05:04.368,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:04.435,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:04.501,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:04.568,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:04.635,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:04.702,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:04.769,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:04.836,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:04.903,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:04.971,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:05.038,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:05.105,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:05.172,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:05.239,000] <inf> main: gesture result id: 142 ,state: 4
[00:05:05.306,000] <inf> main: gesture result id: 142 ,state: 4

可以看到手势识别的回调间隔约 70ms,但是实际 LOG_INF 有缓冲区,通常4行一起显示出来

系统的 sdk 路径

默认安装在 $HOME/snap/lisa/x1/.listenai/csk-sdk/zephyr/include/zephyr/

工具链在 $HOME/snap/lisa/x1/.listenai/lisa-zephyr/packages/node_modules/@binary/gcc-arm-none-eabi-10.3/binary/

这些路径在 lisa zep build 也会显示

0x3 TinyMaix实验

https://github.com/sipeed/Tin...

TinyMaix是面向单片机的超轻量级的神经网络推理库,即TinyML推理库,可以让你在任意单片机上运行轻量级深度学习模型~

ARM Star MCU:最高300MHz主频

以hello_world工程为基准,把 TinyMaix 整个项目代码复制过来,修改cmake引入代码

cmake_minimum_required(VERSION 3.20.0)

find_package(Zephyr REQUIRED HINTS $ENV{ZEPHYR_BASE})
project(hello_world)

aux_source_directory(${CMAKE_CURRENT_SOURCE_DIR}/src/TinyMaix/src lib_tinymaix)
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/src/TinyMaix/include)

target_sources(app PRIVATE src/TinyMaix/examples/mnist/main.c ${lib_tinymaix})

修改 TinyMaix/include/tm_port.h 移植 csk 相关的配置,屏蔽无法直接编译的 debug time 相关宏定义

#define TM_LOCAL_MATH   (1)         //use local math func (like exp()) to avoid libm

#define tm_malloc(x)    csk_malloc(x)
#define tm_free(x)      csk_free(x)

#define  TM_GET_US()
#define  TM_DBGT_INIT()
#define  TM_DBGT_START()
#define  TM_DBGT(x)

修改 prj.conf,设置更大的 heap size

CONFIG_HEAP_MEM_POOL_SIZE=300000
CONFIG_CSK_HEAP=y
CONFIG_CSK_HEAP_MEM_POOL_SIZE=842736

老套路,编译,烧录

lisa zep build -b csk6011a_nano
lisa zep flash

串口输出 mnist 推理结果,可以看到成功识别为数字2

*** Booting Zephyr OS build v1.1.1-alpha.2  ***
mnist demo
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,116,125,171,255,255,150, 93,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,169,253,253,253,253,253,253,218, 30,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,169,253,253,253,213,142,176,253,253,122,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0, 52,250,253,210, 32, 12,  0,  6,206,253,140,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0, 77,251,210, 25,  0,  0,  0,122,248,253, 65,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0, 31, 18,  0,  0,  0,  0,209,253,253, 65,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,117,247,253,198, 10,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 76,247,253,231, 63,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,128,253,253,144,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,176,246,253,159, 12,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 25,234,253,233, 35,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,198,253,253,141,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0, 78,248,253,189, 12,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0, 19,200,253,253,141,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,134,253,253,173, 12,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,248,253,253, 25,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,248,253,253, 43, 20, 20, 20, 20,  5,  0,  5, 20, 20, 37,150,150,150,147, 10,  0,
  0,  0,  0,  0,  0,  0,  0,  0,248,253,253,253,253,253,253,253,168,143,166,253,253,253,253,253,253,253,123,  0,
  0,  0,  0,  0,  0,  0,  0,  0,174,253,253,253,253,253,253,253,253,253,253,253,249,247,247,169,117,117, 57,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,118,123,123,123,166,253,253,253,155,123,123, 41,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
================================ model stat ================================
mdl_type=0 (int8))
out_deq=1
input_cnt=1, output_cnt=1, layer_cnt=6
input 3dims: (28, 28, 1)
output 1dims: (1, 1, 10)
main buf size 1464; sub buf size 0
//Note: PARAM is layer param size, include align padding

Idx     Layer            outshape       inoft   outoft  PARAM   MEMOUT OPS
---     Input            28, 28,  1     -       0       0       784     0
###L71: body oft = 64
###L72: type=0, is_out=0, size=152, in_oft=0, out_oft=784, in_dims=[3,28,28,1], out_dims=[3,13,13,4], in_s=0.004, in_zp=-128, out_s=0.016, out_zp=-128
###L85: Conv2d: kw=3, kh=3, sw=2, sh=2, dw=1, dh=1, act=1, pad=[0,0,0,0], dmul=0, ws_oft=80, w_oft=96, b_oft=136
000     Conv2D           13, 13,  4     0       784     72      676     6084
###L71: body oft = 216
###L72: type=0, is_out=0, size=432, in_oft=784, out_oft=0, in_dims=[3,13,13,4], out_dims=[3,6,6,8], in_s=0.016, in_zp=-128, out_s=0.016, out_zp=-128
###L85: Conv2d: kw=3, kh=3, sw=2, sh=2, dw=1, dh=1, act=1, pad=[0,0,0,0], dmul=0, ws_oft=80, w_oft=112, b_oft=400
001     Conv2D            6,  6,  8     784     0       352     288     10368
###L71: body oft = 648
###L72: type=0, is_out=0, size=1360, in_oft=0, out_oft=1400, in_dims=[3,6,6,8], out_dims=[3,2,2,16], in_s=0.016, in_zp=-128, out_s=0.057, out_zp=-128
###L85: Conv2d: kw=3, kh=3, sw=2, sh=2, dw=1, dh=1, act=1, pad=[0,0,0,0], dmul=0, ws_oft=80, w_oft=144, b_oft=1296
002     Conv2D            2,  2, 16     0       1400    1280    64      4608
###L71: body oft = 2008
###L72: type=1, is_out=0, size=48, in_oft=1400, out_oft=0, in_dims=[3,2,2,16], out_dims=[1,1,1,16], in_s=0.057, in_zp=-128, out_s=0.022, out_zp=-128
003     GAP               1,  1, 16     1400    0       0       16      64
###L71: body oft = 2056
###L72: type=2, is_out=0, size=304, in_oft=0, out_oft=1448, in_dims=[1,1,1,16], out_dims=[1,1,1,10], in_s=0.022, in_zp=-128, out_s=0.151, out_zp=42
###L96: FC: ws_oft=64, w_oft=104, b_oft=264
004     FC                1,  1, 10     0       1448    240     10      160
###L71: body oft = 2360
###L72: type=3, is_out=1, size=48, in_oft=1448, out_oft=0, in_dims=[1,1,1,10], out_dims=[1,1,1,10], in_s=0.151, in_zp=42, out_s=0.004, out_zp=-128
005     Softmax           1,  1, 10     1448    0       0       10      60

Total param ~1.9 KB, OPS ~0.02 MOPS, buffer 1.4 KB

0: 0.004
1: 0.004
2: 0.996
3: 0.004
4: 0.000
5: 0.000
6: 0.004
7: 0.004
8: 0.004
9: 0.004
### Predict output is: Number 2, prob 0.996

0x4 总结

  • 开发环境原生支持 Windows/Linux/macOS 好评
  • csk 文档详细,示例demo上手难度低,可用性好
  • 期待开放更多底层的技术细节,比如自定义模型的编译等
  • qwqwqwq
推荐阅读
关注数
5175
内容数
100
聆思科技官方专栏,专注AIOT芯片,持续分享有趣的解决方案。商务合作微信:listenai-csk 技术交流QQ群:825206462
目录
极术微信服务号
关注极术微信号
实时接收点赞提醒和评论通知
安谋科技学堂公众号
关注安谋科技学堂
实时获取安谋科技及 Arm 教学资源
安谋科技招聘公众号
关注安谋科技招聘
实时获取安谋科技中国职位信息