爱笑的小姐姐 · 2021年05月24日

深度学习-性能优化0.手动编译TensorFlow支持TensorRT

本文转自:知乎
作者:djh

手动编译TensorFlow支持TensorRT

1.问题阐述

使用 pip 安装的TensorFlow 如下:

pip install tensorflow-gpu==1.12.0

存在问题,导致TRT的某些接口找不到,如下:在NVIDIA社区有人提问。

**** Failed to initialize TensorRT. This is either because the TensorRT installation path is not in LD_LIBRARY_PATH, or because you do not have it installed. If not installed, please go to https://developer.nvidia.com/tensorrt to download and install TensorRT ****
...
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/util/loader.py", line 56, in load_op_library
    ret = load_library.load_op_library(path)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/load_library.py", line 60, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: libnvinfer.so.4: cannot open shared object file: No such file or directory

然后就傻傻的去安装,tensorRT。到官方网站,找到tensorRT的安装。 https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html 然后还是没有用,如果你只是用tensorRT,又会有如下错误:

Traceback (most recent call last):
File "tftrt_sample.py", line 291, in <module>
timings,comp,valfp32,mdstats=timeGraph(getFP32(f.batch_size,wsize),f.batch_size,f.num_loops,
File "tftrt_sample.py", line 111, in getFP32
trt_graph = trt.create_inference_graph(getResnet50(), [ "resnet_v1_50/predictions/Reshape_1"],
AttributeError: module 'tensorrt' has no attribute 'create_inference_graph

后面想一想就知道了,TensorRT 和 TensorFlow里面的TensorRT 应该是两个不同的东西。对吧。

你只需要TensorFlow支持TensorRT就行了。TensorFlow默认是没有编译出TRT的,所以需要自己编译。

2.编译

TensorFlow 的编译如下

https://www.tensorflow.org/install/source

编译都是一切正常,需要注意一下几点,

2.1、在config阶段,需要打开,支持TRT,如下

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.0
Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.1
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Do you wish to build TensorFlow with TensorRT support? [y/N]: y
TensorRT support will be enabled for TensorFlow.
Please specify the location where TensorRT is installed. [Default is /usr/lib/x86_64-linux-gnu]:/home/jj/TensorRT-5.0

2.2、各个版本要对应,我的版本是

Python3.6
CUDA9.0
cudnn7.3
TensorRT5.0

支持以后就在tensorflow的框架中正常使用trt的接口了。

3、详细编译配置参考附录

./configure
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.23.2 installed.
Please specify the location of python. [Default is /data0/tools/anaconda3/bin/python]:


Found possible Python library paths:
  /data0/tools/anaconda3/lib/python3.6/site-packages
Please input the desired Python library path to use.  Default is [/data0/tools/anaconda3/lib/python3.6/site-packages]

Do you wish to build TensorFlow with Apache Ignite support? [Y/n]: y
Apache Ignite support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [Y/n]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 9.0]: 9.0


Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-9.0


Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.5


Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-9.0]: /data0/home/djh/cudnn/cudacudnn7.5


Do you wish to build TensorFlow with TensorRT support? [y/N]: y
TensorRT support will be enabled for TensorFlow.

Please specify the location where TensorRT is installed. [Default is /usr/lib/x86_64-linux-gnu]:
/data0/home/djh/tensorRT/TensorRT-5.1.2.2/targets/x86_64-linux-gnu

Please specify the NCCL version you want to use. If NCCL 2.2 is not installed, then you can use version 1.3 that can be fetched automatically but it may have worse performance with multiple GPUs. [Default is 2.2]: 2.4


NCCL libraries found in /usr/lib/x86_64-linux-gnu/libnccl.so
This looks like a system path.
Assuming NCCL header path is /usr/include
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1]:


Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:


Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:


Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
    --config=mkl            # Build with MKL support.
    --config=monolithic     # Config for mostly static monolithic build.
    --config=gdr            # Build with GDR support.
    --config=verbs          # Build with libverbs support.
    --config=ngraph         # Build with Intel nGraph support.
Configuration finished

其他参考文章

https://github.com/yazone/ai\_learning\_path​github.comhttps://github.com/yazone/ai\_learning\_path/tree/master/04.AI%E8%90%BD%E5%9C%B0%E5%AE%9E%E8%B7%B5/04.%E6%80%A7%E8%83%BD%E4%BC%98%E5%8C%96/nvidia%E6%80%A7%E8%83%BD%E4%BC%98%E5%8C%96​github.com

更多嵌入式AI技术相关内容请关注嵌入式AI专栏。
推荐阅读
关注数
18790
内容数
1342
嵌入式端AI,包括AI算法在推理框架Tengine,MNN,NCNN,PaddlePaddle及相关芯片上的实现。欢迎加入微信交流群,微信号:aijishu20(备注:嵌入式)
目录
极术微信服务号
关注极术微信号
实时接收点赞提醒和评论通知
安谋科技学堂公众号
关注安谋科技学堂
实时获取安谋科技及 Arm 教学资源
安谋科技招聘公众号
关注安谋科技招聘
实时获取安谋科技中国职位信息