干货 | 函数详解 OpenVINO Inference Engine SDK

作者：时间：2021-01-20 来源：OpenVINO中文社区

加入技术交流群
- 扫码加入
  和技术大咖面对面交流
  海量资料库查询

基本介绍

本文引用地址：//www.cghlg.com/article/202101/422243.htm

OpenVINO 是针对英特尔针对自家现有的硬件平台开发的高性能计算机视觉和深度学习视觉应用的工具套件，支持英特尔自家的 CPU、GPU、FPGA、VPU 等硬件。OpenVINO 包含两个大模块：模型转换模块 Model Optimizer 和推理模块 Inference Engine。本文讲解推理模块常见的 C++、API 函数说明以及使用方法，推理模块 API 也提供 C、Python 接口，笔者安装的 OpenVINO 版本是 2020.3 版本。

工作流程

推理模块的工作流程一般包含如下步骤：

创建推理对象：该推理对象可以支持不同的设备，所有的设备插件自动通过 Core 来进行管理。Core::SetConfig 来配置设备属性，使用 Core::AddExtension 来注册设备第三方库，增加自定义层实现
读取中间表示：使用 Core 对象来读取中间表示文件 Core::ReadNetwork 创建 CNNNetwork 对象，该网络存在于宿主机的内存中
设置输入输出：CNNNetwork::getInputsInfo 和 CNNNetwork::getOutputsInfo 函数用于设置输入输出层的精度、数据排列等
加载神经网络：CNNNetwork::LoadNetwork 编译并加载网络到设备，得到ExecutableNetwork 对象
设置输入数据：使用 ExecutableNetwork 对象来创建 InferRequest，可以直接将宿主机的内存复制到设备内存
执行推理过程：可以选择同步推理 InferRequest::Infer，也可以选择异步推理模式 InferRequest::StartAsync
获取输出结果：InferRequest::GetBlob 读取推理结果

接口详解

下面逐一讲解上述工作流程中提到的 API 函数，以下函数声明来自于安装好的 OpenVINO 库，也可以从 OpenVINO 源码中理解这些函数的实现，由于篇幅有限，本文重点讲解函数接口以及使用方法。

创建推理对象

Core 类是 InferenceEngine 的核心管理类，负责设备的管理，网络加载等功能。头文件为 openvino/inference_engine/ie_core.hpp，实现文件在openvino/inference_engine/src/inference_engine/ie_core.cpp。Core 构造函数声明如下所示

explicit Core(const std::string& xmlConfigFile = std::string());
// xmlConfigFile：指定插件配置文件，如果为空的话加载默认配置

一般创建对象不输入插件配置文件，使用默认的插件库，默认参数位于openvino/deployment_tools/inference_engine/lib/intel64/plugins.xml，默认参数如下所示，name 表示支持的设备类型名称，location 表示支持设备对应的库名称。

<ie>
    <plugins>
        <plugin name="GNA" location="libGNAPlugin.so">
        </plugin>
        <plugin name="HETERO" location="libHeteroPlugin.so">
        </plugin>
        <plugin name="CPU" location="libMKLDNNPlugin.so">
        </plugin>
        <plugin name="MULTI" location="libMultiDevicePlugin.so">
        </plugin>
        <plugin name="GPU" location="libclDNNPlugin.so">
        </plugin>
        <plugin name="MYRIAD" location="libmyriadPlugin.so">
        </plugin>
        <plugin name="HDDL" location="libHDDLPlugin.so">
        </plugin>
        <plugin name="FPGA" location="libdliaPlugin.so">
        </plugin>
    </plugins>
</ie>

一个详细版本的设备配置文件如下所示：

<ie>
    <plugins>
        <plugin name="" location="">
            <extensions>
                <extension location="">
            </extensions>
            <properties>
                <property key="" value="">
            </properties>
        </plugin>
    </plugins>
</ie>

以上默认的插件库除FPGA外均可自行编译，源码位于openvino/blob/master/inference-engine/src/inference_engine 内。可通过 SetConfig 来配置设备的一些属性也就是上面 xml 中的 property 字段，使用 AddExtension 来设置设备外挂第三方库也就是上面 xml 中的 extension 字段，SetConfig 接口函数如下所示：

void SetConfig(const std::map<std::string, std::string>& config, const std::string& deviceName = std::string());
// config：指定配置的参数名称和数值
// deviceName：指定配置设备名称，可选参数，如果不设置可默认为所有注册的设备都更改次配置

SetConfig 函数用于为设备设置某些属性值，第一个参数为设备属性及其值，第二个参数为设备名称，如果设备名称为空，则设置所有的设备属性。设备属性查询列表可以参考openvino/inference_engine/include/ie_plugin_config.hpp。

AddExtension 函数用于配置设备的外挂第三方库，来支持 OpenVINO 中没有实现的某些层。目前不支持设备名称为 HETERO 和 MULTI 的外挂第三方库，可用第二个函数接口检验设备名称。

void AddExtension(const IExtensionPtr& extension);
void AddExtension(IExtensionPtr extension, const std::string& deviceName);
// extention：已加载的扩展的指针
// deviceName：设备名称

创建推理对象步骤如下示例代码所示：

// 使用默认的 plugins.xml 文件创建 Core 对象
Core ie;

// 设置设备属性
ie.SetConfig({{PluginConfigParams::KEY_CONFIG_FILE, config_file}}, device_name);

// 设置设备的外挂第三方库用于支持用户自定义层
IExtensionPtr extension_ptr = make_so_pointer<IExtension>(extension_name);
ie.AddExtension(extension_ptr, "CPU");

// 设置设备的外挂函数用于支持用户自定义层
IExtensionPtr inPlaceExtension = std::make_shared<InPlaceExtension>();
ie.AddExtension(inPlaceExtension);

读取中间表示

读取中间表示的函数接口为 Core::ReadNetwork 函数，该函数输入参数为 Model Optimizer 转换得到中间表示，将其加载到宿主机内存中。函数声明如下所示：

CNNNetwork ReadNetwork(const std::string& modelPath, const std::string& binPath = "") const;
// modelPath：中间表示的配置文件
// binPath：中间表示的权重文件，如果为空，则尝试加载 modelPath 同名的权重文件，如果找不到同名文件则不加载权重
CNNNetwork ReadNetwork(const std::string& model, const Blob::CPtr& weights) const;
// model：中间表示的配置文件，权重文件必须与配置文件同名
// weights：共享指针，指向常量 Blob

第二个函数不常用，下面仅举例说明第一个函数用法，通过 ReadNetwork 函数创建 CNNNetwork 对象：

/** Read network model **/
CNNNetwork network = ie.ReadNetwork(modelPath);

设置输入输出

设置输入输出属性，包括数据精度、输入数据的排列方式（NCHW、NHWC等）、BatchSize 等操作。

排列方式可以从openvino/inference_engine/include/ie_common.h查询目前支持的输入输出数据 Layout 方式如下：

NCHW = 1,  //!< NCHW layout for input / output blobs
NHWC = 2,  //!< NHWC layout for input / output blobs
NCDHW = 3,  //!< NCDHW layout for input / output blobs
NDHWC = 4,  //!< NDHWC layout for input / output blobs

精度参数可以从openvino/inference_engine/include/ie_precision.hpp查询，目前支持的精度参数方式如下：

enum ePrecision : uint8_t {
    UNSPECIFIED = 255, /**< Unspecified value. Used by default */
    MIXED = 0,         /**< Mixed value. Can be received from network. No applicable for tensors */
    FP32 = 10,         /**< 32bit floating point value */
    FP16 = 11,         /**< 16bit floating point value */
    Q78 = 20,          /**< 16bit specific signed fixed point precision */
    I16 = 30,          /**< 16bit signed integer value */
    U8 = 40,           /**< 8bit unsigned integer value */
    I8 = 50,           /**< 8bit signed integer value */
    U16 = 60,          /**< 16bit unsigned integer value */
    I32 = 70,          /**< 32bit signed integer value */
    I64 = 72,          /**< 64bit signed integer value */
    U64 = 73,          /**< 64bit unsigned integer value */
    BIN = 71,          /**< 1bit integer value */
    BOOL = 41,         /**< 8bit bool type */
    CUSTOM = 80        /**< custom precision has it's own name and size of elements */
};

输入数据的属性设置示例代码如下所示：

InputsDataMap inputInfo(network.getInputsInfo());
        
InputInfo::Ptr& input = inputInfo.begin()->second;
auto inputName = inputInfo.begin()->first;

// 设置精度和数据排列方式
input->setPrecision(Precision::U8);
input->getInputData()->setLayout(Layout::NCHW);

// 设置 BatchSize 大小
ICNNNetwork::InputShapes inputShapes = network.getInputShapes();
SizeVector& inSizeVector = inputShapes.begin()->second;
inSizeVector[0] = 1;  // set batch to 1
network.reshape(inputShapes);

输出数据的属性设置示例代码如下所示：

OutputsDataMap outputInfo(network.getOutputsInfo());
for (auto &output : outputInfo) {
    // 设置精度和数据的排列方式
    output.second->setPrecision(Precision::FP32);
    output.second->setLayout(Layout::NCHW);
}

加载神经网络

加载神经网路是将网络编译并加载到设备上，使用的函数接口为 Core::LoadNetwork，其函数声明如下所示：

ExecutableNetwork LoadNetwork(
const CNNNetwork network, const std::string& deviceName,
const std::map<std::string, std::string>& config = std::map<std::string, std::string>());
// network：在步骤二读取中间表示中创建的网络
// deviceName：执行推理的设备名称
// config：设备配置属性，可选参数，该属性也可以通过 SetConfig 来设置所有设备属性

该函数的用法如下所示：

ExecutableNetwork executable_network = ie.LoadNetwork(network, device_name, configure);

LoadNetwork 所执行的加载动作实现代码存在于设备的 Plugin 插件中，也就是plugins.xml文件中设置的 so 动态库，入口函数为 CreatePluginEngine。

加载神经网络

在执行推理之前需要设置网络的输入数据，通过 GetBlob 函数获取 Blob 指针，然后将输入数据拷贝到设备内存上，在openvino/inference_engine/sample中提供了一种通用的数据拷贝方式matU8ToBlob，首先在宿主机上实现图像的缩放，然后将其拷贝到设备内存，其调用过程如下：

// infer_request 在下面`执行推理过程`时讲述
Blob::Ptr input = infer_request.GetBlob(input_name);
for (size_t b = 0; b < batch_size; b++) {
    matU8ToBlob<uint8_t>(image, input, b);
}

matU8ToBlob 函数位于openvino/inference_engine/samples/cpp/common/samples/ocv_common.hpp 头文件内，其实现代码如下所示

template <typename T>
void matU8ToBlob(const cv::Mat& orig_image, InferenceEngine::Blob::Ptr& blob, int batchIndex = 0) {
    // orig_image：原始图像
    // blob：输入数据内存
    // batchIndex：批处理的index

   // 首先读取网路尺寸
    InferenceEngine::SizeVector blobSize = blob->getTensorDesc().getDims();
    const size_t width = blobSize[3];
    const size_t height = blobSize[2];
    const size_t channels = blobSize[1];
    if (static_cast<size_t>(orig_image.channels()) != channels) {
        THROW_IE_EXCEPTION << "The number of channels for net input and image must match";
    }
    T* blob_data = blob->buffer().as<T*>();

    // CPU下执行原图的缩放
    cv::Mat resized_image(orig_image);
    if (static_cast<int>(width) != orig_image.size().width ||
            static_cast<int>(height) != orig_image.size().height) {
        cv::resize(orig_image, resized_image, cv::Size(width, height));
    }

    // 获得内存中数据偏移位移
    int batchOffset = batchIndex * width * height * channels;

    // 完成数据从宿主机到设备的拷贝过程，仅支持单通道或者三通道输入数据推理
    if (channels == 1) {
        for (size_t  h = 0; h < height; h++) {
            for (size_t w = 0; w < width; w++) {
                blob_data[batchOffset + h * width + w] = resized_image.at<uchar>(h, w);
            }
        }
    } else if (channels == 3) {
        for (size_t c = 0; c < channels; c++) {
            for (size_t  h = 0; h < height; h++) {
                for (size_t w = 0; w < width; w++) {
                    blob_data[batchOffset + c * width * height + h * width + w] =
                            resized_image.at<cv::Vec3b>(h, w)[c];
                }
            }
        }
    } else {
        THROW_IE_EXCEPTION << "Unsupported number of channels";
    }
}

Blob 类是 OpenVINO 的基础数据单元，其头文件为openvino/inference_engine/include/ie_blob.h，网络输入输出数据的传递是通过 Blob 类来实现的。

执行推理过程

OpenVINO 提供了两种模式来执行推理过程：同步模式下推理函数 Infer 一直会阻塞直到推理结束；异步模式下调用 StartAsync 函数会立刻返回，然后再调用 Wait 函数等待执行结束，对于视频分析或者基于视频的目标检测任务，OpenVINO 官方推荐使用异步方式，可以实现更快帧率的处理，提高推理设备的利用率，在设备推断同时完成数据拷贝过程，减少推理设备等待时间。

同步模式下函数的调用例子如下所示：

InferRequest infer_request = executable_network.CreateInferRequest();
infer_request.Infer();
// 获取网络输出结果

异步模式下必须建立一定的循环机制，才可做到正确使用，首先必须创建2个推理请求，其次是必须结合使用 StartAsync 和 Wait 函数来分别发送下一帧的推理请求和获取上一帧的推理结果。

InferRequest::Ptr async_infer_request_curr = network.CreateInferRequestPtr();
InferRequest::Ptr async_infer_request_next = network.CreateInferRequestPtr();

while (true)
{
    // 设置下一帧推理数据
    frameToBlob(curr_frame, async_infer_request_next, imageInputName);
    // 发送下一帧推理请求
    async_infer_request_next->StartAsync();
    // 获取上一帧推理结果
    if (OK == async_infer_request_curr->Wait(IInferRequest::WaitMode::RESULT_READY))
    {
    // 获取网络输出结果
    }
    // 交换两个推理指针
    async_infer_request_curr.swap(async_infer_request_next);
}

执行推理过程

获取输出结果需要调用 GetBlob 函数获取输出数据的地址，然后对其进行解析，同步和异步模式下获取输出结果的方法类似，下面方法一直接获取 Blob 指针，然后可以解析 Blob 指针内数据，方法二则直接获取输出结果数值。

// 获取输出结果方法一
Blob::Ptr output_blob = infer_request.GetBlob(output_name);
MemoryBlob::CPtr moutput = as<MemoryBlob>(infer_request.GetBlob(output_name));

// 获取输出结果方法二
const float *detections = async_infer_request_curr->GetBlob(output_name)->buffer().as<PrecisionTrait<Precision::FP32>::value_type*>();

写在最后

本文对 OpenVINO 推理模块常见的 API 接口函数以及用法进行了说明，撰写过程中参考了openvino/inference_engine/sample中例子程序。

▼

参考链接

https://github.com/openvinotoolkit/openvino
https://docs.openvinotoolkit.org/latest/classInferenceEngine_1_1Core.html
https://docs.openvinotoolkit.org/latest/_docs_IE_DG_inference_engine_intro.html
https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide.html