IBD: The metrics and evaluation method for DNN processor benchmark while doing Inference task,Journal of Intelligent & Fuzzy Systems

当前位置： X-MOL 学术 › J. Intell. Fuzzy Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

IBD: The metrics and evaluation method for DNN processor benchmark while doing Inference task
Journal of Intelligent & Fuzzy Systems ( IF 1.7 ) Pub Date : 2021-02-18 , DOI: 10.3233/jifs-202552
Wei Min Zhang ₁ , Long Zhang ₂ , Zheyu Zhang ₁ , Mingjun Sun ₁

Affiliation

With the many varieties of AI hardware prevailing on the market, it is often hard to decide which one is the most suitable to use but not only with the best performance. As there is an industry-wide trend demand for deep learning deployment, the inference benchmark for the effectiveness of DNN processor becomes important and is of great help to select and optimize AI hardware. To systematically benchmark deep learning deployment platforms, and give more objective and useful metrics comparison. In this paper, an end to end benchmark evaluation system was brought up called IBD, it combined 4 steps include three components with 6 metrics. The performance comparison results are obtained from the chipsets from Qualcomm, HiSilicon, and NVIDIA, which can provide hardware acceleration for AI inference. To comprehensively reflect the current status of the DNN processor deploying performance, we chose six devices from three kinds of deployment scenarios which are cloud, desktop and mobile, ten models from three different kinds of applications with diverse characteristics are selected, and all these models are trained from three major training frameworks. Several important observations were made by using our methodologies. Experimental results showed that workload diversity should focus on the difference came from training frameworks, inference frameworks with specific processors, input size and precision (floating and quantized).

中文翻译：

IBD：执行推理任务时DNN处理器基准的度量标准和评估方法

随着市场上各种AI硬件的流行，通常很难决定哪一种最适合使用，而不仅是性能最佳。随着整个行业对深度学习部署的趋势需求，DNN处理器有效性的推理基准变得非常重要，这对于选择和优化AI硬件非常有帮助。要系统地对深度学习部署平台进行基准测试，并提供更客观和有用的指标比较。本文提出了一种端到端的基准评估系统，称为IBD，它组合了4个步骤，包括3个组成部分和6个度量。性能比较结果是从Qualcomm，HiSilicon和NVIDIA的芯片组获得的，这些芯片组可以为AI推理提供硬件加速。为了全面反映DNN处理器部署性能的现状，我们从三种部署方案（云，桌面和移动）中选择了六种设备，从三种具有不同特性的不同应用程序中选择了十种模型，并且所有这些模型都是从三个主要培训框架进行培训。使用我们的方法得出了一些重要的观察结果。实验结果表明，工作负载的多样性应着重于培训框架，具有特定处理器的推理框架，输入大小和精度（浮动和量化）的差异。从三种具有不同特征的不同应用程序中选择十个模型，并从三个主要的训练框架中训练所有这些模型。使用我们的方法得出了一些重要的观察结果。实验结果表明，工作负载的多样性应着重于培训框架，具有特定处理器的推理框架，输入大小和精度（浮动和量化）的差异。从三种具有不同特征的不同应用程序中选择十个模型，并从三个主要的训练框架中训练所有这些模型。使用我们的方法得出了一些重要的观察结果。实验结果表明，工作负载的多样性应着重于培训框架，具有特定处理器的推理框架，输入大小和精度（浮动和量化）的差异。

更新日期：2021-02-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11