当前位置: X-MOL 学术Mach. Vis. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimized hand pose estimation CrossInfoNet-based architecture for embedded devices
Machine Vision and Applications ( IF 3.3 ) Pub Date : 2022-08-13 , DOI: 10.1007/s00138-022-01332-8
Marek Šimoník , Michal Krumnikl

We present CrossInfoMobileNet, a hand pose estimation convolutional neural network based on CrossInfoNet, specifically tuned to mobile phone processors through the optimization, modification, and replacement of computationally critical CrossInfoNet components. By introducing a state-of-the-art MobileNetV3 network as a feature extractor and refiner, replacing ReLU activation with a better performing H-Swish activation function, we have achieved a network that requires 2.37 times less multiply-add operations and 2.22 times less parameters than the CrossInfoNet network, while maintaining the same error on the state-of-the-art datasets. This reduction of multiply-add operations resulted in an average 1.56 times faster real-world performance on both desktop and mobile devices, making it more suitable for embedded applications. The full source code of CrossInfoMobileNet including the sample dataset and its evaluation is available online through Code Ocean.



中文翻译:

用于嵌入式设备的基于 CrossInfoNet 的优化手姿势估计架构

我们提出了 CrossInfoMobileNet,这是一种基于 CrossInfoNet 的手部姿态估计卷积神经网络,通过优化、修改和替换计算关键的 CrossInfoNet 组件,专门针对手机处理器进行了调整。通过引入最先进的 MobileNetV3 网络作为特征提取器和细化器,用性能更好的 H-Swish 激活函数替换 ReLU 激活,我们实现了一个需要少 2.37 倍的乘加运算和 2.22 倍的网络参数优于 CrossInfoNet 网络,同时在最先进的数据集上保持相同的错误。这种乘加运算的减少导致桌面和移动设备的实际性能平均提高了 1.56 倍,使其更适合嵌入式应用程序。

更新日期:2022-08-15
down
wechat
bug