当前位置: X-MOL 学术arXiv.cs.HC › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Rethinking Generalization in American Sign Language Prediction for Edge Devices with Extremely Low Memory Footprint
arXiv - CS - Human-Computer Interaction Pub Date : 2020-11-27 , DOI: arxiv-2011.13741
Aditya Jyoti Paul, Puranjay Mohan, Stuti Sehgal

Due to the boom in technical compute in the last few years, the world has seen massive advances in artificially intelligent systems solving diverse real-world problems. But a major roadblock in the ubiquitous acceptance of these models is their enormous computational complexity and memory footprint. Hence efficient architectures and training techniques are required for deployment on extremely low resource inference endpoints. This paper proposes an architecture for detection of alphabets in American Sign Language on an ARM Cortex-M7 microcontroller having just 496 KB of framebuffer RAM. Leveraging parameter quantization is a common technique that might cause varying drops in test accuracy. This paper proposes using interpolation as augmentation amongst other techniques as an efficient method of reducing this drop, which also helps the model generalize well to previously unseen noisy data. The proposed model is about 185 KB post-quantization and inference speed is 20 frames per second.

中文翻译:

对具有极低内存占用量的边缘设备的美国手语预测中的泛化的重新思考

由于最近几年技术计算的迅猛发展,世界上已经看到了解决各种现实问题的人工智能系统的巨大进步。但是,普遍接受这些模型的主要障碍是其巨大的计算复杂性和内存占用量。因此,在极低的资源推断端点上部署需要有效的体系结构和培训技术。本文提出了一种在只有496 KB帧缓冲RAM的ARM Cortex-M7微控制器上检测美国手语字母的体系结构。利用参数量化是一种常见的技术,可能会导致测试精度的下降。本文提出使用插值作为其他技术中的增强方法来减少这种下降,这也有助于该模型很好地推广到以前看不见的嘈杂数据。所提出的模型约为185 KB后量化,推理速度为每秒20帧。
更新日期:2020-12-01
down
wechat
bug