当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cache Bypassing for Machine Learning Algorithms
arXiv - CS - Hardware Architecture Pub Date : 2021-02-13 , DOI: arxiv-2102.06892
Asim Ikram, Muhammad Awais Ali, Mirza Omer Beg

Graphics Processing Units (GPUs) were once used solely for graphical computation tasks but with the increase in the use of machine learning applications, the use of GPUs to perform general-purpose computing has increased in the last few years. GPUs employ a massive amount of threads, that in turn achieve a high amount of parallelism, to perform tasks. Though GPUs have a high amount of computation power, they face the problem of cache contention due to the SIMT model that they use. A solution to this problem is called "cache bypassing". This paper presents a predictive model that analyzes the access patterns of various machine learning algorithms and determines whether certain data should be stored in the cache or not. It presents insights on how well each model performs on different datasets and also shows how minimizing the size of each model will affect its performance The performance of most of the models were found to be around 90% with KNN performing the best but not with the smallest size. We further increase the features by splitting the addresses into chunks of 4 bytes. We observe that this increased the performance of the neural network substantially and increased the accuracy to 99.9% with three neurons.

中文翻译:

机器学习算法的缓存绕过

图形处理单元(GPU)曾经仅用于图形计算任务,但随着机器学习应用程序使用的增加,在过去几年中,使用GPU进行通用计算的情况有所增加。GPU使用大量线程,这些线程又实现了大量并行性,以执行任务。尽管GPU具有很高的计算能力,但由于使用的SIMT模型,它们面临缓存争用的问题。解决此问题的方法称为“缓存绕过”。本文提出了一种预测模型,该模型分析了各种机器学习算法的访问模式,并确定是否应将某些数据存储在缓存中。它提供了关于每个模型在不同数据集上的性能的见解,还显示了最小化每个模型的大小将如何影响其性能。发现大多数模型的性能大约为90%,其中KNN表现最佳,但最小尺寸。我们通过将地址分成4个字节的块来进一步增加功能。我们观察到,这显着提高了神经网络的性能,并且使用三个神经元将准确性提高到99.9%。
更新日期:2021-02-16
down
wechat
bug