当前位置: X-MOL 学术ACM Trans. Storage › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cache What You Need to Cache
ACM Transactions on Storage ( IF 1.7 ) Pub Date : 2020-07-07 , DOI: 10.1145/3397766
Hua Wang 1 , Jiawei Zhang 1 , Ping Huang 2 , Xinbo Yi 1 , Bin Cheng 3 , Ke Zhou 1
Affiliation  

The SSD has been playing a significantly important role in caching systems due to its high performance-to-cost ratio. Since the cache space is typically much smaller than that of the backend storage by one order of magnitude or even more, write density (defined as writes per unit time and space) of the SSD cache is therefore much more intensive than that of HDD storage, which brings about tremendous challenges to the SSD’s lifetime. Meanwhile, under social network workloads, quite a lot writes to the SSD cache are unnecessary. For example, our study on Tencent’s photo caching shows that about 61% of total photos are accessed only once, whereas they are still swapped in and out of the cache. Therefore, if we can predict these kinds of photos proactively and prevent them from entering the cache, we can eliminate unnecessary SSD cache writes and improve cache space utilization. To cope with the challenge, we put forward a “one-time-access criteria” that is applied to the cache space and further propose a “one-time-access-exclusion” policy. Based on these two techniques, we design a prediction-based classifier to facilitate the policy. Unlike the state-of-the-art history-based predictions, our prediction is non-history oriented, which is challenging to achieve good prediction accuracy. To address this issue, we integrate a decision tree into the classifier, extract social-related information as classifying features, and apply cost-sensitive learning to improve classification precision. Due to these techniques, we attain a prediction accuracy greater than 80%. Experimental results show that the one-time-access-exclusion approach results in outstanding cache performance in most aspects. Take LRU, for instance: applying our approach improves the hit rate by 4.4%, decreases the cache writes by 56.8%, and cuts the average access latency by 5.5%.

中文翻译:

缓存你需要缓存的东西

SSD 由于其高性价比,在缓存系统中一直扮演着非常重要的角色。由于缓存空间通常比后端存储小一个数量级甚至更多,因此 SSD 缓存的写入密度(定义为单位时间和空间的写入量)比 HDD 存储密集得多,这给SSD的寿命带来了巨大的挑战。同时,在社交网络工作负载下,不需要对 SSD 缓存进行大量写入。例如,我们对腾讯照片缓存的研究表明,大约 61% 的照片只被访问过一次,而它们仍然被换入和换出缓存。因此,如果我们能够主动预测这些类型的照片并阻止它们进入缓存,我们可以消除不必要的 SSD 缓存写入,提高缓存空间利用率。为了应对挑战,我们提出了适用于缓存空间的“一次性访问标准”,并进一步提出了“一次性访问排除”策略。基于这两种技术,我们设计了一个基于预测的分类器来促进策略。与最先进的基于历史的预测不同,我们的预测是非历史导向的,这对于实现良好的预测准确性具有挑战性。为了解决这个问题,我们将决策树集成到分类器中,提取社会相关信息作为分类特征,并应用成本敏感学习来提高分类精度。由于这些技术,我们获得了大于 80% 的预测准确度。实验结果表明,一次性访问排除方法在大多数方面都具有出色的缓存性能。以 LRU 为例:应用我们的方法将命中率提高了 4.4%,缓存写入减少了 56.8%,平均访问延迟降低了 5.5%。
更新日期:2020-07-07
down
wechat
bug