当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
HCMonitor: An accurate measurement system for high concurrent network services
Concurrency and Computation: Practice and Experience ( IF 1.5 ) Pub Date : 2021-04-27 , DOI: 10.1002/cpe.6081
Hui Song 1 , Wenli Zhang 1 , Ke Liu 1 , Yifan Shen 1, 2 , Mingyu Chen 1, 2, 3
Affiliation  

This article aims to enhance the monitoring accuracy of high concurrent network services. As modern network services grow rapidly in data centers, tail latency has become one of the most crucial deciding factors on user experience. Latency measurement and anomaly detection are essential in evaluating service performance. Existing monitoring tools can be divided into two categories according to estimation methods. First, approaches based on sample traffic sample network packets to unburden the measurement. Second, approaches based on full traffic like wrk, analyze all of the packets from the kernel network stack and load the client-side overhead into response delay. Therefore, we propose a high-performance monitor system named HCMonitor, which computes the server-side response latency and the round-trip time of per-request. It can afford full traffic monitoring on the basis of userspace, “zero copy” and pipeline. By switch mirroring, the measured latency eliminates the kernel network stack overhead and the queuing delay of the client-side. Such measurement results in improved accuracy, online analysis, anomaly detection, real-time display and transparent to network services. Our evaluations show HCMonitor obtains a higher throughput compared with tcpdump by over 200 times. Compared with wrk, the tail latency accuracy shows an increase by up to 72%–76% in high concurrent networks.

中文翻译:

HCMonitor:高并发网络服务的精准测量系统

本文旨在提高对高并发网络服务的监控精度。随着现代网络服务在数据中心的快速增长,尾部延迟已成为影响用户体验的最关键的决定因素之一。延迟测量和异常检测对于评估服务性能至关重要。现有的监测工具按估算方法可分为两类。首先,基于样本流量的方法对网络数据包进行采样以减轻测量负担。其次,基于全流量的方法,如 wrk,分析来自内核网络堆栈的所有数据包,并将客户端开销加载到响应延迟中。因此,我们提出了一个名为 HCMonitor 的高性能监控系统,它计算服务器端响应延迟和每个请求的往返时间。它可以提供基于用户空间、“零拷贝”和管道的全流量监控。通过交换机镜像,测得的延迟消除了内核网络堆栈开销和客户端的排队延迟。这种测量结果提高了准确性、在线分析、异常检测、实时显示和对网络服务透明。我们的评估表明,与 tcpdump 相比,HCMonitor 获得了超过 200 倍的吞吐量。与 wrk 相比,在高并发网络中,尾部延迟准确率显示出高达 72%–76% 的提升。实时显示,对网络服务透明。我们的评估表明,与 tcpdump 相比,HCMonitor 获得了超过 200 倍的吞吐量。与 wrk 相比,在高并发网络中,尾部延迟准确率显示出高达 72%–76% 的提升。实时显示,对网络服务透明。我们的评估表明,与 tcpdump 相比,HCMonitor 获得了超过 200 倍的吞吐量。与 wrk 相比,在高并发网络中,尾部延迟准确率显示出高达 72%–76% 的提升。
更新日期:2021-04-27
down
wechat
bug