当前位置: X-MOL 学术IEEE Trans. Netw. Serv. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predicting Network Flow Characteristics using Deep Learning and Real-World Network Traffic
IEEE Transactions on Network and Service Management ( IF 5.3 ) Pub Date : 2020-12-01 , DOI: 10.1109/tnsm.2020.3025131
Christoph Hardegen , Benedikt Pfulb , Sebastian Rieger , Alexander Gepperth

We present a processing pipeline for flow-based traffic classification using a machine learning component leveraging Deep Neural Networks (DNNs). The system is trained to predict likely characteristics of real-world traffic flows from a campus network ahead of time, e.g., a flow’s throughput or duration. Training and evaluation of DNN models are continuously performed on a flow data stream collected from a university data center. Instead of the common binary classification into “mice” and “elephant” (throughput) or “short-term” and “long-term” (duration) flows, predicted flow characteristics are quantized into three classes. Various communication contexts (subset of network traffic, e.g., only TCP) and flow feature groups (subset of flow features, e.g., only a flow’s 5-tuple), which are supported through an enrichment strategy, are considered and investigated. An in-depth description of the data acquisition process, including preprocessing steps and anonymization used to protect sensitive information, is given. Additionally, we employ an accelerated variant of t-distributed Stochastic Neighbor Embedding (t-SNE) to visualize network traffic data. This enables the understanding of traffic characteristics and relations between communication flows at a glance. Furthermore, possible use-cases and a high-level architecture for flow-based routing scenarios utilizing the developed pipeline are proposed.

中文翻译:

使用深度学习和真实网络流量预测网络流量特征

我们使用利用深度神经网络 (DNN) 的机器学习组件提出了基于流的流量分类的处理管道。该系统经过训练,可以提前预测来自校园网络的真实世界流量的可能特征,例如流量的吞吐量或持续时间。对从大学数据中心收集的流数据流连续执行 DNN 模型的训练和评估。不是将常见的二进制分类为“小鼠”和“大象”(吞吐量)或“短期”和“长期”(持续时间)流,而是将预测的流特征量化为三类。各种通信上下文(网络流量的子集,例如,只有 TCP)和流特征组(流特征的子集,例如,只有流的 5 元组),通过丰富策略得到支持,被考虑和调查。给出了数据采集过程的深入描述,包括用于保护敏感信息的预处理步骤和匿名化。此外,我们采用 t 分布随机邻居嵌入 (t-SNE) 的加速变体来可视化网络流量数据。这样可以一目了然地了解通信流特征和通信流之间的关系。此外,还提出了使用开发的管道的基于流的路由场景的可能用例和高级架构。我们采用 t 分布随机邻居嵌入 (t-SNE) 的加速变体来可视化网络流量数据。这样可以一目了然地了解通信流特征和通信流之间的关系。此外,还提出了使用开发的管道的基于流的路由场景的可能用例和高级架构。我们采用 t 分布随机邻居嵌入 (t-SNE) 的加速变体来可视化网络流量数据。这样可以一目了然地了解通信流特征和通信流之间的关系。此外,还提出了使用开发的管道的基于流的路由场景的可能用例和高级架构。
更新日期:2020-12-01
down
wechat
bug