Modeling Network Traffic with Relational Features

Modeling Network Traffic with Relational Features

If you tap the wires of a computer network, you will be splashed by a flood a 0‘s and 1‘s. Can you make sense of it to see what goes on on the net and who talks to whom? Can you spot cyber-attacks from such data? And can a computer do that? We employ machine-learning and data-mining techniques to this end.

This is a contractual research project conducted with the world‘s major network technology provider CISCO. The Intelligent Data Analysis (IDA) lab at the department tries to bridge the gap between the crisp but not interpretable data stream observable from the low-level network traffic on one hand, and the intuitive, if vague, concepts of events such as “port scanning”, “DOS attacks” etc. Our approach lies in the identification of characteristic patterns (relational features) occurring in the low-level data and correlating with the high-level events of interest. In the first year of the project that started in October 2013, we were mainly concerned with splitting the demanding computations (involving some NP-complete procedures) into a part that can be applied in real time as the data stream flows in, and a part that has to be computed off line. We designed an effective system gaining enough CISCO’s appreciation to file a US patent application on our behalf. In the follow-up we plan to experiment with advanced machine-learning techniques making use of deep learning concepts adjusted to logic-based representation of data and patterns.

Involved: Filip Železný, Gustav Šourek