The evolution of cyber threats has highlighted the limitations of traditional signature-based methods for traffic analysis and intrusion detection, pushing towards the adoption of Machine Learning-based approaches.
This is the background to the new study AI4Cyberdedicated to the analysis of two variants of the dataset CICIDS2017which is widely recognised as a benchmark in the scientific literature. The former is based on CSV files derived from a revised version of the original dataset, while the latter requires the generation of network flows from raw PCAP files using NetFlowMeterour tool developed to overcome the criticalities of CICFlowMeter.
The research consisted of two phases: an exploratory phase, conducted with decision trees, which revealed particularly discriminating features, and a second phase devoted to semi-supervised anomaly detection, using an autoencoder trained on normal traffic.
The analysis revealed some critical issues related to false positives, which may reduce the effectiveness of detection systems. Therefore, to mitigate this risk we propose the use of more advanced models, ensemble learning techniques and an integration with rule-based filtering mechanisms. We also reiterate the importance of a rigorous approach in the validation of datasets and third-party tools.
If you wish to learn more, here is the link to our comprehensive study.
In addition, you can subscribe to the specific mailing list Cyber Studios by Tinexta Defence, to receive updates on upcoming research:


