Generative Adversarial Network for intrusion detection system

Author: Muhsin Ali

Ali, Muhsin, 2021 Generative Adversarial Network for intrusion detection system, Flinders University, College of Science and Engineering

Terms of Use: This electronic version is (or will be) made publicly available by Flinders University in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. You may use this material for uses permitted under the Copyright Act 1968. If you are the owner of any included third party copyright material and/or you believe that any material has been made available without permission of the copyright owner please contact copyright@flinders.edu.au with the details.

Abstract

Generative Adversarial Network (GAN) has been extensively studied in image processing and computer vision (CV). However, their influence on the cyber security field remains an area of open research. Cyber security attacks and incidents on the Internet of Things (IoT) have been increased. Many deep learning algorithms are being proposed to detect and mitigate such intrusion attacks. In modern days where new methods to prevent such attacks are introduced, attackers also improve their techniques to attack. Adversarial attacks became the most emerging technique to fool the network security system by giving deceptive input. This research project investigates GAN generated data to deceive the machine learning-based IDS system. We used the BoT-IoT dataset in our analysis. First, we trained our ML models and developed a benchmark; second, we generated fake data with GAN to deceive the IDS system. In the first approach, three machine learning/ deep learning-based IDS models, Random Forest (RF), Support Vector Machine (SVM) and Artificial Neural Network (ANN), are trained, which gave an accuracy of up to 99% on binary class training data. After generating the fake data through GAN, these three trained models were tested again for the fake data and compared. Results showed that GAN successfully fooled RF and ANN, the accuracy of both the models dropped to 50% and 38.63%, respectively.

On the contrary, SVM outperformed RF and ANN and proved more robust against them. In the second approach, Wasserstein GAN is used to develop a trained model on the training dataset then tested with test data, which predicted the malicious data as normal (malicious data as non-malicious data). RF and SVM models were used for the IDS. Likewise, in the first approach, RF and SVM also demonstrate accuracy of 99.9% in binary and multiclass classifications on the actual dataset. The evaluation of this research concluded that ML-based IDS could be compromised by using GAN.

Keywords: Generative Adversarial Network, Intrusion Detection System, GAN, IDS

Subject: Computer Science thesis

Thesis type: Masters
Completed: 2021
School: College of Science and Engineering
Supervisor: Dr. Saeed Rehman