Generating scalable and labelled IoT intrusion dataset using Leaky GANs

Bipraneel Roy, Hon Cheung, Chun Ruan

Research output: Chapter in Book / Conference PaperConference Paperpeer-review

Abstract

This paper introduces the Leaky Generative Adversarial Network for Synthetic Dataset Generation (L-GANSDG) framework employing Leaky ReLU activation function with a specific negative slope (α) value of 0.2. This design improves the training convergence and prevents the issue of dying neurons. The proposed approach optimizes the process of generating synthetic intrusion dataset which is scalable, reproducible, and consisting of diverse attack and benign footprints. Results show that L-GANSDG achieves an optimal Discriminator accuracy of 1.0 and loss of 0.001 within 100 epochs. The intrusion dataset generated by our proposed formulation, named GenMix, is crafted in a tabular format and does not require any physical infrastructure or specialized domain expertise. Additionally, GenMix has been validated through comprehensive statistical and machine learning efficacy analysis. Statistical evaluation confirms that 93% of the GenMix data points closely cluster around the mean, and 100% of the features maintain acceptable correlations with the training set. Machine learning efficacy evaluations indicates that GenMix demonstrates an optimal replica of performance compared to the benchmark UNSW_NB15 [1] dataset. This is evidenced across several key performance metrics, including accuracy, F1-score, precision, recall and False Alarm Rate (FAR). Additionally, proposed L-GANSDG significantly reduces resource demands in terms of cost, time, and operational effort highlighting GenMix's robustness and efficiency. Furthermore, our proposed framework eliminates the need for infrastructure deployment or expert knowledge, ensuring scalability and reproducibility.
Original languageEnglish
Title of host publication2024 IEEE International Conference on Future Machine Learning and Data Science, FMLDS 2024
Subtitle of host publication20-23 November 2024, Sydney, Australia
EditorsAdel Al-Jumaily, Md Rafiqul Islam, Syed Mohammad Shamsul Islam, Md Rezaul Bashar
Place of PublicationU.S.
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages179-184
Number of pages6
ISBN (Electronic)9798350391213
ISBN (Print)9798350391220
DOIs
Publication statusPublished - 2024
Event2024 IEEE International Conference on Future Machine Learning and Data Science, FMLDS 2024 - Sydney, Australia
Duration: 20 Nov 202423 Nov 2024

Conference

Conference2024 IEEE International Conference on Future Machine Learning and Data Science, FMLDS 2024
Country/TerritoryAustralia
CitySydney
Period20/11/2423/11/24

Keywords

  • Deep Learning
  • GAN
  • Intrusion Detection
  • IoT
  • Synthetic Intrusion Dataset Generation

Fingerprint

Dive into the research topics of 'Generating scalable and labelled IoT intrusion dataset using Leaky GANs'. Together they form a unique fingerprint.

Cite this