Semi-Supervised Learning for Intrusion Detection in IoT Networks with UNSW_NB15 Dataset

Sindibad Ali Fayyadh Fayyadh

Dr. Ayla Kayabaş

Keywords: Bi-LSTM (Bi-directional Long Short-Term Memory)-CNN (Convolutional Neural Network), UNSW_NB15 Dataset, RNN (Recurrent Neural Network)-Variational Encoder and Decoder.


In today's technology environment, attacks and anomalies are commonplace. However, it is crucial to discover these abnormalities as quickly as possible, as they might create a variety of issues. In order to discover these outliers, anomaly detection plays a crucial role. Anomaly detection identifies unusual behavior that fundamentally deviates from the norm. Anomaly detection tries to find unanticipated and unexpected events in data streams, which are frequently referred to as anomalous events. Since a result, the suggested approach employs Bi-LSTM with CNN, as it has the capability to identify the existence of abnormalities extremely readily and rapidly. The UNSW-NB 15 dataset is used for the suggested approach. The UNSW-NB15 dataset was generated at the cyber range lab using the IXIA Perfect Storm program, and the dataset used in the proposed model includes nine distinct attacks. In order for the Bi-LSTM with CNN model to function properly, rectified linear weights are also employed, as these rectified linear weights feed the optimal weights rather than the standard weights. In the suggested technique, several data pre-processing procedures are conducted, including the verification of missing values, remote sampling, elimination of outliers, scaling of features, and encoding of categorical data. In the suggested Bi-LSTM with CNN model, batch normalization, average pooling, and activation functions such as SoftMax activation function are employed. Effectively evaluating the performance of the proposed model using different performance metrics such as precision, accuracy, and recall, and then comparing the proposed method to the existing methods, the proposed method outperformed the existing methods by providing a higher and more satisfactory accuracy rate to detect and classify anomalies efficiently.