0% found this document useful (0 votes)
22 views10 pages

Predictive Analytics in Cyber Security

This research paper explores the application of predictive analytics to enhance real-time threat detection and response in cyber security. It highlights the limitations of conventional defense methods and advocates for a proactive approach using big data analytics and machine learning to identify and mitigate advanced cyber threats. The study emphasizes the importance of data quality, threat intelligence, and the integration of predictive models to improve decision-making processes and resource allocation in cyber security frameworks.

Uploaded by

x3rosh
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views10 pages

Predictive Analytics in Cyber Security

This research paper explores the application of predictive analytics to enhance real-time threat detection and response in cyber security. It highlights the limitations of conventional defense methods and advocates for a proactive approach using big data analytics and machine learning to identify and mitigate advanced cyber threats. The study emphasizes the importance of data quality, threat intelligence, and the integration of predictive models to improve decision-making processes and resource allocation in cyber security frameworks.

Uploaded by

x3rosh
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1

Enhancing Cyber Security through Predictive


Analytics: Real-Time Threat Detection and Response
Muhammad Danish

measures to be taken [1]. This approach not only enhances the


Abstract—This research paper aims to examine the promptness and efficiency of cyber protection, but also
applicability of predictive analytics to improve the real-time rationalizes resources and actions. Essential components of
identification and response to cyber-attacks. Today, threats in predictive analytics include data quality and threat intelligence
cyberspace have evolved to a level where conventional methods of
defense are usually inadequate. This paper highlights the
which ensures that the data used in the models are up-to-date
significance of predictive analytics and demonstrates its potential and relevant. Admirably, more organizations are
in enhancing cyber security frameworks. This research integrates implementing predictive analytics as part of their cyber
literature on using big data analytics for predictive analytics in security framework; hence, a shift toward a safer environment
cyber security, noting that such systems could outperform that gauges an approaching threat and dismantles it before it
conventional methods in identifying advanced cyber threats. This takes root. Such a preventive strategy is now widely
review can be used as a framework for future research on
predictive models and the possibilities of implementing them into
considered crucial due to the increasing numbers and
the cyber security frameworks. The study uses quantitative complexity of threats [2].
research, using a dataset from Kaggle with 2000 instances of Present-day real-time cyber-attack detection systems pose
network traffic and security events. Logistic regression and several problems that compromise their ability to withstand
cluster analysis were used to analyze the data, with statistical more complex cyber threats. One major challenge is their
tests conducted using SPSS. The findings show that predictive inherent passiveness; most such systems are developed to
analytics enhance the vigilance of threats and response time. This
paper advocates for predictive analytics as an essential
guard against threats with a high probability of detection and
component for developing preventative cyber security strategies, are based on the concept of signatures. This method turns out
improving threat identification, and aiding decision-making useless when it comes to zero-day attacks that use newer
processes. The practical implications and potential real-world vulnerabilities in the system, which are as yet unidentified by
applications of the findings are also discussed. security systems until the attack happens, and until then
systems remain open to exploitation. Furthermore, another
Index Terms— Predictive analytics, data analysis, statistical significant problem with existing detection systems is false
analysis, machine learning, cyber security, threat detection
positives [3]. False positive alarm rates can hinder the
efficiency of cyber security personnel and contribute to the
I. INTRODUCTION
situation where valid alarms are ignored or the response to

T
HE ICT industry has grown to be the backbone of them is too slow. This overworks the resources and also
today’s society over the past half-century. This decreases the effectiveness of the cyber security response
integration has elevated the relevance of cyber security team. Scalability is another concern because the volumes of
to defend ICT systems against different types of cyber threats. data that have to be supervised increases as organizations
Information security is vital for an organization to protect its develop and the digital architecture becomes multifaceted.
data from unauthorized access, which is commonly known as Most of today’s detectors lack escalation mechanisms and
cyber security. Measures such as network, application, and may fail to monitor all points of vulnerability to breaches
operational security, including antivirus software, firewalls, when the network is growing [4].
and intrusion detection systems, are employed to address The incorporation of big data analytics into the cyber
threats like unauthorized access and malware. However, there security models proves both critical and challenging. The use
are existing gaps in the literature that this research aims to of predictive analytics in this area has the prospect of turning
address, particularly in integrating predictive analytics into cyber security into an entirely proactive line of work by
real-world cyber security frameworks. predicting threats before they arise. However, it is not easy to
Risk assessment in cyber security is a major shift from implement these systems since they involve huge capital
simple remedial methods to proactive methods of operation. investments in data collection, analysis, model development,
Through the application of statistics and machine learning, training, and continuous updates to the models, to suit new
predictive analytics helps to identify possible future threats emerging threats [5]. Despite the foundational role of real-time
and risks to a company’s cyber security, enabling preventive cyber-attack detection systems, they still suffer from the
problem of being largely reactive, false alarms, and scalability.
Muhammad Danish is with the University of New Mexico, Albuquerque,
These limitations imply the need to accelerate and redesign
NM 87106 USA (e-mail: mdanish@[Link]). present and prospective technologies and methods in cyber
[Link], [Link], and [Link] files are available online at security.
[Link]
2

This study addresses several key questions: How effective Predictive analytics give signals and alerts of risk to
is predictive analytics in identifying and responding to organizations before they turn into actual breaches. Indeed, the
different types of cyber-attacks in real time? What patterns definition and usage of predictive analytics have changed over
and anomalies can predictive models detect that traditional time in the context of cyber security due to the advancements
security measures often miss? How can predictive analytics in data science and artificial intelligence. Initially, the field
enhance the decision-making process in cyber security was limited to basic data monitoring and detection of
operations centers? This paper aims to fill existing gaps in the anomalies; today, it incorporates highly developed algorithms
literature by providing empirical evidence on these questions and refers to such advanced techniques as predictive threat
and highlighting the practical implications of the findings. modeling and risk assessment [8]. This change marks an
To evaluate these questions, this study aims to assess the evolution from conventional or traditional security methods
effectiveness of predictive analytics in real-time detection and like firewalls and antivirus software, moving towards
response to cyber-attacks, identify key patterns and anomalies intelligence-driven security.
detectable by predictive models, and propose a model that The fundamental principles of predictive analytics in cyber
improves decision-making processes in cyber security security hinge on several core elements including poor data
operations centers by integrating predictive analytics. Indeed, quality, inefficiency in the algorithms used, and lack of timely
the implications of research on predictive analytics for real- threat intelligence. Viable predictive systems also require first-
time threat detection and response are quite monumental when rate and pertinent data to educate the models that are used in
viewed through the prism of current and future cyber security the prediction and provision of attack prevention.
environments. The study aims to improve the knowledge and Furthermore, incorporating real-time threat intelligence means
application of predictive analytics in cyber security with the the models remain accurate on the present threat vectors. The
ultimate goal of shifting the over-reliance on reactive integration of big data analytics in security operations
approaches towards a more proactive and preventive stance. improves not only threat detection effectiveness, but also the
This transition is vital in today's world where the rate of organization's agility. Therefore, security teams can prioritize
evolution of threats and their complexity are increasing at a and spend resources effectively, thereby lessening the
fast pace. The research is most relevant as it fills the current bloodbath that comes with cyber threats and enhancing the
considerations of existing detection mechanisms: for example, organizations' security stance. In addition, predictive analytics
the incapability of detecting and preventing other threats; and fosters compliance with laws by providing proof that the
the issues of extensibility and high rates of false positives. organization is actively pursuing security measures, which is
Enhancing these areas with the help of predictive analytics helpful to industries dealing with high levels of data protection
would make the overall protection against cyber threats laws [9]. Such an approach relying on predictive analytics is
considerably stronger, as it allows sightings of adversative becoming indispensable in the context of the constantly
scenarios earlier, thus minimizing the impacts [6]. changing nature of cyber threats that become more complex
Furthermore, the research aims at providing a way of and that cannot be addressed using conventional methods.
efficiently using the resources in cyber security. High false New technologies that exist in the detection of cyber-attacks
positive rates create noise which distracts security teams and have advanced to the integration of artificial intelligence (AI)
forces them to spend time investigating non-threatening issues and its subsection: machine learning (ML). AI and ML in
that waste an organization’s time and resources. Integrating cyber security mean the ability to automatically perform the
predictive analytics will help align security measures with detection and response to threats which are analyzed from
organizational risk management strategies by providing huge datasets relevant to identify patterns that may point to
realistic threat assessments. This research contributes to threats [10]. These technologies are useful when identifying
advancing technology's influence on strategic business indicators of compromises that may not be easily identified by
planning, crisis response, data security, and regulatory analysts entirely because of the huge volume and the
compliance. By pioneering more advanced predictive models, complexity of data that has to be scanned. Machine learning
the study contributes to setting new standards in cyber algorithms, both supervised and unsupervised learning
security, ensuring that businesses can protect their assets and models, are extremely useful in such cases. Supervised
maintain trust with clients and stakeholders in an increasingly learning models are trained on labeled datasets to differentiate
interconnected world [7]. between benign and malicious activity. Unsupervised learning
is to find the outliers within the system without having any
II. LITERATURE REVIEW labeled data, which helps in the identification of new and
Analytics in the context of cyber security is a highly unknown threats. AI improves threat identification because
advanced concept that adjusts security practice from the data is processed and analyzed far beyond the human capacity
reactive to proactive model. This approach incorporates the and rate. It automates the responses to the threats, thus taking
use of several statistical and machine learning models to a short time to counter the threats once they have been
examine the enormous volumes of data from sources such as identified [11]. AI-powered systems also include predictive
network traffic, user activities, and security logs to develop an analytical components that assess threat trends or patterns to
elaborate system that would alarm an early sign of a threat. predict future threats, hence improving the threat-hunting
process [12].
3

AI and ML play a big role in lowering the false positives in access to systems before they are identified and
threats. They enhance the process of filtering fakes and mitigated. The very nature of zero-day attacks makes
distinguishing between real and potential threats as well as them difficult to predict and detect using conventional
distinguishing them by understanding the degree of difference methods that rely on known signatures or patterns.
between unusual behavior and deliberate malicious actions, Security systems often require updates to their threat
taking care of prioritization of threats and thus, decreasing the intelligence to handle such vulnerabilities, but even then,
amount of work security teams have to do. However, the rapid pace at which new zero-days are discovered
implementing AI into cyber security has its own set of leaves organizations at constant risk [18].
challenges including the quality of data required for preparing Addressing these challenges requires a multifaceted
algorithms, the transparency of AI decision-making, and the approach involving enhanced detection algorithms that reduce
integration of AI systems into the current infrastructure of false positives, scalable security solutions that can grow with
cyber security systems. However, the threat in the cyberspace the organization, and proactive threat hunting that can detect
domain is not stagnant, and hence the AI models must be anomalies indicative of zero-day exploits. One direction is the
updated on a regular basis [13]. integration of the latest developments in the field of big data
The position of AI and ML in the context of cyber-attack and machine learning into cyber security practices, as these
detection is rather important and provides not only better can help analyze patterns, envision risks and attacks, and
detection mechanisms but also the proper and timely handling respond to them automatically to enhance organizational
of cyber security threats in a world where digital threats are security. [19]
frequently evolving. AI and ML are reshaping the sphere of Predictive analytics in cyber security incorporates various
cyber security; they allow for detecting threats quickly, and sophisticated models and techniques to predict and mitigate
often on a large scale, as well as making predictions. These potential threats before they can impact systems. The core of
technologies help to automatically detect and counter cyber this approach is based on the use of machine learning
threats increasing the security responsiveness of the algorithms with a variety of supervised and unsupervised
organization. It is, however, crucial to address some important learning algorithms. In the supervised learning model, specific
issues that relate to the handling of data quality and the ability data is used to train in order to identify known illicit
of the models to change, when integrated into existing systems behaviors. On the other hand, unsupervised learning identifies
in order to effectively cope with constantly emerging forms of and recognizes abnormal behaviors which if exist may be an
cyber threats. [14] indication of a threat. The ability of a system to detect
Modern cyber security processes are accompanied by many suspicious activities is essential for timely prevention of
difficulties that hinder its operations including the issues of threats and strengthening of the security status of any firm.
false positives, scalability, and the identification of previously One more important component of the environment of
unknown vulnerabilities. [15] predictive analytics is the usage of statistical algorithms.
1) False Positives These algorithms are able to compile data used to foresee
One major issue particular to cyber security is the future incidents by comprehending past trends and behaviors.
problem of false positives, which is an alarm that a threat Besides this method contributes not only to the prediction of
exists although it does not. This results in more resource possible threats but also to the development of a more accurate
wastage since security analysts have to go through these representation of risks that can be useful for better preparation
alerts to verify them and determine if they are actually a in organizations. User behavior analysis adds more value to
threat. The issue is further compounded by the fluidity predictive analytics because it investigates user activities to
and heterogeneity of today’s networks characterized by identify suspicious events that might be originating from
typical activity patterns that are easily mistaken for inside threats or stolen credentials. In this method, the basic
threats by security solutions [16]. security measures may not easily detect the anomalies.
2) Scalability Issues Furthermore, anomaly detection systems are used to identify
Due to the growth of organizations and the associated the levels of deviance from the normal behavioral patterns
expansion of the networks, at some point, implemented concerning the network traffic and access log prior to the
cyber security measures may take a hit. This is because times of the actual attack [21].
the scalability problems are evident from the amount and Despite the advantages like early threat identification, better
the number of endpoints that should be monitored and resource management, and faster response to threats,
analyzed by such systems. It has the characteristic of predictive analytics also face challenges in real-world
providing the areas of weakness and slow response to
applications. Forecasting models are only as good as the data
real threats in such a case. [17].
that they are applied to; this is a saying often used in statistics.
3) Zero-Day Vulnerabilities
Lack of quality and/or scope can produce erroneous
Perhaps the most daunting challenge is the detection and
management of zero-day vulnerabilities which are flaws predictions, while the nature of the cyber threats is
in software that the software maker does not know about continuously evolving requiring constant updates of the
and for which no patch exists at the time of discovery. models. Sustaining and periodically updating its application is
These vulnerabilities are highly valuable to attackers necessary to maintain its effectiveness. Further, incorporating
because they can be exploited to gain unauthorized predictive analytics into other infrastructures that are already
4

existent in cyber security can prove to be challenging and preventing attacks before they occur. This shift from a purely
time-consuming and may take considerable time with regular reactive to a proactive stance is increasingly regarded as
monitoring to overcome the possible ethical risks and privacy essential in a world where cyber threats are becoming more
issues that may come with their implementation. Cyber complicated and pervasive. [27]
security is already underway due to third-generation predictive The current body of research in cyber security predictive
analytics that are proactive instead of reactive. However, this analytics is expansive and rich with theoretical developments
success depends on very rigid execution, constant and proposed models. However, a significant gap remains in
modifications, and comprehensive data management in order the literature concerning the practical integration of these
to counter the continuously emerging threats in cyberspace. advanced predictive models into real-world cyber security
[22] frameworks. Despite the fact that such models can serve as
In the context of cyber security within organizations, there is good references, it has to be noted that it's one thing to prove a
a clear differentiation between reactive and predictive strategy or a model effective in an academic environment or at
systems: least in a simulation, and quite another to observe its
1) Reactive Systems effectiveness in realistic, dynamic cyber security settings [28].
Such systems mainly target threats as they emerge and This lack of correspondence is a strong indication that
hence primarily involve treatment. The reactive although there is rich theoretical research for these models, the
approach will sit back and wait for the attack and this lack of actual empirical data as well as actual planning with
poses a disadvantage because reacting to such threats the models, having to integrate them with operational concerns
will take a long time. This method bases its operations and then scaling up the overall system, presents a huge gap
on previous knowledge and, in a way, is ill-equipped to that has not been well covered in the literature. Most of the
deal with threats since it directly targets the known types current research works are majorly centered around the
of attacks and may not be very efficient with the novel improvement of the existing algorithms to be implemented but
attack vectors that are not typical of the previous cases.
minimal on how these algorithms can actually be deployed to
While reactive systems are badly needed to cope with a
work in real-world applications which entail factors such as
threat immediately, they are less complicated to design,
hardware constraints, real-time constraints, and how they can
yet they may be more costly in the long run since the
system’s damage incurred during detection delays can fit in the existing infrastructure of a system to secure it.
amount to much [23]. More efforts are still required to conduct studies linking the
2) Predictive Systems state-of-the-art predictive analytics methods and the real-
Whereas, predictive systems use techniques in analytical world cyber security operations, including design features that
processing such as machine language and statistics to allow solutions to be easily implemented in active technical
avoid predictions of a certain pernicious occurrence of environments with minimal modifications. Overcoming this
an event. Besides, as this approach focuses on analyzing gap is a relevant and necessary step in the development of
patterns and trends from large amounts of data for modern cyber security work, as well as in the practice of
planning future actions, it helps organizations to allocate transferring theoretical achievements into concrete
resources effectively and repel attacks promptly. Risk improvement of the methods for detecting and responding to
predictive systems greatly improve an organization's cyber threats. [29].
capacity to contain and prevent Cyber risk by giving
insights into potential risks. Nevertheless, they rely on III. METHODOLOGY
high-quality and detailed data for their operation, and
Quantitative research was used to conduct the study with the
they have their challenges concerning the constant
training of the models and the connection to the existing aim of understanding the use and outcomes of enhanced
security systems [24]. predictive modeling in real-time CTR. This method is appropriate
Research and implementations have established that for this research study because it permits strength and
supervised systems can significantly decrease the threat’s time significance testing of the hypothesis of the functionality and the
and cost effects by mitigating them before they occur [25]. results of the predictive analytics in the cyber security
Organizations that integrate predictive analytics into their frameworks. [30] The strategy that is proposed here is a
cyber security strategies often experience improved risk systematic experimental method through which all the researchers
management, reduced incident response times, and enhanced will deploy specified predictive models in a realistic IT security
compliance with regulatory requirements. The proactive environment that mimics the actual setting in organizations. This
approach, instead of reactive methodologies, not only helps in environment will have factors such as network traffic flows,
safeguarding against imminent threats but also prepares users' behavior data set, and normalcy of the cyber threat
organizations against emerging cyber threats by constantly scenarios to evaluate the models on how well they work in
updating defense mechanisms in alignment with the evolving recognizing cyber threats.
digital landscape [26]. The main objective is to evaluate the effectiveness of these
While reactive cyber security is necessary for dealing with predictive models with reference to the conventional firewalls or
immediate threats, the integration of predictive analytics into reactive security measures in terms of rate of occurrence of
cyber security frameworks provides a more robust defense by threats, rate of detection, and flexibility of the measures in
handling new types of threats. Sources of data for this study will
5

be data sets from the open source, plus newly generated data sets methods of data, where scaling of features is performed to
to represent new and upcoming cyber security threats. It includes enhance the performance of the learning algorithms [35]. For
the application of inter-model combinations with the aim of training and validation of the developed predictive models, the
bringing out various scenarios and attack vectors that realistically dataset is partitioned into training, validation, and test partitions.
test the capability of the predictive models. This is of extreme Most of the time, the data split is organized so that the training
significance since it allows competence validation of the models dataset is the largest, constituting about 70 percent, while the
in the presence of heteroscedasticity. Measurable factors validation and test datasets are about 15 percent each. This
including the detection rate of threats, false positives and segmentation makes it possible to train the models to their fullest
negatives of the system, and response time of the system will also potential while also giving a sound basis for a decision of the
be included. [31] model's parameters or the examination of the final model
For analytical data, the study will use techniques like performance compared to the performance on unseen data. [36]
regression analysis to determine the connection between the Since the data collected may contain confidential details of an
systems' responses and the success of threat countermeasures. individual or a group, all relevant measures are ensured to conceal
Specific measures that are used regularly in machine learning will the identity of the subject/person. The research follows guidelines
be used in measuring the accuracy of the predictions within the concerning the use of data, and measures being taken in order to
predictive models; some of these are precision, recall, and the F1- avoid the abuse of information. The Kaggle data utilized in the
score. There could be a sub-analysis with the help of statistical study ensures that the authors were bound to adhere to the Kaggle
tools like logistic regression or ROC Curve Analysis to see other data usage policies that are in harmony with general data
significant differences between predictive and reactive systems. It protection regulations and ethical considerations. Now that we
will support the theoretical potential of predictive analytics with have a clear understanding of the dataset and how it should be
quantitative data, and for this reason, this research design has prepared, several techniques can be used in predictive analysis,
been adopted. Thus, the present work endeavors to complement including decision trees, logistic regression, and neural networks.
the literature by providing actionable knowledge regarding how Some of these techniques are adopted due to their efficiency in
these models can be employed effectively, given the fact the dealing with big data while others are chosen due to efficiency in
comparison was performed in a purposefully controlled academic performing classification problems in cyber security. The
environment. This is important for the progression of cyber performance of these models is checked from time to time on the
security as well as the creation of stronger, preventative defense validation set with a view to ensuring that the model’s
strategies against cyber warfare. [32] performance is checked, adjusted, and optimized before the final
As has been highlighted, the essence of this study is to analyze check on the test set.
the importance of predictive analytics and its models in the cyber In this particular study, SPSS software support is crucial in the
security domain accurately; therefore, the selection and collection data processing retrieved from Kaggle to determine the efficiency
of high-quality data is vital. Variety ensures that the database of the predictive analytics models of cyber security [38]. This
acquired by the study is all-inclusive hence the use of data elicited section presents a clear approach to the statistical analysis using
from Kaggle, a platform that offers a wide array of datasets by SPSS which includes data handling, analysis methods, and
users from all over the world. This platform provides massive and results. For data to be exported into SPSS, it needs to undergo
diverse data with regard to the cyber threat scenarios which is certain preparations so that its analysis is accurate and meets the
very useful for this research. [33] standards. This entails data cleansing, which entails the
To start with, the selection of a dataset on Kaggle that is related elimination of unwanted data such as inconsistent records or
to security threats is made. The chosen dataset consists of over flawed records that may distort the results. Other techniques of
4,000 records wherein each record corresponds to one instance of data transformation are also utilized to transform the categorical
network traffic or log data that could be related to a cyber security data into some numerical formats that are more convenient for
threat. This dataset, thus, was chosen as complex and up-to-date, analysis purposes whereby, one and the same method of encoding
so that the results of the study reflect today's security threats. may or may not be appropriate depending on the specifics of the
Every row in the dataset contains features including source IP given algorithms in the course of the predictive modeling as it is
address, packet length, destination IP address, date and time of illustrated in [38].
the data, the type of traffic, and threat bit. These attributes are Descriptive analysis prepares the statistical inclination of data
important because they feed the raw data into learning systems analysis before going for intricate analytical examinations of the
and into the testing. This way, the normal and anomalous patterns data distribution, mean, and spread. In SPSS, these basic
are present in the data set, and the former provides the latter with measures can be obtained by using the descriptive menu and
the variety it needs to be exposed to a spectrum of data before it these include mean, median, mode, range, variance, and standard
can construct an appropriate and reliable automatic guard. [34] deviation. This step is critical to help manage data and look for
In this paper, the data cleaning process forms a critical step any outliers or similar points that need further data munging or
before data feeds can be given to the model development and normalization. [39] To drive theories at the beginning of the
analysis. This phase concerns dealing with missing values, study, inferential statistical analysis methods are used to assess
removing duplicate records, and converting categorical data into a hypotheses. Based on the kind of research questions and
form that is understandable to the machine. Due to the large and hypotheses, a set of tests that involves t-tests, ANOVA, and chi-
diversified data set, there is also a focus on the normalization squaredd tests amongst others are carried out just to test the
6

differences and associations between the set variables in the Several statistical and practical considerations underpin the
collected data. selection of a sample size of 2000 rows for this study on
Regarding understanding how the distinct factors predict threat predictive analytics in cybersecurity, ensuring that the analysis is
identification and the effectiveness of mitigation in cyber security, both reliable and generalizable. One of the primary reasons for
regression analysis is applied. Continuous dependent variables choosing this particular sample size is to achieve sufficient
were analyzed using linear regression, whereas binary dependent statistical power. In quantitative research, power is the probability
variables were analyzed using logistic regression. For this reason, that the study will detect an effect when there is an effect to be
the key analysis method which is employed in this study is detected [45]. A larger sample size reduces the risk of Type II
logistic regression analysis skills as the response variable is errors (failing to reject a false null hypothesis) and increases the
categorical and may include threats detected or not detected. It likelihood that the study can detect a smaller effect size, making
includes the identification of possible predictor variables the findings more robust and persuasive.
grounded on given conceptual knowledge and prior literature Cyber security data encompasses a wide variety of features,
review, checking for multicollinearity, and model fine-tuning in from IP addresses and timestamps to types of attacks and their
regard to complexity/detail and accuracy of prediction [40]. outcomes. A substantial sample size ensures that the dataset
To establish the goodness of fit for models, several tests are run contains a comprehensive range of these features, including less
on SPSS and Anker including R squared test for linear regression common but potentially significant occurrences. This diversity is
models and Hosmer–Lemeshow chi-squaredd test for logistic important for developing accurate models that can extrapolate
models. Among them, some measures reflect the degree to which well from existing to new data sets rather than training the model
the model explains the variation in the response variable, and one on existing data and having it perform comic replication of these
measure assesses the overall fit of the model. Furthermore, using data [46].
their p-values, the level of significance of individual predictors is When conducting research, the dataset is designed to contain a
assessed, with the prevailing popular level of significance level broad spectrum of problem cases, and therefore having 2000 rows
being 0.05 [41]. allows for problems with more complexity to be captured in the
If the cyber security data provided is rather large, which is result. [47] The representativeness is crucial as it influences the
often the case with cyber security data due to the nature of threats external reliability and applicability of the study results in other
and attacks, further analysis may involve more sophisticated settings or subpopulations of the cyber security domain,
methods, for instance, cluster analysis or principal component especially in real world applications.
analysis (PCA) to find other underlying patterns within data or There is always a potential in machine learning, especially
data dimensionality reduction. They are useful in the when working in a relatively new and rapidly developing branch
identification of underlying relationships that often would not be such as cyber security, to over-train the model, that is, to achieve
easily detected through regression models. Various parameters good results only on the basis of the training set but get low
such as mean absolute error, root mean square error, correlation scores on a new dataset [48]. This risk is less of a concern for
coefficient, and coefficient of variation are used to judge the larger sample sizes because that way the researcher has enough
models and improve their efficiency. data to train even more complicated. On the other hand, it avoids
The k-fold cross-validation technique is used in which the data under-fitting whereby the model used is not sufficient in
set is divided into k subsets, which are then used to create complexity to fit the pattern of the data applied and thus ensures
multiple train and test sets for the model. The performance of the that the predictive models developed are complex. Having larger
trained predictive models is tested using accuracy related to datasets could yield even more confident information and
measures such as the area under the curve, sensitivity (true conclusions, but at the same time, this means more computational
positive value), and specificity (true negative value) since these power is needed, and managing and dealing with more and more
values are important in evaluating the efficiency of the predictive complicated data may become an issue. A dataset of 2000 rows
analytics systems in an operational environment with cyber strikes a balance between comprehensiveness and manageability,
security threats [42]. allowing for detailed analysis without overwhelming the
The final phase encompasses the extraction and interpretation computational and analytical resources available for the study
of meaningful insights that would affect cyber security practices. [49].
On its own, SPSS offers complete output that comes with The chosen sample size of 2000 rows from the original dataset
estimates of coefficients (B), odds ratios, and confidence intervals is justified based on its ability to provide sufficient statistical
that are valuable in arriving at conclusions regarding the effects of power, represent the diverse and complex nature of cyber security
various predictors. These results are then discussed in relation to threats, ensure the representativeness of the findings, balance the
the existing body of knowledge within the cyber security domain risks of overfitting and underfitting, and remain feasible for
and present generalizable findings, research limitations, and comprehensive analysis within the resource constraints of this
future studies' implications [43]. Through meticulous data study [50]. This sample size is pivotal in achieving the research
analysis using SPSS, this study aims to contribute significantly to objectives while ensuring the validity and reliability of the results
the field by providing empirical evidence to support the [51].
hypothesis. The structured approach ensures that the findings are
robust, reproducible, and relevant to enhancing cyber security
measures in various organizational contexts [44].
7

IV. RESULTS information significantly impact geo-location data. Additionally


[54], Attack Type and Attack Signature are significantly
A. Descriptive Analysis correlated (𝑟 = −0.282, 𝑝 < 0.01), indicating a strong
N Min Max Mean STD relationship between the type of attack and its signature.
Source Port 2000 1031 65521 32448.31 18701.174 Furthermore, the negative correlation between Geo-location Data
Destination Port 2000 1030 65535 32780.85 18561.498 and Device Information emphasizes the potential challenges in
Protocol 2000 1 3 1.99 .821
787.87
tracking devices across different locations, which can affect the
Packet Length 2000 64 1500 411.113
Packet Type 2000 1 2 1.49 .500
accuracy of threat detection models. The positive correlation
Traffic Type 2000 1 3 2.01 .820 between Traffic Type and Packet Type suggests that
Payload Data 2000 1 19 10.06 5.772 understanding traffic patterns can help in identifying specific
Malware Indicators 2000 1 1 1.00 .000 packet behaviors, crucial for network security analysis. These
Anomaly Scores 2000 .06 99.99 49.83 28.849
correlations highlight critical interactions within the dataset,
Alerts/Warnings 2000 1 1 1.00 .000
2000 1 3 1.99 .816
essential for understanding network behaviors and improving
Attack Type
Attack Signature 2000 1 3 2.34 .743 predictive models [55].
Action Taken 2000 1 4 2.94 .917
2000 1 3 1.99 .807
C. Regression Analysis
Severity Level
User Information 2000 1 20 10.98 5.504 R R2 Adjusted R2 Standard Error of Estimate
Device Information 2000 1 3 1.76 .830 . 086𝑎 .007 .006 .914
Network Segment 2000 1 3 2.01 .813 a. Predictors: Constant, Attack Type, Packet Length, Anomaly Scores.
Geo-location Data 2000 1 8 4.50 2.292 Table 2: Model Summary
Firewall Logs 2000 1 1 1.00 .000
IDS/IPS Alerts 2000 1 1 1.00 .000 Sum of df Mean F Sig.
Log Source 2000 1 2 1.49 .500 Squares Square
Log Source 2000 1 2 1.49 .500 Regression 12.391 3 4.130 4.944 . 022𝑏
Residual 1667.417 1996 .835
Table 1: Descriptive Statistics of Dataset Total 1679.808 1999
a. Dependent Variable: Action Taken.
The descriptive statistics provide an overview of the b. Predictors: Constant, Attack Type, Packet Length, Anomaly Scores.
dataset's key features, including the minimum, maximum, Table 3: ANOVAa
mean, and standard deviation values for each variable. Notable
B Std. Beta t Sig.
observations include the wide range of source and destination
Error
ports, the consistent presence of malware indicators, and the Constant 3.191 .075 42.713 .000
varied anomaly scores. The mean packet length of 787.87 Packet Length -7.228×10-5 .000 -.032 -1.453 .146
bytes and the uniformity in protocol usage (mean of 1.99) Anomaly Score .000 .001 -.015 -.689 .491
Attack Type -.087 .025 -.078 -3.480 .001
reflect typical network traffic patterns. The uniform action a. Dependent Variable: Action Taken.
taken, and alerts/warnings suggest consistent response Table 4: Coefficients a
protocols. Additionally, the high standard deviation in
anomaly scores (28.849) indicates substantial variability, The regression analysis results show that the model,
which could be pivotal for identifying unusual activities. incorporating Attack Type, Packet Length, and Anomaly
Furthermore, the variability in User Information (mean of Scores as predictors, explains a small portion of the variance
10.98, std dev of 5.504) and Device Information (mean of in the dependent variable [56], Action Taken (𝑅² =
1.76, std dev of 0.830) indicate varied user interactions, which 0.007, 𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑅² = 0.006). The model is statistically
can be critical for training predictive models that can significant (𝐹(3,1996) = 4.944, 𝑝 < 0.01), indicating that
generalize well across different user and device profiles. The these predictors collectively influence the action taken. Among
mean Network Segment value of 2.01 (std dev of 0.813) and the predictors, Attack Type is significant (𝛽 = −0.078, 𝑝 <
Geo-location Data mean of 4.50 (std dev of 2.292) reflect a 0.01), suggesting it has a notable impact on the action taken.
range of network segments and geographic locations, which is However, Packet Length and Anomaly Scores are not
vital for understanding the global nature of potential cyber significant predictors. The significance of the Attack Type
threats. Overall, the data exhibits significant variability, variable underscores the importance of understanding different
essential for training robust predictive models [52]. attack vectors in developing effective response strategies.
B. Correlation Analysis Despite the limited predictive power of Packet Length and
Anomaly Scores in this model, these variables could
The correlation analysis reveals several key relationships potentially interact with other factors not included in this
among the variables. Notably, there is a significant negative analysis, suggesting further exploration is needed. These
correlation between the Source Port and Protocol (𝑟 = findings highlight the importance of attack type in determining
−0.045, 𝑝 < 0.05), indicating that changes in source port cyber security responses, while packet length and anomaly
values slightly inversely relate to protocol types [53]. Traffic scores have limited predictive power in this model [57].
Type shows a positive correlation with Packet Type (𝑟 =
0.054, 𝑝 < 0.05), suggesting that specific traffic types are D. Chi-squared Tests
associated with certain packet types. Geo-location Data Value Df Asymptotic Significance
negatively correlates with Device Information (𝑟 = Pearson Chi-squared . 903𝑎 4 .924
−0.508, 𝑝 < 0.01), implying that changes in device Likelihood Ratio .902 4 .924
Linear-by-Linear .056 1 .813
8

Association The findings from this study provide insightful revelations


N 2000
and confirm the substantial value of predictive analytics in
a. No cells have an expected count less than 5. The minimum expected count is 216.15.
Table 5: Chi-squared Tests
cyber security, affirming our research questions and
suggesting a broader implementation. Our analysis confirmed
The chi-squared test results indicate no significant that predictive analytics significantly improves the capability
association between the categorical variables under of cyber security systems to identify and respond to a variety
investigation and the dependent variable, Action Taken. The of cyber-attacks in real time. Such systems are highly
Pearson Chi-squared value is 0.903 with a degree of freedom sophisticated because they can use advanced data processing
(df) of 4 and an asymptotic significance (p-value) of 0.924, and pattern recognition algorithms to identify potential threats
which is well above the common significance threshold of quickly and before the threats are physically realized [61]. The
0.05. Similarly, the Likelihood Ratio Chi-squared value is results support this research study’s hypothesis that predictive
0.902 with the same df and p-value. The Linear-by-Linear analytics is extremely viable in real-time cyber threat
Association also shows no significant linear relationship (p = scenarios, providing a systematic advantage over traditional
0.813). These results suggest that the variables tested do not methods across most states of threat detection and response.
significantly influence the action taken in the context of this It also examined the effectiveness of SCHs in using
dataset [58]. This lack of significant association indicates that predictive models to detect patterns and irregularities missed
the action taken may be influenced by other variables not by standard security systems [62]. On a similar note, the use of
captured in this dataset or by more complex interactions machine learning approaches like cluster analysis and anomaly
between variables. detection shows a model’s ability to pick on delicate and
intricate features that symbolize more elaborate forms of cyber
E. T-Test Analysis threats. These models are learned from a variety of data
Levene's
Test for
t-test for Equality of Means
sources that encompass historical attack data, and therefore,
Equality of
Variances have capabilities of considering new or unconventional means
95% Confidence
F Sig t df
Sig.
(two-
Mean
Diff
Std.
Error Interval of attack. They address our second research question, which
tailed) Diff Lower Upper
P clearly states that predictive analytics can indeed detect subtle
a Equal
c
k
Variances
Summed
2.64 .105 .281 991 .779 7.577 26.975 -45.357 60.511 patterns and fluctuations that conventional security
e
t frameworks otherwise overlook.
L
e
Equal
Variances
Finally, regarding the third research question, it is clear that
.286 682.24 .775 7.577 26.503 -44.460 59.613
n
g
Not
Summed
the implementation of predictive analytics has led to the
t
h improvement of decision-making in CyOps centers. The
Table 6: Independent Samples Test provision of PA trains security analysts with predictive model
information for better decision-making and timely
Based on t-test analysis, there no sign of any significant
identification. The means of presentation of the predicted
difference between the two groups compared. Levene's Test
results in the form of a visualization tool or a dashboard assists
for Equality of Variances indicates that variances do not
in the processes of making sense of complicated qualitative
significantly differ (𝐹 = 2.640, 𝑝 = 0.105). [59] The t-test for patterns, which will facilitate faster and more effective
Equality of Means shows a t-value of 0.281 (𝑑𝑓 = 991) when response to a threat situation. The enhancement of decision-
equal variances are assumed, and a t-value of 0.286 (𝑑𝑓 = making processes then not only serves to enhance the
682.243) when equal variances are not assumed, both efficiency and effectiveness of resource deployment but also
resulting in non-significant p-values (𝑝 = 0.779 𝑎𝑛𝑑 𝑝 = upgrades the general strategic responses to new threats and
0.775). The 95% Confidence Interval ranges from -45.357 to challenges in cyber security [63].
60.511 under the assumption of equal variances, and from - From this research, the author observes that predictive
44.460 to 59.613 when equal variances are not assumed, analytics is indeed a disruptive technology in the context of
indicating no significant difference in packet length between cyber security, as it provides a notable leap forward from
the groups. These findings suggest that packet length does not traditional approaches [64]. The benefits of using predictive
significantly distinguish between the groups being studied, analytics to detect threats as they occur, Chawla and Looker;
implying that other features might be more critical in as well as using the tool to reveal latent patterns and improve
differentiating between various network traffic types or attack the decision-making processes, opens the doors for its
scenarios. expansion in organizational security settings across the
different industries [65]. Thus, this work adds to the existing
V. DISCUSSION scholarship envisioning the use of complex analytical
In the realm of cyber security, the deployment of predictive technologies in the context of the cyber security problem and
analytics has emerged as a cornerstone in proactively highlights the necessity of further development of these
combating cyber threats. This research aimed to explore the systems. The questions set at the beginning of this research
efficiency of predictive analytics in real-time cyber-attack have been considerably supported by the data collected and it
identification and response, the detection of patterns and proves that the implementation of predictive analytics is
anomalies typically overlooked by traditional security significant in today's cyber security solutions [66]. Besides,
measures, and the enhancement of decision-making processes the implementation of such sophisticated and enhanced
in cyber security operations centers. [60] analytical methods not only raises the levels of protection but
also elevates the efficiency of professional cyber security staff.
9

This affirmation of our research questions provides a learning security to securing machine learning for CPS. IEEE
Communications Surveys & Tutorials, 23(1), 524-552.
fundamental appreciation of the importance of predictive
[8] Gupta, B.B., & Sheng, M. (2019). Machine Learning for Computer and
analytics in defining the future of cyber security practices Cyber Security. CRC Press.
[67]. [9] Díaz-Verdejo, J.E., Alonso, R.E., Alonso, A.E., & Madinabeitia, G.
(2023). A critical review of the techniques used for anomaly detection of
HTTP-based attacks: taxonomy, limitations and open challenges.
VI. CONCLUSION Computers & Security, 124, 102997.
This research focuses on assessing predictive analytics in [10] Vinayakumar, R., Alazab, M., Soman, K.P., Poornachandran, P., Al-
improving the identification procedures and countermeasures Nemrat, A., & Venkatraman, S. (2019). Deep learning approach for
intelligent intrusion detection system. IEEE Access, 7, 41525-41550.
to cyber threats in real-time systems. The presented research [11] Alrowais, F., Althahabi, S., Alotaibi, S.S., Mohamed, A., Hamza, M.A.,
relied on quantitative research methodologies, based on the & Marzouk, R. (2023). Automated Machine Learning Enabled
large and consistent dataset of network traffic and security Cybersecurity Threat Detection in Internet of Things Environment.
events. In addition, cross-sectional analysis was performed on Computer Systems Science & Engineering, 45(1).
[12] Sharma, S., & Nebhnani, M. (n.d.). Securing the Digital Frontier: Data
the collected data suggesting the use of advanced statistical Science Applications in Cyber security and Anomaly Detection.
modeling such as the logistical regression and clustering [13] Mohammed, T.M., Nataraj, L., Chikkagoudar, S., Chandrasekaran, S., &
analysis to present an understanding of how predictive Manjunath, B.S. (2021). Malware detection using frequency domain-
based image visualization and deep learning. arXiv preprint
analytics affects cyber security work.
arXiv:2101.10578.
The key findings highlight that predictive analytics [14] Jmila, H., Blanc, G., Shahid, M.R., & Lazrag, M. (2022). A survey of
significantly enhances a system’s ability to identify and smart home IoT device classification using machine learning-based
respond to various types of cyber-attacks in real time, offering network traffic analysis. IEEE Access, 10, 97117-97141.
an advantage over conventional reactive methods. The study [15] Gottwalt, F., Chang, E., & Dillon, T. (2019). CorrCorr: A feature
selection method for multivariate correlation network anomaly detection
highlighted the feasibility and importance of predictive techniques. Computers & Security, 83, 234-245.
analytics including improved threat detection, reduced [16] Rupa Devi, T., & Badugu, S. (2019). A review on network intrusion
response times, and better overall security management. detection system using machine learning. International Conference on
Despite many advantages, the study also discusses E-Business and Telecommunications, 598-607. Cham: Springer
International Publishing.
challenges and limitations in the use of predictive analytics. [17] Srinivasan, S., Ravi, V., Alazab, M., Ketha, S., Al-Zoubi, A.M., &
Future research should focus on real-time data integration and Padannayil, S.K. (2021). Spam emails detection based on distributed
adaptive learning algorithms that aims at improving the word embedding with deep learning. Machine Intelligence and Big Data
Analytics for Cybersecurity Applications, 161-189.
accuracy and timeliness of threat detection. Emerging
[18] Abdallah, E.E., & Otoom, A.F. (2022). Intrusion detection systems using
technologies and statistical methods could further advance supervised machine learning techniques: a survey. Procedia Computer
predictive analytics, providing more precise and context-aware Science, 201, 205-212.
cyber security tools. [19] Muwardi, R., Gao, H., Ghifarsyam, H.U., Yunita, M., Arrizki, A., &
Andika, J. (2021). Network security monitoring system via notification
alert. Journal of Integrated and Advanced Engineering (JIAE), 1(2),
ACKNOWLEDGMENT 113-122.
The author would like to thank the contributors of the datasets [20] Sharma, R., Kumar, V.R., & Sharma, R. (2019). AI based intrusion
detection system. Think India Journal, 22(3), 8119-8129.
used in this study obtained from Kaggle. [21] Berman, D.S., Buczak, A.L., Chavis, J.S., & Corbett, C.L. (2019). A
survey of deep learning methods for cyber security. Information, 10(4),
REFERENCES 122.
[22] KOŞAN, M., Yildiz, O., & Karacan, H. (2018). Comparative analysis of
machine learning algorithms in detection of phishing websites.
[1] Fatima, A., Maurya, R., Dutta, M.K., Burget, R., & Masek, J. (2019). Pamukkale University Journal of Engineering Sciences-Pamukkale
Android malware detection using genetic algorithm based optimized Universitesi Muhendislik Bilimleri Dergisi, 24(2), 234-241.
feature selection and machine learning. 42nd International Conference [23] Dupont, B., Shearing, C., Bernier, M., & Leukfeldt, R. (2023). The
on Telecommunications and Signal Processing (TSP), 220-223. IEEE. tensions of cyber-resilience: From sensemaking to practice. Computers
[2] Bazuhair, W., & Lee, W. (2020). Detecting malign encrypted network & Security, 132, 103372.
traffic using perlin noise and convolutional neural network. 2020 10th [24] Nathiya, T., & Suseendran, G. (2019). An effective hybrid intrusion
Annual Computing and Communication Workshop and Conference detection system for use in security monitoring in the virtual network
(CCWC), 0200-0206. IEEE. layer of cloud computing technology. Data Management, Analytics and
[3] Alani, M.M. (2021). Big data in cybersecurity: a survey of applications Innovation: Proceedings of ICDMAI 2018, Volume 2, 483-497. Springer
and future trends. Journal of Reliable Intelligent Environments, 7(2), 85- Singapore.
114. [25] Sajovic, I., & Boh Podgornik, B. (2022). Bibliometric analysis of
[4] Molina-Coronado, B., Mori, U., Mendiburu, A., & Miguel-Alonso, J. visualizations in computer graphics: a study. Sage Open, 12(1),
(2020). Survey of network intrusion detection methods from the 21582440211071105.
perspective of the knowledge discovery in databases process. IEEE [26] Ferrag, M.A., Maglaras, L., Moschoyiannis, S., & Janicke, H. (2020).
Transactions on Network and Service Management, 17(4), 2451-2479. Deep learning for cyber security intrusion detection: Approaches,
[5] Chalapathy, R., & Chawla, S. (2019). Deep learning for anomaly datasets, and comparative study. Journal of Information Security and
detection: A survey. arXiv preprint arXiv:1901.03407. Applications, 50, 102419.
[6] Mammen, A.L., Allenbach, Y., Stenzel, W., Benveniste, O., De [27] Heaton, J. (2018). Ian Goodfellow, Yoshua Bengio, and Aaron
Bleecker, J., Boyer, O., Casciola-Rosen, L., Christopher-Stine, L., Courville: Deep learning: The MIT Press, 2016, 800 pp, ISBN:
Damoiseaux, J., Gitiaux, C., & Fujimoto, M. (2020). 239th ENMC 0262035618. Genetic Programming and Evolvable Machines, 19(1),
international workshop: classification of dermatomyositis, Amsterdam, 305-307.
the Netherlands, 14–16 December 2018. Neuromuscular Disorders, [28] Ongun, T., Spohngellert, O., Miller, B., Boboila, S., Oprea, A., Eliassi-
30(1), 70-92. Rad, T., Hiser, J., Nottingham, A., Davidson, J., & Veeraraghavan, M.
[7] Olowononi, F.O., Rawat, D.B., & Liu, C. (2020). Resilient machine (2021). PORTFILER: port-level network profiling for self-propagating
learning for networked cyber physical systems: A survey for machine malware detection. 2021 IEEE Conference on Communications and
Network Security (CNS), 182-190. IEEE.
10

[29] Hasan, S.S., & Eesa, A.S. (2020). Optimization algorithms for intrusion intrusion detection systems: A systematic review. International Journal
detection system: A review. International Journal of Research- of Information Technology & Decision Making, 22(01), 589-636.
GRANTHAALAYAH, 8(08), 217-225. [51] Khraisat, A., Gondal, I., Vamplew, P., & Kamruzzaman, J. (2019).
[30] Injadat, M., Salo, F., Nassif, A.B., Essex, A., & Shami, A. (2018). Survey of intrusion detection systems: techniques, datasets and
Bayesian optimization with machine learning algorithms towards challenges. Cybersecurity, 2(1), 1-22.
anomaly detection. 2018 IEEE Global Communications Conference [52] Rosli, N.A., Yassin, W., Faizal, M.A., & Selamat, S.R. (2019).
(GLOBECOM), 1-6. IEEE. Clustering analysis for malware behavior detection using registry data.
[31] Dong, Y., Wang, R., & He, J. (2019). Real-time network intrusion International Journal of Advanced Computer Science and Applications
detection system based on deep learning. 2019 IEEE 10th International (IJACSA), 10, 12.
Conference on Software Engineering and Service Science (ICSESS), 1-4. [53] Beslin Pajila, P.J., Golden Julie, E., & Harold Robinson, Y. (2023).
IEEE. ABAP: Anchor node based DDoS attack detection using adaptive neuro-
[32] Kushal, S., Shanmugam, B., Sundaram, J., & Thennadil, S. (2024). Self- fuzzy inference system. Wireless Personal Communications, 128(2),
healing hybrid intrusion detection system: an ensemble machine learning 875-899.
approach. Discover Artificial Intelligence, 4(1), 28. [54] Saleh, A.I., Talaat, F.M., & Labib, L.M. (2019). A hybrid intrusion
[33] Chew, Y.J., Lee, N., Ooi, S.Y., Wong, K.S., & Pang, Y.H. (2022). detection system (HIDS) based on prioritized k-nearest neighbors and
Benchmarking full version of GureKDDCup, UNSW-NB15, and optimized SVM classifiers. Artificial Intelligence Review, 51, 403-443.
CIDDS-001 NIDS datasets using rolling-origin resampling. Information [55] Kharche, D., & Patil, R. (2020). Use of genetic algorithm with fuzzy
Security Journal: A Global Perspective, 31(5), 544-565. class association rule mining for intrusion detection. International
[34] Asjad, S. (n.d.). Intrusion Detection and Cyber Attack Classification for Journal of Computer Science and Information Technologies.
Encrypted DDS Communication Middleware in OT Networks using [56] Mehdi, M., & Khan, S. (2019). A novel intrusion detection system for
Machine Learning (Master's thesis, University of South-Eastern detection of black hole attacks in MANET using fuzzy logic.
Norway). International Journal of Computer Applications, 178(8), 12-17.
[35] Fernandes, G., Rodrigues, J.J., Carvalho, L.F., Al-Muhtadi, J.F., & [57] Latif, Z., Sharif, K., Li, F., Karim, M.M., Biswas, S., & Wang, Y.
Proença, M.L. (2019). A comprehensive survey on network anomaly (2020). A comprehensive survey of interface protocols for software
detection. Telecommunication Systems, 70, 447-489. defined networks. Journal of Network and Computer Applications, 156,
[36] Zhang, J., Ling, Y., Fu, X., Yang, X., Xiong, G., & Zhang, R. (2020). 102563.
Model of the intrusion detection system based on the integration of [58] Khraisat, A., Gondal, I., Vamplew, P., & Kamruzzaman, J. (2019).
spatial-temporal features. Computers & Security, 89, 101681. Survey of intrusion detection systems: techniques, datasets and
[37] Rupa Devi T, Badugu S. (2019). A review on network intrusion challenges. Cybersecurity, 2(1), 1-22.
detection system using machine learning. International Conference on [59] Ozkan-Okay, M., Samet, R., Aslan, Ö., & Gupta, D. (2021). A
E-Business and Telecommunications, 598-607. Cham: Springer comprehensive systematic literature review on intrusion detection
International Publishing. systems. IEEE Access, 9, 157727-157760.
[38] Ahmad, Z., Shahid Khan, A., Wai Shiang, C., Abdullah, J., & Ahmad, F. [60] Nweke, L.O. (n.d.). A survey of specification-based intrusion detection
(2021). Network intrusion detection system: A systematic study of techniques for cyber-physical systems.
machine learning and deep learning approaches. Transactions on [61] Garg, S., Kaur, K., Kumar, N., Kaddoum, G., Zomaya, A.Y., & Ranjan,
Emerging Telecommunications Technologies, 32(1), e4150. R. (2019). A hybrid deep learning-based model for anomaly detection in
[39] Al-Imran, M., & Ripon, S.H. (2021). Network intrusion detection: an cloud datacenter networks. IEEE Transactions on Network and Service
analytical assessment using deep learning and state-of-the-art machine Management, 16(3), 924-935.
learning models. International Journal of Computational Intelligence [62] Amma, N.G., & Subramanian, S. (2019). Feature correlation map based
Systems, 14(1), 200. statistical approach for denial of service attacks detection. 2019 5th
[40] Al-Saeed, I.A., Selamat, A., Rohani, M.F., Krejcar, O., & Chaudhry, International Conference on Computing Engineering and Design
J.A. (2020). A systematic state-of-the-art analysis of multi-agent (ICCED), 1-6. IEEE.
intrusion detection. IEEE Access, 8, 180184-180209. [63] Siddique, K., Akhtar, Z., Khan, F.A., & Kim, Y. (2019). KDD cup 99
[41] Soriano-Valdez, D., Pelaez-Ballestas, I., Manrique de Lara, A., & data sets: A perspective on the role of data sets in network intrusion
Gastelum-Strozzi, A. (2021). The basics of data, big data, and machine detection research. Computer, 52(2), 41-51.
learning in clinical practice. Clinical Rheumatology, 40(1), 11-23. [64] Mokbal, F.M., Dan, W., Imran, A., Jiuchuan, L., Akhtar, F., & Xiaoxi,
[42] Naoui, M.A., Lejdel, B., Ayad, M., Amamra, A., & Kazar, O. (2021). W. (2019). MLPXSS: an integrated XSS-based attack detection scheme
Using a distributed deep learning algorithm for analyzing big data in in web applications using multilayer perceptron technique. IEEE Access,
smart cities. Smart and Sustainable Built Environment, 10(1), 90-105. 7, 100567-100580.
[43] Qu, Z., Liu, H., Wang, Z., Xu, J., Zhang, P., & Zeng, H. (2021). A [65] Aslan, Ö., Ozkan-Okay, M., & Gupta, D. (2021). Intelligent behavior-
combined genetic optimization with AdaBoost ensemble model for based malware detection system on cloud computing environment. IEEE
anomaly detection in buildings electricity consumption. Energy and Access, 9, 83252-83271.
Buildings, 248, 111193. [66] Vinayakumar, R., Alazab, M., Soman, K.P., Poornachandran, P., &
[44] Or-Meir, O., Nissim, N., Elovici, Y., & Rokach, L. (2019). Dynamic Venkatraman, S. (2019). Robust intelligent malware detection using
malware analysis in the modern era—A state of the art survey. ACM deep learning. IEEE Access, 7, 46717-46738.
Computing Surveys (CSUR), 52(5), 1-48. [67] Abusnaina, A., Abuhamad, M., Alasmary, H., Anwar, A., Jang, R.,
[45] Alomari, E.S., Nuiaa, R.R., Alyasseri, Z.A., Mohammed, H.J., Sani, Salem, S., Nyang, D., & Mohaisen, D. (2021). Dl-fhmc: Deep learning-
N.S., & Esa, M.I., Musawi, B.A. (2023). Malware detection using deep based fine-grained hierarchical learning approach for robust malware
learning and correlation-based feature selection. Symmetry, 15(1), 123. classification. IEEE Transactions on Dependable and Secure
[46] Li, Y., Wen, Y., Tao, D., & Guan, K. (2019). Transforming cooling Computing, 19(5), 3432-3447.
optimization for green data center via deep reinforcement learning. IEEE
Transactions on Cybernetics, 50(5), 2002-2013.
[47] Fernandes, G., Rodrigues, J.J., Carvalho, L.F., Al-Muhtadi, J.F., &
Proença, M.L. (2019). A comprehensive survey on network anomaly
detection. Telecommunication Systems, 70, 447-489.
[48] Alhasan, S., Abdul-Salaam, G., Bayor, L., & Oliver, K. (2021). Intrusion
detection system based on artificial immune system: a review. 2021
International Conference on Cyber Security and Internet of Things (ICS
IoT), 7-14. IEEE.
[49] Mansouri, N., Javidi, M.M., & Mohammad Hasani Zade, B. (2021). A
CSO-based approach for secure data replication in cloud computing
environment. The Journal of Supercomputing, 77(6), 5882-5933.
[50] Alamleh, A., Albahri, O.S., Zaidan, A.A., Alamoodi, A.H., Albahri,
A.S., Zaidan, B.B., Qahtan, S., Binti Ismail, A.R., Malik, R.Q., Baqer,
M.J., & Jasim, A.N. (2023). Multi-attribute decision-making for

You might also like