Data Sources / Proposed
# Project Title Problem Statement Expected Outcome
Datasets Approach
Combine network
Multimodal Hybrid model with
flow, log, and user CICIDS 2017 + Late Fusion CNN-
Intrusion improved
1 behavior data to LANL Log LSTM + Feature
Detection detection accuracy
detect complex Dataset Embedding
System for layered threats.
attacks.
Use unstructured log
Deep Log
data to identify BERT-like log Context-aware log-
Analyzer for HDFS, BGL, or
2 abnormal events and embedding + LSTM based intrusion
Anomaly LogHub
correlate them with Autoencoder detection.
Detection
network activity.
Phishing
Detection Combine text, links, Multimodal CNN Robust phishing
using Email and webpage layout PhishTank + (for HTML) + BERT detection using
3
Text + URL + data for phishing Enron Dataset (for text) + Tabular multiple evidence
HTML classification. ML Fusion sources.
Structure
Learn malware Model capable of
Deep Learning EMBER +
patterns by analyzing identifying
for Malware Malimg + Dual-stream CNN +
4 both static features malware families
Behavior Process GRU Architecture
and dynamic by code and
Analysis Behavior Logs
behavior logs. behavior.
Insider Threat Integrate text
Text Embedding
Detection (emails) and Early warning
CERT Insider (BERT) + Graph
5 using Activity behavioral logs for system for insider
Threat Dataset Neural Network on
Logs and insider risk threat prediction.
activity graph
Emails prediction.
Adaptive Detect ransomware
CNN on file Hybrid DL system
Ransomware in real-time using file
Ransomware metadata + LSTM for early
6 Detection with access patterns and
Activity Dataset for process ransomware
File + Process system process
sequence detection.
Monitoring traces.
Anomaly
Detection in Fuse device Unified model for
Autoencoder +
IoT Networks telemetry, network detecting IoT-
7 IoT-23 Dataset Transformer
using flow, and firmware based botnet
Fusion
Multimodal logs for IoT security. attacks.
Data
Data Sources / Proposed
# Project Title Problem Statement Expected Outcome
Datasets Approach
Fake Domain &
Combine textual and
URL Detection DGArchive + BERT for text + Accurate detection
metadata
8 using Text + Alexa Top XGBoost for of AI-generated or
information to detect
WHOIS + DNS Domains structured data malicious domains.
malicious domains.
Data
Forecast potential
Hybrid Model
network intrusions Predictive system
for Network CICIDS + UNSW- CNN-LSTM Hybrid
9 using time-series for upcoming
Attack NB15 + Temporal Graphs
data and historical attack patterns.
Prediction
alerts.
Train ML models
Federated
collaboratively across Synthetic Federated CNN- Privacy-preserving
Learning for
10 multiple simulated network LSTM + Secure collaborative
Intrusion
sites without sharing datasets Aggregation intrusion detector.
Detection
raw data.
Cross-Domain
Link attack events Unified cyber
Threat MISP Threat Multi-view Graph
11 across logs, alerts, threat intelligence
Correlation Intel + LogHub Neural Networks
and emails using ML. graph.
using AI
Build an
Model that
Explainable AI interpretable ML/DL
NSL-KDD or XGBoost + SHAP + explains attack
12 for Intrusion model for
CICIDS Attention Layers features
Detection cybersecurity
transparently.
analysts.
Multi-Source Detect botnets using CNN for flows + Context-rich
13 Botnet network flows + DNS CTU-13 + IoT-23 GNN for host botnet detection
Detection queries + host logs. relations framework.
Audio Real-time
Deepfake and Detect synthetic
ASVspoof + Spectrogram CNN detection of
14 Voice Spoofing audio/video using
DFDC Dataset + Visual deepfakes and
Detection multimodal cues.
Transformer voice clones.
Cyber Threat Identify emerging
BERT + Topic Early prediction of
Prediction threats using posts, Twitter + OSINT
15 Modeling + Time- cyber events from
from Social hashtags, and dark datasets
Series Analysis social signals.
Media web chatter.
16 Zero-Day Generate and detect ExploitDB + CVE Variational Generative model
Data Sources / Proposed
# Project Title Problem Statement Expected Outcome
Datasets Approach
Attack
Autoencoder +
Prediction new exploit patterns + Vulnerability predicting unseen
GAN-based
using from mixed datasets. Data attacks.
anomaly detector
Generative AI
Multimodal
Combined open
Authentication Develop robust Secure continuous
datasets (e.g., Siamese CNN +
17 using Voice + multi-factor user authentication
MOBIO, LSTM
Typing + Face verification. model.
Keystroke)
Data
Use ML to analyze
BERT-based Alert
AI-Driven alerts, correlate Semi-automated
Summarizer +
18 Threat Hunting patterns, and SOC Alert Logs analyst assistant
Reinforcement
Assistant recommend for SOC.
Learning
responses.
Cloud Security
Detect resource Cloud-native
Threat AWS CloudTrail, Multi-log
misuse by correlating anomaly detector
19 Detection Azure Audit Transformer
VM, API, and with explainable
using Multi- Logs Fusion
network logs. insights.
Log Fusion
Hybrid Graph-
Detect multi-stage Graph Neural Multi-stage attack
Based Cyber DARPA Intrusion
20 attacks using linked Network + Event chain visualization
Attack Chain Dataset
security events. Sequence Learning and detection.
Detection