0% found this document useful (0 votes)
83 views21 pages

SMS Spam Detection with ML Algorithms

The document presents a comparative study of machine learning algorithms for SMS spam detection. It describes extracting features from a dataset of SMS messages, training classifiers, and using the trained classifiers to classify new messages. It evaluates the performance of logistic regression, naive Bayes, SVM, and neural network models on the task, finding that neural networks achieved the best accuracy of 97.67%. The document concludes that machine learning is effective for SMS spam detection and can help shield users from unwanted messages.

Uploaded by

Kavya Shetty
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views21 pages

SMS Spam Detection with ML Algorithms

The document presents a comparative study of machine learning algorithms for SMS spam detection. It describes extracting features from a dataset of SMS messages, training classifiers, and using the trained classifiers to classify new messages. It evaluates the performance of logistic regression, naive Bayes, SVM, and neural network models on the task, finding that neural networks achieved the best accuracy of 97.67%. The document concludes that machine learning is effective for SMS spam detection and can help shield users from unwanted messages.

Uploaded by

Kavya Shetty
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

ALVA’S INSTITUTE OF ENGINEERING AND

TECHNOLOGY

Comparative Study of Machine Learning


Algorithms for SMS Spam Detection

Presented by,
[Link]
4al17cs042
UNDER THE GUIDANCE OF
Mrs. ReenaLobo
Assistant Professor
2
Contents
 Introduction
 Statement of Problem
 Objectives
 Methodology
 Applications
 Conclusions
 Reference
3 Introduction
• The short message service (SMS) became popular after it was initially
provided as a service in the second-generation (2G) terrestrial mobile
network architecture .
• Its popularity has been exploited by some advertising companies and others
to spread unwanted advertising, communicate advertising offers, and send
unwanted material to the end users.
• These undesirable messages, known as spam, make it difficult for the users
to receive the desirable messages and make them frustration and irritation.
4 Statement of problem
• SMS has become such an integral part of communication in the
contemporary society that service providers use it as the primary means of
passing information to their subscribers.
• At the same time, spammers have exploited this opportunity to pass their
messages to mobile phone users with the interest of driving their business
agenda.
5 Objectives
 Spam messages are a bother to many users not only for their “annoying”
nature, but also for intruding the end users’ devices, occupying memory
resources that could have been used for other purposes.

 Since there are many mobile phone users who rely on SMS
communication, it is important to shield them from the negative impacts
that mobile spam presents to them as they use the service.
6 Methodology
System Model
 The SMS spam filtering system performs various functions that can be
divided into five major sub-systems:
1. Feature Extraction System.
2. Classifier Training System.
3. Trained Classification System.
4. Classification System.
5. Decision System.
7
8
Importing Libraries and the Data Set
 The dataset has been collected from UCI Machine Learning repository
gathered in 2012 that has 5574 SMS text messages.
 After saving the dataset in google drive, the first step in the process is the
importation of the libraries using the Python programming language.
 The Python codes for performing these functions are as detailed in the
screenshots below.
9 From collections import Counter
import numpy as np
import pandas as pd
import [Link] as plt
from [Link] import
precision_score,f1_score,roc_auc_score,accuracy_score,recall_score
from sklearn.linear_model import LogisticRegression
import string
import nltk
10 Cleaning and Visualizing the Dataset
• The dataset thus obtained from Google drive contains the identification number of
the message.

• Its classification as either spam or ham, the text message itself, and three other
columns that come with default values.
 Presented by,
 [Link]
 4al17cs054
11
12 The cleaning and visualization process ends with the addition of one more
column that is relevant to the understanding of the data. The column “Word
Count” is added to establish the number of words that each of the text
messages have.
13 Splitting the Dataset into Train and Test
Logistical Regression
14
Accuracy - 94.26%

Precision for detecting ham – 0.94

Precision for detecting spam – 0.99

Recall rate of ham – 1.0

Recall rate of spam – 0.6

Support of ham – 957

Support of spam – 158


15 Naïve Bayes Algorithm
Accuracy – 88.16%

Precision for detecting ham – 0.97

Precision for detecting spam – 0.56

Recall rate of ham – 0.89

Recall rate of spam – 0.82

Support of ham – 957

Support of spam - 158


16 SVM
Accuracy - 94.26%

Precision for detecting ham – 0.94

Precision for detecting spam – 0.99

Recall rate of ham – 1.0

Recall rate of spam – 0.6

Support of ham – 957

Support of spam – 158


17
Neural Network
The neural network was analyzed using the TensorFlow backend
instruction. Of the 137,153 parameters available, the model trained
137,153 – equal to 97.67% success rate. Consequently, it displayed
better results as compared to the previous models.
Application
18

The spam filtering concept which is used in email spam filtering can
be used in SMS spam filtering too. Which will be very useful for the
users as the distinguish between spam and ham messages is done
and the users will be shielded such that they wont receive any
negative messages.
Conclusion
19

• Machine learning is the most popular technique used in the classification of


messages into spam or ham.
• Its successful use in producing email spam classification system makes it a viable
option for the classification of mobile spam messages.
• Consequently, machine learning techniques have been adopted in implementing the
system of spam detection, classification, and blocking.
References
20

 K. Mathew and B. Issac, "Intelligent Spam Classifcation for Mobile Text


Message," in 2Oll International Conference on Computer Science and
Network Technology, Kuching, Malaysia, 2011.
 H. Shirani-Mehr, "SMS Spam Detection using Machine Learning
Approach," Stanford University, CA, 2013.
 H. Shirani-Mehr, "SMS Spam Detection using Machine Learning
Approach," Stanford University, CA, 2013.
21

THANK YOU!!!!!!!!

You might also like