0% found this document useful (0 votes)
7 views15 pages

Analyzing Nyka Product Reviews with NLP

The document discusses Natural Language Processing (NLP) and its application in analyzing Nyka product reviews, utilizing the NLTK Python package. It outlines the data preprocessing steps, including text cleaning and vectorization methods such as Bag of Words and TF-IDF. Additionally, it provides links to GitHub repositories related to the project.

Uploaded by

ashokchakram5
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views15 pages

Analyzing Nyka Product Reviews with NLP

The document discusses Natural Language Processing (NLP) and its application in analyzing Nyka product reviews, utilizing the NLTK Python package. It outlines the data preprocessing steps, including text cleaning and vectorization methods such as Bag of Words and TF-IDF. Additionally, it provides links to GitHub repositories related to the project.

Uploaded by

ashokchakram5
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

NLP-Natural

Language
Processing
TOPIC: -
Nyka product reviews
Group-1
NLP
 Natural language processing (NLP) is a field that focuses on
making natural human language usable by computer programs.
 NLTK, or Natural Language Toolkit, is a Python package that you
can use for NLP.
 A lot of the data that you could be analyzing is unstructured data
and contains the human-readable text.
 Since machines can able to understand only binary values, We
use NLP Technics to convert text into vectors.
LIBRARIES USED
 import pandas as pd
 import nltk
 from [Link] import sent_tokenize,word_tokenize
 from [Link] import stopwords
 from [Link] import PorterStemmer, LancasterStemmer, SnowballStemmer,
 from [Link] import WordNetLemmatizer
 from sklearn.feature_extraction.text import CountVectorizer
 from sys import getsizeof
 from sklearn.feature_extraction.text import TfidfVectorizer
 import re
 import warnings
 [Link]('ignore')
Nyka product reviews-Data
Frame

Number of data points =


61284
Dependent and independent columns

Independent text column: - Review text


DataFrame After removing Nan value
rows

Number of data points =


61276
Steps involved in Text Pre
Processing

1 2 3 4 5 6 7

Convert Demoniz Remove Remove the Remove Remove Apply


text into e the HTML Web page special stop stemmin
lowercase emojis. tags. Hypertext characte words. g and
or Transfer rs except lemmatiz
uppercas Protocol for ation.
e. Secure space.
Text pre-
processing
Cleaned
review
Converting text to vectors using Bag of
Words
Converting text to vectors using
TF-IDF
Vector conversion of Review Rating Column

Steps involved in converting numbers into vectors.


 Dealing With Nan’s (removing)
 Separate the ratings which are >3 or <3 and convert into
positive or negative reviews
 Applying One-Hot Encoding
One-Hot Encoding on Rating
column
Github Links
Name GitHub Link

Gogula Vinay [Link]


arning_NLP_Assingnment-[Link]
Dhanya [Link]
Palacharla n%20nykaa%[Link]
Sahithi Chowdary [Link]
ews-NLP-
Vijay Bhaskar [Link]
Suresh Gurrala [Link]

You might also like