0% found this document useful (0 votes)
20 views4 pages

Script Bot vs Smart Bot Explained

Uploaded by

gtcxgamer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views4 pages

Script Bot vs Smart Bot Explained

Uploaded by

gtcxgamer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit -6

NLP
Q:-1 What are the applications of NLP?
A:- 1. Automatic text Summarization:-It is the process of creating the most meaningful and
relevant summary of voluminous texts from multiple resources.
2. Sentiment Analysis:- It is a process to identify sentiments among several social media posts
.Sentiment analysis reflects the overall positive, negative or neutral opinion by a person.
3. Text Classification:- It is process of classifying the unstructured text into groups or
categories. Eg spam filter
[Link] Assistant:- Virtual assistant like Alexa, siri, Cortana uses NLP and interpret human
voice ,understand human intent and do task accordingly like play your favourite playlist, set up
alarm etc.
[Link]:- Chatbot automates your tasks like saying good morning when you wake up, telling
you news on a daily basis, helping you in choosing a less traffic route for your school, ordering a
coffee for you on your way back home. Mitsuku bot, Haptik, Ochatbot etc.
Q:- 2 What is the difference between Scripted –bot and Smart-bot?
Ans:- Script Bot Smart Bot
1. These are simple chatbots 1. These are flexible,powerful, AI
with limited functionalities. based model with wider functionalities.
2. They are scripted according to task . 2. They support machine learning
algorithms that make a machine learn from
experience. They simulate human like
interactions with the users.
3. Script bot are easy to make. 3. Smart bot are difficult to make.
4. They need less programming skills. 4. They need a lot of programming and work on
bigger database .
5. Script bot are best suited for straight 5. Virtual assistants like Google
forward interactions like at customer care assistant, Alexa ,Siri are example
services. of Smart Bot.

Q:- 3 Name the techniques of NLP.


A:- Techniques of NLP are
1. Bag of words
2. Term Frequency
3. Inverse Document Frequency
Q:- 4 What are the steps for Text Normalisation? Explain
Ans:- 1. Text Normalisation:-In this process of cleaning of textual data by converting a text into
standard [Link] like as slang,short forms,misspelled words,abbreviations etc are converted
into canonical form.
a. Sentence Segmentation:- It is process of Sentence Boundary Detection which reduces the
corpus into a sentence.
b. Tokenization:-It is a process of dividing the sentences into tokens. A token is a word,number
or special character.
c. Removing stopwords,special characters and numbers:- The words that do not provide any
information regarding the corpus are removed in this step.
d. Converting Text to a common case:-It is process of converting whole corpus into lowercase
to avoid confusion .
e. Stemming: It is a process of removing the affixes from the words to get back its base word
is called stemming.
f. Lemmatization:- It is a process of removing the affixes from the words to create a
meaningful base word is called lemmatization.
Q:- 5Give example of stemming and Lemmatization.
A:- Stemming:- Healed- heal, Studies-studi, Caring - car
Lemmatization:- Studies- Study, Caring - Care
Q:-6 What is Bag of words?
Ans:- After the process of text normalization the corpus is converted into normalized corpus
which is just a collection of meaningful words with no sequence.
Q:- 7 What are the steps involved on Bag of Words?

Ans:- The Steps involved in Bag of Words algorithm are:


• Text Normalisation: The collection of data is processed to get normalised corpus.
• Create Dictionary: This step will create a list of all unique words available in normalisedcorpus.
• Create Document Vectors: For each document in the corpus, create a list of uniquewords with
its number of occurrences.
• Create Document Vectors for all the Documents: Repeat Step 3 for all documents inthe corpus to
create a “Document Vector Table”.
Q:- 8 Create a step by step approach to implement a bag of words algorithm.

Common questions

Powered by AI

Both processes transform raw data into a format suitable for analysis, but they operate at different stages. Text normalization involves cleaning the text by converting it into a standard format, removing noise like stopwords, and establishing a consistent form through processes like stemming and lemmatization . Creating a document vector table in the Bag of Words model occurs after normalization and involves representing documents as vectors based on the frequency of words from an established dictionary. While text normalization focuses on cleaning and standardizing text, creating document vector tables quantitatively represents the text’s content structure .

Stemming and lemmatization both aim to reduce words to their base forms, but they differ in their approach. Stemming is a rule-based process that removes affixes to return the root form, but this form may not be a valid word (e.g., 'studies' becomes 'studi'). Lemmatization, however, considers the morphological analysis of words, converting them into meaningful base words (e.g., 'studies' becomes 'study'). Lemmatization tends to be more accurate in representing the word's meaning, aiding more precise text analysis. Both processes reduce dimensionality in text data, simplifying further computational tasks .

Automatic text summarization is a natural language processing (NLP) technique that involves creating the most meaningful and relevant summary from a large volume of text gathered from multiple resources. The main applications include reducing the time required to understand information from extensive sources, aiding in fast decision-making processes, and improving the accessibility of information by providing concise summaries .

The 'Bag of Words' model represents text data by disregarding grammar and word order while counting the frequency of each word's occurrence in a document. After text normalization, it forms a dictionary of unique words from the corpus. Each document is then represented as a vector, detailing the occurrences of these words, facilitating various analyses such as text classification and clustering by enabling the comparison of different vectors across documents .

Sentiment analysis focuses on identifying the sentiment or emotion expressed in a text, categorizing it as positive, negative, or neutral, often used to analyze opinions on social media . On the other hand, text classification involves categorizing unstructured text into broader groups or categories, such as in spam filtering, which sorts emails based on content . Both techniques aim to derive structured information from text data but serve distinct purposes.

Virtual assistants, as applications of NLP, interpret human voice commands, understand intent, and perform tasks such as playing music or setting alarms, based on machine learning algorithms. They enhance user interaction by facilitating hands-free, natural language-based communication, improving user experience with accessibility and convenience. This interaction leverages speech recognition and natural language understanding to provide intelligent responses and actions, making technology more user-friendly and efficient .

Implementing a 'Bag of Words' model involves several key steps: 1) Text Normalization, which prepares a clean and standardized corpus by removing noise; 2) Creating a Dictionary, listing all unique words from the normalized corpus, providing a comprehensive lexicon; 3) Creating Document Vectors, where each document is represented by a vector quantifying the occurrence of these words; 4) Creating a Document Vector Table for the entire corpus, enabling comparison between documents. Each step is essential for transforming raw text into a structured format amenable to quantitative analysis and machine learning tasks .

Scripted bots are built with limited functionalities, primarily designed for straightforward interactions and require less programming skill . They strictly follow scripts to perform specific tasks. In contrast, smart bots are more flexible and powerful, utilizing AI and machine learning to simulate human-like interactions, making them more complex and demanding in terms of programming and database management. Smart bots like virtual assistants (e.g., Alexa, Siri) can learn from interactions and adapt over time, unlike scripted bots .

Removing stopwords is crucial because these words (e.g., 'is', 'the', 'at') do not carry significant meaning or insight and can clutter the text data. By eliminating them, the remaining text becomes more focused on the keywords that carry semantic value, improving the efficiency and accuracy of subsequent analysis steps, like sentiment analysis or topic modeling, by reducing noise .

Sentence segmentation and tokenization are crucial steps in text normalization, a process of cleaning textual data. Sentence segmentation, or sentence boundary detection, reduces the text corpus into distinct sentences, facilitating easier management and analysis . Tokenization breaks these sentences into tokens, which can be words, numbers, or special characters, allowing further processing like removing stopwords and stemming, thus standardizing the text for further analysis and application in NLP tasks .

You might also like