NLP 2

The document outlines a series of Python programming tasks focused on Natural Language Processing (NLP) using the NLTK library. It covers various techniques such as tokenization, stop word removal, stemming, word sense disambiguation, part of speech tagging, and converting audio to text. Each week includes specific aims, procedures, code examples, and expected outputs for practical implementation of NLP concepts.

Uploaded by

murali12ytpremium

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views8 pages

NLP 2

Uploaded by

murali12ytpremium

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

#WEEK 01

Write a Python Program to perform following tasks on text

• Tokenization b) Stop word Removal
Aim : To implement tokenization and stop word removal on a given text using Python in order to
preprocess textual data for Natural Language Processing (NLP) applications.
Tools: Python, NLTK
Procedure:
1. Install NLTK using pip install nltk.
2. Write a Python script to tokenize text into words/sentences.
3. Implement stop word removal using the NLTK stopword corpus.
4. Execute the program and analyze the output.
#CODE
import nltk
from [Link] import stopwords
from [Link] import word_tokenize
[Link]('punkt_tab')
[Link]('stopwords')
text = "Natural Language Processing is a branch of Artificial Intelligence."
# Step 1: Tokenization
tokens = word_tokenize(text)
print("Tokens:")
print(tokens)
# Step 2: Stop word removal
stop_words = set([Link]('english'))
filtered_tokens = [word for word in tokens if [Link]() not in stop_words]
print("\nTokens after Stop Word Removal:")
print(filtered_tokens)
#OUTPUT
Tokens:
['Natural', 'Language', 'Processing', 'is', 'a', 'branch', 'of', 'Artificial', 'Intelligence', '.']
Tokens after Stop Word Removal:
['Natural', 'Language', 'Processing', 'branch', 'Artificial', 'Intelligence', '.']
#WEEK 02
Install NLTK tool kit and perform stemming
• Aim : To implement the Porter Stemming algorithm .
• Tools: Python, NLTK
#CODE
import re
class PorterStemmer:
def __init__(self):
[Link] = "aeiou"
[Link] = {
1: ["s", "es", "ed", "ing"],
2: ["ly", "er", "ment"],
3: ["iest", "ness", "ful", "ous"]
}
def is_vowel(self, ch):
return ch in [Link]
def step1(self, word):
if [Link]("sses"):
return word[:-2] # sses -> ss
if [Link]("ied") or [Link]("ies"):
return word[:-2] # ied/ies -> i
if [Link]("s") and not [Link]("ss"):
return word[:-1] # remove plural s
return word
def step2(self, word):
if [Link]("ing"):
return word[:-3]
if [Link]("ed"):
return word[:-2]
return word
def step3(self, word):
if [Link]("ness"):
return word[:-4]
if [Link]("ful"):
return word[:-3]
if [Link]("ous"):
return word[:-3]
return word
def stem(self, word):
word = [Link]()
word = self.step1(word)
word = self.step2(word)
word = self.step3(word)
return word
ps = PorterStemmer()
words = ["running", "happiness", "easily", "jumping", "fairly", "savings", "flies"]
for word in words:
print(f"Original: {word}, Stemmed: {[Link](word)}")
#OUTPUT
Original: running, Stemmed: runn
Original: happiness, Stemmed: happi
Original: easily, Stemmed: easily
Original: jumping, Stemmed: jump
Original: fairly, Stemmed: fairly
Original: savings, Stemmed: sav
Original: flies, Stemmed: fli
#WEEK 03
Write Python programs for:
• Word Analysis
• Word Generation
AIM: To write Python programs for:
a) Word Analysis – to analyze words in a given text
b) Word Generation – to generate words using linguistic rules and random methods
Tools: Python, NLTK
Procedure:
1. Write a function to analyze words based on their frequency and linguistic features.
2. Implement a function to generate words based on affix rules.
#CODE
import nltk
from collections import Counter
from [Link] import word_tokenize
[Link]('punkt_tab')
text = "This is a sample text for word frequency analysis. Analysis is important."
tokens = word_tokenize(text)
word_freq = Counter(tokens)
print("Word Frequencies:", word_freq)
#OUTPUT
Word Frequencies: Counter({'is': 2, '.': 2, 'This': 1, 'a': 1, 'sample': 1, 'text': 1, 'for': 1, 'word': 1,
'frequency': 1, 'analysis': 1, 'Analysis': 1, 'important': 1})
#CODE
# Word Generation using Prefix and Suffix
root_word = input("Enter a root word: ")
prefixes = ["un", "re", "pre"]
suffixes = ["ing", "ed", "ly"]
print("\nGenerated Words:")
for p in prefixes:
print(p + root_word)
for s in suffixes:
print(root_word + s)
#OUTPUT
Enter a root word: happy
Generated Words:
Unhappy
rehappy
prehappy
happying
happyed
happily
#WEEK 04
Create a sample list of at least 5 words with ambiguous senses and write a Python program to
implement WSD.
• Aim :To implement Word Sense Disambiguation (WSD) using the Lesk algorithm in
Python with the help of the NLTK WordNet corpus, and to identify the correct meaning of
an ambiguous word based on its context in a sentence.
Tools: Python, WordNet (NLTK)
Procedure
1. Import the required modules lesk from [Link] and wordnet from [Link].
2. Define a sentence containing an ambiguous word whose meaning depends on context.
3. Specify the target word to be disambiguated.
4. Split the sentence into individual words to form the context.
5. Apply the lesk() function by passing the context words and the target word.
6. Obtain the most appropriate WordNet synset returned by the Lesk algorithm.
7. Display the identified sense along with its definition.
#CODE
import nltk
from [Link] import lesk
from [Link] import wordnet
[Link]('wordnet')
sentences = [
("The bank will not be open until tomorrow.", "bank"),
("He hit the ball with a bat.", "bat"),
("The plant produces electricity.", "plant"),
("The crane lifted the heavy container.", "crane"),
("She bought a new mouse for her laptop.", "mouse")
]
for sentence, word in sentences:
sense = lesk([Link](), word)
print("Sentence:", sentence)
print("Ambiguous Word:", word)
print("Best Sense:", sense)
print("Definition:", [Link]() if sense else "No sense found")
print("-" * 60)
#OUTPUT
Sentence: The bank will not be open until tomorrow.
Ambiguous Word: bank
Best Sense: Synset('deposit.v.02')
Definition: put into a bank account
------------------------------------------------------------
Sentence: He hit the ball with a bat.
Ambiguous Word: bat
Best Sense: Synset('squash_racket.n.01')
Definition: a small racket with a long handle used for playing squash
------------------------------------------------------------
Sentence: The plant produces electricity.
Ambiguous Word: plant
Best Sense: Synset('plant.v.06')
Definition: put firmly in the mind
------------------------------------------------------------
Sentence: The crane lifted the heavy container.
Ambiguous Word: crane
Best Sense: Synset('grus.n.01')
Definition: a small constellation in the southern hemisphere near Phoenix
------------------------------------------------------------
Sentence: She bought a new mouse for her laptop.
Ambiguous Word: mouse
Best Sense: Synset('mouse.v.02')
Definition: manipulate the mouse of a computer
WEEK 05: Create Sample list of at least 10 words POS tagging and find the POS for any given
word
Aim : To create a sample list of words and perform Part of Speech (POS) tagging using the NLTK
toolkit in Python and to find the POS tag for any given word.
Tools: Python, WordNet (NLTK)
Procedure
1. Install the NLTK library using pip.
2. Import the required NLTK modules in Python.
3. Download necessary NLTK data packages such as tokenizer and POS tagger.
4. Create a sample list of at least 10 words.
5. Use the pos_tag() function to assign POS tags to each word in the list.
6. Display the words along with their corresponding POS tags.
7. Provide a given word and find its POS tag using the same POS tagging function.
8. Observe and verify the output.
#CODE
import nltk
[Link]('punkt_tab')
[Link]('averaged_perceptron_tagger_eng')
words = [
"running", "dog", "beautiful", "quickly", "eat",
"computer", "happy", "students", "write", "very"
]
pos_tags = nltk.pos_tag(words)
print("Word\t\tPOS Tag")
print("------------------------")
for word, tag in pos_tags:
print(f"{word}\t\t{tag}")
given_word = "running"
tag = nltk.pos_tag([given_word])
print(f"\nGiven Word: {given_word}")
print(f"POS Tag : {tag[0][1]}")
#OUTPUT
Word POS Tag
------------------------
running VBG
dog NN
beautiful JJ
quickly RB
eat VBP
computer NN
happy JJ
students NNS
write VBP
very RB
Given Word: running
POS Tag : VBG
#WEEK 06
Install NLTK tool kit and perform stemming
• Aim : To implement the Porter Stemming algorithm .
• Tools: Python, NLTK
#CODE
from [Link] import PorterStemmer
ps = PorterStemmer()
words = ["running", "flies", "jumps", "easily", "fairly"]
stemmed_words = [[Link](word) for word in words]
print("Stemmed Words:", stemmed_words)
#OUTPUT
Stemmed Words: ['run', 'fli', 'jump', 'easili', 'fairli']
#WEEK 07
1. Perform a morphological analysis using nltk library
Aim : To perform lemmatization using the WordNet Lemmatizer in Python and convert inflected
words into their base (dictionary) forms using appropriate Parts of Speech (POS)
Tools: Python, NLTK
Procedure:
1. Install and import NLTK library in Python.
2. Download required datasets such as:wordnet
3. Input the sentence for analysis.
4. Perform lemmatization using WordNet Lemmatizer with proper POS mapping.
5. Display the output lemmas.
#CODE
import nltk
from [Link] import WordNetLemmatizer
from [Link] import wordnet
[Link]('wordnet')
lemmatizer = WordNetLemmatizer()
words = ["better", "running", "wolves"]
print([Link]("better", pos="a"))
print([Link]("running", pos="v"))
print([Link]("wolves", pos="n"))
#OUTPUT
good
run
wolf
2. Generate n-grams using NLTK N-Grams library
#CODE
import nltk
from [Link] import ngrams
from [Link] import word_tokenize
# [Link]('punkt')
text = input("Enter the text: ")
n = int(input("Enter the value of n: "))
tokens = word_tokenize(text)
generated_ngrams = list(ngrams(tokens, n))
print(f"\n{n}-grams are:")
for gram in generated_ngrams:
print(gram)
#OUTPUT
Enter the text: Swarm intelligence is inspired by nature
Enter the value of n: 3
3-grams are:
('Swarm', 'intelligence', 'is')
('intelligence', 'is', 'inspired')
('is', 'inspired', 'by')
('inspired', 'by', 'nature')
3. Implement N-Grams Smoothing
import nltk
from [Link] import ngrams
from collections import Counter
corpus = [
"I love natural language processing",
"I love machine learning",
"natural language processing is fun"
]
tokens = []
for sentence in corpus:
[Link]([Link]().split())
vocab = set(tokens)
V = len(vocab)
bigrams = list(ngrams(tokens, 2))
bigram_counts = Counter(bigrams)
unigram_counts = Counter(tokens)
def bigram_probability(w1, w2):
bigram_count = bigram_counts[(w1, w2)]
unigram_count = unigram_counts[w1]
probability = (bigram_count + 1) / (unigram_count + V)
return probability
print("P(love | I) =", bigram_probability("i", "love"))
print("P(language | love) =", bigram_probability("love", "language"))
print("P(fun | machine) =", bigram_probability("machine", "fun"))
#OUTPUT
P(love | I) = 0.2727272727272727
P(language | love) = 0.09090909090909091
P(fun | machine) = 0.1
#WEEK 08
Using NLTK package to convert audio file to text and text file to audio files.
Aim: To implement text-to-audio and audio-to-text conversion using Python libraries pyttsx3 and
SpeechRecognition
Procedure
1. Install required Python libraries: pyttsx3 and SpeechRecognition.
2. Import the required modules in Python.
3. Create a function to convert text to speech using pyttsx3.
4. Create another function to convert speech to text using SpeechRecognition.
5. Use a microphone to capture speech input.
6. Run the program and observe the audio output and recognized text.
#CODE
pip install pyttsx3 SpeechRecognition pyaudio
import pyttsx3
import speech_recognition as sr
def text_to_audio(text):
engine = [Link]()
[Link](text)
[Link]()
def audio_to_text():
recognizer = [Link]()
with [Link]() as source:
print("Speak something...")
audio = [Link](source)
try:
return recognizer.recognize_google(audio)
except [Link]:
return "Sorry, I could not understand the audio."
except [Link]:
return "Network error."
text_to_audio("NLP is interesting.")
result = audio_to_text()
print("You said:", result)
#OUTPUT
The program speaks the sentence:
“NLP is interesting.”
If the microphone function is called, the program will also convert spoken words into text.

Python Text Processing and WSD Implementation
No ratings yet
Python Text Processing and WSD Implementation
13 pages
Python NLP Techniques: Tokenization & Stemming
No ratings yet
Python NLP Techniques: Tokenization & Stemming
17 pages
How to Install and Use NLTK in Python
No ratings yet
How to Install and Use NLTK in Python
15 pages
NLTK Tokenization and Stop Words Guide
No ratings yet
NLTK Tokenization and Stop Words Guide
32 pages
NLP Lab Manual2 NLP Lab NLP Lab Manual2
No ratings yet
NLP Lab Manual2 NLP Lab NLP Lab Manual2
15 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
9 pages
R22 NLP Python-Programs Upto 7
No ratings yet
R22 NLP Python-Programs Upto 7
25 pages
NLP Lab
No ratings yet
NLP Lab
11 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
14 pages
Antonyms in NLTK WordNet Usage
No ratings yet
Antonyms in NLTK WordNet Usage
42 pages
NLTK Text Processing and Analysis
No ratings yet
NLTK Text Processing and Analysis
17 pages
NLP Techniques: Stemming, Lemmatization, POS
No ratings yet
NLP Techniques: Stemming, Lemmatization, POS
15 pages
NLP Experiments in Google Colab
No ratings yet
NLP Experiments in Google Colab
9 pages
NLP Lab Programs
No ratings yet
NLP Lab Programs
12 pages
Word Similarity and NLP Techniques
No ratings yet
Word Similarity and NLP Techniques
14 pages
Install NLTK and Perform Stemming
No ratings yet
Install NLTK and Perform Stemming
18 pages
NLP Lab Manual: Tokenization to Audio Processing
No ratings yet
NLP Lab Manual: Tokenization to Audio Processing
8 pages
NLP Codes
No ratings yet
NLP Codes
11 pages
NLTK Text Processing Techniques
No ratings yet
NLTK Text Processing Techniques
28 pages
Natural Language Processing Lab Manual
No ratings yet
Natural Language Processing Lab Manual
18 pages
NLP Text Preprocessing with NLTK
No ratings yet
NLP Text Preprocessing with NLTK
27 pages
Python NLP: Word Analysis & Generation
No ratings yet
Python NLP: Word Analysis & Generation
4 pages
Web & Social Media Analytics Lab Guide
No ratings yet
Web & Social Media Analytics Lab Guide
58 pages
NLP Text Processing Techniques
No ratings yet
NLP Text Processing Techniques
10 pages
NLTK Word Analysis and Generation Techniques
No ratings yet
NLTK Word Analysis and Generation Techniques
22 pages
Lab 10-11
No ratings yet
Lab 10-11
27 pages
NLP Techniques for Text Preprocessing
No ratings yet
NLP Techniques for Text Preprocessing
55 pages
Installing Python NLTK and Programming
No ratings yet
Installing Python NLTK and Programming
11 pages
Python Tokenization, Stemming, and Lemmatization
No ratings yet
Python Tokenization, Stemming, and Lemmatization
5 pages
NLP Techniques: Stemming and Lemmatization
No ratings yet
NLP Techniques: Stemming and Lemmatization
15 pages
NLTK Tokenization and Stop Word Removal
No ratings yet
NLTK Tokenization and Stop Word Removal
17 pages
NLP Applications and Text Preprocessing
No ratings yet
NLP Applications and Text Preprocessing
56 pages
Essential Text Preprocessing Steps in NLP
No ratings yet
Essential Text Preprocessing Steps in NLP
31 pages
NLP Lab Tasks and Python Implementations
No ratings yet
NLP Lab Tasks and Python Implementations
16 pages
Install NLTK Library and Tokenization Guide
No ratings yet
Install NLTK Library and Tokenization Guide
3 pages
NLP Tokenization and Processing Techniques
No ratings yet
NLP Tokenization and Processing Techniques
16 pages
NLP r22 Lab
No ratings yet
NLP r22 Lab
28 pages
NLP Lab Manual for Tokenization and Stemming
No ratings yet
NLP Lab Manual for Tokenization and Stemming
45 pages
NLP Practical Exercises Overview
No ratings yet
NLP Practical Exercises Overview
16 pages
Python NLP Techniques and Examples
No ratings yet
Python NLP Techniques and Examples
15 pages
NLP Practical Journal with Python Code
No ratings yet
NLP Practical Journal with Python Code
17 pages
NLP Tokenization and Text Processing Guide
No ratings yet
NLP Tokenization and Text Processing Guide
21 pages
NLP Tokenization, Stemming, Lemmatization Guide
No ratings yet
NLP Tokenization, Stemming, Lemmatization Guide
29 pages
Python NLP Techniques and Examples
No ratings yet
Python NLP Techniques and Examples
7 pages
NLP (PRC)
No ratings yet
NLP (PRC)
9 pages
POS Tagging and Lemmatization in NLTK
No ratings yet
POS Tagging and Lemmatization in NLTK
6 pages
NLP-till GRU
No ratings yet
NLP-till GRU
187 pages
NLP Laboratory Manual for CSE AIML
No ratings yet
NLP Laboratory Manual for CSE AIML
27 pages
NLP Lab Manual for CSE Students
No ratings yet
NLP Lab Manual for CSE Students
65 pages
NLP Lab
No ratings yet
NLP Lab
36 pages
NLP Practical Exercises Overview
No ratings yet
NLP Practical Exercises Overview
16 pages
Screens
No ratings yet
Screens
13 pages
Syllabus Iiiy - II-sem
No ratings yet
Syllabus Iiiy - II-sem
23 pages
Understanding Database Transactions and ACID Properties
No ratings yet
Understanding Database Transactions and ACID Properties
45 pages
Metalworking Processes and Techniques Guide
No ratings yet
Metalworking Processes and Techniques Guide
1 page
Understanding Sets in Mathematics
No ratings yet
Understanding Sets in Mathematics
24 pages
Important Discrete Mathematics Questions
No ratings yet
Important Discrete Mathematics Questions
2 pages
SAP Time Management Overview
No ratings yet
SAP Time Management Overview
15 pages
Guide to Writing Extended Abstracts
No ratings yet
Guide to Writing Extended Abstracts
4 pages
Understanding Administrative Law Basics
No ratings yet
Understanding Administrative Law Basics
41 pages
MDS Newsletter October 2012
No ratings yet
MDS Newsletter October 2012
2 pages
Smidchens
No ratings yet
Smidchens
387 pages
Poetry Mastery Test Instructions
100% (2)
Poetry Mastery Test Instructions
3 pages
Understanding Stress: Causes & Symptoms
No ratings yet
Understanding Stress: Causes & Symptoms
3 pages
Women's Empowerment Initiatives in India
No ratings yet
Women's Empowerment Initiatives in India
4 pages
Characteristics and Functions of Higher Education
No ratings yet
Characteristics and Functions of Higher Education
4 pages
How To Create A Monthly Sales Game Plan Example
No ratings yet
How To Create A Monthly Sales Game Plan Example
39 pages
Carly Gon: Teaching Experience Summary
No ratings yet
Carly Gon: Teaching Experience Summary
5 pages
Philippine National Police Personal Data Sheet
No ratings yet
Philippine National Police Personal Data Sheet
7 pages
UNITS 1-5 Diagnostic Test A (Standard) : You Have ONE HOUR To Complete This Test. You Are Not Allowed To Use A Calculator
No ratings yet
UNITS 1-5 Diagnostic Test A (Standard) : You Have ONE HOUR To Complete This Test. You Are Not Allowed To Use A Calculator
32 pages
Barnstable Shawl KAL Guide 2025
No ratings yet
Barnstable Shawl KAL Guide 2025
19 pages
Socratic and Platonic Knowledge Theories
67% (3)
Socratic and Platonic Knowledge Theories
1 page
Print Culture in Modern History: Q&A Guide
No ratings yet
Print Culture in Modern History: Q&A Guide
20 pages
Merchant of Venice Act III Analysis
No ratings yet
Merchant of Venice Act III Analysis
3 pages
Bill Hopkins: Angry Young Man's Legacy
100% (1)
Bill Hopkins: Angry Young Man's Legacy
24 pages
ISO 9001:2015 Quality Management Overview
No ratings yet
ISO 9001:2015 Quality Management Overview
1 page
Ashley McCoy's Personality Insights
No ratings yet
Ashley McCoy's Personality Insights
11 pages
Kelas 7 Bahasa Inggris: Ujian Akhir
No ratings yet
Kelas 7 Bahasa Inggris: Ujian Akhir
6 pages
Management of Stroke-Related Seizures: Epilepsy
No ratings yet
Management of Stroke-Related Seizures: Epilepsy
3 pages
Life Attitude Profile Revised (LAP-R)
100% (2)
Life Attitude Profile Revised (LAP-R)
4 pages
Psychology Paper 2 Revision Guide
No ratings yet
Psychology Paper 2 Revision Guide
41 pages
Development of Formula Student Electric Car Battery Design Procedure
No ratings yet
Development of Formula Student Electric Car Battery Design Procedure
5 pages
Kings and Kingdoms: Dynasties of 7th-12th Century
No ratings yet
Kings and Kingdoms: Dynasties of 7th-12th Century
8 pages
The Geography of Beer
100% (1)
The Geography of Beer
211 pages
Understanding Filipino Culture and Values
No ratings yet
Understanding Filipino Culture and Values
8 pages
7th Class Social Studies Revision Worksheet
No ratings yet
7th Class Social Studies Revision Worksheet
15 pages
Waiver and Quitclaim of Ownership Document
100% (2)
Waiver and Quitclaim of Ownership Document
2 pages

NLP 2

Uploaded by

NLP 2

Uploaded by

#WEEK 01

Write a Python Program to perform following tasks on text

You might also like