0% found this document useful (0 votes)

20 views8 pages

NLP Questions and Answers Guide

The document provides an overview of Natural Language Processing (NLP) covering various topics including applications, components, phases, and key concepts such as morphology, typology, and parsing. It discusses techniques like stemming, lemmatization, and the use of libraries like NLTK, along with the importance of syntactic analysis and treebanks. Additionally, it addresses N-gram models, smoothing techniques, and the limitations of N-gram models in NLP.

Uploaded by

harini.konkala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views8 pages

NLP Questions and Answers Guide

Uploaded by

harini.konkala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

NLP

One mark Questions with answers

UNIT - I

1.) list out few applications of NLP.

• Question Answering
• spam detection
• machine translation
• speech correction
• Chatbot
• Speech recognition

2.) Components of NLP

• NLU (natural language understanding)

• NLG (natural language generation)

3.) Name five phases involved in NLP.

• lexical Analysis and morphological

• Syntactic Analysis
• semantic
• discourse integration
• pragmatic analysis

4.) Differentiate lexeme and lemma

Aspect Lexeme Lemma

The base unit of meaning; an abstract unit The dictionary form or canonical form
Definition
representing all inflected forms of a word. of a lexeme.
All the inflectional variants (e.g., walk, walks, A single standard form, typically used as
Represents
walked, walking). a headword in dictionaries.
Example Lexeme: RUN → run, runs, ran, running Lemma: run
Used in Linguistic analysis, corpus studies, NLP Dictionaries, NLP, morphological parsing
Nature Abstract and general Specific and representative

5.) Define Morphology

Morphology is the branch of linguistics that studies the structure and formation of words. It
analyzes how morphemes (the smallest units of meaning) combine to form words, including roots,
prefixes, and suffixes.

Example: In the word “unhappiness”, un- (prefix), happy (root), and -ness (suffix) are all morphem

6.) What is typology

Typology in linguistics is the study and classification of languages based on their structural features,
such as word order, sentence structure, or morphological patterns. It helps identify similarities and
differences among languages, regardless of their historical or genetic relationships.

Example: English follows SVO (Subject-Verb-Object) word order, while Hindi follows SOV (Subject-
Object-Verb).

7.) Mention about Fusional language

Fusional languages are defined by their feature-per-morpheme ratio higher than one (as in Arabic,
Czech, Latin, Sanskrit, German, etc.).

Ex: Word: Head

She nodded her head

She is the head of the department

check the head of the page

We should head back home now

The toothpaste came out of the head of the tube

8.) Features of NLTK

• Tokenization
• Lowercasing
• Removing stopwords
• Punctuation removal
• Stemming
• Lemmatization
• POS tagging
• Named Entity Recognition (NER)

9.) Define stemming

Stemming is the process of reducing a word to its base or root form, called a "stem.“

It helps group related words together so they can be analyzed as a single item, regardless of tense or
form.

Ex: Helping - help

studying - studi

flying - fli

helper - help

10.) Define Lemmatizing

Lemmatizing is the process of reducing a word to its lemma, or base form. Unlike stemming, it
produces a valid English word that makes sense on its own.

Stemming:
→ "caring" → "car" (not meaningful in context)

→ Fast but less accurate.

Lemmatizing:

→ "caring" → "care" (meaningful root word)

→ Slower but context-aware and grammatically correct.

11.) List out the libraries that are imported with respect to NLTK

import contractions

import nltk

import re

from [Link] import word_tokenize

from [Link] import stopwords

from [Link] import PorterStemmer, WordNetLemmatizer

from nltk import pos_tag

12.) Differentiate chunking and chinking

Chunking: the process of identifying and grouping phrases in a sentence — like noun phrases,
verb phrases, etc.

Chinking: Removes specific patterns within a chunk (like verbs or adverbs that don't belong)

13.) Define NER (Named Entity Recognition)

Process of identifying entities in the given sentence

Ex: Person names (e.g., Mahatma Gandhi)

Organizations (e.g., MRCET)

Locations (e.g., Hyderabad, India)

Dates (e.g., 20 June 2025)

Monetary values (e.g., ₹500, $1000)(AMOUNT MENTIONED IN TEXT)

Time, Percentages, Events, etc.

UNIT-II

1.) Define Parsing/Syntax Analysis.

A. the process of analyzing a sentence's grammatical structure according to the rules of a formal
grammar. It identifies the syntactic structure of a sentence and determines how the words relate to
each other.

2.) Applications of Syntactic analysis

• Grammar checking (e.g., Grammarly)

• Question answering systems

• Chatbots

• Machine translation

• Text summarization

3.) List out Approaches to Syntax Analysis.

 Top-Down Parsing – Starts from the start symbol and tries to derive the sentence.
 Bottom-Up Parsing – Builds the parse tree from the input up to the start symbol.
 Chart Parsing – Uses dynamic programming to store intermediate parsing results.
 Shift-Reduce Parsing – A bottom-up method using a stack to shift and reduce tokens.
 Recursive Descent Parsing – A top-down parser using recursive functions for grammar rules.
 Dependency Parsing – Focuses on word-to-word relations (head-dependent).
 Constituency Parsing – Breaks sentences into phrase structures (like NP, VP).
 Probabilistic Parsing – Uses probabilities to select the most likely parse tree.

4.) Define Treebanks

Treebanks are annotated text corpora that include syntactic or grammatical structure (usually in
the form of parse trees) for each sentence. They are used in Natural Language Processing (NLP) and
linguistics to train and evaluate parsers and grammar models.

Example: A sentence like "The cat sat on the mat." would be annotated to show how words group
into phrases (like noun phrases and verb phrases).

5.) Types of Syntax trees and what are they?

There are two main types of syntax trees in linguistics:

1. Constituency Tree (Phrase Structure Tree):

Shows how words group into phrases (like noun phrases or verb phrases) based on grammar
rules.
Example: [NP The cat] [VP sat [PP on [NP the mat]]]
2. Dependency Tree:
Shows word-to-word relationships, where one word (the "head") governs the others (its
"dependents").
Example: In "The cat sat," "sat" is the main verb, and "cat" is its subject dependent.

These trees help analyze sentence structure and grammatical relationships.

6.) Uses of Treebanks.

• Training parsers (e.g., probabilistic context-free grammar parsers, neural parsers)

• Evaluating parsing algorithms

• Linguistic research
• Building tools for translation

• sentiment analysis, etc.

7.) Write about data driven approach

A data-driven approach in linguistics and NLP relies on large annotated datasets (corpora) to learn
patterns and make predictions. Instead of using fixed grammar rules, this approach uses statistical
models or machine learning algorithms trained on real language data.

Example: A machine translation system trained on parallel corpora learns how to translate based on
patterns in the data, not predefined rules.

8.) Define dependency graph

A. A Dependency Graph is defined as how words in a sentence are connected based on their
grammar roles.

Ex:"Don't drink and drive.“

9.) Where do dependency graph is used.

• A. NLP parsers (like spaCy, Stanford NLP)

• Grammar checking tools

• Machine translation

• Information extraction

10.) List out the tools used to build Phrase structure trees.

• NLTK (Natural Language Toolkit) — Python

• Stanford Parser / CoreNLP

• spaCy + Benepar (Berkeley Neural Parser)

• RSyntaxTree (Web GUI Tool)

• SyntaxNet

11.) Write about types of Parsing algorithms.

• Shift-Reduce Parsing

• Chart Parsing (CYK Algorithm)

• Hypergraph-based Parsing

12.) Define Hypergraph.

hypergraph is a type of graph in which an edge, called a hyperedge, can connect more than two
vertices. It is used to represent multi-way relationships between elements.

Vertices: A, B, C, D

A B C D

●-------●-------●

\ | /

\_____|_____/

Hyperedge E1

13.) Write about Probabilistic Context-free Grammer.

A. Probabilistic Context-Free Grammar (PCFG) is an extension of CFG (Context-Free Grammar) where:

• Each production rule has an associated probability.

• These probabilities help choose the most likely parse tree when a sentence has multiple
possible meanings.

14.) List out Types of Generative models.

• PCFG (Probabilistic Context-Free Grammar)

• Lexicalized PCFG
• Generative Neural Parsers
• Data-Oriented Parsing (DOP)
• Bayesian Generative Models
• Stochastic Tree-Substitution Grammars (STSG)
• Generative Dependency Parsers
• Minimalist Grammars (generative, theoretical)

15.) What are the advantages of Discriminative models for parsing.

• Can use rich and overlapping features (lexical, syntactic, semantic).

• Do not take own decisions
• Provides higher accuracy parsing

UNIT-III

1.) How many types of n-gram models are there. What are they?

Types of N-Gram Model

• Unigram
• Bigram

• Trigram

• Higher-order N-gram Models

2.) What is the purpose of language model evaluation?

• The accuracy of word predictions

• The fluency and naturalness of generated text

• How well the model captures language structure and meaning

3.) Define perplexity.

Perplexity is a measurement of how well a language model predicts a sequence of words.

It tells user how “confused” or “surprised” the model is when it sees the actual text.

4.) Types of Smoothing techniques.

• Add-One (Laplace) Smoothing

• Add-k Smoothing
• Good-Turing Discounting
• Backoff and Interpolation

5.) Describe the role of smoothing in N-gram models. Why is it necessary?

Answer:
Smoothing helps when some N-grams in the test sentence do not appear in the training corpus,
resulting in zero probabilities.

Example: If "I enjoy mango" never appeared in training, then:

P("mango" | "enjoy") = 0 → Whole sentence probability = 0

Solution:

 Laplace Smoothing: Adds 1 to all counts to avoid zeros.

 Backoff Models: Fall back to smaller N-grams if higher ones are missing.

Smoothing ensures the model assigns non-zero probabilities to unseen sequences.

6) What are the limitations of N-gram models and how can they be addressed?:

Limitations:

 Data sparsity: Many possible word sequences may not appear in training data.
 Limited context: N-gram models only look at a few previous words.
 High memory: Storing large N-gram tables is resource-heavy.

Solutions:

 Smoothing: Adjusts probabilities of unseen N-grams (e.g., Laplace Smoothing).

 Backoff and Interpolation: Uses lower-order N-grams when higher-order ones are
unavailable.

Common questions

PCFG enhances syntactic parsing by assigning probabilities to production rules, allowing the parser to select the most likely parse tree among multiple possibilities, which improves disambiguation of sentences with multiple interpretations. This probabilistic approach considers real language usage patterns, thus providing more accurate parsing outcomes than standard CFG, which assumes all rules are equally likely .

Data-driven approaches, leveraging large annotated corpora for pattern learning and predictions, do not rely on fixed grammar rules and can adapt to diverse linguistic phenomena. They provide flexibility and efficiency in language processing but require substantial data volumes and computational resources. Challenges include handling data sparsity for less frequent constructs and ensuring generalization across languages, which may not always have comparable data availability .

Morphology studies word structure and formation, crucial for understanding how morphemes assemble to convey meaning. It informs NLP applications like machine translation and text analysis by providing insight into word variants and their grammatical functions. For example, in morphological parsing, recognizing how 'unhappiness' comprises 'un-' (prefix), 'happy' (root), and '-ness' (suffix) aids in deriving syntactical and semantic components, enhancing the comprehension and processing of complex syntactic structures .

A dependency tree shows word-to-word relations where words are connected based on grammatical roles, with each word dependent on another (e.g., 'The cat sat' uses 'sat' as the head with 'cat' as a subject). In contrast, a constituency tree reflects phrase structure, showing hierarchical groupings of words into phrases such as noun or verb phrases (e.g., '[NP The cat] [VP sat]'). Both are essential as they provide different perspectives on sentence structure, aiding in comprehensive language analysis and applications such as translation and sentiment analysis .

Linguistic typology highlights the syntactic structure differences, such as English's SVO (Subject-Verb-Object) order versus Hindi's SOV (Subject-Object-Verb) order. These structural differences require machine translation systems to adeptly rearrange sentence elements to maintain meaning across languages. These insights are crucial for developing algorithms that can navigate and transform between divergent language structures, ensuring the preservation of semantic accuracy in machine translation .

Syntactic analysis helps by structuring questions and potential answers into parsed representations that facilitate identifying relationships between words and phrases, allowing systems to understand and generate grammatical structures necessary for extracting relevant information. This analysis is crucial for developing semantically appropriate answers and is key in accurately understanding user input, making it an essential feature for the effectiveness and reliability of question-answering systems .

Smoothing helps N-gram models handle cases where certain N-grams in the test sentence do not appear in the training data, which would otherwise lead to zero probabilities and ineffective predictions. Techniques such as Laplace Smoothing (adding 1 to all N-gram counts) and Backoff and Interpolation (using lower-order N-grams when higher-order are missing) adjust probabilities to ensure that sentences with unseen N-grams are assigned non-zero probabilities, thereby improving model robustness .

Parsing is integral because it helps grammar checking tools analyze sentence structure, ensuring syntactic rules are adhered to by identifying grammatical relations and errors. Various approaches such as chart parsing for efficiency in error detection and dependency parsing for relational accuracy are used, enabling such tools to offer precise suggestions for correction and enhancement of text fluidity and grammatical correctness .

Stemming reduces a word to its base form or 'stem,' which is often not a valid word (e.g., 'caring' -> 'car') and is faster but less accurate. This is suitable where speed is essential and precise word forms are unimportant, such as in basic search engines. Lemmatizing reduces a word to its lemma, producing a meaningful base form (e.g., 'caring' -> 'care'), and is context-aware, making it more accurate but slower. It is preferred in applications requiring grammatical correctness, such as NLP tasks like machine translation and part-of-speech tagging .

Treebanks, which contain syntactically annotated text corpora, help train parsers by providing reference structures for language models to learn sentence parsing. They assist in evaluating parsing algorithms and contribute to linguistics research by offering insights into syntactic patterns and variations, supporting the development of models for translation, sentiment analysis, and more, thus bridging theory with practical applications .

Key NLP Concepts and Techniques
No ratings yet
Key NLP Concepts and Techniques
7 pages
Understanding NLP: Key Concepts & Techniques
No ratings yet
Understanding NLP: Key Concepts & Techniques
4 pages
Morphological Analysis in NLP
No ratings yet
Morphological Analysis in NLP
15 pages
Understanding Structural Ambiguity in NLP
No ratings yet
Understanding Structural Ambiguity in NLP
16 pages
NLP Exam Evaluation Scheme 2025
No ratings yet
NLP Exam Evaluation Scheme 2025
14 pages
Types of Parsing Algorithms in NLP
No ratings yet
Types of Parsing Algorithms in NLP
9 pages
Syntax Analysis in Natural Language Processing
No ratings yet
Syntax Analysis in Natural Language Processing
16 pages
Natural Language Processing Overview
No ratings yet
Natural Language Processing Overview
19 pages
AI (Unit V) Question Bank
No ratings yet
AI (Unit V) Question Bank
12 pages
Natural Language Processing
No ratings yet
Natural Language Processing
6 pages
NLP Evaluation Scheme for M.Tech Exam
No ratings yet
NLP Evaluation Scheme for M.Tech Exam
16 pages
NLP Question Bank Overview
0% (1)
NLP Question Bank Overview
43 pages
Syntactic Analysis in NLP: CFGs & Parsing
No ratings yet
Syntactic Analysis in NLP: CFGs & Parsing
47 pages
Understanding Parsing in NLP
No ratings yet
Understanding Parsing in NLP
16 pages
Al3501-Natural Language Processing-1153468745-Nlp 3 and 4
No ratings yet
Al3501-Natural Language Processing-1153468745-Nlp 3 and 4
46 pages
Natural Language Processing-U2
No ratings yet
Natural Language Processing-U2
18 pages
NLP Unit 3
No ratings yet
NLP Unit 3
12 pages
NLP-Assignment Answers
No ratings yet
NLP-Assignment Answers
62 pages
Syntactic Parsing and POS Tagging in NLP
No ratings yet
Syntactic Parsing and POS Tagging in NLP
92 pages
NLP Concepts and Techniques Overview
No ratings yet
NLP Concepts and Techniques Overview
10 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
9 pages
Morphological and Lexical Analysis in NLP
No ratings yet
Morphological and Lexical Analysis in NLP
30 pages
Understanding NLP Techniques and Applications
No ratings yet
Understanding NLP Techniques and Applications
27 pages
LLM Mid Sem
No ratings yet
LLM Mid Sem
60 pages
Morphological Analysis in NLP
No ratings yet
Morphological Analysis in NLP
14 pages
Module 6
No ratings yet
Module 6
28 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
51 pages
History and Techniques of NLP
No ratings yet
History and Techniques of NLP
46 pages
Syntactic and Semantic NLP Representations
No ratings yet
Syntactic and Semantic NLP Representations
47 pages
Syntactic Analysis in Natural Language Processing
No ratings yet
Syntactic Analysis in Natural Language Processing
3 pages
Natural Language Processing (NLP)
No ratings yet
Natural Language Processing (NLP)
27 pages
NLP Prev QP
No ratings yet
NLP Prev QP
9 pages
Language
No ratings yet
Language
13 pages
Unit 1 (Natural Language Basics)
No ratings yet
Unit 1 (Natural Language Basics)
23 pages
NLP Chapter-1
No ratings yet
NLP Chapter-1
24 pages
NLP Oral
No ratings yet
NLP Oral
22 pages
NLP Unit 2
No ratings yet
NLP Unit 2
15 pages
Natural Language Proceesing Unit 3
No ratings yet
Natural Language Proceesing Unit 3
22 pages
Levels of Language Analysis in NLP
No ratings yet
Levels of Language Analysis in NLP
37 pages
NLP Unit3
No ratings yet
NLP Unit3
91 pages
Overview of NLP Components and Steps
No ratings yet
Overview of NLP Components and Steps
26 pages
ASR and NLP Concepts Explained
No ratings yet
ASR and NLP Concepts Explained
20 pages
Natural Language Processing Basics
No ratings yet
Natural Language Processing Basics
98 pages
NLP Morphology: Tokens, Lexemes, Morphemes
No ratings yet
NLP Morphology: Tokens, Lexemes, Morphemes
33 pages
Syntactic Analysis Overview
No ratings yet
Syntactic Analysis Overview
4 pages
Understanding Natural Language Processing
No ratings yet
Understanding Natural Language Processing
5 pages
Syntactic Parsing in NLP Explained
No ratings yet
Syntactic Parsing in NLP Explained
45 pages
NLP Syntax and Semantics Overview
No ratings yet
NLP Syntax and Semantics Overview
48 pages
NLP Unit 2
No ratings yet
NLP Unit 2
20 pages
Top 30 NLP Interview Questions
No ratings yet
Top 30 NLP Interview Questions
18 pages
Unit 3
No ratings yet
Unit 3
43 pages
NLP Techniques and Linguistic Foundations
No ratings yet
NLP Techniques and Linguistic Foundations
32 pages
Syntactic Parsing in Natural Language Processing
No ratings yet
Syntactic Parsing in Natural Language Processing
15 pages
Context-Free Grammar and Parsing in NLP
No ratings yet
Context-Free Grammar and Parsing in NLP
29 pages
History and Challenges of NLP
No ratings yet
History and Challenges of NLP
7 pages
English 9 Summative Test Guide
No ratings yet
English 9 Summative Test Guide
2 pages
Polish Writing Practice Workbook
No ratings yet
Polish Writing Practice Workbook
38 pages
Understanding Reported Speech Rules
No ratings yet
Understanding Reported Speech Rules
3 pages
TH 2 SB
100% (3)
TH 2 SB
132 pages
JLPT N4 Vocabulary and Grammar Guide
No ratings yet
JLPT N4 Vocabulary and Grammar Guide
4 pages
English Quiz on Moroccan Media
No ratings yet
English Quiz on Moroccan Media
4 pages
Noun Worksheets for Class 3 Students
No ratings yet
Noun Worksheets for Class 3 Students
4 pages
WMT22 Metrics Task Results Overview
No ratings yet
WMT22 Metrics Task Results Overview
23 pages
English Grammar: Past Tense & Adjectives
No ratings yet
English Grammar: Past Tense & Adjectives
4 pages
Grade 12 - Notice Writing - Details - Checklists and Basic Samples
No ratings yet
Grade 12 - Notice Writing - Details - Checklists and Basic Samples
4 pages
30-Day English Grammar Challenge
No ratings yet
30-Day English Grammar Challenge
1 page
Guidelines for Effective Test Item Construction
No ratings yet
Guidelines for Effective Test Item Construction
9 pages
English Language Questions & Answers Guide
No ratings yet
English Language Questions & Answers Guide
35 pages
Understanding Morphology: Words & Morphemes
No ratings yet
Understanding Morphology: Words & Morphemes
8 pages
Propositional Logic and Categorical Propositions
No ratings yet
Propositional Logic and Categorical Propositions
20 pages
Grade 2 Conjunctions Worksheet
100% (2)
Grade 2 Conjunctions Worksheet
3 pages
English Language Exam Instructions
No ratings yet
English Language Exam Instructions
1 page
Quantifiers for Countable and Uncountable Nouns
No ratings yet
Quantifiers for Countable and Uncountable Nouns
1 page
Effective Mediation in Language Learning
No ratings yet
Effective Mediation in Language Learning
21 pages
Grammar Exercises for HSC 2nd Paper
No ratings yet
Grammar Exercises for HSC 2nd Paper
2 pages
The Role of Grammatical Instruction Within Communicative Language
No ratings yet
The Role of Grammatical Instruction Within Communicative Language
72 pages
Law College English Exam Structure Guide
No ratings yet
Law College English Exam Structure Guide
9 pages
Beginner English Course Syllabus A1
No ratings yet
Beginner English Course Syllabus A1
5 pages
Hindi Audio Message Guidelines
No ratings yet
Hindi Audio Message Guidelines
6 pages
Kamal Hindi Book Overview and Insights
No ratings yet
Kamal Hindi Book Overview and Insights
2 pages
English Test for 6th Grade Students
No ratings yet
English Test for 6th Grade Students
3 pages
English Synonyms Reference Guide
No ratings yet
English Synonyms Reference Guide
3 pages
English Language in Business Management
No ratings yet
English Language in Business Management
8 pages
Eco-Friendly English Learning Plan
No ratings yet
Eco-Friendly English Learning Plan
5 pages
Prepositional Phrase Worksheet Answers
No ratings yet
Prepositional Phrase Worksheet Answers
10 pages