0% found this document useful (0 votes)
34 views20 pages

NLP Question Bank for B.Tech CSE 2024-25

The document is a question bank for a course on Introduction to Natural Language Processing at Vignan Institute of Technology and Science for the academic year 2024-25. It includes descriptive and objective questions categorized by units, covering various topics such as morphemes, tokenization, syntactic parsing, semantic ambiguity, and language modeling. Each question is accompanied by marks allocation and learning outcomes.

Uploaded by

kattasatwika56
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views20 pages

NLP Question Bank for B.Tech CSE 2024-25

The document is a question bank for a course on Introduction to Natural Language Processing at Vignan Institute of Technology and Science for the academic year 2024-25. It includes descriptive and objective questions categorized by units, covering various topics such as morphemes, tokenization, syntactic parsing, semantic ambiguity, and language modeling. Each question is accompanied by marks allocation and learning outcomes.

Uploaded by

kattasatwika56
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

VIGNAN INSTITUTE OF TECHNOLOGY AND SCIENCE

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


AY 2024-25 III YEAR II SEM
INTRODUCTION TO NATURAL LANGUAGE PROCESSING
QUESTION BANK

Course :_B.Tech____ Year / Semester: ___


III/II_______

Subject Name :_ INT TO NLP Branch


Name(s):___CSE(AIML)________

Subject Code(s) :_________ _________

DESCRIPTIVE QUESTION BANK

UNIT- 1

Q..N
DESCRIPTION OF QUESTION MARKS CO PO BTL
O
a Define a morpheme 2 1 1 1
b State the purpose of tokenization in document analysis. 3 1 1 1
1 c Explain in detail about some early NLP Systems 5 1 1 2
D List and explain the issues and challenges of NLP System in detail 5 1 1 2
a Define vowel harmony? Provide an example from Turkis 2 1 2 1
b State morphological ambiguity with an example. 3 1 2 1
2 c Discuss in detail about Complexity approaches of NLP Systems 5 1 2 1
d What are the measures used to find the performances of the NLP
5 1 2 2
Methods.
a Name two structural elements commonly found in documents.
2 1 1 1
b Define polysynthetic morphology? Provide an example. 3 1 2 1
3 c “Ambiguous document boundaries affect text segmentation". Justify
5 1 2 1
the statement
d Explain generative sequence classification methods for sentence
5 1 2 5
segmentation.
a Mention two agglutinative languages and explain their characteristic
2 1 1 1
feature.
b Discuss the impact of morphological ambiguity on POS tagging? 3 1 1 2
4
c Explain the trade-offs between precision and recall in document
10 1 2 2
structure analysis
a Name two applications of topic boundary detection.
2 1 1 1
5 b How does functional morphology differ from rule-based morphology?
3 1 1 1
c Discuss the common features used for both text and speech in 10 1 2 5
document structure analysis
a Name two challenges of analyzing multi-column layouts.
2 1 1 1
b Define finite-state transducers (FSTs)? 3 1 1 2
6
c Elaborate on the major types of morphological models and analyze
their advantages and limitations. 10 1 2 2

UNIT-2
Q..N
DESCRIPTION OF QUESTION MARKS CO PO BTL
O
a Name two tasks in NLP that rely on syntactic parsing.
2 2 1 1
b Explain the ambiguity in syntactic parsing? 3 2 1 1
1 CW What are the Limitations in syntax parsing? 5 2 1 2
d Compare constituency parsing and dependency parsing. Highlight their
differences and use cases. 5 2 2 2

a Name two widely used treebanks in NLP.


2 2 2 3
b What are non-projective dependencies? Provide an example.
3 2 2 3
2
c Give all possible parse trees for the sentence, Stolen painting found by
5 2 2 1
tree.
d Explain any one Parsing Algorithm with the help of an example 5 2 1 3
a Define recursion in phrase structure trees?
2 2 1 1
b How do dependency graphs handle coordination and subordination in
sentences? Illustrate with examples. 3 2 2 2
3
c Describe the significance ince of treebanks in data-driven syntactic
5 2 2 2
analysis
d Justify the need for data-driven approaches over rule-based systems in
modem syntactic parsing 5 2 2 2

a Discuss the process of creating a treebank. 2 2 1 4


b Define treebank? 3 2 1 2
c Explain the following terms
4
a) Parsing natural language.
10 2 2 1
b) Models for ambiguity resolution resolution in parsing

5 a Define dependency graph 2 2 1


b How does context-free grammar (CFG) play a role in syntactic 3 2 1 4
parsing?
c Discuss the role of a stack in shift-reduce parsing? Explain its function
with an example. 10 2 2 2

a Describe the components of a phrase structure tree 2 2 1 1


b Define unsupervised parsing?
3 2 1 4
6
c Compare manually annotated treebanks with automatically generated
10 2 2 2
treebanks.

UNIT-3
Q..N
DESCRIPTION OF QUESTION MARKS CO PO BTL
O
a Define semantic ambiguity?
2 3 1 1
b Name two strategies for resolving structural ambiguity 3 3 1 1
1 c Explain the concept of ambiguity in syntactic parsing. 5 3 1 2
d How does semantic ambiguity differ from syntactic ambiguity, and
5 3 1 4
how can it be resolved in NLP systems?
a How are probabilities assigned in a PCFG?
2 3 2 3
b How do generative models assign probabilities to parse trees?
3 3 2 1
2
c How do generative models handle ambiguity in parsing? 5 3 2 2
d Compare generative and discriminative models for ambiguity
5 3 2 3
resolution in parsing.
a How do discriminative models address ambiguity?
2 3 1 1
b Define universal dependencies? 3 3 2 1
c Discuss the advantages of using PCFGs over traditional context-free
3 grammars (CFGs) in parsing tasks. 5 3 2 2

d How do generative models handle uncertainty in parsing? Provide an


example. 5 3 1

a What is tokenization?
2 3 1 1
b Why is encoding important in NLP?
3 3 1 3
4
c How do discriminative models address the issue of ambiguity in
parsing? Provide an example. 10 3 2 1

a How is word segmentation performed in Thai?


2 3 1 1
b Name one challenge of tokenizing Chinese text.
3 3 1 1
5
c Discuss the role of universal dependencies in multilingual parsing.
How do they address language-specific challenges? 10 3 2 2
a Define semantic interpretation? 2 3 1 2
b What is Abstract Meaning Representation (AMR)?
3 3 1 2
6
c Compare rule-based and machine learning-based approaches to
10 3 2 2
tokenization

UNIT-4
Q..N
DESCRIPTION OF QUESTION MARKS CO PO BTL
O
a Mention the common challenges in word sense ense disambiguation 2 4 1
B State the difference between a predicate and an argument 3 4 1
C Elaborate on the importance of predicate-argument structure in
5 4 1
1 understanding natural language semantics.
D Discuss how different meaning representation systems (eg., AMR,
FOL, Frame Net) model the same sentence differently. 5 4

A What is a predicate-argument structure?


2 4 2
B Define FrameNet?
3 4 2
2
C Explain the concept of predicate-argument structure in semantic
5 4 2
parsing.
D Discuss the importance of FrameNet as a resource for analyzing
5 4
predicate-argument structures
A How does FrameNet support semantic parsing?
2 4 1
B What is the difference between shallow and deep semantic parsing?
3 4 2
3
C Explain the role of logical forms in meaning representation. Provide an
5 4 2
example of a logical form for a simple sentence.
D Explain the role of dependency parsing in extracting predicate-
5 4
argument structures.
A What is Abstract Meaning Representation (AMR)?
2 4 1
B Define Discourse Representation Theory (DRT)?
4 3 4 1
C Discuss the role of predicate-argument structures in question-
10 4 2
answering systems.
a How do meaning representations contribute to text summarization?
2 4 1
b What is SEMAFOR? 3 4 1
5
c How do meaning representation systems handle ambiguity in natural
language? Provide examples. 10 4 2

6 a Define OpenIE?
2 4 1
b How do meaning representations aid in machine translation? 3 4 1
c Discuss the role of meaning representations in cross-lingual NLP
tasks. Provide examples from multilingual datasets. 10 4 2

UNIT-5

Q..N
DESCRIPTION OF QUESTION MARKS CO PO BTL
O
a What is the role of context in language modeling?
2 5 1 1
b Define unigram model? 3 5 1 2
1 c Describe the problems involved in language-specific modeling 5 5 1 2
d Outline methods for language model adaptation in dynamic
5 5 2 2
conversational Al systems.
a How does cross-validation improve model evaluation?
2 5 2 1
b Define posterior probability? 3 5 2 4
c Compare Compare and and contrast multilingual and cross lingual
2 5 5 2 2
language modeling.
d How do Bayesian topic-based language models improve the
representation of thematic content in text? 5 5

a How does Bayesian estimation address overfitting?


2 5 1 1
b Why is adaptation important for domain-specific applications?
3 5 2 4
c Compare maximum likelihood estimation (MLE) with Bayesian
3
parameter estimation. Highlight their strengths and weaknesses. 5 5 2 3

d Explain the process of adapting a general-purpose language model to a


specific domain. Provide an example. 5 5

a How does transfer learning support language model adaptation?


2 5 1 5
b Compare maximum likelihood estimation (MLE) with Bayesian
estimation. 3 5 1 1
4
c How does Bayesian parameter estimation address the problem of
overfitting in language models? 10 5 2 2

5 a How does Bayesian estimation improve robustness?


2 5 1 5
b How do class-based models reduce complexity?
3 5 1 1
c What is the Markov assumption in n-gram models? How does it 10 5 2 2
simplify the computation of probabilities?

6 a What is crosslingual language modeling? 2 5 1 5


b How does transfer learning support language model adaptation?
3 5 1 1
c Define language modeling and explain its importance in natural
language processing (NLP) tasks like machine translation and speech
10 5 2 2
recognition.
OBJECTIVE QUESTION BANK
UNIT-I
MULTIPLE CHOICE QUESTIONS:
1) What is a token in NLP?
a) A word form as it appears in text
b) The abstract representation of a word
c) The smallest meaningful unit of language
d) A rule-based morphological model
2) Which of the following is an example of a bound morpheme?
a) "run"
b) "-ing"
c) "dog"
d) "happy"

3) Which language type combines multiple meanings into a single morpheme?


a) Agglutinative
b) Isolating
c) Fusional
d) Polysynthetic
4) What is overstemming?
a) Removing too little of a word during stemming
b) Removing too much of a word during stemming
c) Adding unnecessary affixes during lemmatization
d) Failing to identify morphemes in a word
5) Irregular verbs like "go-went-gone" pose challenges for which type of morphological model?
a) Dictionary lookup
b) Finite-state morphology
c) Unification-based morphology
d) All of the above
6) Which model uses feature structures to represent linguistic information?
a) Dictionary lookup
b) Finite-state morphology
c) Unification-based morphology
d) Functional morphology
7) Functional morphology focuses on the __________ relationships between morphemes.
a) syntactic and semantic
b) phonological and orthographic
c) visual and structural
d) lexical and derivational
8) Which of the following is a challenge in sentence boundary detection?
a) Handling abbreviations like "Dr."
b) Identifying multi-column layouts
c) Extracting headings and subheadings
d) Analyzing font sizes and styles
9) Which of the following is NOT a method for document structure analysis?
a) Rule-based methods
b) Machine learning-based methods
c) OCR-based methods
d) Template-based methods
10) Which metric measures the correctness of predictions in document structure analysis?
a) Recall
b) Precision
c) Accuracy
d) F1-score
Fill in the blanks:
11. A __________ is the smallest meaningful unit of language.
12. The word "unhappiness" consists of a root ("happy"), a prefix ("un-"), and a suffix ("-
ness"). The root is also called the __________.
13. Languages like Chinese, which have minimal inflection, are classified as __________
languages.
14. In Turkish, words like "evlerimizde" are formed by stringing together multiple
morphemes. This is an example of an __________ language.
15. The process of forming new words by adding affixes is known as __________.
16. Irregular verbs like "go-went-gone" are examples of __________ in morphology.

17. The phenomenon where a single word has multiple interpretations is called
__________.

18. Topic boundary detection helps in identifying __________ shifts in a document.


19. Layout analysis in document processing involves identifying __________ elements
like headings and paragraphs.
20. Hybrid approaches combine the strengths of __________ and machine learning
methods.

UNIT-1 : Objective Key


[Link]. Answer [Link]. Answer
morpheme
01 a 11

lexeme
02 b 12

03 c 13 isolating
04 b 14 agglutinative
05 d 15 derivation
irregularities
06 c 16

07 a 17 ambiguity
08 c 18 thematic
09 c 19 structural
10 d 20 rule-based
UNIT-II

Objective Questions:
1. What is the primary goal of syntactic parsing?
a) To identify the grammatical structure of a sentence
b) To translate text into another language
c) To extract keywords from a document
d) To classify text into categories
2. Which of the following is NOT a type of ambiguity in syntactic parsing?
a) Lexical ambiguity
b) Structural ambiguity
c) Semantic ambiguity
d) Phonological ambiguity
3. What is the role of context-free grammar (CFG) in parsing?
a) To model the hierarchical structure of sentences
b) To classify words into parts of speech
c) To generate random sentences
d) To translate text into another language
4. Which of the following is a challenge in parsing natural language?
a) Long-range dependencies
b) Short sentences
c) Lack of punctuation
d) Uniform word order
5. Which parsing approach focuses on identifying relationships between words?
a) Constituency parsing
b) Dependency parsing
c) Rule-based parsing
d) Probabilistic parsing
6. Which of the following is a widely used treebank?
a) WordNet
b) Penn Treebank
c) Wikipedia Corpus
d) Google Books Corpus
7. What is annotation bias in treebanks?
a) Errors introduced by manual annotators
b) Systematic errors due to subjective decisions
c) Missing annotations in the dataset
d) Automatic generation of annotations

8. What is a projective dependency?


a) A dependency that crosses over other dependencies
b) A dependency that does not cross over others
c) A dependency that involves multiple heads
d) A dependency that is optional

9. Which of the following is NOT a constituent in a phrase structure tree?


a) Noun phrase (NP)
b) Verb phrase (VP)
c) Adjective phrase (AP)
d) Word embedding (WE)

10. What is recursion in phrase structure trees?


a) Repeating the same word
b) Embedding one phrase within another
c) Translating text into multiple languages
d) Classifying text into categories

Fill in the blanks


11. The process of analyzing the grammatical structure of a sentence is called
__________.

12. A sentence like "I saw the man with the telescope" demonstrates __________
ambiguity.

13. Context-free grammar (CFG) uses __________ to define the rules for sentence
structure.

14. Parsing noisy text, such as social media posts, often requires handling
__________ and informal language.

15. The two main types of syntactic parsing are constituency parsing and
__________ parsing.

16. A __________ is a collection of syntactically annotated sentences used to train


parsers.
17. The Penn Treebank is an example of a __________ treebank.

18. Annotation bias in treebanks can lead to __________ in parser performance.

19. Treebanks are essential for training __________ models in NLP.

20. Automatically generated treebanks often lack the __________ of manually


annotated ones.

UNIT-2 – Objective Key


[Link]. Answer [Link]. Answer
01 a 11 syntactic parsing
02 d 12 structural
03 a 13 production rules
04 a 14 abbreviations
05 b 15 dependency
treebank
06 b 16

07 b 17 manually annotated
08 D 18 systematic errors
09 B 19 statistical
10 B 20 accuracy

UNIT-III
1. What is syntactic ambiguity?
a) Multiple meanings of a word
b) Multiple possible parses of a sentence
c) Errors in tokenization
d) Lack of punctuation

2. What is the role of probabilistic models in ambiguity resolution?


a) To generate random parses
b) To assign probabilities to different parses
c) To classify text into categories
d) To translate text into another language

3. Which model assigns probabilities to parse trees based on training data?


a) Generative model
b) Discriminative model
c) Rule-based model
d) Hybrid model

4. Structural ambiguity often arises due to:


a) Differences in word order
b) Ambiguous punctuation
c) Multiple interpretations of a sentence's structure
d) Lack of capitalization

5. PCFGs are commonly used in which NLP task?


a) Named entity recognition
b) Dependency parsing
c) Text classification
d) Sentiment analysis

6. Which model is better suited for noisy text?


a) Generative model
b) Discriminative model
c) Rule-based model
d) Hybrid model

7. Which model assigns probabilities to correct parses directly?


a) Generative model
b) Discriminative model
c) Rule-based model
d) Hybrid model

8. Generative models are particularly useful for tasks involving __________.


a) Long-range dependencies
b) Joint probability modeling
c) Conditional probability modeling
d) Fast parsing

9. Free word-order languages pose challenges for __________ parsing.


a) Constituency
b) Dependency
c) Rule-based
d) Probabilistic

10. What is tokenization?


a) Assigning probabilities to words
b) Splitting text into meaningful units
c) Translating text into another language
d) Classifying text into categories

2. 11. Probabilistic models resolve ambiguity by assigning __________ to different parses.

12. Rule-based disambiguation relies on predefined __________ to resolve ambiguity.

13. Ambiguity resolution is critical for tasks like __________ and machine translation.

14. A PCFG assigns probabilities to __________ based on training data.

15. Generative models model the __________ probability of sentences and parse trees.

16. Features in discriminative models help capture __________ relationships in the data.

17. Generative models are often used for tasks like __________ parsing.

18. Multilingual parsing must account for differences in __________ and morphology.

19. Agglutinative languages like Turkish have complex __________ structures.

20. Universal dependencies provide a __________ framework for representing syntactic


structures.
UNIT-3 – Objective Key.
[Link]. Answer [Link]. Answer
probabilities
01 B 11

02 b 12 grammatical rules
03 A 13 semantic parsing
04 c 14 parse trees
05 b 15 joint
06 b 16 contextual
07 b 17 probabilistic
08 b 18 syntax
09 b 19 morphological
language-
10 b 20
independent

UNIT-IV

Objective Questions:
1. Which resource provides detailed verb classifications and argument structures?
a) WordNet
b) VerbNet
c) AMR
d) DRT

2. Which of the following is NOT a resource for predicate-argument structure analysis?


a) PropBank
b) FrameNet
c) WordNet
d) VerbNet
3. How do transformer-based models like BERT enhance predicate-argument structure
identification?
a) By generating random parses
b) By capturing contextual relationships
c) By translating text into another language
d) By classifying text into categories

4. What is the role of dependency parsing in predicate-argument structure extraction?


a) To classify text into categories
b) To identify syntactic relationships between words
c) To translate text into another language
d) To generate random parses

5. Which system is commonly used for semantic role labeling (SRL)?


a) SpaCy
b) SEMAFOR
c) NLTK
d) Hugging Face

6. What is the main limitation of FrameNet for predicate-argument structure analysis?


a) Lack of multilingual support
b) Limited coverage of verbs
c) Inability to handle noisy text
d) Lack of dependency parsing

7. Which representation framework uses graphs to model predicate-argument structures?


a) AMR
b) DRT
c) FrameNet
d) PropBank

8. What is the primary challenge of identifying predicate-argument structures in low-


resource languages?
a) Lack of annotated data
b) Uniform word order
c) Simple morphology
d) Short sentences

9. Which of the following is NOT a meaning representation framework?


a) AMR
b) DRT
c) WordNet
d) FrameNet

10. Which system generates meaning representations using open information extraction?
a) SEMAFOR
b) OpenIE
c) AllenNLP
d) Hugging Face

11. FrameNet is a resource that represents relationships between __________ and their
arguments.

12. PropBank annotates sentences with __________ roles to capture predicate-argument


structures.

13. Semantic role labeling (SRL) identifies the roles of __________ in a sentence.

14. Transformer-based models like BERT improve predicate-argument structure


identification by capturing __________ relationships.

15. Predicate-argument structures contribute to tasks like __________ and question


answering.

16. Abstract Meaning Representation (AMR) uses __________ to represent sentence


meaning.

17. Ontologies like WordNet provide __________ information for meaning


representation.

18. Discourse Representation Theory (DRT) is a framework for representing __________


meaning.

19. Neural networks capture __________ relationships to improve meaning


representation.

20. Logical forms are used to represent sentence meaning in a __________ format.

UNIT-4 – Objective Key.


[Link]. Answer [Link]. Answer
predicates
01 B 11

02 C 12 semantic
03 B 13 arguments
04 b 14 contextual
information
05 B 15
extraction
06 B 16 graphs
07 A 17 semantic
08 a 18 sentence
09 C 19 contextual
10 B 20 formal

UNIT-V
Objective Questions:
1. What is the primary goal of a language model?
a) To classify text into categories
b) To estimate the probability of a sequence of words
c) To translate text into another language
d) To generate random sentences

2. Which of the following is NOT a key component of language modeling?


a) Context handling
b) Ambiguity resolution
c) Word alignment
d) Probability estimation

3. Which of the following is NOT a type of n-gram model?


a) Unigram
b) Bigram
c) Trigram
d) Quadragram

4. What is posterior probability?


a) The probability of generating random sentences
b) The updated probability after observing data
c) The probability of translating text
d) The probability of classifying text

5. How does Bayesian estimation address overfitting?


a) By ignoring prior knowledge
b) By incorporating prior knowledge
c) By generating random sentences
d) By translating text into another language

6. What is the role of evidence in Bayesian estimation?


a) To classify text into categories
b) To normalize the posterior probability
c) To translate text into another language
d) To generate random sentences

7. What is multilingual language modeling?


a) Building models for multiple languages
b) Translating text into another language
c) Classifying text into categories
d) Generating random sentences

8. What is crosslingual language modeling?


a) Building models for a single language
b) Aligning meaning representations across languages
c) Translating text into another language
d) Generating random sentences

9. What is the role of shared embeddings in multilingual models?


a) To classify text into categories
b) To align representations across languages
c) To translate text into another language
d) To generate random sentences

10. What is parallel corpora?


a) Texts in a single language
b) Aligned texts in multiple languages
c) Randomly generated sentences
d) Classified texts

Fill-in-the-Blanks
11. A language model estimates the __________ of a sequence of words.
Answer :
12. The primary goal of language modeling is to predict the next word based on the
__________.
Answer :
13. Language models are widely used in tasks like __________ and speech recognition.
Answer :
14. The Markov assumption simplifies the computation by assuming that the
probability of a word depends only on the __________ words.
Answer :
15. BLEU score measures the __________ between generated and reference
texts.
Answer :
16. One limitation of perplexity is that it does not account for __________
meaning.
Answer :

17. Bayesian parameter estimation uses __________ distributions to incorporate


prior knowledge.
Answer :
18. Prior distributions represent __________ about the parameters before
observing data.
Answer :
19. Crosslingual language modeling involves aligning __________
representations across languages.
Answer :
20. mBERT is a __________ version of BERT.
Answer :

UNIT-5 – Objective Key.


[Link]. Answer [Link]. Answer
01 B 11 probability
02 C 12 context
03 d 13 machine translation
04 B 14 previous n-1
05 B 15 overlap
06 B 16 semantic
07 A 17 prior
08 B 18 beliefs
meaning
09 B 19
multilingual
10 b 20

You might also like