0% found this document useful (0 votes)
20 views3 pages

Comprehensive NLP Concepts and Techniques

The document outlines a comprehensive list of questions related to Natural Language Processing (NLP) across five units, covering topics such as applications, components, parsing, language modeling, semantic interpretation, and discourse processing. Each unit contains specific questions aimed at exploring fundamental concepts and methodologies in NLP. The questions encourage detailed explanations and examples, facilitating a deeper understanding of the subject matter.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views3 pages

Comprehensive NLP Concepts and Techniques

The document outlines a comprehensive list of questions related to Natural Language Processing (NLP) across five units, covering topics such as applications, components, parsing, language modeling, semantic interpretation, and discourse processing. Each unit contains specific questions aimed at exploring fundamental concepts and methodologies in NLP. The questions encourage detailed explanations and examples, facilitating a deeper understanding of the subject matter.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

NLP List of Questions

UNIT-I
1. What are the different applications of NLP.
2. What are the different components of NLP and write about the different steps in NLP.
3. Explain all the NLP steps in detail.
4. Write short notes on tokens, lexemes, morphemes.
5. What is morphological typology. Explain.
6. Explain the concept of tokenization, filtering stopwords, stemming, Tagging Parts of
speech.
7. Explain the concept of Lemmatization, Name Entity Recognition, Term Frequency
and Inverse Document Frequency.
8. Explain the concept of Chunking and Chinking.
UNIT-II
9. Explain what is Parsing Natural Language.
10. What is Treebank. Explain in detail.
11. Explain Context Free Grammar.
12. Explain the concept of Syntax analysis using Dependency Graph.
13. Explain the concept of Syntax analysis using Phrase Structure Trees.
14. Write about Shift Reduce Parsing in NLP.
15. Write note on Chart Parsing (CYK Parsing).
16. What is Hypergraph.
17. Write about models for ambiguity resolution in parsing.

UNIT-III

18. What are the different usecases of Language Modelling.


19. Explain the concept of N-Grams in detail.
20. Explain the concept of Maximum Likelihood Estimate.
21. Write note on Evaluating Language Models.
22. Explain how perplexity is used to evaluate language model.
23. Explain the concept of Laplace Smoothing with respect to Language Modelling.
24. Explain what is sampling sentences from a language model.
25. Write about Back off and Interpolation.
26. Explain what is parameter estimation and specify the importance of it and how it
works?
27. Describe Language Model Adaptation.
28. What are the different types of Language models. Explain in detail

UNIT-IV

29. Write note on Semantic Interpretation.


30. Explain what are the different System Paradigms in Semantic Interpretation.
31. Write a Rule based algorithm for semantic Parsing and explain how it works with an
example.
32. Write the supervised learning algorithm for Semantic Parsing and explain.
33. Write unsupervised learning algorithm for Semantic Parsing and explain.
34. Write the semi-supervised learning algorithm for Semantic Parsing and explain.

UNIT-V

35. Explain how Semantic Role Labelling can be done using FrameNet and PropBank.
36. Distinguish between FrameNet and PropBank
37. Write the algorithm for Semantic Role Labelling and explain.
38. Explain what is Deep Semantic Parsing (Meaning representation) with examples.
39. Explain what is Rule based and Supervised Learning algorithms for Deep Semantic
Parsing (Meaning Representation.
40. Explain what is Discourse Processing.
UNIT–III: Language Modeling

1. What is an N-gram in language modeling?

2. List the Types of Evaluation in language modeling.

3. What is the purpose of smoothing in language models?

4. What does perplexity measure in language model evaluation?

5. List the Language-Specific Modelling Problems in NLP

6. Define language model adaptation.

UNIT–IV: Semantic Parsing

1. What is semantic parsing?

2. Define semantic interpretation.

3. What does WSD stand for in word sense systems?

4. Name one approach used in semantic parsing.

5. What is the role of software in semantic parsing?

6. Give one application of semantic parsing.

UNIT–V: Predicate-Argument Structure & Discourse Processing

1. What is a predicate in predicate-argument structure?

2. What is an argument in language semantics?

3. What is meaning representation?

4. Name one meaning representation system.

5. What is the purpose of software in meaning representation?

6. Define discourse cohesion.

7. What is reference resolution?

Common questions

Powered by AI

Semantic interpretation involves converting natural language into a structured representation of meaning, essential for applications like question answering and knowledge extraction. Different paradigms, such as rule-based, statistical, and hybrid approaches, facilitate this by varying levels of linguistic knowledge integration and learning from data. Rule-based systems rely on predefined grammars, while statistical paradigms apply machine learning to infer meanings, offering flexibility and scalability. Hybrids aim to combine the strengths of both, optimizing for accuracy and adaptability in diverse contexts .

Context Free Grammar (CFG) is pivotal in NLP for defining sentence structure rules. It uses production rules to capture the hierarchical nature of language, providing a framework for parsing algorithms to determine the grammatical consistency of sentences. CFG enables the construction of parsing trees, aiding natural language systems in understanding sentence structure for applications like machine translation and speech recognition .

Dependency parsing offers a relational perspective by focusing on the dependencies between words, directly reflecting syntactic functions like subject-predicate relationships, which is more intuitive and closer to how language is processed cognitively. In contrast, phrase structure trees emphasize hierarchical syntactic groupings, often complicating real-time processing due to their emphasis on constituency over relational syntax. Dependency parsing is preferred in language applications requiring more flexible and interpretable representations, such as syntax-based machine translation .

Tokenization, filtering stopwords, and stemming are critical processes in NLP that enhance text processing efficiency. Tokenization breaks down text into individual units or 'tokens,' facilitating easier analysis by focusing on the structural and syntactic aspects of language. Filtering stopwords removes common but unmeaningful words, reducing dimensionality and ensuring computational resources focus on significant terms. Stemming reduces words to their root forms, which consolidates variations and reduces complexity, essential for tasks such as text classification, sentiment analysis, and information retrieval by improving algorithmic performance .

Laplace Smoothing addresses data sparsity issues by adding a small constant to all word count frequencies in a language model. This prevents zero probabilities for unseen events, enhancing model reliability. However, it may overestimate the probabilities of infrequent events, potentially skewing results by uniformly adjusting all word counts rather than tailoring adjustments to context or likelihood, which can affect accuracy in language applications .

Perplexity is a metric that gauges the quality of a language model by evaluating how well it predicts a sample. A lower perplexity indicates a model with better predictive accuracy, meaning it is more effective at understanding the underlying language patterns. It measures the model's surprise at observing a test set, thus reflecting its robustness and ability to generalize beyond training data .

FrameNet and PropBank differ in their approach to semantic role labeling (SRL); FrameNet classifies roles based on frame semantics, using conceptual scenarios to assign roles, while PropBank annotates roles through verb-specific framesets focusing on syntactic behavior. FrameNet offers rich semantic insights but requires extensive annotation, whereas PropBank provides more direct mappings to syntactic structures, beneficial for machine learning models. Both influence NLP tasks by providing varied perspectives in role identification, enhancing capabilities in tasks like information extraction and machine translation .

The semi-supervised learning algorithm for semantic parsing combines limited annotated data with larger unannotated corpora, leveraging both to build robust parsing models. It bridges the gap between resource-intense supervised methods and less accurate unsupervised ones, improving generalization and adaptability to new domains. Such algorithms are pivotal in NLP systems by enhancing scalability while maintaining high accuracy, enabling efficient processing even in resource-scarce environments .

Sampling sentences from a language model involves generating text based on probabilistic predictions, crucial for creative tasks in NLP such as text generation, predictive typing, and dialogue systems. By drawing samples, models can generate diverse linguistic outputs, reflecting broader language use. This impacts applications by improving user interaction through more natural, varied responses, enhancing user engagement and realism in automated systems .

Treebanks are corpora that provide syntactic annotations for sentences, serving as foundational resources in developing and evaluating parsing algorithms. They enable supervised learning by offering annotated examples, enhancing parser accuracy and robustness. However, challenges include the significant effort and expertise required for manual annotation, limited language coverage, and potential biases reflecting linguistic theories at the time of creation, affecting universality and applicability across different language models .

You might also like