0% found this document useful (0 votes)
5 views3 pages

Section 1

The document consists of a series of questions and answers related to Information Retrieval (IR), covering key concepts such as indexing, querying, and the Bag of Words model. It includes practical examples and tasks like constructing term-document matrices and explaining true and false positives. Overall, it serves as a guide to understanding the fundamental principles and operations within the field of IR.

Uploaded by

om651994
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views3 pages

Section 1

The document consists of a series of questions and answers related to Information Retrieval (IR), covering key concepts such as indexing, querying, and the Bag of Words model. It includes practical examples and tasks like constructing term-document matrices and explaining true and false positives. Overall, it serves as a guide to understanding the fundamental principles and operations within the field of IR.

Uploaded by

om651994
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Q1. What is Information Retrieval (IR)?

A) Storing data in databases


B) Finding structured data only
C) Finding unstructured documents that satisfy an information need
D) Designing computer networks
Q2. Which of the following is NOT a key component of IR?
A) Indexing
B) Querying
C) Sorting
D) Matching and ranking
Q3. What is the main purpose of indexing?
A) To delete irrelevant documents
B) To organize data for efficient searching
C) To display results to the user
D) To translate queries
Q4. A query in IR is:
A) A database table
B) A user request for information
C) A ranked list
D) A document
Q5. Matching and ranking are used to:
A) Store documents
B) Create metadata
C) Order results by relevance
D) Compress files
Q6. In IR, documents are usually considered:
A) Highly structured
B) Completely numeric
C) Unstructured text
D) Encrypted files
Q7. What does the Bag of Words (BoW) model ignore?
A) Word frequency
B) Word order
C) Document length
D) Stop words
Q8. Boolean retrieval model is based on:
A) Probabilities
B) Neural networks
C) Logical operators (AND, OR, NOT)
D) Machine learning
Q9. Re-ordering words in a document:
A) Changes its main meaning
B) Destroys the topic
C) Does not affect the main idea
D) Makes it unreadable
Q10. Given the following term-document matrix:
Term / Doc D1 D2 D3
data 1 0 1
mining 1 1 0
retrieval 0 1 1
Which documents satisfy the query:
data AND mining?
Answer D1

Q11. Using the same matrix, which documents satisfy:


retrieval OR data?
Answer: D1, D2, D3

Q12. Which documents satisfy:


mining AND NOT data?
Answer: D2

Q13. Construct a term-document incidence matrix for these documents:


D1: "data mining techniques"
D2: "information retrieval systems"
D3: "data retrieval"
Term D1 D2 D3
data 1 0 1
mining 1 0 0
techniques 1 0 0
information 0 1 0
retrieval 0 1 1
systems 0 1 0

Q14. Explain the difference between:


True Positive and False Positive in IR.
 True Positive: relevant document retrieved.
 False Positive: unrelevant document retrieved.

Q15. Why is the Bag of Words model useful in information retrieval?


The Bag of Words model is useful because it simplifies text representation and allows fast
comparison between documents and queries by ignoring word order and focusing on word
presence.

Q16. Write a program that takes the following documents and builds a term-document
matrix.
Documents:
D1 = "data mining techniques"
D2 = "information retrieval systems"
D3 = "data retrieval"
Output should be a matrix like:
Term D1 D2 D3

Q17. Write a function that extracts all unique terms (vocabulary) from a list of
documents?

Q18 .Write a function that takes a query like:


"data AND retrieval" and returns the documents that satisfy it.
[Link] a program that checks if a document matches a query using AND
logic
[Link] results as: Relevant documents: D1, D3

You might also like