CS 3308 Discussion Assignment Unit 5
How to Determine Which Type of Query to Use
The type of query to use depends on what kind of information you’re looking for and how
specific you want your results to be. If you need exact matches for terms, a Boolean retrieval
query works best. It uses logical operators such as AND, OR, and NOT. For example, if you search
for “machine AND learning”, the system will return only documents that contain both words
(Manning, Raghavan, & Schütze, 2009).
If you’re looking for flexible matches, such as when you’re unsure of a word’s spelling, a
wildcard query is better. For example, typing compute could return computer, computing, and
computation. This is especially useful for dealing with variations in word endings or plurals.
When you want to search for an exact phrase, such as “artificial intelligence”, a phrase query
should be used. Phrase queries ensure that the exact sequence of words appears together in
the text, which improves the accuracy of search results.
Differences Between Query Types
Query Type Description Example Use Case
When looking for documents
Boolean Combines words using AND, OR, data AND
containing all or some specific
Query NOT to match logical conditions. mining
terms.
Wildcard Uses symbols like * or ? to match When you’re unsure of spelling or
comput*
Query partial or unknown words. want to include word variations.
Phrase Matches exact sequences of "machine When the exact phrase meaning
Query words. learning" matters.
Improving Scoring and Ranking Efficiency
Search engines must process millions of documents quickly. Some techniques that improve
efficiency include:
1. Inverted Indexing – This structure maps terms to their document occurrences, allowing
faster lookups (Manning et al., 2009).
2. Term Weighting (tf-idf) – Assigning higher weights to unique and meaningful terms
helps rank documents by importance rather than just keyword frequency.
3. Cosine Similarity – Calculates the angle between the document and query vectors to
determine how similar they are (Wikipedia, 2024).
4. Caching Frequent Queries – Storing results of common searches speeds up retrieval for
future requests.
5. Query Optimization and Pruning – Ignoring stop words and eliminating unnecessary
comparisons reduces computation time.
By combining these methods, modern search systems like Google can return relevant results
almost instantly.
References
Manning, C. D., Raghavan, P., & Schütze, H. (2009). An Introduction to Information
Retrieval. Cambridge University Press. Retrieved from
[Link]
Wikipedia. (2024). Cosine similarity. Retrieved from
[Link]