0% found this document useful (0 votes)
139 views17 pages

Database vs. Search Engine Overview

A database is an organized collection of related records stored digitally, arranged in a structured order for efficient searching. A search engine is a software system designed to search for information on a computer system like the World Wide Web. It has four main components: a crawler that gathers web pages and stores them in an index; the index database; a search engine that finds matches to user queries in the index; and a user interface. Basic and advanced search techniques can be used to conduct targeted searches within databases and search engines.

Uploaded by

bhavgifee
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
139 views17 pages

Database vs. Search Engine Overview

A database is an organized collection of related records stored digitally, arranged in a structured order for efficient searching. A search engine is a software system designed to search for information on a computer system like the World Wide Web. It has four main components: a crawler that gathers web pages and stores them in an index; the index database; a search engine that finds matches to user queries in the index; and a user interface. Basic and advanced search techniques can be used to conduct targeted searches within databases and search engines.

Uploaded by

bhavgifee
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

DATABASE & SEARCH ENGINE

PRESENTED BY:
[Link]
Database
What is a database?
• A database is an organized collection of related
records that is stored digitally.
• It is arranged in a structured order for ease and
speed of search.
• An example would be the Library Literature
Database on the New York Public Library website
which “Indexes
periodicals and books, reports, pamphlets, and
library school theses on all aspects of library and
information science” from 1984 to the present
What Is A Search Engine ?
• Search Engine usually refer to a web search engine, which searches information
on the web.

• Search engines are huge databases of web page files that have been assembled
automatically by machines.

• By performing a search using a search engine, you're asking the engine to scan its
index of sites and match your keywords and phrases with those in the text of
documents within the engine's database.
• Search Engine is a Document Retrieval System* which is designed to help find
information stored on a computer system like on World Wide Web.

• Search Engine allows one to ask for content, meeting specific criteria, typically
those containing a given word or a phrase and retrieves a list of those items that
match those criteria. This list is often sorted with respect to some measure of
relevance of the results.

• When you are using a search engine, you are NOT searching the entire web as it
exists at this moment. You are actually searching a portion of the web, captured
in a fixed index created at an earlier date.
What Is A Search Engine ?

• A client/server application
• A document retrieval system
• Use regularly updated indexes to operate quickly and
Efficiently
• Designed to help find information stored:
• On a computer system, such as on the World Wide Web
• Inside a corporate or proprietary network
• 􀁹 In a personal computer
• 􀁹 Different selection and relevance criteria can apply in
• different environments, or for different uses
• 􀁹 Allows one to ask for content meeting specific criteria
• 􀁹 Typically those containing a given word or phrase
• 􀁹 Retrieves a list of items that match those criteria
Search Engines Consist of Four Discrete Software
Components

• Spider/ Crawler : a software program that gathers


information and puts it into the search engine’s database. It
visits Web pages, often starting at the main page of a site,
reads them and the follows the links to other pages.

• The database or Index: the web pages are systematically


stored and updated here.

• Search Engine Result Engine: which is the software that sifts


through the pages stored in the index to find matches to a
search and rank them in order of what it believes, is most
relevant.

• The interface, which is what we use to query the database. It


usually consists of a search box in which you type your query
and a button to launch the search. Sometimes there are
menus to choose various search functions to refine the query.
• The Spider retrieves pages
from the world wide web.

• The data retrieved by the


spider is systematically
indexed and stored in the
search engine’s database.

• When a user types in a search


query the Search Engine
Result Engine looks up the
Index and provides a listing of
best-matching web pages
according to its criteria,
usually with a short summary
containing the document's title
and sometimes parts of the
text. Most search engines
support the use of the boolean
terms AND, OR and NOT to
further specify the search
query.
Let's see how Goggle processes a query

1. The web server sends the


query to the index servers.
The content inside the
index servers is similar to
the index in the back of a
book--it tells which pages
contain the words that
match any particular
query term.
2. The query travels to the
doc servers, which
actually retrieve the stored
documents. Snippets are
generated to describe each
search result.
3. The search results are
returned to the user in a
fraction of a second.
Types of Search Engines
• Crawler based Search Engines – Crawlers are indexed using
spiders. E.g Google, Altavista.

• Directory – These are created and maintained by human


editors. The editors review and select sites for inclusion in
their directories on the basis of previously determined
selection criteria. Their databases are organised by category or
subject to permit browsing but are in general much smaller
than those of crawler based engines. E.g. Yahoo, Looksmart.

• Regional - Regional search engines focus on one particular


language or region. E.g. [Link], [Link]

• Metasearcher - MetaSearchers use a uniform platform to


search using several engines simultaneously. E.g [Link],
profusion, vivisimo.
Invisible Web
• Search engines do not necessarily reach all parts of the
Web or necessarily index all pages at a site.

• The Invisible Web, as it is called, is largely comprised of


databases not easily indexed by the search engines, pages
deep in a web site that don't get crawled, file formats that
the search engines ignore, and services for subscribers
only (and often for a fee).

• No one has an estimate, but some have guessed at 500


billion.
SEARCH ENGINE APPLICATIONS
Search Engines allow field
searches for Search in title,
Date last updated, Search in
the URL, etc.

Search Engine searches from


a huge database of web
pages.

The results are displayed as


per the highest occurrence of
keywords specified. One can
reorder by date of posting as
well.
THE METHOD TO CONDUCT SEARCH
When you conduct a search for a specific title or author what type of search are
you conducting?
Field searching allows the researcher to select a specific portion of the electronic
record to search, be that title, author, publication year, etc. If someone were looking
for articles by John Updike, the searcher could simply type “Updike, John” into the
author field to search for all articles contained in the database written by John
Updike.

What are basic search techniques?


The first basic principle of conducting a search is to choose appropriate keywords,
using a thesaurus if deemed necessary. In choosing keywords the researcher should
consider variant word forms, differing spellings and related words
List some advanced search techniques.
In order to conduct a more specific search,
field searching is recommended. This
would mean searching such particular fields as
Author, Title, Year of Publication,
Language, etc. for precise keywords. Thus a
researcher could input “1999” in the
year of publication field to find documents
published in that year or “French” in the
language field to find documents written in
French or “small” in the title field to find
books with the word small in the title.
In addition to the basic search techniques, on
some interfaces a proximity operator,
like “with, “adjacent” or “near,” can be used to
further limit or expand search
potentials.
THANK YOU

Common questions

Powered by AI

Databases are organized collections of related records, structured for efficient querying and retrieval of data stored digitally, such as library catalogues . In contrast, search engines are designed to locate information stored on computer systems, like the internet, by indexing and retrieving web page files based on user queries . While databases focus on curated and structured data management, search engines dynamically index and retrieve data from vast, less structured web environments .

Search engines create and maintain their indexes by employing spiders to traverse the web, gathering data from websites. The spider reads web pages, follows links, and systematically collects information, which is then indexed and organized in search engine databases . This indexing is crucial for efficient data retrieval as it enables the search engine result engine to quickly sift through massive datasets to find and rank pages relevant to user queries, facilitating the rapid delivery of search results .

A search engine system primarily comprises four components: the spider/crawler, the database or index, the search engine result engine, and the interface. The spider or crawler gathers information by visiting web pages and following links to other pages . The collected data is then indexed and stored systematically in the search engine's database . When a user inputs a query via the interface, it is sent to the index servers to find pages containing the search terms. These pages are then ranked by the search engine result engine based on relevance criteria and returned to the user's interface, often with a document summary including the title and parts of the text .

Metasearch engines distinguish themselves by utilizing a single platform to search across several other search engines simultaneously, which allows users to access a broader range of search results than a single crawler-based search engine can provide . In contrast, crawler-based search engines use spiders to index web pages and independently rank them in indexes . Hence, while crawler-based engines build and search their own indexes, metasearch engines rely on aggregating results from various indexes without maintaining their own .

Search engines have limitations in reaching all parts of the web due to the vast and dynamic nature of internet content. The 'Invisible Web' refers to areas of the internet that are not indexed by search engines. This includes databases that present challenges for indexing, pages deep within a site that are not accessed by spiders, file formats ignored by search engines, and subscriber-only services . As a result, a significant portion of online information remains inaccessible through conventional search engines, with estimates suggesting the Invisible Web could involve up to 500 billion pages .

Boolean operators refine search engine queries by allowing users to define relationships between keywords and phrases, thereby enhancing the specificity of search results. Common Boolean operators include AND, OR, and NOT. 'AND' narrows search results by including only pages containing all specified terms, 'OR' broadens results to include pages with any of the listed terms, and 'NOT' excludes pages containing certain terms . Utilizing these operators helps users filter relevant information more effectively during searches .

Directory-based search engines differ from crawler-based search engines in that they rely on human editors to review, select, and categorize sites based on predetermined criteria . This results in smaller, more curated databases organized by subject. In contrast, crawler-based engines use automated spiders to index sites, offering broader, more extensive coverage of the web. For users, directory-based engines can offer more focused and high-quality results in specific categories, while crawler-based engines provide access to a wider array of data, potentially requiring more effort to filter for relevance .

Search engines have evolved to offer region-specific search results, enhancing their relevance and accessibility to global users. This development involves tailoring search algorithms and indexes to prioritize content based on location, language, and cultural context. Regional search engines like Google.co.in focus on tailoring results to local interests and language preferences, enhancing user experience by delivering more applicable results . The impact on global user access is significant, as users can receive more pertinent and contextually appropriate information without geographic or language barriers, fostering increased internet utility across diverse populations .

Field searching in databases allows users to target specific portions of an electronic record, such as the title, author, or publication year, which enables more precise retrieval than general keyword searching . This structured approach can filter results based on defined record fields, whereas general keyword searching casts a wider net by identifying relevant terms throughout the entire record. For example, field searching for "Updike, John" in the author field retrieves works specifically by that author, whereas keyword searching might yield broader results .

The user interface is a vital component of a search engine as it facilitates user interaction with the system. It typically includes a search box for typing queries and control elements to launch searches and refine results . The interface translates user queries into formats that the search engine understands, allowing operational commands such as applying Boolean operators or choosing specific search fields. Additionally, a well-designed interface increases usability by making it intuitive for users to access information efficiently .

You might also like