0% found this document useful (0 votes)
12 views9 pages

Understanding SQL Query Processing Steps

The document discusses SQL query processing in database management systems (DBMS), outlining its importance for efficient data retrieval and manipulation. It details the steps involved in query processing, including parsing, optimization, and evaluation, and provides an example to illustrate the process. Additionally, it covers data abstraction in DBMS, explaining its levels and advantages, as well as some drawbacks.

Uploaded by

nahidhassanjs4
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views9 pages

Understanding SQL Query Processing Steps

The document discusses SQL query processing in database management systems (DBMS), outlining its importance for efficient data retrieval and manipulation. It details the steps involved in query processing, including parsing, optimization, and evaluation, and provides an example to illustrate the process. Additionally, it covers data abstraction in DBMS, explaining its levels and advantages, as well as some drawbacks.

Uploaded by

nahidhassanjs4
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

SQL query processing is a fundamental aspect of database management systems (DBMS).

When
you run a query, the DBMS processes it to fetch or manipulate data from the database. This process
involves several stages, ensuring that the query is executed efficiently and correctly. Understanding
how this works is essential for optimizing database performance.
In this blog, we will explore the definitions and aspects of SQL queries and query processing, and
we will be providing an example to explain it. By the end of the article, you should be able to
understand how queries are viewed and processed in a DBMS.

What is a Query in DBMS?

The query in a database management system (DBMS) is nothing but steps seeking a response or
action on data that is executed in the database. Such a request is expressed in a particular language,
which is typically SQL and instructs the DBMS to carry out operations which may include data
selection, modification, removal or addition. These Queries help database users to interact with
the database, which helps to manage complex databases. Queries can range from requesting a
record from a table to undertaking several joins/tables and cleaning up the records.
What is Query Processing in DBMS?

In simple terms, query processing in a DBMS can be described as the entire chain that the system
goes through to process a request and output data that has been requested. For instance, in response
to a query, a DBMS does not only provide data but also seeks to carry out all processes that would
help reduce the workload when performing the query. This involves disassembling the query,
restructuring it, and executing it in the most effective manner possible.

The principal objective of query processing is to reduce the time and effort needed to obtain the
results. The DBMS interprets the query and selects the most effective way to execute it, taking into
account the structure of the database, existing indices, and the marked resources. Query processing
has the utmost importance for the performance and timely response of a database system,
especially for large and complex data sets and queries.

Steps in Query Processing


Query processing in DBMS involves several key steps to ensure that a query is executed efficiently
and accurately. The main steps include parsing and translation, optimization, and evaluation. Each
of these steps plays a crucial role in transforming a high-level query into a form that can be
executed by the database engine.

Parsing and Translation

Parsing and translation are the initial steps in query processing. When a query is submitted, the
DBMS first parses the SQL statement to check for syntax errors and to understand the query
structure.
Parsing:

o The query is broken down into smaller components (keywords, operators,


operands).

o The DBMS checks for syntax errors and validates the query structure.

Translation:

o The query is converted into an internal representation, often an abstract syntax tree
(AST).

o Semantic checks ensure the query references valid database objects (tables,
columns) and consistent data types.
Key Points:

• The parsing step prevents syntax errors from causing execution failures.

• Translation transforms the query into a form the optimizer can work with.

• Ensures that the query is logically correct and ready for optimization.

Optimization

The optimization phase determines the most efficient way to execute the parsed query. This is
crucial for performance, especially with large databases or complex queries.

• Query Optimizer:

o Generates multiple execution plans.


o Evaluate each plan based on cost factors like CPU time, memory, and I/O
operations.

o Selects the execution plan with the lowest estimated cost.

• Optimization Considerations:
o Indexes: Decide whether to use indexes or perform full table scans.

o Data Size: Considers the volume of data involved.


o Joins and Conditions: Analyses the complexity of joins and the selectivity of
query conditions.

Key Points:

• The optimizer explores different strategies to execute the query efficiently.

• The final execution plan is designed to minimise resource usage.

• The optimization phase can include query rewriting or applying transformations.


Evaluation

The evaluation phase is where the DBMS executes the optimised query plan, interacting with the
storage engine to retrieve or modify data.

• Execution:

o Follow the steps outlined in the execution plan.

o Performs operations like reading data, applying filters, joining tables, sorting, and
aggregating results.

• Data Interaction:

o Uses indexes or performs full table scans as per the execution plan.

o Handles intermediate results in memory or on disk, depending on the query’s


complexity and available resources.

Key Points:

• The DBMS retrieves the required data based on the optimised plan.
• Intermediate results are managed efficiently to complete the query.

• The final results are formatted and sent back to the user or application.
Query Processing Example

To better understand how query processing works, let’s walk through a practical example. Suppose
we have a simple database with a table called Employees, which contains the following columns:
EmployeeID, FirstName, LastName, Department, and Salary. Now, imagine you want to retrieve
the names and departments of employees who earn more than $50,000.

SQL Query:

SELECT FirstName, LastName, Department

FROM Employees

WHERE Salary > 50000;

Step 1: Parsing and Translation

• Parsing:
o The DBMS checks the syntax of the SQL query.
o The query is broken down into components: SELECT, FROM, WHERE, etc.

o The system verifies that the table Employees and the columns FirstName,
LastName, Department, and Salary exist.

• Translation:

o The query is converted into an abstract syntax tree (AST).

o Semantic checks are performed to ensure logical correctness.

o The query is now in a form that the optimizer can understand.

Step 2: Optimization
• Generating Execution Plans:

o The optimizer considers different ways to execute the query, such as:

▪ Index Scan: If an index exists on the Salary column, the optimizer may
choose to use it to quickly find rows where Salary > 50000.

▪ Full Table Scan: If no suitable index is found, the optimizer may opt for a
full table scan.

• Choosing the Best Plan:

o The optimizer evaluates each plan’s cost, considering factors like the number of
rows to be scanned and the resources required.

o The plan with the lowest estimated cost, perhaps using an index scan, is selected.

Step 3: Evaluation

• Executing the Plan:

o The DBMS executes the chosen execution plan.

o If an index scan was selected, the DBMS quickly retrieves the rows where Salary
> 50000.

o The selected rows are then filtered, and only the FirstName, LastName, and
Department columns are retrieved.
• Returning the Results:

o The DBMS assembles the final result set containing the names and departments of
employees earning more than $50,000.

o The results are sent back to the user or application that issued the query.

Through these steps, the DBMS ensures that the query is executed efficiently, even for large
datasets or complex conditions. This process helps maintain the performance and reliability of the
database system.

Conclusion

Query processing in DBMS is a vital mechanism that ensures efficient and accurate execution of
queries. By breaking down a query into smaller steps like parsing, optimization, and evaluation
the DBMS can identify the best execution plan, resulting in faster and more efficient data retrieval.
Understanding these stages helps optimise query performance and improves overall system
efficiency.

What is data Abstraction in DBMS?

Data Abstraction is a process of hiding unwanted or irrelevant details from the end user. It
provides a different view and helps in achieving data independence which is used to enhance the
security of data.
The database systems consist of complicated data structures and relations. For users to access the
data easily, these complications are kept hidden, and only the relevant part of the database is made
accessible to the users through data abstraction.
Levels of abstraction for DBMS
Database systems include complex data-structures. In terms of retrieval of data, reduce complexity
in terms of usability of users and in order to make the system efficient, developers use levels of
abstraction that hide irrelevant details from the users. Levels of abstraction simplify database
design.
Mainly there are three levels of abstraction for DBMS, which are as follows −
• Physical or Internal Level
• Logical or Conceptual Level
• View or External Level
Physical or Internal Level
It is the lowest level of abstraction for DBMS which defines how the data is actually stored, it
defines data-structures to store data and access methods used by the database. Actually, it is
decided by developers or database application programmers how to store the data in the database.
So, overall, the entire database is described in this level that is physical or internal level. It is a
very complex level to understand. For example, customer's information is stored in tables and data
is stored in the form of blocks of storage such as bytes, gigabytes etc.

The Database Administrators(DBA) decide that which data should be kept at which particular disk
drive, how the data has to be fragmented, where it has to be stored etc. They decide if the data has
to be centralized or distributed. Though we see the data in the form of tables at view level the data
here is actually stored in the form of files only. It totally depends on the DBA, how he/she manages
the database at the physical level.

Logical or Conceptual Level


Logical level is the intermediate level or next higher level. It describes what data is stored in the
database and what relationship exists among those data. It tries to describe the entire or whole data
because it describes what tables to be created and what are the links among those tables that are
created.
It is less complex than the physical level. Logical level is used by developers or database
administrators (DBA). So, overall, the logical level contains tables (fields and attributes) and
relationships among table attributes.
Example : Let us take an example where we use the relational model for storing the data. We have
to store the data of a student, the columns in the student table will be student_name, age, mail_id,
roll_no etc. We have to define all these at this level while we are creating the database. Though
the data is stored in the database but the structure of the tables like the student table, teacher table,
books table, etc are defined here in the conceptual level or logical level. Also, how the tables are
related to each other are defined here. Overall, we can say that we are creating a blueprint of the
data at the conceptual level.

View or External Level


It is the highest level. In view level, there are different levels of views and every view only defines
a part of the entire data. It also simplifies interaction with the user and it provides many views or
multiple views of the same database.
View level can be used by all users (all levels' users). This level is the least complex and easy to
understand.
For example, a user can interact with a system using GUI that is view level and can enter details
at GUI or screen and the user does not know how data is stored and what data is stored, this detail
is hidden from the user.
Example: If we have a login-id and password in a university system, then as a student, we can
view our marks, attendance, fee structure, etc. But the faculty of the university will have a different
view. He will have options like salary, edit marks of a student, enter attendance of the students,
etc. So, both the student and the faculty have a different view. By doing so, the security of the
system also increases. In this example, the student can't edit his marks but the faculty who is
authorized to edit the marks can edit the student's marks. Similarly, the dean of the college or
university will have some more authorization and accordingly, he will has his view. So, different
users will have a different view according to the authorization they have.
Advantages of Data Abstraction in DBMS

The abstraction levels within a Database Management System (DBMS) offer various advantages
to users and applications:
Separation of Data Presentation and Storage: DBMS abstraction levels establish a distinction
between how data appears to users or applications and how it’s stored and retrieved by the system.
This separation enables modifications to be made to the physical storage and access methods
without impacting the external or conceptual levels.

Simplified Database Management: DBMS abstraction levels facilitate database administrators


in managing the database effectively. They can implement changes to the physical storage and
access methods without disrupting users or applications interacting with the database.
Performance Optimization: DBMS abstraction levels empower the system to optimise physical
storage and access methods for enhanced performance, all while preserving the user or application
perspective of the data.
Meaningful Data Representation: DBMS abstraction levels enable users or applications to
perceive data in a meaningful manner without concerning themselves with the underlying
implementation details. This flexibility makes it simpler to adapt to evolving business
requirements and user preferences over time

Disadvantages of Data Abstraction in DBMS

Below are some drawbacks of data abstraction:


Complexity for Developers: Data abstraction can be perplexing for developers due to the
intricacies present at multiple levels within the database
Increased Navigation Complexity: Adding an additional layer to the code can make navigation
challenging, potentially leading to difficulties in accessing and manipulating data.
Limitations on Modifying DBMS Behavior: Implementing changes to the behavior of the
Database Management System (DBMS) at lower abstraction levels may pose a daunting task or
even be rendered impossible due to the abstraction in place.

You might also like