UNIT
01 Introduction to Databases
Names of Sub-Units
Introduction, Characteristics of the Database Approach, File Based Systems, Disadvantages of Files
Systems, Data Processing, Types of Data Processing, Access Methods, Data Models, DBMS Vs RDBMS,
Advantages of DBMS, Application of Databases
Overview
In this unit, you first learn about basics of database management system. Next, the unit discusses
about the differences between DBMS and RDBMS. Further, the unit discusses various applications of
databases. In the end, the unit outlines the various access methods and data models.
Learning Objectives
In this unit, you will learn to:
Understand the fundamentals of the Database Management System
Describe the differences between DBMS and RDBMS
Explain the various Application of Databases
Outline the various Access Methods and Data Models
JGI JAIN
DEEMED-TO-BE UNIVERSIT Y
Relational Database Management System
Learning Outcomes
At the end of this unit, you would:
Examine the significance of Database and Database Management System
Assess Data model ideas
Analyse how DBMS and RDBMS principles are used
Examine the issues with DBMS vs RDBMS Methods
Pre-Unit Preparatory Material
[Link]
1.1 INTRODUCTION
The term “data” has originated from the plural of a Latin word ‘datum’, which means something given. It
forms the key ingredient for any information system. In fact, for information systems in order to produce
any type of information, data is a necessity. Data plays such an important role in information systems
because it represents some facts, observations, assumptions and occurrence. To be more specific, data
represents some facts, observations, assumptions and occurrence regarding people, process, functions
and events related to an organisation’s internal and external environment. The structured form of data
or information is necessary to support business process in terms of decision making and improved
efficiency.
1.2 DATA AND INFORMATION
Effective use of data enables an organisation to streamline the process of producing products as per the
requirements of customers. Therefore, data has to be present in a structured form, i.e., in a form from
which some relevant information can be derived after processing it. The processed form of data is called
information, which supports business decision making and facilitates efficient business processes.
In an organisation, data can be available in various forms such as text, graphics, audio, video and pre-
specified information.
To perform its business functions and processes effectively, an organisation needs to collect data related
to its target market, customers and competitors. However, data seems to be useless until it is processed
to extract the desired results. When data is processed and converted into a form that has a specific
meaning, it becomes information. For example, when a market researcher asks people to complete
questionnaires about a product or a service, the collected questionnaires are data. When this data is
processed and analysed to prepare a market report, the resulting report is information. Thus, it can be
said that information is a well-processed form of data that has a specific meaning and purpose. In other
words, processed and interpreted data is called information, i.e., data has been evaluated and worked
upon, and some conclusions have been drawn from it. Information is created when data is organised into
charts, summaries, averages and ranked lists, which help an organisation to make decisions. Decisions
based on this acquired information are referred to as “informed decisions”. Information is organised,
structured and derived by processing the data collected from various sources.
2
UNIT 01: Introduction to Databases JGI JAIN
DEEMED-TO-BE UNIVERSIT Y
1.3 DATABASE
Database refers to the collection of data that is organised in a logical and integrated manner. Such
collection of data forms a basis for data storage, and this data collection can be accessed for information
processing. Thus, database is organised in such a way that the data in it can be easily accessed, managed
and updated.
A database provides data for many business applications as and when required. Examples of databases
are as follows:
Train booking database Airlines booking database
Employee details database Cricket database
Sales database
The main features of a database are as follows:
It should be well-organised.
It should be relevant.
It should be easily accessible/retrievable.
It should provide an easy base for data processing.
1.4 CHARACTERISTICS OF THE DATABASE APPROACH
Database approach is significantly superior to traditional file management systems. The database
approach offers several qualities that make it more durable. Let’s have a look at the major features of
the database approach.
1. Manages Information: Because information is always beneficial for whatever task we do, a database
takes care of its information. It keeps track of all the data we need. We become more intentional
users of our data by managing information in a database.
2. Easy Operation Implementation: All actions, including insert, remove, update and search, are
performed in a flexible and user-friendly manner. These actions are relatively straightforward to
accomplish using a database. These procedures can be performed by a person with less knowledge.
This feature of a database increases its power.
3. Multiple Views of Database: A view is essentially a subset of the database. A view is created and
dedicated to a certain system user. Users of the system may have differing perspectives on the same
system. Every view only shows data that is relevant to a single person or a group of users. The users
must be aware of how and where their data is stored.
4. Data for Specific Purpose: A database is a collection of data with a defined purpose. A database for
a student management system, for example, is used to keep track of a student’s grades, fees and
attendance. This information is used to keep track of students’ progress.
5. Having Users of Specific Interest: There is usually some indented group of users and apps that these
user groups are interested in a database. In a library system, for example, there are three users: the
college’s formal administration, the librarian and the students.
6. Represent Some aspects of Real World applications: A database is a representation of some
characteristics of real-world applications. In the actual world, any change is reflected in the database.
If we make modifications to our real-world applications, such as a train reservation system, the
changes will be reflected in the database as well. For example, a railway reservation system; we have
3
JGI JAIN
DEEMED-TO-BE UNIVERSIT Y
Relational Database Management System
in mind certain specific applications for keeping track of attendance, waiting lists, train arrival and
departure times, specific days and so on, for each train.
7. Self-Describing Nature: A database is self-descriptive in that it describes and narrates itself at all
times. It describes the whole data structure, as well as the constraints and variables. It distinguishes
it from previous file management systems, which did not include definition as part of the application
software. When needed, users and DBMS software refer to these definitions.
8. Logical Relationship Between Records and Data: A database establishes a logical link between its
entries and information. As a result, a user may use a single database query to retrieve a variety of
entries based on logical requirements.
1.5 FILE BASED SYSTEMS
File-based data systems are systems that are used to manage and maintain data files. To retrieve
information from such files, the entries are either searched sequentially or an indexing system could
be used to locate such information. This works fine when the number of items stored is very small.
Sometimes, it also works fine if the data is needed to be stored and retrieved only, even if the number of
items stored is quite large.
Following are the features of a file-based data management system:
Any user can benefit from a file-based system for basic data management.
The data in the file-based system has to be consistent. The consistency attribute should not be
affected by any transactions performed in the file-based system.
Any unlawful or possibly harmful activities on the data should be prohibited by the file-based system.
The file-based system should allow many processes to access the same data at the same time, and
this should be properly coordinated.
The data should be uniformly organised and kept in the file-based system to make it easy to retrieve.
1.5.1 Disadvantages of Files Systems
The following are some of the main drawbacks of file-based systems:
Capacity constraints
Functionality is limited
There is less security
Inconsistency in data is increasing
There are no backup or recovery options
The drawbacks of file-based systems are as follows:
The file-based system has a lesser storage capacity and cannot handle huge volumes of data.
Because this system is simple, it is unable to handle complex searches, data recovery and other
tasks.
Because the file-based system lacks a sophisticated method for removing duplicate data, there may
be some.
In a file-based system, data is vulnerable to corruption and destruction.
In a file-based system, data files can be stored in numerous locations. As a result, it’s difficult to
readily exchange data with numerous people.
4
UNIT 01: Introduction to Databases JGI JAIN
DEEMED-TO-BE UNIVERSIT Y
1.6 DATA PROCESSING
Data processing is a technique for manipulating information. It refers to the transformation of
unstructured data into material that is both meaningful and machine-readable. It’s essentially a method
of transforming unstructured data into useful information. “It can relate to the processing of business
data using automated methods.” To analyse huge amounts of comparable data, usually involves very
basic, repetitive tasks. Raw data is the source of information that is processed to provide useful results.
1.6.1 Types of Data Processing
Depending on the purpose of the data, several types of data processing procedures are available. The
five major forms of data processing are:
1. Commercial Data Processing
2. Scientific Data Processing
3. Batch Processing
4. Online Processing
5. Real-Time Processing
Let’s now discuss about each in detail.
1. Commercial Data Processing: Commercial data processing is a way of using relational databases
in a commercial setting, which involves batch processing. It entails feeding the system a big amount
of data and producing a significant volume of output with fewer computing processes. It essentially
mixes commerce and computers to make it beneficial for a company. Because the data handled by
this system is typically standardised, there is a considerably lesser risk of mistakes. Many manual
tasks are mechanised using computers to make them easier and more error-free. In the business
world, computers are employed to transform raw data into information that is helpful to the
company. Accounting software is a good example of a data processing application.
2. Scientific Data Processing: Scientific data processing, unlike commercial data processing, makes
extensive use of computer processes while requiring fewer inputs and outputs. Arithmetic and
comparison operations are among the computing operations. Any risk of mistakes is unacceptable
in this sort of processing since it would lead to erroneous decisions. As a result, the process of
verifying, categorising and standardising the data is carried out with great care, and a variety
of scientific procedures are employed to guarantee that no incorrect correlations or conclusions
are formed. This takes longer than data processing in a business setting. Processing, managing
and distributing science data products are common examples of scientific data processing, as are
facilitating scientific analysis of algorithms, calibration data and data products, as well as keeping
all software, calibration data and data products under strict configuration control.
3. Batch Processing: Batch processing is a form of data processing that involves processing several
cases at the same time. It is most commonly utilised when the data is homogeneous and in big
amounts, and it is gathered and analysed in batches. Concurrent, simultaneous or sequential
execution of an activity is referred to as batch processing. Simultaneous batch processing happens
when all of the cases are processed at the same time by the same resource. Sequential batch
processing happens when distinct cases are processed by the same resource either simultaneously
or sequentially.
When they are processed by the same resources yet partially overlap in time, this is referred to as
concurrent batch processing. It’s usually utilised in banking applications or areas where higher
degrees of security are necessary. The computational time is reduced in this processing since the
5
JGI JAIN
DEEMED-TO-BE UNIVERSIT Y
Relational Database Management System
output is extracted by applying a function to the entire data set. It is capable of completing tasks
with relatively little human interaction.
4. Online Processing: In today’s database systems, “online” means “interactive” Batch processing is
the opposite of online processing. Online processing, like traditional query processing engines, may
be created out of a variety of relatively basic operators. Processing through the Internet analytical
processes usually require big chunks of data from large databases. As a result, it should come as no
surprise that today’s online analytical tools include interactive functionality. Precomputation is the
key to their success.
The response to each point and click is computed long before the user even opens the programme
in most Online Analytical Processing systems. In reality, many online processing systems do the
calculation inefficiently, but because the processing is done ahead of time, the end-user is unaware of
the issue. When data must be processed continuously and is supplied into the system automatically,
this form of processing is employed.
6. Real-Time Processing: The present data management system generally limits the capability of
processing data on an as-needed basis since it is always dependent on periodic batch updates,
resulting in a time lag of many hours between an event occurring and it being recorded or updated.
This necessitated the development of a system that could capture, update and process data on an
as-needed basis, i.e., in real-time, decreasing the time lag between occurrence and processing to
practically zero. Huge amounts of data are being pushed into the systems of businesses, therefore,
storing and analysing it in real time would be a game changer for an organisation.
Most businesses seek real-time data insights to completely comprehend the environment both
within and beyond their walls. This necessitates the development of a system capable of real-time
data processing and analytics. This form of processing generates findings as they occur. The most
frequent technique is to collect data straight from its source, which is also known as a stream, and
make conclusions without having to transmit or download it. Data virtualisation techniques are
another important approach in real-time processing, in which useful information is extracted for
data processing while the data remains in its original form.
1.7 ACCESS METHODS
The part of a computer’s operating system responsible for structuring data sets and directing them
to specified storage devices is known as an access method. Random and sequential access are the two
types of access methods. To process queries and obtain data, data access methods are employed. The
data to be queried is stored in database objects such as tables, forms, reports etc. The instructions or
procedures that can be used to retrieve and change data into useful information. The simplest access
method is sequential access. Information from the database could be read in order, starting at the
beginning and reading one record after the other in order. This mode of access is the most common.
Sequential access is more convenient when the storage medium is magnetic tape, rather than disk.
Random access method is of great use for immediate access to large amounts of information. When a
query concerning a particular subject need to be accessed from a database, the answer is computed,
and then that block is read directly to provide the desired information.
Other access methods can also be used such as Indexed sequential access method. These methods
generally use an index for the file. The index block contains pointers to the various blocks of the file. To
find a record in the database or file, first the index block is searched where the pointers to the block are
stored, and then the pointer is used to access the file directly and to find the desired record.
6
UNIT 01: Introduction to Databases JGI JAIN
DEEMED-TO-BE UNIVERSIT Y
1.8 DATA MODELS
A database model can be defined as the theoretical foundation of a database, which fundamentally
determines the manner in which data can be stored, retrieved, or manipulated inside a database
system. A database model not only defines the way to structure the database but also defines the set of
operations that can be performed on the data.
The conceptual and external levels of database abstraction use certain data structures, which help
in utilising the database effectively. However, several questions are raised by database users, such as
how these data structures are decided? What data structure and associated operators should a system
support? The answers to these questions depend upon the approach or model used for the database
management.
The following are three kinds of data models:
The relational data model: In the relational data model, data is stored in tables which consist
of several rows and columns. These tables are referred as relations in which a row acts as the
relationship among a set of values. A table can be considered as a collection of relationships and is
generally referred by the term relations. This term is derived from the mathematical term relations,
which has a similar concept.
The hierarchical data model: In the Network model, the data is represented as records and
relationship among them is represented as links. In the network database system, each record
is connected to one another using links. Therefore, a link can also be referred as an association
between two records.
The network data model : In the Network model, the data is represented as records and relationship
among them is represented as links. In the network database system, each record is connected to one
another using links. Therefore, a link can also be referred as an association between two records.
In the Network model, links are implemented by adding pointer fields to the record at the time of
mapping to files. There must be one pointer field for each link with which it is associated. Various
operations are performed on the network database with the help of data manipulation language.
Some operations that can be performed in the network database are find, insert, modify, and
delete. The insertion and deletion of records involve the use of connect, disconnect, and reconnect
operations.
1.9 DATABASE MANAGEMENT SYSTEM (DBMS)
A Database Management System (DBMS) is software that stores and retrieves data for users while taking
necessary security precautions. It is made up of a collection of applications that alter the database.
The DBMS accepts an application’s request for data and directs the operating system to supply the
requested data. A DBMS aids users and other third-party applications in storing and retrieving data in
big systems.
A DBMS is a software package for defining, manipulating, retrieving and managing data in databases.
A database management system manipulates data, data format, field names, record structure and
file structure. It also lays forth the rules for validating and manipulating the data. As the profession
of database administration advances, database management solutions are built on particular data
handling ideas. The first databases only dealt with particular pieces of data that had to be structured in
a specific way. Today’s more advanced systems can handle a variety of less structured data and connect
it in more complex ways.
7
JGI JAIN
DEEMED-TO-BE UNIVERSIT Y
Relational Database Management System
1.9.1 advantages of DBMS
The advantages of DBMS are as follows:
1. Improved data sharing: The database management strategy has the advantage of assisting in the
creation of an environment where end users have greater access to more and better-managed data.
End users can respond rapidly to changes in their environment with this kind of access.
2. Improved data security: The higher the number of people that have access to the data, the greater
the danger of a data security breach. Corporations spend a lot of time, effort and money to make
sure that their data is used correctly. A database management system (DBMS) provides a framework
for enforcing data privacy and security standards more effectively.
3. Better data integration: Access to well-managed data allows for a more holistic perspective of
the organisation’s activities and a better image of the big picture. It’s a lot simpler to observe how
activities in one part of the organisation influence the rest of the company.
4. Minimised data inconsistency: When various versions of the same data emerge in different
places, this is known as data inconsistency. When a company’s sales department stores a sales
representative’s name as “Aniket Sharma” but the company’s personnel department stores the
same person’s name as “Aniket k. Sharma” there is data discrepancy. When a company’s regional
sales office lists a product for $45.95 but its national sales office lists the identical thing for $43.95. In
a well-designed database, the chances of data inconsistency are significantly minimised.
5. Improved data access: The DBMS enables the generation of fast responses to ad hoc queries. A query
is a particular request sent to the database management system (DBMS) for data manipulation,
such as reading or updating data. Simply defined, a query is a question, and an ad hoc inquiry is a
question asked on the spur of the moment. The DBMS responds to the application with a response
(known as the query result set).
6. Improved decision making: Better-managed data and enhanced data access allow for the generation
of higher-quality data, which may then be used to make better decisions. The quality of the underlying
data determines the quality of the information generated. Data quality refers to a holistic strategy to
ensuring data correctness, validity and timeliness. While the database management system (DBMS)
does not ensure data quality, it establishes a foundation for data quality activities.
7. Increased end-user productivity: End users can make rapid, educated decisions based on the
availability of data and the tools that turn data into useful information, which can be the difference
between success and failure in the global economy.
1.9.2 Disadvantages of DBMS
The disadvantages of DBMS are as follows:
1. Increased costs: Database systems need complex hardware and software, as well as highly
experienced employees, which is one of the drawbacks of DBMS. The cost of maintaining the hardware,
software and employees needed to run and administer a database system may be significant. When
database systems are installed, expenditures such as training, licencing and regulatory compliance
are sometimes ignored.
2. Management complexity: Database systems work with a variety of technologies and have a big
influence on a company’s resources and culture. To guarantee that the changes brought on by
the implementation of a database system assist the organisation achieve its goals, they must be
effectively managed. Because database systems store critical corporate data that is accessible from
many sources, security concerns must be reviewed regularly.
8
UNIT 01: Introduction to Databases JGI JAIN
DEEMED-TO-BE UNIVERSIT Y
3. Maintaining currency: You must maintain your database system updated to enhance its efficiency.
As a result, you must keep all components up to date and apply the latest patches and security
measures. Personnel training expenses are often high due to the fast advancement of database
technology. Dependence on vendors. Companies may be hesitant to switch database vendors due
to significant investments in technology and people training. As a result, providers are less likely
to provide current customers with pricing point discounts, and those customers may have fewer
options for database system components.
4. Frequent upgrade/replacement cycles: Vendors of database management systems often update
their products by introducing new features. These new features are frequently included in new
software update versions. Hardware updates are required for some of these versions. Not only do
updates cost money, but so does training database users and administrators on how to utilise and
handle the new capabilities effectively.
1.10 APPLICATION OF DATABASES
Databases are created to store information. Various fields in which database is utilised are as follows:
Railway Reservation System: In the rail reservation system, the database is required to store
information such as availability of tickets, status of trains, passenger details etc. The information
stored in the database is updated on timely basis.
Library Management System: In the library, there are lot of books, so it is required to manage them
properly, so that students can get them easily when required. The database is used to store various
types of information related to books such as details regarding issuance of books, student details,
book availability details etc.
Banking: In banking sector, the database is used to store information related to customers such as
their address, phone no., amount stored in their account etc. The database also contains information
of employees working in the bank.
Human Resource (HR) Database: HR databases are used by persons who are responsible for taking
care of human resources in businesses. The HR database is used by HR professionals for storing all
the personal information of all the employees in the organisation. Not only personal information,
an HR database helps in recording many other types of HR related data such as training and
recruitment details. Data stored in the HR databases can also include managers’ information, the
number of holidays and absenteeism, standard working hours, clocking on and off times, timesheets,
overheads, etc., which can help HR professionals with workforce management. HR databases also
help in assisting HR teams.
Hospital management system: Databases are created to store the records of the patients as well as
other details that need to be maintained regarding doctors, medicines, services, and equipments.
Online Shopping: Nowdays, web based shopping or online shopping is very popular. People
avoid visiting shops to save their travelling time. People find shopping easier through web based
shopping sites from home, such as, Amazon, Flipkart, Myntra etc. These websites use database to
store information of items available or sold. These websites also use database to store sensitive
information of their customers.
1.11 RELATIONAL DATABASE MANAGEMENT SYSTEM (RDBMS)
The RDBMS or Relational Database Management System is a database that stores data in tables that
have connections with other tables in the database. A relational database is a form of database that
stores and gives access to data elements that are linked. The relational model, a simple and obvious
9
JGI JAIN DEEMED-TO-BE UNIVERSIT Y
Relational Database Management System
manner of expressing data in tables, is the foundation of relational databases. Each row in a table in a
relational database is a record with a unique ID called the key. The characteristics of the data are stored
in the table’s columns, and each record generally contains a value for each attribute, making it simple
to build associations between data points.
1.12 DBMS VS RDBMS
A database is a software programme that allows users to store information. To store physical data,
several implementations and theories exist in database design. The RDBMS or Relational Database
Management System is a database that stores data in tables that have connections with other tables
in the database. There are no connections between tables in a DBMS (Database Management System).
Now that we know what RDBMS and DBMS mean, we can talk about the distinctions between the two.
Here are some of the differences we see straight away when comparing DBMS and RDBMS:
Data is stored in a file in a DBMS, whereas information is stored in tables in an RDBMS.
RDMBS, on the other hand, maybe used by numerous users, whereas DBMS can only be used by one.
RDBMS is the only database management system that allows for client-server interaction and
architecture, whereas DBMS does not.
In terms of hardware and software, DBMS is less demanding than RDMBS. You’ll need a higher
powerful computer to run RDMBS effectively.
Data redundancy is possible in DBMS. When utilising a database management system, data might
be repeated. In RDBMS, however, we cannot have redundant data because of the indexing.
Table 1: Shows differences between DBMS and RDBMS:
DBMS RDBMS
Data is often stored in DBMS in either a hierarchical or Tables in RDBMS contain a primary key identifier, and
navigational format. data values are kept in the form of tables.
Because DBMS stores data on a file system, there will Data values are kept in the form of tables in RDBMS,
be no relationship between the tables. thus a connection between these data values will also
be recorded as a table.
The database management system (DBMS) must To retrieve the stored information, the RDBMS system
provide certain standard methods for accessing the offers a tabular structure of the data and a relationship
data. between them.
DBMSs are designed for tiny businesses that deal with A relational database management system (RDBMS) is
little amounts of data. It can only be used by one person. built to handle huge amounts of data. It may be used
by several people.
Distributed databases are not supported by DBMS. Distributed databases are supported by RDBMS.
File systems, XML and other DBMS are examples. Examples of RDBMS are Microsoft Access (MS Access),
Microsoft SQL Server, Oracle and OpenOffice Base.
Conclusion 1.13 CONCLUSION
A database management system (DBMS) is a set of programmes that maintains the database
structure and regulates access to the data contained in the database.
10
UNIT 01: Introduction to Databases JGI JAIN
DEEMED-TO-BE UNIVERSIT Y
As special-purpose languages, they have Data definition language, Data manipulation language
and Query language.
A relational database management system (RDBMS) is a more sophisticated form of a database
management system (DBMS).
RDBMS is a relational database management system that stores data in tabular format. Individual
data components must be accessed. Information is stored in tables in a relational database
management system (RDBMS).
RDMBS may be used by numerous users, whereas DBMS can only be used by one user.
1.14 GLOSSARY
Data: A collection of unorganized facts, such as symbols, alphabets, or numbers, used for representing
ideas and objects
Information: The organized form of data
Database: The collection of information in such a way that it can easily be accessed, managed, or
updated by users
Database Management System (DBMS): The application that controls the creation, maintenance,
and use of a database
1.15 SELF-ASSESSMENT QUESTIONS
A. Essay Type Questions
1. Database is organised in such a way that the data in it can be easily accessed, managed and updated.
Write a short note on database and its features.
2. Database approach is significantly superior to traditional file management systems. Enlist the
various characteristics of database approach.
3. The file systems are inefficient while handling large amount of information. Discuss some features
and limitations of file based systems.
4. Data processing is a technique for manipulating information. What are its different types?
5. A database management system (DBMS) is software that stores and retrieves data for users while
taking necessary security precautions. Write some advantages of database management system.
1.16 ANSWERS AND HINTS FOR SELF-ASSESSMENT QUESTIONS
B. Hints for Essay Type Questions
1. Database refers to the collection of data that is organised in a logical and integrated manner. Refer
to section ‘Database’ for more details.
2. The database approach offers several qualities that make it more durable. Refer to section
‘Characteristics of database approach’ for more details.
3. File based data systems are systems that are used to manage and maintain data files. Refer to
section ‘File based systems’ for more details.
11
JGI JAINDEEMED-TO-BE UNIVERSIT Y
Relational Database Management System
4. The five major types of data processing technique are Commercial Data Processing, Scientific Data
Processing, Batch Processing, Online Processing and Real-Time Processing. Refer to section ‘Data
Processing’ for more details.
5. DBMS aids users and other third-party applications in storing and retrieving data in big systems.
Refer to section ‘Database Management System’ for more details.
@ 1.17 POST-UNIT READING MATERIAL
[Link]
[Link]
1.18 TOPICS FOR DISCUSSION FORUMS
Discuss the evolution of RDBMS in detail.
12