CHAPTER 1 - Introduction
Chapter 1 – Slide 1
BASIC DEFINITIONS
▪ Data –
• facts that are recorded and can be accessed
• facts that can be recorded and have an implicit meaning.
• Data formats – text, numbers, figures, graphics, images,
audio/video recordings and more.
• Data is recorded and kept because it is considered to be of use to
an intended user.
▪ Information - refers to the data that is accessed by a user
for some particular purpose
• Typically, getting the needed information from a collection of
data requires performing an activity, such as searching through,
processing, or manipulating the data in some form or fashion.
Chapter 1 – Slide 2
BASIC DEFINITIONS
▪ Database - structured collection of related data stored on a
computer medium
• Organizes the data in a way that facilitates efficient access to the
information captured in the data
▪ Database management system (DBMS) - software used for:
• Creation of databases
• Insertion, storage, retrieval, update, and deletion of the data in the
database
• Maintenance of databases
▪ Database system - computer-based system whose purpose is
to enable an efficient interaction between the users and the
information captured in a database
• The DBMS software together with the data itself.
• Sometimes, the applications are also included.
Chapter 1 – Slide 3
Operations with Databases
• Design
– Define structure and types of data
• Construction
– Create data structures of DB, populate DB with data
• Manipulation of Data
– Insert, delete, update
– Query: “Which department pays the highest salary?”
– Create reports:
“List monthly salaries of employees, organised
by department, with average salary and total
sum of salaries for each dept”
4
Databases Are Everywhere
• Database = a large (?) collection of related data
• Classically, a DB models a real-world organisation
(e.g., enterprise, university)
– Entities (e.g., students, courses)
– Relationships (e.g., “Martin is taking IDS in 2009/10”)
• Changes in the organisation = changes in the database
• Examples:
– personnel records
– banking
– airline reservations
5
Scientific Databases (Examples)
• Biology:
▪ e.g., DNA sequences of genes, amino-acid sequences
of proteins, genes expressed in tissues
▪ (up to several Gigabytes)
• Astronomy:
▪ e.g., location and spectra of astronomic objects
▪ (up to several Terabytes)
• Physics:
▪ e.g., sensor measurements in particle physics
experiments
▪ (up to several Petabytes)
6
DB Tendencies
• Data are recorded by sensors
DBs grow in size
DBs become more widespread
• Computers are becoming more powerful
DB Management Systems
can run on laptops
(and on phones—and soon on chip cards?)
• Multimedia data arise everywhere
Requirements for larger storage
New query operations
7
An Ideal DB Implementation
Should Support:
• Structure • Performance
– data types – retrieve and store data
– data behaviour quickly
• Persistence • Data Integrity
– store data on • Sharing
secondary storage – concurrency
• Retrieval • Reliability and resilience
– a declarative query • Large data volumes
language
– a procedural database
programming language
8
Database Management System (DBMS)
• A DBMS is a software package designed
to store and manage databases
• A DBMS provides generic functionality that
otherwise would have to be implemented over
and over again
Reduced application development time
• Several brands, e.g.,
– Oracle Xi/Yg (Oracle), DB2 (IBM), SQL Server, Access
(Microsoft), MySQL, PostgreSQL (open source)
9
DATABASE METADATA
▪ A DBMS catalog stores the description of a particular database
• e.g. data structures, types, and constraints
• The description is called meta-data.
▪ Database metadata – represents the structure of the database
• Database content that is not the data itself (data about the data)
• Contains:
o Names of data structures
o Data types
o Data descriptions
o Other information describing the characteristics of the data
▪ Metadata - data that describes the structure and the properties of
the data
• Metadata is essential for the proper understanding and use of the data
Chapter 1 – Slide 10
DATABASE METADATA
Data without metadata - example
Chapter 1 – Slide 11
DATABASE METADATA
Data with metadata - example
Chapter 1 – Slide 12
DATABASE METADATA
Chapter 1 – Slide 13
DATABASE SCHEMA VS. DATABASE STATE
▪ Database Schema
• Includes descriptions of the database structure, data types, and
the constraints on the database.
• Schema is also called intension.
• It changes very infrequently.
▪ Database State:
• The database content at a particular moment in time.
• Refers to the of a database at a moment in time.
• Also called instance (or occurrence or snapshot or extension).
• It changes every time the database is updated.
▪ Valid State:
• A state that satisfies the structure and constraints of the
database.
Chapter 1 – Slide 14
EXAMPLE OF A DATABASE SCHEMA
Chapter 1 – Slide 15
EXAMPLE OF A DATABASE STATE
Chapter 1 – Slide 16
DBMS LANGUAGES
1. DDL – Data Definition Language
2. DQL – Data Query Language
3. DML – Data Manipulation Language
4. DCL – Data Control Language
Chapter 1 – Slide
17
DBMS LANGUAGES
▪ Data Definition Language (DDL):
• Used by the DBA and database designers to specify the conceptual
schema of a database.
• In many DBMSs, the DDL is also used to define internal and external
schemas (views).
• CREATE (generates a new table)
• ALTER (alters table)
• DROP (removes a table from the database)
▪ Data Manipulation Language (DML):
• Used to specify database retrievals and updates
• Alternatively, stand-alone DML commands can be applied directly
(called a query language).
• INSERT: command to add new or new value to the database.
• UPDATE: command to change or update current/existing data to a more
recent value within the database.
• DELETE: command to delete or delete the values or data information of
the current table in the database.
Chapter 1 – Slide 18
TYPICAL DATABASE SYSTEM ARCHITECTURE
Chapter 1 – Slide 19
TYPICAL DATABASE SYSTEM ARCHITECTURE
Chapter 1 – Slide 20
EXAMPLE OF A DATABASE
▪ UNIVERSITY Registration System.
▪ Some entities:
• STUDENTs
• COURSEs
• SECTIONs (of COURSEs)
• (academic) DEPARTMENTs
• INSTRUCTORs
Chapter 1 – Slide 21
EXAMPLE OF A DATABASE
▪ Some relationships:
• SECTIONs are of specific COURSEs
• STUDENTs take SECTIONs
• COURSEs have prerequisite COURSEs
• INSTRUCTORs teach SECTIONs
• COURSEs are offered by DEPARTMENTs
• STUDENTs major in DEPARTMENTs
▪ Note: The above entities and relationships are typically
expressed in a conceptual data model, such as the
ENTITY-RELATIONSHIP data model.
Chapter 1 – Slide 22
EXAMPLE OF A SIMPLE DATABASE
Chapter 1 – Slide 23
Chapter 1 – Slide
24
Chapter 1 – Slide
25
TYPICAL DBMS FUNCTIONALITY
▪ Define a particular database in terms of its data types,
structures, and constraints
▪ Construct or Load the initial database contents on a
secondary storage medium
▪ Manipulating the database:
• Retrieval: Querying, generating reports
• Modification: Insertions, deletions and updates to its
content
• Accessing the database through Web applications
▪ Processing and Sharing by a set of concurrent users and
application programs – yet, keeping all data valid and
consistent
Chapter 1 – Slide 26
ADDITIONAL DBMS FUNCTIONALITY
▪ DBMS may additionally provide:
• Protection or Security measures to prevent unauthorized access
• Presentation and Visualization of data
• Data visualization is the graphical representation of information
and data. By using visual elements like charts, graphs, and maps,
data visualization tools provide an accessible way to see and
understand trends, outliers, and patterns in data. Examples : Tableau,
Zoho Analytics
• Backup and Restoring the database.
Chapter 1 – Slide 27
INTERACTION WITH DATABASE
▪ Front-end applications - provide a mechanism for easy
interaction between the users and the DBMS
▪ End-users (business-users) - users using a database
system to support their tasks and processes
▪ Indirect interaction - end-user communicating with the
database through front-end applications
▪ Direct interaction - end-user communicating with the
database directly through DBMS
Chapter 1 – Slide 28
INTERACTION WITH DATABASE
Direct vs. Indirect Interaction
Chapter 1 – Slide 29
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
Chapter 1 – Slide 30
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Requirements collection, definition, and visualization -
results in the requirements specifying which data the
future database system will hold and in what fashion, and
what the capabilities and functionalities of the database
system will be
• The collected requirements should be clearly defined and stated
in a written document, and then visualized
Chapter 1 – Slide 31
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Requirements collection, definition, and visualization
• Conceptual database model – a visualization of requirements by
using a conceptual data modeling technique (such as entity-
relationship [ER] modeling)
Chapter 1 – Slide 32
Chapter 1 – Slide 33
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
Iterative nature of the database requirements collection,
definition, and visualization process
Chapter 1 – Slide 34
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Requirements collection, definition, and visualization
Chapter 1 – Slide 35
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Database modeling (logical database modeling ) -
creation of the database model that is implementable by
the DBMS software
• Logical database modeling follows conceptual database
modeling
Chapter 1 – Slide 36
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Database modeling (logical database modeling )
Chapter 1 – Slide 37
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Database implementation - using a DBMS to implement
the database model as an actual database
• Most modern databases are implemented using a relational
DBMS (RDBMS) software
• SQL (Structured Query Language) is a language used by most
RDBMS packages
Chapter 1 – Slide 38
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Database implementation - using a DBMS to implement
the database model as an actual database
Chapter 1 – Slide 39
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Developing front-end applications - designing and
creating applications for indirect use by the end-users
• Front-end applications are based on the database model and the
requirements specifying the front-end functionalities
• Front-end applications contain interfaces (such as forms and
reports) accessible via a navigation mechanism (such as a menu)
Chapter 1 – Slide 40
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Developing front-end applications - designing and
creating applications for indirect use by the end-users
• Database front-end example
Chapter 1 – Slide 41
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Database deployment - releasing the database system
for use by the end users
Chapter 1 – Slide 42
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Database use - the insertion, modification, deletion and
retrieval of the data in the database system
Chapter 1 – Slide 43
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Database use - the insertion, modification, deletion and
retrieval of the data in the database system
• Example of data in a database that can be inserted, modified,
deleted or retrieved
Chapter 1 – Slide 44
STEPS IN THE DEVELOPMENT OF DATABASE
SYSTEMS
▪ Database administration and maintenance - performing
activities that support the database end user, including
dealing with technical issues, such as:
• Providing security for the information contained in the database
• Ensuring sufficient hard-drive space for the database content
• Implementing the backup and recovery procedures
Chapter 1 – Slide 45
Database Actors
Database Application
Designers Programmers “on the scenes”
Database End Users
Administrator • sophisticated
• casual
(DBA) • ‘parametric’ or
‘canned’ transactions
Database
DBMS developers Operators and Maintenance
Tool Developers Personnel “behind the scenes”
Database Management System
Chapter 1 – Slide 46
PEOPLE INVOLVED WITH DATABASE SYSTEMS
▪ Database analysts, designers, and developers
• Database analysts - involved in the requirements collection,
definition, and visualization stage
• Database designers (a.k.a. database modelers or architects) -
involved in the database modeling stage
o Responsible to define the content, the structure, the constraints, and
functions or transactions against the database.
o They must communicate with the end-users and understand their
needs.
• Database developers – in charge of implementing the database
model as a functioning database using the DBMS software
Chapter 1 – Slide 47
PEOPLE INVOLVED WITH DATABASE SYSTEMS
▪ Front-end applications analysts and developers
• Front-end application analysts - in charge of collecting and
defining requirements for front-end applications
• Front-end applications developers - in charge of creating the
front-end applications
Chapter 1 – Slide 48
PEOPLE INVOLVED WITH DATABASE SYSTEMS
▪ Database administrators (DBAs) - perform the tasks
related to the maintenance and administration of a
database system.
▪ Responsible for
• authorizing access to the database
• coordinating and monitoring its use
• acquiring software and hardware resources
• controlling its use
• monitoring efficiency of operations.
Chapter 1 – Slide 49
PEOPLE INVOLVED WITH DATABASE SYSTEMS
▪ Database end users - They use the data for queries,
reports, and some of them update the database content.
• use a database system to support their work- or life-related tasks
and processes
▪ End-users can be categorized into:
• Casual: access database occasionally when needed
• Naïve or Parametric: they make up a large section of the end-
user population.
o They use previously well-defined functions in the form of “canned
transactions” against the database.
o Users of Mobile Apps mostly fall in this category
o Bank-tellers or reservation clerks are parametric users who do this
activity for an entire shift of operations.
o Social Media Users post and read information from websites
Chapter 1 – Slide 50
PEOPLE INVOLVED WITH DATABASE SYSTEMS
• Sophisticated:
o These include business analysts, scientists, engineers, others
thoroughly familiar with the system capabilities.
o Many use tools in the form of software packages that work closely
with the stored database.
• Stand-alone:
o Mostly maintain personal databases using ready-to-use packaged
applications.
o An example is the user of a tax program that creates its own internal
database.
o Another example is a user that maintains a database of personal
photos and videos.
Chapter 1 – Slide 51
OPERATIONAL VERSUS ANALYTICAL
DATABASES
▪ Operational information (transactional information) - the
information collected and used in support of day to day
operational needs in businesses and other organizations
▪ Operational database - collects and presents operational
information in support of daily operational procedures and
processes
▪ Analytical information - the information collected and used in
support of analytical tasks
• Analytical information is based on operational (transactional)
information
▪ Analytical database - collects and presents analytical
information in support of analytical tasks
Chapter 1 – Slide 52
Chapter 1 – Slide 53
ADVANTAGES OF USING THE DATABASE
APPROACH
▪ Controlling redundancy in data storage and in development
and maintenance efforts.
▪ Restricting unauthorized access to data.
▪ Providing Storage Structures (e.g. indexes) for efficient Query
Processing
▪ Providing optimization of queries for efficient processing.
▪ Providing backup and recovery services.
▪ Providing multiple interfaces to different classes of users.
▪ Enforcing integrity constraints on the database.
Chapter 1 – Slide 54
File System Approach
• Uncontrolled redundancy
• Inconsistent data
• Inflexibility
• Limited data sharing
• Poor enforcement of standards
• Low programmer productivity
• Excessive program maintenance
• Excessive data maintenance
Chapter 1 – Slide 55
DBMS Approach
• Controlled redundancy • Services & controls
– consistency of data &
– security & privacy
integrity constraints controls
• Integration of data – backup & recovery
– self-contained
– enforcement of
standards
– represents semantics
of application • Flexibility
• Data and operation – data independence
sharing – data accessibility
– multiple interfaces – reduced program
maintenance
• Ease of application
development 56
A DBMS Presents Programmers and Users with
a Simplified Environment
Users/Programmers
Database System
Queries / Application Programs
DBMS Software
Software to Process Queries / Programs
Software to Access Stored Data
“Catalogue”, Stored Database Stored
“Data dictionary” Definition Database
(Metadata)
Chapter 1 – Slide 57
Data Model, Schema and Instance
Data Model Schema
• A set of concepts that can • A formal definition that fixes
be used to describe the all the relevant features of
structure of a database: the those parts of the real
data types, relationships, world that are of interest to
constraints, semantics and the users of the database
operational behaviour
• The schema of a db is held
• Hides details of data
storage in the data dictionary
Student(studno,name,address)
Schema Course(courseno,lecturer)
(in relational data model)
Student(123,Egger,Bozen)
Instance Course(CS321,Nutt)
58
Copyright (c) 2020 Nenad Jukic and Prospect Press
Other Data Models
Relational model is good for:
• Large amounts of data and simple operations
• Limited navigation, touching only small numbers of
relations/tables
Difficult applications for relational model:
• VLSI design (CAD in general)
ADDER
ALU A
FA
CPU ADDER
ALU
ADDER
• CASE
• Graphical data
• Bill of materials, transitive closure Chapter 1 – Slide 59
Object Data Models
Where number of “relations” is large, relationships are complex
• Object Data Model
• “Knowledge Data Model” (= Objects + Deductive Rules)
Object Data Model (Principles)
1. Complex Objects –
Nested Structure (pointers or references)
2. Encapsulation, set of methods/access functions
3. Object Identity
4. Inheritance – Defining new classes like old classes
Object model: usually, objects are found via explicit navigation.
Also query language in some systems.
Chapter 1 – Slide 60
Data Models
60’s
Hierarchical Network
70's
80's Choice for most
Relational
applications today
90’s
Object Bases Knowledge Bases, Rules
00’s
Semistructured Data, XML Semantic Web, RDF
Chapter 1 – Slide 61
Sharing—Multiple views of data
Database Management System
Database
Chapter 1 – Slide 62
Characteristics of the DB Approach
• Insulation of programs and data
from each other
• Support of multiple user views
• Use of a catalogue to store the schema
How can one realise these principles?
Chapter 1 – Slide
63
Three Levels of Abstraction
ANSI/SPARC architecture for DBMSs (1978):
• Many external views
• One conceptual View 1 View 2 View 3
(= logical) schema mappings
• One physical
(= internal) schema Conceptual Schema
– Views describe how
users see the data mapping
– Conceptual schema Physical Schema
defines logical
structure
– Physical schema
describes the files and
indexes used Chapter 1 – Slide 64
Data Independence
New New functions
hardware Change in
use
New
users New data
Database
User's
view
Change in
New storage Linkage to other technology
techniques databases
• Logical data independence
– change the logical schema without having to change the
external schemas
• Physical data independence
– change the internal schema without having to change the
logical schema
Change the mapping, not the schema!
Chapter 1 – Slide 65
Database Languages
• Data Definition Language (DDL)
– Commands for setting up the schema of a database
– The process of designing a schema can be complex,
may use a design methodology and/or tool
• Data Manipulation Language (DML)
– Commands to manipulate data in database:
RETRIEVE, INSERT, DELETE, MODIFY
– Also called “query language”
Chapter 1 – Slide 66
Host Languages
C, C++, Fortran, Lisp, Java, Perl, …
Application prog.
Calls to
DB DBMS
Local Vars
(Memory)
(Storage)
Host language is completely general (Turing complete)
but gives no support for data manipulation
Query language—less general, “non procedural” and
optimizable Chapter 1 – Slide 67
Building an Application with a DBMS
• Requirements gathering (natural language, pictures)
• Requirements modeling (conceptual data model, ER)
– Decide what entities should be part of the application
and how they should be related
• Schema design and implementation
– Decide on a set of tables, attributes
– Create the tables in the database system
– Populate database (insert records/tuples)
• Write application programs using the DBMS
– … a lot easier now that the data
management is taken care of
Chapter 1 – Slide 68
category Conceptual
name Modeling name
ssn
cid
Course
Student
Takes
quarter
Advises Teaches
Professor
office name faculty
Chapter 1 – Slide
69
Schema Design and Implementation
• Tables:
Student: Takes:
SSN Name Category SSN CID
123-45-6789 Charles undergrad 123-45-6789 CSE444
234-56-7890 Dan grad 123-45-6789 CSE444
… … 234-56-7890 CSE142
…
Course:
CID Name Quarter
CSE444 Databases fall
CSE541 Operating systems winter
• The logical schema separates the logical view
from the physical view of the data.
Chapter 1 – Slide 70
Querying a Database
• “Find all courses that Mary takes”
• S(tructured) Q(uery) L(anguage)
▪ select [Link]
from Student s, Takes t,
Course c
where [Link] = ’Mary’ and
[Link] = [Link] and
[Link] = [Link]
• The query processor figures out
how to answer the query efficiently
Chapter 1 – Slide 71
Query Optimization
Goal: Declarative SQL query Query execution plan
select [Link] [Link]
from Student s, Takes t,
Course c
where [Link] = ’Mary’ and
[Link] = [Link] and cid=cid
[Link] = [Link]
sid=sid
name= ’Mary’
Student Takes Course
Plan: Tree of relational algebra operators,
choice of algorithm for each operator
Ideally: Find best plan Practically: Avoid worst plans! Chapter 1 – Slide 72
Traditional and Novel Data Management Issues
• Traditional Data Management:
– Relational data for enterprise applications
– Storage
– Query processing/optimization
– Transaction processing
• Novel Data Management:
– Integration of data from multiple databases, warehousing
– Data management for decision support, data mining
– Managing documents, audio, and visual data
– Exchange of data on the web: XML
– Data Streams
– Incomplete and probabilistic data
Chapter 1 – Slide 73