Database Management System
Definition of a Database
A database (DB) is a collection of interrelated computer files, whose data contents and structure
are described in a data dictionary and which is under the control of a database management
system (DBMS).
Database Management System
A database management system is a collection of programs that carry out activities on a
database, including
• setting up storage structures
• loading data
• accepting and performing updates
• Accepting data requests from users and programs.
Database Applications:
• Enterprise Information
◦ Sales: For customer, product, and purchase information.
◦ Accounting: For payments, receipts, account balances, assets and other accounting
information.
◦ Human resources: For information about employees, salaries, payroll taxes, and
benefits, and for generation of paychecks.
◦ Manufacturing: For management of the supply chain and for tracking production of
items in factories, inventories of items in warehouses and stores, and orders for items.
◦ Online retailers: For sales data noted above plus online order tracking,
generation of recommendation lists, and maintenance of online product evaluations.
• Banking and Finance
◦ Banking: For customer information, accounts, loans, and banking transactions.
◦ Credit card transactions: For purchases on credit cards and generation of monthly
statements.
◦ Finance: For storing information about holdings, sales, and purchases of financial
instruments such as stocks and bonds; also for storing real-time market data to enable
online trading by customers and automated trading by the firm.
• Universities: For student information, course registrations, and grades (in addition to standard
enterprise information such as human resources and accounting).
• Airlines: For reservations and schedule information. Airlines were among the first to use
databases in a geographically distributed manner.
• Telecommunication: For keeping records of calls made, generating monthly bills, maintaining
balances on prepaid calling cards, and storing information about the communication networks.
University Database Example:
• Add new students, instructors, and courses
• Register students for courses, and generate class rosters
• Assign grades to students, compute grade point averages (GPA) and generate transcripts
In the early days, database applications were built directly on top of file systems
Drawbacks of using file systems to store data:
• Data redundancy and inconsistency
Multiple file formats, duplication of information in different files
• Difficulty in accessing data
Need to write a new program to carry out each new task
• Data isolation
Multiple files and formats
• Integrity problems
✓ Integrity constraints
✓ Hard to add new constraints or change existing ones
• Atomicity of updates
✓ Failures may leave database in an inconsistent state with partial updates carried out
✓ Example: Transfer of funds from one account to another should either complete or
not happen at all
• Concurrent access by multiple users
✓ Concurrent access needed for performance
✓ Uncontrolled concurrent accesses can lead to inconsistencies
✓ Example: Two people reading a balance (say 100) and updating it by withdrawing
money (say 50 each) at the same time
• Security problems
Hard to provide user access to some, but not all, data
A good DBMS performs the following functions
• maintain data dictionary
• support multiple views of data
• enforce integrity constraints
• enforce access constraints
• support concurrency control
• support backup and recovery procedures
• support logical transactions
Advantages of Using a DBMS
• Less redundancy: A DBMS centralizes data storage, which prevents the duplication of
data across different files and applications.
• Less risk of inconsistency: By avoiding redundancy, a DBMS ensures that data is
consistent. Updates are made in one place, so all users see the most current information.
• Maintenance of data integrity: DBMS allows for the enforcement of data integrity
constraints, ensuring that the stored data is accurate and reliable.
• Application of access restrictions: A central control system allows administrators to
define and enforce access restrictions, ensuring that only authorized users can access
specific data.
• Balance between different requirements: A DBMS allows multiple users to access and
share the same data concurrently, while the system manages potential conflicts and
ensures data consistency.
• Data independence: The separation of data structure (physical) from the data
representation (logical) means that applications do not need to be rewritten if the
database structure is changed, promoting flexibility.
Disadvantages of DBMS
• Complexity: DBMS can be hard to design, implement, and manage, needing specialized
knowledge.
• Cost: High setup costs, including hardware, software, and skilled personnel, can be
expensive. Ongoing maintenance adds to the cost.
• Performance Overhead: DBMS might slow down simple tasks due to their extra features
and general-purpose nature.
• Security Risks: Centralizing data can create security risks. If the system is hacked, all data
could be compromised.
• Resource Intensive: DBMS need a lot of memory, storage, and processing power, which
can be costly.
• Data Integrity Issues: Complex systems can lead to data integrity problems if not
managed well.
Levels of Data Abstraction
1. Physical level: describes how a record (e.g., customer) is stored.
✓ File organization (e.g., sequential or indexed)
✓ Storage space allocation and record placement
✓ Data compression and encryption techniques
2. Logical level: describes data stored in database, and the relationships among the data.
✓ The data stored in the database
✓ The relationships between different data items
✓ The data types and their constraints
3. View level: application programs hide details of data types. Views can also hide information
(e.g., salary) for security purposes.
✓ A subset of the logical data
✓ How application programs present data, hiding the underlying data types
✓ It acts as a security mechanism by hiding sensitive information, such as a user's salary
Transaction
A transaction is a collection of one or more operations on one or more databases, which reflects
a single real-world transition.
Commit
Use the COMMIT statement to end the current transaction and make permanent all changes
performed in the transaction.
Rollback
ROLLBACK is used in a transaction to erase all data modifications made from the start of the
transaction or to a save point.
Reasons for Rollback
• User changes their mind (“ctl-C”/cancel)
• App program finds a problem
• System-initiated abort
1. System crash
2. Housekeeping
Atomicity
Atomicity means two possible outcomes for a transaction. Either, it is commits: all the changes
are made or it is aborts: no changes are made.
Integrity
Data integrity refers to the overall completeness, accuracy and consistency of data.
The following three integrity constraints are used in a relational database structure to achieve
data integrity:
• Entity Integrity: This is concerned with the concept of primary keys. The rule states that
every table must have its own primary key and that each has to be unique and not null.
• Referential Integrity: This is the concept of foreign keys. The rule states that the foreign
key value can be in two states. The first state is that the foreign key value would refer to
a primary key value of another table, or it can be null. Being null could simply mean that
there are no relationships, or that the relationship is unknown.
• Domain Integrity: This states that all columns in a relational database are in a defined
domain.
Consistency
Consistency states that data cannot be written that would violate the database’s own rules for
valid data. If a certain transaction occurs that attempts to introduce inconsistent data, the entire
transaction is rolled back and an error returned to the user.
ACID Properties:
Atomic
State shows either all the effects of transactions, or none of them
Consistent
Transactions moves from a state where integrity holds, to another where integrity holds.
The consistency property ensures that any transaction will bring the database from one
valid state to another.
Isolated
The isolation property ensures that the concurrent execution of transactions is the same
as transactions running one after another
Durable
Once a transactions has committed, its effects remain in the database
Database Users:
• Application Programmers: They are the developers who interact with the database by
means of DML queries. These DML queries are written in the application programs like C,
C++, JAVA, Pascal etc.
• Sophisticated Users: They are database developers, who write SQL queries to
select/insert/delete/update data. They do not use any application or programs to request
the database. They directly interact with the database by means of query language like
SQL.
• Specialized Users: These are also sophisticated users, but they write special database
application programs. They are the developers who develop the complex programs to the
requirement.
• Native Users: These are the users who use the existing application to interact with the
database. For example, online library system, ticket booking systems, ATMs etc. which
has existing application and users use them to interact with the database to fulfill their
requests.
Database administrator's duties include:
• Schema definition
• Storage structure and access method definition
• Schema and physical organization modification
• Granting user authority to access the database
• Specifying integrity constraints
• Acting as liaison with users
• Monitoring performance and responding to changes in requirements
Database Architecture:
Two-tier architecture: E.g. client programs using ODBC/JDBC to communicate with a database.
Three-tier architecture: E.g. web-based applications, and applications built using “middleware”