DBMS Spring 2024

Q: How does a Foreign Key contribute to maintaining data integrity and consistency within a relational database?

A foreign key contributes to data integrity by enforcing referential integrity, ensuring that if a value exists in the referencing table's foreign key, it has a corresponding primary key value in the referenced table. This prevents orphan records—records that reference non-existent data. Furthermore, foreign keys help maintain data consistency by ensuring changes in primary key tables are either propagated or restricted based on the constraint conditions like ON DELETE CASCADE or ON DELETE RESTRICT. This avoids invalid data references, contributing to higher data accuracy and quality .

Q: How can normalization up to the Third Normal Form (3NF) streamline data management in relational databases and prevent anomalies?

Normalization up to 3NF organizes data to reduce redundancy and dependency, thus streamlining data management. By ensuring that each non-key attribute is only dependent on the primary key and eliminating transitive dependencies where non-key attributes depend on other non-key attributes, normalization prevents anomalies like update, insert, and delete anomalies. In 3NF, all data dependencies are clear, which ensures consistent data representation and minimizes redundancy, leading to efficient data management and storage .

Q: Explain the role of locking in maintaining data consistency during concurrent transactions and describe the different levels of lock granularity.

Locking in concurrent transactions prevents data inconsistency by controlling access to shared data among multiple transactions. It ensures that a transaction holds exclusive access to its data until the lock is released, preventing conflicts such as lost updates or dirty reads. Different lock granularities include database-level locks (entire database), table-level locks (entire table), page-level locks (fixed-size data blocks), row-level locks (specific rows), and field-level locks (specific columns). Each level balances between the complexity of managing locks and achieving high concurrency .

Q: What is the significance of replication transparency in a Distributed Database Management System and how does it enhance system performance and data availability?

Replication transparency ensures that data replicas across multiple sites appear as single data entitie to users, allowing seamless data access regardless of its replication status. This enhances system performance by enabling faster data access from the nearest data instance and boosts availability by allowing system operation continuity even if one replica fails. The DDBMS manages replication and consistency automatically, which simplifies data administration and ensures reliable performance and high data availability .

Q: Discuss the significance of COMMIT, ROLLBACK, and SAVEPOINT in transaction management within databases.

COMMIT, ROLLBACK, and SAVEPOINT are critical commands for managing transactions in databases. COMMIT saves all changes made during the transaction, making them permanent and visible to other transactions. ROLLBACK undoes all changes since the last COMMIT, restoring the database state before the transaction, crucial for error recovery. SAVEPOINT allows setting a point in a transaction to partially roll back to, preserving earlier changes while undoing subsequent ones. This flexibility helps manage errors and partial transaction completion .

Q: In what ways can Foreign Keys improve the overall quality of data within a database system?

Foreign keys improve database quality by enforcing referential integrity, which prevents the existence of undefined, or orphan, records in the referencing table. By maintaining consistency across referenced data, foreign keys ensure that updates to parent tables (such as deletions) are correctly handled according to specified constraints, reducing incidences of stale or inaccurate data. They also assist in establishing reliable, meaningful relationships between tables, leading to a well-structured database that supports accurate data retrieval and reporting .

Q: Analyze the relationship and dependencies between various elements in a distributed database, such as query processing and transaction management, to maintain ACID properties.

In a distributed database, maintaining ACID properties involves intricate relationships between distributed query processing and transaction management. Query processing must efficiently handle data residing at multiple sites, possibly involving complex optimizations to minimize data transfer and computational overhead. Distributed transaction management ensures atomicity, consistency, isolation, and durability across different nodes. Concurrency control mechanisms are vital to synchronize transactions running across sites, while fault tolerance and network reliability ensure continuous operation and data integrity despite potential failures .

Q: What are the challenges associated with managing data across different sites in a Distributed Database Management System, and how does a DDBMS overcome them?

Challenges in managing data across different sites in a DDBMS include ensuring data consistency, managing concurrent access from multiple nodes, handling network failures, and maintaining data transparency. A DDBMS overcomes these challenges through concurrency control mechanisms to prevent data conflicts, replication strategies to ensure data availability and consistency across sites, and algorithms for distributed query processing and transaction management that maintain ACID properties. It also provides location transparency to users so they can access data without knowing its physical location .

Q: What are the key characteristics of a Distributed Database Management System (DDBMS), and how do they address the challenges of managing distributed data?

The key characteristics of a DDBMS include data distribution, location transparency, replication transparency, concurrency control, distributed query processing, distributed transaction management, failure transparency, and heterogeneity. These characteristics address challenges such as ensuring users interact with a database without needing to know data locations (location transparency), maintaining data consistency across multiple copies (replication transparency), managing simultaneous operations without data conflicts (concurrency control), and ensuring system functionality despite site or network failures (failure transparency).

Q: Why is location transparency a crucial feature of a Distributed Database Management System, and how does it impact user interaction with the database?

Location transparency in a DDBMS allows users to interact with the database as if it were a single, central entity, irrespective of where the data is physically stored across different sites. This feature simplifies user interaction by abstracting the complexity of data distribution, boosting system usability, and allowing seamless query operations without requiring users to manage data location concerns. It enhances the user experience by making the distributed architecture invisible and more intuitive .

The document outlines a DBMS Spring 2024 exam with various questions related to database concepts. It covers topics such as entities and attributes, data dictionaries, foreign keys, distributed database management systems, transaction management, and an ER diagram for an organization. The exam includes both short answer questions and a practical ER diagram design task.

Uploaded by

alinazirgee232

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views12 pages

DBMS Spring 2024

Uploaded by

alinazirgee232

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

DBMS SPRING 2024

THE ANSWERS MUST BE ATTEMPTED ON THE ANSWER SHEET PROVIDED

Q.1. Answer the following short questions: (6x5=30)
a) Define entity and attribute of an entity. Briefly explain the difference between base
attribute and derived attribute of an entity with the help of an example.
● Entity: In the context of databases, an entity is a real-world object that is distinguishable
from other objects. It represents a "thing" or "concept" about which data can be stored.
Examples include a person, a place, an event, or an object. For instance, in a university
database, "Student," "Course," and "Professor" could be entities.
● Attribute: An attribute is a characteristic or property that describes an entity. It represents
a piece of information that is relevant to the entity. For example, for a "Student" entity,
attributes might include "StudentID," "Name," "DateOfBirth," and "Address."
● Difference between Base and Derived Attribute:
○ Base Attribute (Stored Attribute): A base attribute is an attribute whose value is
directly stored in the database. It is a fundamental piece of data that is not
calculated or derived from other attributes.
■ Example: For a "Student" entity, "DateOfBirth" would be a base attribute
because its value is directly recorded.
○ Derived Attribute: A derived attribute is an attribute whose value can be calculated
or derived from other stored attributes in the database. Its value is not stored
explicitly but is computed when needed.
■ Example: For a "Student" entity, "Age" would be a derived attribute. It can be
calculated from the "DateOfBirth" (a base attribute) and the current date. You
don't need to store "Age" explicitly as it changes over time and can always be
computed.
b) Briefly explain data dictionary and its contents.
● Data Dictionary: A data dictionary (also known as a metadata repository) is a centralized
repository of information about data. It contains definitions and descriptions of the data
elements used in a database. Essentially, it's "data about data" or "metadata." It's crucial
for managing and understanding the database structure.
● Contents of a Data Dictionary: A data dictionary typically contains information about:
1. Tables/Relations: Names of tables, their owners, creation dates, and descriptions.
2. Columns/Attributes: Names of columns in each table, their data types (e.g., INT,
VARCHAR, DATE), lengths, precision, nullability constraints (whether they can be
empty), default values, and descriptions.
3. Indexes: Names of indexes, the columns they are built on, and their types (e.g.,
primary, unique, non-unique).
4. Constraints: Definitions of various constraints like primary keys, foreign keys,
unique constraints, check constraints, and their associated tables and columns.
5. Views: Definitions of virtual tables (views), including the underlying tables they are
based on.
6. Stored Procedures and Functions: Definitions of stored programs within the
database.
7. Users and Privileges: Information about database users, their roles, and the
permissions they have on different database objects.
8. Triggers: Definitions of automated actions that are executed in response to certain
events (e.g., INSERT, UPDATE, DELETE).
9. Relationships: Information about relationships between tables (e.g., one-to-many,
many-to-many), often inferred from foreign key definitions.
c) What is a foreign key? How it helps in maintaining the data integrity and data quality?
● Foreign Key: A foreign key is a column or a set of columns in a database table that refers
to the primary key in another table (or sometimes, in the same table). It establishes a link
or relationship between two tables. The table containing the foreign key is called the
"referencing" or "child" table, and the table containing the primary key is called the
"referenced" or "parent" table.
● How it helps in maintaining Data Integrity and Data Quality:
○ Referential Integrity: The primary role of a foreign key is to enforce referential
integrity. This means that if a foreign key value exists in the referencing table, it
must have a matching primary key value in the referenced table. This prevents
"orphan" records, where a child record refers to a non-existent parent.
■ Example: If you have a "Students" table and a "Courses" table, and
"CourseID" is a foreign key in "Students" referencing the "CourseID" primary
key in "Courses," you cannot enroll a student in a course that doesn't exist in
the "Courses" table.
○ Data Consistency: By enforcing relationships, foreign keys ensure that data
across related tables remains consistent. Changes in the primary key table can be
propagated or restricted based on the foreign key constraints (e.g., ON DELETE
CASCADE, ON DELETE RESTRICT).
○ Data Accuracy: By preventing invalid references, foreign keys contribute
significantly to data accuracy. You won't have records pointing to non-existent data,
which would lead to incorrect or misleading information.
○ Data Quality: Overall, by maintaining referential integrity, consistency, and
accuracy, foreign keys significantly improve the overall quality of the data in a
database. They help in building reliable and well-structured databases.
d) Briefly explain distributed database management system (DDBMS).
● Distributed Database Management System (DDBMS): A Distributed Database
Management System (DDBMS) is a software system that manages a distributed
database, making the distribution transparent to users. In a DDBMS, a single logical
database is physically stored across multiple interconnected computers (nodes) at
different locations, but it appears to users as a single, centralized database. The DDBMS
handles the complexities of data distribution, communication, and concurrency control
across these nodes.
● Key Characteristics and Concepts of DDBMS:
1. Data Distribution: Data is fragmented and stored across multiple sites. This can
be done horizontally (different rows at different sites) or vertically (different columns
at different sites), or a combination.
2. Location Transparency: Users do not need to know where the data is physically
stored. They can query the database as if it were a single, centralized system.
3. Replication Transparency: Data can be replicated (copies stored at multiple sites)
for availability and performance. The DDBMS manages these copies to ensure
consistency.
4. Concurrency Control: The DDBMS must ensure that simultaneous operations
from different sites do not interfere with each other and maintain data consistency.
5. Distributed Query Processing: Queries may involve data from multiple sites. The
DDBMS optimizes query execution across the network.
6. Distributed Transaction Management: Transactions may span multiple sites. The
DDBMS ensures atomicity, consistency, isolation, and durability (ACID properties)
across all participating sites.
7. Failure Transparency: The system should ideally continue to operate even if some
sites or network links fail.
8. Heterogeneity: A DDBMS might manage data across different types of databases
(e.g., Oracle, SQL Server) and operating systems.
e) What is the use of locking in concurrent transactions? Discuss different levels
(granularity) of a lock. Briefly explain COMMIT, ROLLBACK, and SAVEPOINT transaction
management.
● Use of Locking in Concurrent Transactions: In a multi-user database environment,
multiple transactions can run concurrently. If not managed properly, these concurrent
transactions can lead to data inconsistency issues (e.g., lost updates, dirty reads,
non-repeatable reads, phantom reads). Locking is a mechanism used in concurrent
transaction management to control access to shared data. When a transaction accesses
a piece of data, it acquires a lock on that data, preventing other transactions from
accessing or modifying it in a conflicting way until the lock is released. This ensures data
integrity and consistency by serializing conflicting operations.
● Different Levels (Granularity) of a Lock: Lock granularity refers to the size of the data
item that is locked. Different levels of granularity offer trade-offs between concurrency and
overhead.
1. Database Level Lock: The entire database is locked. This offers the lowest
concurrency as only one transaction can access the database at a time. It's rarely
used in practice due to severe performance implications.
2. Table Level Lock: The entire table is locked. When a transaction acquires a
table-level lock, no other transaction can access any part of that table. This is
simpler to implement but reduces concurrency significantly, especially for large
tables with frequent access.
3. Page Level Lock: The database page (a fixed-size block of data, typically 4KB or
8KB) containing the data is locked. This offers better concurrency than table-level
locks but can lead to "false contention" if unrelated rows happen to be on the same
page.
4. Row Level Lock: A specific row (tuple) within a table is locked. This offers the
highest level of concurrency because only the row being accessed or modified is
locked, allowing other transactions to access other rows in the same table.
However, managing row-level locks involves more overhead due to the large
number of locks that might be required.
5. Field (Column) Level Lock: Only a specific field (column) of a row is locked. This
offers the highest granularity and thus potentially the highest concurrency, but it is
very complex to implement and manage, so it's rarely used in commercial DBMS.
● COMMIT, ROLLBACK, and SAVEPOINT Transaction Management: These are
fundamental commands in SQL for managing transactions, ensuring the ACID properties
(Atomicity, Consistency, Isolation, Durability).
1. COMMIT:
■ Purpose: The COMMIT command is used to permanently save all changes
made during the current transaction to the database.
■ Effect: Once a transaction is committed, its changes are durable, visible to all
other transactions, and cannot be undone by a ROLLBACK command. It
marks the successful completion of a transaction.
2. ROLLBACK:
■ Purpose: The ROLLBACK command is used to undo all changes made
during the current transaction since the last COMMIT or ROLLBACK.
■ Effect: It reverts the database to its state before the transaction began. All
changes made within the uncommitted transaction are discarded, ensuring
atomicity (all or nothing). This is crucial for error recovery or when a
transaction cannot be completed successfully.
3. SAVEPOINT:
■ Purpose: A SAVEPOINT (or "checkpoint") allows you to set a named point
within a transaction to which you can later roll back. It does not end the
transaction.
■ Effect: If an error occurs or a decision is made to undo only a portion of a
long transaction, you can ROLLBACK to a specific SAVEPOINT instead of
rolling back the entire transaction. Changes made after the SAVEPOINT are
undone, while changes made before the SAVEPOINT remain intact within the
current transaction, awaiting a final COMMIT or a ROLLBACK to the
beginning of the transaction.
■ Syntax Example:
START TRANSACTION;
INSERT INTO accounts VALUES (...);
SAVEPOINT sp1;
UPDATE products SET price = ...;
ROLLBACK TO sp1; -- Undoes only the update to products
COMMIT; -- Commits the insert into accounts

Answer the following questions: (3x10=30)

i. Draw an ER diagram for an organization to keep track of all employees, their skills,
projects assigned and their departments. Every employee has a unique number and each
employee is given a job title. We are interested in collecting more data specific to
engineers and secretary. The relevant data to be recorded for engineers is the type of
degree and for secretaries is their typing speed. Many employees can work on a project
and vice-versa. An employee can have many skills and they will only use a given set of
skills on a particular project. There are 10 different departments each with a unique
name. An employee can report to only one department. Each department is managed by
a chairman. Each department deals with many vendors for purchasing various kinds of
equipment. A vendor typically supplies equipment to many departments. It is required to
store the details of each order delivered by a vendor to a certain department.
● Entities:
○ Employee (EmployeeNumber (PK), JobTitle, Name, Address, Phone, etc.)
○ Engineer (TypeDegree) - Subtype of Employee
○ Secretary (TypingSpeed) - Subtype of Employee
○ Skill (SkillName (PK), Description)
○ Project (ProjectID (PK), ProjectName, Description, StartDate, EndDate, etc.)
○ Department (DepartmentName (PK), Location, Phone, etc.)
○ Vendor (VendorID (PK), VendorName, ContactPerson, Phone, etc.)
○ Order (OrderID (PK), OrderDate, TotalAmount, etc.)
● Relationships:
○ Works_For: Employee (Many) -- Works_For -- Department (One)
■ (Each employee reports to only one department, but a department has many
employees)
○ Manages: Employee (One, Chairman) -- Manages -- Department (One)
■ (Each department is managed by a chairman, who is an employee)
■ This is a recursive relationship on Employee or a specific attribute in
Department pointing to Employee. Let's assume a ChairmanID in Department
references EmployeeNumber.
○ Has_Skill: Employee (Many) -- Has_Skill -- Skill (Many)
■ (An employee can have many skills, and a skill can be possessed by many
employees)
○ Works_On: Employee (Many) -- Works_On -- Project (Many)
■ (Many employees can work on a project and vice-versa)
○ Uses_Skill_On_Project: This is a ternary relationship between Employee, Skill,
and Project to capture "An employee will only use a given set of skills on a
particular project." This implies an associative entity or a relationship with attributes.
○ Supplies: Vendor (Many) -- Supplies -- Department (Many)
■ (A vendor supplies equipment to many departments, and a department deals
with many vendors)
○ Delivers: Vendor (One) -- Delivers -- Order (Many)
■ (A vendor delivers many orders)
○ Receives: Department (One) -- Receives -- Order (Many)
■ (An order is delivered to a certain department)
● Subtype/Supertype:
○ Employee is a supertype. Engineer and Secretary are disjoint subtypes.
ER Diagram (Conceptual Sketch):
+----------------+ 1 N
| Department |<-----------|------------>| Employee |
| -DeptName (PK) | (Manages) (Works_For)
| -Location | (Chairman)
+----------------+ ^
| (ISA)
+----------------+
| Employee |
| -EmployeeNumber (PK)|
| -JobTitle |
| -Name |
| ... |
+----------------+
/ \
/ \
/ \
+-----------+ +-------------+
| Engineer | | Secretary |
| -TypeDegree | | -TypingSpeed|
+-----------+ +-------------+

N M N
+-----+----------+-----------+-----+
|Skill| | |Project|
|-SkillName(PK) | --Has_Skill-- | | --Works_On-- | -ProjectID(PK)|
| | | | | |
+-----+----------+-----------+-----+
|
| N
| (Used_For) -- Ternary Relationship
|
+------------------------------+
| Uses_Skill_On_Project (Associative Entity) |
| -EmployeeNumber (FK) |
| -SkillName (FK) |
| -ProjectID (FK) |
| (PK = Composite of all 3 FKs) |
+------------------------------+

N M N 1
+------+----------+-----------+-----+
|Vendor| | |Order|
|-VendorID(PK)|--Supplies---| | --Delivers-- | -OrderID(PK)|
| | | | | -OrderDate |
+------+----------+-----------+-----+
\ /
\ /
\ /
M N
+-----------------+
| Department |
| -DeptName (PK) |
+-----------------+
|
| 1
| (Receives)
|
| N
+-----+
|Order|
+-----+

Self-correction for ER Diagram:

● The Manages relationship: A Department is managed by an Employee (Chairman). This
implies a 1-to-1 relationship between Department and Employee where the employee has
the "Chairman" role. It's often modeled by adding a ChairmanEmployeeNumber (FK)
attribute to the Department entity, referencing Employee.
● The ternary Uses_Skill_On_Project correctly represents the constraint that "An employee
will only use a given set of skills on a particular project."
ii. Normalize the following relation step by step up to 3NF. You must specify the definition
of the normal forms and set of relations obtained after each Normal form.
Original Relation:
StdID StdNa Course [Link]. UnitCo UnitNa PF Adnan OOR Nadee ZaliD Asim
me ID de me m
001 Usman A203 3 U10 ICT Nadia Adnan
U11 PF Adnan
U12 DOP Omer
003 Ahmad A104 4 U10 ICT Nadia
U31 DBA Omer
007 Qamar A203 3 U10 ICT ZaliD
U11 PF Nadee
m
U23 DBA Asim
010 Fatima A323 2 U33 DBA Asim
This table is highly un-normalized. It has:
● Repeating groups (multiple UnitCode, UnitName, PF, Adnan, OOR, etc. for a single
StdID/CourseID combination).
● Multiple attributes seem to represent instructors (PF, Adnan, OOR, Nadeem, ZaliD, Asim).
This indicates a design flaw or misinterpretation of columns. Let's assume Instructor is a
single concept and the given columns are instructor names.
Let's redefine the columns for clarity based on what seems to be the intention:
● StdID: Student ID
● StdName: Student Name
● CourseID: Course ID
● [Link].: Credit Hours of the course
● UnitCode: Unit/Subject Code within a course
● UnitName: Name of the Unit/Subject
● Instructor: Instructor of the Unit/Subject (This is inferred from the list of names in the
columns PF, Adnan, etc. I'll treat Instructor as a single column and assume the previous
columns were examples of instructor names, which should be normalized into rows.)
Original Relation (interpreted): R (StdID, StdName, CourseID, [Link]., UnitCode, UnitName,
Instructor)
Functional Dependencies (FDs) observed from the data and typical academic structure:
1. StdID -> StdName (Student ID determines student name)
2. CourseID -> [Link]. (Course ID determines credit hours)
3. UnitCode -> UnitName (Unit code determines unit name)
4. StdID, CourseID, UnitCode -> Instructor (A student taking a specific course will have a
specific instructor for a given unit within that course. This seems like a reasonable
assumption from the data where different instructors are listed per unit code for the same
student and course.)
5. CourseID, UnitCode -> UnitName (A unit code within a course determines the unit name)
- This might conflict with FD 3 if UnitCode is globally unique. Let's assume UnitCode is
unique within a course and possibly globally. From the given data, U10 maps to ICT for
different CourseIDs, implying UnitCode alone determines UnitName. Let's stick with
UnitCode -> UnitName.
Candidate Key(s): From the FDs, the minimal set of attributes that determines all other
attributes is {StdID, CourseID, UnitCode}. This will be our primary key for the initial
un-normalized relation.
Step 1: First Normal Form (1NF)
Definition of 1NF: A relation is in 1NF if and only if all attribute values are atomic (indivisible)
and there are no repeating groups. Each row-column intersection contains exactly one value.
Normalization Process: The given table has repeating groups for UnitCode, UnitName, and
Instructor (derived from the multiple name columns like PF, Adnan, etc.) under the StdID,
CourseID combination. To achieve 1NF, we eliminate these repeating groups by making each
row unique and ensuring atomic values. We'll combine the Instructor columns into a single
Instructor column.
Resulting Relations (in 1NF):
STUDENT_COURSE_UNIT_INSTRUCTOR (StdID, StdName, CourseID, [Link]., UnitCode,
UnitName, Instructor)
● Primary Key: {StdID, CourseID, UnitCode}
● Data after 1NF transformation:
StdID StdName CourseID [Link]. UnitCode UnitName Instructor
001 Usman A203 3 U10 ICT Nadia
001 Usman A203 3 U11 PF Adnan
001 Usman A203 3 U12 DOP Omer
003 Ahmad A104 4 U10 ICT Nadia
003 Ahmad A104 4 U31 DBA Omer
007 Qamar A203 3 U10 ICT ZaliD
007 Qamar A203 3 U11 PF Nadeem
007 Qamar A203 3 U23 DBA Asim
010 Fatima A323 2 U33 DBA Asim
Step 2: Second Normal Form (2NF)
Definition of 2NF: A relation is in 2NF if and only if it is in 1NF and all non-key attributes are
fully functionally dependent on the entire primary key. This means there should be no partial
dependencies (where a non-key attribute depends on only a part of a composite primary key).
Functional Dependencies (from 1NF):
● StdID -> StdName (Partial dependency on {StdID, CourseID, UnitCode})
● CourseID -> [Link]. (Partial dependency on {StdID, CourseID, UnitCode})
● UnitCode -> UnitName (Partial dependency on {StdID, CourseID, UnitCode})
● StdID, CourseID, UnitCode -> Instructor (Full dependency)
Normalization Process: We identify and remove partial dependencies by creating new
relations for the attributes that are only partially dependent on the primary key.
● StdName depends only on StdID. Extract StdID, StdName into a new relation.
● [Link]. depends only on CourseID. Extract CourseID, [Link]. into a new relation.
● UnitName depends only on UnitCode. Extract UnitCode, UnitName into a new relation.
Resulting Relations (in 2NF):
1. STUDENT (StdID, StdName)
○ Primary Key: {StdID}
○ FDs: StdID -> StdName
○ Data: | StdID | StdName | | :---- | :------ | | 001 | Usman | | 003 | Ahmad | | 007 |
Qamar | | 010 | Fatima |
2. COURSE (CourseID, [Link].)
○ Primary Key: {CourseID}
○ FDs: CourseID -> [Link].
○ Data: | CourseID | [Link]. | | :------- | :------ | | A203 | 3 | | A104 | 4 | | A323 | 2 |
3. UNIT (UnitCode, UnitName)
○ Primary Key: {UnitCode}
○ FDs: UnitCode -> UnitName
○ Data: | UnitCode | UnitName | | :------- | :------- | | U10 | ICT | | U11 | PF | | U12 |
DOP | | U31 | DBA | | U23 | DBA | | U33 | DBA |
4. ENROLLMENT_UNIT_INSTRUCTOR (StdID, CourseID, UnitCode, Instructor)
○ Primary Key: {StdID, CourseID, UnitCode}
○ Foreign Keys:
■ StdID references STUDENT(StdID)
■ CourseID references COURSE(CourseID)
■ UnitCode references UNIT(UnitCode)
○ FDs: StdID, CourseID, UnitCode -> Instructor (This is the only remaining FD, and
it's a full dependency)
○ Data: | StdID | CourseID | UnitCode | Instructor | | :---- | :------- | :------- | :--------- | |
001 | A203 | U10 | Nadia | | 001 | A203 | U11 | Adnan | | 001 | A203 | U12 | Omer | |
003 | A104 | U10 | Nadia | | 003 | A104 | U31 | Omer | | 007 | A203 | U10 | ZaliD | |
007 | A203 | U11 | Nadeem | | 007 | A203 | U23 | Asim | | 010 | A323 | U33 | Asim |
Step 3: Third Normal Form (3NF)
Definition of 3NF: A relation is in 3NF if and only if it is in 2NF and there are no transitive
dependencies. A transitive dependency occurs when a non-key attribute is dependent on
another non-key attribute. (i.e., A -> B and B -> C, then A -> C is a transitive dependency if B is
not a superkey).
Functional Dependencies (from 2NF relations):
● STUDENT: StdID -> StdName (Already in 3NF)
● COURSE: CourseID -> [Link]. (Already in 3NF)
● UNIT: UnitCode -> UnitName (Already in 3NF)
● ENROLLMENT_UNIT_INSTRUCTOR: StdID, CourseID, UnitCode -> Instructor
Let's check ENROLLMENT_UNIT_INSTRUCTOR for transitive dependencies.
● Primary Key: {StdID, CourseID, UnitCode}
● Non-key attribute: Instructor
Are there any FDs where a non-key attribute determines another non-key attribute? No,
Instructor is the only non-key attribute, and it is fully dependent on the primary key.
However, there might be an implicit dependency regarding Instructor and UnitCode. It is
common that a particular UnitCode always has the same instructor or a specific
instructor teaches a specific UnitCode across courses/students. From the given data:
● U10 is taught by Nadia (001, A203, U10) and ZaliD (007, A203, U10). This implies
UnitCode does NOT determine Instructor.
● U11 is taught by Adnan (001, A203, U11) and Nadeem (007, A203, U11).
● U31 is taught by Omer (003, A104, U31).
● U23 is taught by Asim (007, A203, U23).
● U33 is taught by Asim (010, A323, U33).
It seems Instructor is specific to the student's enrollment in a particular unit within a course.
Thus, StdID, CourseID, UnitCode -> Instructor appears to be the correct and only dependency
for that table.
Conclusion for 3NF: All relations obtained in 2NF are already in 3NF. There are no transitive
dependencies.
Final Relations in 3NF:
1. STUDENT (StdID PK, StdName)
2. COURSE (CourseID PK, [Link].)
3. UNIT (UnitCode PK, UnitName)
4. ENROLLMENT_UNIT_INSTRUCTOR (StdID PK, CourseID PK, UnitCode PK,
Instructor)
○ StdID references STUDENT(StdID)
○ CourseID references COURSE(CourseID)
○ UnitCode references UNIT(UnitCode)
iii. Create the following tables and apply the described constraints:
Table Name: Product
Attribute Constraints
ProdID Primary Key
Prodname Not Null
Price Between 1000 and 15000
Stock Greater than 0
Table Name: Invoice
Attribute Constraints
Invoice# Primary Key
ProdID Primary Key, Foreign Key (references Product)
Qty. Ordered Greater than 0
SQL DDL Statements:
-- Create the Product table
CREATE TABLE Product (
ProdID INT PRIMARY KEY,
Prodname VARCHAR(255) NOT NULL,
Price DECIMAL(10, 2) CHECK (Price BETWEEN 1000 AND 15000),
Stock INT CHECK (Stock > 0)
);

-- Create the Invoice table
CREATE TABLE Invoice (
Invoice# INT, -- This should be InvoiceID or similar
for clarity
ProdID INT,
Qty_Ordered INT CHECK (Qty_Ordered > 0),
PRIMARY KEY (Invoice#, ProdID), -- Composite Primary Key
FOREIGN KEY (ProdID) REFERENCES Product(ProdID)
);

Correction on Invoice table design: The table structure for Invoice with Invoice# and ProdID both
as Primary Keys implies that a single invoice line item is identified by both the invoice number
and the product. This is typical for an Invoice_Line_Item table. If Invoice is meant to represent
the header, then Invoice# would be the sole primary key, and ProdID (along with Qty_Ordered)
would be part of a separate Invoice_Line_Item table. Given the prompt's table, I'll stick to a
composite primary key for Invoice as provided.
b) Write SQL statements for the above database:
i) Update SQL statements to update all products by 15% of their current price, whose
current price is between 1000 and 5000.
UPDATE Product
SET Price = Price * 1.15
WHERE Price BETWEEN 1000 AND 5000;

ii) Display Invoice#, ProdID, Prodname and Price for all the invoices generated in the year
2023. (Assuming Invoice date is available in Invoice table or derived from Invoice#)
Assumption: The Invoice table as defined does not have a date column. To display invoices
generated in 2023, we'd need an InvoiceDate column in the Invoice table. Let's assume an
InvoiceDate column exists in the Invoice table.
If InvoiceDate is not present, it's impossible to filter by year based solely on Invoice#. I'll proceed
with the assumption of InvoiceDate being present.
-- Assuming an InvoiceDate column exists in the Invoice table
SELECT
[Link]#,
[Link],
[Link],
[Link]
FROM
Invoice I
JOIN
Product P ON [Link] = [Link]
WHERE
STRFTIME('%Y', [Link]) = '2023'; -- For SQLite or similar
(adjust for other SQL dialects)

-- For SQL Server:
-- WHERE YEAR([Link]) = 2023;

-- For MySQL:
-- WHERE YEAR([Link]) = 2023;

-- For PostgreSQL:
-- WHERE EXTRACT(YEAR FROM [Link]) = 2023;

iii) How many products have stock below 100?

SELECT
COUNT(ProdID) AS ProductsBelow100Stock
FROM
Product
WHERE
Stock < 100;

iv) Display minimum, maximum, and average price using the product table.
SELECT
MIN(Price) AS MinimumPrice,
MAX(Price) AS MaximumPrice,
AVG(Price) AS AveragePrice
FROM
Product;

Common questions