0% found this document useful (0 votes)
6 views14 pages

3 Unit

Codd's Rules outline twelve principles for relational database management systems, emphasizing the organization, accessibility, and integrity of data stored in tables. Key concepts include the treatment of NULL values, the importance of primary and foreign keys, and the process of normalization to reduce redundancy. Additionally, the document discusses functional dependencies and their types, as well as the significance of lossless and lossy decomposition in maintaining data integrity.

Uploaded by

Kalyani Ghadge
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views14 pages

3 Unit

Codd's Rules outline twelve principles for relational database management systems, emphasizing the organization, accessibility, and integrity of data stored in tables. Key concepts include the treatment of NULL values, the importance of primary and foreign keys, and the process of normalization to reduce redundancy. Additionally, the document discusses functional dependencies and their types, as well as the significance of lossless and lossy decomposition in maintaining data integrity.

Uploaded by

Kalyani Ghadge
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Codd’s Rules

Rule 1: The Information Rule


(jo bhi information store kar rhe h wo table ke form me hi karege)
 All information, whether it is user information or metadata, that is stored in a database must
be entered as a value in a cell of a table.
 A database contains various information, and this information must be stored in each of a
table in the form of row and column.
 It is said that everything within the database is organized in a table layout.

Rule 2: The Guaranteed Access Rule


(jo bhi hamne table banay h wo accessible hona chahiye , to access it we shold have table
name, primary key, column name)
 Each data element is guaranteed to be accessible logically with a combination of the table
name, primary key (row value), and attribute name (column value).

Rule 3: Systematic Treatment of NULL Values


 This rule define the systematic treatment of NULL value in database record.
 The null value has various meaning in the database, like missing the data, no value in cell,
inappropriate information, unknown data and the primary key should not be NULL.
 Every NULL value in a database must be given a systematic and uniform treatment.

Rule 4: Active Online Catalog Rule


(table se related data ko bhi store krna zaruri h, usko bhi waise hi access karo jaise ki table ko
krte h , privileges hone chahiye ki ikon access kr skta h )
 The database catalog, which contains metadata about the database, must be stored and
accessed using the same relational database management system.
 It represents the entire logical structure of the descriptive database that must be stored
online and is known as a database dictionary.
 It authorizes users to access the database and implement a similar query language to access
the database.

Rule 5. Comprehensive Data Sublanguage Rule


(atleast ek language aaisa hona chahiye jisse kii hum data ko access, define kr ske,
manipulate kr skte h ex: sql)
 The relational database supports various language
 and if we want to access the database, the language must be the explicit, linear or well-
defined syntax, character strings and supports the comprehensive: data definition, view
definition, data manipulation, integrity constraints, and limit transaction management
operations.
 If the database allows access to the data without any language, it is considered a violation of
the database.

Rule 6. View Updating Rule


(view is like a stored query ya jinki physical existence nhi hoti h but give views the same
importance as we give to table)
 All views table can be theoretically updated and must be practically updated by the database
systems.

Rule 7: Relational Level Operation (High-Level Insert, Update and delete) Rule
(jo bhi hum database bana rhe h unse atleast insert delete update kii facility honi chahiye)
 A database system should follow high-level relational operations such as insert, update, and
delete in each level or a single row.
 It also supports union, intersection and minus operation in the database system.

Rule 8: Physical Data Independence Rule


(agar physical level hum kuch chnge kr rhe h to it should not affect logical level or view level,
hum base level me kuch bhi chnge kr skte h)
 All stored data in a database or an application must be physically independent to access the
database.
 Each data should not depend on other data or an application.
 If data is updated or the physical structure of the database is changed, it will not show any
effect on external applications that are accessing the data from the database

Rule 9: Logical Data Independence Rule


(agar logical level pe kuch chnge kiya to view level pe kuch nhi dekhna chahiye uspe koi
effect n pade)
 It is similar to physical data independence.
 It means, if any changes occurred to the logical level (table structures), it should not affect
the user's view (application). For example, suppose a table either split into two tables, or two
table joins to create a single table, these changes should not be impacted on the user view
application.

Rule 10: Integrity Independence Rule


(jo bhi intergrity constraints laga rhe h aaisa nhi kii application program ko bole chnge krne
ko nhi jo sublanguage h jo backend p use krte h usme chnge krte h )
 A database must maintain integrity independence when inserting data into table's cells
using the SQL query language.
 All entered values should not be changed or rely on any external factor or application to
maintain integrity.
 It is also helpful in making the database-independent for each front-end application.

Rule 11: Distribution Independence Rule


(databse bht bada h to ek jgh pe nhi rkh skte h to multiple jagah pe rkh diya pr user ko ye
chij na dekhe)
 The distribution independence rule represents a database that must work properly, even if it
is stored in different locations and used by different end- users.
 Suppose a user accesses the database through an application; in that case, they should not
be aware that another user uses particular data, and the data they always get is only located
on one site.
 The end users can access the database, and these access data should be independent for
every user to perform the SQL queries.

Rule 12: Non-Subversion Rule


(jo hum language banai ge wo high level language hoga pr jab hum complie karege to low
level language me covert ho jai ga, so jab hum covert karege to aaisa koi constraint ya loop
hole nhi hona chhaiye)
 The non-submersion rule defines RDBMS as a SQL language to store and manipulate the
data in the database.
 When a query is fired in the database, it will be converted into low level language so that it
can be understood by the underlying systems to retrieve the data.
 In such case, when accessing or manipulating the records at low level language, there should
not be any loopholes that alter the integrity of the database.

Keys
1. Primary Key
 It's the most important key in a table.
 It uniquely identifies each row (tuple) in the table.
 It cannot contain null values.
 Each table can have only one primary key.
 Examples: "emp_id" in an "Employees" table, "product_id" in a "Products" table

2. Foreign Key:
 It's a column (or set of columns) in one table that references the primary key of another
table.
 It establishes relationships between tables.
 It can contain null values.
 A table can have multiple foreign keys.
 Example: "department_id" in the "Employees" table could reference the
"department_id" primary key in the "Departments" table.

3. Candidate Key:
 It's a column (or set of columns) that could potentially act as a primary key because it
uniquely identifies each row.
 A table can have multiple candidate keys.
 The primary key is chosen from among the candidate keys.
 Example: A table with columns "emp_id" and "email" could have both as candidate keys,
but one would be chosen as the primary key.

4. Super Key(use to identify record)


 A super key is any set of attributes in a table that uniquely identifies all rows in that
table.
 Example: In the same "Students" table, all the individual attributes ("student_id",
"name", "email") and any combination of them (e.g., "student_id", "name") are super
keys as they can uniquely identify each student.

5. Unique Key (column or set of columns)


 A unique key in DBMS is used to uniquely identify a tuple in a table and is used to
prevent duplicity of the values in a table.
 The usage of a primary key is the same, but there is a difference between both keys.
 A primary key cannot take a NULL value, but a unique key can have one NULL value
as its value

6. Composite Key:
 It's a primary key that consists of multiple columns, together uniquely identifying each
row.
 Used when a single column cannot uniquely identify rows.
 Example: A "Courses" table might use "course_code" and "semester" as a composite
primary key.
Normalization

 Normalization is the process of organizing the data in the database.


 Normalization is used to minimize the redundancy from a relation or set of relations. It is
also used to eliminate undesirable characteristics like Insertion, Update, and Deletion
Anomalies.
 Normalization divides the larger table into smaller and links them using relationships.
 The normal form is used to reduce redundancy from the database table.

Advantages of Normalization

 Normalization helps to minimize data redundancy.


 Greater overall database organization.
 Data consistency within the database.
 Much more flexible database design.
 Enforces the concept of relational integrity(sets relation between table).

Disadvantages of Normalization

 You cannot start building the database before knowing what the user needs.
 The performance degrades when normalizing the relations to higher normal
forms, i.e., 4NF, 5NF.
 It is very time-consuming and difficult to normalize relations of a higher degree.
 Careless decomposition may lead to a bad database design, leading to serious
problems.

Decomposition
 The term decomposition refers to the process in which we break down a table in a
database into various elements or parts.
 Thus, decomposition replaces a given relation with a collection of various smaller
relations.
 Thus, in a database, we can make any table break down into multiple tables when
we want to collect a particular set of data.
 Decomposition must always be lossless. This way, we can rest assured that the
data/information that was there in the original relation can be reconstructed
accurately on the basis of the decomposed relations.
 In case the relation is not decomposed properly, then it may eventually lead to
problems such as information loss.
Types of Decomposition (short notes)

Decomposition is of two major types in DBMS:

 Lossless
 Lossy

Lossless Decomposition
 Lossless join decomposition is a decomposition of a relation R into relations R1, and
R2 such that if we perform a natural join of relation R1 and R2, it will return the
original relation R.
 This is effective in removing redundancy from databases while preserving the original
data.
 In other words by lossless decomposition, it becomes feasible to reconstruct the
relation R from decomposed tables R1 and R2 by using Joins.
 Only 1NF,2NF,3NF, and BCNF are valid for lossless join decomposition.
Here's an example of lossless decomposition:
Imagine a table called Student that stores student ID, name, department, and course
enrolled. This table can be decomposed into two tables:
StudentDetails(StudentID, Name)
Enrollment(StudentID, Department, Course)
In this case, StudentID acts as the common attribute for joining the tables. Performing a
natural join on StudentID will retrieve all the original student information without any
loss.

Lossy Decomposition
 Lossy decomposition, also known as careless decomposition, is a term used in database
management systems (DBMS) when breaking down a table into smaller tables results in
some information loss.
 This means you cannot perfectly recreate the original table by joining the decomposed
tables back together.
 Lossy decomposition typically occurs when the chosen decomposition violates
the conditions of lossless decomposition.
Here's an example of lossy decomposition:
Let's say we decompose the Student table again, but this time into:
StudentIDName(StudentID, Name)
DepartmentCourse(Department, Course)
There's no common attribute to join these tables. Even though both tables contain
relevant information, you cannot reconstruct the original data about a specific
student's department and course enrollment by simply joining them.

Functional Dependencies (FD)


 A dependency function (FD) is a database constraint that determines the
relationship of one attribute(column) to another in a database management
system (DBMS).
 Functional dependencies help maintain the quality of data in the database.
 Functional dependence is a relationship that exists between two attributes.
 It usually exists between the primary key and non-prime attributes in the
table.
Example: X -> Y
In this case, the left side of the arrow is the determinant and the right of the
arrow is dependent. X will be the primary attribute and Y will be a non-prime
attribute of the table. It shows that the attributes of column X uniquely identify
the attribute of column Y to satisfy this functional dependency.

How to represent functional dependency in DBMS?


Functional dependency is expressed in the form of equations. Here you have an array of
objects (A, B, C, etc.) and an arrow (->) that represents the dependency. For example, if we
have an employee record with fields “EmployeeID”, “FirstName” and “LastName”, we can
specify the function as follows:
EmployeeID -> FirstName, LastName
To represent functional dependency in DBMS has two main features: left (LHS) and right
(RHS) of the arrow (->).
For example, if we have a table with attributes “X”, “Y” and “Z” and the attribute “X” can
determine the value of the attributes “Y” and “Z”.
X -> Y, Z
This symbol indicates that the value in property “X” determines the values in property “Y”
and “Z”. So if you know the value of “X”, you can also determine the value of “Y” and “Z”.

Types of Functionalities in DBMS


The following are some important types of functionality in DBMS
1. Trivial Functional Dependency
Trivial functional dependency occurs when an attribute or attributes
(columns) appear in the DBMS. When the (->) function is present, the left
element (LHS) already determines the right element (RHS) without additional
information.

Example: Suppose we have a employee table with attributes “EmployeeID”


and “EmployeeName”. In this case, if we declare the dependency function as

EmployeeID -> EmployeeName

It will be a trivial functional dependency. Because there is only one


“EmployeeName” in “EmployeeID”. That is, the value of “EmployeeID”
determines the value of “EmployeeName” without the need for any
additional information or conditions.

2. Non-trivial Functional Dependency

Non-trivial functional dependencies are a type of dependency between


attributes (columns) in a table. Here the relationship is neither obvious nor
important. It provides valuable information about how values in one feature
set determine values in another feature set.

Example: Here is the functional dependency based on the employee’s date of


birth (EmployeeDOB) and Department. In this case the non-trivial
dependency will be

EmployeeDOB, Department -> EmployeeName.


In this functional dependency given combination of employee’s date of birth and class can
uniquely determines employee names. It is important because it provides important
information about the relationship between attributes in the language.

3. Multivalued Functional Dependency

Multivalued functional dependencies in a database occur when one or more


attributes determine multiple independent values in other attributes. Indicates that
changes in the defined attributes will lead to various combinations of values in the
attributes that specify the relationship in the data.
Example:

In the above case, the multivalued dependency holds as:


Ali has StudentID 101 and enrolled in {Math, English}.
Bob has StudentID 102 and enrolled in {History, English}.
Cavin has StudentID 103 and enrolled in {Physics, Hindi}.
4. Transitive Functional Dependency
Transitive functional dependency in DBMS is the relationship between attributes
(columns) of a database table. This occurs when the price of one property
determines the price of another property through an intermediate (third) factor.

Example: Consider a database table named “Employee_Info” with the following


properties:

Empoyee_ID (a unique identifier for each student),

Employee_Name,

Employee_Address

Employee_City.

In this example we can assume that Employee_Address is linked to Employee_City


and Employee_City is linked to Employee_ID. This creates a transitive dependency in
the DBMS where Employee_ID directly determines Employee_Address

Normal Form
First normal form (1NF)

 First Normal Form (1NF): This is the most basic level of normalization. In 1NF, each
table cell should contain only a single value, and each column should have a unique
name. The first normal form helps to eliminate duplicate data and simplify queries.

 A relation is said to be in 1NF (first normal form), if it doesn’t contain any multi-
valued attribute.

 In other words you can say that a relation is in 1NF if each attribute contains only
atomic (single) value only.

 As per the rule of first normal form, an attribute (column) of a table cannot hold
multiple values. It should hold only atomic values.

Example: Let’s say a company wants to store the names and contact details of its employees.
It creates a table in the database that looks like this:

 Two employees (Jon & Lester) have two mobile numbers that caused
the Emp_Mobile field to have multiple values for these two employees.

 This table is not in 1NF as the rule says “each attribute of a table must have atomic
(single) values”, the Emp_Mobile values for employees Jon & Lester violates that
rule.

 To make the table complies with 1NF we need to create separate rows for the each
mobile number in such a way so that none of the attributes contains multiple values.
For a table to be in the first normal form, it must meet the following criteria:

 A single cell must not hold more than one value (atomicity)
 There must be a primary key for identification
 No duplicated rows or columns
 Each column must have only one value for each row in the table

Second Normal Form (2NF):


 Second Normal Form (2NF): 2NF eliminates redundant data by requiring that each
non-key attribute be dependent on the primary key. This means that each column
should be directly related to the primary key, and not to other columns.

 The first condition for the table to be in Second Normal Form is that the table has to
be in First Normal Form.

 The normalization of 1NF relations to 2NF involves the elimination of partial


dependencies.

 A partial dependency in DBMS exists when any non-prime attributes, i.e., an


attribute not a part of the candidate key, is not fully functionally dependent on one
of the candidate keys.

An attribute that is not part of any candidate key is known as non-prime attribute.
A table is said to be in 2NF if it meets the following criteria:
 Table is in 1NF (First normal form)
 Has no partial dependency. That is, all non-key attributes are fully dependent on a
primary key.
Now the tables are in Second normal form (2NF).

Consider the table Location:

The Location table possesses a composite primary key cust_id, storeid. The
non-key attribute is store_location. In this case, store_location only depends
on storeid, which is a part of the primary key. Hence, this table does not fulfill
the second normal form.

To bring the table to Second Normal Form, you need to split the table into two
parts. This will give you the below tables:

As you have removed the partial functional dependency from the location
table, the column store_location entirely depends on the primary key of that
table, storeid.
Third Normal Form (3NF):
 3NF builds on 2NF by requiring that all non-key attributes are independent of
each other. This means that each column should be directly related to the
primary key, and not to any other columns in the same table.

 The first condition for the table to be in Third Normal Form is that the table
should be in the Second Normal Form.

 The second condition is that there should be no transitive dependency for


non-prime attributes, which indicates that non-prime attributes (which are not
a part of the candidate key) should not depend on other non-prime attributes
in a table.

 Therefore, a transitive dependency is a functional dependency in which A → C


(A determines C) indirectly, because of A → B and B → C (where it is not the
case that B → A).

 The third Normal Form ensures the reduction of data duplication. It is also
used to achieve data integrity.

Below is a student table that has student id, student name, subject id, subject name, and
address of the student as its columns.

In the above student table, stu_id determines subid, and subid determines sub.
Therefore, stu_id determines sub via subid. This implies that the table possesses a transitive
functional dependency, and it does not fulfill the third normal form criteria.

Now to change the table to the third normal form, you need to divide the table as shown
below:

As you can see in both the tables, all the non-key attributes are now fully functional,
dependent only on the primary key.

In the first table, columns name, subid, and addresses only depend on stu_id. In the second
table, the sub only depends on subid.

You might also like