What is normalization?
Database normalization is a data design and
organization process applied to data structures based on rules that help build
relational databases. In other words, process of
decomposing a complex relation into a simple and stable data structure.
Normalization usually involves dividing a database into two or more
tables and defining relationships between the tables.
Benefits of normalization:
1.
Minimize data redundancy.
2.
Reducing required storage space
3.
Preventing the need to restructure existing
tables to accommodate new data.
4.
Increased speed and flexibility of queries,
sorts, and summaries.
First Normal Form (1NF)
A relation schema is in 1NF:
– If and only if all the attributes
of the relation R are atomic in nature.
– Atomic: the smallest level to which
data may be broken down and remain meaningful
- Eliminate duplicative columns from the same table.
- Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).
Second
Normal Form (2NF)
A Relation is said to be in Second
Normal Form if and only if:
– It
is in the First normal form, and
– No
partial dependency exists between non-key attributes and key attributes.
• An attribute of a relation R that belongs to any key of R is said to
be a
prime attribute and that which doesn’t is a non-prime attribute
- Meet all the requirements of the first normal form.
- Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
- Create relationships between these new tables and their predecessors through the use of foreign keys.
Third
Normal Form (3NF)
A relation R is said to
be in the Third Normal Form (3NF) if and only if
− It is in 2NF and
− No transitive dependency exists between
non-key attributes and
key attributes.
- Meet all the requirements of the second normal form.
- Remove columns that are not dependent upon the primary key.
Boyce-Codd
Normal Form (BCNF or 3.5NF)
The Boyce-Codd Normal Form, also
referred to as the "third and half (3.5) normal form", adds one more
requirement:
- if and only if all the determinants are candidate keys.
BCNF relation is a strong 3NF, but
not every 3NF relation is BCNF.
- Meet all the requirements of the third normal form.
- Every determinant must be a candidate key.
Fourth
Normal Form (4NF)
Finally, fourth normal form (4NF) has
one additional requirement:
- Meet all the requirements of the third normal form.
- A relation is in 4NF if it has no multi-valued dependencies.
Keys:
1.
Super Key:
An
attribute or group of attributes, that is sufficient to distinguish every tuple
in the relation from every other one.
Each
Superkey is called a candidate key.
Any
superset of a candidate Key is a super key.
A
candidate key is all those set of attributes which can uniquely identify a row.
However, any subset of those set of attributes would not identify a row
uniquely.
2.
Candidate Key:
A
candidate key can be any column or a combination of columns that can be qualify
as a unique key in database.
There
can be multiple candidate keys in one table. Each candidate key can qualify as
s primary key.
3.
Primary key:
A
primary key is a column or combination of columns that uniquely identify a
record.
Only one
candidate key can be a primary key.
A primary key which is a combination of
more than one attribute is called a Composite primary key.
4.
Foreign Key:
A foreign
key is a “copy” of a primary key that has been exported from one relation into
another to represent the existence of a relationship between them.
Overlapping candidate keys: Two candidate keys overlap if they involve any attribute in
common.
Attribute that does not participate in
any candidate key is called a Non-key attribute
No comments:
Post a Comment