Normalisation - D B M S
- Functional Dependency (FD)
- It's a relationship between the primary key attribute (usually) of the relation to that of the other attribute of the relation
- X → Y, the left side of FD is known as a Determinant, the right side of the production is known as a Dependent
- Redundancy
- Row Level => Removed by using Primary Key
- Column Level => Insertion, Deletion, Update Anomaly
- Closure Method => To find all Candidate Keys in a Table
- Prime Attribute => Attribute which is used in making of the Candidate key
- Steps
- Check all the Attributes in the Right side of Functional Dependency
- The one which is not present on the right side must be present on the left side to make Candidate key
- Now check all possibilities with this Attribute
- After you find 1 Candidate key, check if any of its Attribute is present on the right side of FD
- Replace that Attribute with its Left side Attribute and now check for this combination
- Repeat 4th process till no other combination is found
- Types of FD
- Trivial FD
- A → B has trivial functional dependency if B is a subset of A
- A → A, B → B are also Trivial FD
- Non-trivial FD
- A → B has a non-trivial functional dependency if B is not a subset of A [A intersection B is NULL]
- Rules of FD (Armstrong’s axioms)
- Reflexive
- If "A" is a set of attributes and "B" is a subset of "A", Then, A → B holds
- If A ⊇ B then A → B
- Augmentation
- If B can be determined from A, then adding an attribute to this functional dependency won’t change anything
- If A → B holds, then AX → BX holds too, "X" being a set of attributes
- Transitivity
- If A determines B and B determines C, we can say that A determines C
- If A → B and B → C then A → C
- Equivalence of FD
- Check if X covers Y => Take LHS from Y and check if you get its RHS from X
- Check if Y covers X
- Properties
- Anomalies (Abnormalities)
- Due to these anomalies, DB size increases and DB performance become very slow
- To rectify these anomalies and the effect of these of DB, we use Database optimization technique called NORMALISATION
- There are three types of anomalies introduced by data redundancy
- Insertion anomaly
- When certain data (attribute) can not be inserted into the DB without the presence of other data
- Deletion anomaly
- Situation where the deletion of data results in the unintended loss of some other important data
- Updation anomaly (or modification anomaly)
- When an update of a single data value requires multiple rows of data to be updated
- Due to updation to many places, may be Data inconsistency arises, if one forgets to update the data at all the intended places
- What is Normalisation?
- Normalisation is a step towards DB optimization
- Normalisation is used to minimize the redundancy from a relations
- It is also used to eliminate undesirable characteristics like Insertion, Update, and Deletion Anomalies
- Normalisation divides the composite attributes into individual attributes OR larger table into smaller and links them using relationships
- Decompose Tables into multiple smaller tables till SRP (Single Responsibility Principle) is achieved
- Types of Normal forms => The normal form is used to reduce redundancy from the database table
- 1NF
- Every relation cell must have atomic value
- Relation must not have multi-valued attributes
- 2NF
- Relation must be in 1NF
- There should not be any partial dependency
- All non-prime attributes must be fully dependent on PK
- Non prime attribute can not depend on the part of the PK
- 3NF
- Relation must be in 2NF
- No transitivity dependency exists
- Non-prime attribute should not find a non-prime attribute
- BCNF (Boyce-Codd normal form)
- Relation must be in 3NF
- FD => A -> B, A must be a super key
- We must not derive prime attribute from any prime or non-prime attribute
- Advantages of Normalisation
- Normalisation helps to minimize data redundancy
- Greater overall database organization
- Data consistency is maintained in DB
- Decomposition
- Lossless / Lossy Join Decomposition
- Spurious Tuples => Extra Tuples after joining
- Common Attribute should be CK, SK of either Table
- Dependency Preserving Decomposition => FD should be preserved
- Union of Decomposed Tables FD should be equal to the original Table FD
- Make all FD of each Table
- Check which FD is true by comparing it with original Table FD
- Take Union of all FD found
- Check if the Original FD can be found by these