Manav Goyal

Normalisation - D B M S

Functional Dependency (FD)
- It's a relationship between the primary key attribute (usually) of the relation to that of the other attribute of the relation
- X → Y, the left side of FD is known as a Determinant, the right side of the production is known as a Dependent
- Redundancy
  - Row Level => Removed by using Primary Key
  - Column Level => Insertion, Deletion, Update Anomaly
- Closure Method => To find all Candidate Keys in a Table
  - Prime Attribute => Attribute which is used in making of the Candidate key
  - Steps
    1. Check all the Attributes in the Right side of Functional Dependency
    2. The one which is not present on the right side must be present on the left side to make Candidate key
    3. Now check all possibilities with this Attribute
    4. After you find 1 Candidate key, check if any of its Attribute is present on the right side of FD
    5. Replace that Attribute with its Left side Attribute and now check for this combination
    6. Repeat 4th process till no other combination is found
- Types of FD
  - Trivial FD
    - A → B has trivial functional dependency if B is a subset of A
    - A → A, B → B are also Trivial FD
  - Non-trivial FD
    - A → B has a non-trivial functional dependency if B is not a subset of A [A intersection B is NULL]
- Rules of FD (Armstrong’s axioms)
  - Reflexive
    - If "A" is a set of attributes and "B" is a subset of "A", Then, A → B holds
    - If A ⊇ B then A → B
  - Augmentation
    - If B can be determined from A, then adding an attribute to this functional dependency won’t change anything
    - If A → B holds, then AX → BX holds too, "X" being a set of attributes
  - Transitivity
    - If A determines B and B determines C, we can say that A determines C
    - If A → B and B → C then A → C
- Equivalence of FD
  - Check if X covers Y => Take LHS from Y and check if you get its RHS from X
  - Check if Y covers X
- Properties
Anomalies (Abnormalities)
- Due to these anomalies, DB size increases and DB performance become very slow
- To rectify these anomalies and the effect of these of DB, we use Database optimization technique called NORMALISATION
- There are three types of anomalies introduced by data redundancy
  - Insertion anomaly
    - When certain data (attribute) can not be inserted into the DB without the presence of other data
  - Deletion anomaly
    - Situation where the deletion of data results in the unintended loss of some other important data
  - Updation anomaly (or modification anomaly)
    - When an update of a single data value requires multiple rows of data to be updated
    - Due to updation to many places, may be Data inconsistency arises, if one forgets to update the data at all the intended places
What is Normalisation?
- Normalisation is a step towards DB optimization
- Normalisation is used to minimize the redundancy from a relations
- It is also used to eliminate undesirable characteristics like Insertion, Update, and Deletion Anomalies
- Normalisation divides the composite attributes into individual attributes OR larger table into smaller and links them using relationships
- Decompose Tables into multiple smaller tables till SRP (Single Responsibility Principle) is achieved
Types of Normal forms => The normal form is used to reduce redundancy from the database table
- 1NF
  - Every relation cell must have atomic value
  - Relation must not have multi-valued attributes
- 2NF
  - Relation must be in 1NF
  - There should not be any partial dependency
    - All non-prime attributes must be fully dependent on PK
    - Non prime attribute can not depend on the part of the PK
- 3NF
  - Relation must be in 2NF
  - No transitivity dependency exists
    - Non-prime attribute should not find a non-prime attribute
- BCNF (Boyce-Codd normal form)
  - Relation must be in 3NF
  - FD => A -> B, A must be a super key
    - We must not derive prime attribute from any prime or non-prime attribute
Advantages of Normalisation
- Normalisation helps to minimize data redundancy
- Greater overall database organization
- Data consistency is maintained in DB
Decomposition
- Lossless / Lossy Join Decomposition
  - Spurious Tuples => Extra Tuples after joining
  - Common Attribute should be CK, SK of either Table
- Dependency Preserving Decomposition => FD should be preserved
  - Union of Decomposed Tables FD should be equal to the original Table FD
  1. Make all FD of each Table
  2. Check which FD is true by comparing it with original Table FD
  3. Take Union of all FD found
  4. Check if the Original FD can be found by these