Data Modeling - D B M S

Entity-Relationship (ER) Model


  • It is a high level data model based on a perception of a real world that consists of a collection of basic objects, called entities and of relationships among these objects
  • Graphical representation of ER Model is ER diagram, which acts as a blueprint of DB
  • Error
  • Entity
    • An Entity is a “thing” or “object” in the real world that is distinguishable from all other objects
    • It has physical existence
    • Entity can be uniquely identified (By a primary attribute, aka Primary Key)
    • Types
      • Strong Entity
        • Can be uniquely identified
        • Represented by Square
      • Weak Entity
        • Can’t be uniquely identified, Depends on some other strong entity
        • It doesn’t have sufficient attributes, to select a uniquely identifiable attribute
        • Weak entity depends on strong entity for existence
        • Represented by Double Square
    • Entity set
      • It is a set of entities of the same type that share the same properties, or attributes
  • Attribute
    • An entity is represented by a set of attributes
    • Each entity has a value for each of its attributes
    • For each attribute, there is a set of permitted values, called the domain/value set
    • Represented by Eclipse
    • Types of Attributes
      • Simple
        • Attributes which can’t be divided further
      • Composite
        • Can be divided into subparts (that is, other attributes)
        • If user wants to refer to an entire attribute or to only a component of the attribute
      • Single-valued
        • Only one value attribute
        • Represented by Single Eclipse
      • Multi-valued
        • Attribute having more than one value
        • Limit constraint may be applied, upper or lower limits
        • Represented by Double Eclipse
      • Derived
        • Value of this type of attribute can be derived from the value of other related attributes
        • Represented by Dotted Eclipse
      • NULL Value
        • An attribute takes a null value when an entity does not have a value for it
        • It may indicate "not applicable", value doesn’t exist
          • Example => Person having no middle-name
        • It may indicate "unknown"
          • Unknown can indicate missing entry
            • Example => Name value of a customer is NULL, means it is missing as name must have some value
          • Not known
      • Stored
        • Which can't be derived from other Attributes
      • Key
        • Having unique value
        • Represented by Circle with Underline
      • Required & Optimal
        • If the value is Required or not
      • Complex
        • Composite + Multi-valued
  • Relationship
    • Association among two or more entities
    • Attribute of Relation is called Descriptive Attribute other than the 2 Foreign keys referencing to Primary Key of each table
    • Types
      • Strong Relationship
        • Between two independent entities
        • Represented by Diamond
      • Weak Relationship
        • Between weak entity and its owner/strong entity
        • Represented by Double-Diamond
    • Degree of Relationship => Number of entities participating in a relationship
      • Unary, Only one entity participates
        • Example => Employee manages employee
      • Binary, two entities participates
        • Binary are common
        • Example => Student takes Course
      • Ternary relationship, three entities participates
        • Example => Employee works-on branch, employee works-on job
  • Relationships Constraints
    • Mapping Cardinality/Cardinality Ratio
      • Number of entities to which another entity can be associated via a relationship
      • Types
        • One to One (1:1)
          • Entity in A associates with at most one entity in B, where A & B are entity sets. And an entity of B is associated with at most one entity of A
          • One Value should be taken from both Entity => Primary Key can be from any of 2
          • Can be Reduced to 2 Tables
        • One to Many (1:M)
          • Entity in A associated with N entity in B. While entity in B is associated with at most one entity in A
          • Primary Key in Relationship table will be from the "many" side
          • Can be merged with the "many" table
        • Many to One (M:1)
          • Entity in A associated with at most one entity in B. While entity in B can be associated with N entity in A
          • Primary Key will be from the "many" side
          • Can be merged with the "many" table
        • Many to Many (M:N)
          • Entity in A associated with N entity in B. While entity in B also associated with N entity in A
          • Composite Key will be formed
          • Can't be reduced
    • Participation Constraints (Minimum cardinality constraint)
      • Types
        • Partial Participation
          • Not all entities are involved in the relationship instance
          • Represented by Single-line
        • Total Participation
          • Each entity must be involved in at least one relationship instance
          • Represented by Double-line
      • Weak entity has total participation constraint, but strong may not have total
  • Specialisation
    • Specialisation is splitting up the entity set into further sub entity sets on the basis of their functionalities, specialties and features
    • It is a Top-Down approach
    • Eg: Person entity set can be divided into customer, student, employee. Person is superclass and other specialised entity sets are subclasses
      • We have "is-a" relationship between superclass and subclass
      • Represented by triangle component
    • Why Specialisation?
      • Certain attributes may only be applicable to a few entities of the parent entity set
      • DB designer can show the distinctive features of the sub entities
      • To group such entities we apply Specialisation, to overall refine the DB blueprint
  • Generalisation
    • It is just a reverse of Specialisation
    • DB Designer, may encounter certain properties of two entities are overlapping. Designer may consider to make a new generalised entity set. That generalised entity set will be a super class.
    • "is-a" relationship is present between subclass and super class
    • Eg: Car, Jeep and Bus all have some common attributes, to avoid data repetition for the common attributes. DB designer may consider to Generalise to a new entity set "Vehicle"
    • It is a Bottom-up approach
    • Why Generalisation?
      • Makes DB more refined and simpler
      • Common attributes are not repeated
  • Attribute Inheritance
    • Both Specialisation and Generalisation, has attribute inheritance
    • The attributes of higher level entity sets are inherited by lower level entity sets
    • Eg: Customer & Employee inherit the attributes of Person
  • Participation Inheritance
    • If a parent entity set participates in a relationship then its child entity sets will also participate in that relationship
  • Aggregation
    • To show relationships among relationships
    • Abstraction is applied to treat relationships as higher-level entities. We can call it Abstract entity
    • Avoid redundancy by aggregating relationship as an entity set itself
  • Steps to make ER diagram
    • Requirement engineering
    • Identify identity sets
    • Identify attributes & its types
    • Identify relationships & its constraints (Mapping, Participation)

Relational Model


  • Relational Model (RM) organizes the data in the form of relations (tables)
  • A relational DB consists of collection of tables, each of which is assigned a unique name
  • Row => Represents a relationship among a set of values, and table is collection of such relationships
  • Tuple => A single row of the table representing a single data point / a unique record
  • Columns => Represents the attributes of the relation
    • Each attribute, there is a permitted value, called domain of the attribute
  • Relation Schema => Defines the design and structure of the relation, contains the name of the relation and all the columns/attributes
    • Eg: TableName (Column1, Column2, Column3)
  • Implementation (Database creation)
    • RDBMS is a software that implements RM
    • Eg: Oracle, IBM, MySQL, MS Access
  • Degree of table => Number of attributes/columns in a given table/relation
  • Cardinality => Total no. of tuples in a given relation
  • Relational Key => Set of attributes which can uniquely identify an each tuple
  • Important properties of a Table in Relational Model
    • The name of relation is distinct among all other relation
    • The values have to be atomic. Can’t be broken down further
    • The name of each attribute/column must be unique
    • Each tuple must be unique in a table
    • The sequence of row and column has no significance
    • Tables must follow integrity constraints => Helps to maintain data consistency across the tables
  • Relational Model Keys
    • Super Key (SK)
      • Any Permutation & Combination of attributes present in a table which can uniquely identify each tuple
    • Candidate Key (CK)
      • Minimum subset of super keys, which can uniquely identify each tuple
      • It contains no redundant attribute
      • CK value shouldn’t be NULL
    • Primary Key (PK)
      • Selected out of CK set, has the least no. of attributes
    • Alternate Key (AK)
      • All CK except PK
    • Foreign Key (FK)
      • It creates relation between two tables
      • FK helps to cross reference between two different relations
      • A relation, say r1 (Child table/Referencing relation), may include among its attributes the PK of an other relation, say r2 (Parent table/Referenced relation)
        • This attribute is called FK from r1 referencing r2
    • Composite Key
      • PK formed using at least 2 attributes
    • Compound Key
      • PK which is formed using 2 FK
    • Surrogate Key
      • Synthetic PK
      • Generated automatically by DB, usually an integer value
      • May be used as PK
  • Integrity Constraints
    • CRUD Operations must be done with some integrity policy so that DB is always consistent
    • Introduced so that we do not accidentally corrupt the DB
    • Domain Constraints
      • Restricts the value in the attribute of relation, specifies the Domain
      • Restrict the Data types of every attribute
    • Entity Constraints
      • Every relation should have PK & PK != NULL
    • Referential Constraints
      • Specified between two relations & helps maintain consistency among tuples of two relations
      • FK must have the matching PK for its each value in the parent table or it must be NULL
      • Insert constraint
        • Value can't be inserted in the child table if the value is not present in the parent table
      • Delete constraint
        • Value can't be deleted in the parent table if the value is present in the child table
        • On Delete Cascade
          • When deleting value from the parent table, delete corresponding entry from the child table
        • On Delete Null
          • When deleting value from the parent table, value of foreign key of corresponding entry from the child table becomes null
    • Key Constraints
      • NOT NULL => This constraint will restrict the user from not having a NULL value
        • It ensures that every element in the database has a value
      • UNIQUE => It helps us to ensure that all the values consisting in a column are different from each other
      • DEFAULT => It is used to set the default value to the column
        • The default value is added to the columns if no value is specified for them
      • CHECK => It keeps the check that integrity of data is maintained before and after the completion of the CRUD
        • Limits value range (Domain)
      • PRIMARY KEY => This is an attribute or set of attributes that can uniquely identify each entity in the entity set
        • The primary key must contain unique as well as not null values
      • FOREIGN KEY => Common attribute that defines some relationship between two entities must be the primary key of an entity set and will become the foreign key of another entity set
        • This key will prevent every action which can result in loss of connection between tables

Transform ER Model to Relational Model (ER-Diagram to Tables)


  • Strong Entity
    • Becomes an individual table with entity name, attributes becomes columns of the relation
    • Entity’s Primary Key (PK) is used as Relation’s PK
    • FK are added to establish relationships with other relations
  • Weak Entity
    • A table is formed with all the attributes of the entity
    • PK of its corresponding Strong Entity will be added as FK
    • PK of the relation will be a composite PK, {FK + Partial discriminator Key}
  • Single Values Attributes
    • Represented as columns directly in the tables/relations
  • Composite Attributes
    • Handled by creating a separate attribute itself in the original relation for each composite attribute
  • Multivalued Attributes
    • New tables (named as original attribute name) are created for each multivalued attribute
    • PK of the entity is used as column FK in the new table
    • Multivalued attribute’s similar name is added as a column to define multiple values
    • PK of the new table would be {FK + multivalued name}
  • Derived Attributes => Not considered in the tables
  • Generalisation
    • Method-1
      • Create a table for the higher level entity set
      • For each lower-level entity set, create a table that includes a column for each of the attributes of that entity set plus a column for each attribute of the primary key of the higher-level entity set
    • Method-2
      • If the generalisation is disjoint and complete—that is, if no entity is a member of two lower-level entity sets directly below a higher-level entity set, and if every entity in the higher level entity set is also a member of one of the lower-level entity sets. Here, do not create a table for the higher-level entity set
      • Instead, for each lower-level entity set, create a table that includes a column for each of the attributes of that entity set plus a column for each attribute of the higher-level entity sets
      • Drawbacks
        • If the second method were used for an overlapping generalisation, some values would be stored twice unnecessarily
        • Similarly, if the generalisation were not complete, if some accounts were neither savings nor current accounts, then such accounts could not be represented with the second method
  • Aggregation
    • Table of the relationship set is made
    • Attributes includes primary keys of entity set and aggregation set’s entities
    • Also, add descriptive attribute if any on the relationship
Share: