Relational databases are quite popular, even though it was a system designed in the 1970s
Also known as relational database management systems (RDBMS), relational databases commonly use Structured Query Language (SQL) for operations such as creating, reading, updating, and deleting data
Relational databases store information in discrete tables, which can be JOINed together by fields known as foreign keys
Eg: MySQL, Microsoft SQL Server, Oracle
They are ubiquitous, highly optimised for working with structured data
They provide a stronger guarantee of data normalisation
Scalability issues (Horizontal Scaling)
Data become huge, system become more complex
Object Oriented Databases
The object-oriented data model, is based on the object-oriented-programming paradigm, which is now in wide use
Inheritance, object-identity, and encapsulation (information hiding), with methods to provide an interface to objects, are among the key concepts of object-oriented programming that have found applications in data modelling
The object-oriented data model also supports a rich type system, including structured and collection types
While inheritance and, to some extent, complex types are also present in the E-R model, encapsulation and object-identity distinguish the object-oriented data model from the E-R model
In Object-oriented databases data is treated as an object
All bits of information come in one instantly available object package instead of multiple tables
Eg: ObjectDB, GemStone etc
Advantages
Data storage and retrieval is easy and quick
Can handle complex data relations and more variety of data types that standard relational databases
Relatively friendly to model the advance real world problems
Works with functionality of OOPs and Object Oriented languages
Disadvantages
High complexity causes performance issues like read, write, update and delete operations are slowed down
Not much of a community support as isn’t widely adopted as relational databases
Does not support views like relational databases
Hierarchical Databases
Based on a concrete hierarchy
The schema for hierarchical databases is defined by its tree-like organisation, in which there is typically a root “parent” directory of data stored as records that links to various other subdirectory branches, and each subdirectory branch, or child record, may link to various other subdirectory branches
Data within records is stored in the form of fields, and each field can only contain one value
Retrieving hierarchical data from a hierarchical database architecture requires traversing the entire tree, starting at the root node
Since the disk storage system is also inherently a hierarchical structure, these models can also be used as physical models
The key advantage of a hierarchical database is its ease of use
The one-to-many organisation of data makes traversing the database simple and fast, which is ideal for use cases such as website drop-down menus or computer folders
Due to the separation of the tables from physical storage structures, information can easily be added or deleted without affecting the entirety of the database
And most major programming languages offer functionality for reading tree structure databases
The major disadvantage of hierarchical databases is their inflexible nature
The tree-like organisation of data requires top-to-bottom sequential searching, which is time consuming, and requires repetitive storage of data in multiple different entities, which can be redundant
Eg: IBM IMS
Network Databases
Extension of Hierarchical databases
The child records are given the freedom to associate with multiple parent records
Organised in a Graph structure
Can handle complex relations
Maintenance is tedious
M:N links may cause slow retrieval
Not much web community support
Eg: Integrated Data Store (IDS), IDMS (Integrated Database Management System), Raima Database Manager, TurboIMAGE
DB Optimization
Distributed Database
A single logical database that is, spread across multiple locations (servers) and logically interconnected by network
This is the product of applying DB optimisation techniques like Clustering, Partitioning and Sharding
Clustering
Database Clustering (making Replica-sets) is the process of combining more than one servers or instances connecting a single database
Database clustering, SQL server clustering, and SQL clustering are closely associated with SQL is the language used to manage the database information, sometimes used to handle large traffic
Replicate the same dataset on different servers
In cluster architecture, all requests are split with many computers so that an individual user request is executed and produced by a number of computer systems
The clustering is serviceable definitely by the ability of load balancing and high-availability
If one node collapses, the request is handled by another node
Consequently, there are few or no possibilities of absolute system failures
Advantages
Data Redundancy
Clustering of databases helps with data redundancy, as we store the same data at multiple servers
The redundancy that clustering offers is required and is quite certain due to the synchronisation
In case any of the servers had to face a failure due to any possible reason, the data is available at other servers to access
Load balancing
Scalability doesn’t come by default with the database, brought by clustering
Basically, what load balancing does is allocating the workload among the different servers that are part of the cluster
This can provide scaling seamlessly as required
This links directly to high availability
High availability
High availability refers the amount of time a database is considered available
The amount of availability you need greatly depends on the number of transactions you are running on your database and how often you are running any kind of analytics on your data
Partitioning
The technique used to divide stored database objects into separate servers
Due to this, there is an increase in performance, controllability of the data
When Horizontally scaling our machines/servers, dealing with relational databases becomes quite tough to maintain the relations
If we apply partitioning to the database that is already scaled out i.e. equipped with multiple servers, we can partition our database among those servers and handle the big data easily
Types
Vertical Partitioning
Slicing relation vertically / column-wise
Need to access different servers to get complete tuples
Horizontal Partitioning
Slicing relation horizontally / row-wise
Independent chunks of data tuples are stored in different servers
When Partitioning is Applied?
Dataset become much huge that managing and dealing with it become a tedious task
The number of requests are enough larger that the single DB server access is taking huge time and hence the system’s response time become high
Instead of having all the data sit on one DB instance, we split it up and introduce a Routing layer so that we can forward the request to the right instances that actually contain the data
Pros
Scalability, Availability
Cons
Complexity => Making partition mapping, Routing layer to be implemented in the system, Non-uniformity that creates the necessity of Re-Sharding
Scatter-Gather problem => Not well suited for Analytical type of queries, as the data is spread across different DB instances