NoSQL databases (aka "not only SQL") are non-tabular databases and store data differently than relational tables
NoSQL databases come in a variety of types based on their data model
The main types are document, key-value, wide-column, and graph
They provide flexible schemas and scale easily with large amounts of data and high user loads
They are schema free
Data structures used are not tabular, they are more flexible, has the ability to adjust dynamically
Can handle huge amount of data (big data)
Most of the NoSQL are open sources and has the capability of horizontal scaling
It just stores data in some format other than relational
History behind NoSQL
NoSQL databases emerged in the late 2000s as the cost of storage dramatically decreased
Data becoming unstructured more, hence structuring (defining schema in advance) them had becoming costly
NoSQL databases gives flexibility, Quickly adapt to changing requirements
The ability to distribute data across multiple servers and regions to make their applications resilient, to scale out instead of scale up, and to intelligently geo-place their data in Cloud computing
NoSQL Databases Advantages
Flexible Schema
RDBMS has pre-defined schema, which become an issue when we do not have all the data with us or we need to change the schema. It's a huge task to change schema on the go
Horizontal Scaling
Horizontal scaling, also known as scale-out, refers to bringing on additional nodes to share the load. This is difficult with relational databases due to the difficulty in spreading out related data across nodes
With non-relational databases, this is made simpler since collections are self-contained and not coupled relationally. This allows them to be distributed across nodes more simply, as queries do not have to “join” them together across nodes
Scaling horizontally is achieved through Sharding OR Replica-sets
High Availability
NoSQL databases are highly available due to its auto replication feature i.e. whenever any kind of failure happens data replicates itself to the preceding consistent state
If a server fails, we can access that data from another server as well, as in NoSQL database data is stored at multiple servers
Easy insert and read operations
Queries in NoSQL databases can be faster than SQL databases
Data in SQL databases is typically normalised, so queries for a single object or entity require you to join data from multiple tables
As your tables grow in size, the joins can become expensive
However, data in NoSQL databases is typically stored in a way that is optimised for queries
The rule of thumb when you use MongoDB is data that is accessed together should be stored together
But difficult delete or update operations
Caching mechanism
NoSQL use case is more for Cloud applications
When to use NoSQL?
Fast-paced Agile development
Storage of unstructured and semi-structured data
Huge volumes of data
Requirements for scale-out architecture
Modern application paradigms like micro-services and real-time streaming
NoSQL DB Misconceptions
Relationship data is best suited for relational databases
NoSQL databases just store it differently than relational databases do
Related data doesn’t have to be split between tables
NoSQL databases don't support ACID transactions
Some NoSQL databases like MongoDB do, in fact, support ACID transactions
Types of NoSQL Data Models
Key-Value Stores
Every data element in the database is stored as a key value pair consisting of an attribute name (or "key") and a value
In a sense, a key-value store is like a relational database with only two columns: the key or attribute name (such as "state") and the value (such as "Alaska")
Key-value databases use compact, efficient index structures to be able to quickly and reliably locate a value by its key, making them ideal for systems that need to be able to find and retrieve data in constant time
There are several use-cases where choosing a key value store approach is an optimal solution
Real time random data access, Eg, user session attributes in an online application such as gaming or finance
Caching mechanism for frequently accessed data or configuration based on keys
Application is designed on simple key-based queries
The data is stored such that each row of a column will be next to other rows from that same column
While a relational database stores data in rows and reads data row by row, a column store is organised as a set of columns
This means that when you want to run analytics on a small number of columns, you can read those columns directly without consuming memory with the unwanted data
Columns are often of the same type and benefit from more efficient compression, making reads even faster, updates is slower as values needs to be inserted in between
Eg: Cassandra, RedShift, Snowflake
Document Based Stores
This DB store data in documents similar to JSON (JavaScript Object Notation) objects
Each document contains pairs of fields and values
The values can typically be a variety of types including things like strings, numbers, booleans, arrays, or objects
Supports ACID properties hence, suitable for Transactions
Eg: MongoDB, CouchDB
Graph Based Stores
A graph database focuses on the relationship between data elements
Each element is stored as a node (such as a person in a social media graph)
The connections between elements are called links or relationships
A graph database is optimised to capture and search the connections between data elements, overcoming the overhead associated with JOINing multiple tables in SQL
Very few real-world business systems can survive solely on graph queries. As a result graph databases are usually run alongside other more traditional databases
Use cases include fraud detection, social networks, and knowledge graphs
NoSQL Databases Dis-advantages
Data Redundancy
Storage is currently so cheap that most consider this a minor drawback, and some NoSQL databases also support compression to reduce the storage footprint
Update & Delete operations are costly
All type of NoSQL Data model doesn’t fulfil all of your application needs
Doesn’t support ACID properties in general
Doesn’t support data entry with consistency constraints