"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

January 01, 2020

NOSQL Internals and Design Practices

Objective – The objective of this paper is to analyze NoSQL internals from RDBMS developer perspective and provide design guidelines for NoSQL Applications
Analysis
RDBMS – RDBMS came into the picture to ensure the ACID properties are maintained and there is a single version of the truth. RDBMS plays a critical role in OLTP applications (Banking, Finance, and Payment) domains.
Database design– Database design is implemented to ensure it's normalized and avoid data redundancy. Primary Keys, Indexes are created to ensure query plans use the indexes to filter required rows and fetch required results within the shortest intervals.

Query Execution – Data is typically stored in a B-Tree format. The data is organized physically in the form of clustered indexes. This is the reason search based on the primary key is quick compared to any other non-indexed columns. Database Engine implements several other operations to optimize the execution plan by leveraging indexes, statistics, and partitioning, Non-clustered indexes. Depending on the query plan join operators, sort operators are applied to produce the execution plan. The execution plan is reused if it already exists in memory.
This paper was very useful to understand OLTP Internals. Reposting notes from my blog post
  • WAL – Changes are written in log and committed to disk when the checkpoint is reached
  • Buffer Manager – cache for data fetched / recently used
  • Two-Phase locking – Optimistic/pessimistic locking depending on isolation levels
  • Concurrency control – Based on isolation levels
NoSQL Databases 
Similar to above OLTP aspects, There are few papers that describe designing NOSQL apps for Read heavy / Write Heavy Apps. This paper was very useful to understand NoSQL perspective of designing apps in columnar databases

For Heavy Writes
  • Tall Skinny Tables
  • Consolidate data into single columns
For Heavy Reads
  • Fewer column families
  • Use bloom filters
There are multiple NoSQL databases (Key-Value, Document-based, Columnar Databases, etc...). 

Happy Learning!!!

No comments: