"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

October 05, 2011

NOSQL - can it replace RDBMS Databases?

[You may also like - Big Data - Basics - Getting Started

NOSQL Databases are quite popular last couple of years. Will it replace RDBMS databases? What are their advantages? I have been working in SQL Server for last 5 years. Do I need to learn NOSQL as well?

First to get answer for this question, I need to understand

Ø  What NOSQL Provides
o   No Strict Schema. Store New Data without considering earlier stored data types
o   Provide Indexing, Querying features same as a RDBMS
o   Distributed Caching etc..
This link provides good read on Disadvantages of RDBMS databases. Also check whitepaper on NOSQL Technology from CouchBase site.

I want to do a few things on a NOSQL Database. I tried with MongoDB. Downloaded 32 bit Windows Installer. Unzipped the files. Detailed Steps provided in link

After creating Data directory, Run mongod.exe


Next is running Mongo.exe


Now we can try sample examples
  • Creating Table - Yes supported
  • Storing records - Yes supported
  • Execution plan - Yes supported
  • Stored Procedures – Javascript code in MongoDB. Good Read link
  • SQL Joins are not supported in MongoDB link
  • Triggers - Not Supported link
  • Indexes - Yes
This link SQL - MongoDB Mapping Chart was very useful to try our example. Reusing Schema created in link

Step 1 - Table Creation


Step 2 - Inserting Record and Selecting Records


Step 3 - Checking Execution Plan



Step 4 - Drop a collection
--Equivalent to Dropping a Table in TSQL

db.Earnings.drop();

Step 5 - Inserting Records
-- Creating Table
db.createCollection("Earnings");

db.Earnings.insert({Name:"Raja",SaleDate:"10/05/2011",SaleValue:12500,City:"Chennai", tags:["BestPerformer","top10"]});
db.Earnings.insert({Name:"Raja",SaleDate:"10/07/2011",SaleValue:5500,City:"Chennai",tags:["BestPerformer","top40"]});
db.Earnings.insert({Name:"Raja",SaleDate:"10/09/2011",SaleValue:8500,City:"Mumbai",tags:["BestPerformer","top20"]});
db.Earnings.insert({Name:"Robert",SaleDate:"10/05/2011",SaleValue:2500,City:"Hyderebad",tags:["AveragePerformer","top50"]});
db.Earnings.insert({Name:"Robert",SaleDate:"10/07/2011",SaleValue:7500,City:"Hyderabad",tags:["BestPerformer","top20"]});
db.Earnings.insert({Name:"Robert",SaleDate:"10/09/2011",SaleValue:8500,City:"Chennai",tags:["GoodPerformer","top30"]});
db.Earnings.insert({Name:"Allen",SaleDate:"10/05/2011",SaleValue:6500,City:"Delhi",tags:["AvgPerformer","top50"]});
db.Earnings.insert({Name:"Allen",SaleDate:"10/07/2011",SaleValue:6500,City:"Hyderabad",tags:["AvgPerformer","top50"]});
db.Earnings.insert({Name:"Allen",SaleDate:"10/09/2011",SaleValue:6500,City:"Bangalore",tags:["MediumPerformer","top70"]});

Step 6 -Listing all records

db.Earnings.find();

Step 7 - Select only few columns

db.Earnings.find({},{Name:1,City:1,SaleDate:1});
db.Earnings.find({},{Name:1,City:1,SaleDate:1});

Step 8 - Find all best performers based on tags

db.Earnings.find({tags:"BestPerformer"});
db.Earnings.find({tags:"top10"});

Step 9 - Is it case Sensitive ?

db.Earnings.find({tags:"bestperformer"});
Yes, It is case sensitive, Zero Records returned for above query

Step 10 - Using Regular Expressions
--Find all employees who name starts with R

db.Earnings.find({Name:/^R/});

Step 11 - Update Records

db.Earnings.update({Name:"Raja"}, {$set:{Name:"Raja NewName"}},false, true);

Step 12 - Using Query operators

db.Earnings.find({SaleValue:{$gt:8000}});
db.Earnings.find({SaleValue:{$gt:8000}},{Name:1,City:1,SaleDate:1,SaleValue:1});

Step 13 - Delete a record

db.Earnings.remove({City:"Chennai"});

Step 14 - Execution Plan without Index

db.Earnings.find({City:"Chennai"}).explain();

Step 15 - Execution Plan with Index

db.users.ensureIndex({City:1});
db.Earnings.find({City:"Chennai"}).explain();
I didn't see much difference between both the queries. Records are very less in our case. It would be good to try it with a larger data set

The next question that comes in my mind is - Can I use NOSQL DB for OLTP Application?
Stackoverflow question Which NoSQL DB is best fitted for OLTP financial systems? was the best answer I could find for my question
The aspects highlighted are
  • Consistency
  • Database Integrity
Facebook, Twitter – Social Networking Sites dealing with Data related to individuals. Data collected is
  • Check on Consistency / Integrity of Data may not be a priority as its mostly demographic information, interests, friends, hobbies etc.. related information is captured
  • Distributed Caching enables faster query and data retrieval for end user
  • Not Financial/Secure information/Trasnsactions (ex-Banking details of user)
Learning’s
  • Examples above would show MongoDB equivalent syntax for TSQL statement. Learning curve looks pretty straight forward
  • Based on the need you may decide SQL or NOSQL
    • Ordering Details of ecommerce site is best suited to be on RDBMS SQL platform to maintain Data Integrity, Consistency
    • Listing items / Managing Recommendations / Searching Inventory can be based on NOSQL databases
  • For BI platform, NOSQL can be a Datasource. Data Quality/ Data Cleaning Process – ETL has to be done on NSQL databases to clean-up data issues/ fix Integrity/Consistency issues in the NOSQL Databases. It would be a good learning to see a whitepaper/case study on BI Analytics on NOSQL Database
In a Nutshell, NOSQL - can it replace RDBMS Databases?
  • Yes for certain aspects (Non Critical features of OLTP App). Example - For of a web application (Listing items / Managing Recommendations / Searching Inventory/Product Reviews/Managing Customer Contacts) can be stored on NOSQL databases
  • RDBMS would still be preferred option for Ordering / critical financial related transactions (ex-online banking, online shopping – payments)
  • For BI – NOSQL *NO*, If NOSQL providers come up with additional tools/features for BI reporting on top of NOSQL then it would be worth to take a look
Predicted market share of NOSQL DB by 2015 - $1.8Bln. (Source)
Slightly Deviating from the topic, let’s see Databases used by product companies, PB Databases (1 Petabyte DB = 1000 Tera byte). Sneak peek into What DBs are used by product companies

  • Google - MySQLdb, Big Table
  • AOL - Oracle, Sybase, PostgreSQL, MySQL
  • AT&T Research - proprietary DB called Daytona
  • Ebay – Teradata
  • YahooOracle, HBASE (DB over Hadoop)
Source - Link. This is few years old. They might have moved to NOSQL for certain areas.

More Reads
Please feel free to add your comments.

Happy Learning!!!

No comments: