Limitations of Hadoop 1.0
- No Random Access --> Hadoop for more batch access (OLAP)
- Not suitable for Real-time Access
- No Update - Access Pattern is WORM (Write Once Read Multiple Times Hadoop best suited)
Why HBase
- Flexible Schema Design --> Add a new column when a row is added
- Multiple versions of a single cell (Data)
- Columnar storage
- Cache columns at client side
- Compression of columns
Read v/s Write
- For Availability (Compromise on Write) vs Consistency (Compromise on Read)
Hbase
- NoSQL Class on Non-Relational Storage Systems
- In RDBMS it is Rowkey based allocations, HBase it is columnar storage
- Hbase needs HDFS for replication
- ZooKeeper - Taking all requests from client. Client will communicate from zookeeper Client -> ZooKeeper -> HMaster
- Region Server - It Serves the region. Region Server processor runs on slaves (Data Nodes)
Happy Learning!!!
No comments:
Post a Comment