I wanted to try out the suggested pattern of tables. This can be compared to TSQL Equivalent of DDL, DML Scripts. Querying, Select with filters is the key learning from this exercise. Key Learning's are select with filter examples. There are no aggregate functions available in Hbase. Hive and Phoenix which sits on top of Hbase Serves for this purpose of aggregations.
HBASE (One Liner)- Low Latency, Consistent, best suited for random read/write big data access
Few bookmarks provide great short intro - link1
Hbase Queries (Tried it on cloudera-quickstart-vm-4.4.0-1-vmware)
hbase shell
Disable and Drop Table
Disable 'Employee'
Drop 'Employee'
Create Table
Details is a column family. Inside this column family there are members Name, Location, DOB and Salary
create 'Employee', {NAME => 'Details'}
Insert Records
put 'Employee', 'row1', 'Details:Name', 'Raj'
put 'Employee', 'row1', 'Details:Location', 'Chennai'
put 'Employee', 'row1', 'Details:DOB', '01011990'
put 'Employee', 'row1', 'Details:Salary', '1990'
put 'Employee', 'row2', 'Details:Name', 'Raja'
put 'Employee', 'row2', 'Details:Location', 'Delhi'
put 'Employee', 'row2', 'Details:DOB', '01011991'
put 'Employee', 'row2', 'Details:Salary', '5000'
put 'Employee', 'row3', 'Details:Name', 'Kumar'
put 'Employee', 'row3', 'Details:Location', 'Mumbai'
put 'Employee', 'row3', 'Details:DOB', '01011992'
put 'Employee', 'row3', 'Details:Salary', '50000'
Select based on Column Qualifiers
scan 'Employee', {COLUMNS => ['Details:Name', 'Details:Location']}
scan 'Employee', {COLUMNS => ['Details:Location']}
Single Filter - Search by Location Column Filter
scan 'Employee' ,{ FILTER => "SingleColumnValueFilter('Details','Location',=, 'binary:Chennai')" }
ValueFilter - Search by Location Value
scan 'Employee' , { COLUMNS => 'Details:Location', FILTER => "ValueFilter (=, 'binary:Chennai')"}
Multiple Filter - Search by Name and Location
scan 'Employee' ,{ FILTER => "SingleColumnValueFilter('Details','Location',=, 'binary:Chennai') AND SingleColumnValueFilter('Details','Name',=, 'binary:Raj')"}
Timestamp and Filter - Search with Timestamp and Filter
scan 'Employee' ,{TIMERANGE=>[1407324224184,1407324224391], COLUMNS => 'Details:Location', FILTER => "ValueFilter (=, 'binary:Chennai')"}
Timestamp and Filter - Search with multiple Filter on same column (contains)
scan 'Employee' ,{ FILTER => "SingleColumnValueFilter('Details','Location',=, 'binary:Chennai') OR SingleColumnValueFilter('Details','Location',=, 'binary:Delhi')"}
Filter using regExMatch
scan 'Employee' ,{ FILTER => "SingleColumnValueFilter('Details','Location',=, 'regexstring:Che*',true,true)" }
Search using Prefix
scan 'Employee' ,{ FILTER => "SingleColumnValueFilter('Details','Location',=, 'binary:Chennai') AND PrefixFilter ('row1')"}
Return only one record
scan 'Employee', {COLUMNS => ['Details:Location'], LIMIT => 1, STARTROW => 'Chennai'}
Return two records
scan 'Employee', {COLUMNS => ['Details:Location'], LIMIT => 2, STARTROW => 'Chennai'}
Return records between range of rowkeys
scan 'Employee', {COLUMNS => ['Details:Location'], STARTROW => 'row1',STOPROW=>'row2'}
Get Specific Row with Filter
get 'Employee', 'row1', {FILTER => "ValueFilter (=, 'binary:Chennai')"}
Count records
count 'Employee'
Timestamp Range - Return within the timestamp range
scan 'Employee' ,{TIMERANGE=>[1407324224391,1407324234707]}
Hbase Table 3 Column Family
create 'Employee1', {NAME => 'Name'}, {NAME => 'Location'}, {NAME => 'DOB'}
Query with column family
scan 'Employee1', {COLUMNS => ['Name']}
Delete a Column Family
alter ‘Employee', ‘delete’ => ‘Location'
No comments:
Post a Comment