Sunday, April 14, 2013

Hadoop tools

Karmasphere

Abstraction layer and tool on top of Hadoop that enables to query Hadoop files using SQL. The tool is comprised of interactive web interface where users can create a project that will contain a set of queries. These queries can be shared and parametrised. The output can as well be used by tool like Tableau.
A drawback is that it doesn't solve the latency issue of Hadoop bu rather provide a tracker that shows which queries are running and which one are finished.

HBase

Open source, non-relational, NoSQL database in Java that seats on top of Hadoop and thus taking advantage of HDFS to store BigTables. It uses the programing concept of key-value representing data with "row"/"column-family"/"column"/"timsetamp"/"value"


Flume

Mahout

Hive

Pig