Hadoop
Hadoop is associate degree open supply project of the Apache foundation, it's a framework written in Java, originally developed by Doug Cutting in 2005, it was created to support distribution for Nutch, the text program. Hadoop uses Google's Map scale back and Google classification system Technologies as its foundation. Some of the major features of Hadoop are given below:
Hadoop Is Easily Scalable, what that means is new nodes can easily be added to the existing data, which makes it ideal to be used in open source projects.
Hadoop Is Fault Tolerant, it gets this reputation as the data is stored up in HDFS where the data is automatically gets replicated to other places.
It is great at faster data processing, which is attributable to its ability to try and do multiprocessing, hadoop will perform batch processes ten times quicker than on one thread server or on the mainframe.
Comparison
Coming onto the comparison, both Pig and Hive are high-level languages that compile to MapReduce. HBase is totally different in its own way, it permits Hadoop to support lxookups/transactions on key/value pairs. HBase permits
1. fast random lookups, versus scan all of information consecutive,
2. insert/update/delete from middle, not the simple add/append.
Now, coming onto Pig and Hive
Pig does not need underlying structure to the info, but Hive will imply structure via a metastore, what that does is makes Pig more suitable for ETL tasks. On the other hand, Hive’s metastore offer a dictionary which lets you see more easily.
Hive requires very few lines of code when compared to Pig because of its SQL like resemblance, basically it is a subset of SQL with very simple variations to enable mapreduce-like computation.
Pig is faster in the data import but slower in actual execution to a language like Hive.