
Brief introduction to Hive: Apache Hive is a data warehouse software that facilitates querying and managing of large datasets residing in distributed storage. Hive provides SQL-like language called HiveQL for querying the data. Hive is considered friendlier and more familiar to users who are used to using SQL for...

Have you ever wondered how to process huge data residing on multiple systems? Well here is a simple solution for the same – Hadoop’s MapReduce feature. MapReduce is a software framework for easily writing applications which process vast amounts of data residing on multiple systems. Although it is a...

This blog helps those people who want to build their own custom types in Hadoop which is possible only with Writable and WritableComparable. After reading this blog you will get a clear understanding of: What are Writables? Importance of Writables in Hadoop Why are Writables introduced in Hadoop? What...