
This blog is about analyzing the data of youtube.This total analysis is performed in Hadoop MapReduce. This youtube data is publicly available and the youtube data set is described below under the heading Data Set Description. Using that dataset we will perform some Analysis and will draw out some...

Pig was initially developed by Yahoo! to allow individuals using Apache Hadoop to focus more on analyzing large datasets and spend less time in writing mapper and reducer programs.Pig enables users to write complex data analysis code without prior knowledge in Java. Pig’s simple SQL-like scripting language is called...

Why was Hive introduced? Few years ago, Hadoop came into existence for solving queries on huge structured, Semi-structured, and unstructured datasets. Hadoop is considered to be the bestsolution to store and process those huge datasets, because of its advantages like scalability, less infrastructure costs(commodity hardware), data security(replication factor) and...