Abstract: – The term ‘Big Data’, refers to data sets whose size (volume), complexity (variability), and rate of growth (velocity) make them difficult to capture, manage, process or analyzed. Big data is a collection of large data sets that include different types such as structured, unstructured and semi-structured data. Hadoop is an open source software project that enables the distributed processing of large data sets […]