Big Data

Notes in This Topic

How to Process Data Using Spark 2
Team working together at a tablePhoto: Annie Spratt / Unsplash · Royalty-free Big Data
Mohd Naeem

How to Process Data Using Spark 2

Data note · Spark 2 extends the RDDs(Resilient Distributed Dataset) in terms of a "DataFrame" Dataframe contains Row Objects thus give you power to use…

Read Note
How to Process Data Using Spark
BYOK setup on a developer laptopPhoto: Christina @ wocintechchat.com / Unsplash · Royalty-free Big Data
Mohd Naeem

How to Process Data Using Spark

Data note · Apache Spark is super lightening fast Hadoop distributed processing service. Its execute in-memory that's why it is the fastest of all processing…

Read Note
How to Directly Use MapReduce to Process Data
Circuit board macro detailPhoto: Alexandre Debiève / Unsplash · Royalty-free Big Data
Mohd Naeem

How to Directly Use MapReduce to Process Data

Data note · As we know that the core of the Hadoop's distributed processing system is MapReduce. We can use on the top technologies like…

Read Note
How to Process Data with Pig with MapReduce and TEZ
Circuit board macro detailPhoto: Alexandre Debiève / Unsplash · Royalty-free Big Data
Mohd Naeem

How to Process Data with Pig with MapReduce and TEZ

Data note · As we know that HDFS is the distributed storage system of Hadoop. Similarly MapReduce is the core processing engine. Recently TEZ is…

Read Note
How to Process Data Using Hadoop Hive
Robotic and human hands nearly touchingPhoto: Maximilian Wittmann / Unsplash · Royalty-free Big Data
Mohd Naeem

How to Process Data Using Hadoop Hive

Data note · As we saw in out last post that we successfully uploaded data to HDFS( stands for Hadoop File System and is the…

Read Note
How to Ingest Data into Hadoop File System (HDFS)
Team working together at a tablePhoto: Annie Spratt / Unsplash · Royalty-free Big Data
Mohd Naeem

How to Ingest Data into Hadoop File System (HDFS)

Data note · In Hadoop Architecture, while HDFS is the distributed file system, MapReduce or Tez are the distributed processing engines. To process huge amount…

Read Note