All Notes

Big Data

Big Data Notes

MongoDB – A Dive Deep
AI usage metrics on a dashboardPhoto: Luke Chesser / Unsplash · Royalty-free Big Data
Mohd Naeem

MongoDB – A Dive Deep

Data note · The previous session on MongoDB was on Big data integration of MongoDB and how to use Spark and Python to access data…

Read Note
Big Data Integration with MongoDB Using Spark
Infrastructure-as-code on screenPhoto: Markus Spiske / Unsplash · Royalty-free Big Data
Mohd Naeem

Big Data Integration with MongoDB Using Spark

Data note · Why MongoDB? : Lets evaluate MongoDB on CAP theorem to assert 'Why MongoDB' Partition tolerance is a MUST in Bigdata scenarios as…

Read Note
Big Data Integration with Cassandra Using Spark
Notebook and pipeline codePhoto: Christina @ wocintechchat.com / Unsplash · Royalty-free Big Data
Mohd Naeem

Big Data Integration with Cassandra Using Spark

Data note · Why Cassandra: Before we discuss Cassandara, we have to also discuss about something called as CAP Theorem - As per CAP(Consistency, Availability…

Read Note
How to Interact with HDFS Using HBase and Pig
Build pipeline on a laptopPhoto: Danial Iglesias / Unsplash · Royalty-free Big Data
Mohd Naeem

How to Interact with HDFS Using HBase and Pig

Data note · Interacting with HDFS using HBase and Python was very powerful but it was also very engaging as we havd to do a…

Read Note
How to Interact with HDFS Using HBase and Python
Terminal session on a laptopPhoto: Growtika / Unsplash · Royalty-free Big Data
Mohd Naeem

How to Interact with HDFS Using HBase and Python

Data note · What is HBase: HBase is a NoSQL/non-relational answer your big data queries where relational databases can't be as scalable as non relational…

Read Note
Exchanging Data between MySQL and Hadoop Using Sqoop Import and Export
Robotic and human hands nearly touchingPhoto: Maximilian Wittmann / Unsplash · Royalty-free Big Data
Mohd Naeem

Exchanging Data between MySQL and Hadoop Using Sqoop Import and Export

Data note · The distributed Hadoop file system can not only retrieve data from flat files but also my structured as well as unstructured sources.…

Read Note
How to Process Data Using Spark 2
Team working together at a tablePhoto: Annie Spratt / Unsplash · Royalty-free Big Data
Mohd Naeem

How to Process Data Using Spark 2

Data note · Spark 2 extends the RDDs(Resilient Distributed Dataset) in terms of a "DataFrame" Dataframe contains Row Objects thus give you power to use…

Read Note
How to Process Data Using Spark
BYOK setup on a developer laptopPhoto: Christina @ wocintechchat.com / Unsplash · Royalty-free Big Data
Mohd Naeem

How to Process Data Using Spark

Data note · Apache Spark is super lightening fast Hadoop distributed processing service. Its execute in-memory that's why it is the fastest of all processing…

Read Note