Big Data – Page 3 – Mohd Naeem

Photo: Scott Graham / Unsplash · Royalty-free Big Data

March 10, 2020 Mohd Naeem

Apache Phoenix – Another Query Engine with a SQL Interface Fine Tuned for Performance with HBase

Data note · Apache Phoenix is another query engine similar to Apache Drill but unlike Drill which can connect to any databases, it can only…

Read Note

Photo: Google DeepMind / Unsplash · Royalty-free Big Data

February 7, 2020 Mohd Naeem

Apache Drill – A Data Engine Which Can Play with Data from Any Data Source

Data note · Apache Drill can sit on the top of any data source - be it relational, non-relational, S3, JSON etc. It presents an…

Read Note

Photo: Luke Chesser / Unsplash · Royalty-free Big Data

January 20, 2020 Mohd Naeem

MongoDB – A Dive Deep

Data note · The previous session on MongoDB was on Big data integration of MongoDB and how to use Spark and Python to access data…

Read Note

Photo: Markus Spiske / Unsplash · Royalty-free Big Data

December 18, 2019 Mohd Naeem

Big Data Integration with MongoDB Using Spark

Data note · Why MongoDB? : Lets evaluate MongoDB on CAP theorem to assert 'Why MongoDB' Partition tolerance is a MUST in Bigdata scenarios as…

Read Note

Photo: Christina @ wocintechchat.com / Unsplash · Royalty-free Big Data

November 14, 2019 Mohd Naeem

Big Data Integration with Cassandra Using Spark

Data note · Why Cassandra: Before we discuss Cassandara, we have to also discuss about something called as CAP Theorem - As per CAP(Consistency, Availability…

Read Note

Photo: Danial Iglesias / Unsplash · Royalty-free Big Data

October 12, 2019 Mohd Naeem

How to Interact with HDFS Using HBase and Pig

Data note · Interacting with HDFS using HBase and Python was very powerful but it was also very engaging as we havd to do a…

Read Note

Photo: Growtika / Unsplash · Royalty-free Big Data

September 10, 2019 Mohd Naeem

How to Interact with HDFS Using HBase and Python

Data note · What is HBase: HBase is a NoSQL/non-relational answer your big data queries where relational databases can't be as scalable as non relational…

Read Note

Photo: Maximilian Wittmann / Unsplash · Royalty-free Big Data

August 9, 2019 Mohd Naeem

Exchanging Data between MySQL and Hadoop Using Sqoop Import and Export

Data note · The distributed Hadoop file system can not only retrieve data from flat files but also my structured as well as unstructured sources.…

Read Note

Photo: Annie Spratt / Unsplash · Royalty-free Big Data

June 4, 2019 Mohd Naeem

How to Process Data Using Spark 2

Data note · Spark 2 extends the RDDs(Resilient Distributed Dataset) in terms of a "DataFrame" Dataframe contains Row Objects thus give you power to use…

Read Note

Photo: Christina @ wocintechchat.com / Unsplash · Royalty-free Big Data

May 9, 2019 Mohd Naeem

How to Process Data Using Spark

Data note · Apache Spark is super lightening fast Hadoop distributed processing service. Its execute in-memory that's why it is the fastest of all processing…

Read Note

Big Data Notes

Apache Phoenix – Another Query Engine with a SQL Interface Fine Tuned for Performance with HBase

Apache Drill – A Data Engine Which Can Play with Data from Any Data Source

MongoDB – A Dive Deep

Big Data Integration with MongoDB Using Spark

Big Data Integration with Cassandra Using Spark

How to Interact with HDFS Using HBase and Pig

How to Interact with HDFS Using HBase and Python

Exchanging Data between MySQL and Hadoop Using Sqoop Import and Export

How to Process Data Using Spark 2

How to Process Data Using Spark