Big Data Integration with MongoDB Using Spark
Data note · Why MongoDB? : Lets evaluate MongoDB on CAP theorem to assert 'Why MongoDB' Partition tolerance is a MUST in Bigdata scenarios as…
Read NoteData note · Why MongoDB? : Lets evaluate MongoDB on CAP theorem to assert 'Why MongoDB' Partition tolerance is a MUST in Bigdata scenarios as…
Read NoteData note · Why Cassandra: Before we discuss Cassandara, we have to also discuss about something called as CAP Theorem - As per CAP(Consistency, Availability…
Read NoteData note · Interacting with HDFS using HBase and Python was very powerful but it was also very engaging as we havd to do a…
Read NoteData note · What is HBase: HBase is a NoSQL/non-relational answer your big data queries where relational databases can't be as scalable as non relational…
Read NoteData note · The distributed Hadoop file system can not only retrieve data from flat files but also my structured as well as unstructured sources.…
Read NoteEditorial reprint · As Spark 2 supports datasets which is the extension of RDDs, we can use these datasets to model into a Machine Learning…
Read NoteData note · Spark 2 extends the RDDs(Resilient Distributed Dataset) in terms of a "DataFrame" Dataframe contains Row Objects thus give you power to use…
Read NoteData note · Apache Spark is super lightening fast Hadoop distributed processing service. Its execute in-memory that's why it is the fastest of all processing…
Read NoteData note · As we know that the core of the Hadoop's distributed processing system is MapReduce. We can use on the top technologies like…
Read NoteData note · As we know that HDFS is the distributed storage system of Hadoop. Similarly MapReduce is the core processing engine. Recently TEZ is…
Read Note