Apache Kafka – A Tool for Streaming Data into the Cluster
Data note · What is Streaming? So what if you have to capture live data or logs from a web servers, you have data…
Read NoteData note · What is Streaming? So what if you have to capture live data or logs from a web servers, you have data…
Read NoteData note · What is Apache Flume: As we know that Apache Kafka is a generic streaming tool which can handle not only Hadoop specific…
Read NoteData note · What is Apache Zeppelin - Notebook interface to the core as well as custom Big data technologies. an analysis and visualization tool…
Read NoteData note · There are quite a few important under the hood players in a Hadoop System - those who manage the cluster - they…
Read NoteData note · What is Presto: has a SQL interface to query. connects to multiple databases including Cassandra(which Drill can't). a big plus - OLTP…
Read NoteOps note · Python – Part 1 of 5 What is Python and Why: open source programming language interpreted at run-time(unlike compiled Java, C#, like…
Read NoteOps note · Linux Commands - Part 1 of 4 What is Linux - an operating system open-source software consists of a core, called as…
Read NoteOps note · As we know that "Hortonworks Sandbox" is a customized Hadoop VM, which you can install using any of the virtualization tools like…
Read NoteData note · Apache Phoenix is another query engine similar to Apache Drill but unlike Drill which can connect to any databases, it can only…
Read NoteData note · Apache Drill can sit on the top of any data source - be it relational, non-relational, S3, JSON etc. It presents an…
Read Note