S P L A S H

Big Data analytics training in Pallikaranai, Chennai


SPLASH - Big Data Hadoop and Spark Training Course is curated by Big Data industry experts, and it covers in-depth knowledge on Big Data and Hadoop Ecosystem tools such as HDFS, MapReduce, YARN, Hive, Pig, HBase, Apache Spark, PySpark, Oozie, Flume and Sqoop.


course designed such a way to handle real time solution to the big data problem. SPLASH - covers not only to install and use the tools, we also offer in-detail about each ecosystem tools.


Big data job market is diverse, most popular job description in big data is, Hadoop Administrator and Big Data Developer. SPLASH - customized Apache Spark Developer training provides in-detail about Spark with Scala trainining in Pallikaranai, Chennai and Spark with ML traininig in Pallikaranai, Chennai.


Big Data Hadoop Training syllabus

 

Introduction to Big Data

  • Rise of Big Data
  • Compare Hadoop vs traditonal systems
  • Hadoop Master-Slave Architecture
  • Understanding HDFS Architecture
  • NameNode, DataNode, Secondary Node
  • Learn about Resource Manager, Node Manager
 

HDFS and MapReduce Architecture

  • Core components of Hadoop
  • Understanding Hadoop Master-Slave Architecture
  • Learn about NameNode, DataNode, Secondary Node
  • Understanding HDFS Architecture
  • Anatomy of Read and Write data on HDFS
  • MapReduce Architecture Flow
  • JobTracker and TaskTracker

Hadoop Configuration

  • Hadoop Modes
  • Hadoop Terminal Commands
  • Cluster Configuration
  • Web Ports
  • Hadoop Configuration Files
  • Reporting, Recovery
  • MapReduce in Action

Hadoop Configuration

  • Overview of the MapReduce Framework
  • Use cases of MapReduce
  • MapReduce Architecture
  • Anatomy of MapReduce Program
  • Mapper/Reducer Class, Driver code
  • Understand Combiner and Partitioner

Apache PIG

  • PIG vs MapReduce
  • PIG Architecture & Data types
  • PIG Latin Relational Operators
  • PIG Latin Join and CoGroup
  • PIG Latin Group and Union
  • Describe, Explain, Illustrate
  • PIG Latin: File Loaders & UDF

Apache Hive and HiveQL

  • What is Hive
  • Hive DDL - Create/Show Database
  • Hive DDL - Create/Show/Drop Tables
  • Hive DML - Load Files & Insert Data
  • Hive SQL - Select, Filter, Join, Group By
  • Hive Architecture & Components
  • Difference between Hive and RDBMS

Advanced HiveQL

  • Multi-Table Inserts
  • Joins
  • Grouping Sets, Cubes, Rollups
  • Custom Map and Reduce scripts
  • Hive SerDe
  • Hive UDF
  • Hive UDAF

Apache Flume, Sqoop, Oozie

  • Sqoop - How Sqoop works
  • Sqoop Architecture
  • Flume - How it works
  • Flume Complex Flow - Multiplexing
  • Oozie - Simple/Complex Flow
  • Oozie Service/ Scheduler
  • Use Cases - Time and Data triggers

Hadoop 2.0, YARN, MRv2

  • Hadoop 1.0 Limitations
  • MapReduce Limitations
  • HDFS 2: Architecture
  • HDFS 2: High availability
  • HDFS 2: Federation
  • YARN Architecture
  • Classic vs YARN
  • YARN multitenancy
  • YARN Capacity Scheduler

Big Data has various stage, like data streaming, distibuted data storage, Processing, creatting data Lake, reporting and visuvalization. Each task is pipplined to others, Creating pipleline task and integrating in realtime environment is the real challenge. Hadoop mainly deal with Batch processing. In order to handle the realtime data in distributed data stotrage and required database access to the application, No-SQL is the altimate choice.

We at SPLASH - certified No-SQL experts who can handle MongoDB No-SQL training and Cassandra No-SQL training in Pallikaranmai, Chennai

SPLASH - Experts in the filed of DATA. We have 12+ years of experience in RDBMS. Have implemnted morethan 5 projects in the field of No-SQL, Big Data, Statistical analysis, Machine Learning, Deep Learning. Our Training course and syllabus more cutomized towards the job market.