Hadoop Classroom Training

Big Data and Hadoop Certification course is designed to prepare you for a job assignment in the Big Data world. The course provides you not only with Hadoop 2.7 essential skills, but also gives you practical work experience in Big Data Hadoop by completing long-term, real-world projects. You’ll use Hadoop 2.7 with CloudLab—a cloud-based Hadoop environment lab—to complete your hands-on project work. Register Today

Schedules for Hadoop Classroom Training

Schedules Timings Demo Date Start Date Register
1 7:30 AM to 9 AM Oct 14th Oct 15th Register
2 9:30 AM to 11 AM Sep 15th Sep 15th Register
3 11 AM to 12:30 PM Recently Started Register
4 8 PM to 9:30 PM Recently Started Register
5 10:30 AM to 1 PM (W) Recently Started Register

Trainer : Mr Vamsi (19+ Yrs Exp)


Course Fee: 18,000/-

Duration: 4 Weeks (Mon - Sat)


HIGHLIGHTS

Daily Tasks Weekly Interviews
Real-time Project Resume Guidance
Certification Guidance Placement Services
 

Hadoop Training Course Contents:

Module I

Module II

CHAPTER 1 : INTRODUCTION

  • What is Cloud Computing
  • What is Grid Computing
  • What is Virtualization
  • How above three are inter-related to each other
  • What is Big Data
  • Introduction to Analytics and the need for big data analytics
  • Hadoop Solutions - Big Picture
  • Hadoop distributions
  • Comparing Hadoop Vs. Traditional systems
  • Volunteer Computing
  • Data Retrieval - Radom Access Vs. Sequential Access
  • NoSQL Databases

CHAPTER 2 : THE MOTIVATION FOR HADOOP

  • Problems with traditional large-scale systems
  • Data Storage literature survey
  • Data Processing literature Survey
  • Network Constraints
  • Requirements for a new approach

CHAPTER 3 : HADOOP BASIC CONCEPTS

  • What is Hadoop?
  • The Hadoop Distributed File System
  • How MapReduce Works
  • Anatomy of a Hadoop Cluster

CHAPTER 4 : HADOOP DEMONS

  • Master Daemons
  • Name node
  • Job Tracker
  • Secondary name node
  • Slave Daemons
  • Job tracker
  • Task tracker

CHAPTER 5 : HDFS (HADOOP DISTRIBUTION FILE SYSTEM)

  • Blocks and Splits
  • Input Splits
  • HDFS Splits
  • Data Replication
  • Hadoop Rack Aware
  • Data high availability
  • Data Integrity
  • Cluster architecture and block placement
  • Accessing HDFS
  • JAVA Approach
  • CLI Approach

CHAPTER 6 : PROGRAMMING PRACTICES & PERFORMING TUNING

  • Developing MapReduce Programs in
  • Local Mode
  • Running without HDFS and Mapreduce
  • Pseudo-distributed Mode
  • Running all daemons in a single node
  • Fully distributed mode
  • Running daemons on dedicated nodes

CHAPTER 7: HADOOP ADMINISTATIVE TASKS - Setup Hadoop cluster of Apache, Cloudera and HortonWorks

  • Install and configure Apache Hadoop
  • Make a fully distributed Hadoop cluster on a single laptop/desktop (Psuedo Mode)
  • Install and configure Cloudera Hadoop distribution in fully distributed mode
  • Install and configure HortonWorks Hadoop distribution in fully distributed mode
  • Monitoring the cluster
  • Getting used to management console of Cloudera and Horton Works
  • Name Node in Safe mode
  • Meta Data Backup
  • Integrating Kerberos security in Hadoop
  • Ganglia and Nagios Cluster monitoring
  • Benchmarking the Cluster
  • Commissioning/Decommissioning Nodes.

CHAPTER 8 : HAOOP DEVELOPER TASKS-Writing a Map Reduce Program

  • Examining a Sample Map Reduce Program
  • With Several Examples
  • Basic API Concepts
  • The Driver Code
  • The Mapper
  • The Reducer
  • Hadoop's Streaming API

CHAPTER 9 : Performing several Hadoop Jobs

  • The configure and close Methods
  • Sequence Files
  • Record Reader
  • Record Writer
  • Role of Reporter
  • Output Collector
  • Processing video files and audio files
  • Processing image files
  • Processing XML files
  • Processing Zip files
  • Counters
  • Directly Accessing HDFS
  • Tool Runner
  • Using The Distributed Cache.

CHAPTER 10 : Common Map Reduce Algorithms

  • Sorting and Searching
  • Indexing
  • Classification/Machine Learning
  • Term Frequency - Inverse Document Frequency
  • Word Co-Occurrence
  • Hands-On Exercise: Creating an Inverted Index
  • Identify Mapper
  • Identify Reducer
  • Exploring well known problems using
  • Map Reduce applications.

CHAPTER 11 : Debugging Map Reduce Programs

  • Testing with MR Unit
  • Logging
  • Other Debugging Strategies.

CHAPTER 12 : Advanced Map Reduce Programming

  • A Recap of the Map Reduce Flow
  • Custom Writables and Writable Comparables
  • The Secondary Sort
  • Creating Input Formats and Output Formats
  • Pipelining Jobs With Oozie
  • Map-Side Joins
  • Reduce-Side Joins.

CHAPTER 13 : Monitoring and debugging on a Production Cluster

  • Counters
  • Skipping Bad Records
  • Rerunning failed tasks with Isolation Runner

CHAPTER 14 : Tuning for Performance

  • Reducing network traffic with combiner
  • Reducing the amount of input data
  • Using Compression
  • Running with speculative execution
  • Refactoring code and rewriting algorithms Parameters affecting Performance
  • Other Performance Aspects

CHAPTER 15 : Hadoop Ecosystem- Hive

  • Hive concepts
  • Hive architecture
  • Install and configure hive on cluster
  • Create database, access it console
  • Buckets,Partitions
  • Joins in Hive
  • Inner joins
  • Outer joins
  • Hive UDF
  • Hive UDAF
  • Hive UDTF
  • Develop and run sample applications in Java to access hive
  • Load Data into Hive and process it using Hive

CHAPTER 16 : PIG

  • Pig basics
  • Install and configure PIG on a cluster
  • PIG Vs MapReduce and SQL
  • PIG Vs Hive
  • Write sample Pig Latin scripts
  • Modes of running PIG
  • Running in Grunt shell
  • Programming in Eclipse
  • Running as Java program
  • PIG UDFs
  • PIG Macros
  • Load data into Pig and process it using Pig

CHAPTER 17 : SQOOP

  • Install and configure Sqoop on cluster
  • Connecting to RDBMS
  • Installing Mysql
  • Import data from Oracle/Mysql to hive
  • Export data to Oracle/Mysql
  • Internal mechanism of import/export
  • Import millions of records into HDFS from RDBMS using Sqoop

Chapter 18 : HBASE

  • HBase concepts
  • HBase architecture
  • Region server architecture
  • File storage architecture
  • HBase basics
  • Cloumn access
  • Scans
  • HBase Use Cases
  • Install and configure HBase on cluster
  • Create database, Develop and run sample applications
  • Access data stored in HBase using clients like Java
  • Map Resuce client to access the HBase data
  • HBase and Hive Integration
  • HBase admin tasks
  • Defining Schema and basic operation

CHAPTER 19 : CASSANDRA

  • Cassandra core concepts
  • Install and configure Cassandra on cluster
  • Create database, tables and access it console
  • Developing applications to access data in Cassandra through Java
  • Install and Configure OpsCenter to access Cassandra data using browser

CHAPTER 20 : OOZIE

  • Oozie architecture
  • XML file specifications
  • Install and configure Oozie on cluster
  • Specifying Work flow
  • Action nodes
  • Control nodes
  • Oozie job coordinator
  • Accessing Oozie jobs command line and using web console
  • Create a sample workflows in oozie and run them on cluster

CHAPTER 21 : Zookeeper, Flume, Chukwa, Avro, Scribe,Thrift, HCatalog

  • Flume and Chukwa Concepts
  • Use cases of Thrift ,Avro and scribe
  • Install and Configure flume on cluster
  • Create a sample application to capture logs from Apache using flume

CHAPTER 22 : ANALYTICS BASIC

  • Analytics and big data analytics
  • Commonly used analytics algorithms
  • Analytics tools like R and Weka
  • R language basics
  • Mahout

CHAPTER 23 : CDH4 ENHANCEMENTS

  • Name Node High – Availability
  • Name Node federation
  • Fencing
  • YARn
24x7 LIVE Online Server (Lab) with Real-time Databases. Course includes ONE Real-time Project. Register Today
All Classes are Instructor-Led & LIVE. Completely Practical and Real-time with Study Material, Session Notes, Tasks and 24x7 LIVE Server.
 

Hadoop Training - Highlights :

  • Completely Practical and Real-time
  • Suitable for Starters + Working Professionals
  • Session wise Handouts and Tasks + Solutions
  • TWO Real-time Case Studies, One Project
  • Weekly Mock Interviews, Certifications
  • Certification & Interview Guidance
 
 
  • Detailed SQL Server Architecture, DB Design
  • Query Tuning, Stored Procedures, Linked Servers
  • In-Memory, DAC and Contained Databases
  • Routine DBA Activities, Emergency DBA Activities
  • SQL Profiler, SQLDIAG, DTA and Litespeed Tools
  • High Availability, Disaster Recovery, Always-On
Register Today  Other Popular Courses: SQL DBA Training, MSBI Training, SSIS Training, SSAS Training, SSRS Training [+] More Courses

Job-Oriented Real-time Training @ SQL School Training Institute - Trainer: Mr. Sai Phanindra T