0% found this document useful (0 votes)
83 views

Hadoop Admin Course

This document outlines the course content for a Hadoop Administration course. The course covers topics such as the fundamentals of HDFS and MapReduce, planning and installing Hadoop clusters, configuring Hive and Pig, cluster operations and maintenance, security, and advanced topics including real-time event processing with Spark, Kafka, and Storm. The course aims to provide students with the skills needed to administer Hadoop clusters in production environments.

Uploaded by

krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views

Hadoop Admin Course

This document outlines the course content for a Hadoop Administration course. The course covers topics such as the fundamentals of HDFS and MapReduce, planning and installing Hadoop clusters, configuring Hive and Pig, cluster operations and maintenance, security, and advanced topics including real-time event processing with Spark, Kafka, and Storm. The course aims to provide students with the skills needed to administer Hadoop clusters in production environments.

Uploaded by

krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Name: Hadoop Administration Course

Version: 2016-2017
Updated Date: 7th Jan 2017
Owner: AADS Education www.aadseducation.com

Big Data HADOOP Administration


Course Outline
Introduction to Apache Hadoop

i. The Case for Apache Hadoop


 Why Hadoop is needed
 What problems Hadoop solves
 What comprises Hadoop and the Hadoop Ecosystem

ii. HDFS
 What features HDFS provides
 How HDFS reads and writes files
 How the NameNode uses memory
 How Hadoop provides file security
 How to use the NameNode Web UI
 How to use the Hadoop File Shel

iii. Getting Data Into HDFS


 How to import data into HDFS with Flume
 How to import data into HDFS with Sqoop
 What REST interfaces Hadoop provides
 Best practices for importing data

iv. MapReduce
 What MapReduce is
 What features MapReduce provides
 What the basic concepts of MapReduce are
 What the architecture of MapReduce is
 What featurs MapReduce version 2 provides
 How MapReduce handles failure
 How to use the JobTracker Web UI

1
Name: Hadoop Administration Course
Version: 2016-2017
Updated Date: 7th Jan 2017
Owner: AADS Education www.aadseducation.com

Planning, Installing, and Configuring a Hadoop Cluster

i. Planning Your Hadoop Cluster


 What issues to consider when planning your Hadoop cluster
 What types of hardware are typically used for Hadoop nodes
 How to optimally configure your network topology
 How to select the right operating system and Hadoop distribution
 How to plan for cluster management

ii. Hadoop Installation and Initial Configuration


 The different installation configurations avaialable in Hadoop
 How to install Hadoop
 How to specify Hadoop configuration
 How to configure HDFS
 How to configure MapReduce
 How to locate and configure Hadoop log files

iii. Installing and Configuring Hive, Impala,and Pig


 Hive features and basic configuration
 Impala features and basic configuration
 Pig features and installation

iv. Hadoop Clients


 What Hadoop clients are
 How to install and configure Hadoop clients
 How to install and configure Hue
 How Hue authenticates and authorizes user access

v. Advanced Cluster Configuration


We will also cover
 Advanced Configuration Parameters
 RealTime scenarios,
 Configuring Hadoop Ports
 Troubleshooting & Issue
 Explicitly including and Excluding Hosts handling
 Configuring HDFS for Rack Awareness  How to handle nterview
 Configuring HDFS High Availability questions
 FAQS
2
Name: Hadoop Administration Course
Version: 2016-2017
Updated Date: 7th Jan 2017
Owner: AADS Education www.aadseducation.com

vi. Hadoop Security


 Why security is important for Hadoop
 How Hadoop's security model evolved
 What Kerberos is and how it relates to Hadoop
 What to consider when securing Hadoop

Cluster Operations and Maintenance

i. Managing and Scheduling Jobs


 How to view and stop jobs running on a cluster
 The options available for scheduling Hadoop jobs
 How to configure the Fair Scheduler

ii. Cluster Maintenance


 How to check the status of HDFS
 How to copy data between clusters
 How to add and remove nodes
 How to rebalance the cluster
 How to upgrade your cluster

iii. Cluster Maintenance and Troubleshooting


 What general system conditions to monitor
 How to monitor a Hadoop cluster
 Some techniques for troubleshooting problems on a Hadoop cluster
 Some common misconfigurations, and their resolutions

3
Name: Hadoop Administration Course
Version: 2016-2017
Updated Date: 7th Jan 2017
Owner: AADS Education www.aadseducation.com

Security and HDFS Federation

i. Kerberos Configuration
 What are the phases required for a client to access a service
 Kerberos Client Commands
 Configuring HDFS Security
 Configuring MapReduce Security
 Troubleshooting Hadoop Security

ii. Configuring HDFS Federation


 What is HDFS Federation
 Benefits of HDFS Federation
 How HDFS Federation works
 Federation Configuration

ADVANCE TOPICS (Real Time Event Processing)


 APACHE SPARK
 What is Spark
 How Spark works
 Spark Use Cases
 Installing and configuring Spark
 Real time event processing with Spark
 2)APACHE KAFKA
 What is Kafka
 How Kafka works
 Installing and configuring kafka
 Real time event processing with Kafka.
 3) APACHE STORM
 What is Storm We will also cover
 How Storm works  RealTime scenarios,
 Installing and configuring Storm  Troubleshooting & Issue
 Real time event processing with Storm handling
 How to handle nterview
questions
 FAQS
4

You might also like