This training provides you with a comprehensive understanding of all the steps necessary to operate and maintain a Hadoop cluster. From installation and configuration through management, scaling and advanced tuning this training is the best preparation for the real-world challenges faced by Hadoop administrators.
You will benefit from the Cloudera Administrator Training for Apache Hadoop course if:
you are a system administrator,
you have basic Linux experience.
Prior knowledge of Apache Hadoop is not required.
Xebia University (based in Hilversum, Amsterdam area) is an official training partner of Cloudera, the leader in Apache Hadoop-based software and services.
What you will achieve
This training alternates between instructional sessions and hands-on labs.
After completing this 4-day training:
You will know:
the internals of YARN, MapReduce, and HDFS,
how to load data into the cluster from dynamically-generated files using Flume and from RDBMS using Sqoop,
best practices for preparing and maintaining Apache Hadoop in production.
You will have hands-on experience in:
proper cluster configuration and deployment to integrate with the data center,
configurating the FairScheduler to provide service-level agreements for multiple users of a cluster.
You will have the skills to:
determine the correct hardware and infrastructure for you cluster, troubleshoot, diagnose, tune and solve Hadoop issues.
Note: This training does not focus on the ins and outs of Cloudera Manager. You will see how Cloudera Manager looks like, but we will focus on the different Hadoop components and the configuration of these components (exercises are done without using Cloudera Manager). If you require a training which focuses on how Cloudera Manager works, please contact us.
!Please note, that you need to bring your own laptop for this training.
This laptop should meet the following requirements:
At least 2GB RAM (4GB or more preferred);
10GB of free hard disk space;
VMware Player 5.x or above (Windows)/ VMware Fusion 4.x or above (Mac);
Internet access is mandatory. This course uses Amazon EC2-based virtual machines, port-forwarding SSH via ports 80 and 443. Access to the EC2 instances on those ports must be direct, with no HTTP proxy or our other port filtering in place.