menu

Big Data on AWS

5 Labs 5h 46m 50 Credits

Scientists, developers, and other technologists from many different industries are taking advantage of AWS to perform big data analytics and meet the challenges of the increasing volume, variety, and velocity of digital information. AWS offers a portfolio of cloud computing services to help you manage big data by reducing costs, scaling to meet demand, and increasing the speed of innovation. In this quest, you’ll learn to work with advanced services for Big Data.

Objectives

This quest is designed to teach you how to work with AWS services to perform big data analytics on the cloud.

Quest Outline

Praxisorientiertes Lab

Working with Amazon Redshift

The lab demonstrates how to use Amazon RedShift to create a cluster, load data, run queries and monitor performance. Note: Students will download a free SQL client as part of this lab.

English 日本語 简体中文
Praxisorientiertes Lab

Exploring Google Ngrams with Amazon EMR

This lab demonstrates how to launch an Amazon Elastic MapReduce (EMR) cluster for Big Data processing and use Hive with SQL-style queries to analyze data. You will create a Hadoop cluster using Amazon EMR which will allow to run interactive Hive queries against data stored in Amazon S3. You will use Hive to normalize the data in a more useful way, and you will run queries to analyze the data.

English 日本語 简体中文
Praxisorientiertes Lab

Analyze Big Data with Hadoop

In this lab, you will deploy a fully functional Hadoop cluster, ready to analyze log data in just a few minutes. You will start by launching an Amazon EMR cluster and then use a HiveQL script to process sample log data stored in an Amazon S3 bucket. HiveQL is a SQL-like scripting language for data warehousing and analysis. You can then use a similar setup to analyze your own log files.

Praxisorientiertes Lab

Advanced Amazon Redshift: Table Layout and Schema Design

In this lab, you will take a close look at different types of table layout and schema design. You will create tables using various methods for data compression and distribution, and analyze which methods work best, including incorporating Amazon Redshift recommendations. You will conclude the lab by building five different versions of the same table, and analyzing how the differences impact storage requirements and query performance. Pre-requisites: To successfully complete this lab, you should be familiar with Redshift concepts. Knowledge of SQL programming is required, although full solution code is provided.

Praxisorientiertes Lab

Advanced Amazon Redshift: Data Loading

In this lab, you will experiment with and compare different types of data loading using Amazon Redshift. You will create tables, load data using S3, remote hosts, and practice troubleshooting data loading errors. For the lab to function as written, please DO NOT change the auto assigned region.

English 日本語

Enroll

Enroll Text

Add
home
Startseite
school
Katalog
menu
Mehr
Mehr