Data Warehousing with Hadoop: HDInsight and Retail Sales Implementation Using Hive
Data Warehousing with Hadoop: HDInsight and Retail Sales Implementation Using Hive
- illustrate various data modeling approaches that are adopted for provisioning data warehousing solutions
- identify the roles of fact and dimension tables in the dimensional design process
- recall the essential dimensional design processes and the objectives of the various steps involved
- depict the essential business use cases of data warehousing in the retail sales domain
- create dimension tables in Hive
- create fact tables for retail use cases in Hive
- load data in the dimension and fact tables
- identify the essential queries that are used and are required in order to fetch essential data from retail schemas
- construct and execute queries to get the desired outputs from Hive tables
- create dashboards in PowerBI from Hive data
- write Hive query to extract data from dimension and fact tables using Joins
This course covers the implementation of data warehousing in retail sales. Learners will learn to design and implement data warehousing solutions using Hive and PowerBI on HDInsight.
Data Warehousing with Hadoop: Managing Big Data Using HDInsight Hadoop
Data Warehousing with Hadoop: Managing Big Data Using HDInsight Hadoop
- recognize the critical features provided by HDInsight to manage big data
- list the various essential types of cluster that we can implement with HDInsight
- recall the various open source components of HDInsight and their roles in managing cluster, data, and jobs
- demonstrate how to set up Hadoop clusters on Azure HDInsight
- create Hadoop HDInsight clusters using the Azure Resource Manager template
- specify the essential capabilities of HDInsight and the various types of storages that we can provision to store data
- illustrate the critical capabilities afforded by the Azure Management Console
- create, manage, and monitor HDInsight clusters using the Azure Management Console
- set up the HDInsight Emulator and use PowerShellto execute essential commands
- identify the various approaches of programming in HDInsight
- develop and execute MapReduce programs using cmdlet and Hadoop streaming
- set up Hadoop clusters on HDInsight and execute MapReduce applications
Explore the fundamentals of Azure HDInsight and the essential architectural components.
Data Warehousing with Hadoop: Microsoft Analytics Platform System and Hive
Data Warehousing with Hadoop: Microsoft Analytics Platform System and Hive
- illustrate capabilities, features, and objectives of the Microsoft Analytics Platform System
- specify how to manage data using PolyBase and the various essential benefits provided by PolyBase
- identify the role of parallel data warehousing architecture in Microsoft Analytics Platform System
- recall the various data exploration architectures that can be implemented using HDInsight and the Microsoft Analytics Platform System
- describe the role of Hive as a data warehouse system for Hadoop
- describe the architectural composition of Hive in HDInsight
- set up the development environment for Hive using the Azure HDInsight tool for VSCode
- connect and submit queries to HDInsight clusters using VSCode
- specify the various clauses that can be used in Hive Query Language to manage objects and query data
- work with Azure PowerShell and Beeline to execute Hive Query Language queries
- create a database, tables, and load data to Hive tables from the Azure Blob Storage and SQL Servers
- work with partition tables and manage Hive data formats
- demonstrate how to install Hue and manage Hive queries from the Hue interface
- demonstrate the approaches involved in retrieving Hive data and creating visualization on Power BI
- work with HIVE as an ETL tool
- compare HBase and Hive from the data modeling perspective
- create a Hive table and load data from an external SQL Server
Explore the Microsoft Analytics Platform System and using Hive to manage data from a data warehouse perspective.
Data Warehousing with Hadoop: Spark, HDInsight and Cluster Management
Data Warehousing with Hadoop: Spark, HDInsight and Cluster Management
- specify the essential capabilities of Spark and its essential architectural components
- list the data structures along with the RDD and lineage concepts that are used in Spark
- set up Spark clusters using PowerShell and Azure Resource Manager template
- describe the relationship between Spark SQL and Hive
- specify the essential concepts of Spark SQL and DataFrame
- demonstrate the approach of customizing HDInsight clusters using bootstrap
- install Hadoop applications on Azure HDInsight
- illustrate the usage of Ambari as a tool in order to manage clusters
- manage Hadoop clusters in HDInsight using Azure CLI
- specify the approach of troubleshooting and tuning HDInsight clusters
- monitor Hadoop clusters in HDInsight to collect metrics for analysis
- set up Spark clusters and manage the clusters using Ambari GUI
Discover how to work with Spark and its in-memory capabilities of data management. How to manage and troubleshoot HDInsight clusters using Ambari and the Azure CLI tool is also covered.