This blog cover Step-By-Step Activity Guides of the Microsoft Azure Data Engineer Associate [DP-200 & DP-201] Hands-On Labs Training program that you must perform to learn this course.

Azure Data Engineer designs and implement the management, security, monitoring, and privacy of the data using the full stack of Azure data services to satisfy business needs.

The walkthrough of the Step-By-Step Activity Guides of the Microsoft Azure Data Engineer Associate [DP-200 & DP-201] Training program will prepare you thoroughly for the DP-200 & DP-201 certifications.

DP-200| Implementing an Azure Data Solution

DP-201 | Designing an Azure Data Solution

Here’s the quick guide of how to start learning Data Science on Azure & to clear Azure Data Engineer Associate by doing Hands-on. Image of Learning path

To know in more detail about the Azure Data Engineer Associate Certification click here.

Skills Measured in Exam DP-200

Implement Data Storage Solutions (40-45%)
Manage and Develop Data Processing (25-30%)
Monitor and Optimize Data Solutions (30-35%)

Lab 1: Azure for the Data Engineer

Exercise 1: Identify the evolving of world data

In this exercise, we’ll identify the data requirements and identify if the data structure for the requirement is structured, semi-structured or unstructured from the case study.
- Non-Relational: Document data, graph data, column family, etc.
- Relational: Data stored in tables. i.e. Customer table, employee tables.

Exercise 2: Determine the Azure Data Platform services

In this exercise, we’ll determine the data platform technology that delivers the identified data requirements.

Exercise 3: Identify the tasks to be performed by the Data Engineer

In this exercise, we’ll select one of the requirements and determine the high-level tasks that will perform to meet the data requirement selected.
- Provisioning data storage services.
- Ingesting Streaming and batch data.
- Transforming data.

Exercise 4: Finalize the data engineering deliverables

In this exercise, we’ll finalize the data engineering deliverables for AdventureWorks.

Lab 2: Working with Data Storage

Exercise 1: Choose a data storage approach in Azure

In this exercise, we’ll identify the data storage requirements for the static images for the website, and for the predictive analytics solution from the case study.
Each data set has different requirements, and it’s our job to figure out which storage solution is best.

screenshot of Azure Storage.

Exercise 2: Create an Azure Storage Account

In this exercise, we’ll create an Azure resource group in the region closest to the lab location.
Create a container named images, phone calls, and tweets within the storage account.
Upload some graphics to the images container of the storage account.

Exercise 3: Explain Azure Data Lake Storage

In this exercise, we’ll create and configure a storage account as a Data Lake Store Gen2 storage type in the region closest to the lab location, within the resource group.

Exercise 4: Upload data into Azure Data Lake

In this exercise, we’ll Install and start Microsoft Azure Storage Explorer and Upload some data files to the containers of the Data Lake Gen II Storage Account.

Lab 3: Enabling Team-Based Data Science with Azure Databricks

Exercise 1: Explain Azure Databricks

Azure Databricks is easy to set up data analytics platforms. Based on Apache Spark “big data” platform.

Azure Databricks

Exercise 2: Work with Azure Databricks

In this exercise, we’ll Create an Azure Databricks Premium Tier instance in a resource group and then Open Azure Databricks and then Launch a Databricks Workspace and create a Spark Cluster.

screenshot of Azure Databricks workspace

Exercise 3: Read data with Azure Databricks

In this exercise, we’ll confirm that the Databricks cluster has been created and then collect the Azure Data Lake Store Gen2 account name. Enable your Databricks instance to access the Data Lake Gen2 Store.
we’ll Create a Databricks Notebook and connect to a Data Lake Store and then Read data in Azure Databricks.

Exercise 4: Perform basic transformations with Azure Databricks

In this exercise, we’ll Retrieve specific columns on a Dataset and then Performing a column rename on a Dataset. Add an Annotation and If Time permits: Additional transformations.

screenshot of Azure Databricks.

Lab 4: Building Globally Distributed Databases with Cosmos DB

Exercise 1: Create an Azure Cosmos DB database built to scale

In this exercise, we’ll create an Azure Cosmos DB instance.

screenshot of Create an Azure Cosmos DB

Exercise 2: Insert and query data in your Azure Cosmos DB database

In this exercise, we’ll Setup your Azure Cosmos DB database and container and then add data using the portal
We’ll Run queries in the Azure portal. Run complex operations on our data.

Exercise 3: Distribute your data globally with Azure Cosmos DB

In this exercise, we’ll Replicate Data to Multiple Regions and managing Failover.

Lab 5: Working with Relational Data Stores in the Cloud

Exercise 1: Use Azure SQL Database

A “database as a service” offering from Azure runs the SQL server database engine under the hood, not 100 % compatible, but also a minor change to our code might be required some SQL server feature are not supported.
In this exercise, we’ll create and configure a SQL Database instance.

Exercise 2: Describe Azure Synapse Analytics

In this exercise, we’ll Create and configure an Azure Synapse Analytics instance and Configure the Server Firewall and then Pause the warehouse database.

Exercise 3: Creating an Azure Synapse Analytics database and tables

In this exercise, we’ll Install SQL Server Management Studio and connect to a data warehouse instance and then create a SQL Data Warehouse database and create SQL Data Warehouse tables.

Exercise 4: Using PolyBase to Load Data into Azure Synapse Analytics

Polybase allows us to query external databases like SQL, Oracle, Teradata, MongoDB and Azure blob storage.
In this exercise, we’ll Collect Data Lake Storage container and key details and then create a dbo.Dates table using PolyBase from Azure Data Lake Storage.

screenshot of Use PolyBase to Load Data into Azure SQL Data Warehouse

Lab 6: Performing Real-Time Analytics with Stream Analytics

Exercise 1: Explain data streams and event processing

In this exercise, we’ll identify the data stream ingestion technology for AdventureWorks, and the high-level tasks that you will conduct as a data engineer to complete the social media analysis requirements from the case study and the scenario.

Exercise 2: Data Ingestion with Event Hubs

In this exercise, we’ll create and configure an Event Hub Namespace, and Event Hub and configure Event Hub security.

screenshot of create and configure an Event Hub

Exercise 3: Starting the telecom event generator application

In this exercise, we’ll update the application connection string and run the application.

Exercise 4: Processing Data with Stream Analytics Jobs

In this exercise, we’ll do the following tasks :

Provision a Stream Analytics job and Specify the Stream Analytics job input.
Specify the Stream Analytics job output and Define a Stream Analytics query.
Start the Stream Analytics job and Validate streaming data is collected.

Lab 7: Orchestrating Data Movement with Azure Data Factory

Exercise 1: Setup Azure Data Factory

In this exercise, we’ll set up Azure Data Factory.

Image of lab.

Exercise 2: Ingest data using the Copy Activity

In this exercise, we’ll add the Copy Activity to the designer and then Create a new HTTP dataset to use as a source. Create a new ADLS Gen2 sink and test the Copy Activity.

Exercise 3: Transforming Data with Mapping Data Flow

In this exercise, we’ll be preparing the environment and be Adding a Data Source. Using Mapping Data Flow transformation writing to a Data Sink and then running the Pipeline.

Exercise 4: Azure Data Factory and Databricks

In this exercise, we’ll Generate a Databricks Access Token. Generate a Databricks Notebook Create Linked Services and we Create a Pipeline that uses Databricks Notebook Activity and then triggers a Pipeline Run.

Lab 8: Securing Azure Data Platforms

Exercise 1: An introduction to security

We will find accurate and timely information about Azure security. In this exercise, we’ll Security as a layered approach.

Exercise 2: Key security components

In this exercise, we’ll be Assessing Data and Storage Security Hygiene.

Exercise 3: Securing Storage Accounts and Data Lake Storage

In this exercise, we’ll be determining the appropriate security approach for Azure Blob.

Exercise 4: Securing Data Stores

In this exercise, we’ll Enable Auditing, Query the Database and View the Audit log.

Exercise 5: Securing Streaming Data

In this exercise, we’ll Change Event Hub Permissions.

Lab 9: Monitoring and Troubleshooting Data Storage and Processing

Exercise 1: Explain the monitoring capabilities that are available

In this exercise, we’ll be Defining a corporate monitoring approach.
- Network Performance Monitor.
- Application Gateway Analytics.

Image of Azure monitor.

Exercise 2: Troubleshoot common data storage issues

In this exercise, we’ll find issues that are related to data storage.
- Consistency
- Corruption

Exercise 3: Troubleshoot common data processing issues

In this exercise, we’ll determine issues that are related to data processing.

Exercise 4: Manage disaster recovery

In this exercise, we’ll Manage Disaster Recovery.
In Azure, there are two core services that we’ll take advantage of. The first is the Azure Site Recovery or ASR, and the second is Azure Backup. Both ASR and Azure Backup complement each other to provide you with end-to-end business continuity and disaster recovery solution with unlimited scale.

Skills Measured in Exam DP-201

Design Azure data storage solutions (40-45%)
Design data processing solutions (25-30%)
Design for data security and compliance (25-30%)

Lab 1 – Data Platform Architecture Considerations

Exercise 1: Design with Security in Mind

In this exercise, we’ll identify the security requirements of AdventureWorks from the case study. Every certification of Microsoft has one or the other way of checking how the compliance, how security is for your data. So what Microsoft has done is divided this security for your application into seven different layers.

Exercise 2: Design for Performance and Scalability

In this exercise, we’ll determine the scalability and performance requirements as identify from the case study.
- Scale-up: scaling up is an act of adding more resources for the same instance.
- scale-out: scaling out is adding multiple instances in a particular set of the cluster.

Exercise 3: Design for Availability and Recoverability

In this exercise, we’ll determine the recoverability and availability requirements as identify from the case study.

Exercise 4: Design for Efficiency and Operations

In this exercise, we’ll determine the operations and efficiency requirements for AdventureWorks.

Lab 2 – Azure Batch Processing Reference Architectures

Exercise 1: Design an Enterprise BI solution in Azure

From the case study, we’ll identify the requirements that would form part of the Batch mode processing of data in an Enterprise BI solution in AdventureWorks.
We’ll also build a high-level Architecture that reflects the Enterprise BI solution in AdventureWorks.

Exercise 2: Automate enterprise BI solutions in Azure

In this exercise, we’ll enhance a high-level Architecture to include automation of an Enterprise BI solution in AdventureWorks.

Exercise 3: Conversational bot solutions in Azure

In this exercise, we’ll enhance a high-level Architecture to include automation of a conversational bot solution in AdventureWorks.

Lab 3 – Azure Real-Time Reference Architectures

Exercise 1: Architect a stream processing pipeline with Azure Stream Analytics

In this exercise, we’ll identify the requirements that would form part of the real-time processing of data in AdventureWorks from the case study.
We’ll build a high-level Architecture that reflects a stream processing pipeline with Azure Stream Analytics.

Exercise 2: Design a stream processing pipeline with Azure Databricks

In this exercise, we’ll create a high-level Architecture to include a stream processing pipeline with Azure Databricks solution in AdventureWorks.

Exercise 3: Create an Azure IoT reference architecture

In this exercise, we’ll Confirm which architecture would form part of an Azure IoT reference architecture.

Lab 4 – Azure Data Platform Security Considerations

Exercise 1: Defense in Depth Security Approach

We’ll identify the security requirements for AdventureWorks from the case study.

Exercise 2: Identity Management

We’ll define the primary authentication mechanism for each technology used to meet AdventureWorks requirements.

Lab 5 – Designing for Scale and Resiliency

Exercise 1: Adjust Workload Capacity by Scaling

In this exercise, we’ll list out the services that would benefit from scaling and how the scale units are measured per service from the case study.

Exercise 2: Design for Optimized Storage and Database Performance

In this exercise, we’ll define a service feature that can be used to optimize storage and database performance.

Exercise 3: Design a Highly Available Solution

In this exercise, we’ll define a service feature that provides high availability where possible.

Exercise 4: Incorporate Disaster Recovery into Architectures

In this exercise, we’ll Outline the Disaster Recovery approach for the data services used by AdventureWorks.

Lab 6 – Designing for Efficiency and Operations

Exercise 1: Maximize the Efficiency of your Cloud Environment

In this exercise, we’ll provide a link to the Azure Price Calculator and a list of best practices that the IS department should follow to minimize costs.

Exercise 2: Use Monitoring and Analytics to Gain Operational Insights

In this exercise, we’ll draft a monitoring and analytics strategy that should be adopted by AdventureWorks

Exercise 3: Use Automation to Reduce Effort and Error

In this exercise, we’ll list the options for automation languages and approaches.

Related/References

Next Task For You

To know more about Data Engineering for beginners, why you should learn, Job opportunities, and what to study Including Hands-On labs you must perform to clear [DP-200] Implementing an Azure Data Solution and [DP-201] Designing an Azure Data Solution.

Click on the below image to join the waitlist for Microsoft Azure Data Engineer Associate Certification.

The post Microsoft Certified Azure Data Engineer Associate | DP-200 And DP-201 | Step By Step Activity Guides [Hands-On Labs] appeared first on Cloud Training Program.