Quantcast
Channel: Cloud Training Program
Viewing all articles
Browse latest Browse all 1903

Azure Data Engineer [DP-203] Q/A | Day 1 Live Session Review

$
0
0

An Azure Data Engineer provides end-to-end support to build and manage data solutions using Azure Data Services. They optimize data workflows – data ingesting, transforming, processing and deploy the data solutions.

In this post, we will be sharing the Azure Data Engineer Day 1 Live Session Review FAQs. This will help you in understanding some basic concepts.

First of all, there are 17 modules & 15+ hands-on labs which are important to learn to become an Azure Data Engineer.

Out of which, in the first Live Session (Day 1) of the Azure Data Engineer Training Program, we covered the concepts of Azure for Data Engineer and an Overview of Azure SQL Databases.

Note: If you want to know about the Certification read our blog post.

So, here we discuss some FAQ’s asked during the Live session from Module 1: Azure for Data Engineer

>Basic Data Terminologies

To get started with the first Module of Azure Data Engineer, it is important to clarify the data types and uses of various data storage services that we use on-premise. This will give us a better understanding of Azure Services that provide similar resources in the Cloud.

Q1: What is a Database and a Data Warehouse?

A: A database is used for storing transactional data, hence the data in it gets updated very quickly. It is normalized to reduce redundancies and optimize transactional queries (insert, update, delete)

A data warehouse on the other hand is used for storing historical and analytical data to obtain business insights. The data is read on a regular basis but is loaded on a periodic basis- weekly, monthly or yearly. Hence it is partially normalized since the main focus is only on SELECT or read queries, not insertion and update queries.

Q2: Why do we need a Data Warehouse if we have a Database?

A: A Database has more of a reading and writing purpose for transactions. Though it is normalized to optimize queries, if it serves analytical queries too there will be more query blocks, locks and deadlocks.

Hence to separate the transactional and analytical environment, we need a Data Warehouse.

A Data Warehouse serves more of a reading purpose for insights and analytics, hence optimized in a way where we can read the data quickly. It maintains the relationships between the data in a different way.

Both database and data warehouse have data physically stored in rows and columns, but the purpose and hence the logical structure of each is different.

Q3: How do we convert data from a Database to a Data Warehouse?

A: In a database, we have data organized as tables having rows and columns. These tables are joined relationally by the primary and foreign key.

A data warehouse has two types of tables called Fact Table and Dimension Tables which are also joined relationally by the primary and foreign key. All the numeric data from the database which can be aggregated are stored in the Fact Tables and the data by which these numeric values can be categorized or grouped are stored in Dimension Tables.

We migrate the data from the database into a data warehouse by putting them into Fact and Dimension Tables.

> Module 1: Azure For Data Engineer

After clearing the concepts of data storage facilities above, we move to Module 1 where we understand the roles and responsibilities of a Data Engineer and the changing nature of data in today’s world. We also understand a broad level of some Azure Data Services and where they can be used when building data solutions.

Azure DE

Q4: What are the various Data Abundance factors a Data Engineer needs to consider?

A: The following factors are crucial for a Data Engineer to consider:

  • Process: The Data Engineer needs to prepare the structure of the data to be ready to use it further. Processes like data ingestion, storage, transform, manage, aggregate and reporting need to be taken care of.
  • Consumers: Data is consumed or generated in a vast variety of formats and sources such as relational data, non-relational data, sensor and device data.
  • Variety: A Data Engineer should be able to work with a variety of data generated and consumed.
  • Responsibilities and Technologies: Data Engineers should be familiar with technologies that help them to handle and process data. Azure provides multiple services for working with data.

Q5: On-Premise v/s Cloud Technologies (Azure)

A: Azure migration offers many benefits such as:

  • Computing Environment: Azure offers easy to provision computing services for you to provision and use quickly
  • Licensing Model: You can use your own on-premise product licenses while migrating to Cloud
  • Maintainability: Azure maintains and manages all the services without the need of you taking care of that
  • Scalability: Provision of additional resources easily
  • Availability: Azure maintains replicas of data and resources to be able to recover just in time in case of a disaster or failure.

cloud vs on remise

Q6: Data Engineering Job Responsibilities

A: A Data Engineer needs to learn and be updated with new skills and platforms to work with a variety of data. Taking care of implementing as well as provisioning data solutions for customers is equally important. He should be able to apply suitable data loading approaches according to changing scenarios:

  • Extract-Transform-Load (ETL) transforms the data according to the requirement after extracting it from the sources, before loading it into the destination
  • Extract-Load-Transform (ELT) directly loads the raw data as it is into the destination. It then iteratively transforms the data according to requirements and uses it accordingly. Hence ELT is more beneficial than ETL

Read more about the difference between a Data Engineer, Data Scientist and Data Analyst here.

Q7: Azure Services for various Industrial Use Cases

A: Azure has various uses in a variety of industrial domains, some of which are:

  • Web Retail: Azure Cosmos Databases can be used that is excellent at replication and provides fast query handling for mobile and web applications used across the globe.
  • HealthCare: Azure Databricks is well suited for big data analytics and AI solutions to be used in the healthcare sector for prediction or building other useful models.
  • IoT Scenarios: Data Engineers can use Azure IoT Hub to deploy and track IoT sensors capturing real-time data.

Q8: Structured Data v/s Unstructured Data

A: Structured data is data that has been organized into a formatted repository, typically a database with rows and columns. They have relational keys and can easily be mapped into pre-designed fields. These are most common in development and the simplest way to manage information. Example: Relational data.

Semi-Structured data is information that does not reside in a well-structured format but that has some organizational properties that make it easier to analyze them such as tags or key-value pairs. With some process, you can sometimes store them in the relational database. They exist to ease space. Example: XML data.

Unstructured data has no designated structure to it. Also referred to as No-SQL data. Example: Word, PDF, Text, Media logs.

structured vs unstructured data

Read more about different types of Structured, Semi-Structured and Unstructured Data.

Q9: What are the different Azure Services to work with Data?

A: Some of the commonly used Azure Data Services are:

>Overview Of Azure SQL Database

After understanding the roles and responsibilities of a Data Engineer and the various Azure Services that can be used, we started with the first part of Module 3: Azure SQL Database. We had a glimpse of the Azure SQL Database and a small lab demo of how to provision the same on Azure.

Q10: What is an Azure SQL database?

A: The Azure SQL Database is a fully managed platform as a service (PaaS) database engine that handles most of the database management functions such as upgrading, patching, backups, and monitoring without user involvement.

It runs on the latest stable version of the SQL Server database engine and patched OS with 99.99% availability. With Azure SQL Database, you can create a highly available and high-performance data storage layer for the applications and solutions in Azure

Azure SQL DB

Q11: Will a free Azure subscription allow us to complete all the labs in this course?

A: You would be able to complete most of the labs of DP-203 training using the Azure-Free trial account/free Azure subscription.

Feedback Received…

Here is some positive feedback from our trainees who attended the session:

Read more about the DP-203 Certification and whether it is the right certification for you, from our blog on Exam DP-203: Data Engineering on Microsoft Azure

Feedback-1  Feedback-2

Quiz Time (Sample Exam Questions)!

With our Azure Data Engineer Training Program, we cover 150+ sample exam questions to help you prepare for the DP-203 Certification.

Check out one of the questions and see if you can crack this…

Ques. You must apply patches to the Azure SQL database regularly. State whether the above statement is true or false.

A. True

B. False

Comment with your answer & we will tell you if you are correct or not!

References

Next Task For You

If you want to begin your journey towards becoming a Microsoft Certified: Azure Data Engineer by checking our FREE CLASS. [DP.200.201]_CU

The post Azure Data Engineer [DP-203] Q/A | Day 1 Live Session Review appeared first on Cloud Training Program.


Viewing all articles
Browse latest Browse all 1903

Trending Articles