0% found this document useful (0 votes)
99 views2 pages

Azure Data Engineering Course Content

The document outlines the content of an Azure Data Engineering course, covering key topics such as Azure Data Factory, Azure Databricks, and Azure Synapse Analytics. It includes details on building ETL pipelines, data integration, performance tuning, and real-time data processing. The course culminates in an end-to-end project that integrates all three services.

Uploaded by

ampooja24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views2 pages

Azure Data Engineering Course Content

The document outlines the content of an Azure Data Engineering course, covering key topics such as Azure Data Factory, Azure Databricks, and Azure Synapse Analytics. It includes details on building ETL pipelines, data integration, performance tuning, and real-time data processing. The course culminates in an end-to-end project that integrates all three services.

Uploaded by

ampooja24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Azure Data Engineering Course Content:

Azure Data Factory (ADF)

1. Overview of Azure Data Factory

o What is ADF, its use cases, and real-world applications

o Key components: Linked Services, Datasets, Pipelines, Triggers

2. Building Pipelines

o Creating ETL pipelines from scratch

o Data transformation activities: Copy Data, Mapping Data Flows

o Integration with Blob Storage, SQL Database

3. Parameterization and Dynamic Content

o Using parameters, variables, and expressions

o Dynamic Linked Services and Datasets

4. Pipeline Control Flow

o Conditional execution, For-Each loops, and handling failures

5. Monitoring and Debugging

o Debugging pipelines in real-time

o Monitoring pipeline runs and understanding metrics

6. Integration with Other Azure Services

o Using Databricks notebooks in ADF pipelines

o Moving data to/from Azure Synapse Analytics

Azure Databricks

1. Overview of Azure Databricks

o Databricks architecture and ecosystem

o Setting up workspace: Clusters, Notebooks, and Libraries

2. Introduction to Apache Spark

o Core Spark concepts: RDDs, DataFrames, and Datasets

o Introduction to Delta Lake and its benefits

3. Optimization Techniques

o Understanding shuffling, partitioning, and caching

o Optimizing queries with broadcast joins

4. Structured Streaming
o Processing real-time data streams

o Handling streaming data with Delta Lake

5. Integration with ADF and Synapse

o Connecting Databricks to other Azure services

o Writing processed data to Synapse and Blob Storage

Azure Synapse Analytics

1. Overview of Azure Synapse Analytics

o Synapse architecture: SQL on-demand, Dedicated SQL Pool, Serverless compute

o Key concepts: Partitioning, Indexing, and Data Distribution

2. Data Integration

o Loading data from ADF and Databricks to Synapse

o Working with external tables and PolyBase

3. Performance Tuning

o Query optimization techniques in Synapse

o Managing table partitions and statistics

4. Security and Monitoring

o Implementing Row-Level Security (RLS)

o Monitoring Synapse performance and job activity

5. End-to-End Project

o Design and implement a complete ETL pipeline connecting ADF, Databricks, and
Synapse

o Perform data transformations, aggregations, and reporting using Synapse

You might also like