#ETL Developer

ETL Developer is responsible for extracting data from various sources by designing, building and maintaining data pipelines. ETL Developer role is in high demand and offers excellent pay scale in the tech industry. They often use Azure and AWS cloud platforms to handle their operations.

✅ SQL for ETL Development
✅ Data Mapping, Transformations
✅ Azure Data Factory (ADF)
✅ Azure Databricks
✅ SSIS, ADF For ETL & LET
✅ DWH & Star Schema Design
✅ Pipeline Tuning, Batch Processing
✅ Python, PySpark for Automations
✅ Real Time Project
✅ 1:1 Mentorship, Resume

Module 1: SQL Server TSQL (MS SQL) Queries

Ch 1: Data Analyst Job Roles

Introduction to Data
Data Analyst Job Roles
Data Analyst Challenges
Data and Databases Intro

Ch 2: Database Intro & Installations

Database Types (OLTP, DWH, ..)
DBMS: Basics
SQL Server 2025 Installations
SSMS Tool Installation
Server Connections, Authentications

Ch 3: SQL Basics V1 (Commands)

Creating Databases (GUI)
Creating Tables, Columns (GUI)
SQL Basics (DDL, DML, etc..)
Creating Databases, Tables
Data Inserts (GUI, SQL)
Basic SELECT Queries

Ch 4: SQL Basics V2 (Commands, Operators)

DDL : Create, Alter, Drop, Add, modify, etc..
DML: Insert, Update, Delete, select into, etc..
DQL: Fetch, Insert… Select, etc..
SQL Operations: LIKE, BETWEEN, IN, etc..
Special Operators

Ch 5: Data Types

Integer Data Types
Character, MAX Data Types
Decimal & Money Data Types
Boolean & Binary Data Types
Date and Time Data Types
SQL_Variant Type, Variables

Ch 6: Excel Data Imports

Data Imports with Excel
SQL Native Client
Order By: Asc, Desc
Order By with WHERE
TOP & OFFSET
UNION, UNION ALL

Ch 7: Schemas & Batches

Schemas: Creation, Usage
Schemas & Table Grouping
Real-world Banking Database
2 Part, 3 Part & 4 Part Naming
Batch Concept & “Go” Command

Ch 8: Constraints, Keys & RDBMS – Level 1

Null, Not Null Constraints
Unique Key Constraint
Primary Key Constraint
Foreign Key & References
Default Constraint & Usage
DB Diagrams & ER Models

Ch 9: Normal Forms & RDBMS – Level 2

Normal Forms: 1 NF, 2 NF
3 NF, BCNF and 4 NF
Adding PK to Tables
Adding FK to Tables
Cascading Keys
Self Referencing Keys
Database Diagrams

Ch 10: Joins & Queries

Joins: Table Comparisons
Inner Joins & Matching Data
Outer Joins: LEFT, RIGHT
Full Outer Joins & Aliases
Cross Join & Table Combination
Joining more than 2 tables

Ch 11: Views & RLS

Views: Realtime Usage
Storing SELECT in Views
DML, SELECT with Views
RLS: Row Level Security
WITH CHECK OPTION
Important System Views

Ch 12: Stored Procedures

Stored Procedures: Realtime Use
Parameters Concept with SPs
Procedures with SELECT
System Stored Procedures
Metadata Access with SPs
SP Recompilations
Stored Procedures, Tuning

Ch 13: User Defined Functions

Using Functions in MSSQL
Scalar Functions in Real-world
Inline & Multiline Functions
Parameterized Queries
Date & Time Functions
String Functions & Queries
Aggregated Functions & Usage

Ch 14: Triggers & Automations

Need for Triggers in Real-world
DDL & DML Triggers
For / After Triggers
Instead Of Triggers
Memory Tables with Triggers
Disabling DMLs & Triggers

Ch 15: Transactions & ACID

Transaction Concepts in OLTP
Auto Commit Transaction
Explicit Transactions
COMMIT, ROLLBACK
Checkpoint & Logging
Lock Hints & Query Blockin
READPAST, LOCKHINT

Ch 16: CTEs & Tuning

Common Table Expression
Creating and Using CTEs
CTEs, In-Memory Processing
Using CTEs for DML Operations
Using CTEs for Tuning
CTEs: Duplicate Row Deletion

Ch 17: Indexes Basics, Tuning

Indexes & Tuning
Clustered Index, Primary Key
Non Clustered Index & Unique
Creating Indexes Manually
Composite Keys, Query Optimizer
Composite Indexes & Usage

Ch 18: Group By Queries

Group By, Distinct Keywords
GROUP BY, HAVING
Cube( ) and Rollup( )
Sub Totals & Grand Totals
Grouping( ) & Usage
Group By with UNION
Group By with UNION ALL

Ch 19: Joins with Group By

Joins with Group By
3 Table, 4 Table Joins
Join Queries with Aliases
Join Queries & WHERE
Join Queries & Group By
Joins with Sub Queries
Query Execution Order

Ch 20: Sub Queries

Sub Queries Concept
Sub Queries & Aggregations
Joins with Sub Queries
Sub Queries with Aliases
Sub Queries, Joins, Where
Correlated Queries

Ch 21: Cursors & Fetch

Cursors: Realtime Usage
Local & Global Cursors
Scroll & Forward Only Cursors
Static & Dynamic Cursors
Fetch, Absolute Cursors

Ch 22: Window Functions, CASE

IIF Function and Usage
CASE Statement Usage
Window Functions (Rank)
Row_Number( )
Rank( ), DenseRank( )
Partition By & Order By

Ch 23: Merge(Upsert) & CASE, IIF

Merge Statement
Upsert Operations with Merge
Matched and Not Matched
IIF & CASE Statements
Merge Statement inside SPs
Merge with OLTP & DWH

Ch 24: Key Take-Aways from Module 1

Case Study 1: Medicare: Tasks, Solutions
Case Study 2: ECommerce: Task, Solutions
Chapter Wise Assignments: Solutions
Dailly Assignments: Review (Feedback)
Weekly Mock Interview: Feedbacks

Module 2: Python (For Data Analysts)

Ch 1: Python Introduction

Python Introduction
Python Versions
Python Implementations
Python Installations
Python IDE & Usage
Jupyter Notebooks

Ch 2: Python Operations

Basic Operations in Python
Python Scripts, Print()
Single, Multiline Statements
Python: Internal Architecture
Compiler Versus Interpreter

Ch 3: Data Types & Variables

Integer / Int Data Types
Float, String Data Types
Sequence Types: List, Tuple
Range, Complex & memview
Retrieving Data Type: type()

Ch 4: Python Operators

Arithmetic, Assignment Ops
Comparison Operators
Operator Precedence
If … Else Statement, Pass
Short Hand If, OR, AND
ELIF and ELSE IF Statements

Ch 5: Python Loops, Iterations

Python Loop & Realtime Use
Python While Loop Statement
Break and Continue Statement
Iterations & Conditions
Exit Conditions & For Loops
iter() and Looping Options

Ch 6: Python Functions

Python Functions & Usage
Function Parameters
Default & List Parameters
Python Lambda Functions
Recursive Functions, Usage
Return & Print @ Lamdba

Ch 7: Python Modules

Import Python Modules
Built In Modules & dir
datetime module in Python
Date Objections Creation
strftime Method & Usage
imports & datetime.now()

Ch 8: Python User Inputs & TRY

Try Except, Exception Handling
Raise an exception method
TypeError, Scripting in Python
Python User Inputs
Python Index Numbers
input() & raw_input()

Ch 9: Python File Handling

File Handling, Activities
Loop, Write, Close Files
Appending, Overwriting
import os, path.exists
f.open, f.write
f.read, f.close

Ch 10: Pandas DataFrames 1

Installation of Pandas
Python Modules & Pandas
Pandas Codebase & Usage
import pandas.DataFrame
Pandas Series, arrays

Ch 11: Pandas DataFrames 2

Indexes & Named Options
Locate Row and Load Rows
Row Index & Index Lists
Load Files Into a DataFrame
df.to_string() Function
tail() & null() Function

Ch 12: Pandas Transformations

Pandas – Cleaning Data
Replace, Transform Columns
Data Discovery & Column Fill
Identify & Remove Duplicates
dropna(), fillna() Functions
Data Plotting & matlib Lib

Ch 13: Realtime Project (Banking / Finance) For Data Analysis [End to End]

Module 3: Azure Data Engineer (ETL, DWH)

Part 1: ADF & Synapse

Ch 1: ETL, DWH Introduction

Data Warehouse (DWH)
Cloud Concepts: IaaS, PaaS
SaaS & Azure Cloud Concepts
Azure Resources & Groups
Storage, ETL, IoT Resources

Ch 2: Azure Intro, Azure SQL

Azure SQL Server, SQL DB
Azure SQL Database (OLTP)
Azure SQL Pool (DWH)
Connections from SSMS Tool
Connections from ADS Tool
Pause / Resume SQL Pool
Source Data Configurations

Ch 3: Azure Synapse (DWH)

Synapse Pool Architecture
Control Node, Compute Node
DMS & Partitioned Tables
Creating Tables with TSQL
Distributions: RR, Hash, Repl
Big Data Loads with TQL
Important DMFs & DMVs

Ch 4: Azure Data Factory (ADF)

Need for ADF & Pipelines
Data Orchestration with IR
Integration Runtime Engine
Linked Services, Datasets
Pipelines: Copy Data Activity
Data Flow Activity with IR

Ch 5: Azure SQL DB Loads

ADF: Author
Azure SQL Database Reads
Azure SQL Pool Writes
Synapse Analytics with IR
Pipeline Design, Validation
Pipeline Runs, Monitoring

Ch 6: BLOB Data Loads

Azure Storage Account
Azure BLOB Containers
BLOB Storage in ADF
Synapse Analytics with IR
ADF Pipeline Edits
Pipeline Runs, Monitoring

Ch 7: Pipeline Settings

ADF Pipeline Settings
Staging : Advantages
Reliable Logging
Best Effort Logging
DIU & DOCP with IR
Compressions, Health Check

Ch 8: File Incremental Loads

File Incremental Loads
Storage Account, Data Lake
Binary Copy, Schema Drift
Staging Concept in ADF
Initial, Incremental Loads
Schema & Data Changes

Ch 9: Table Incremental Loads

Implement SCD with ADF
Self Hosted IR: Realtime Use
On-premise Data: Incr Loads
Copy Method: Upsert, Keys
Staging & ADF Optimizations
Pipeline Runs, Activity IDs

Ch 10: ADF Data Flow – 1

Data Flow Transformations
Spark Clusters for Debugging
Optimized Clusters, Preview
Conditional Split, SELECT
Sort, Union Transformations
Pipelines with Data Flow

Ch 11: ADF Data Flow – 2

Working with Multiple Tables
Join Transform, Broadcast
Row Filters, Column Filters
Surrogate Keys, Derived Cols
ETL Loads Dates, Sink Options
Aggregated Data Loads

Ch 12: ADF Data Flow – 3

Pivot Transformation
Group By & Pivot Keys
Column Pattern, Deduplicate
Lookup, Cached Lookup
Tuning Transformations
Tuning Data Flow, Spark

Ch 13: ADF Data Flow – 4

Lookup Transformation
Cache Lookup
Inline Datasets
Data Validations
Lookup Versus Joins
Broadcast Options

Ch 14: ADF Metrics, Alerts

Azure Insights
Azure Metrics for ADF
Azure Metrics for Synapse
CPU, Memory Metrics
Alerts and Notifications
Action Groups, Tuning Options

Ch 15: ADF Parameters, Security

Linked Service Parameters
Creating Logins
Users and ETL Permissions
Parameterize Logins
Parameterize Users
Dynamic Linked Services

Ch 16: Parameters, SCD & ETL

ADF Templates in Realtime
Implementing ADF SCD
Table Incremental Loads
Control Tables, Watermarks
Pipeline Parameters, SPs
Dynamic Data Sets, SCD

Ch 17: CDC with ADF

Using CDC in ADF
Control Tables (CT): Upserts
Handling Inserts, Updates
Change Tracking (CT) Tables
SCD Type 1 & Type 2
ADF, Synapse: Limitations

Ch 18: Synapse Analytics – 1

Azure Synapse Analytics
Dedicated SQL Pools
TSQL: Stored Procedures
Synapse Pipelines, Tuning
SP Activity in Pipelines, Jobs
Comparing ADF & Synapse

Ch 19: Synapse Analytics – 2

Serverless Pools in Synapse
TSQL Scripts with Serverless
ADLS Data Imports with SPs
Synapse Analytics
Synapse Optimizations
Synapse Security & Logins

Ch 20: Synapse Analytics – 3

Synapse Notebooks
Synapse Analytics with Pools
Staging, Aggregations
Pipelines with Notebooks
Pipelines Vs Notebooks
Scheduling Notebooks

Ch 21: CI CD with GitHub

Creating Github Account
GIT: Main, Branches
Connecting with ADF
Version Changes
Builds and Deployments
CI-CD Integrations

Part 2: Azure IoT & ADLS

Ch 22: Azure Intro & Storage

Storage, ETL, IoT Resources
Azure Storage Components
Azure Storage Account, HNS
Azure Data Lake Storage
Storage Explorer Config

Ch 23: Azure Storage Operations

BLOB Storage: Containers
File & Folder Uploads, Edits
Azure Tables: Row Key
Partition Key, Timestamp
Use Cases of Azure Tables

Ch 24: Azure Storage Security

Access Keys & Admin Access
SAS Keys Generation, Ips
Azure AD Users, Groups
IAM & RBAC with Entra Users
ACLs and ADLS Security

Ch 25: Azure SQL DB Migrations

On-Premise SQL DB bacpac
Azure SQL Deployment
Azure Storage from SSMS
Azure SQL DB Migration
Migration Verifications
Testing Migrations in SQL

Ch 26: Azure Tables & Files

Azure Tables
Entities and Properties
Storage Service Operations
Storage Browser Operations
OData Queries & Filters
Azure BLOB Vs Azure Tables

Ch 27: Azure Stream Analytics

Azure IoT Hubs & Devices
APIs with Connection Strings
Azure Steam Analytic Jobs
Inputs, Outputs, SAQL Query
LIVE Feed: JSON, AVRO Files
Watermark & LIVE Streams

Ch 28: Azure Event Hubs

Azure Event Hubs
Namespaces & Instances
Instance Configurations
Access Policies & Security
Event Based Data Ingestions
IoT Hubs Versus Event Hubs

Ch 29: Azure Key Vaults

Azure Encryptions at REST
SMK & CMK Encryptions
Azure Key Vaults & Keys
Key Access Policies
Rest, Transit Encryptions
Realtime Considerations

Ch 30: Access Tiers & Blob Types

Azure Access Tiers
Hot, Cold, Cool & Archive
Block & Page Blobs
Append Blobs
Storage Snapshots
Version Checks @ Snapshots

Ch 31: Azure Metrics & Alerts

Azure Encryptions @ REST
Azure Key Vaults & Keys
SMK & CMK Encryptions
Azure Metrics: Ingress
Egress, E2E Latency Issues
Performance Tuning Options

Ch 32: Storage Optimizations

BLOB Types & Content Types
Hot, Cool, Cold, Archive Types
Creating, Using Access Policies
Immutable Storage, Rotation
Containerization, Indexing
Replication: LRS, ZRS, RA-GRS

Ch 33: Azure Pricing, Functions

Azure Logic Apps: Usage
Log Apps Usage in ETL
Snapshots, Azure Functions
Azure Functions Realtime Use
ETL & DWH with Functions
Azure Resource Pricing

Part 3: Databricks (ETL, DWH) with AI

Ch 34: Databricks Intro & Architecture

Databricks Introduction
Databricks Features
Key Components, Architecture
Control Pange, Compute Pane
Azure Databricks Resource
Databricks Workspace

Ch 35: Spark Cluster Architecture

Spark Clusters: Types, Policies
Driver Node: Purpose, Compute
Worker Node: Purpose, Compute
Cluster Manager, Cluster Types
Resilent Distributed Datasets
DAG, Scaling, Photon Acceleration

Ch 36: DBFS Operations

DBFS Concepts: File Store, Tables
DBFS File Uploads, Infer Schema
Header Row Promotion
Create Table using UI
HIVE Metastore Catalog
Spark Database & Tables

Ch 37: Notebooks Intro

ETL & ELT Process
Workspace Options: Notebooks
Notebooks: SQL, Python, Scala
When to use which Notebooks?
Notebook Exports, Imports
Cloning and Markdown Cells

Ch 38: Spark SQL Notebooks

Creating Spark Databases
Connecting to Spark Databases
Creating Spark tables
Data Inserts & DML Operations
DDL Operations on Spark Tables
SQL Notebook: Limitations

Ch 39: Medallion Architecture

Medallion Architecture: Scaling
Raw Data with Medallion
Transformations (ETL)
Bronze Layer: Raw Data
Silver Layer with Temp Views
Gold Layer with Spark Tables

Ch 40: Python Intro, Variables

Python Introduction
Python Usage in ETL, DDL, DML
Spark Environment for Python
PySpark: Python inside Spark
Python Variables
Python Print Statement

Ch 41: Python Data Types

Python Data Types
String, Numeric Types
Date Data Type
List, Tuple & Sets
Data frames: Purpose
Data frames as Spreadsheets

Ch 42: PySpark Transformations

Custom DataFrames
Single List, Mixed List Options
Concat Function & Index Options
Removing Empty Rows
Replacing Null Values
Merge, Joins, Join Kind

Ch 43: Medallion with DBFS

DBFS File Source & DataFrames
Temp View For Medallion
Dataframe Loads to TempView
Data Filters in Temp View
Data Aggregations in Temp View
Creating Parquet Tables

Ch 44: Medallion with Azure SQL DB

Azure SQL DB Connections
Azure SQL Server & DB Names
Connection String & URL Format
Dataframes @ read.jdbc()
Aggregated / Incremental Tfns
Data Loads into Spark Database

Ch 45: Delta Tables (PySpark)

Delta Tables Concept
Creating, Using Delta Tables
DML Operations in Delta Tables
Upsert: Incremental Loads
Delta Tables in HIVE Metastore
MERGE INTO Statement (Spark)

Ch 46: SCD with Azure SQL DB

Delta Tables: Upsert Activity
Reading Azure SQL DB Tables
Temp Views with Upserts
Upsert: Incremental Loads
MERGE INTO Statement (Spark)
SCD with Azure SQL DB (OLTP)

Ch 47: Python Widgets (PySpark)

Widgets: Notebook Parameters
widgets.text()
widget.get()
Reading Widgets into Variables
Using Variables in Notebook
Aggregated Loads with Widgets

Ch 48: Workflows (PySpark)

Python Notebook Schedules
Adding Tasks to Jobs
Job Clusters & Cluster Sizes
High Performance Cluster
Job Notifications, Verifications

Ch 49: Unity Catalog

Unity Catalog & Big Data Storage
Unity Catalog Connectors
Catalog Explorer, HIVE
Ubuntu VM: Azure Resource
Cluster Size & VM Size Options
Default Spark Database, Usage

Ch 50: Delta Lake

Declarative Frameworks
Streaming Data Handling
Batch Data Handling
Data Quality & Data Lineage
Medallion with Delta Lake
Time Travel with Delta Lake

Ch 51: Auto Loader in Delta Lake

Structured Streaming Source
Incremental File Uploads
Read, Write Stream (OLTP)
Upsert: Incremental Loads
Data Ingestions & List Modes
File Notification, CheckPoints

Ch 52: Delta LIVE Tables (DLT)

Delta LIVE Tables: Scope
Creating DLT Tables
Streaming Tables
Lake Flow Concepts
Materialized Views
Pipeline Graphs, Lake Flow

Ch 53: Security

RBAC Concepts: IAM Roles
Databricks Workspace Security
Notebook Job Level Security
Job Level Security, Sharing
JDBC Connections: Server Host
Access Tokens & API Access

Ch 54: Scala Notebooks – V1

Scala Notebooks: Realtime Use
JVM and Scala Notebooks
Data Frames, Temp Views
Creating Temp Tables in Scala
Medallion with DBFS, SQL DB
Parquet Tables & Delta Tables

Integrations

👉🏻ADF with Databricks

👉🏻ADF with Storage Tables

👉🏻ADF with Databricks, Azure Storage

Microsoft Fabric

👉🏻Advantages

👉🏻Fabric Configurations

👉🏻Azure to Fabric Migrations

👉🏻Realtime Project For your Resume: End to End Project

👉🏻Databricks Data Engineer: Certification Guidance

Brochure Download Request a Callback

Azure Data Engineer curriculum covering Cloud ETL, Azure Data Factory, Databricks, Spark SQL, PySpark, and Delta Lake

What is the ETL Developer course and who should join?

This course is designed for aspiring Data Engineers, ETL Developers, SQL Developers, BI Engineers, Python ETL developers, and professionals aiming to build real-time ETL/ELT pipelines using SQL, Azure, ADF, Synapse, Databricks, and Python.