
Databricks Data Engineer Associate focuses on building scalable data pipelines using Apache Spark and Delta Lake on Databricks. It equips you to handle ETL, data transformations, and performance optimization in cloud environments, leading to roles like Data Engineer, Spark Developer, and Cloud Data Engineer.
- ✅Azure Fundamentals & Core Data Services
- ✅Cloud, Big Data, ETL & DWH
- ✅ Data Warehouse : Synapse, Spark
- ✅ETL : ADF, Databricks, ASA Jobs
- ✅Data Lake, Delta Lake (DLT), Unity Catalog
- ✅Python, PySpark, Scala, IoT
- ✅Logic Apps, Azure Functions, ME
- ✅End-to-End Project Execution & Migration
Module 1: SQL Server TSQL (MS SQL) Queries
Ch 1: Databricks Job Roles
- Introduction to Data
- Data Analyst Job Roles
- Data Analyst Job Roles
Ch 2: Database Intro & Installations
- Database Types (OLTP, DWH, ..)
- DBMS: Basics
- SQL Server 2025 Installations
- SSMS Tool Installation
- Server Connections, Authentications
Ch 3: SQL Basics V1 (Commands)
- Creating Databases (GUI)
- Creating Tables, Columns (GUI)
- SQL Basics (DDL, DML, etc..)
- Creating Databases, Tables
- Data Inserts (GUI, SQL)
- Basic SELECT Queries
Ch 4: SQL Basics V2 (Commands, Operators)
- DDL : Create, Alter, Drop, Add, modify, etc..
- DML: Insert, Update, Delete, select into, etc..
- DQL: Fetch, Insert… Select, etc..
- SQL Operations: LIKE, BETWEEN, IN, etc..
Ch 5: Data Types
- Integer Data Types
- Character, MAX Data Types
- Decimal & Money Data Types
- Boolean & Binary Data Types
- Date and Time Data Types
- SQL_Variant Type, Variables
Ch 6: Excel Data Imports
- Data Imports with Excel
- SQL Native Client
- Order By: Asc, Desc
- Order By with WHERE
- TOP & OFFSET
- UNION, UNION ALL
Ch 7: Schemas & Batches
- Schemas: Creation, Usage
- Schemas & Table Grouping
- Real-world Banking Database
- 2 Part, 3 Part & 4 Part Naming
- Batch Concept & “Go” Command
Ch 8: Constraints, Keys & RDBMS – Level 1
- Null, Not Null Constraints
- Unique Key Constraint
- Primary Key Constraint
- Foreign Key & References
- Default Constraint & Usage
- DB Diagrams & ER Models
Ch 9: Normal Forms & RDBMS – Level 2
- Normal Forms: 1 NF, 2 NF
- 3 NF, BCNF and 4 NF
- Adding PK to Tables
- Adding FK to Tables
- Cascading Keys
- Self Referencing Keys
- Database Diagrams
Ch 10: Joins & Queries
- Joins: Table Comparisons
- Inner Joins & Matching Data
- Outer Joins: LEFT, RIGHT
- Full Outer Joins & Aliases
- Cross Join & Table Combination
- Joining more than 2 tables
Ch 11: Views & RLS
- Views: Realtime Usage
- Storing SELECT in Views
- DML, SELECT with Views
- RLS: Row Level Security
- WITH CHECK OPTION
- Important System Views
Ch 12: Stored Procedures
- Stored Procedures: Realtime Use
- Parameters Concept with SPs
- Procedures with SELECT
- System Stored Procedures
- Metadata Access with SPs
- SP Recompilations
- Stored Procedures, Tuning
Ch 13: User Defined Functions
- Using Functions in MSSQL
- Scalar Functions in Real-world
- Inline & Multiline Functions
- Parameterized Queries
- Date & Time Functions
- String Functions & Queries
- Aggregated Functions & Usage
Ch 14: Triggers & Automations
- Need for Triggers in Real-world
- DDL & DML Triggers
- For / After Triggers
- Instead Of Triggers
- Memory Tables with Triggers
- Disabling DMLs & Triggers
Ch 15: Transactions & ACID
- Transaction Concepts in OLTP
- Auto Commit Transaction
- Explicit Transactions
- COMMIT, ROLLBACK
- Checkpoint & Logging
- Lock Hints & Query Blockin
- READPAST, LOCKHINT
Ch 16: CTEs & Tuning
- Common Table Expression
- Creating and Using CTEs
- CTEs, In-Memory Processing
- Using CTEs for DML Operations
- Using CTEs for Tuning
- CTEs: Duplicate Row Deletion
Ch 17: Indexes Basics, Tuning
- Indexes & Tuning
- Clustered Index, Primary Key
- Non Clustered Index & Unique
- Creating Indexes Manually
- Composite Keys, Query Optimizer
- Composite Indexes & Usage
Ch 18: Group By Queries
- Group By, Distinct Keywords
- GROUP BY, HAVING
- Cube( ) and Rollup( )
- Sub Totals & Grand Totals
- Grouping( ) & Usage
- Group By with UNION
- Group By with UNION ALL
Ch 19: Joins with Group By
- Joins with Group By
- 3 Table, 4 Table Joins
- Join Queries with Aliases
- Join Queries & WHERE
- Join Queries & Group By
- Joins with Sub Queries
- Query Execution Order
Ch 20: Sub Queries
- Sub Queries Concept
- Sub Queries & Aggregations
- Joins with Sub Queries
- Sub Queries with Aliases
- Sub Queries, Joins, Where
- Correlated Queries
Ch 21: Cursors & Fetch
- Cursors: Realtime Usage
- Local & Global Cursors
- Scroll & Forward Only Cursors
- Static & Dynamic Cursors
- Fetch, Absolute Cursors
Ch 22: Window Functions, CASE
- IIF Function and Usage
- CASE Statement Usage
- Window Functions (Rank)
- Row_Number( )
- Rank( ), DenseRank( )
- Partition By & Order By
Ch 23: Merge(Upsert) & CASE, IIF
- Merge Statement
- Upsert Operations with Merge
- Matched and Not Matched
- IIF & CASE Statements
- Merge Statement inside SPs
- Merge with OLTP & DWH
Module 2: Databricks
Ch 1: Databricks Introduction
- Cloud ETL, DWH
- Cloud Computing
- Databricks Concepts
- Databricks Advantages
- Databricks Key Features
- Big Data in Cloud
- Databricks Account
Ch 2: Databricks Architecture
- Unified Cloud Platform
- Unity Catalog
- Apache Spark
- LakeHouse (Cloud)
- Volumes, Files & Tables
- Control Pane, Compute Pane
- Deployment Modes
- Cloud Providers: Azure/AWS/Google
- Azure Cloud: Advantages
- Databricks Runtime (DBR)
- RDD & DAG Components
- Databricks One: Hadoop, Map Reduce
Ch 3: Spark Cluster Architecture (Cloud Computing)
- Spark Components
- Apache Spark Clusters
- Cloud Computing Concepts
- Classic Cluster Types
- Serverless Clusters
- Compute Operations
- Apache Spark Ecosystem
- Drive Node, Worker Node
- Cluster Manager & Executors
Ch 4: Unity Catalog
- Unity Catalog Concepts
- Region, Properties
- Databricks Workspace UI
- Organizing Workspace Objects
- File Uploads
- Spark Table Creations
- Creating Volumes
- UI: Limitations
Ch 5: SparkSQL – 1
- Spark SQL Notebooks
- Creating Schemas, Tables
- Spark Data Types
- Data Partitioning
- Managed Tables
- SQL Queries with the PySpark API
- Union, Views in Spark
- Dropping Objects
Ch 6: Spark SQL – 2
- Spark Joins
- Aggregations
- Math, Sort Functions
- String, DateTime Functions
- Conditional Statements
- SQL Expressions with expr()
- Spark SQL Aggregations
Ch 7: Spark SQL – 3
- Spark Time Travel
- Data Recovery & Undo
- Version Number
- Describe
- Describe Extented
- TimeStamp As Of Concept
Ch 8: Python Intro & Print
- Python Introduction
- Python Versions
- Python Implementations
- Python in Spark (PySpark)
- Python Print()
- Single, Multiline Statements
Ch 9: Python Variables
- Defining Variables
- Using Variables
- Printing Variables
- Display Variables
- Variable Types
- Multi Value Variables
- Multi Value Assigning
- If … Else Statement
Ch 10: Python Operators
- Integer Operators
- String Operators
- Arithmetic Operators
- Assignment Operators
- Comparison Operators
- Formatted Strings
- Indexing Operators
- Short Hand If, OR, AND
- ELIF and ELSE IF Statements
Ch 11: Python Data Types
- Python Data Types
- Integer / Int Data Types
- Float, String Data Types
- List Data Type
- List Items, Indexes
- Tuple Data Type
- Dictionary Data Type
Ch 12: Python Dataframes
- Pandas Module (Python)
- Dataframes from Lists
- Dataframe from Dict
- Pandas Dataframes
- Dataframe print, display
- Dataframe from Files
- spark.read.csv()
- spark.read.format()
Ch 13: Medallion Architecture
- Understanding Medallion Concepts
- Bronze, Gold and Silver
- Raw Data
- Data Preparation (Prepping)
- Temporary Views
- Aggregated Data Flow
- Big Data Analytics
Ch 14: PySpark: Medallion Loads – 1
- Reading from Volumes
- Dataframes, Temp Views
- Data Prep (Silver)
- Filtering DataFrame Records
- Removing Duplicate Records
- Sorting and Limiting Records
- Spark SQL Dataframes
- Gold Layer Implementation
- Testing Aggregated Loads
Ch 15: PySpark: Medallion Loads – 2
- Azure SQL DB Connections
- JDBC & Credentials
- SQL Queries in PySpark
- Data Prep (Silver)
- Filtering Null Values
- Grouping and Aggregating
- Spark SQL Dataframes
- Gold Layer Implementation
- Testing Aggregated Loads
Ch 16: PySpark: Delta Tables
- Delta Tables (Spark)
- Parquet Versus Delta
- Deleting and Updating Records
- Table Utility Commands
- Delta Transaction Log
Ch 17: PySpark: SCD
- Slowly Changing Dimension
- Parquet Versus Delta
- Deleting and Updating Records
- Table Utility Commands
- Merge Into Statement
- Incremental Loads
- Merge with OLTP Data Sources
- Merge Temp Views & Spark Table
Ch 18: PySpark: Widgets
- Need for Widgets
- Text Widgets
- User Parameters
- Manual Executions
- Parameters & JSON
Ch 19: Lake Flow Jobs
- Worksflows & CRON
- Job Compute, Running Tasks
- Python Tasks (Notebooks)
- Parameters into Notebook Tasks
- Parameters into Python Script Tasks
- Concurrent Executions, Dependencies
- Branching Control with the If-Else Task
Ch 20: Databricks Tuning
- OPTIMIZE
- VACUUM
- Lazy Evaluation
- Caching, Data Shuffling
- Broadcast Joins
- Data Skipping
- Z Ordering
- Liquid Clustering
- Spark Configurations
Ch 21: Databricks Security
- Databricks Security
- \MFA (Multi Factor Authentication)
- IAM (Identity & Access Management)
- ACL Concepts
- Workspace Users & Groups
- Workspace Security
- Notebook Security
- Job Security
- Cluster Access Control
Ch 22: Auto Loader – 1
- File Incremental Loads
- Cloud Files
- Cloud File Processing
- Checkpoint Files
- Creating Directories in Volumes
- Reading Streams with Auto Loader
- Workspace Modules
- Testing Auto Loader (Initial Loads)
Ch 23: Auto Loader – 2
- Metadata & WithColumns
- Schema Evolution
- addNewColumns
- Rescue
- FailOnNewColumns
- Writing to a Data Stream
- Testing Auto Loader (Incremental Loads)
Ch 24: Spark Structured Streaming
- Delta Lake Concepts
- Lakeflow SDP
- Declarative Pipelines
- Streaming Tables
- CDC: Change Data Capture
- Bronze Tables
- Silver Tables, Timestamp
- Gold Tables
- Big Data Analytics
- SDP (Spark Data Pipelines)
- Exploratory Data Analysis
Ch 25: Version Control & GitHub
- Local Development
- Runtime Compatibility
- Git and GitHub Pre-requisites
- Git and GitHub Basics
- Linking GitHub and Databricks
- Databricks Git Folders
- Project Code to GitHub
- Adding Modules to the Project Code
- Databricks Job Updates, Runs
Ch 26: Realtime Project @ Ecommerce / Banking / Sales
- Detailed Project Requirements
- Project Solutions
- Project FAQs
- Project Flow
- LakeBridge
- Interview Questions & Answers
- Resume Guidance (1:1)

What is the Databricks Data Engineer Associate Training?
This training covers Databricks concepts end-to-end including Spark SQL, PySpark, Delta Lake, Lakehouse, Auto Loader, DLT, Unity Catalog, Workflows, Streaming, Medallion Architecture, and Real-Time Projects.
Who should join this course?
Aspiring Data Engineers, Cloud Engineers, BI Developers, Data Science Engineers, and freshers who want to build a strong career in Databricks and modern Data Engineering.
What modules are included in this training?
Module 1: MSSQL
Module 2: Python
Module 3: Databricks (Complete)
Module 4: Databricks Data Engineer Associate Exam Guidance
Is SQL included as part of the training?
Yes. SQL Server basics to advanced topics including DDL, DML, Joins, Constraints, Keys, Views, Procedures, Functions, CTEs, Tuning, Indexes, Group By, Subqueries, Transactions, and Window Functions.
Do I need Python knowledge to learn Databricks?
Yes, and this course teaches Python from scratch including data types, loops, functions, modules, file handling, exception handling, and full pandas for ETL.
What Databricks basics will I learn?
You will learn Workspace, Notebooks, Clusters, Filesystems, Catalogs, Schemas, and Databricks Architecture including Spark and Lakehouse fundamentals.
Does the course include Spark SQL?
Yes. Spark SQL API, creating schemas, altering columns, unions, math functions, sort functions, string functions, date/time functions, conditional logic, expr() and complex SQL expressions.
Will I learn PySpark in detail?
Yes. Creating DataFrames, reading/writing CSV/JSON/ORC/Parquet, schema inference, grouping, filtering, joins, union, pivot/unpivot, transformations, and rendering outputs.
Is Unity Catalog included in the curriculum?
Yes. Managed tables, external tables, volumes, catalogs, schemas, views, access control, workspace binding, lineage, metastore, system tables, and securable objects.
Will I learn Data Ingestion & Auto Loader?
Yes. Auto Loader streaming ingestion, schema inference, evolution, streaming reads/writes, cancellations, and workspace modules.
Is Medallion Architecture taught?
Yes. Bronze, Silver, Gold layers, aggregated loads, temp views, parquet tables, file/table sources, and building reliable pipelines using Medallion principles.
What Delta Lake concepts does this course cover?
Delta Table API, delete/update/merge, time travel, history, schema evolution, DML operations, retention, transaction logs, and Delta Lake SCD Type 2 implementation.
Will I learn SCD Type 2 in real-time?
Yes. Incremental loads, new/existing record handling, history retention, upserts, and automation using Delta Lake and notebooks.
Does the course include Streaming & Structured Streaming?
Yes. Streaming simulations, micro-batches, schema evolution, watermarking, time-based aggregations, triggers, and Delta streaming pipelines.
Do you cover Databricks Workflows (Jobs)?
Yes. Jobs scheduling, CRON, task dependencies, branching logic, passing parameters into notebooks/py scripts, concurrent executions, and job clusters.
Is Databricks Tuning part of the training?
Yes. Explain plans, lazy evaluation, caching, data shuffling, broadcast joins, partitioning, data skipping, Z-ordering, Liquid Clustering, and Spark configs.
Will I learn GitHub Integration?
Yes. Git prerequisites, linking GitHub with Databricks, Git folders, adding modules, version control, code sync, and pipeline updates.
Does the course include Delta Live Tables (DLT)?
Yes. Pipeline clusters, Data Quality checks, declarative pipelines, streaming datasets, parameterization, and DLT streaming live tables.
Is a real-time project included?
Yes. E-Commerce/Banking/Sales projects with requirements, solutions, FAQs, architecture flow, interview questions, and resume guidance.
Is exam preparation for Databricks Data Engineer Associate included?
Yes. Exam guidance, sample questions, mock exams, and hands-on practice for the certification.
Placement Partners


SQL SCHOOL
24x7 LIVE Online Server (Lab) with Real-time Databases.
Course includes ONE Real-time Project.
Why Choose SQL School
- 100% Real-Time and Practical
- ISO 9001:2008 Certified
- Weekly Mock Interviews
- 24/7 LIVE Server Access
- Realtime Project FAQs
- Course Completion Certificate
- Placement Assistance
- Job Support






























