
Databricks Data Engineer Associate focuses on building scalable data pipelines using Apache Spark and Delta Lake on Databricks. It equips you to handle ETL, data transformations, and performance optimization in cloud environments, leading to roles like Data Engineer, Spark Developer, and Cloud Data Engineer.
- ✅Azure Fundamentals & Core Data Services
- ✅Cloud, Big Data, ETL & DWH
- ✅ Data Warehouse : Synapse, Spark
- ✅ETL : ADF, Databricks, ASA Jobs
- ✅Data Lake, Delta Lake (DLT), Unity Catalog
- ✅Python, PySpark, Scala, IoT
- ✅Logic Apps, Azure Functions, ME
- ✅End-to-End Project Execution & Migration
Module 1: SQL Server TSQL (MS SQL) Queries
Ch 1: Data Analyst Job Roles
- Introduction to Data
- Data Analyst Job Roles
- Data Analyst Challenge
- Data and Databases Intro
Ch 2: Database Intro & Installations
- Database Types (OLTP, DWH, ..)
- DBMS: Basics
- SQL Server 2025 Installations
- SSMS Tool Installation
- Server Connections, Authentications
Ch 3: SQL Basics V1 (Commands)
- Creating Databases (GUI)
- Creating Tables, Columns (GUI)
- SQL Basics (DDL, DML, etc..)
- Creating Databases, Tables
- Data Inserts (GUI, SQL)
- Basic SELECT Queries
Ch 4: SQL Basics V2 (Commands, Operators)
- DDL : Create, Alter, Drop, Add, modify, etc..
- DML: Insert, Update, Delete, select into, etc..
- DQL: Fetch, Insert… Select, etc..
- SQL Operations: LIKE, BETWEEN, IN, etc..
- Special Operators
Ch 5: Data Types
- Integer Data Types
- Character, MAX Data Types
- Decimal & Money Data Types
- Boolean & Binary Data Types
- Date and Time Data Types
- SQL_Variant Type, Variables
Ch 6: Excel Data Imports
- Data Imports with Excel
- SQL Native Client
- Order By: Asc, Desc
- Order By with WHERE
- TOP & OFFSET
- UNION, UNION ALL
Ch 7: Schemas & Batches
- Schemas: Creation, Usage
- Schemas & Table Grouping
- Real-world Banking Database
- 2 Part, 3 Part & 4 Part Naming
- Batch Concept & “Go” Command
Ch 8: Constraints, Keys & RDBMS – Level 1
- Null, Not Null Constraints
- Unique Key Constraint
- Primary Key Constraint
- Foreign Key & References
- Default Constraint & Usage
- DB Diagrams & ER Models
Ch 9: Normal Forms & RDBMS – Level 2
- Normal Forms: 1 NF, 2 NF
- 3 NF, BCNF and 4 NF
- Adding PK to Tables
- Adding FK to Tables
- Cascading Keys
- Self Referencing Keys
- Database Diagrams
Ch 10: Joins & Queries
- Joins: Table Comparisons
- Inner Joins & Matching Data
- Outer Joins: LEFT, RIGHT
- Full Outer Joins & Aliases
- Cross Join & Table Combination
- Joining more than 2 tables
Ch 11: Views & RLS
- Views: Realtime Usage
- Storing SELECT in Views
- DML, SELECT with Views
- RLS: Row Level Security
- WITH CHECK OPTION
- Important System Views
Ch 12: Stored Procedures
- Stored Procedures: Realtime Use
- Parameters Concept with SPs
- Procedures with SELECT
- System Stored Procedures
- Metadata Access with SPs
- SP Recompilations
- Stored Procedures, Tuning
Ch 13: User Defined Functions
- Using Functions in MSSQL
- Scalar Functions in Real-world
- Inline & Multiline Functions
- Parameterized Queries
- Date & Time Functions
- String Functions & Queries
- Aggregated Functions & Usage
Ch 14: Triggers & Automations
- Need for Triggers in Real-world
- DDL & DML Triggers
- For / After Triggers
- Instead Of Triggers
- Memory Tables with Triggers
- Disabling DMLs & Triggers
Ch 15: Transactions & ACID
- Transaction Concepts in OLTP
- Auto Commit Transaction
- Explicit Transactions
- COMMIT, ROLLBACK
- Checkpoint & Logging
- Lock Hints & Query Blockin
- READPAST, LOCKHINT
Ch 16: CTEs & Tuning
- Common Table Expression
- Creating and Using CTEs
- CTEs, In-Memory Processing
- Using CTEs for DML Operations
- Using CTEs for Tuning
- CTEs: Duplicate Row Deletion
Ch 17: Indexes Basics, Tuning
- Indexes & Tuning
- Clustered Index, Primary Key
- Non Clustered Index & Unique
- Creating Indexes Manually
- Composite Keys, Query Optimizer
- Composite Indexes & Usage
Ch 18: Group By Queries
- Group By, Distinct Keywords
- GROUP BY, HAVING
- Cube( ) and Rollup( )
- Sub Totals & Grand Totals
- Grouping( ) & Usage
- Group By with UNION
- Group By with UNION ALL
Ch 19: Joins with Group By
- Joins with Group By
- 3 Table, 4 Table Joins
- Join Queries with Aliases
- Join Queries & WHERE
- Join Queries & Group By
- Joins with Sub Queries
- Query Execution Order
Ch 20: Sub Queries
- Sub Queries Concept
- Sub Queries & Aggregations
- Joins with Sub Queries
- Sub Queries with Aliases
- Sub Queries, Joins, Where
- Correlated Queries
Ch 21: Cursors & Fetch
- Cursors: Realtime Usage
- Local & Global Cursors
- Scroll & Forward Only Cursors
- Static & Dynamic Cursors
- Fetch, Absolute Cursors
Ch 22: Window Functions, CASE
- IIF Function and Usage
- CASE Statement Usage
- Window Functions (Rank)
- Row_Number( )
- Rank( ), DenseRank( )
- Partition By & Order By
Ch 23: Merge(Upsert) & CASE, IIF
- Merge Statement
- Upsert Operations with Merge
- Matched and Not Matched
- IIF & CASE Statements
- Merge Statement inside SPs
- Merge with OLTP & DWH
Ch 24: Key Take-Aways from Module 1
- Case Study 1: Medicare: Tasks, Solutions
- Case Study 2: ECommerce: Task, Solutions
- Chapter Wise Assignments: Solutions
- Dailly Assignments: Review (Feedback)
- Weekly Mock Interview: Feedbacks
Module 2: Python Concepts
Ch 1: What is Data Engineering?
- Database Types
- ETL
- DWH
- Cloud Computing
- Databricks
- Need for Python in Databricks
Ch 2: Python Introduction
- Python Introduction
- Python Versions
- Python Implementations
- Python Installations
- Python IDE & Usage
- Jupyter Notebooks
Ch 3: Python Operations
- Basic Operations in Python
- Python Scripts, Print()
- Single, Multiline Statements
- Python: Internal Architecture
- Compiler Versus Interpreter
Ch 4: Data Types & Variables
- Integer / Int Data Types
- Float, String Data Types
- Sequence Types: List, Tuple
- Range, Complex & memview
- Retrieving Data Type: type()
Ch 5: Python Operators
- Arithmetic, Assignment Ops
- Comparison Operators
- Operator Precedence
- If … Else Statement, Pass
- Short Hand If, OR, AND
- ELIF and ELSE IF Statements
Ch 6: Python Loops, Iterations
- Python Loop & Realtime Use
- Python While Loop Statement
- Break and Continue Statement
- Iterations & Conditions
- Exit Conditions & For Loops
- iter() and Looping Options
Ch 7: Python Functions
- Python Functions & Usage
- Function Parameters
- Default & List Parameters
- Python Lambda Functions
- Recursive Functions, Usage
- Return & Print @ Lamdba
Ch 8: Python Modules
- Import Python Modules
- Built In Modules & dir
- datetime module in Python
- Date Objections Creation
- strftime Method & Usage
- imports & datetime.now()
Ch 9: Python User Inputs & TRY
- Try Except, Exception Handling
- Raise an exception method
- TypeError, Scripting in Python
- Python User Inputs
- Python Index Numbers
- input() & raw_input()
Ch 10: Python File Handling
- File Handling, Activities
- Loop, Write, Close Files
- Appending, Overwriting
- import os, path.exists
- f.open, f.write
- f.read, f.close
Ch 11: Pandas DataFrames 1
- Installation of Pandas
- Python Modules & Pandas
- Pandas Codebase & Usage
- import pandas.DataFrame
- Pandas Series, arrays
Ch 12: Pandas DataFrames 2
- Indexes & Named Options
- Locate Row and Load Rows
- Row Index & Index Lists
- Load Files Into a DataFrame
- df.to_string() Function
- tail() & null() Function
Ch 13: Pandas Transformations
- Pandas – Cleaning Data
- Replace, Transform Columns
- Data Discovery & Column Fill
- Identify & Remove Duplicates
- dropna(), fillna() Functions
- Data Plotting & matlib Lib
Ch 14: Key Take-Aways from Module 2
- Case Study @ ECommerce: Task, Solutions
- Chapter Wise Assignments: Solutions
- Dailly Assignments: Review (Feedback)
- Weekly Mock Interview: Feedbacks
Module 3: Databricks
Ch 1: Databricks Intro
- Big Data
- Open Source ETL
- What is a Data Lakehouse?
- Hadoop, MapReduce and Apache Spark
Ch 2: Databricks Architecture
- Unity Catalog Volume
- Clusters…
- Apache Spark and Databricks
- Apache Spark Ecosystem
- Compute Activities
Ch 3: Databricks Workspace
- Workspace Objects
- Databricks Notebooks
- Databricks Managed Resources
- Databricks Workspace UI
- UI Updates
Ch 4: Databricks Notebooks
- Databricks Notebooks
- Mix Languages in Notebooks
- Comments and Markdown Text to Databricks Notebooks
- Organizing your Workspace Objects
- SparkSQL Notebooks
Ch 5: SparkSQL Notebooks – 1
- Spark SQL API
- Creating a Catalog, Schema
- Adding New Columns
- Changing Data Types
- Removing Columns
- Union
Ch 6: SparkSQL Notebooks – 2
- Math Functions
- Sort Functions
- String Functions
- Datetime Functions
- Conditional Statements
- SQL Expressions with expr()
Ch 7: SparkSQL Notebooks – 3
- Volume for our Data Assets
- Uploading the Countries Data Files
- File Formats, Schema Inference
- How to Partition your Data
- Databricks File System Utilities
- Creating Views with SQL
- Creating Catalogs, Schemas and Volumes with SQL
Ch 8: PySpark – 1
- Dataframes
- Creation of Dataframes
- Pandas Dataframes
- Dataframe()
- List Values, Mixed Values
- spark.read.csv()
- spark.read.format()
- Filtering DataFrames
- Grouping your DataFrame
- Pivot your DataFrame
Ch 9: PySpark – 2
- DataFrameReader
- DataFrameWriter Methods
- CSV Data into a DataFrame
- Reading Single Files
- Reading Multiple Files
- Schema with an SQL String
- Schema Programmatically
Ch 10: PySpark – 3
- Writing DataFrames to CSV
- Working with JSON
- Working with ORC
- Working with Parquet
- Working with Delta Lake
- Rendering your DataFrame
- Creating DataFrames from Python Data Structures
Ch 11: Unity Catalog (Dev)
- Unity Catalog Managed Tables
- SQL Queries with the PySpark API
- Managed Tables with SQL
- Creating Views with SQL
- Creating Catalogs, Schemas and Volumes with SQL
- Dropping Unity Catalog Objects with SQL
- Temporary Views
- External Tables, External Volumes
Ch 12: Unity Catalog (Admin)
- Metastore and the Unity Catalog Object Model
- Databricks Account Console
- Data Discovery and Lineage
- System Tables
- Databricks Account and Workspace Roles
- Unity Catalog Privileges and Securable Objects
- Workspace Access Control Lists (ACLs)
- Workspace-Catalog Binding
- Workspace Compute Policies
Ch 13: PySpark Transformations – 1
- Data Preparation
- Selecting Columns
- Column Transformations
- Renaming Columns
- Changing Data Types
- select() and selectExpr()
- Column Transformations
- withColumn()
Ch 14: PySpark Transformations – 2
- Basic Arithmetic and Math Functions
- String Functions
- Datetime Conversions
- Date and Time Functions
- Joining DataFrames
- Unioning DataFrames
- Joining DataFrames
Ch 15: PySpark Transformations – 3
- Filtering DataFrame Records
- Removing Duplicate Records
- Sorting and Limiting Records
- Filtering Null Values
- Grouping and Aggregating
- Pivoting and Unpivoting
- Conditional Expressions
Ch 16: Medallion Architecture
- Medallion Architecture
- Aggregated Data Loads
- Broze, Silver and Gold
- Temp Views
- Spark Tables (Parquet)
- Work with File, Table Sources
Ch 17: Delta Lake – 1
- Storage Layer
- Delta Table API
- Deleting Records
- Updating Records
- Merging Records
- History and Time Travel
Ch 18: Delta Lake – 2 (SCD)
- Schema Evolution
- Delta Lake Data Files
- Deleting and Updating Records
- Merge Into
- Table Utility Commands
- Exploratory Data Analysis
Ch 19: Implementation of SCD Type 2
- Incremental Loads
- Upserts Versus SCD
- Ne Row Inserts
- Existing Row Updates
- Old History Retention
- Delta Transaction Log
Ch 20: Widgets
- Text Widgets
- User Parameters
- Manual Executions
- Lake Bridge
- Databricks BridgeOne
Ch 21: Lake Flow Jobs
- Worksflows & CRON
- Job Compute, Running Tasks
- Python Script Tasks
- Parameters into Notebook Tasks
- Parameters into Python Script Tasks
- Concurrent Executions, Dependencies
- Branching Control with the If-Else Task
Ch 22: Databricks Tuning
- How Spark Optimizes your Code
- Lazy Evaluation
- Explain Plan
- Inspecting Query Performance
- Caching, Data Shuffling
- Broadcast Joins
- When to Partition
- Data Skipping
- Z Ordering
- Liquid Clustering
- Spark Configurations
Ch 23: Version Control & GitHub
- Local Development
- Runtime Compatibility
- Git and GitHub Pre-requisites
- Git and GitHub Basics
- Linking GitHub and Databricks
- Databricks Git Folders
- Project Code to GitHub
- Adding Modules to the Project Code
- Databricks Job Updates, Runs
Ch 24: Spark Structured Streaming
- Streaming Simulator Notebook
- Micro-batch Size
- Schema Inference and Evolution
- Time Based Aggregations and Watermarking
- Writing Streams
- Trigger Intervals
- Delta Table Streaming Reads and Writes
Ch 25: Auto Loader
- Reading Streams with Auto Loader
- Reading a Data Stream
- Manually Cancel your Data Streams
- Writing to a Data Stream
- Workspace Modules
Ch 26: Lake Flow Declarative Pipelines
- Delta LIVE Tables
- Data Generator Notebook
- Pipeline Clusters
- Databricks CLI
- Data Quality Checks
- Streaming Dataset “Simulator”
- Streaming Live Tables
Ch 27: Security: ACLs
- Overview of ACLs
- Adding a New User to our Workspace
- Workspace Access Control
- Cluster Access Control
- Groups
Ch 28: Realtime Project @ Ecommerce / Banking / Sales
- Detailed Project Requirements
- Project Solutions
- Project FAQs
- Project Flow
- Interview Questions & Answers
- Resume Guidance (1:1)
Ch 29: Key Take-Aways from Module 3
👉 Realtime Project: Requirement, CI CD, Solution, FAQs
👉 Chapter Wise Assignments: Solutions
👉 Dailly Assignments: Review (Feedback)
👉 Weekly Mock Interview: Feedback
Module 4: Databricks Data Engineer Associate
Databricks Data Engineer Associate Exam Guidance
Exam Samples
Mock Exams

What is the Databricks Data Engineer Associate Training?
This training covers Databricks concepts end-to-end including Spark SQL, PySpark, Delta Lake, Lakehouse, Auto Loader, DLT, Unity Catalog, Workflows, Streaming, Medallion Architecture, and Real-Time Projects.
Who should join this course?
Aspiring Data Engineers, Cloud Engineers, BI Developers, Data Science Engineers, and freshers who want to build a strong career in Databricks and modern Data Engineering.
What modules are included in this training?
Module 1: MSSQL
Module 2: Python
Module 3: Databricks (Complete)
Module 4: Databricks Data Engineer Associate Exam Guidance
Is SQL included as part of the training?
Yes. SQL Server basics to advanced topics including DDL, DML, Joins, Constraints, Keys, Views, Procedures, Functions, CTEs, Tuning, Indexes, Group By, Subqueries, Transactions, and Window Functions.
Do I need Python knowledge to learn Databricks?
Yes, and this course teaches Python from scratch including data types, loops, functions, modules, file handling, exception handling, and full pandas for ETL.
What Databricks basics will I learn?
You will learn Workspace, Notebooks, Clusters, Filesystems, Catalogs, Schemas, and Databricks Architecture including Spark and Lakehouse fundamentals.
Does the course include Spark SQL?
Yes. Spark SQL API, creating schemas, altering columns, unions, math functions, sort functions, string functions, date/time functions, conditional logic, expr() and complex SQL expressions.
Will I learn PySpark in detail?
Yes. Creating DataFrames, reading/writing CSV/JSON/ORC/Parquet, schema inference, grouping, filtering, joins, union, pivot/unpivot, transformations, and rendering outputs.
Is Unity Catalog included in the curriculum?
Yes. Managed tables, external tables, volumes, catalogs, schemas, views, access control, workspace binding, lineage, metastore, system tables, and securable objects.
Will I learn Data Ingestion & Auto Loader?
Yes. Auto Loader streaming ingestion, schema inference, evolution, streaming reads/writes, cancellations, and workspace modules.
Is Medallion Architecture taught?
Yes. Bronze, Silver, Gold layers, aggregated loads, temp views, parquet tables, file/table sources, and building reliable pipelines using Medallion principles.
What Delta Lake concepts does this course cover?
Delta Table API, delete/update/merge, time travel, history, schema evolution, DML operations, retention, transaction logs, and Delta Lake SCD Type 2 implementation.
Will I learn SCD Type 2 in real-time?
Yes. Incremental loads, new/existing record handling, history retention, upserts, and automation using Delta Lake and notebooks.
Does the course include Streaming & Structured Streaming?
Yes. Streaming simulations, micro-batches, schema evolution, watermarking, time-based aggregations, triggers, and Delta streaming pipelines.
Do you cover Databricks Workflows (Jobs)?
Yes. Jobs scheduling, CRON, task dependencies, branching logic, passing parameters into notebooks/py scripts, concurrent executions, and job clusters.
Is Databricks Tuning part of the training?
Yes. Explain plans, lazy evaluation, caching, data shuffling, broadcast joins, partitioning, data skipping, Z-ordering, Liquid Clustering, and Spark configs.
Will I learn GitHub Integration?
Yes. Git prerequisites, linking GitHub with Databricks, Git folders, adding modules, version control, code sync, and pipeline updates.
Does the course include Delta Live Tables (DLT)?
Yes. Pipeline clusters, Data Quality checks, declarative pipelines, streaming datasets, parameterization, and DLT streaming live tables.
Is a real-time project included?
Yes. E-Commerce/Banking/Sales projects with requirements, solutions, FAQs, architecture flow, interview questions, and resume guidance.
Is exam preparation for Databricks Data Engineer Associate included?
Yes. Exam guidance, sample questions, mock exams, and hands-on practice for the certification.
Placement Partners


SQL SCHOOL
24x7 LIVE Online Server (Lab) with Real-time Databases.
Course includes ONE Real-time Project.
Why Choose SQL School
- 100% Real-Time and Practical
- ISO 9001:2008 Certified
- Weekly Mock Interviews
- 24/7 LIVE Server Access
- Realtime Project FAQs
- Course Completion Certificate
- Placement Assistance
- Job Support






























