This is an example of a simple banner

Training: Implement data engineering solutions using Azure Databricks (DP-750)

Ref. DP-750T00
Duration:
4
 jours
Exam:
Level:
Intermédiaire

Implement data engineering solutions using Azure Databricks Training (DP-750)

Azure Databricks has become the reference platform for large-scale data engineering in Microsoft Azure. The DP-750 training prepares you to design, implement and operate production-grade data pipelines leveraging Apache Spark, Delta Lake and the Lakehouse architecture.

Over four days, you work on Unity Catalog, Delta Live Tables, ETL and ELT pipelines, Workflows orchestration, integration with Azure Data Lake Storage Gen2 and secure data sharing. The training is delivered in Geneva and Lausanne by a Microsoft Certified Trainer.

Participant Profiles

Objectives

  • Design an Azure Databricks Lakehouse architecture with Unity Catalog for data governance
  • Develop ETL and ELT pipelines in PySpark and SQL with Delta Lake
  • Implement streaming and incremental pipelines with Delta Live Tables
  • Orchestrate complex workflows with Databricks Workflows and integrate with Azure Data Factory
  • Optimize Spark performance: partitioning, caching, AQE, Photon, autoscaling
  • Secure and govern data with Unity Catalog, row-level security and data lineage

Prerequisites

Course Content

Module 1 : Explore Azure Databricks

  • Get started with Azure Databricks
  • Identify Azure Databricks workloads
  • Understand key concepts
  • Data governance using Unity Catalog and Microsoft Purview
  • Module assessment

Module 2 : Understand Azure Databricks architecture

  • Understand Azure Databricks architecture
  • Understand Unity Catalog managed storage
  • Understand external storage
  • Understand default storage
  • Module assessment

Module 3 : Understand Azure Databricks Integrations

  • Understand integration with Microsoft Fabric
  • Understand integration with Power BI
  • Understand integration with VS Code
  • Understand integration with Power Platform
  • Understand integration with Copilot Studio
  • Understand integration with Microsoft Purview
  • Understand integration with Microsoft Foundry
  • Module assessment

Module 4 : Select and Configure Compute in Azure Databricks

  • Choose an appropriate compute type
  • Configure compute performance
  • Configure compute features
  • Install libraries for compute
  • Configure compute access
  • Module assessment

Module 5 : Create and organize objects in Unity Catalog

  • Apply naming conventions
  • Create catalog
  • Create schema
  • Create tables and views
  • Create volumes
  • Implement DDL operations
  • Implement foreign catalog
  • Configure AI/BI Genie instructions

Module 6 : Secure Unity Catalog objects

  • Understand query lifecycle
  • Implement access control strategies
  • Understand fine-grained access control
  • Implement row filtering and column masking
  • Access Azure Key Vault secrets
  • Authenticate data access with service principals
  • Authenticate resource access with managed identities
  • Module assessment

Module 7 : Govern Unity Catalog objects

  • Create and preserve table definitions
  • Configure ABAC with tags and policies
  • Apply data retention policies
  • Set up and manage data lineage
  • Configure audit logging
  • Design secure Delta Sharing strategy
  • Module assessment

Module 8 : Design and implement data modeling with Azure Databricks

  • Design ingestion logic and data source configuration
  • Choose a data ingestion tool
  • Choose a data table format
  • Design and implement a data partitioning scheme
  • Choose a slowly changing dimension (SCD) type
  • Implement a slowly changing dimension (SCD) type 2
  • Design and implement a temporal (history) table to record changes over time
  • Choose granularity on a column or table based on requirements
  • Choose managed vs unmanaged tables
  • Design and implement a clustering strategy

Module 9 : Ingest data into Unity Catalog

  • Ingest data with Lakeflow Connect
  • Ingest data with notebooks
  • Ingest data with SQL methods
  • Ingest data with CDC feed
  • Ingest data with Spark Structured Streaming
  • Ingest data with Auto Loader
  • Ingest data with Lakeflow Spark Declarative Pipelines
  • Module assessment

Module 10 : Cleanse, transform, and load data into Unity Catalog

  • Profile data
  • Choose column data types
  • Resolve duplicates and nulls
  • Transform data with filters and aggregations
  • Transform data with joins and set operators
  • Transform data with denormalization and pivots
  • Load data with merge, insert, and append
  • Module assessment

Module 11 : Implement and manage data quality constraints with Azure Databricks

  • Implement validation checks
  • Implement data type checks
  • Detect and manage schema drift
  • Manage data quality with pipeline expectations
  • Module assessment

Module 12 : Design and implement data pipelines with Azure Databricks

  • Design order of operations for a pipeline
  • Choose notebook vs Lakeflow Pipelines
  • Design Lakeflow job logic
  • Design error handling in pipelines and jobs
  • Create pipeline with notebook
  • Create pipeline with Lakeflow Spark Declarative Pipelines
  • Module assessment

Module 13 : Implement Lakeflow Jobs with Azure Databricks

  • Create job setup and configuration
  • Configure job triggers
  • Schedule a job
  • Configure job alerts
  • Configure automatic restarts
  • Module assessment

Module 14 : Implement development lifecycle processes in Azure Databricks

  • Apply Git version control best practices
  • Manage branching and pull requests
  • Implement testing strategy
  • Configure and package Declarative Automation Bundles
  • Deploy bundle with Databricks CLI
  • Module assessment

Module 15 : Monitor, troubleshoot and optimize workloads in Azure Databricks

  • Monitor and manage cluster consumption
  • Troubleshoot and repair Lakeflow Jobs
  • Troubleshoot Spark jobs and notebooks
  • Investigate caching, skewing, spilling, shuffle
  • Implement log streaming with Azure Log Analytics
  • Module assessment

Documentation

Course material included.

Complementary Courses

Eligible Funding

ITTA is a partner of a continuing education fund dedicated to temporary workers. This fund can subsidize your training, provided that you are subject to the “Service Provision” collective labor agreement (CCT) and meet certain conditions, including having worked at least 88 hours in the past 12 months.

Additional Information

Azure Databricks: the data engineering platform at the heart of the Lakehouse

Azure Databricks unifies data engineering, data science, machine learning and BI on a single platform. The Implement data engineering solutions using Azure Databricks (DP-750) training focuses on the data engineering pillar: ingestion, transformation, quality, governance and data exposure. You work on Apache Spark optimized by Databricks (Photon engine), Delta Lake for ACID reliability, and the Lakehouse architecture that combines the benefits of data lake and data warehouse.

Unity Catalog: unified governance

Unity Catalog is the governance layer you configure during the course: a single metastore catalog for all Databricks workspaces, granular permissions (catalog, schema, table, view, column), automatic data lineage and secure sharing via Delta Sharing. Mastery of Unity Catalog has become essential for enterprise Databricks architectures.

Delta Lake and the medallion architecture

Delta Lake brings ACID transactions, time travel, schema evolution and performance on top of Parquet files in Azure Data Lake Storage. The training covers advanced techniques: MERGE INTO for upserts, OPTIMIZE and Z-ordering for performance, VACUUM for retention, change data feed for change propagation. The medallion architecture (bronze / silver / gold) is presented as a reference pattern.

Delta Live Tables: declarative pipelines

Delta Live Tables (DLT) is a declarative framework to build reliable data pipelines. Instead of orchestrating individual notebooks, you declare transformations and DLT manages dependencies, retries, data quality (expectations) and monitoring. The training shows how to migrate existing pipelines to DLT and combine streaming and batch in the same pipeline.

Spark performance and optimization

Optimizing Spark requires understanding its internals: partitioning, shuffle, broadcast joins, AQE (Adaptive Query Execution), Photon (the native Databricks engine written in C++). You learn to read the Spark UI, identify bottlenecks, adjust cluster configurations and choose the right APIs (DataFrame vs SQL, RDD to avoid).

Audience and prerequisites

The Implement data engineering solutions using Azure Databricks (DP-750) training targets data engineers, ETL engineers and data architects who will design production Databricks pipelines. Prerequisites: Python or Scala knowledge, Azure fundamentals (equivalent to AZ-900), SQL experience. Prior Spark knowledge is a plus but not required.

FAQ Implement data engineering solutions using Azure Databricks (DP-750)

What’s the difference between Azure Databricks and Microsoft Fabric?

Microsoft Fabric integrates a unified SaaS experience (Lakehouse, Data Warehouse, Real-Time Analytics, Power BI). Azure Databricks remains the leading platform for advanced Spark workloads, large-scale ML and multi-cloud architectures. The DP-750 training covers Azure Databricks in depth; DP-600 / DP-700 cover Microsoft Fabric.

Do I need to know Apache Spark before DP-750?

No. The training introduces Spark progressively. However, SQL experience and knowledge of at least one programming language (Python, Scala) are essential.

Does the DP-750 course lead to a Microsoft certification?

DP-750 is a Microsoft Applied Skill, with no formal exam associated. For a certification covering Azure Databricks, see Azure Data Engineer Associate (DP-203) which includes Databricks in its scope.

Does the training cover real-time streaming workloads?

Yes, Structured Streaming and Delta Live Tables in continuous mode are addressed with CDC (Change Data Capture) use cases and Event Hubs / Kafka integration.

Prix de l'inscription
CHF 3'000.-
Mois actuel

mar16Juin(Juin 16)09:00ven19(Juin 19)17:00VirtuelVirtual Etiquettes de sessionDP-750T00

mar16Juin(Juin 16)09:00ven19(Juin 19)17:00Genève, Route des Jeunes 35, 1227 Genève Etiquettes de sessionDP-750T00

mar21juil(juil 21)09:00ven24(juil 24)17:00VirtuelVirtual Etiquettes de sessionDP-750T00

mar21juil(juil 21)09:00ven24(juil 24)17:00Lausanne, Av. Mon-Repos 24, 1005 Lausanne Etiquettes de sessionDP-750T00

mar25Aoû(Aoû 25)09:00ven28(Aoû 28)17:00VirtuelVirtual Etiquettes de sessionDP-750T00

mar25Aoû(Aoû 25)09:00ven28(Aoû 28)17:00Genève, Route des Jeunes 35, 1227 Genève Etiquettes de sessionDP-750T00

mar29Sep(Sep 29)09:00ven02Oct(Oct 2)17:00VirtuelVirtual Etiquettes de sessionDP-750T00

mar29Sep(Sep 29)09:00ven02Oct(Oct 2)17:00Lausanne, Av. Mon-Repos 24, 1005 Lausanne Etiquettes de sessionDP-750T00

mar03Nov(Nov 3)09:00ven06(Nov 6)17:00VirtuelVirtual Etiquettes de sessionDP-750T00

mar03Nov(Nov 3)09:00ven06(Nov 6)17:00Genève, Route des Jeunes 35, 1227 Genève Etiquettes de sessionDP-750T00

mar08Déc(Déc 8)09:00ven11(Déc 11)17:00VirtuelVirtual Etiquettes de sessionDP-750T00

mar08Déc(Déc 8)09:00ven11(Déc 11)17:00Lausanne, Av. Mon-Repos 24, 1005 Lausanne Etiquettes de sessionDP-750T00

mar12Jan(Jan 12)09:00ven15(Jan 15)17:00VirtuelVirtual Etiquettes de sessionDP-750T00

mar12Jan(Jan 12)09:00ven15(Jan 15)17:00Genève, Route des Jeunes 35, 1227 Genève Etiquettes de sessionDP-750T00

mar16Fév(Fév 16)09:00ven19(Fév 19)17:00VirtuelVirtual Etiquettes de sessionDP-750T00

mar16Fév(Fév 16)09:00ven19(Fév 19)17:00Lausanne, Av. Mon-Repos 24, 1005 Lausanne Etiquettes de sessionDP-750T00

Contact

ITTA
Route des jeunes 35
1227 Carouge, Suisse

Opening hours

Monday to Friday
8:30 AM to 6:00 PM
Tel. 058 307 73 00

Contact-us

ITTA
Route des jeunes 35
1227 Carouge, Suisse

Make a request

Contact

ITTA
Route des jeunes 35
1227 Carouge, Suisse

Opening hours

Monday to Friday, from 8:30 am to 06:00 pm.

Contact us

Your request