Azure Databricks Training

Level

Intermediate

Duration

24h / 3 days

Date

Individually arranged

Price

Individually arranged

Azure Databricks Training

Azure Databricks is a big data service based on the Apache Spark platform that enables the creation, training, and exploration of data in the cloud. It is a data processing platform that provides scalability, performance, and ease of use. Azure Databricks allows teams to coordinate work more easily and share code.

What You Will Learn

  • Fundamentals of the Azure Databricks platform.
  • Data processing and preparation techniques.
  • Data analysis using Databricks SQL.
  • Utilization of Apache Spark for data processing.
Who is this training for?
  • logo infoshare Individuals who want to leverage data to optimize processes.
  • logo infoshare Those who wish to deepen their understanding of Apache Spark.
  • logo infoshare Individuals with basic knowledge of data analysis.
  • logo infoshare Developers, Data Engineers, and Data Scientists.

Training Program

  1. What is the Databricks Lakehouse Platform

  • Description of the Databricks Lakehouse Platform
  • Origin of the Lakehouse data management paradigm
  • Fundamental challenges in managing and using data
  • Security features of the Databricks Lakehouse Platform
  • Examples of organizations benefiting from Databricks Lakehouse
  1. What is Databricks SQL

  • Fundamental concepts for using Databricks SQL effectively
  • Tools and features for querying data and sharing insights
  • Supporting data analysis workflows and business insight sharing
  1. What is Databricks Machine Learning

  • Overview of Databricks Machine Learning
  • Benefits for data science and machine learning teams
  • Core components and functionalities
  • Examples of real-world customer use cases
  1. Databricks Data Science and Data Engineering Workspace

  • Overview of the workspace
  • Assets provided by the workspace
  • Example development workflow for querying and aggregating data
  1. Databricks Workspaces and Services

  • Databricks architecture and services
  • Data Science and Engineering Workspace
  • Creating and managing interactive clusters
  • Notebook basics
  • Git versioning with Databricks Repos
  • Using Databricks Repos
  • Getting started with the Databricks platform
  1. Delta Lakehouse

  • What is Delta Lake
  • Managing Delta tables
  • Manipulating tables with Delta Lake
  • Advanced Delta features
  1. Relational Entities on Databricks

  • Databases and views
  • Views and Common Table Expressions (CTEs)
  1. ETL with Spark SQL

  • Querying files directly
  • Providing options
  • Creating Delta tables
  • Writing to tables
  • Cleaning data
  • Advanced SQL transformations
  • User-defined functions (UDFs)
  1. Getting Started with Databricks SQL

  • Navigating Databricks SQL
  • Unity Catalog on Databricks SQL
  • Schemas, tables, and views
  • Basic SQL operations
  • Ingesting data
  • Joins
  • Delta commands in Databricks SQL
  1. Presenting Data Visually

  • Data visualization concepts
  • Visualizations in Databricks SQL
  • Dashboards
  • Notifying stakeholders
  1. Apache Spark Programming – DataFrames

  • Databricks platform and ecosystem
  • Spark SQL
  • DataFrames and SparkSession
  • Reader and writer APIs
  • Data sources
  • DataFrame, column, and expressions
  • Transformations, actions, and rows
  1. Apache Spark Programming – Transformations

  • Aggregation and aggregation functions
  • Date and time processing
  • Dates and timestamps
  • Complex data types
  • Additional functions
  • UDFs and vectorized UDFs
  1. Apache Spark Programming – Spark Internals

  • Spark architecture
  • Spark cluster and execution model
  • Shuffling and caching
  • Query optimization
  • Partitioning
  1. Apache Spark Programming – Structured Streaming

  • Streaming fundamentals in Apache Spark

Contact us

we will organize training for you tailored to your needs

Przemysław Wołosz

Key Account Manager

przemyslaw.wolosz@infoShareAcademy.com

    The controller of your personal data is InfoShare Academy Sp. z o.o. with its registered office in Gdańsk, al. Grunwaldzka 427B, 80-309 Gdańsk, KRS: 0000531749, NIP: 5842742121. Personal data are processed in accordance with information clause.