Data Profiling and Validation Training

Level

Intermediate

Duration

16h / 2 days

Date

Individually arranged

Price

Individually arranged

Data Profiling and Validation Training

This training introduces participants to the topics of profiling and validating data quality. You will learn about tools and methods for measuring data quality, appropriate profiling, and handling errors. The program covers key techniques in profiling and validation—from building quality metrics to implementing automated tests. In data-driven environments, companies cannot afford errors caused by inconsistent or incomplete data. This training shows how to detect, measure, and effectively eliminate such issues using practical methods and tools.

What You Will Learn

  • Profile data in terms of structure, relationships, and patterns
  • Build data quality metrics aligned with business rules
  • Implement your own functions for detecting errors and anomalies in data
  • Understand practical aspects of testing data quality and compliance in production environments

Requirements

  • Knowledge of Python (version 3.10 or higher)
  • Knowledge of Docker
Who is this training for?
  • logo infoshare Data Engineers responsible for data processing and preparation
  • logo infoshare Data Architects building data warehouses and integration systems
  • logo infoshare QA Engineers in data projects
  • logo infoshare Analysts and developers working in ETL/ELT environments

Training Program

  1. Day 1: Data Profiling

  • Introduction to Data Profiling

    • Purpose and importance of data profiling
  • Types of Data Profiling

    • Column profiling
    • Relationship profiling
    • Dependency profiling
    • Pattern profiling
    • Type profiling
  • Profiling Metrics and Analysis

    • Creating profiling functions and metrics
    • Detecting data anomalies
    • Predicting anomalies based on historical patterns
  1. Day 2: Data Quality

  • Introduction to Data Quality

    • Data quality concepts and objectives
    • Data quality vs. business rules
  • Data Quality Dimensions

    • Completeness
    • Consistency
    • Accuracy
    • Uniqueness
    • Timeliness
    • Precision
  • Implementation and Organization

    • Implementing data quality tests
    • Roles and responsibilities in Data Quality
    • Relationship between Data Quality and Data Governance

Contact us

we will organize training for you tailored to your needs

Przemysław Wołosz

Key Account Manager

przemyslaw.wolosz@infoShareAcademy.com

    The controller of your personal data is InfoShare Academy Sp. z o.o. with its registered office in Gdańsk, al. Grunwaldzka 427B, 80-309 Gdańsk, KRS: 0000531749, NIP: 5842742121. Personal data are processed in accordance with information clause.