dpejcoch
The World According to David Pejčoch

Teaching and Coaching

University of Economics in Prague (Master level)

4IZ210: Processing of Information and Knowledge 4IZ562: Data Quality Management

Business Institut (MBA)

Management of Data, Information and Knowledge

On Demand Workshops

Data and Information Quality Management SAS Programming SQL

Data and Information Quality Management

Introduction to DQM

  • Quality of Data vs. Quality of Information
  • Data Quality characteristics
  • Causes of non-quality
  • Impacts of non-quality
  • Data Quality requirements
  • Methodologies
  • Examples of tools

Typical Tasks within DQM Process

  • Data Profilling and Standardization
  • Imputation of missing values
  • Data Verification and Validation
  • Unification and Deduplication
  • Geocoding
  • Householding
  • Data Enrichment
  • Data Quality Monitoring
  • Maintenance of Quality Knowledge Base
  • Data Quality Assessment

Master Data Management: Process View

  • Master Data and Master Data Management (MDM)
  • Key goals of MDM
  • Towards single version of true - historical overview
  • Architecture of MDM solution
  • Importance of the realion between MDM and Data Governance
  • Specifical forms of MDM
  • User roles in MDM
  • MDM implementation do's and dont's
  • Examples of tools

Master Data Management: Technology View

  • The role of Metadata
  • Application integration: batch and online integration
  • Principles of Service Oriented Architecture
  • Importance of Cannonical Data Model
  • Data Integration: Methods for Matching and Merging - benchmark of different aproaches
  • Identification of Surviving Record
  • Examples of Architecture
  • Hadoop and MDM
  • Automated vs. Manual deduplication
  • Exceptions from deduplication

Improving the Quality of Data

  • Data input controls
  • ETL embedded controls
  • Data Reconciliation
  • Controls integrated to web-services
  • Quality of semi-structured data
  • Quality of unstructured data
  • Big Data Quality and Governance

SAS Programming

SAS: Level 1

  • Introduction to family of SAS tools
  • How to connect to different data sources
  • Introduction to SAS Data Step (functionality for sequence data processing)
  • Using Data Step for loading data from flat files, ODBC sources and XML
  • Introduction to SAS Procedure for SQL (implementation of ANSI SQL in SAS)
  • Moving data from Teradata to SAS and back without performance problems
  • Introduction to basic SAS procedures for analysing data
  • Vertical and horizontal merging of data using Data Step and Procedure SQL
  • Introduction to SAS functions
  • Introduction to SAS formats: how to use formats in reporting and for lookups to reference data sources

SAS: Level 2

  • How to create samples using SAS
  • Introductino to SAS reporting functionality
  • Data visualization using SAS
  • Introduction to SAS Macro Language: how to parameter reports, create loops and automatize code
  • Fast joining using data stored in memory (SAS Hash Object)
  • Freely available SAS samples
  • SAS tips and tricks

SQL

SQL Basics

  • How to connect to trainig data
  • Basic SELECT syntax
  • ANSI Joins using simple keys
  • ANSI Joins using composite keys
  • Vertical data merge
  • Nested SELECTs
  • Using aggregating functions
  • Conditions inside of aggregating functions
  • Using BETWEEN
  • Combination of different AND and OR conditions

Teradata SQL

  • Teradata DB architecture overview
  • Tables types in Teradata
  • Using HELP and SHOW to view object metadata
  • Anatomy of Explain Plan
  • Performance tricks
  • Demystification of skewed queries
  • LIKE / LIKE ANY / LIKE ALL
  • Using CAST() to change data type
  • Using EXTRACT function to get parts of date / time
  • Extracting bitcounts from flags
  • How to check table space
  • Collecting statistics

University of Economics

Business Institut

On Demand Workshops

Philosophy

.
Stránka byla naposledy aktualizována dne 15.2.2015
Venganza banner
Powered by HOLOPAGE
©2011 - 2015 D. Pejčoch