0% found this document useful (0 votes)
52 views14 pages

DMBOK Version 2 Exam Prep Summary

The document serves as a reference for the DAMA-DMBOK Version 2, summarizing key concepts in data governance, architecture, modeling, and quality management. It emphasizes the importance of data quality assessments through both bottom-up and top-down approaches, and outlines various types of metadata and their significance in data management. Additionally, it highlights the role of data stewardship and the necessity of effective document and content management practices.

Uploaded by

salehmughir
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views14 pages

DMBOK Version 2 Exam Prep Summary

The document serves as a reference for the DAMA-DMBOK Version 2, summarizing key concepts in data governance, architecture, modeling, and quality management. It emphasizes the importance of data quality assessments through both bottom-up and top-down approaches, and outlines various types of metadata and their significance in data management. Additionally, it highlights the role of data stewardship and the necessity of effective document and content management practices.

Uploaded by

salehmughir
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1

DMBOK 1 Notes
Contents
3. Data Governance ............................................................................................................................ 2
4. Data Architecture ............................................................................................................................ 3
5. Data Modelling and Design ............................................................................................................. 4
9. Document and Content Management ............................................................................................ 6
10. Reference and Master Data ........................................................................................................ 7
8 Data Integration and Interoperability ............................................................................................. 8
11. Data Warehousing and BI ........................................................................................................... 9
12. Metadata Management ............................................................................................................ 10
Metadata types according to DMBOK 1 ........................................................................................... 10
13. Data Quality Management........................................................................................................ 14

This document is your reference in the event of a question based on the DMBOK Version 1. All
questions ought to be based on DMBOK Version 2, but some questions appear to have been recycled
from the previous exams. We have reported them.

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
2

3. Data Governance

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
3

4. Data Architecture

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
4

5. Data Modelling and Design


Data models express two primary types of data rules:
¢

Cardinality rules define the quantity of each entity instance that can participate in a relationship
between entities. For example, “Each company can employ many persons.”

Referential Integrity rules ensure valid values. For example, “A person can exist without working for
a company, but a company cannot exist unless at least one person is employed by that company.”

+,-./

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
5

Denormalization is the deliberate transformation of a normalized logical data model into tables with
redundant data. In other words, it intentionally puts one data element in multiple places. This
process does introduce risk of data errors due to duplication. Implement data quality checks to
ensure that the copies of the data elements stay correctly stored. Only denormalized specifically to
improve database query performance, by either segregating or combining data to reduce query set
sizes, combining data to reduce joins, or performing and storing costly data calculations.

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
6

9. Document and Content Management


Non-value added information should be removed to avoid wasting space and the costs of
maintenance. Develop policies and procedures.

Many organisations do not prioritise removing non-value added information because:

• Policies are not adequate:


o One person’s non-value added information may be valuable to another person
o There may be possible future needs for the information
• There is no buy-in for records management
o Inability to decide what to keep
o Perceived cost of making a decision to remove the information
o Electronic space is cheap. it is easier to buy more than to put archive/disposal
processes in place

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
7

10. Reference and Master Data

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
8

8 Data Integration and Interoperability

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
9

11. Data Warehousing and BI


Active Data Warehousing
With the onset of Operational BI (and other general requirements from the business)
pushing for lower latency and more integration of real time or near real time data into
the data warehouse, new architectural approaches are emerging to deal with the
inclusion of volatile data. A common application of operational BI is the automated
banking machine (ABM) data provisioning. When making a banking transaction,
historical balances and new balances resulting from immediate banking actions, need to
be presented to the banking customer real-time.
The impact of the changes from new volatile data must be isolated from the bulk of the
historical, non-volatile DW data. Typical architectural approaches for isolation include a
combination of building partitions and using union queries for the different partitions,
when necessary.

Fact Tables
Fact tables contain important business measures. They are entities which contain attributes
representing measures (facts). Rows correspond to a particular measurement and are numeric. Fact
tables have many rows and take up the most space in the database (around 90%).

Dimension Tables
Represent the important objects of the business and contain textual descriptions of the business.
They are the source of the “query by” or “report by” constraints. They are denormalised and
account for 10% of the total data. All designs will have a Date dimension and Organisation or Party
Dimension as a minimum. Dimensions have surrogate or natural key.

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
10

12. Metadata Management


Meta-data is a broad term that includes many potential subject areas. These subject areas include:
• Business analytics: Data definitions, reports, users, usage, performance.
• Business architecture: Roles and organizations, goals and objectives.
• Business definitions: The business terms and explanations for a particular concept, fact, or
other item found in an organization.
• Business rules: Standard calculations and derivation methods.
• Data governance: Policies, standards, procedures, programs, roles, organizations,
stewardship assignments.
• Data integration: Sources, targets, transformations, lineage, ETL workflows, EAI, EII,
migration / conversion.
• Data quality: Defects, metrics, ratings.
• Document content management: Unstructured data, documents, taxonomies, ontologies,
name sets, legal discovery, search engine indexes.
• Information technology infrastructure: Platforms, networks, configurations, licenses.
• Logical data models: Entities, attributes, relationships and rules, business names and
definitions.
• Physical data models: Files, tables, columns, views, business definitions, indexes, usage,
performance, change management.
• Process models: Functions, activities, roles, inputs / outputs, workflow, business rules,
timing, stores.
• Systems portfolio and IT governance: Databases, applications, projects and programs,
integration roadmap, change management.
• Service-oriented architecture (SOA) information: Components, services, messages, master
data. 15. System design and development: Requirements, designs and test plans, impact.
• Systems management: Data security, licenses, configuration, reliability, service levels.

Metadata types according to DMBOK 1


Meta-data is classified into four major types: business, technical and operational,
process, and data stewardship.

Business Metadata:

Business meta-data includes the business names and definitions of subject and concept areas,
entities, and attributes; attribute data types and other attribute properties; range descriptions;
calculations; algorithms and business rules; and valid domain values and their definitions. Business
meta-data relates the business perspective to the meta-data user.
􀀀

• Business data definitions, including calculations.


• _

• Business rules and algorithms, including hierarchies.


• _

• Data lineage and impact analysis.


• _

• Data model: enterprise level conceptual and logical.


• _

• Data quality statements, such as confidence and completeness indicators.


• _

• Data stewardship information and owning organization(s).


• _

• Data update cycle.


• _

• Historical data availability.


• _

• Historical or alternate business definitions.


• Regulatory or contractual constraints.
• Reports lists and data contents.
• _

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
11



System of record for data elements.
_

• Valid value constraints (sample or list).

Technical and operational meta-data:

Technical and operational meta-data provides developers and technical users with information
about their systems.
Technical meta-data includes physical database table and column names, column properties, other
database object properties, and data storage. The database administrator needs to know users
patterns of access, frequency, and report / query execution time. Capture this meta-data using
routines within a DBMS or other software.
Operational meta-data is targeted to the IT operations users’ needs, including information about
data movement, source and target systems, batch programs, job frequency, schedule anomalies,
recovery and backup information, archive rules, and usage.

Examples of technical and operational meta-data include:


• #

• Audit controls and balancing information.


• $

• Data archiving and retention rules.


• %

• Encoding / reference table conversions.


• &

• History of extracts and results.


• '

• Identification of source system fields.


• (

• Mappings, transformations, and statistics from the system of record to target


• data stores (OLTP, OLAP).
• )

• Physical data model, including data table names, keys, and indexes.
• *

• Program job dependencies and schedule.


• +

• Program names and descriptions.


• ,

• Purge criteria.
• -

• Recovery and backup rules.


• .

• Relationships between the data models and the data warehouse / marts.
• /

• Systems of record feeding target data stores (OLTP, OLAP, SOA).


• 0

• User report and query access patterns, frequency, and execution time.
• 1

• Version maintenance.

Process Metadata

Process meta-data is data that defines and describes the characteristics of other system elements
(processes, business rules, programs, jobs, tools, etc.).
Examples of process meta-data include:
2

• Data stores and data involved.


• 3

• Government / regulatory bodies.


• 4

• Organization owners and stakeholders.


• 5

• Process dependencies and decomposition.


• 6

• Process feedback loop documentation.


• 7

• Process name.
• Process order and timing.
• _

• Process variations due to input or timing.


• _

• Roles and responsibilities.


• _

• Value chain activities.

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
12

Data Stewardship Metadata

Data stewardship meta-data is data about data stewards, stewardship processes, and responsibility
assignments. Data stewards assure that data and meta-data are accurate, with high quality across
the enterprise. They establish and monitor sharing of data.
Examples of data stewardship meta-data include:
_

• Business drivers / goals.


• _

• Data CRUD rules.


• _

• Data definitions - business and technical


• Data owners.
• _

• Data sharing rules and agreements / contracts.


• Data stewards, roles and responsibilities.
• Data stores and systems involved.
• _

• Data subject areas.


• _

• Data users.
• Government / regulatory bodies.
• _

• Governance organization structure and responsibilities

Unstructured Metadata:

Examples of descriptive meta-data include:


􀀀

• Catalog information.
• _

• Thesauri keyword terms.


Examples of structural meta-data include:
_

• Dublin Core.
• _

• Field structures.
• _

• Format (Audio / visual, booklet).


• _

• Thesauri keyword labels.


• _

• XML schemas.
Examples of administrative meta-data include:
_

• Source(s).
• _

• Integration / update schedule.


• Access rights.
• Page relationships (e.g. site navigational design).

Bibliographic meta-data, record-keeping meta-data, and preservation meta-data are all meta-data
schemes applied to documents, but from different focuses:
• Bibliographic meta-data is the library card of the document.
• Record-keeping meta-data is concerned with validity and retention.
• Preservation meta-data is concerned with storage, archival, condition, and conservation of
material

Industry / Consensus and International Meta-data Standards


Two major types of meta-data standards exist:
• industry or consensus standards (terms used interchangeably)
• international standards.

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
13

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
14

13. Data Quality Management

Prior to defining data quality metrics, it is crucial to perform an assessment of the data using two
different approaches, bottom-up and top-down.

The bottom-up assessment of existing data quality issues involves inspection and evaluation of the
data sets themselves. Direct data analysis will reveal potential data anomalies that should be
brought to the attention of subject matter experts for validation and analysis. Bottom-up
approaches highlight potential issues based on the results of automated processes, such as
frequency analysis, duplicate analysis, cross-data set dependency, orphan child data rows, and
redundancy analysis.

However, potential anomalies, and even true data flaws may not be relevant within the business
context unless vetted with the constituency of data consumers. The top-down approach to data
quality assessment involves engaging business users to document their business processes and the
corresponding critical data dependencies. The top-down approach involves understanding how their
processes consume data, and which data elements are critical to the success of the business
application. By reviewing the types of reported, documented, and diagnosed data flaws, the data
quality analyst can assess the kinds of business impacts that are associated with data issues.

Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by


Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)

Common questions

Powered by AI

The top-down approach to data quality assessment involves engaging business users to document business processes and critical data dependencies. It focuses on how processes consume data crucial to business applications, understanding the business impact of data issues. In contrast, the bottom-up approach involves direct analysis of data to uncover anomalies, with automated processes like frequency and duplicate analyses highlighting potential flaws. These insights must be validated with data users to ensure business relevance .

Denormalization is the intentional process of transforming a normalized logical data model into tables with redundant data to improve database query performance. It achieves this by reducing query set sizes or the need for joins, and by storing costly data calculations. However, this practice introduces the risk of data errors due to duplication, making it necessary to implement data quality checks to ensure accuracy .

Organizations often struggle to remove non-value added information due to inadequate policies, lack of buy-in for records management, and the perceived low cost of electronic storage. One person's non-essential information might be valuable to another, and future needs for such information can be difficult to predict. Without appropriate policies or prioritization, the decision to remove data is often deferred, and it's easier to increase storage capacity instead .

Business metadata includes definitions and terms that relate to a business context, such as business rules, calculations, hierarchy, and domain values. It provides business perspective to users by linking business concepts to data. Technical and operational metadata differs as it targets developers and IT operations needs, involving physical database properties, system performance, data storage, and data movement details. This metadata ensures efficient system functionality and maintenance .

Businesses face challenges like inadequate prioritization of document management, disparate value perceptions of information, and limited buy-in for records management, making policy implementation difficult. Addressing these requires robust policy frameworks, promoting organizational buy-in, and understanding diverse information value perspectives. Encouraging archival and disposal processes over simple storage expansion can address long-term efficiency and compliance needs .

Data stewards ensure high-quality and accurate data across an enterprise by establishing sharing protocols, monitoring data quality, and maintaining data definitions. They play a pivotal role in data governance by managing metadata, enforcing data CRUD (Create, Read, Update, Delete) rules, and ensuring compliance with governance regulations. Their responsibilities include data stewardship processes, responsibility assignments, and supporting data's alignment with business objectives .

Active data warehousing, driven by the need for lower latency in operational BI, integrates real-time or near-real-time data, unlike traditional data warehousing which typically relies on batch updates. This approach requires new architectural adjustments to manage the increase in volatile data by employing techniques like data partitioning and union queries to isolate real-time updates from the bulk of historical data, optimizing both data availability and processing speed .

The integration of real-time data into data warehouses, particularly in operational BI scenarios, necessitates architectural adjustments to efficiently manage volatile data. This includes building data partitions and employing union queries to separate the impacts of new data from historical, non-volatile data. This architecture facilitates faster data processing and supports real-time analytics, such as automated banking machine data provisioning, ensuring immediate data availability and consistency .

Partitioning and union queries are critical in managing volatile data integration because they allow for isolating changes from new volatile data from the bulk of historical data. This architectural strategy ensures that real-time or near-real-time data can be efficiently integrated into the data warehouse without affecting the performance or consistency of historical data queries, which is vital for operational BI processes that rely on accurate and timely data processing .

Process metadata captures data about processes, such as characteristics, dependencies, and roles, providing a detailed view of system and business processes. By documenting these elements, it aids in discovering inefficiencies, optimizing process flows, and ensuring alignment with business goals. Additionally, it supports process feedback loops and value chain analysis, informing improvements and helping maintain compliance with governmental or regulatory demands .

You might also like