1
DMBOK 1 Notes
Contents
3. Data Governance ............................................................................................................................ 2
4. Data Architecture ............................................................................................................................ 3
5. Data Modelling and Design ............................................................................................................. 4
9. Document and Content Management ............................................................................................ 6
10. Reference and Master Data ........................................................................................................ 7
8 Data Integration and Interoperability ............................................................................................. 8
11. Data Warehousing and BI ........................................................................................................... 9
12. Metadata Management ............................................................................................................ 10
Metadata types according to DMBOK 1 ........................................................................................... 10
13. Data Quality Management........................................................................................................ 14
This document is your reference in the event of a question based on the DMBOK Version 1. All
questions ought to be based on DMBOK Version 2, but some questions appear to have been recycled
from the previous exams. We have reported them.
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
2
3. Data Governance
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
3
4. Data Architecture
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
4
5. Data Modelling and Design
Data models express two primary types of data rules:
¢
Cardinality rules define the quantity of each entity instance that can participate in a relationship
between entities. For example, “Each company can employ many persons.”
Referential Integrity rules ensure valid values. For example, “A person can exist without working for
a company, but a company cannot exist unless at least one person is employed by that company.”
+,-./
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
5
Denormalization is the deliberate transformation of a normalized logical data model into tables with
redundant data. In other words, it intentionally puts one data element in multiple places. This
process does introduce risk of data errors due to duplication. Implement data quality checks to
ensure that the copies of the data elements stay correctly stored. Only denormalized specifically to
improve database query performance, by either segregating or combining data to reduce query set
sizes, combining data to reduce joins, or performing and storing costly data calculations.
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
6
9. Document and Content Management
Non-value added information should be removed to avoid wasting space and the costs of
maintenance. Develop policies and procedures.
Many organisations do not prioritise removing non-value added information because:
• Policies are not adequate:
o One person’s non-value added information may be valuable to another person
o There may be possible future needs for the information
• There is no buy-in for records management
o Inability to decide what to keep
o Perceived cost of making a decision to remove the information
o Electronic space is cheap. it is easier to buy more than to put archive/disposal
processes in place
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
7
10. Reference and Master Data
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
8
8 Data Integration and Interoperability
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
9
11. Data Warehousing and BI
Active Data Warehousing
With the onset of Operational BI (and other general requirements from the business)
pushing for lower latency and more integration of real time or near real time data into
the data warehouse, new architectural approaches are emerging to deal with the
inclusion of volatile data. A common application of operational BI is the automated
banking machine (ABM) data provisioning. When making a banking transaction,
historical balances and new balances resulting from immediate banking actions, need to
be presented to the banking customer real-time.
The impact of the changes from new volatile data must be isolated from the bulk of the
historical, non-volatile DW data. Typical architectural approaches for isolation include a
combination of building partitions and using union queries for the different partitions,
when necessary.
Fact Tables
Fact tables contain important business measures. They are entities which contain attributes
representing measures (facts). Rows correspond to a particular measurement and are numeric. Fact
tables have many rows and take up the most space in the database (around 90%).
Dimension Tables
Represent the important objects of the business and contain textual descriptions of the business.
They are the source of the “query by” or “report by” constraints. They are denormalised and
account for 10% of the total data. All designs will have a Date dimension and Organisation or Party
Dimension as a minimum. Dimensions have surrogate or natural key.
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
10
12. Metadata Management
Meta-data is a broad term that includes many potential subject areas. These subject areas include:
• Business analytics: Data definitions, reports, users, usage, performance.
• Business architecture: Roles and organizations, goals and objectives.
• Business definitions: The business terms and explanations for a particular concept, fact, or
other item found in an organization.
• Business rules: Standard calculations and derivation methods.
• Data governance: Policies, standards, procedures, programs, roles, organizations,
stewardship assignments.
• Data integration: Sources, targets, transformations, lineage, ETL workflows, EAI, EII,
migration / conversion.
• Data quality: Defects, metrics, ratings.
• Document content management: Unstructured data, documents, taxonomies, ontologies,
name sets, legal discovery, search engine indexes.
• Information technology infrastructure: Platforms, networks, configurations, licenses.
• Logical data models: Entities, attributes, relationships and rules, business names and
definitions.
• Physical data models: Files, tables, columns, views, business definitions, indexes, usage,
performance, change management.
• Process models: Functions, activities, roles, inputs / outputs, workflow, business rules,
timing, stores.
• Systems portfolio and IT governance: Databases, applications, projects and programs,
integration roadmap, change management.
• Service-oriented architecture (SOA) information: Components, services, messages, master
data. 15. System design and development: Requirements, designs and test plans, impact.
• Systems management: Data security, licenses, configuration, reliability, service levels.
Metadata types according to DMBOK 1
Meta-data is classified into four major types: business, technical and operational,
process, and data stewardship.
Business Metadata:
Business meta-data includes the business names and definitions of subject and concept areas,
entities, and attributes; attribute data types and other attribute properties; range descriptions;
calculations; algorithms and business rules; and valid domain values and their definitions. Business
meta-data relates the business perspective to the meta-data user.
• Business data definitions, including calculations.
• _
• Business rules and algorithms, including hierarchies.
• _
• Data lineage and impact analysis.
• _
• Data model: enterprise level conceptual and logical.
• _
• Data quality statements, such as confidence and completeness indicators.
• _
• Data stewardship information and owning organization(s).
• _
• Data update cycle.
• _
• Historical data availability.
• _
• Historical or alternate business definitions.
• Regulatory or contractual constraints.
• Reports lists and data contents.
• _
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
11
•
•
System of record for data elements.
_
• Valid value constraints (sample or list).
Technical and operational meta-data:
Technical and operational meta-data provides developers and technical users with information
about their systems.
Technical meta-data includes physical database table and column names, column properties, other
database object properties, and data storage. The database administrator needs to know users
patterns of access, frequency, and report / query execution time. Capture this meta-data using
routines within a DBMS or other software.
Operational meta-data is targeted to the IT operations users’ needs, including information about
data movement, source and target systems, batch programs, job frequency, schedule anomalies,
recovery and backup information, archive rules, and usage.
Examples of technical and operational meta-data include:
• #
• Audit controls and balancing information.
• $
• Data archiving and retention rules.
• %
• Encoding / reference table conversions.
• &
• History of extracts and results.
• '
• Identification of source system fields.
• (
• Mappings, transformations, and statistics from the system of record to target
• data stores (OLTP, OLAP).
• )
• Physical data model, including data table names, keys, and indexes.
• *
• Program job dependencies and schedule.
• +
• Program names and descriptions.
• ,
• Purge criteria.
• -
• Recovery and backup rules.
• .
• Relationships between the data models and the data warehouse / marts.
• /
• Systems of record feeding target data stores (OLTP, OLAP, SOA).
• 0
• User report and query access patterns, frequency, and execution time.
• 1
• Version maintenance.
Process Metadata
Process meta-data is data that defines and describes the characteristics of other system elements
(processes, business rules, programs, jobs, tools, etc.).
Examples of process meta-data include:
2
• Data stores and data involved.
• 3
• Government / regulatory bodies.
• 4
• Organization owners and stakeholders.
• 5
• Process dependencies and decomposition.
• 6
• Process feedback loop documentation.
• 7
• Process name.
• Process order and timing.
• _
• Process variations due to input or timing.
• _
• Roles and responsibilities.
• _
• Value chain activities.
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
12
Data Stewardship Metadata
Data stewardship meta-data is data about data stewards, stewardship processes, and responsibility
assignments. Data stewards assure that data and meta-data are accurate, with high quality across
the enterprise. They establish and monitor sharing of data.
Examples of data stewardship meta-data include:
_
• Business drivers / goals.
• _
• Data CRUD rules.
• _
• Data definitions - business and technical
• Data owners.
• _
• Data sharing rules and agreements / contracts.
• Data stewards, roles and responsibilities.
• Data stores and systems involved.
• _
• Data subject areas.
• _
• Data users.
• Government / regulatory bodies.
• _
• Governance organization structure and responsibilities
Unstructured Metadata:
Examples of descriptive meta-data include:
• Catalog information.
• _
• Thesauri keyword terms.
Examples of structural meta-data include:
_
• Dublin Core.
• _
• Field structures.
• _
• Format (Audio / visual, booklet).
• _
• Thesauri keyword labels.
• _
• XML schemas.
Examples of administrative meta-data include:
_
• Source(s).
• _
• Integration / update schedule.
• Access rights.
• Page relationships (e.g. site navigational design).
Bibliographic meta-data, record-keeping meta-data, and preservation meta-data are all meta-data
schemes applied to documents, but from different focuses:
• Bibliographic meta-data is the library card of the document.
• Record-keeping meta-data is concerned with validity and retention.
• Preservation meta-data is concerned with storage, archival, condition, and conservation of
material
Industry / Consensus and International Meta-data Standards
Two major types of meta-data standards exist:
• industry or consensus standards (terms used interchangeably)
• international standards.
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
13
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)
14
13. Data Quality Management
Prior to defining data quality metrics, it is crucial to perform an assessment of the data using two
different approaches, bottom-up and top-down.
The bottom-up assessment of existing data quality issues involves inspection and evaluation of the
data sets themselves. Direct data analysis will reveal potential data anomalies that should be
brought to the attention of subject matter experts for validation and analysis. Bottom-up
approaches highlight potential issues based on the results of automated processes, such as
frequency analysis, duplicate analysis, cross-data set dependency, orphan child data rows, and
redundancy analysis.
However, potential anomalies, and even true data flaws may not be relevant within the business
context unless vetted with the constituency of data consumers. The top-down approach to data
quality assessment involves engaging business users to document their business processes and the
corresponding critical data dependencies. The top-down approach involves understanding how their
processes consume data, and which data elements are critical to the success of the business
application. By reviewing the types of reported, documented, and diagnosed data flaws, the data
quality analyst can assess the kinds of business impacts that are associated with data issues.
Summary of DAMA-DMBOK Version 2© as preparation for CDMP Exams by
Veronica Diesel (CDMP, Data Modelling & Design, Data Governance, Data Quality)
Education Director DAMA SA (December 2020)