0% found this document useful (0 votes)
10 views39 pages

Data Modeling Techniques for Business

The document discusses the application of data modeling in business, detailing its importance in creating structured representations of data for databases. It outlines different types of data models (conceptual, logical, physical) and their characteristics, along with various data modeling techniques and the significance of handling missing values. Additionally, it emphasizes the need for business modeling to design organizational structures and processes, including the use of gap analysis to identify discrepancies between current and desired states.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views39 pages

Data Modeling Techniques for Business

The document discusses the application of data modeling in business, detailing its importance in creating structured representations of data for databases. It outlines different types of data models (conceptual, logical, physical) and their characteristics, along with various data modeling techniques and the significance of handling missing values. Additionally, it emphasizes the need for business modeling to design organizational structures and processes, including the use of gap analysis to identify discrepancies between current and desired states.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Unit

II
Application of Modeling in
Business
⚫ Data Modeling :
⚫ Data modeling is the process of creating a
data model
for the data to be stored in a database.
⚫ This data model is a conceptual representation
of Data objects, the associations between
different data objects, and the rules.
⚫ Data modeling helps in the visual
representation of data which enforces business
rules & government policies on the data.
Cont..
⚫ Data Model :
⚫ The Data Model is an abstract model that
organizes data description, data semantics, and
consistency constraints of data.
⚫ The data model emphasizes on what data is
needed and how it should be organized.
⚫ Data Model is like an architect's building
plan, which helps to build models and set a
relationship between data items.
Why to use Data
Model ?
⚫ It ensures that all data objects required by the
database are
accurately represented.
⚫ A data model helps design the database at the
conceptual, physical and logical levels.
⚫ Data Model structure helps to define the
relational tables, primary and foreign keys and
stored procedures.
⚫ It also provides a clear picture of the data and
can be used
by database developers to create a physical
database.
Types of Data Models
⚫ There are mainly three different types of data
models:
⚫ Conceptual data models.
⚫ Logical data models.
⚫ Physical data models.
⚫ Each model has a specific purpose.
⚫ These data models are used to represent the
data and how it is stored in the database
and to set the relationship between data
items.
Conceptual Data Model
⚫ A Conceptual Data Model is an organized
view of
database concepts and their relationships.
⚫ The purpose of creating a
conceptual data model is to
establish entities, their attributes, and
relationships.
⚫ Business stakeholders and data architects create
this type of data model.
⚫ The three basic features to represent data in a
Data Model are :
⚫ Entity: A real-world thing
⚫ Attribute: Characteristics or properties of an entity
⚫ Data model example:
⚫ Customer and Product are two entities.
⚫ Customer number and name are attributes of the
Customer
entity.
⚫ Product name and price are attributes of product
entity.
⚫ Sale is the relationship between the customer and
product.
⚫ Characteristics of a conceptual data
model :
⚫ Offers Organisation- a wide coverage of the
business concepts.
⚫ This type of Data Models are designed and
developed for
a business audience.
⚫ The conceptual model is developed
independently of hardware specifications
like data storage capacity, location or
software specifications like DBMS vendor and
technology.
⚫ The focus is to represent data as a user will
Logical Data Model
⚫ The Logical Data Model is used to
define the structure of data elements and
to set relationships between them.
⚫ The logical data model adds further
information to the conceptual data model
elements.
⚫ The advantage of using a Logical data
model is to provide a foundation to form the
base for the Physical model.
⚫ But, the modeling structure remains generic.
⚫ Characteristics of a Logical data model :
⚫ Describes data needs for a single project but could
integrate with other logical data models based on
the scope of the project.
⚫ Designed and developed independently from the
DBMS.
⚫ Data attributes will have datatypes with exact
precisions and
length.
Physical Data Model
⚫A Physical Data Model describes a
database-specific
implementation of the data model.
⚫ It offers database abstraction and helps generate the
schema.
⚫ The physical data model helps in visualizing database
structure by replicating database column keys,
constraints, indexes, triggers, and other RDBMS
features.
⚫ Characteristics of a physical data model:
⚫ Data Model contains relationships between tables
which
addresses cardinality and nullability of the
relationships.
⚫ Developed for aspecific version of a
DBMS, location,
data storage or technology to be used in the
project.
⚫ Columns should have exact datatypes, lengths
assigned and default values.
⚫ Primary and Foreign keys, views, indexes,
⚫ The data model should be detailed enough to
be used for building the physical database.
⚫ The information in the data model can be
used for defining the relationship between
tables, primary and foreign keys, and stored
procedures.
⚫ Data Model helps business to communicate
within and across organizations.
Databases
⚫ A database is defined as a structured set of
data held in a computer’s memory or on the
cloud that is accessible in various ways.
⚫ Databases make structured storage secure,
efficient, and fast. They provide a framework
for how the data should be stored,
structured, and retrieved.
⚫ Ex: You turn on Netflix, it suggests what you
should watch next, based on your previous
selections.
⚫ Types of Databases :
⚫ Relational

⚫ Relational databases :
⚫ In a relational database, the data is organized
and stored into tables that can be linked to
each other use some relation.
⚫ For example, an airline company can have a
table of passengers for all flights, and another
table for passengers on a specific flight. A flight
code can connect these two tables.
⚫ This ability to have connected tables allows
developers and data scientists — to understand
better the relation between the different
elements of the table.
⚫ Understanding the relationship can give us
hints and insight that will make the process
of analyzing and visualizing the data an
easier task.
⚫ The way to communicate and interact with
relational
databases is through using the SQL language.
⚫ Non-relational databases :
⚫ Non-relational databases, also known as NoSQL
databases.
⚫ The most popular form of the NoSQL database is
key-value pairs. Keys have to be unique, as long as
they are, a key-value pair can store all the relations
in one document.
⚫ Relational databases use tables as their core
storing unit. A table in a database consists of a
collection of rows and columns, and you can
connect several tables using relations.
⚫ In NoSQL, however, the data is stored on
document-like storage. You can still perform all
tasks, such as add, delete, update your data as
long as you know how the document is structured.
Types of Data and Variables
⚫ There are two types of variables
in data
⚫ Numerical
⚫ Categorical
⚫ Numerical data divided into :
⚫ Continuous
⚫ Discrete
⚫ Categorical data divided into :
⚫ Nominal
⚫ Ordinal
⚫ Numerical
Numerical data is information that is
measurable, and data is represented as
numbers and not words or text.
⚫ Continuous numbers are numbers
that don’t have a
logical end to them.
⚫ Example: variables that represent money or
height.
⚫ Discrete numbers have a logical end to them.
⚫ Example: variables for days in the month.
⚫ Categorical
Categorical data, the data which isn’t a
number, which can mean a string of text or
date.
⚫ Ordinal values are values that have a set order
to them.
⚫ Example: having a priority on a bug such as
“Critical” or “Low”
or the ranking of a race as “First” or “Third”.
⚫ Nominal values are the values with no set
order to them.
⚫ Example: variables such as “Country” or “Marital
Status”.
⚫ There is a special type of categorical data
called binary. Binary data types only have two
values – yes or no. This can be represented in
different ways such as “True” and “False” or 1
and 0.
⚫ Example: variables can include whether a person
has stopped their subscription service or not, or if a
person bought a car or not.
Data Modeling Techniques
⚫ Data modeling is a process through which
data is stored structurally in a format in a
database. Data modeling enables
organizations to make data-driven decisions.
⚫ Different Methodologies/Techniques :
⚫ Hierarchical Model
⚫ Relational Model
⚫ Network Model
⚫ Object-oriented Model
⚫ Hierarchical model :
⚫ As the name indicates, this data model
makes use of hierarchy to structure the data
in a tree-like format. However, retrieving and
accessing data is difficult in a hierarchical
database. This is why it is rarely used now.
⚫ Network model :
⚫ The network model is inspired by the
hierarchical model. However, unlike the
hierarchical model, this model makes it
easier to convey complex relationships as
each record can be linked with multiple
parent records.
⚫ Relational model :
⚫ Proposed as an alternative to hierarchical
model by an IBM researcher.
⚫ Here data is represented in the form of tables.
⚫ It reduces the complexity and provides a clear
overview of the data.
⚫ Object-oriented model :
⚫ This database model consists of a collection of
objects, each with its own features and
methods.
⚫ Thistype of database model is also calledthe
post- relational database model.
⚫ Importance of Data Modeling :
⚫ A clear representation of data makes it easier
to analyze
the data properly.
⚫ Data modeling represents the data properly in
a model.
⚫ It rules out any chancesof data
redundancy and omission. This helps in clear
analysis and processing.
⚫ Data modeling improves data
quality and enables the
concerned stakeholders to make data-driven
Missing Imputations
⚫ Many datasets may contain missing values
for various reasons. They are often encoded
as NaNs, blanks or NA.
⚫ One way to handle this problem is to get
rid of the observations that have missing
data.
⚫ But, it will lead to losing data points with
valuable information.
⚫ A better strategy would be to impute the
missing values. In other words, we need to
infer those missing values from the existing
⚫ Imputation Using (Mean/Median) Values:
⚫ This works by calculating the mean/median of
the non- missing values in a column and
then replacing the missing values within
each column separately and independently
from the others.
⚫ It can only be used with numeric data.
⚫ Imputation Using (Most
Frequent) or (Zero/Constant) Values:
⚫ Most Frequent is another statistical strategy
to impute missing values. It works with
categorical features (strings or numerical
representations) by replacing missing data
with the most frequent values within each
column.
⚫ Zero or Constant imputation — as the name
suggests — it replaces the missing values
with either zero or any constant value you
specify
⚫ Imputation Using k-NN:
⚫ The k nearest neighbours is an algorithm that is
used for simple classification.
⚫ The algorithm uses ‘feature similarity’ to predict
the values
of any new data points.
⚫ This means that the new point is assigned a value
based on
how closely it resembles the points in the training
set.
⚫ This can be very useful in making predictions
about the missing values by finding the k’s
closest neighbours to the observation with missing
⚫ Imputation Using Deep Learning :
⚫ This method works very well with categorical
and non-
numerical features.
⚫ It is a library that learns Machine Learning
models using Deep Neural Networks to
impute missing values in a dataframe.
Need for Business Modeling
⚫ Business Model is a structured model,
just like a blueprint for the final product to
be developed. It gives structure and
dynamics for planning. It also provides the
foundation for the final product.
⚫ With the help of modeling techniques, we can
create a complete description of existing
and proposed organizational structures,
processes, and information used by the
enterprise.
⚫ Purpose of Business Modeling :
⚫ Business modeling is used to design current
and future
state of an enterprise.
⚫ This model is used by the Business Analyst
and the stakeholders to ensure that they
have an accurate understanding of the
current “As-Is” model of the enterprise.
⚫ It is used to verify if, stakeholders have
a shared understanding of the proposed “To-be
of the solution.
⚫ Analyzing requirements is a part of
business modelling process. Functional
Requirements are gathered during the
“Current State”.
⚫ These requirements are provided by the
stakeholders regarding the business
processes, data, and business rules that
describe the desired functionality which will
be designed in the “Future State”.
Performing GAP Analysis
⚫ After defining the business needs, the current
state (e.g. current business processes, business
functions, features of a current system and
services/products offered ) must be
identified to understand how people,
processes and technology, structure and
architecture are supporting the business by
seeking input from staff and other related
stakeholders including business owners.
⚫ A gap analysis is then performed to assess, if
there is any gap that prevents from
achieving business needs by comparing
the identified current state with the desired
⚫ If there is no gap (i.e. the current state is
adequate to meet the business needs and
desired outcomes), it will probably not be
necessary to
⚫ Otherwis launch
the the IT project.
problems/issues to
e, required in orderto be
addresse bridge the gap should
d be
identifie
d.

You might also like