Business Analytics: Data Warehousing Insights
Business Analytics: Data Warehousing Insights
1
Data Warehouses and Data Mart
Business intelligence (BI) is all about turning
an organization’s data into insights that can be
used to inform business decisions.
BI analysts will use BI tools, software or services
to access and analyze datasets and translate
their findings into reports, summaries,
dashboards, graphs, charts or maps.
In recent years, the advent of modern data
visualization and reporting tools has
transformed the discipline, empowering
businesses to use big data insights to identify,
develop and create new business opportunities.
2
Business Intelligence
“a set of methodologies, processes, architectures and
technologies that transform raw data into meaningful and
useful information”.
This can then be “used to enable more effective strategic,
tactical and operational insights and decision-making”.
This definition acknowledges that data cannot be
effectively analyzed or used to generate meaningful
insights if it is poor quality.
BI should not be confused with ‘business analytics’.
Business intelligence is descriptive and uses metrics to
generate clear snapshots of business performance.
Meanwhile, business analytics is predictive, and describes
what organizations should do in future to generate better
outcomes.
3
Data Warehouse
A Data Warehouse (DW) is a relational database that
is designed for query and analysis rather than
transaction processing.
It includes historical data derived from transaction
data from single and multiple sources.
A Data Warehouse provides integrated, enterprise-
wide, historical data and focuses on providing support
for decision-makers for data modeling and analysis.
A Data Warehouse is a group of data specific to the
entire organization, not only to a particular group of
users.
It is not used for daily operations and transaction
processing but used for making decisions.
4
A Data Warehouse can be viewed as a data
system with the following attributes:
It is a database designed for investigative
tasks, using data from various applications.
It supports a relatively small number of
clients with relatively long interactions.
5
It includes current and historical data to
provide a historical perspective of
information.
Its usage is read-intensive.
It contains a few large tables.
"Data Warehouse is a subject-oriented,
integrated, and time-variant store of
information in support of management's
decisions."
6
7
Subject-Oriented
A data warehouse target on the modeling
and analysis of data for decision-makers.
Therefore, data warehouses typically
provide a concise and straightforward view
around a particular subject, such as
customer, product, or sales, instead of the
global organization's ongoing operations.
This is done by excluding data that are not
useful concerning the subject and including
all data needed by the users to understand
the subject.
8
Integrated
A data warehouse integrates various
heterogeneous data sources like RDBMS,
flat files, and online transaction records.
It requires performing data cleaning and
integration during data warehousing to
ensure consistency in naming conventions,
attributes types, etc., among different data
sources.
9
10
Time-Variant
Historical information is kept in a data
warehouse. For example, one can retrieve
files from 3 months, 6 months, 12 months,
or even previous data from a data
warehouse.
These variations with a transactions
system, where often only the most current
file is kept.
11
Non-Volatile
The data warehouse is a physically separate data
storage, which is transformed from the source
operational RDBMS.
The operational updates of data do not occur in the
data warehouse, i.e., update, insert, and delete
operations are not performed.
It usually requires only two procedures in data
accessing: Initial loading of data and access to data.
Therefore, the DW does not require transaction
processing, recovery, and concurrency capabilities,
which allows for substantial speedup of data retrieval.
Non-Volatile defines that once entered into the
warehouse, and data should not change.
12
13
Goals of Data Warehousing
To help reporting as well as analysis
Maintain the organization's historical
information
Be the foundation for decision making.
14
15
Data Mart
A Data Mart is a subset of a directorial
information store, generally oriented to a
specific purpose or primary data subject
which may be distributed to provide
business needs.
Data Marts are analytical record stores
designed to focus on particular business
functions for a specific community within
an organization.
Data marts are derived from subsets of
data in a data warehouse, though in the
bottom- up data warehouse design
16
methodology, the data warehouse is
The fundamental use of a data mart is
Business Intelligence (BI) applications.
BI is used to gather, store, access, and
analyze record.
It can be used by smaller businesses to
utilize the data they have accumulated
since it is less expensive than
implementing a data warehouse.
17
Reasons for creating a data
mart
Creates collective data by a group of users
Easy access to frequently needed data
Ease of creation
Improves end-user response time
Lower cost than implementing a complete
data warehouses
Potential clients are more clearly defined
than in a comprehensive data warehouse
It contains only essential business data and
is less cluttered.
18
Types of Data Marts
There are mainly two approaches to
designing data marts. These approaches are
Dependent Data Marts
Independent Data Marts
19
20
A dependent data mart is a logical subset of a
physical subset of a higher data warehouse.
According to this technique, the data marts are
treated as the subsets of a data warehouse.
In this technique, firstly a data warehouse is created
from which further various data marts can be
created.
These data mart are dependent on the data
warehouse and extract the essential record from it.
In this technique, as the data warehouse creates the
data mart; therefore, there is no need for data mart
integration. It is also known as a top-down
approach.
21
Independent Data Marts
The second approach is Independent data marts
(IDM) Here, firstly independent data marts are
created, and then a data warehouse is designed
using these independent multiple data marts.
In this approach, as all the data marts are
designed independently; therefore, the
integration of data marts is required.
It is also termed as a bottom-up approach as
the data marts are integrated to develop a data
warehouse.
Other than these two categories, one more type
exists that is called "Hybrid Data Marts."
22
23
Hybrid Data Marts
It allows us to combine input from sources
other than a data warehouse.
This could be helpful for many situations;
especially when Adhoc integrations are
needed, such as after a new group or
product is added to the organizations.
24
Steps in Implementing a Data
Mart
The significant steps in implementing a
data mart are to design the schema,
construct the physical storage, populate
the data mart with data from source
systems, access it to make informed
decisions and manage it over time.
25
Designing
The design step is the first in the data
mart process. This phase covers all of
the functions from initiating the request
for a data mart through gathering data
about the requirements and developing
the logical and physical design of the
data mart.
It involves the following tasks:
Gathering the business and technical
requirements
Identifying data sources
26 Selecting the appropriate subset of data
Constructing
This step contains creating the physical
database and logical structures
associated with the data mart to provide
fast and efficient access to the data.
It involves the following tasks:
Creating the physical database and logical
structures such as table spaces associated
with the data mart.
Creating the schema objects such as tables
and indexes describe in the design step.
Determining how best to set up the tables
and access structures.
27
Populating
This step includes all of the tasks related to the
getting data from the source, cleaning it up,
modifying it to the right format and level of detail,
and moving it into the data mart.
It involves the following tasks:
Mapping data sources to target data sources
Extracting data
Cleansing and transforming the information.
Loading data into the data mart
Creating and storing metadata
28
It involves the following tasks:
Set up and intermediate layer (Meta
Layer) for the front-end tool to use. This
layer translates database operations
and objects names into business
conditions so that the end-clients can
interact with the data mart using words
which relates to the business functions.
Set up and manage database architectures
like summarized tables which help queries
agree through the front-end tools execute
29
rapidly and efficiently.
Managing
This step contains managing the data mart
over its lifetime. In this step, management
functions are performed as:
Providing secure access to the data.
Managing the growth of the data.
Optimizing the system for better performance.
Ensuring the availability of data even with system
failures.
30
31
32
Knowledge Management
The Connection Between Business Intelligence
and Knowledge Management
Information simply has to be accessible – gathering,
managing and utilizing information is an inevitable part
of running any modern business.
There are two information management
technologies we use: Business Intelligence
(BI) and Knowledge Management (KM).
However, terms information, data and
knowledge are often used interchangeably,
thus, apart from being confused about each
phrase, there is oftentimes much confusion
around the definitions of BI and KM.
33
Business Intelligence (BI): Generally, BI is considered
to be a set of tools and techniques applied to gather data
and transform it into information that can be used in
business analysis for the purposes of business
development.
Every company gathers, collects, or to say more
accurately, deals with a large amounts of data,
including various business documents, emails, newspaper
articles, web pages, reports, contracts, technical journals
and reviews, spreadsheets, graphs and charts and other
relevant sources of business data.
BI technologies usually deal with large amounts of
unstructured data via the use of data warehousing and
online analytical processing (OLAP).
34
All these data needs to be organized and validated
– prepared for business analytics.
Thus, “BI is about providing the right data at
the right time to the right people so that
they can take the right decisions.”
Knowledge Management can be defined in
many ways as it spans many multi-
disciplinary approaches – content
management, collaboration, the science of
organizational behavior, analyses like observation
of trends and appearance of anomalies, clustering,
classification, summarization, taxonomy building
and so on.
35
“Knowledge management is the process of
capturing, distributing, and effectively using
knowledge.”
KM refers to a set of techniques used to capture,
share, and use the information available in order
to achieve business objectives and to aid in
business decision making based on business
analytics.
There has been immense growth in the domain of
knowledge management in the last decade and
new applications and solutions that empower
knowledge sharing and knowledge management
have appeared.
36
Knowledge is a mix of contextual
information, experiences, rules, and values.
Richer, deeper, and more valuable.
Consider knowing –
What? - based upon assembling information
and eventually applying it.
How? – applying knowledge leads to learning
how to do something.
Why? – casual knowledge of why something
occurs.
37
38
Tacit vs. Explicit Knowledge
Tacit knowledge is personal, context-
specific and hard to formalize and
communicate.
A [knowledge] developed and internalized
by the knower over a long period of
time . . . incorporates so much accrued and
embedded learning that its rules may be
impossible to separate from how an
individual acts.
39
Explicit knowledge can be easily collected,
organized and transferred through digital
means.
A theory of the world, conceived of as a set
of all of the conceptual entities describing
classes of objects, relationships, processes,
and behavioral norms.
Often referred to as knowing that’, or
declarative knowledge.
40
41
Why Manage Knowledge?
Information and knowledge have become
the fields in which businesses compete.
Several important factors include:
Sharing Best Practice
Globalization
Rapid Change
Downsizing
Managing Information and Communication
Overload – Knowledge Embedded in Products
Sustainable Competitive Advantage.
42
43
Sharing Best Practices
Sharing best practices means leveraging the
knowledge gained by a subset of the
organization.
Increasingly important in organizations who
depend on applying their expertise such as
accounting, consulting and training firms.
KM systems capture best practices to
disseminate their experience within the firm.
Problems often arise from employees who may
be reluctant to share their knowledge
(managers must encourage and reward open
sharing).
44
Globalization
Historically three factors, land, labor and
capital were the key to economic success.
Knowledge has become a fourth factor.
Knowledge-based businesses can grow
without traditional land, labor, and capital
requirements.
Key competitive factor will be how well an
organization acquires and applies
knowledge
45
Other factors
Rapid change: firms must be nimble and adaptive
to compete.
Downsizing: sometimes the wrong people get fired
when creating a leaner organization.
Managing Info. and Comm. Overload: data must
be categorized in some manner if it is to be useful
rather than overwhelming.
Knowledge Embedded in Products: the intangibles
that add the most value to goods and services are
becoming increasingly knowledge-based.
Sustainable Competitive Advantage: KM is the
way to do this. Shorter innovation life cycles keep
companies ahead of the competition.
46
Categorizing knowledge
Process knowledge – best practices, useful
for increasing efficiency.
Factual knowledge – easy to document;
basic information about people/things.
Catalog knowledge – know where things
are; like directories of expertise.
Cultural knowledge – knowing how things
get done politically and culturally.
47
Process Knowledge
Definition:
Process knowledge refers to understanding the best
practices, workflows, and methodologies used to
complete tasks efficiently. It involves knowing how
things should be done to optimize productivity and
quality.
Characteristics:
It includes step-by-step procedures and workflows.
Often developed through experience, training, and
observation.
Can be standardized through documentation and
automation.
Helps in continuous improvement and efficiency.
48
Examples:
A software company follows an Agile
methodology for software development.
Manufacturing industries use Six Sigma to
improve quality control.
Hospitals have standard operating procedures
(SOPs) for patient care.
Importance:
Reduces errors and inefficiencies in operations.
Improves consistency and quality of outcomes.
Enables new employees to learn processes
quickly.
Helps organizations adapt to changing
environments through refined workflows.
49
Factual Knowledge
Definition:
Factual knowledge consists of basic, objective, and easily verifiable
information about people, objects, events, and concepts. It is often
recorded and shared in structured formats like databases, manuals, or
books.
Characteristics:
It is static and does not change frequently.
Can be easily documented and shared.
Used for decision-making and reference.
Examples:
A customer service representative needs customer details (name, address,
contact information).
Engineers refer to mathematical formulas and physical laws.
An AI model needs a dataset containing facts about weather conditions to
make predictions.
Importance:
Forms the foundation for knowledge-based decision-making.
Essential for training AI models and automation.
50 Helps organizations store and retrieve critical information efficiently.
Catalog Knowledge
Definition:
Catalog knowledge refers to knowing where to find information,
resources, or expertise when needed. It acts as a directory that
helps individuals or teams navigate complex systems.
Characteristics:
Helps in locating resources quickly.
Often stored in knowledge management systems, directories, or
databases.
Used for networking and collaboration.
Examples:
A university maintains a faculty directory with specializations and
contact details.
IT support teams have a knowledge base for troubleshooting guides.
A hospital maintains a medical inventory catalog to track the
availability of drugs and equipment.
Importance:
Reduces time spent on searching for information.
Improves collaboration and communication.
Enhances decision-making by connecting people with the right
expertise.
51
Cultural Knowledge
Definition:
Cultural knowledge refers to understanding how things get
done within an organization, community, or society based
on norms, traditions, and political structures. It includes
unwritten rules, beliefs, and power dynamics.
Characteristics:
Often implicit and learned through experience.
Can vary significantly between organizations and
regions.
Helps in navigating workplace relationships and
decision-making.
Examples:
In a multinational company, employees need to understand
cultural differences in communication styles.
A new employee in a government agency learns about
bureaucratic procedures and how to get approvals
efficiently.
In academia, researchers need to understand funding
priorities and grant application processes.
52
Importance:
KM involves four main
processes
Generation – all activities that discover
“new” knowledge.
Capture – all continuous processes of
scanning, organizing, and packaging
knowledge after it has been generated.
Codification – the representation of
knowledge in a manner that can be easily
accessed and transferred.
Transfer – transmitting knowledge from one
person or group to another, and the
absorption of that knowledge.
53
Knowledge Generation
Concerns the intentional activities of an
organization to acquire/create new
knowledge.
Two primary ways are knowledge creation
and knowledge sharing.
Methods include: – Research and
Development – Adaptation – Buy or Rent –
Shared Problem Solving – Communities of
Practice.
54
Knowledge capture takes into account the
media to be used in the codification process.
The 3 main knowledge capture activities are:
Scanning (gather “raw” information) – can
be electronic or human.
Organizing (move it into an acceptable
form) – must be easy for all types of users
to access.
Designingknowledge maps
(providinga guide for navigating the
knowledge base)
55
Designing Knowledge Maps
A knowledge map serves as both a guide to
where knowledge exists in an organization
and an inventory of the knowledge assets
available.
A knowledge map can consist of nothing
more than a list of people, documents, and
databases telling employees where to go
when they need help.
Provides access to resources that would
otherwise be difficult or impossible to find.
56
57
Knowledge Codification
Knowledge must be used or shared to be of
value.
Codification puts the knowledge into a form
that makes it easy to find and use.
It is difficult to measure knowledge in
discreet units (since it changes over time).
Knowledge has a shelf life.
58
Knowledge Transfer
Knowledge Transfer describe four different
modes of knowledge conversion (transfer):
Socialization: from tacit knowledge to tacit
knowledge
Externalization: from tacit knowledge to
explicit knowledge
Combination: from explicit knowledge to
explicit knowledge
Internalization: from explicit knowledge to
tacit knowledge
59
The Knowledge Management
Process
60
Similarities and Differences between BI and
KM
Confusion between these two technologies comes from the fact
that they deal with many similar processes.
Both business intelligence and knowledge management
capture, collect, organize, analyze and aggregate data in
order to find the best solutions regarding business decision
making processes.
Business intelligence goes as far back as the 19th century and
the beginnings of entrepreneurship and it has been developing
steadily over many years.
BI enables organizations to integrate data across the enterprise,
unlock the information and empower knowledge worker to make
better (and faster) decisions – it focuses on explicit knowledge.
However, KM deals with the creation of new knowledge and the
dispersion of existing knowledge throughout an organization – it
encompasses both tacit and explicit knowledge, thus, we can say
that KM can influence the very nature of business intelligence.
61
Types of Decisions
Modern cloud-native organizations have constantly growing
streams of raw data flowing from every corner of the
enterprise.
Determining the impact this data has on business
performance can be an overwhelming task requiring teams
of analysts.
That’s where employing business intelligence (BI) can help.
By presenting current and historical data within a business
context, the data insights supplied by BI tools enable
organizations to make smarter, more confident decisions
that provide strategic direction for years to come.
Instead of relying on intuition and “gut feel,” companies
can use BI to find new ways to increase revenue, track
performance, boost operational efficiency, identify market
trends, expose problems, and much, much more.
62
Decision-making
Simply put, decision-making is the process of deciding
something, especially with a group of people.
From a business decision perspective, the aim is to achieve
business objectives to satisfy stakeholder requirements,
needs, and expectations.
For the decision to be effective, however, decision makers
must forecast the outcome of each option and determine
which is best for a particular situation.
That makes decision support systems (DSS) like decision
intelligence and business intelligence absolute essentials.
Business intelligence refers to the technology tools and
processes that enable businesses to organize, analyse, and
contextualize business data from around the company.
Business intelligence tools and decision-making transform
raw data into meaningful and actionable information.
63
The role of business intelligence
Companies make big mistakes when they
base business decisions on what they think
will happen instead of relying on facts.
Using BI and advanced analytics,
organizations can extract crucial facts from
the mountain of data, transforming it into
information companies can act on to make
informed strategic decisions.
The result: improved business processes,
operational efficiency, and business
productivity.
64
Business intelligence
decisions
Business intelligence decisions typically fall
into three categories: strategic, tactical,
and operational.
An organization needs to gain a complete
understanding of these types of decisions
in business intelligence to make better-
informed decisions that lead to increased
customer retention, stakeholder
satisfaction, operational efficiency, and
revenue.
65
The relationship between business
intelligence and business analytics
Business intelligence tells you what is
currently happening and what happened in
the past to bring you to that state.
On the other hand, business analytics is an
umbrella term for predictive data analysis
techniques and prescriptive .
Using business intelligence and analytics
efficiently is the difference between
companies that succeed and those that fail
in the modern environment.
66
Three primary types of business
intelligence decisions
Business intelligence supports the three
types of decision-making mentioned above:
strategic
tactical
operational
67
Strategic decisions
Strategic decisions comprise the highest level of
organizational business decisions and are usually
less frequent and made by the organization’s
executives. Yet, their impact is enormous and far-
reaching.
Some types of strategic decisions include selecting
a particular market to penetrate, a company to
acquire, or whether to hire additional staff.
Decisions made at this level usually involve
significant expenditure.
However, they are generally non-repetitive in
nature and are taken only after careful analysis and
evaluation of many alternatives.
68
Tactical decisions
Tactical decisions (or semi structured
decisions) occur with greater frequency
(e.g., weekly or monthly) and fall into the
mid-management level. Often, they relate
to the implementation of strategic
decisions.
Examples of tactical decisions include
product price changes, work
schedules,departmental reorganization,
and similar activities.
The impact of these types of decisions is
medium regarding risk to the organization
69
and impact on profitability.
Operational decisions
Operational decisions (or structured
decisions) usually happen frequently (e.g.,
daily or hourly), relate to day-to-day
operations of the enterprise, and have a
lesser impact on the organization.
Operational decisions determine the day-
to-day profitability of the business, how
effectively it retains customers, or how well
it manages risk.
Answering a sales inquiry, approving a
quotation, or calculating employee bonuses
may be examples of this decision type.
70
Strategic: Long-term, complex, made
by senior managers
Tactical: Medium-term, less complex,
made by mid-level managers
Operational: Day-to-day, simple,
routine, made by junior managers
71
How to make the best decisions
for your business
How do you make the best business decisions? Some
people trust intuition or gut feeling. Others reach out to
constituents and experts for advice.
Still, others decision-making to information systems and
automation.
However, the smartest business decisions are made by
those who look at the numbers.
In a competitive business landscape, where agility,
flexibility, and a real-time decision-making process are
critical and timely, accurate data analysis is more important
than ever.
In that respect, relying on the types of decisions in business
intelligence is non-negotiable.
It is required for long-standing success and market
dominance.
72
Decision Making Process
As a business owner, you need to make decisions
that have a large impact on the success of your
business.
You need to make decisions ranging from the small,
such as what color your company logo should be, to
the larger, such as whether to expand your business.
No matter what your decision, it should be based on
facts, not emotions.
Decision-making is a crucial aspect of running a
business. Whenever you decide what to do next, a
chain of events is triggered, which may eventually
lead to the outcome you desire.
These events constitute the Decision-Making Process
using predictive analytics consulting.
73
The decision-making process is a term used to
describe how a company gets to a point where
it can make the best decision for its company.
This process is a way of thinking, a way of
seeing the world, and a method of arriving at
what you believe are the best possible
decisions.
74
From the day a company is founded, it is
believed that each executive should also be a
manager in the right sense.
It is believed that the executives should be
able to take the right action with the help of
the appropriate decision-making process.
And the best decision-making process is one
that is not only efficient but also cost-
effective.
75
In order to make the best decisions, companies need to use tools
to decipher their data and better understand their customers.
Business Intelligence (BI) is a set of tools used by companies to
analyze data using data analytics tools, gain valuable insights,
and make better and more informed decisions using Business
Intelligence reports.
Since its early days, business intelligence has been a tool for
decision-making. It’s a way to ensure that managers and
executives make better decisions so that businesses can run
more effectively.
These decisions can range from basic operations like deciding
how much to produce to broader strategic considerations like
deciding what products to sell.
However, BI tools vary considerably in how they are designed, how
they interact with other applications, and how they are used.
Ultimately, BI is about having access to the right information at the
right time.
76
THE EFFECTIVE STEPS IN THE DECISION-
MAKING PROCESS THAT USES BI
Gathering information
If you’re like most of us, you’re using
spreadsheets and dashboards to visualize
and present data on the Web.
But some businesses are turning to
dashboards to make smart decisions that
improve operations, enhance employee
satisfaction, process more data, and have
greater visibility into the bottom line.
77
Design and analyze
Routinely encountering data requires a
person to make decisions on how to process
the data.
It can be viewed as a few steps involved in
this process.
First, it involves the analysis of data, followed
by the identification of the most appropriate
decision- making model.
The implementation comes after the analysis
of data.
The data analysis process is then repeated,
and the model is altered based on the latest
78
information or discoveries, if necessary.
Select and implement using ad hoc
query, what-if, and forecasting
The importance of data in business is beyond
dispute. But, the problem is that most companies
are not making the most of the information
available to them.
What they have is not necessarily the right data
and is not being put to its best use.
Ad hoc queries, what-if scenarios, and forecasting
are particularly effective tools for uncovering
hidden information.
It can influence risk and make decisions. Ad-hoc
query, What-If, and Forecasting are three terms
used in business intelligence to help make
decisions.
79
Do evaluations using the
vital tools
Effective decision-making is critical to business
success. The good news is that business intelligence
tools can help you make better decisions.
Using dashboards and reports can help you identify
key performance indicators, make comparisons
between data sets.
Also, identify opportunities to improve your business. If
you have any experience managing a business, you
know that these vital tools are a necessity, providing
you with the information you need to make business
decisions.
These tools can be used for assessing your
performance, motivating staff, and improving the
customer experience.
80
Separate the components that
are not related
At one point in time, every decision we make
appears to have little connection to anything else.
This is because we are able to separate decisions
into distinctly different components, such as the
sales forecast, the profit, the hiring policy, and so
on.
The trick is to recognize that these components are
inseparable. Moreover, a decision made without the
sales forecast is like an engine without the air.
A decision made without a profit is like an engine
without fuel.
A decision made without a hiring policy is like an
engine without the air.
81
Increase the coherence
Business intelligence is a technology that
extracts and analyzes historical data to
help organizations make better decisions.
It is a key asset for businesses that use it to
gain value from all their data and to make
better decisions on how to spend their
resources.
82
There is a trend among different organizations in
today’s world to adopt the use of Business
Intelligence.
The use of business intelligence in these
organizations helps in making decisions faster and
more effectively.
With the help of business intelligence, organizations
can keep track of their processes.
Moreover, the performance of their business, and
the consumption of resources and get insights into
the organization’s current situation.
Business intelligence solutions help in making more
effective decisions.
83
Decision Support Systems
Broadly speaking, a decision support system
(DSS) is an analytics software program
used to gather and analyze data to inform
decision making.
There are many different types of decision
support systems, from modern business
intelligence which uses AI and machine
learning to suggest insights and analyses for
humans to perform, to model-based DSS
systems which use predefined criteria to
perform automated calculations and deliver
best-case decisions.
For all types, DSS is used in timely problem
84
solving to improve efficiency and streamline
Traditional vs Modern DSS
Traditional DSS: Historically, DSS and BI
tools relied on preconfigured, historical data
with no ability to drive real-time decisions
and action. With this approach, decisions are
made based on the past.
Modern DSS: New tools and processes allow
for “active intelligence”, a state of continuous
intelligence with an end-to-end analytics data
pipeline delivering real- time, up-to-date
information designed to trigger immediate
insights and actions.
85
DSS Characteristics
Prior to decision support systems,
organizational leaders relied heavily on a
combination of their experience and
professional training, and applied those to
thoughtful use of the advanced insights
generated by a data analytics platform.
Decision support systems systematize that
by taking organizational data, analyzing it,
and presenting it for use in company decision
making.
86
Categories of decision support
systems
Data-driven DSS
A data-driven DSS gives users access to a
large amount of internal and external
data.
This DSS will query a database using the
web, an external server or a company's
mainframe.
It relies on data mining to provide patterns
and information about the data being
assessed.
87
Users rely on data-driven decision support systems to
make decisions about businesses, inventories and
products.
Managers might find data-driven decision support
systems most helpful when analyzing current and
historical data to report on the conditions of a
department or the business.
CEOs, managers and staff might use a data-driven DSS.
Software examples of a data-driven DSS include:
Geographic Information Systems (GIS)
File drawer systems
Executive information systems
Computer-based databases with query systems
88
Model-driven DSS
A model-driven DSS allows a user to analyze and
manipulate specific models of data, such as
statistics, finances or scheduling.
These decision support systems are specific to the
type of model the user wants to interact with and
typically offer less data than other DSS types.
They analyze scenarios and data to allow the user to
manipulate a model, such as creating a work schedule.
They might use simple analysis tools or complex
statistics, depending on the model's purpose and the
user's needs.
Managers, staff and third parties who interact with a
business might use a model-driven DSS.
89
Software examples of a model-driven DSS
include:
Scheduling software
Financial modeling
Decision analysis modeling
Optimization software
90
Knowledge-driven DSS
With a knowledge-driven DSS, a knowledge-management
system monitors continually updated data about an
organization to support decisions.
The DSS uses diagnosis, prediction, interpretation and
classification to recommend actions consistent with the
business.
A knowledge-driven DSS can be helpful to managers because it
performs tasks faster than a human might. They can also help
consumers decide which products and services to buy. This kind
of DSS often relies on a data-mining component.
Managers, staff and external users, such as customers, might
use a knowledge- driven DSS.
Software examples of a knowledge-driven DSS include:
Software that identifies new or current customers who might be
interested in products
Product selection software
91
Document-driven DSS
A document-driven DSS retrieves unstructured
information from a variety of electronic sources. It
searches web pages, documents in databases and other
information based on a user's search terms to gather
relevant information.
A document-driven DSS might be specific to a business'
private files or as broad as a common internet search
engine.
Anyone using a database's search function or an internet
search engine is using a document-driven DSS.
Software examples of a document-driven DSS include:
Search engines
Database search software
Article databases with search functions
92
Communication-driven DSS
A communication-driven DSS uses tools to
support communication and collaboration.
Email is an example of a communication-
driven DSS.
This type of DSS includes share tools that
allow multiple people to work on a project
at once and software that allows for digital
communication between people.
93
It improves a shared project's efficiency and effectiveness and
can help facilitate meetings and conversations.
Internal team members, virtual business meeting hosts and
online chat and video meeting software users can benefit from
a communication-driven DSS.
A communication-driven DSS might also be called a group DSS.
A communication- driven DSS focuses on communication and
collaboration, while a group DSS helps groups streamline the
decision-making process.
A communication-driven DSS, for example, might help two
people who work for the same company on different shifts
share documents. It might also allow employees on opposite
sides of the country to meet virtually to view a shared file.
Software examples of a communication-driven DSS include:
Chat and instant messaging services
Collaboration software, such as document sharing and editing
software
94
Intelligent DSS
Any DSS with artificial intelligence in its design is an example of
intelligent DSS (IDSS).
Within an IDSS, AI does data mining and processing to filter
through large datasets.
An IDSS is designed to offer similar services to a human
consultant.
They're programmed to identify patterns and trends to
guide decision-making.
They can also resolve problems and analyze solutions. AI
components add advantages, such as fuzzy logic and machine
learning, to a DSS.
Managers, diagnosticians and other decision-makers might use
an IDSS.
Software examples of an intelligent DSS include:
Smart manufacturing systems
Medical diagnostic systems
95
Manual DSS
A manual DSS relies on individuals instead of
computers to support decision- making.
A group of experts analyses the strengths,
weaknesses, opportunities and threats of their
organization or project.
A manual DSS is much slower than a computer-
based DSS, but certain types of analysis still need a
human eye at every step.
Economists, executives and managers might use a
manual DSS.
Examples of manual DSS include:
Cost-benefit analyses
Decision matrixes
96
Hybrid DSS
A hybrid DSS combines parts of multiple DSS types to create a
complex outcome.
Large issues in industries such as finance and health care might
require the tools of multiple decision support systems, such as a
knowledge-driven DSS and a data-driven DSS.
A hybrid DSS might use additional software to help these
components work together.
Sometimes a human analyses and combines the results of each
DSS.
A hybrid DSS might also describe a system in which a human works
with a DSS to extract and manipulate data.
Medical professionals, financial decision-makers and researchers
might use a hybrid DSS.
Software examples of a hybrid DSS include:
Risk assessment
Clinical DSS
Web-based DSS
97
Decision Support System
Examples.
GPS route planning.
Crop-planning.
Clinical DSS
98
Business Intelligence
Business intelligence combines business
analytics, data mining, data visualization,
data tools and infrastructure, and best
practices to help organizations make more
data-driven decisions.
99
Techniques Used In Business Intelligence
1. Data Visualization
When data is stored as a set or matrix of
numbers, it is precise but difficult to interpret.
When looking at more than one dimension of
the data, this becomes even harder. Creating
charts, graphics or dashboards from the data
makes it much easier for people to understand
and interpret.
2. Data Mining
Data mining is a computer supported method to
reveal previously unknown or unnoticed
relations among data entities.
10
0
Example
In retail: shopping basket analysis can
examine products consumers buy together
in order to better promote other products.
In banking: using an automated risk
assessment based on historical data to
evaluate whether a customer is likely to
pay back a loan.
In insurance: mining behavioural and
historical data to detect fraud.
In health: analysis of complications and
common diseases may help to reduce risk.
10
1
Reporting
One area where BI tools commonly help
business users is by designing, scheduling
and generating reports, for example regular
performance, sales or marketing reports.
Reports output by BI tools efficiently gather
and present information to support the
management, planning and decision
making process.
Once the report is designed it can be
automatically run at set intervals and sent
to a predefined distribution list so key
people can see regularly updated numbers.
10
2
Time-Series Analysis And
Predictive Techniques
Nearly all data warehouses and all
enterprise data have a time dimension.
For example, product sales, phone calls,
patient hospitalizations, etc.
Time-series analysis can reveal changes in
user behaviour over time, relationships
between sales of different products, or
changes in sales figures based on
marketing promotions.
Historic data can also be used to
extrapolate and try to predict future trends,
10
outcomes or financial results.
3
Online Analytical Processing
(OLAP)
OLAP is best known for the OLAP-cubes which provide a
visualization of multidimensional data.
OLAP cubes display dimensions on the cube edges (e.g. time,
product, customer type, customer age etc.).
The values in the cube represent the measured facts (e.g. value
of contracts, number of sold products etc.).
The user can navigate through OLAP cubes using drill-up, drill-
down and drill-across features.
The drill-up functionality enables the user to easily zoom out to
more coarse-grained details.
Conversely, drill-down displays the information with more
details.
Finally, drilling-across means that the user can navigate to
another OLAP cube to see the relations on another dimension(s).
All the functionality is provided in real-time.
10
4
Statistical Analysis
Statistical analysis uses mathematic
foundations to qualify the significance and
reliability of the observed relations.
The most interesting features are
distribution analysis, confidence intervals
(for example for changes in user
behaviours, etc.)
Statistical analysis is used for devising and
analysing the results from data mining.
10
5
The Necessary BI Skills
Data Analysis: Most BI skills and intelligence analyst-
related skills are about using data to make better decisions.
You need to be good at examining many different sources of
data and then making accurate conclusions about them.
Problem-solving: BI isn’t just about analyzing data; it’s also
about creating business strategies and solving real-world
business problems with that data. For example, you could be
the one to extract actionable insights from specific retail KPIs
that need to be visualized and presented during a meeting.
Specific industry knowledge: While some of this can and
will be learned on the job, you need to have a solid grasp of
the industry’s dynamics, particularly the areas of the field
that you’re looking to work in. Over time, you’ll want to
become an expert in your industry as this will increase your
ability to connect data with business problem-solving.
10
6
Communication skills: In addition to acquiring intelligence
analyst-related skills, you’ll need to be able to communicate
your findings effectively to the other professionals you’ll be
working with. To some extent, if you work in back-end BI, you
won’t need to communicate quite as much. However, if you
work in the front- end, you’ll be responsible for communicating
technical concepts to non-technical people. This kind of role
requires excellent communication skills.
Data visualization: Expanding on the point above, in order to
ensure good communication you will also need to have data
visualization skills. Visualizations are the best tools to make
trends and general insights understandable. Being able to
clearly see how the data changes in time is what makes it
possible to extract relevant conclusions from it. For this purpose,
you should be able to differentiate between various charts and
report types as well as understand when and how to use them to
benefit the BI process.
10
7
Advanced vision and attention to detail: By its
very nature, business intelligence is incredibly detail-
oriented. As a BI analyst or developer, you'll often work
with the smallest fragment of information with the
objective of turning it into actionable insight. You will
need a great deal of forward-thinking vision and the
ability to pay very close attention to detail to succeed
in the fast-paced world of BI.
Statistical analysis: Statistical knowledge is another
important skill especially if you want to become a BI
analyst. Understanding various statistical components
such as mean, median, range, variance, and others,
can enable you to go deeper into the data and extract
relevant conclusions from it.
10
8
Programming knowledge: On a more technical side of
things, having programming language knowledge can also
be a very valuable skill when it comes to pursuing a career
in BI. Many solutions require the use of different
programming languages to perform advanced analysis
such as R, Python, Javascript, just to name a few, and
knowing them can significantly enhance your skillset.
Technical notion: Our next BI skill is not fundamental, but
it can certainly make you a more complete and prepared
professional. Business intelligence is an industry that
highly relies on technology and having a technical notion of
how to manage these technologies can be a plus. With this,
we do not mean that you need to know how to use every
tool in the market, but understanding how these
technologies can work to your advantage.
10
9
Business acumen: To thrive in a business
intelligence career, you will need to
possess a swift ability to understand your
company’s business model and how to
tailor your efforts to not only gain
maximum value from your key performance
indicators (and the KPI management
process) but also make strategic decisions
that will help your organization succeed on
a continual basis.
11
0
Benefits of business intelligence
Data clarity
Increased efficiency
Better customer experience
Improved employee satisfaction
11
1
How to develop a business intelligence
strategy
A BI strategy is your blueprint for success.
You’ll need to decide how data is used,
gather key roles, and define responsibilities
in the initial phases. It may sound simple at
a high level; however, starting with
business goals is your key to success.
11
2
How to create a BI strategy from
the ground up:
Know your business strategy and goals.
Identify key stakeholders.
Choose a sponsor from your key
stakeholders.
Choose your BI platform and tools.
Create a BI team.
Define your scope.
Prepare your data infrastructure.
Define your goals and roadmap.
11
3
Advantages of BI include:
Data visibility
Accurate reports
Streamlined processes
Disadvantages of BI include:
Initial cost
User resistance
Data skills gap
11
4
OLAP
OLAP is an acronym for Online Analytical
Processing. OLAP performs
multidimensional analysis of business data
and provides the capability for complex
calculations, trend analysis, and
sophisticated data modeling.
11
5
OLAP offers five key benefits:
Business-focused multidimensional data
Business-focused calculations
Trustworthy data and calculations
Speed-of-thought analysis
Flexible, self-service reporting
11
6
Characteristics of OLAP
It defines which the system targeted to
deliver the most feedback to the client
within about five seconds, with the
elementary analysis taking no more than
one second and very few taking more than
20 seconds.
11
7
Analysis
It defines which the method can cope with any
business logic and statistical analysis that is relevant
for the function and the user, keep it easy enough for
the target client. Although some pre programming
may be needed we do not think it acceptable if all
Share.
It defines which the system tools all the security
requirements for understanding and, if multiple write
connection is needed, concurrent update location at
an appropriated level, not all functions need customer
to write data back, but for the increasing number
which does, the system should be able to manage
multiple updates in a timely, secure manner.
11
8
Multidimensional
This is the basic requirement. OLAP system
must provide a multidimensional
conceptual view of the data, including full
support for hierarchies, as this is certainly
the most logical method to analyse
business and organizations.
Information
The system should be able to hold all the
data needed by the applications.
Data sparsity should be handled in an
efficient manner.
11
OLAP Operations in the Multidimensional
9
Roll-Up
The roll-up operation (also known as drill-up
or aggregation operation) performs
aggregation on a data cube, by climbing
down concept hierarchies, i.e., dimension
reduction. Roll-up is like zooming-out on the
data cubes.
When a roll-up is performed by dimensions
reduction, one or more dimensions are
removed from the cube.
12
0
12
1
Drill-Down
The drill-down operation (also called roll-down) is the
reverse operation of roll-up.
Drill- down is like zooming-in on the data cube. It
navigates from less detailed record to more detailed
data.
Drill-down can be performed by either stepping down a
concept hierarchy for a dimension or adding additional
dimensions.
Figure shows a drill-down operation performed on the
dimension time by stepping down a concept hierarchy
which is defined as day, month, quarter, and year.
Drill-down appears by descending the time hierarchy
from the level of the quarter to a more detailed level of
the month.
12
2
12
3
Slice
A slice is a subset of the cubes
corresponding to a single value for one or
more members of the dimension.
For example, a slice operation is executed
when the customer wants a selection on
one dimension of a three-dimensional cube
resulting in a two-dimensional site.
So, the Slice operations perform a selection
on one dimension of the given cube, thus
resulting in a sub cube.
12
4
12
5
Dice
The dice operation describes a sub cube by
operating a selection on two or more
dimension.
12
6
Pivot
The pivot operation is also called a rotation.
Pivot is a visualization operations which
rotates the data axes in view to provide an
alternative presentation of the data.
It may contain swapping the rows and
columns or moving one of the row-
dimensions into the column dimensions.
12
7
12
8
12
9
Difference between OLTP and
OLAP
OLTP (On-Line Transaction Processing) is
featured by a large number of short on-line
transactions (INSERT, UPDATE, and
DELETE).
The primary significance of OLTP operations
is put on very rapid query processing,
maintaining record integrity in multi-access
environments, and effectiveness consistent
by the number of transactions per second.
13
0
OLAP (On-line Analytical Processing) is
represented by a relatively low volume of
transactions.
Queries are very difficult and involve
aggregations. For OLAP operations,
response time is an effectiveness measure.
OLAP applications are generally used by
Data Mining techniques.
In OLAP database there is aggregated,
historical information,
13
1
13
2
Types of OLAP
13
3
Relational OLAP (ROLAP)
Server
These are intermediate servers which stand in between a relational
back-end server and user frontend tools.
They use a relational or extended-relational DBMS to save and handle
warehouse data, and OLAP middleware to provide missing pieces.
ROLAP servers contain optimization for each DBMS back end,
implementation of aggregation navigation logic, and additional tools
and services.
ROLAP technology tends to have higher scalability than MOLAP
technology.
ROLAP systems work primarily from the data that resides in a
relational database, where the base data and dimension tables are
stored as relational tables. This model permits the multidimensional
analysis of data.
This technique relies on manipulating the data stored in the relational
database to give the presence of traditional OLAP's slicing and dicing
functionality. In essence, each method of slicing and dicing is
equivalent to adding a "WHERE" clause in the SQL statement.
13
4
13
5
Multidimensional OLAP (MOLAP)
Server
A MOLAP system is based on a native
logical model that directly supports
multidimensional data and operations.
Data are stored physically into
multidimensional arrays, and positional
techniques are used to access them.
One of the significant distinctions of
MOLAP against a ROLAP is that data are
summarized and are stored in an optimized
format in a multidimensional cube, instead
of in a relational database.
In MOLAP model, data are structured into
13
6
proprietary formats by client's reporting.
13
7
Hybrid OLAP (HOLAP) Server
HOLAP incorporates the best features of
MOLAP and ROLAP into a single
architecture.
HOLAP systems save more substantial
quantities of detailed data in the relational
tables while the aggregations are stored in
the pre-calculated cubes.
HOLAP also can drill through from the cube
down to the relational tables for delineated
data. The Microsoft SQL Server 2000
provides a hybrid OLAP server.
13
8
13
9
Relational Online Analytical
Processing (ROLAP)
ROLAP servers are placed between
relational backend server and client front-
end tools.
It uses relational or extended DBMS to
store and manage warehouse data. ROLAP
has basically 3 main components:
Database Server, ROLAP server, and Front-
end tool.
14
0
Advantages of ROLAP –
ROLAP is used for handle the large amount
of data. ROLAP tools don’t use pre-
calculated data cubes.
Data can be stored efficiently.
ROLAP can leverage functionalities inherent
in the relational database.
14
1
Disadvantages of ROLAP –
Performance of ROLAP can be slow.
In ROALP, difficult to maintain aggregate
tables. Limited by SQL functionalities.
14
2
Multidimensional Online
Analytical Processing (MOLAP)
MOLAP does not uses relational database to
storage.
It stores in optimized multidimensional
array storage. The storage utilization may
be low With multidimensional data stores.
Many MOLAP server handle dense and
sparse data sets by using two levels of data
storage representation. MOLAP has 3
components : Database Server, MOLAP
server, and Front-end tool.
14
3
Advantages of MOLAP –
MOLAP is basically used for complex
calculations.
MOLAP is optimal for operation such as
slice and dice.
MOLAP allows fastest indexing to the pre-
computed summarized data.
Disadvantages of MOLAP –
MOLAP can’t handle large amount of data.
In MOLAP, Requires additional investment.
Without re-aggregation, difficult to change
14
4 dimension.
Hybrid Online Analytical
Processing (HOLAP)
Hybrid is a combination of both ROLAP and
MOLAP. It offers functionalities of both
ROLAP and as well as MOLAP like faster
computation of MOLAP and higher
scalability of ROLAP.
The aggregations are stored separately in
MOLAP store. Its server allows storing the
large data volumes of detailed information.
14
5
Advantages of HOLAP –
HOLAP provides the functionalities of both
MOLAP and ROLAP.
HOLAP provides fast access at all levels of
aggregation.
Disadvantages of HOLAP –
HOLAP architecture is very complex to
understand because it supports both
MOLAP and ROLAP.
14
6
14
7
OLAP offers five key benefits:
Business-focused multidimensional data
Business-focused calculations
Trustworthy data and calculations
Speed-of-thought analysis
Flexible, self-service reporting
14
8
Advantages
Business-centered multidimensional
information.
Business-centered figuring’s.
Dependable information and figuring’s.
Speed-of-thought examination.
Adaptable, self-administration detailing.
14
9
Disadvantages
Pre-demonstrating is an absolute necessity.
As to business information, the traditional OLAP
tools don't take into consideration quick
investigation without pre-demonstrating.
Extraordinary reliance on IT.
Helpless calculation capacity.
Shy Interactive examination capacity.
Slow in responding.
Theoretical model.
Extraordinary, expected danger.
15
0
Analytic functions
Analytical functions are one of the most
popular tools among BI/Data analysts for
performing complex data analysis.
These functions perform computations over
multiple rows and return the multiple rows
as well.
15
1
analytic_function_name([argument_list])
OVER (
[PARTITION BY partition_expression,…]
[ORDER BY sort_expression, … [ASC|
DESC]])
There are three parts to this syntax,
namely function, partition by and order by.
15
2
analytic_function_name: name of the
function — like RANK(), SUM(), FIRST(), etc
partition_expression: column/expression on
the basis of which the partition or window
frames have to be created
sort_expression: column/expression on the
basis of which the rows in the partition will
be sorted
15
3
15
4
Trendline definition is a function that fits
a linear or exponential model and returns
the fitted values or model.
The numeric_expr represents the Y value
for the trend. The series (time columns)
represent the X value.
15
5
Syntax:
TRENDLINE(numeric_expr, ([series]) BY ([partitionBy]),
model_type, result_type)
numeric_expr represents the data to the trend. This is the
Y-axis, usually a measure column.
series is the X-axis is a list of numerics or time dimension
attribute columns.
partitionBY is a list of dimension attribute columns that are
in the view but not on the X-axis.
model_type is one of the following (‘LINEAR’,
‘EXPONENTIAL’).
result_type is one of the following (‘VALUE’, ‘MODEL’).
‘VALUE‘ will return all the regression Y values given X in the
fit. ‘MODEL’ will return all the parameters in a JSON format
string.
15
6
TRENDLINE((Sales),(Order Date) BY
(Product Sub Category), 'LINEAR', 'VALUE')
numeric_expr = Sales
series = Order Date (Time dimension)
partitionBy = Product Sub Category
model_type = LINEAR
result_type = VALUE
15
7
CLUSTER((dimension_expr1 , ...
dimension_exprN), (expr1, ... exprN),
output_column_name, options,
[runtime_binded_options])
dimension_expr represents the list of
dimensions, e.g., (productID, companyID),
to be clustered.
expr represents the list of dimension
attributes or measures to be used to
cluster the dimension_expr.
15
8
output_column_name is the output column.
The valid values are ‘clusterId’,
‘clusterName’, ‘clusterDescription’,
‘clusterSize’, ‘distanceFromCenter’,
‘centers’.
options mean the string list of name=value
pairs separated by ‘;’. The value can
include %1 … %N, specified using
runtime_binded_options.
runtime_binded_options is an optional
comma-separated list of run-time binded
columns or literal expressions.
15
9
OUTLIER((dimension_expr1 , ... dimension_exprN), (expr1, .. exprN),
output_column_name, options, [runtime_binded_options]))])
dimension_expr indicates a list of dimensions.
expr represents a list of dimension attributes or measures to find
outlier.
output_column_name indicates the output column name. Valid
values are 'isOutlier' and 'distance'.
options indicates a string list of name/value pairs separated by a
semi-colon (;).
The value can include %1 ... %N, which can be specified using
runtime_binded_options.
runtime_binded_options is an option comma separated list (,) of run-
time binded columns and or literal expressions.
REGR(y_axis_measure_expr, (x_axis_expr), (category_expr1, ...,
category_exprN), output_column_name, options,
[runtime_binded_options])
16
0