0% found this document useful (0 votes)
32 views31 pages

Data Management and Business Intelligence

This document provides an outline for a course on foundations of business intelligence. It covers organizing data in traditional file environments and databases, as well as using databases to improve business performance. Key topics include the database approach to data management, entity-relationship diagrams, normalization, SQL queries, data warehousing, online analytical processing, and data mining. The goal is to help students understand how to design databases and leverage tools and analytics to make better business decisions.

Uploaded by

Sam Dari
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views31 pages

Data Management and Business Intelligence

This document provides an outline for a course on foundations of business intelligence. It covers organizing data in traditional file environments and databases, as well as using databases to improve business performance. Key topics include the database approach to data management, entity-relationship diagrams, normalization, SQL queries, data warehousing, online analytical processing, and data mining. The goal is to help students understand how to design databases and leverage tools and analytics to make better business decisions.

Uploaded by

Sam Dari
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

IS 402E

Information Technology Management

Instructors: Tuba BAKICI, Hadj BARKAT, Arnaud


POISSON, Pantelis FRANGOUDIS

Rennes, September 2016

1
OUTLINE

• Foundations of Business Intelligence:


• Organizing Data in a Traditional File Environment
• The Database Approach to Data Management
• Using Databases to Improve Business Performance and
Decision Making
• Managing Data Resources

2
ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT

• File organization concepts


– Database: Group of related files
– File: Group of records of same type
– Record: Group of related fields
– Field: Group of characters as word(s) or number
• Describes an entity (person, place, thing on which we
store information)
• Attribute: Each characteristic, or quality, describing
entity
– Example: Attributes DATE or GRADE belong to entity COURSE
THE DATA HIERARCHY
A computer system
organizes data in a
hierarchy that starts with
the bit, which represents
either a 0 or a 1. Bits
can be grouped to form
a byte to represent one
character, number, or
symbol. Bytes can be
grouped to form a field,
and related fields can be
grouped to form a
record. Related records
can be collected to form
a file, and related files
can be organized into a
database.
ORGANIZING DATA IN A TRADITIONAL FILE ENVIRONMENT

• Problems with the traditional file environment


(files maintained separately by different
departments)
– Data redundancy:
• Presence of duplicate data in multiple files
– Data inconsistency:
• Same attribute has different values
– Program-data dependence:
• When changes in program requires changes to data accessed
by program
– Lack of flexibility
– Poor security
– Lack of data sharing and availability
TRADITIONAL FILE PROCESSING

The use of a
traditional approach
to file processing
encourages each
functional area in a
corporation to
develop specialized
applications. Each
application requires a
unique data file that
is likely to be a
subset of the master
file. These subsets of
the master file lead to
data redundancy and
inconsistency,
processing
inflexibility, and
wasted storage
resources.
THE DATABASE APPROACH TO DATA MANAGEMENT

• Database
– Serves many applications by centralizing data and
controlling redundant data
• Database management system (DBMS)
– Interfaces between applications and physical data files
– Separates logical and physical views of data
– Solves problems of traditional file environment
• Controls redundancy
• Eliminates inconsistency
• Uncouples programs and data
• Enables organization to central manage data and data security
HUMAN RESOURCES DATABASE WITH MULTIPLE VIEWS

A single human resources database provides many different views of data, depending on the
information requirements of the user. Illustrated here are two possible views, one of interest to a
benefits specialist and one of interest to a member of the company’s payroll department.
THE DATABASE APPROACH TO DATA MANAGEMENT

• Relational DBMS
– Represent data as two-dimensional tables
– Each table contains data on entity and attributes
• Table: grid of columns and rows
– Rows (tuples): Records for different entities
– Fields (columns): Represents attribute for entity
– Key field: Field used to uniquely identify each record
– Primary key: Field in table used for key fields
– Foreign key: Primary key used in second table as look-up field to
identify records from original table
Relational Database Tables

A relational database
organizes data in the
form of two-dimensional
tables. Illustrated here
are tables for the entities
SUPPLIER and PART
showing how they
represent each entity
and its attributes.
Supplier Number is a
primary key for the
SUPPLIER table and a
foreign key for the PART
table.
THE DATABASE APPROACH TO DATA MANAGEMENT

• Operations of a Relational DBMS


– Three basic operations used to develop useful
sets of data
• SELECT: Creates subset of data of all records that
meet stated criteria
• JOIN: Combines relational tables to provide user
with more information than available in individual
tables
• PROJECT: Creates subset of columns in table,
creating tables with only the information specified
THE THREE BASIC OPERATIONS OF A RELATIONAL DBMS

The select, join, and project operations enable data from two different tables to be combined and only
selected attributes to be displayed.
THE DATABASE APPROACH TO DATA MANAGEMENT

• Capabilities of database management


systems
– Data dictionary: Automated or manual file storing
definitions of data elements and their
characteristics
– Data manipulation language: Used to add, change,
delete, retrieve data from database
• Structured Query Language (SQL)
• Microsoft Access user tools for generating SQL
EXAMPLE OF AN SQL QUERY

Illustrated here are the SQL statements for a query to select suppliers for parts 137 or 150. They
produce a list with the same results as Figure 6-5.
THE DATABASE APPROACH TO DATA MANAGEMENT

• Designing Databases
– Conceptual (logical) design: abstract model from business perspective
– Physical design: How database is arranged on direct-access storage
devices

• Design process identifies:


– Relationships among data elements, redundant database elements
– Most efficient way to group data elements to meet business
requirements, needs of application programs

• Normalization
– Streamlining complex groupings of data to minimize redundant data
elements and awkward many-to-many relationships
AN UNNORMALIZED RELATION FOR ORDER

An unnormalized relation contains repeating groups. For example, there can be many parts and
suppliers for each order. There is only a one-to-one correspondence between Order_Number and
Order_Date.
NORMALIZED TABLES CREATED FROM ORDER

After normalization, the original relation ORDER has been broken down into four smaller relations.
The relation ORDER is left with only two attributes and the relation LINE_ITEM has a combined, or
concatenated, key consisting of Order_Number and Part_Number.
THE DATABASE APPROACH TO DATA MANAGEMENT

• Entity-relationship diagram
• Used by database designers to document the data
model
• Illustrates relationships between entities

Caution: If a business doesn’t get data model right, system


won’t be able to serve business well
AN ENTITY-RELATIONSHIP DIAGRAM

This diagram shows the relationships between the entities SUPPLIER, PART, LINE_ITEM, and ORDER
that might be used to model the database in Figure 6-10.
USING DATABASES TO IMPROVE BUSINESS PERFORMANCE AND
DECISION MAKING

• Business intelligence infrastructure


Today includes an array of tools for separate systems,
and big data
• Contemporary tools:
• Data warehouses
• Data marts
• Hadoop
USING DATABASES TO IMPROVE BUSINESS PERFORMANCE AND
DECISION MAKING

• Data warehouse:
– Stores current and historical data from many core
operational transaction systems
– Consolidates and standardizes information for use across
enterprise, but data cannot be altered
– Provides analysis and reporting tools
• Data marts:
– Subset of data warehouse
– Summarized or focused portion of data for use by specific
population of users
– Typically focuses on single subject or line of business
COMPONENTS OF A DATA WAREHOUSE

A contemporary
business intelligence
infrastructure features
capabilities and tools
to manage and
analyze large
quantities and different
types of data from
multiple sources.
Easy-to-use query and
reporting tools for
casual business users
and more
sophisticated
analytical toolsets for
power users
are included.
USING DATABASES TO IMPROVE BUSINESS PERFORMANCE AND
DECISION MAKING

• In-memory computing
Used in big data analysis
Use computers main memory (RAM) for data storage to
avoid delays in retrieving data from disk storage
Can reduce hours/days of processing to seconds
Requires optimized hardware
• Analytic platforms
High-speed platforms using both relational and non-
relational tools optimized for large datasets
USING DATABASES TO IMPROVE BUSINESS PERFORMANCE AND
DECISION MAKING

• Analytical tools: Relationships, patterns,


trends
– Tools for consolidating, analyzing, and providing
access to vast amounts of data to help users make
better business decisions
• Multidimensional data analysis (OLAP)
• Data mining
• Text mining
• Web mining
USING DATABASES TO IMPROVE BUSINESS PERFORMANCE AND
DECISION MAKING

• Online analytical processing (OLAP)


– Supports multidimensional data analysis
• Viewing data using multiple dimensions
• Each aspect of information (product, pricing, cost,
region, time period) is different dimension
• Example: How many washers sold in East in June
compared with other regions?
– OLAP enables rapid, online answers to ad hoc queries
MULTIDIMENSIONAL DATA MODEL

The view that is showing


is product versus region.
If you rotate the cube 90
degrees, the face that
will show product versus
actual and projected
sales. If you rotate the
cube 90 degrees again,
you will see region
versus actual and
projected sales. Other
views are possible.
USING DATABASES TO IMPROVE BUSINESS PERFORMANCE AND
DECISION MAKING

• Data mining:
Finds hidden patterns, relationships in datasets
• Example: customer buying patterns
Infers rules to predict future behavior
Types of information obtainable from data mining:
• Associations
• Sequences
• Classification
• Clustering
• Forecasting
USING DATABASES TO IMPROVE BUSINESS PERFORMANCE AND
DECISION MAKING

• Text mining
Extracts key elements from large unstructured data sets
• Stored e-mails
• Call center transcripts
• Legal cases
• Patent descriptions
• Service reports, and so on
Sentiment analysis software
• Mines e-mails, blogs, social media to detect opinions
USING DATABASES TO IMPROVE BUSINESS PERFORMANCE AND
DECISION MAKING

• Web mining
– Discovery and analysis of useful patterns and
information from Web
– Understand customer behavior
– Evaluate effectiveness of Web site, and so on
– Web content mining
• Mines content of Web pages
– Web structure mining
• Analyzes links to and from Web page
– Web usage mining
• Mines user interaction data recorded by Web server
USING DATABASES TO IMPROVE BUSINESS PERFORMANCE AND
DECISION MAKING

• Databases and the Web


– Many companies use Web to make some internal
databases available to customers or partners
– Typical configuration includes:
• Web server
• Application server/middleware/CGI scripts
• Database server (hosting DBMS)
– Advantages of using Web for database access:
• Ease of use of browser software
• Web interface requires few or no changes to database
• Inexpensive to add Web interface to system
LINKING INTERNAL DATABASES TO THE WEB

Users access an organization’s internal database through the Web using their desktop PCs and Web
browser software.

You might also like