DataAnalytics Chap 3

Chapter 3 discusses data mining, which involves extracting knowledge from large datasets through a series of steps including data cleaning, integration, selection, transformation, mining, evaluation, and presentation. It outlines the advantages and disadvantages of data mining, such as its efficiency in decision-making and privacy concerns. The chapter also covers mining frequent patterns, associations, and correlations, along with concepts of characterization and discrimination in data analysis.

Uploaded by

rushabhghotkule129

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views16 pages

DataAnalytics Chap 3

Uploaded by

rushabhghotkule129

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CHAPTER – 3

MINING FREQUENT PATTERNS,

ASSOCIATION AND
CORRELATIONS
Data Mining:

• Data Mining is the technique that extracting information from huge sets of data.
Data mining is the procedure of mining knowledge from data.
• Data Mining is defined as, “ to extracting or mining knowledge from massive
amount of datasets.”
The essential step in the process of knowledge discovery:
1. Data Cleaning: In this step, the noise and inconsistent data is removed and
cleaned.
2. Data Integration: In this step, multiple data sources are combined.
3. Data Selection: In this step, data relevant to the analysistasks are retriwed from
the dataset.
4. Data Transformation: In this step, data is transformed or consolidation into
forms appropriate for mining by performing aggregation or summary operations.
5. Data Mining: In this step, intelligent methods are applied in order to extract data
patterns.
6. Pattern Evaluation: In this step, data patterns are evaluated.
7. Knowledge Presentation: In this step, knowledge is represented.
Advantages of Data Mining:
• Data mining is a quick process that makes it easy for new users to analyze enormous amounts of
data in short time.
• It helps to enables organizations to obtain knowledge-based data.
• Compare to other statistical data application, data mining is efficient and cost efficient.
• It helps in decision making process.
• Helps to predict future trends.

Disadvantages of Data Mining:

• It violates the privacy of its users and that is why it lacks in the safety matter and security to user.
• Identity is a big issue when using data mining.
• Data mining techniques is not a 100% accurate and may cause serious consequences in certain
condition.
• It requires its own space as well as maintenance. This an greatly increase the implementation cost.
What kind of Patterns Can be Mined?

• Patterns that occur frequently in data. Finding frequent patterns plays an essential
role in mining associations, correlations, and many other relationships among data.
• There are number of data mining functionalities such as characterization and
discrimination , the mining of frequent patterns, association and correlations,
classification and regression, clustering analysis and outlier analysis.
• In general tasks can be classified into two categories namely, descriptive and
predictive.
• Descriptive mining tasks characterize properties of the data in a target dataset.
• Predictive mining tasks performs induction on the current data in order to make predictions.
Class/Concept Description:
• Class/Concept refers to data to be associated with the classes or concept.
• Class groups similar items into categories (e.g., "computers" or "printers"),
focusing on what the data represents.
• Concept describes characteristics or behaviors (e.g., "big spenders" or "budget
spenders"), focusing on how the data behaves or is perceived.
• Classes organize data for structure, while concepts provide insights for decisions or
patterns. For example, businesses use classes to group products and concepts to
target customer behavior.
Characterization and Discrimination:
Data Characterization:
• It refers to summarizing data of class under study. This class under study is called as target class.
• It is summarization of the general characteristics or features of a target class of data.
• The data corresponding to user-specified class are typically collected by a query.

Data Discrimination:
• It refers to mapping or classification of a class with some predefined group or class.
• Data Discrimination is comparison of the general features of the target class data objects against general
features of objects from one or multiple contrasting classes.
• Ex: a university analyzing student performance:
• Target Class: High-performing students (scoring above 85%).
• Contrasting Class: Average-performing students (scoring between 50%–70%).
• Comparison (Discrimination): High performers: Spend more time studying, attend extra sessions, and
participate in projects.
• Average performers: Spend less time studying and focus only on exams.
• Purpose: By comparing features, the university identifies patterns to help average performers improve
and better allocate resources for student success.
Mining Frequent Patterns, Associations and Correlations:
Mining of Frequent Patterns:
• There are many kinds of frequent patterns, including itemsets, frequent
subsequences and frequent substructures.
• Frequent itemset a set of items that often appear together in a transactional dataset.
• Ex.: butter and bread (frequently bought together)
• Frequently occurring subsequence, such as pattern that customers, tend to purchase
in sequential order.
• Ex: Landing Page → Product Page → Add to Cart → Checkout.
• Substructure can refer to different structural forms (i.e. graphs, trees) tat may be
combined with itemsets or subsequences.
Mining of Association:
• Associations are used in retail sales to identify patterns that are frequently
purchased together.
• This process refers to the process of uncovering the relationship among data and
determining association rules.
• Ex: 70% of time milk is sold with bread and only 30% of times biscuits are sold
with bread.

Mining of Correlations:
• It is kind of additional analysis performed to uncover interesting statistical
correlations between associated-attribute-value pair or between two item sets to
analyze that if they have positive, negative or no effect on each other.
Support:
The support of a rule x y (where x and y are each item) is defined as proportion of
transaction in dataset which contain item set x as well as y.
Support = No. of transactions which contain item set x and y / Total no. of
transactions.

Confidence:
Confidence (xy) = Support (xy) / Support(x)
Apriori Algorithm:
Frequent Pattern Growth Algorithm:

Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
16 pages
Clustering Techniques in Data Mining
No ratings yet
Clustering Techniques in Data Mining
73 pages
Unit2-Introduction To Data Mining
No ratings yet
Unit2-Introduction To Data Mining
18 pages
Data Mining Tutorial Overview
No ratings yet
Data Mining Tutorial Overview
52 pages
Data Mining: Overview and Applications
No ratings yet
Data Mining: Overview and Applications
24 pages
Data Mining Tasks and Techniques Explained
No ratings yet
Data Mining Tasks and Techniques Explained
21 pages
Data Mining Techniques and Patterns
No ratings yet
Data Mining Techniques and Patterns
17 pages
Data Mining Overview and Applications
No ratings yet
Data Mining Overview and Applications
54 pages
Data Warehousing & Mining Techniques
No ratings yet
Data Warehousing & Mining Techniques
119 pages
DWDM Lecture Notes, I & II Units - II-II-gsk
No ratings yet
DWDM Lecture Notes, I & II Units - II-II-gsk
30 pages
Data Mining Concepts and Techniques
No ratings yet
Data Mining Concepts and Techniques
21 pages
Unit 1 Fundamentals of Data Science Data Mining Concepts and Techniques
No ratings yet
Unit 1 Fundamentals of Data Science Data Mining Concepts and Techniques
17 pages
Understanding Data Mining Processes
No ratings yet
Understanding Data Mining Processes
24 pages
Data Mining and Warehousing Overview
No ratings yet
Data Mining and Warehousing Overview
8 pages
Data Warehouse Tools and Functions
No ratings yet
Data Warehouse Tools and Functions
55 pages
Data Mining Fundamentals Overview
No ratings yet
Data Mining Fundamentals Overview
36 pages
Functions of Data Mining Explained
No ratings yet
Functions of Data Mining Explained
3 pages
UNIT 1 - Part B
No ratings yet
UNIT 1 - Part B
12 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
40 pages
Data Mining Module-1
No ratings yet
Data Mining Module-1
17 pages
Data Mining Techniques and Concepts
No ratings yet
Data Mining Techniques and Concepts
25 pages
BCA Data Mining Syllabus Overview
No ratings yet
BCA Data Mining Syllabus Overview
116 pages
Data Mining: Characterization & Discrimination
No ratings yet
Data Mining: Characterization & Discrimination
4 pages
Data Mining Techniques and Functions
No ratings yet
Data Mining Techniques and Functions
10 pages
Understanding Data Mining Processes
No ratings yet
Understanding Data Mining Processes
6 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
21 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
21 pages
Understanding Data Mining Techniques
No ratings yet
Understanding Data Mining Techniques
73 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
54 pages
Data Mining and Warehousing Overview
No ratings yet
Data Mining and Warehousing Overview
13 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
14 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
7 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
29 pages
Data Mining: Techniques and Applications
No ratings yet
Data Mining: Techniques and Applications
39 pages
Data Mining: Techniques and Applications
No ratings yet
Data Mining: Techniques and Applications
56 pages
Data Mining Functionalities Overview
No ratings yet
Data Mining Functionalities Overview
23 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
161 pages
Understanding Data Mining Techniques
No ratings yet
Understanding Data Mining Techniques
26 pages
Data Mining & Warehousing Overview
No ratings yet
Data Mining & Warehousing Overview
31 pages
Evidence of Sequence Patterns in Data Mining
No ratings yet
Evidence of Sequence Patterns in Data Mining
27 pages
Data Mining: Concepts and Applications
No ratings yet
Data Mining: Concepts and Applications
38 pages
Data Mining Tasks and Techniques
No ratings yet
Data Mining Tasks and Techniques
25 pages
Understanding Data Mining Techniques
No ratings yet
Understanding Data Mining Techniques
30 pages
Understanding Data Mining Techniques
No ratings yet
Understanding Data Mining Techniques
20 pages
Understanding Data Mining Concepts
No ratings yet
Understanding Data Mining Concepts
50 pages
Data Mining Techniques and Functions
No ratings yet
Data Mining Techniques and Functions
18 pages
Data Mining and Warehousing Overview
No ratings yet
Data Mining and Warehousing Overview
19 pages
Data Mining and Warehousing Course Overview
No ratings yet
Data Mining and Warehousing Course Overview
140 pages
Data Mining and Warehousing Course Overview
No ratings yet
Data Mining and Warehousing Course Overview
105 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
11 pages
Data Mining Functionalities Overview
No ratings yet
Data Mining Functionalities Overview
14 pages
Data Miniing Note1
No ratings yet
Data Miniing Note1
12 pages
Understanding Knowledge Discovery Process
No ratings yet
Understanding Knowledge Discovery Process
13 pages
Data Mining Functions and Tasks
No ratings yet
Data Mining Functions and Tasks
22 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
24 pages
Enhance Mine Planning with DOE Techniques
No ratings yet
Enhance Mine Planning with DOE Techniques
11 pages
China's Global Resource Strategy Explained
No ratings yet
China's Global Resource Strategy Explained
17 pages
Topical Kcse Revision: Geography Book
No ratings yet
Topical Kcse Revision: Geography Book
502 pages
MTU Engine Maintenance in Chile Mines
No ratings yet
MTU Engine Maintenance in Chile Mines
2 pages
Minerals and Energy Resources Overview
No ratings yet
Minerals and Energy Resources Overview
8 pages
Elements of Underground Metal Mining
No ratings yet
Elements of Underground Metal Mining
18 pages
Chirano Gold Project Technical Report
No ratings yet
Chirano Gold Project Technical Report
250 pages
Mining Machinery Course Overview at IIT KGP
No ratings yet
Mining Machinery Course Overview at IIT KGP
20 pages
Dysart Mining Highwall Assessment Report
No ratings yet
Dysart Mining Highwall Assessment Report
25 pages
Coal India Limited Annual Report 2012-13
No ratings yet
Coal India Limited Annual Report 2012-13
317 pages
Shimmerstone Mine Rescue Mission
No ratings yet
Shimmerstone Mine Rescue Mission
15 pages
Mineral Resources in Arunachal Pradesh
No ratings yet
Mineral Resources in Arunachal Pradesh
3 pages
Mining Cost Reports: Copper, Gold, Silver
No ratings yet
Mining Cost Reports: Copper, Gold, Silver
7 pages
November 2025 Current Affairs for UPSC
No ratings yet
November 2025 Current Affairs for UPSC
146 pages
Wi-Fi Controlled Coal Mining Robot Project
No ratings yet
Wi-Fi Controlled Coal Mining Robot Project
56 pages
Overview of Occupational Diseases and Pathology
No ratings yet
Overview of Occupational Diseases and Pathology
36 pages
Mining License Fees in Zimbabwe
No ratings yet
Mining License Fees in Zimbabwe
8 pages
Australia's Mining Sector's Economic Impact
No ratings yet
Australia's Mining Sector's Economic Impact
8 pages
Earth's Energy and Mineral Resources
No ratings yet
Earth's Energy and Mineral Resources
11 pages
Diploma Exam Schedule Oct/Nov 2025
No ratings yet
Diploma Exam Schedule Oct/Nov 2025
22 pages
Steel Manufacturing Expert Resume
No ratings yet
Steel Manufacturing Expert Resume
2 pages
RFQ for Social and Labour Plan Development
No ratings yet
RFQ for Social and Labour Plan Development
6 pages
Socio-Ecological Impacts of Nickel Mining in Indonesia
No ratings yet
Socio-Ecological Impacts of Nickel Mining in Indonesia
21 pages
Class 8 Geography: Minerals & Power Resources
No ratings yet
Class 8 Geography: Minerals & Power Resources
7 pages
Understanding Land Resources and Pollution
No ratings yet
Understanding Land Resources and Pollution
27 pages
Gaite v. Fonacier: Mining Claims Dispute
No ratings yet
Gaite v. Fonacier: Mining Claims Dispute
2 pages
Panduan Tulisan Speedboat yang Benar
No ratings yet
Panduan Tulisan Speedboat yang Benar
20 pages
Overview of Petrobangla Operations
No ratings yet
Overview of Petrobangla Operations
49 pages
Ladakh Deputy Director Recruitment Rules
No ratings yet
Ladakh Deputy Director Recruitment Rules
25 pages
Dsi Info 22 en PDF
100% (1)
Dsi Info 22 en PDF
172 pages

DataAnalytics Chap 3

Uploaded by

DataAnalytics Chap 3

Uploaded by

CHAPTER – 3

MINING FREQUENT PATTERNS,

Disadvantages of Data Mining:

You might also like