0% found this document useful (0 votes)
19 views8 pages

Data Warehousing and Mining Question Bank

Uploaded by

charan133596
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views8 pages

Data Warehousing and Mining Question Bank

Uploaded by

charan133596
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

P.V.

P SIDDHARTHA INSTITUTE OF TECHNOLOGY


BRANCH : CSE REGULATION : PVP23
Course: [Link] SUBJECT: DATA WAREHOUSING AND DATA MINING
Year and Semester: III Year / I
Subject Code:23CS3501
Semester
QUESTION BANK

UNIT I

Q.
PART A CO LEVEL
NO
1 Define Data Mining. 1 L1
2 What is a data warehouse? 1 L1
3 List any two differences between OLTP and OLAP. 1 L1
4 What are the types of data that can be mined? 1 L1
5 Define an attribute and give an example. 1 L1
6 What is the purpose of data cleaning? 1 L1
7 What is meant by data discretization? 1 L1
8 Define a data cube. 1 L1
9 List any two real-life applications of data mining. 1 L1
10 What is data reduction? 1 L1
PART B
Classify and describe different types of data that can be mined in
11 1 L3
data mining.
Normalize the following group of data by using the following
techniques.
200, 300, 400, 600, 1000
12 i. min-max normalization technique 1 L3
ii. z-score normalization
iii. Decimal scaling.
a) Write your observations on the above techniques.
In what way various data mining functionalities are applied to
13 1 L3
solve real-world problems
14 Identify and discuss the challenges and issues in data mining 1 L3
Illustrate the multi-tiered architecture of data warehouse with a
15 1 L3
neat diagram.
Illustrate OLAP operations by applying examples for roll-up, drill-
16 1 L3
down, slice, dice, and pivot.
How to make use of statistical measures to measure central
17 1 L3
tendency of data.
18 Suppose that the data for analysis includes the attribute age. The 1 L3
age values for the data tuples are (in increasing order):
13, 15, 16, 16, 19, 20, 23, 29, 35, 41, 44, 53, 62, 69, 72
(i) Use min-max normalization to transform the value of 45 for
age onto the range [0,1]
(ii) Use Z-Score normalization to transform the value 45 for age
where the standard deviation of age is 20.64 years.
With a neat sketch illustrate KDD (Knowledge Discovery in
19 1 L3
Databases) process.
How data reduction techniques are utilized in dimensionality
20 1 L3
reduction.

UNIT II

Q.N
PART A CO LEVEL
O
1 Define Machine Learning. 1 L1
2 List two real-world applications of Machine Learning. 1 L2
3 What are the main paradigms of machine learning? 1 L1
4 Mention any two stages in a typical Machine Learning process. 1 L1
5 List any 2 datasets used in machine learning? 1 L1
6 Define proximity measure in machine learning. 1 L1
7 What is Euclidean distance? 1 L2
8 Give one difference between KNN and Weighted KNN. 1 L2
9 Define KNN Regression. 1 L1
10 List any two performance evaluation metrics for classification. 1 L2
PART B
Compare and Contrast Machine Learning from traditional
11 2 L3
programming to modern approaches.
12 Distinguish paradigms of machine learning with examples. 2 L3
Illustrate the stages of machine learning process with a neat
13 2 L3
diagram?
Describe different types of datasets used for classification and
14 2 L2
Regression.
15 How machine learning is utilized in real-world applications. 2 L3
Identify the evaluation measures used for classification and
16 2 L3
Regression.
17 Illustrate K-Nearest Neighbors (KNN) algorithm with an example 2 L3
18 Apply K-nearest neighbor classifier to predict if a patient is 4 L4
diabetic using the features: BMI and Age. Assume K = 3. Test
Example:
BMI = 43.6, Age = 40,Sugar = ? Training examples are given in
the table above.
BMI Age Sugar
33.6 50 1
26.6 30 0
23.4 40 0
43.1 67 0
35.3 23 1
35.9 67 1
36.7 45 1
25.7 46 0
23.3 29 0
31.0 56 1
19 Compare and contrast standard KNN with Weighted KNN. 2 L3
Apply K-Nearest Neighbors(KNN) algorithm with Euclidean
Distance to predict the Fruit Type of Strawberry (Sweetness = 5,
Sourness = 5). Analyse the outputs when K=1,3,5.
Sweetne Sournes
Fruit Fruit Type
ss s
Lemon 1 9 Sour
20 Grapefruit 2 8 Sour 4 L4
Orange 3 7 Sour
Cherry 6 4 Sweet
Banana 9 1 Sweet
Grapes 8 2 Sweet
Strawberry 5 5 ?

UNIT III

Q.
NO PART A CO LEVEL
.
1 Define classification in the context of machine learning. 1 L1
2 What is a decision tree? 1 L1
Mention any two attribute selection measures used in
3 1 L2
decision trees.
4 What is overfitting in classification? 1 L2
5 Define entropy. 1 L2
6 What is pruning in decision trees? 1 L1
7 State Bayes Theorem. 1 L1
8 What is Naïve Bayes classifier? 1 L1
9 Define information gain 1 L2
10 List any two methods to improve classification accuracy. 1 L2
PART B
Construct a decision tree using ID3 algorithm for the
following example and Classify to predict which patients
are high risk for heart disease.

11 3 L4

Construct a decision tree using ID3 algorithm for the


following example and Classify whether the person gets
the Job offer or Not.

12 3 L4

Apply Naïve Bayes Algorithm for the below data and


analyze whether the person has flu or not for the below
sample?

13 4 L4

Data Sample: X= (Chills=’Y’, RunnyNose=’N’,


Headache=’No’, Fever=’Y’, Flu=?)
14 Analyze the given data and apply Naïve Bayes Algorithm 4 L4
for the given sample and predict if an accident will happen
or Not?
{Weather Condition -Rain, Road Condition =Good, Traffic
Condition = Normal, Engine Problem= No, Accident=?}
Write the Naïve Bayes Algorithm and then solve the below
problem. Consider the given dataset and apply Naïve
Bayes algorithm to predict that if a fruit has the following
properties, then which type of fruit it is? Fruit = {Yellow,
Sweet, Long}
15 Fruit Yellow Sweet Long Total 4 L4
Mango 350 450 0 650
Banana 400 300 350 400
Others 50 100 50 150
Total 800 850 400 1200
Illustrate different attribute selection measures in decision
16 2 L3
trees with an example?
17 Compare and Contrast Bagging and Boosting 2 L3
How tree pruning contributes in improving the
18 2 L3
effectiveness of a decision tree during its construction.
Compare and contrast Naïve Bayes and Decision Tree
19 2 L3
classifiers.
Illustrate the techniques used to improve classification
20 2 L3
accuracy.

UNIT IV

Q.
PART A CO LEVEL
NO.
1 What is a linear discriminant? 1 L1

2 Define perceptron learning rule. 1 L1

3 What is the role of the activation function in a perceptron? 1 L2


Mention one difference between logistic regression and
4 1 L2
linear regression.
5 What is a support vector machine (SVM)? 1 L1

6 Define the margin in the context of SVM. 1 L1


7 What is a linearly non-separable case in classification? 1 L2

8 How kernel is used in non-linear SVMs? 1 L2


9 Define Logistic regression. 1 L1
10 List two applications for linear regression. 1 L1

PART B
Analyze the given data and Apply Support Vector Machine
to plot hyper plane of the following data point on linearly
11 4 L4
separable data: (1,1) (2,1) (1,-1) (2, -1) (4, 0) (5, 1) (5, -1)
(6, 0)
Illustrate the perceptron learning algorithm with an
12 2 L3
example.
Analyze the given data and Apply Support Vector Machine
to plot hyper plane of the following data point on linearly
13 separable data: (2,2), (−2,2), (−2,−2), (2,−2)as positively 4 L4
labelled points (1,1), (1,−1), (−1,−1), (−1,1) as
negatively labelled points.
In what way SVM works in the linearly separable case with
14 2 L3
an example.
Illustrate the concept of non-linear SVM and how the
15 2 L3
kernel trick is used.
Compare and contrast linear SVM and non-linear SVM
16 2 L3
with examples.
Compare and Contrast logistic regression and linear
17 2 L3
regression with examples.
Illustrate the concept of logistic regression model with its
18 2 L3
sigmoid function.

How linear regression is used for prediction with an


19 2 L3
example.

Apply logistic regression to demonstrate binary


20 2 L3
classification.

UNIT V
Q.
PART A CO LEVEL
NO.
1 What is clustering in data mining? 1 L1
2 Define partitional clustering. 1 L1
3 What is the basic idea of agglomerative clustering? 1 L1
List two differences between hierarchical and partitional
4 1 L2
clustering.
5 Define centroid in K-Means clustering. 1 L1
6 What is soft clustering? 1 L2
What is the main difference between K-Means and Fuzzy
7 1 L2
C-Means clustering?
What is the goal of Expectation Maximization (EM)
8 1 L2
clustering?
9 Define intra-cluster and inter-cluster similarity. 1 L2
10 State one limitation of K-Means clustering. 1 L2
PART B
Compute the Hierarchical Clustering on the below data
and represent the output in a dendrogram
A B C d e f
a 0
b 0.12 0
11 c 0.51 0.25 0 4 L3
d 0.84 0.16 0.14 0
0.4
e 0.28 0.77 0.70 0
5
0.2
f 0.34 0.61 0.93 0.67 0
0
Compute the Hierarchical Clustering on the given data
and denote the output in dendogram.
A B C d e f
a 0
12 b 0.71 0 4 L3
c 5.66 4.95 0
d 3.61 2.92 2.24 0
e 4.24 3.54 1.41 1.00 0
f 3.20 2.50 2.50 0.50 1.12 0
Apply K-Means Clustering on a two- Dimensional dataset
containing six data points {(1,1) , (2,1), (1,4), (4,3), (5,4),
13 4 L3
(6,5)} using K=2, Euclidian Distance , and the initial
cluster centroids are c1=(1,1) and c2=(2,1).
Apply K-Means Clustering with three clusters
14 A1(2,10),A2(2,5),A3(8,4) A4(5,8)A5(7,5) A6(6,4) A7(1,2) 4 L3
A8(4,9)
15 Illustrate Fuzzy C-Means clustering with an example. 3 L3

Compare K-Means vs Fuzzy C-Means with an example and


16 3 L3
list the key differences.
How the Expectation-Maximization (EM) clustering
17 3 L3
process with an example.
How Partitioning of Data is done in context of
18 3 L3
reorganization, Compression and summarization?
Compare and Contrast Agglomerative and divisive
19 3 L3
Clustering?

20 Illustrate K-means Clustering algorithm with an example? 3 L3


Course Coordinators
1. Dr [Link]
2. Ms K.N Divya
3. Ms A Madhuri

HOD,CSE

You might also like