0% found this document useful (0 votes)

10 views4 pages

Probability and Statistics Tutorial for Engineers

Q: What is a trimmed mean and how does it help in analyzing data? Provide an example using data set calculations.

A trimmed mean is an average calculated after removing a certain percentage of the smallest and largest values to mitigate extreme outliers' effects. For a data set of total marks obtained by 10 students, calculating a 20% trimmed mean involves removing the top and bottom 10% of scores (or, for small datasets, some highest and lowest scores), then averaging the remaining figures. This gives a central tendency measure less swayed by outliers .

Q: Outline the steps involved in conducting a survey from inception to report preparation, and explain the rationale behind this sequence.

The survey process involves: (1) Setting up administrative organization to establish clear roles and protocols; (2) Designing forms to ensure data collection consistency; (3) Selecting, training, and supervising field investigators to maintain data integrity; (4) Controlling the quality of fieldwork and performing field edits for data accuracy; (5) Following up non-responses to maximize response rate; (6) Processing data for analysis; (7) Preparing the final report to communicate findings effectively. This sequence ensures logical progression and data quality .

Q: What is the significance of the harmonic mean in statistical analyses, especially when compared to other averages?

The harmonic mean is particularly useful in situations where average rates are desired, such as average speeds over equal distances or financial rates. It tends to be lower than the arithmetic mean, emphasizing the impact of smaller values on the data set. This is crucial when small values have a significant impact on the overall result, like in finance where lower interest rates can have a large negative impact .

Q: What is the primary difference between qualitative and quantitative variables, and how can you categorize the examples provided?

Qualitative variables describe non-numeric characteristics such as categories or labels, while quantitative variables represent numeric values. Among the examples: 'time to travel to work,' 'price for a canteen meal,' 'delivery time for a parcel,' and 'height of a child' are quantitative. 'Shoe size,' 'wavelength of light,' and 'customer satisfaction on a scale from 1 to 10' are also quantitative; shoe size is discrete, whereas time-related variables and height are continuous. 'Preferred political party,' 'eye color,' 'gender,' and 'blood type' are qualitative .

Q: How does the histogram differ from a bar chart, and in what scenario is each more suitable?

Histograms are used to represent the distribution of numerical data and show frequency distributions using bars without gaps. Bar charts display categorical data with gaps between the bars. A histogram is more suitable for visualizing the distribution of a continuous data set like exam scores, while a bar chart is better for comparing discrete categories, such as the number of students in different clubs .

Q: Discuss how understanding sample spaces can simplify probability calculations in complicated scenarios, using an example of card draws from a deck.

Sample spaces, the total possible outcomes, simplify probability by framing the scope. When four cards are drawn from a deck, understanding that the sample space consists of all combinations of four cards helps in computing the probability of drawing at least one ace. Define favorable outcomes as any subset containing at least one ace from those 52 cards possibilities, then divide by the total sample space size for the probability of drawing an ace .

Q: How can understanding discrete and continuous variables enhance data analysis, and what impact does incorrect classification have?

Understanding discrete (countable) and continuous (measurable) variables is crucial for appropriately choosing statistical methods and visualizations; for instance, histograms for continuous data or bar charts for discrete data. Incorrect classification might lead to misinterpretation or inappropriate tests impacting results reliability. For example, treating shoe size (discrete) as continuous could lead to incorrect assumptions about the data distribution .

Q: What role does the concept of intersection play in probability, particularly in context with set operations like intersections of sets X and Y?

The intersection in probability involves finding elements common to sets, representing outcomes simultaneously satisfying different event criteria. Studying the intersection of sets X = {3n−1, n ∈N, n < 3} and Y = {y is a prime number < 7}, leads to {2, 5}, emphasizing that both logical and mathematical criteria must be met for elements in this intersection. This is crucial for complex event probability computations .

Q: How can Bayesian theorem be applied to determine student performance classification, and what evidence supports this application?

Bayesian theorem can apply by using prior probability distributions of student performances in Mathematics, Physics, and Chemistry given certain passing criteria. If a student's total score passes thresholds and fails specific subjects, the theorem evaluates the conditional probabilities of these competing hypotheses. Supporting evidence involves comparing these probabilities with known distributions of student scores across the subjects .

Q: Describe the process of using the DNase data set in R to introduce a new variable 'product' and explain its potential application.

To introduce a new variable 'product' in the DNase data set in R, first read and view the data using functions like `read.table()` and `View()`. Create the product variable as the product of the existing variables 'concentration' and 'density' using the assignment `data$product <- data$concentration * data$density`. This new variable can be used for further statistical analysis, such as regression models, to explore the relationship between concentration, density, and their interactions .

The document provides a tutorial on probability and statistics concepts for engineering students. It includes 35 questions covering topics like populations and observations, qualitative and quantitative variables, probability, probability distributions, sampling, and descriptive statistics. The questions are meant to help students learn and practice applying these statistical concepts.

Uploaded by

gagandeepsinghgxd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views4 pages

Probability and Statistics Tutorial for Engineers

Uploaded by

gagandeepsinghgxd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

College of Engineering Pune

(An Autonomous Institute of Government of Maharashtra)

Department of Mathematics
(MA- 21001) Probability and Statistics for Engineers
T.Y. B. Tech. Semester VI(Computer, E and TC, Instrumentation, Mechanical,
Electrical)
Academic Year 2022-23 (ODD Semester)

Tutorial 1 on Unit 1
1. Describe both the population and the observations for the following research questions:

(a) Evaluation of the satisfaction of employees from an airline.

(b) Description of the marks of students from an assignment.
(c) Comparison of two drugs which deal with high blood pressure.

2. Which of the following variables are qualitative, and which are quantitative? Specify
which of the quantitative variables are discrete and which are continuous:

Time to travel to work, shoe size, preferred political party, price for a canteen meal,
eye colour, gender, wavelength of light, customer satisfaction on a scale from 1 to 10,
delivery time for a parcel, blood type, number of goals in a hockey match, height of a
child, subject line of an email.

3. Identify the scale of the following variables:

(a) Political party voted for in an election (b) The difficulty of different levels in a
computer game (c) Production time of a car (d) Age of turtles (e) Calender year (f)
Price of a chocolate bar (g) Identification number of a student (h) Final ranking at a
beauty contest (i) Intelligence quotient.

4. Make yourself familiar with the DNase data set from R.

(a) First, browse through the introduction to R in Appendix A. Then, read in the data.
(b) View the data both in the R data editor and in the R console.
(c) Create a new data matrix which consists of the first 5 rows and first 5 variables of
the data. Print this data set on the R console. Now, save this data set in your preferred
format.
(d) Add a new variable “product” to the data set which is the product of concentration
and density.

5. Identify proper order of various stages in execution of the survey from beginning to
end.

• Setting up administrative organization

• Selection, training and supervision of field investigators
• Design of forms
• Processing data
• Control over quality of the field work and field edit
• Follow up of non response
• Preparing Report

(Ans: Setting up administrative organization, Design of forms, Selection, training and

supervision of field investigators, Control over quality of the field work and field edit,
Follow up of non response, Processing data, Preparing Report)

6. Explain the difference between histogram and bar chart. Give a situation in which one
is a better representation than the other.

7. Consider the marks obtained by students in Mathematics, Physics and Chemistry out
of 100, 50 and 50 in their board exams in this order for 10 students:
80 45 32 78 43 28 87 42 49 95 45 47 53 32 15
67 23 19 99 50 48 79 45 35 89 39 49 85 36 42
(i) Create a frequency table for ungrouped data for total marks obtained by 10
students.
(ii) Create a frequency distribution for the grouped data with class interval of 10.
What is relative frequency?
(iii) Draw a pie chart,histogram and divided bar diagram for the above data to
explain some salient features about the data as you feel fit.
(iv) Calculate the mean, median and mode for the above data giving the details.
Also compute 20 percent trimmed mean for above data set.
(v) Implement all the above in R.
Refer to the data in Question 7 for the problems 8-14 below:

8. Define at least five different events and find their probabilities.

9. What is the probability that a student scored more than 90 marks in Mathematics and
between 70 and 80 (inclusive) in Physics and Chemistry combined together? Hence or
otherwise say if the events are independent ?

10. What is the probability that a student scored 45 marks in Chemistry given that he
scored at least a total of 150 marks?

11. What is the probability that a student scored between 70 and 80 marks in Mathematics
or less than 40 marks in Physics?

12. What is the probability that a student scored at least 80% marks in Physics given that
he scored atmost 80% marks in Mathematics and Chemistry combined together?

13. If 5 students are chosen randomly then what is the probability that none of the students
scored more than 175 marks out of 200?
14. If it is decided that a student will get a valid score if his total is more than 120 and he
will be declared as passed in Mathematics, Physics and Chemistry if he scored more
than 70, 30 and 35 marks respectively then
(i) draw a Venn diagram to show the number of students passing in the three sub-
jects and
(ii) set up a scenario for verifying Baye’s theorem and verify it.

15. Consider the following data which gives the percentages of the families that are in the
upper income level for some individuals in 15 schools of the city.
72.2, 31.9, 26.5, 29.1, 27.3, 8.6, 22.3, 26.5, 20.4, 12.8, 25.1, 19.2, 24.1, 58.2, 68.1.
Construct a relative frequency histogram of the data.

16. We have measured certain variables in a classroom. Label the variables as either dis-
crete or continuous. Fill in the table below with your answer.
Variable Name Type of Variable (Answer)
Number of books in the classroom Discrete
Time it takes for students to finish their quiz Continuous
Shoe size Discrete
Number of students that have their lunch in the canteen Discrete
17. Twelve students compete in a race. In how many ways first three prizes be given?
(Ans: 12 x 11 x 10 = 1320)

18. Suppose P = {a| a is an odd prime number < 7} and Q = {b| b ∈ N, 0 ≤ b < 5},
where N is a set of all natural numbers. Find the number of proper subsets of P and
Q. (Ans: 3 and 15 respectively.)

19. Let 50 patients represent sample units. 20 out of 50 experience stomach ailment after
the drug is given. Find sample proportion for which the drug was success and the
sample proportion for which drug was not successful. Observe that sample proportion
is the sample mean of 1 and 0 where we count 1 if success and 0 stands for failure of
drug treatment.

20. Suppose X = {x| x = 3n − 1, n ∈ N, n < 3} and Y = {y| y is a prime number < 7}.
Then find X ∩ Y. (Ans: X ∩ Y = {2, 5})

21. Four cards are drawn at random (without replacement) from a well shuffled deck of
playing cards. Then find the probability that there is at least one ace among them.
(Find answer correct upto four decimal places.) (Ans: 0.2813)

22. Let E and F be two events with P (E ∪ F ) = 0.7, P (E) = 0.5, P (F ) = 0.3. Find
P (E ∩ F c ). (Ans: 0.4)

23. A fair six-sided dice is rolled twice independently. What is the probability of getting 1
in first roll but not getting 3 or 4 in the second roll? (Ans: 91 )

24. Write sample space for the given experiment: Three items are selected at random
from a manufacturing process and an item selected is tested for defective (D) or non-
defective (N).
25. Two lottery tickets are to be chosen from 20 for first and second prize. Find number of
sample points in S. (Ans: 20P2 )

26. (a) How many ways can five people be lined up to get on a bus? (Ans: 5!)
(b) If a certain two persons refuse to follow each other, how many ways are possible?
(Ans: 12 3! = 72)

27. A college freshman must take a science course, a social science course and a mathe-
matics course. If he may select any of three sciences, any of four social studies and
any of two mathematics courses, how many ways can he arrange his program? (Ans:
24)

28. In how many ways can 6 trees can be planted in a circle? (Ans: 5!)

29. What is the use of harmonic mean in statistics?

30. The average age of 06 persons living in a house is 23.5 years. Three of them are majors
and their average age is 42 years. The difference in ages of the three minor children is
same. What is the mean of the ages of minor children? (Ans: 5)

31. How many numbers are there between 500 and 1000 which have exactly one of their
digits as 7? (Ans: 1x9x9 + 4x1x9 + 4x9x1 = 153)

32. Two dice are rolled, find the probability that the sum is a) equal to 1; b) equal to 4; c)
less than 13. (Ans: a) 0; b) 1/12; c) 1.)

33. A die is rolled and a coin is tossed, find the probability that the die shows an odd
number and the coin shows a head. (Ans: 0.25)

34. A bag I contains 4 white and 6 black balls while another Bag II contains 4 white and
3 black balls. One ball is drawn at random from one of the bags, and it is found to be
7
black. Find the probability that it was drawn from Bag I. (Ans: 12 )

35. If the probability of getting 10 − 20, 21 − 30, 31 − 40, over 40 cars for service at a
day is 0.20, 0.35, 0.25, 0.12 respectively. Then find the probability of getting at least
21 cars at that day.

36. A virus has infected 1.8% of a population. A test detects this virus 95% of the time
when it is actually present, but it returns a false positive 3% of the time when the virus
is not present. If a person at random from this population tests positive for the virus,
what is the probability that this person is actually infected? [Round to the nearest
percent]

Common questions