0% found this document useful (0 votes)
1K views16 pages

SMDM 2025: Customer & Student Analysis

The document outlines three problems: 1) Analyzing wholesale customer data to identify highest/lowest spending regions/channels, item variability, outliers, and recommendations. 2) Analyzing student data from Clear Mountain State University to construct contingency tables for gender vs major/intentions/employment/computer ownership, calculate probabilities, and assess independence and normal distributions. 3) Comparing types of shingles to determine best options.

Uploaded by

Ankit Sharma
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views16 pages

SMDM 2025: Customer & Student Analysis

The document outlines three problems: 1) Analyzing wholesale customer data to identify highest/lowest spending regions/channels, item variability, outliers, and recommendations. 2) Analyzing student data from Clear Mountain State University to construct contingency tables for gender vs major/intentions/employment/computer ownership, calculate probabilities, and assess independence and normal distributions. 3) Comparing types of shingles to determine best options.

Uploaded by

Ankit Sharma
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

SMDM PROJECT

PROBLEM 1-Wholesale Customers Analysis

PROBLEM2-Clear Mountain State University (CMSU)

PROBLEM 3-A & B shingles

PROBLEM 1-Wholesale Customers Analysis


1.1 Use methods of descriptive statistics to summarize data. Which Region and
which Channel spent the most? Which Region and which Channel spent the
least?

1.2 There are 6 different varieties of items that are considered. Describe and
comment/explain all the varieties across Region and Channel? Provide a
detailed justification for your answer.

1.3 On the basis of a descriptive measure of variability, which item shows the
most inconsistent behaviour? Which items show the least inconsistent
behaviour?

1.4 Are there any outliers in the data? Back up your answer with a suitable
plot/technique with the help of detailed comments.

1.5 On the basis of your analysis, what are your recommendations for the
business? How can your analysis help the business to solve its problem? Answer
from the business perspective

CONFIDENTIAL DATA: GREAT LAKES LEARNING


CONFIDENTIAL DATA: GREAT LAKES LEARNING
CONFIDENTIAL DATA: GREAT LAKES LEARNING
CONFIDENTIAL DATA: GREAT LAKES LEARNING
CONFIDENTIAL DATA: GREAT LAKES LEARNING
PROBLEM2-Clear Mountain State University (CMSU)
2.1. For this data, construct the following contingency tables (Keep Gender as
row variable)
2.1.1. Gender and Major
2.1.2. Gender and Grad Intention
2.1.3. Gender and Employment
2.1.4. Gender and Computer
2.2. Assume that the sample is representative of the population of CMSU. Based
on the data, answer the following question:
2.2.1. What is the probability that a randomly selected CMSU student will be
male?
2.2.2. What is the probability that a randomly selected CMSU student will be
female?
2.3. Assume that the sample is representative of the population of CMSU. Based
on the data, answer the following question:
2.3.1. Find the conditional probability of different majors among the male
students in CMSU.
2.3.2 Find the conditional probability of different majors among the female
students of CMSU.
2.4. Assume that the sample is a representative of the population of CMSU. Based
on the data, answer the following question:
2.4.1. Find the probability That a randomly chosen student is a male and intends
to graduate.
2.4.2 Find the probability that a randomly selected student is a female and does
NOT have a laptop. 
2.5. Assume that the sample is representative of the population of CMSU. Based
on the data, answer the following question:
2.5.1. Find the probability that a randomly chosen student is a male or has full-
time employment?
2.5.2. Find the conditional probability that given a female student is randomly
chosen, she is majoring in international business or management.
2.6.  Construct a contingency table of Gender and Intent to Graduate at 2 levels
(Yes/No). The Undecided students are not considered now and the table is a 2x2
table. Do you think the graduate intention and being female are independent
events?
2.7. Note that there are four numerical (continuous) variables in the data set, GPA,
Salary, Spending, and Text Messages.
Answer the following questions based on the data
2.7.1. If a student is chosen randomly, what is the probability that his/her GPA is
less than 3?
2.7.2. Find the conditional probability that a randomly selected male earns 50 or
more. Find the conditional probability that a randomly selected female earns 50 or
more.

CONFIDENTIAL DATA: GREAT LAKES LEARNING


2.8. Note that there are four numerical (continuous) variables in the data set, GPA,
Salary, Spending, and Text Messages. For each of them comment whether they
follow a normal distribution. Write a note summarizing your conclusions.

CONFIDENTIAL DATA: GREAT LAKES LEARNING


CONFIDENTIAL DATA: GREAT LAKES LEARNING
CONFIDENTIAL DATA: GREAT LAKES LEARNING
CONFIDENTIAL DATA: GREAT LAKES LEARNING
CONFIDENTIAL DATA: GREAT LAKES LEARNING
CONFIDENTIAL DATA: GREAT LAKES LEARNING
CONFIDENTIAL DATA: GREAT LAKES LEARNING
CONFIDENTIAL DATA: GREAT LAKES LEARNING
CONFIDENTIAL DATA: GREAT LAKES LEARNING
X
ANKIT SHARMA

CONFIDENTIAL DATA: GREAT LAKES LEARNING

Common questions

Powered by AI

Different varieties of wholesale items can be described and analyzed by summarizing sales data using descriptive statistics to compare how each variety performs across different regions and channels. This would include calculating measures like mean, median, and mode for each variety and considering variability and dispersion measures to understand the spread. Furthermore, analyzing data through cross-tabulation can help identify how each item variety performs in different contexts, potentially revealing discrepancies or opportunities for growth. Visual tools like bar charts or heat maps could also be employed to more easily observe patterns and relationships.

Analyzing the interrelation between GPA and salary can provide CMSU with insights into how academic performance translates into financial success post-graduation. If a strong correlation exists, it could be used to motivate students, showing the tangible benefits of academic excellence. This knowledge may also guide career counseling, immersive internships, and partnerships with industries to provide pathways for students with varying academic achievements. Consequently, institutional policies might focus more on supporting students in fields demonstrating high salary returns or on offering additional support to students struggling academically to enhance their future earnings potential.

Determining whether text messages sent by CMSU students follow a normal distribution has important implications for statistical analyses and interventions. If the distribution is normal, parametric statistical tests can be confidently applied, allowing for more efficient data analysis. It also suggests the central limit theorem may apply, facilitating predictions or inferences about student behavior. If not, non-parametric methods might be needed, or the data might require transformation. Understanding the distribution helps tailor communication strategies and interventions to fit typical student behavior based on average usage patterns.

Statistical measures such as z-scores or the interquartile range (IQR) can be used to identify outliers in a retail sales dataset, whereby data points that fall beyond the calculated threshold (e.g., 1.5 times the IQR above the third quartile or below the first quartile) are considered outliers. Techniques like box plots are effective for visualizing outliers, as they clearly show the distribution of data and any data points that deviate significantly from the rest of the dataset. Scatter plots can also be useful, especially when showing relationships between different variables where an outlier may disrupt an expected pattern.

Based on wholesale customer analysis regarding item variability, businesses could be recommended to manage inventory with differentiated strategies tailored to the variability of each item. For high variability items, implementing just-in-time inventory practices or predictive analytics could reduce holding costs and stockouts. Conversely, for items with low variability, maintaining a steady inventory level and securing long-term supply contracts may ensure cost efficiency. Furthermore, leveraging data analytics to better predict demand patterns may enhance customer satisfaction and optimize stock levels, potentially increasing sales revenue.

The relationship between gender and the intention to graduate at CMSU can be evaluated by constructing a contingency table where gender is a row variable and intention to graduate (Yes/No) is a column variable. Chi-square tests can be performed to check for independence between the two variables, allowing us to assess whether gender has a statistically significant influence on graduation intention. Additionally, calculating the odds ratio can help quantify the strength and direction of the association between gender and graduate intention. If the events are found to be independent, it implies no significant relationship exists.

Conditional probabilities can be used to analyze employment status among CMSU students by constructing a contingency table with employment status as one variable and gender as another. By calculating the probability of being employed given the gender of the student, we can identify trends or biases in employment status between male and female students. This analysis can uncover crucial disparities or equalities that might be present and provide insights into employment opportunities afforded to different genders.

To determine which region and channel have the highest and lowest spending, one would summarize the data using the methods of descriptive statistics. This includes finding the total expenditure across different regions and channels, ranking them from most to least, and identifying the extremes. For example, the region and channel combination with the maximum sum of expenses would be the highest spender, whereas the combination with the minimum would be considered the lowest spender.

The probability that a randomly chosen student from CMSU has a GPA less than 3 can be calculated by dividing the number of students with GPAs below 3 by the total number of students. This probability can significantly influence academic support programs by highlighting the need for remedial classes or tutoring programs tailored to students who struggle academically. Identifying students at risk of falling below a certain GPA can help target resources and interventions to improve their academic outcomes.

To determine which wholesale item shows the most and least variability, one would use a descriptive measure of variability, such as the coefficient of variation or standard deviation. An item with a high standard deviation relative to its mean would show more inconsistency, suggesting potential challenges in demand forecasting and inventory management. Conversely, an item with low variability would have more predictable demand. For inventory management, items with high variability may require more adaptive strategies, like dynamic safety stock levels and close monitoring, whereas items with low variability can be managed with steadier stock policies.

You might also like