DESCRIPTIVE STATISTICS
Descriptive statistics help summarize and describe the important features of a
dataset. In Excel, you can easily calculate measures such as mean, median,
mode, and standard deviation using built-in functions.
Let's go through the explanation and provide an example on how to use these
functions in Excel.
1. Mean (Average)
The mean is the sum of all data points divided by the number of data points. It
represents the average of the dataset.
Formula in Excel:
=AVERAGE(range)
Example: Suppose we have a dataset in Excel from cells A1 to A5:
10, 15, 20, 25, 30
To find the mean:
=AVERAGE(A1:A5) → (10+15+20+25+30) / 5 = 20
2. Median
The median is the middle value of a dataset when it is arranged in ascending or
descending order. If there is an even number of observations, the median is the
average of the two middle values.
Formula in Excel:
=MEDIAN(range)
Example: Using the same dataset as above:
10, 15, 20, 25, 30
The median is 20, since it’s the middle value.
If the dataset is even (e.g., 10, 15, 20, 25), Excel will calculate the average of the
two middle values (15 and 20):
=MEDIAN(A1:A4) → (15+20) / 2 = 17.5
3. Mode
The mode is the most frequent value in a dataset. A dataset can have more than
one mode if multiple values repeat with the same frequency.
Formula in Excel:
=MODE(range)
Example:
10, 15, 15, 20, 25
The mode is 15, as it appears twice, while the others only appear once.
4. Standard Deviation
Standard deviation measures the amount of variation or dispersion in a dataset.
A higher standard deviation means the data points are spread out more widely
from the mean, while a lower standard deviation indicates that the data points
are closer to the mean.
In Excel, STDEV.S and STDEV.P are functions used to calculate the standard
deviation, but they apply to different types of data:
● STDEV.S (Sample Standard Deviation): This is used when your data
represents a sample (a subset) of a larger population. It divides by n−1n -
1n−1, which corrects for the bias in estimating the population standard
deviation.
● STDEV.P (Population Standard Deviation): This is used when your data
represents the entire population. It divides by n, the total number of data
points, because you already have the entire population.
Example
Enter the data into Excel from cells A2:A7 for the students' names, and cells
B2:B7 for the scores.
Student Score
A 85
B 92
C 88
D 94
E 78
F 80
Calculate the Standard Deviation Using STDEV.S (Sample)
In an empty cell, type the following formula:
=STDEV.S(B2:B7)
This calculates the standard deviation assuming the data is a sample.
sample standard deviation using STDEV.S is approximately 6.66.
Calculate the Standard Deviation Using STDEV.P (Population)
In an empty cell, type the following formula:
=STDEV.P(B2:B7)
This calculates the standard deviation assuming the data is the entire
population.
So, the population standard deviation using STDEV.P is approximately 6.07
In Excel, you would use STDEV.S when you're working with a sample and
STDEV.P when you're dealing with the whole population.
CORRELATION AND REGRESSION ANALYSIS
1. Correlation Analysis
Correlation measures the strength and direction of the relationship between
two variables. It is represented by the correlation coefficient (r), which ranges
from -1 to 1:
r = 1 → Perfect positive correlation
r = -1 → Perfect negative correlation
r = 0 → No correlation
Example:
Suppose we have data on students' study hours and their exam scores:
Student Study hours Exam score
1 2 50
2 4 65
3 5 70
4 7 85
5 8 90
To calculate the correlation coefficient in Excel:
1. Use the formula:
=CORREL(B2:B6, C2:C6)
2. If r = 0.95, it indicates a strong positive correlation between study hours and
exam scores.
2. Regression Analysis
Regression helps in predicting the value of a dependent variable (Y) based on
an independent variable (X). The simplest form is Linear Regression:
Y = a + bX
Y = Dependent variable (Exam Score)
X = Independent variable (Study Hours)
a = Intercept
b = Slope
Example:
Using the same dataset, we can run a linear regression in Excel:
1. Go to Data → Data Analysis → Regression.
2. Select "Input Y Range" (Exam Scores) and "Input X Range" (Study Hours).
3. Click OK to get results, including the equation Y = 10 + 10X (hypothetically).
If a student studies for 6 hours, the predicted score is:
Y = 10 + (10 \times 6) = 70
Data Analysis Tools in Excel
1. Solver
Used for optimization problems, such as minimizing cost or maximizing profit.
Example: Finding the best mix of products to maximize revenue under budget
constraints.
How to Enable Solver:
1. Go to File → Options → Add-ins.
2. Select Excel Add-ins → Check Solver Add-in → Click OK.
3. Access it from Data → Solver.
2. Analysis ToolPak
A set of advanced statistical and data analysis tools.
Includes Correlation, Regression, Descriptive Statistics, Histograms, ANOVA, etc.
How to Enable Analysis ToolPak:
1. Go to File → Options → Add-ins.
2. Select Excel Add-ins → Check Analysis ToolPak → Click OK.
3. Access it from Data → Data Analysis.
Example: Maximizing Student Study Hours While Balancing Other
Activities
Problem Statement:
A student has 30 hours per week available for study and other activities. They
want to maximize study hours while ensuring enough time for exercise and
free time, with the following conditions:
● Total Available Hours: 30
● Exercise Time ≥ 5 hours
● Free Time ≥ 4 hours
Find the maximum study hours possible under these constraints.
1. Set Up the Data Table
Hours
Activity Allocated
? (To be
Study maximized)
Exercise 5
Free Time 4
Total Hours 30
● The study hours cell is the one we want to maximize.
● The total hours must not exceed 30.
2. Steps to Use Solver in Excel
1. Enter Data
○ In an Excel sheet, list Study, Exercise, Free Time in column A and
their respective hours in column B.
○ Create a cell (e.g., B5) to sum the total hours using =B2 + B3 + B4.
2. Open Solver
○ Go to Data → Solver (If not visible, enable it from File → Options →
Add-ins → Solver Add-in).
3. Set Solver Parameters
○ Set Objective: Select the Study Hours cell (B2).
○ Choose "Max" (since we want to maximize study time).
○ By Changing Variable Cells: Select B2 (Study Hours).
○ Add Constraints:
■ B3 ≥ 5 (Exercise must be at least 5 hours).
■ B4 ≥ 4 (Free Time must be at least 4 hours).
■ B5 = 30 (Total Hours cannot exceed 30).
4. Solve the Problem
○ Click Solve → Select "Keep Solver Solution"
Solution Outcome
Optimized
Activity Hours
Study 21
Exercise 5
Free Time 4
Total Hours 30
The student can study for 21 hours, ensuring a balance with other activities.
When to Use Solver?
Solver Use
Cases Example
Maximizing study time
Maximization while balancing activities
Minimizing cost while
Minimization achieving a goal
Achieving a specific profit
Target Value with limited resources
SCENARIO ANALYSIS AND WHAT-IF
ANALYSIS IN EXCEL
Scenario analysis and What-If analysis help users evaluate different possibilities and
their outcomes in Excel. The three main What-If analysis tools are:
1. Goal Seek – Determines the required input to achieve a specific output.
2. Data Tables – Analyzes multiple values of one or two variables.
3. Scenario Manager – Compares multiple sets of values for multiple variables.
1. Goal Seek
Example: Finding the Required Sales to Achieve Target Profit
Steps:
Open Excel and enter the following data:
A B
Selling Price
per Unit 50
Cost per Unit 30
Units Sold 100
Formula:
=B2*B4 -
Profit B3*B4
We want to find how many units need to be sold to achieve a profit of $5000.
1. Go to Data → What-If Analysis → Goal Seek.
2. Set Set cell: to the Profit cell (B5).
3. Set To value: as 5000 (target profit).
4. Set By changing cell: to B4 (Units Sold).
5. Click OK and Excel will find the required units.
Output: Excel calculates that 250 units need to be sold to achieve $5000 profit.
2. Data Tables
Example: Analyzing Profit for Different Units Sold
We want to analyze how profit changes for different sales quantities.
Steps for One-Variable Data Table:
Set up the following table:
A B C
Selling Price 50
Cost per Unit 30
Units Sold 100
Profit =B1*B3 -
Formula B2*B3
Below, create a column for different Units Sold:
A B
Units Sold Profit
50
100
150
200
250
In B2, enter the formula: =B1*B3 - B2*B3 (Profit calculation).
1. Copy the formula in the second column under Profit.
2. Select the range A2:B6.
3. Go to Data → What-If Analysis → Data Table.
4. Set Column Input Cell as B3 (Units Sold).
5. Click OK.
Output: Excel calculates the profits for different sales volumes.
Units Sold Profit
50 1000
100 2000
150 3000
200 4000
250 5000
Steps for Two-Variable Data Table:
Now, let's analyze how profit changes based on both Selling Price and Units Sold.
Set up the table:
A B C D E
Units Sold → 50 100 150 200
Price ↓
40
50
60
In B2, enter the formula: =B1*B3 - B2*B3 (Profit calculation).
1. Select the entire table (A1:E4).
2. Go to Data → What-If Analysis → Data Table.
3. Set Row Input Cell to B1 (Selling Price).
4. Set Column Input Cell to B3 (Units Sold).
5. Click OK.
Output: Excel fills in profit values based on different selling prices and units sold.
3. Scenario Manager
Example: Comparing Different Pricing Strategies
Steps:
Open Excel and enter:
A B
Selling Price 50
Cost per Unit 30
Units Sold 100
Formula:
=B1*B3 -
Profit B2*B3
Go to Data → What-If Analysis → Scenario Manager.
1. Click Add and create scenarios:
○ Scenario 1 (Low Price): Selling Price = 40, Units Sold = 120.
○ Scenario 2 (Standard Price): Selling Price = 50, Units Sold = 100.
○ Scenario 3 (High Price): Selling Price = 60, Units Sold = 80.
2. Click Summary and select Profit as the output.
Output:
Scenario Selling Price Units Sold Profit
Low Price 40 120 1200
Standard
Price 50 100 2000
High Price 60 80 2400
Conclusion
● Goal Seek finds the required input for a specific target.
● Data Tables analyze multiple outcomes for one or two variables.
● Scenario Manager compares multiple sets of variables at once.
Introduction to DAX (Data Analysis
Expressions) in Excel
What is DAX?
DAX (Data Analysis Expressions) is a powerful formula language used in Power
BI, Excel Power Pivot, and SQL Server Analysis Services (SSAS) for creating
advanced calculations and aggregations on data models. It extends Excel’s
capabilities for handling large datasets, relationships, and complex business
logic beyond basic formulas.
Why Use DAX in Excel?
● Performs complex calculations beyond traditional Excel formulas.
● Works efficiently with large datasets and data models.
● Enables data summarization, filtering, and relationships across multiple
tables.
● Supports advanced aggregations, time intelligence, and conditional
calculations.
Key Concepts of DAX
DAX formulas work on columns and tables, unlike Excel formulas that
primarily work on individual cells. Some key concepts include:
1. Calculated Columns
● Used to create new columns based on existing data.
Example:
TotalPrice = Sales[Quantity] * Sales[UnitPrice]
● This creates a new column in the Sales table calculating the total price.
2. Measures
● Measures return aggregated values (e.g., SUM, AVERAGE) based on
filters.
Example:
Total Sales = SUM(Sales[TotalPrice])
● This measure returns the total sales amount dynamically.
3. Aggregation Functions
● DAX supports SUM, AVERAGE, MIN, MAX, COUNT, DISTINCTCOUNT.
Example:
AvgSales = AVERAGE(Sales[TotalPrice])
● This calculates the average sales price.
4. Filtering Functions
● Used to filter data dynamically.
Example:
HighValueSales = CALCULATE(SUM(Sales[TotalPrice]), Sales[TotalPrice] > 1000)
● This calculates the sum of sales where TotalPrice is greater than $1000.
5. Time Intelligence Functions
● Handles date-based calculations like YTD(Year to Date), MTD(Month to
Date), and comparisons.
Example:
SalesYTD = TOTALYTD(SUM(Sales[TotalPrice]), Sales[Date])
● This calculates year-to-date (YTD) sales.
6. Relationship Handling
● DAX can work across multiple related tables using RELATED and
RELATEDTABLE functions.
Example:
ProductCategory = RELATED(Products[Category])
● Retrieves product category from a related table.
DAX vs. Excel Formulas
Feature DAX Excel Formulas
Works on Tables, Columns Cells, Ranges
Advanced (SUMX, Basic (SUM,
Aggregations CALCULATE) AVERAGE)
Uses IF & FILTER
Filtering Dynamic Filtering functions
Slower for large
Speed Optimized for Big Data datasets
Works across multiple Limited to single
Relationships tables sheets
Where to Use DAX in Excel?
● Power Pivot: For creating data models with calculated columns and
measures.
● Power BI: For advanced reporting and dashboards.
● SSAS (SQL Server Analysis Services): For complex business intelligence
models.
Conclusion
DAX is a powerful tool for complex calculations and data modeling in Excel
Power Pivot. It enables dynamic aggregations, filtering, and relationship-based
computations, making it essential for business analytics and reporting.