Lecture 1
Daily Class Notes
Program : [Link] CSE
Subject : Decoding Data : Learn AI ,ML, DS & Analytics
Subject Code : ENAI-111
IIMT University, GN Campus
By Pragati Mishra
Assistant Professor([Link], [Link])
*Notes Content adapted from Google/sites/books
Syllabus
Introduction to Data Analytics:
Working with Formula and Functions, Introduction to Power BI & Charts,
Logical functions using Excel, Analysing Data with Excel
Today’s Topic :
Introduction to Data Analytics:
• Working with Formula and Functions
Introduction to Data Analytics
Data Analytics is the process of collecting, organizing and studying data to find
useful information understand what’s happening and make better decisions. In
simple words it helps people and businesses learn from data like what worked in
the past, what is happening now and what might happen in the future.
People often mix up data analytics and data analysis but they’re not exactly the
same. Data analysis is just one part of data analytics it focuses on finding
meaning in data. On the other hand data analytics includes more than just
analysis. It also involves things like coming up with ideas and predictions from
data and building the tools and systems needed to handle large amounts of data.
Importance and Usage of Data Analytics
Data analytics is used in many fields like banking, farming, shopping,
government and more. It helps in many ways:
Data Analytics Importance
• Helps in Decision Making: It gives clear facts and patterns from data
which help people make smarter choices.
• Helps in Problem Solving: It points out what's going wrong and why
making it easier to fix problems.
• Helps Identify Opportunities: It shows trends and new chances for
growth that might not be obvious.
• Improved Efficiency: It helps reduce waste, saves time and makes work
smoother by finding better ways to do things.
Process of Data Analytics
Data analysts, data scientists and data engineers together create data pipelines
which helps to set up the model and do further analysis. Data Analytics can be
done in the following steps which are mentioned below:
Data Analytics Process
1. Data Collection : Data collection is the first step where raw information
is gathered from different places like websites, apps, surveys or machines.
Sometimes data comes from many sources and needs to be joined together.
Other times only a small useful part of the data is selected.
2. Data Cleansing : Once the data is collected it usually contains mistakes
like wrong entries, missing values or repeated rows. In this step the data is
cleaned to fix those problems and remove anything that isn’t needed. Clean
data makes the results more accurate and trustworthy.
3. Data Analysis and Data Interpretation: After cleaning the data is studied
using tools like Excel, Python, R or SQL. Analysts look for patterns, trends
or useful information that can help solve problems or answer questions.
The goal here is to understand what the data is telling us.
4. Data Visualization: Data visualization is the process of creating visual
representation of data using the plots, charts and graphs which helps to
analyze the patterns, trends and get the valuable insights of the data. By
comparing the datasets and analyzing it data analysts find the useful data
from the raw data.
Types of Data Analytics
There are different types of data analysis in which raw data is converted into
valuable insights. Some of the types of data analysis are mentioned below:
Types of Data Analytics
1. Descriptive Data Analytics : Descriptive data analytics helps to
summarize and understand past data. It shows what has happened by using
tables, charts and averages. Companies use it to compare results, find
strengths and weaknesses and spot any unusual patterns.
2. Diagnostic Data Analytics: Diagnostic data analytics looks at why
something happened in the past. It uses tools like correlation, regression or
comparison to find the cause of a problem. This helps companies
understand the reason behind a drop in sales or a sudden change in
performance.
3. Predictive Data Analytics: Predictive data analytics is used to guess what
might happen in the future. It looks at current and past data to find patterns
and make forecasts. Businesses use it to predict things like customer
behavior, future sales or possible risks.
4. Prescriptive Data Analytics: Prescriptive data analytics helps to choose
the best action or solution. It looks at different options and suggests what
should be done next. Companies use it for things like loan approval, pricing
decisions and managing machines or schedules.
Methods of Data Analytics
There are two types of methods in data analytics which are mentioned below:
1. Qualitative Data Analytics
Qualitative data analysis doesn’t use statistics and derives data from the words,
pictures and symbols. Some common qualitative methods are:
• Narrative Analytics is used for working with data acquired from diaries,
interviews and so on.
• Content Analytics is used for Analytics of verbal data and behaviour.
• Grounded theory is used to explain some given event by studying.
2. Quantitative Data Analysis
Quantitative data Analytics is used to collect data and then process it into the
numerical data. Some of the quantitative methods are mentioned below:
• Hypothesis testing assesses the given hypothesis of the data set.
• Sample size determination is the method of taking a small sample from a
large group of people and then analysing it.
• Average or mean of a subject is dividing the sum total numbers in the list
by the number of items present in that list.
Skills Required for Data Analytics
There are multiple skills which are required to be a Data analyst. Some of the
main skills are mentioned below:
• Some of the common programming languages which are used
are R and Python.
• For databases Structured Query Language (SQL) is a programming
language used.
• Machine Learning is used in data analysis.
• In order to better analyse and interpret probability and statistics are used.
• For collecting and organising data, Data Management is used in data
analysis.
• To use charts and graphs Data visualisation is used.
Data analytics helps businesses grow by turning raw data into useful insights. It
supports smarter decisions, better planning and improve customer service across
many industries not just finance. From retail to government data analytics plays
a key role in today’s world by helping organizations find patterns, solve problems
and plan for the future.
Working with Formulas and Functions
In data analytics, formulas and functions are the essential tools used to clean,
transform, aggregate, and analyze data, typically within spreadsheet software (like
Excel, Google Sheets) or programming environments (like Python/Pandas, R).
The Role of Formulas and Functions in Analytics
Tool Definition Role in Analytics Example
An expression starting Used for custom, ad-hoc
with = that performs a calculations or = (A2 * B2) / 100
Formula calculation using combining multiple (Calculating a
references, constants, functions to create a weighted value)
and operators. metric.
A predefined, reusable Used for standardized,
=AVERAGE(D:D)
procedure that takes efficient data (Calculating the
Function
inputs (arguments) and manipulation and mean sales value)
returns a result. summary statistics.
Key Analytical Functions and Their Applications
Data analysts rely on specific function categories to handle different phases of the
analytical process:
Logical and Conditional Functions (Data Cleaning & Flagging)
These functions help categorize data and check for conditions.
• IF:Used to create a new column based on a condition (e.g., flagging
customers as "High Value" if their purchase amount is above a threshold).
• AND / OR: Used to combine multiple conditions within an IF statement.
• IFERROR: Used for error handling; replaces calculation errors (like #DIV/0!)
with a cleaner value (like 0 or "N/A").
Statistical and Aggregation Functions (Summary Analysis)
These are crucial for summarizing large datasets into meaningful metrics.
• SUM / AVERAGE / COUNT / MAX / MIN: Basic functions for descriptive statistics.
• MEDIAN / MODE / STDEV: Functions for measuring central tendency and
variability in a dataset.
• Conditional Aggregation (SUMIFS, COUNTIFS, AVERAGEIFS): Essential for
performing segmentation and calculating metrics for specific subsets of data.
o Example: =SUMIFS(Sales, Region, "West", Product, "A") aggregates sales for
only the "West" region and "Product A".
Lookup and Reference Functions (Data Integration)
These are used to merge data, enriching datasets by pulling corresponding values
from other tables or sheets.
• VLOOKUP / HLOOKUP: Searches for a value in one column/row and returns a
value from the same row/column in the designated index.
• INDEX / MATCH / XLOOKUP: The modern, more robust alternative to
VLOOKUP, allowing for lookups in any direction and providing greater
flexibility.
Text Functions (Data Standardization)
Used to prepare unstructured text data for analysis.
• TRIM: Removes unnecessary spaces, which often cause mismatched values.
• LEFT / RIGHT / MID: Used to extract specific portions of a text string (e.g.,
extracting product IDs from a longer code).
• CONCAT / &: Used to join different fields together (e.g., combining First Name
and Last Name).
Absolute vs. Relative References in Analytics
The strategic use of cell references is key to building scalable analytical reports.
Reference
Example Analytical Use Case
Type
Copying a formula down thousands of rows to apply the
Relative A1
same logic to every single data point.
Locking a reference to a fixed assumption or constant
Absolute $B$2 (e.g., the inflation rate, a target benchmark, or a conversion
factor) so it doesn't shift when the formula is copied.
$C2 or Creating cross-tabulations or sensitivity tables where one
Mixed
C$2 axis is fixed (absolute) and the other is dynamic (relative).
Advanced Array and Data Modeling Techniques
Modern data analysis often requires handling data in ranges (arrays) rather than
single cells.
• Array Functions (e.g., FILTER, UNIQUE): These automatically return multiple
results to adjacent cells (spilling). They are used to quickly create dynamic
subsets of data based on criteria, eliminating manual copy-pasting.
• Named Ranges: Assigning meaningful names (e.g., Sales_Data, Tax_Rate) to
cell ranges or constants. This makes complex formulas much easier to read,
debug, and maintain.