Unit -1
Business Analytics Basics
Study Notes
1. Definition of Analytics
• Analytics is the systematic computational analysis of data or statistics.
• It involves using data, statistical methods, and models to identify patterns, relationships,
and insights.
• Helps in decision-making, problem-solving, and performance improvement.
Analytics is the process of discovering, interpreting, and communicating significant patterns in
data to find meaningful insights. It uses mathematical and statistical methods, machine
learning, and other techniques to analyse large datasets and help make informed decisions,
predict future trends, and solve problems.
Core concepts of analytics
• Discovery: Sifting through large, complex datasets to find hidden patterns.
• Interpretation: Understanding what these patterns mean and how they can be applied.
• Communication: Presenting the findings in a clear and useful way.
How it works
• Uses data: Analytics draws conclusions from data, including historical and real-time
data.
• Employs various methods: It uses a variety of techniques, such as statistical analysis,
predictive modelling, and machine learning.
• Builds models: It can use algorithms to create models that predict future outcomes or
automate decisions.
What it is used for
• Business improvement: Helping organizations increase sales, reduce costs, and make
better business decisions.
• Forecasting: Predicting future needs or trends, such as forecasting inventory
requirements for a clothing company based on past sales data.
• Understanding customers: Analysing customer data to understand their wants and
needs.
• Informing strategy: Providing insights that help businesses remain competitive
2. Evolution of Analytics
1. Traditional Analytics – Manual analysis using spreadsheets and reports.
2. Business Intelligence (BI) – Automated dashboards and historical reporting.
3. Advanced Analytics – Predictive and prescriptive modelling using AI/ML.
4. Real-time Analytics – Streaming data and instant decision-making (IoT, sensors).
The evolution of analytics has progressed from manual data collection and basic reporting in
the 1950s to modern, AI-driven, real-time systems. Key stages include the "spreadsheet era"
for simplified analysis, the "big data era" of the 1990s-2000s for managing massive datasets,
and the current phase where AI, machine learning, and cloud computing enable advanced
predictive and prescriptive insights.
1. Early Beginnings (1950s-1960s)
• Manual data collection: Businesses began manually collecting data to improve
operations.
• Early data processing: Mainframe computers were used to automate administrative
and transactional data processing tasks, a significant improvement over manual method.
• Pioneering work: The US Census Bureau pioneered the use of computers for data
processing in the late 1950s.
2. The Spreadsheet Era (Late 20th Century)
• Democratization of analysis: Microsoft Excel and similar tools made data analysis
accessible to a wider audience, moving it beyond experts.
• Structured data: These tools allowed for easier manipulation of structured data for
tasks like financial modelling and budgeting.
3. The Rise of Big Data and Business Intelligence (1990s-2000s)
• Internet and connected devices: The growth of the internet led to a massive increase
in data volume, creating the "big data" era.
• Sophisticated tools: Businesses adopted more advanced databases, data warehouses,
and Business Intelligence (BI) tools like online analytical processing (OLAP) to
manage and analyse this new scale of information.
4. The Modern AI Era (2010s-Present)
• Cloud computing: The cloud made analytics more scalable, accessible, and cost-
effective, enabling both big data and advanced analytics.
• AI and Machine Learning: Artificial intelligence and machine learning are now
integral, automating processes and enabling sophisticated predictive and prescriptive
analytics that go beyond "what happened" to "what will happen" and "what should we
do".
• New capabilities: Current trends include augmented analytics (AI assisting human
analysts), edge computing (processing data at the source), and data-as-a-service
models.
• Data sources: Modern systems can mine both structured and unstructured data from
sources like social media and IoT devices
3. Growing Role of Business Analytics
• Enables data-driven decisions instead of intuition-based choices.
• Enhances efficiency, productivity, and profitability.
• Helps identify market trends, customer behaviour, and operational improvements.
• Provides a competitive edge in dynamic markets.
The role of business analytics is growing because it enables data-driven decision-making,
improves operational efficiency, and provides a competitive advantage by transforming raw
data into actionable insights. Key drivers include the explosion of big data, the need to
understand customer behaviour, and the ability to forecast future trends and manage risks more
effectively. This is leading to increased demand for analytics professionals and making
business analytics a cornerstone of modern business strategy and growth.
Key areas of impact
• Data-driven decision-making: Analytics provides the factual evidence needed to
make informed decisions, moving away from reliance on intuition.
• Improved operational efficiency: By analysing data, businesses can identify and
streamline inefficiencies, reduce costs, and optimize resource allocation.
• Enhanced customer understanding: Analytics helps organizations gain deep insights
into customer behaviour and preferences, which leads to more personalized marketing,
better customer experiences, and increased loyalty.
• Predictive analysis: Businesses can forecast market trends, anticipate future needs,
and stay ahead of competitors by using predictive models.
• Risk management: Analytics helps identify potential risks and allows companies to
develop proactive strategies to mitigate them.
• Innovation: By analysing market trends and consumer needs, businesses can uncover
new opportunities for products and services.
• Workforce optimization: Analytics can be used in human resources to predict
potential employee turnover and improve retention strategies.
Driving forces
• Big Data: The sheer volume of data generated daily from sources like customer
transactions and social media requires sophisticated analytics to make sense of it.
• Competitive pressure: Companies are using analytics to gain a competitive edge by
making faster, more informed decisions than their rivals.
• Technological advancements: Advancements in areas like AI and machine learning
are making it possible to develop more powerful analytics tools for real-time insights.
Career prospects
• The field is experiencing rapid growth, with demand for business analytics
professionals increasing significantly.
• Career opportunities are expanding across various industries, with high earning
potential for skilled professionals.
• Skills in areas like data analysis, problem-solving, strategic thinking, and technical
proficiency are highly valued
5. Business Analytics vs Business Analysis
Aspect Business Analytics Business Analysis
Focus Data-driven insights Process improvement
Objective Predict & optimize outcomes Identify needs & solutions
Tools Statistics, ML, visualization Process models, requirements docs
Outcome Actionable insights Business requirements
Business analysis focuses on identifying business needs and finding solutions through
processes and functions, while business analytics is a data-driven field focused on using past
performance to predict future outcomes and optimize operations. Business analysts work on
requirements and processes, whereas business analytics professionals use statistical analysis
and data mining to inform strategic decisions.
Feature Business Analysis Business Analytics
Primary Define business needs and identify Analyse past data to forecast future
Goal solutions to problems performance and improve operations
Focus Business processes, functions, and Data, patterns, trends, and statistical
requirements analysis
Key "What are the requirements to "Why did this happen, and what will
Question improve this process?" happen if we change this?"
Typical Requirements gathering, stakeholder Statistics, data mining, predictive
Skills management, process modelling modelling, data visualization
Outcome Process improvements, organizational Data-driven insights, forecasts, and
changes, strategic planning optimized business procedures
5. Business Intelligence vs Data Science
Aspect Business Intelligence (BI) Data Science
Focus Historical data reporting Predictive modelling
Techniques Dashboards, SQL, visualization Machine learning, AI
Output “What happened?” “What will happen?”
Goal Improve understanding Create predictive systems
Business Intelligence (BI) focuses on analysing past and present data to understand business
performance, while Data Science (DS) uses advanced techniques to predict future outcomes.
Business Intelligence (BI) focuses on analysing past and present data to understand business
performance, while Data Science uses advanced algorithms and machine learning to predict
future trends and solve complex problems. BI answers "what happened?" and "why?", while
Data Science answers "what will happen?" and "what should we do about it?"
• Goal: BI aims to monitor operations and track KPIs, answering "What happened?" DS
aims to forecast trends and automate decisions, answering "What will happen next?".
• Data: BI primarily uses structured, internal data from sources like data warehouses. DS
works with both structured and unstructured data, including text and images, from
diverse sources.
• Methods: BI uses descriptive and diagnostic analytics, relying on reporting,
dashboards, and visualizations. DS uses advanced statistics, machine learning, and
algorithms to build predictive models.
• Skills: BI requires expertise in SQL and BI tools (Tableau, Power BI) and strong
business acumen. DS requires strong programming skills (Python, R), advanced
statistics, and machine learning expertise
Key distinctions include:
• Focus: BI is backward-looking (descriptive and diagnostic analytics), while Data
Science is forward-looking (predictive and prescriptive analytics).
• Data Types: BI primarily uses structured, internal data (e.g., sales figures in a data
warehouse), whereas Data Science works with both structured and unstructured data
(e.g., text, images, sensor data).
• Tools & Techniques: BI utilizes tools like Power BI and Tableau for reporting and
visualization, while Data Science involves programming languages (Python, R),
statistical modelling, and machine learning frameworks.
• Deliverables: BI produces dashboards and reports for business users, while Data
Science creates predictive models and algorithms that are often integrated into business
processes.
• Complexity & Skills: Data Science requires a more advanced, technical skill set
(programming, advanced statistics) compared to the business acumen and SQL/tool
proficiency needed for BI.
6. Data Analyst vs Business Analyst
Role Data Analyst Business Analyst
Focus Data collection & analysis Process & business needs
Tools Excel, Python, SQL, BI tools Process maps, requirements docs
Goal Extract insights from data Bridge business & IT functions
Data Analysts focus on extracting insights from raw data using technical tools (SQL, Python,
BI), finding patterns, and presenting findings, while Business Analysts bridge data insights
with business needs, focusing on process improvement, strategy, and communicating * "what"
to "how" and "why"* for decision-makers, requiring stronger soft skills like communication
and stakeholder management. Both use data for decisions, but Data Analysts are deeper in data
manipulation and interpretation, while Business Analysts translate data into actionable business
strategy and solutions.
Data Analyst
• Focus: Analysing complex datasets, identifying trends, statistical analysis, reporting.
• Key Skills: Advanced SQL, Python/R, Statistics, Data Visualization (Tableau/Power
BI).
• Core Task: Turning raw data into understandable insights and visualizations.
• Background: Often STEM fields (Math, CS).
Business Analyst
• Focus: Understanding business problems, improving processes, strategic
recommendations, bridging tech and business.
• Key Skills: Communication, Process Improvement, Critical Thinking, Stakeholder
Management, Basic SQL.
• Core Task: Translating data insights into business strategy and solutions.
• Background: Often Business/Finance/Admin fields.
Key Differences Summarized
• Goal: DA finds "what's in the data," while BA defines "how to use it" for business.
• Tools: DA leans technical (coding, advanced stats); BA leans strategic (process
mapping, requirements gathering).
• Output: DA delivers reports/dashboards; BA delivers solutions/process
improvements.
Overlap & Reality
• Hybrid Roles: Many companies blend these roles, requiring a mix of both technical
and business skills.
• Collaboration: The best results come from DA providing insights that BAs use to build
strategies.
7. Types of Analytics
1. Descriptive Analytics – What happened? (Reports, dashboards, KPIs)
2. Diagnostic Analytics – Why did it happen? (Root cause analysis, correlations)
3. Predictive Analytics – What will happen? (Forecasting, regression, ML)
4. Prescriptive Analytics – What should we do? (Optimization, simulations)
The four main types of analytics, in increasing order of complexity, are Descriptive (what
happened?), Diagnostic (why did it happen?), Predictive (what might happen?),
and Prescriptive (what should we do about it?), forming a progression from understanding the
past to shaping the future by identifying patterns, causes, potential outcomes, and
recommended actions using data and models.
1. Descriptive Analytics
• Answers: "What happened?"
• Focus: Summarizes historical data to provide insights into past performance, often
through dashboards, reports, and visualizations (e.g., total sales last quarter).
2. Diagnostic Analytics
• Answers: "Why did it happen?"
• Focus: Drills down into data to find the root causes and relationships behind past
events, uncovering anomalies and patterns (e.g., why sales dropped in a specific
region).
3. Predictive Analytics
• Answers: "What might happen next?"
• Focus: Uses statistical models and machine learning to forecast future trends and likely
outcomes based on historical data (e.g., predicting customer churn or future demand).
4. Prescriptive Analytics
• Answers: "What should we do about it?"
• Focus: Recommends specific actions or decisions to achieve desired outcomes, often
using optimization and simulation to guide next steps (e.g., suggesting personalized
offers to retain customers).
Advanced Analytics
• Cognitive Analytics: An emerging type that automates the analytical process, using AI
and generative models to understand, predict, prescribe, and even act, moving beyond
human-defined rules.
8. Concept of Insights
• Insights are actionable findings derived from data analysis.
• They help organizations make strategic, tactical, and operational decisions.
• Example: “Customers aged 25–34 respond best to digital ads on weekends.”
Insights are deep, clear understandings of complex situations, people, or problems that reveal
hidden truths or new perspectives, moving beyond raw data to provide actionable intelligence
for better decisions and strategies, especially in business and marketing. While data is facts and
information is processed data, an insight is the "aha!" moment, a powerful realization about
why things happen, enabling meaningful change.
Key aspects of insight:
• Deep Understanding: It's not just knowing what (data), but understanding why
(insight).
• Actionable: True insights drive strategic action, transforming observations into
effective plans.
• Contextual: Insights often emerge from specific contexts, revealing customer
behaviours or market trends.
• Revealing: They uncover underlying motivations, human truths, or cause-and-effect
relationships.
• Intuitive & Sudden: Can sometimes come as a flash of clarity or a clear, intuitive
grasp.
Examples in practice:
• Business:
Understanding that parents use convenience foods not just for ease, but to feel they are
providing thoughtful meals, leading to product innovation.
• Psychology:
A client gaining new understanding of their own patterns to change behaviour.
• Marketing:
Discovering a core human truth about a target audience that informs a new campaign.
In essence, insights bridge the gap between knowing facts and making smart, impactful choices
9. Importance of Data in Business Analytics
• Data is the foundation of analytics.
• Enables businesses to:
o Understand trends & patterns
o Predict future performance
o Improve customer experience
o Optimize operations and costs
Data is the lifeblood of business analytics, transforming raw facts into actionable
insights that drive informed decisions, boost efficiency, enhance customer
understanding, mitigate risks, and foster innovation, allowing companies to optimize
operations, personalize experiences, predict trends, and gain a crucial competitive edge
in a data-driven world.
Key Roles of Data in Business Analytics:
• Informed Decision-Making: Replaces guesswork with evidence, helping leaders
make strategic choices about products, marketing, and investments.
• Operational Efficiency: Analyses processes (supply chain, production) to find
bottlenecks, reduce waste, cut costs, and improve productivity.
• Customer Understanding: Uncovers customer behaviours, preferences, and needs
from digital footprints, enabling personalized experiences and loyalty.
• Risk Management: Identifies potential risks and uncertainties, allowing for proactive
mitigation strategies.
• Revenue Growth: Optimizes marketing ROI, identifies new opportunities, and refines
product development.
• Performance Measurement: Tracks performance across departments (sales,
marketing, HR) to ensure resources are used effectively.
• Innovation & Strategy: Identifies market trends and gaps, fostering adaptability and
data-driven recommendations for growth.
• Human Resources: Optimizes talent management by predicting turnover, identifying
top performers, and designing better training.
In essence, data provides the objective foundation for understanding the past,
navigating the present, and strategically shaping the future of a business, making it
indispensable for success.
10. Differences Between Data, Information, and Knowledge
Level Definition Example
Data Raw, unprocessed facts “500 sales transactions”
Information Processed data with context “Sales increased by 10% in July”
Knowledge Applied information for action “Increase stock in July due to rising demand”
Data, Information, and Knowledge form a hierarchy: Data are raw facts (e.g., 10, 20,
30); Information is data organized with context (e.g., 10°C, 20°C, 30°C, showing a rising
temperature); and Knowledge is applying that information with experience to understand
patterns (e.g., understanding rising temps mean summer is coming) and make decisions, while
wisdom adds judgment about when and how to apply it.
Data
• Definition: Raw, unprocessed facts, figures, symbols, or observations.
• Characteristics: Lacks inherent meaning or context.
• Example: Numbers like 75, 80, 90.
Information
• Definition: Data that has been processed, organized, structured, or put into context.
• Characteristics: Answers "who, what, where, when," providing meaning.
• Example: A student's scores (75, 80, 90) are data; their average (81.67%) is
information.
Knowledge
• Definition:
The application and understanding of information, combined with experience, intuition, and
context.
• Characteristics:
Involves insight, judgment, and the ability to make informed decisions or solve problems.
• Example:
Knowing that a rising average temperature trend (information) indicates summer is
approaching (knowledge).
Key Differences
• Data: Building blocks; needs processing.
• Information: Processed data; provides context.
• Knowledge: Applied information; enables action and understanding
11. Quality of Data
Good data should be:
• Accurate – Correct and error-free
• Complete – No missing values
• Consistent – Uniform across sources
• Timely – Up to date
• Relevant – Suitable for the purpose
Poor data quality leads to faulty insights and bad decisions.
Data quality is the degree to which a dataset is correct, complete, and useful for its intended
purpose. It's assessed across several key dimensions, including accuracy (correctness),
completeness (all necessary data is present), consistency (data aligns across different
sources), and timeliness (data is up-to-date). Ensuring high data quality is critical for reliable
decision-making, operational efficiency, and maintaining trust with stakeholders.
Key dimensions of data quality
• Accuracy: The degree to which data correctly represents the real-world object or
event it describes.
• Completeness: The extent to which all required data is available and present within a
dataset.
• Consistency: The uniformity of data across different systems and datasets.
• Timeliness: The degree to which data is available and up-to-date when it is needed.
• Validity: The degree to which data conforms to a defined format, rule, or standard.
• Uniqueness: The absence of duplicate records within a dataset.
• Integrity: The maintenance of referential relationships between datasets, ensuring no
broken links.
• Relevance: Whether the data is actually needed for the intended purpose.
Why data quality is important
• Informed decisions: High-quality data leads to more reliable insights and better
decision-making.
• Operational efficiency: Reduces errors, boosts productivity, and streamlines
processes.
• Enhanced trust: Builds credibility and confidence among stakeholders.
• Compliance and risk mitigation: Helps organizations adhere to regulations and
avoid costly penalties.
• Cost savings: Minimizes the need for costly revisions and rework.
12. 5Vs of Big Data
1. Volume – Massive amounts of data generated daily
2. Velocity – Speed of data generation and processing
3. Variety – Different types (text, audio, video, etc.)
4. Veracity – Reliability and accuracy of data
5. Value – The usefulness of data insights
The 5 Vs of Big Data define its core challenges and opportunities: Volume (huge
amounts), Velocity (speed of generation), Variety (different types like text,
video), Veracity (data quality/trustworthiness), and Value (deriving useful insights). These
characteristics explain why traditional tools struggle with big data, requiring specialized
methods to manage and extract meaningful information for better decision-making in fields
from business to science.
Here's a breakdown of each 'V':
• Volume: The sheer scale of data generated, like social media posts, sensor data, or
transactions, measured in terabytes, petabytes, and beyond.
• Velocity: The extreme speed at which data is created, transmitted, and needs
processing, often in real-time (e.g., stock market data, GPS).
• Variety: Data comes in diverse formats, including structured (databases), semi-
structured (JSON, XML), and unstructured (text, images, audio, video).
• Veracity: The uncertainty or trustworthiness of the data; ensuring accuracy,
completeness, and reliability is crucial.
• Value: The ultimate goal – transforming this vast, complex data into actionable
insights, business intelligence, and tangible benefits.
13. Big Data Collection and Ethics
• Ethical collection ensures fairness, transparency, and consent.
• Avoid misuse or unauthorized sharing.
• Must respect user privacy and follow data protection regulations.
Big data collection ethics involves principles like transparency, consent, and fairness to
ensure data is used responsibly and respects individual privacy. Key ethical challenges
arise from the potential for misuse, discrimination, and bias, necessitating a focus on
proper permissions, anonymization, and accountability. Implementing ethical
guidelines is crucial to prevent harm and build trust in data-driven decision-making.
Core principles
• Transparency:
Be clear with individuals about what data is being collected, why, and how it will be
used.
• Informed Consent:
Obtain explicit consent from individuals after they have been properly informed about
data collection practices.
• Privacy:
Avoid collecting sensitive or personally identifiable information unnecessarily and
comply with all relevant privacy laws.
• Fairness:
Design data collection and analysis processes to avoid biased or exclusionary methods
that could lead to discrimination.
• Accountability:
Establish clear responsibility for data handling and maintain mechanisms for review
and oversight.
• Data Minimization:
Collect only the data that is strictly necessary for a specific, legitimate purpose.
• Legitimate Purpose:
Collect data only for clear, specific, and valid purposes.
Ethical practices for organizations
• Implement a governance framework: Establish a comprehensive framework to guide
data collection, processing, and decision-making.
• Conduct ethical impact assessments: Evaluate potential risks and harms before
collecting and using data.
• Prioritize individual control: Give individuals more meaningful control over how
their personal information is collected, used, and shared.
• Avoid harm: Use big data to solve societal challenges, not to exploit vulnerabilities or
cause harm to individuals or groups.
• Anonymize data: Protect individual identities by anonymizing data whenever
possible.
14. Data Sources and Collection Methods
Sources:
• Internal: CRM, ERP, sales logs, HR systems
• External: Social media, public data, IoT devices, APIs
Methods:
• Surveys & interviews
• Web scraping
• Transactional systems
• Sensors & IoT devices
Data sources are the origin points of information, categorized as primary (first-hand) or
secondary (existing), while data collection methods are the techniques used to gather this data.
The best choice depends on the research objectives, time, and budget.
Data Sources
• Primary Sources: Provide original, real-time data collected specifically for the current
research purpose.
o Examples: Internal company records, accounting ledgers, employee reports, or
data from your own experiments/surveys.
• Secondary Sources: Consist of data already collected by someone else for different
purposes, which is then used by the current researcher.
o Examples: Government census reports, published academic journals, industry
reports, online databases, or news articles.
Data Collection Methods
These are the tools and procedures used to obtain data from the sources.
• Surveys and Questionnaires: Widely used for gathering information from a large
number of respondents, often producing quantitative data that can be conducted online
or on paper.
• Interviews and Focus Groups: Involve one-on-one or group conversations to gather
in-depth, qualitative insights into opinions, beliefs, and motivations.
• Observation: Involves watching and recording behaviours or events in their natural
context to gather objective data on real-world actions.
• Experiments: Involve manipulating variables under controlled conditions to establish
cause-and-effect relationships and test hypotheses.
• Document and Records Analysis: Reviewing existing written or digital documents,
such as financial statements, health records, or web logs, to extract relevant data.
• Online Tracking/Web Scraping: Automated processes to extract data from websites,
social media, or other online platforms to analyse online behaviour or trends.
15. Data Privacy, Security, and Ethical Considerations
• Data Privacy: Protecting personal and sensitive information.
• Data Security: Using encryption, firewalls, and access control.
• Ethics: Responsible use of data to maintain public trust.
• Follow global regulations like GDPR, CCPA, etc.
Data Privacy, Security, and Ethics involve responsibly handling personal information, focusing
on transparency, consent, minimization, and strong security (like encryption & access control)
to protect individuals, while navigating ethical challenges like algorithmic bias and data
ownership, balancing utility with rights, guided by principles like GDPR and fairness. Security
safeguards data, privacy defines rightful use, and ethics provide the moral framework for both,
ensuring trust and avoiding misuse.
Data Privacy
• Definition: Individuals' right to control their personal information and how it's
collected, used, and shared.
• Key Principles: Transparency (clear policies), consent (informed agreement), purpose
limitation (specific use), data minimization (collect only what's needed).
• Rights: Access, correction, deletion, portability (GDPR, CCPA).
Data Security
• Definition: Protecting data from unauthorized access, alteration, or destruction
(Confidentiality, Integrity, Availability).
• Measures: Encryption, authentication, access controls, secure storage, preventing
breaches.
• Role: A fundamental component of data governance, essential for trust and
compliance.
Ethical Considerations
• Informed Consent & Autonomy: Respecting individuals' right to choose how their
data is used.
• Fairness & Bias: Preventing algorithms from perpetuating societal biases present in
training data.
• Transparency & Accountability: Explaining data practices and being answerable for
failures.
• Data Ownership: Shifting perspective from organization ownership to individual data
rights.
• Privacy vs. Utility: Balancing the benefits of data analysis with individual privacy.
Interconnections & Best Practices
• Ethics informs Privacy, Privacy relies on Security: Ethical principles guide privacy
rules, while strong security ensures privacy promises are kept.
• Privacy by Design: Building privacy and security into systems from the start, not as
an afterthought.
• Anonymization/De-identification: Techniques to protect identity, though challenging
in practice.
• Regulatory Compliance: Adhering to laws like GDPR (EU) and CCPA (US) to
enforce these principles.