Course Name: Advanced Business Analytics Using R &
Python
Home Assignment 1 - Answers
CO1
Question 1: Explain the levels of measurement in data and illustrate them with examples.
Answer:
1 1. Nominal scale: Categorical labels without intrinsic order (e.g., gender, product category). Analysis:
mode, contingency tables, chi-square tests.
2 2. Ordinal scale: Categories with a meaningful order but unknown distance between levels (e.g., customer
satisfaction: low/medium/high). Analysis: median, percentiles, non-parametric tests like Mann-Whitney.
3 3. Interval scale: Ordered numeric scales with equal intervals but no true zero (e.g., temperature in
Celsius). Analysis: mean, standard deviation, correlation, regression (with caution on ratios).
4 4. Ratio scale: Numeric with meaningful zero allowing ratios (e.g., income, weight). Analysis: all arithmetic
operations valid including geometric means and coefficient of variation.
Question 2: What is Big Data? Discuss its characteristics and relevance in business contexts.
Answer:
1 1. Definition: Big Data refers to datasets that are large, fast-moving, or complex such that traditional data
processing tools are inadequate. Big Data often requires distributed storage and parallel processing.
2 2. Key characteristics (the Vs): Volume (massive scale), Velocity (speed of generation and processing),
Variety (structured, unstructured, semi-structured), Veracity (uncertainty and quality), and Value
(extractable insights).
3 3. Business relevance: Enables improved customer segmentation, real-time analytics (fraud detection,
recommendation systems), operational optimization (IoT sensor data), and data-driven product innovation.
Discuss governance, privacy, and infrastructure considerations when deploying Big Data solutions.
CO2
Question 1: Write a short note on univariate and bivariate analysis with suitable examples.
Answer:
1 1. Univariate analysis: Examination of a single variable to summarize its distribution and central tendency
(e.g., mean, median, mode, variance). Example: Analyzing monthly sales distribution using histogram and
descriptive statistics.
2 2. Bivariate analysis: Analysis of relationship between two variables (e.g., scatterplot, correlation,
cross-tabulation). Example: Assessing relationship between advertising spend and sales using scatterplot
and Pearson correlation; for categorical pairs use chi-square tests.
Question 2: Explain normalization and its importance in preparing data.
Answer:
1 1. Normalization definition: Scaling numeric features to a common range (e.g., 0–1) or transforming to have
zero mean and unit variance (standardization).
2 2. Common methods: Min-max scaling: x' = (x - min) / (max - min); Standardization (Z-score): x' = (x - µ) / σ.
3 3. Importance: Improves numerical stability, helps gradient-based optimization converge faster, prevents
features with large scales from dominating distance-based models (k-NN, SVM, K-means), and ensures
meaningful regularization penalties.