Lecture Discussion on
MACHINE LEARNING AND
PREDICTIVE ANALYTICS
A Machine Learning Approach:
Supervised Learning, Deep Learning, and Forecasting
By Joven R. Ramos
CPEN106: Elective 2 - Big Data Analytics
COURSE OUTCOMES
CO2: Analyze different machine learning models and big data
technologies to solve real-world data problems. (Bloom’s Level:
Analyze – Level 4)
CO3: Design and implement a machine learning project
addressing real-world challenges, incorporating data ethics
and privacy considerations. (Bloom’s Level: Create – Level 6)
INTENDED LEARNING OUTCOMES
ILO3. Differentiate supervised learning algorithms.
ILO4. Evaluate models based on performance metrics.
ILO5. Explain deep learning fundamentals.
ILO6. Implement a neural network model for classification tasks.
ILO7. Develop a predictive model using regression or time-series analysis.
ILO8. Apply forecasting techniques to business or environmental data.
MACHINE LEARNING 1:
SUPERVISED LEARNING ALGORITHMS
Concept Overview
Supervised learning involves training a model on labeled data, where the
input features (X) are mapped to an output (Y). The goal is to learn a function
that maps inputs to correct outputs.
MACHINE LEARNING 1:
SUPERVISED LEARNING ALGORITHMS
Types of Supervised Learning
1. Regression Algorithms (For Predicting Continuous Values) - used when the
target variable is numeric and continuous (e.g., predicting house prices,
stock prices, temperature).
2. Classification Algorithms (For Categorical Predictions) - used when the
target variable is categorical (e.g., classifying emails as spam/not spam,
detecting fraud).
MACHINE LEARNING 1:
SUPERVISED LEARNING ALGORITHMS
Regression Algorithms (For Predicting Continuous Values)
ALGORITHM DESCRIPTION REAL-WORLD APPLICATION
Finds the best-fit line for predicting continuous Predicting house prices based on features like
LINEAR REGRESSION
values. size and location.
Extends linear regression by fitting a polynomial Modeling non-linear relationships in stock
POLYNOMIAL REGRESSION
curve. market trends.
Regularized versions of linear regression to Used in healthcare for predicting disease
RIDGE & LASSO REGRESSION
prevent overfitting. progression.
Uses Support Vector Machines for regression
SUPPORT VECTOR REGRESSION (SVR) Predicting air pollution levels.
tasks.
Predicting electricity consumption based on past
DECISION TREE REGRESSION Splits data into regions to make predictions.
usage.
MACHINE LEARNING 1:
SUPERVISED LEARNING ALGORITHMS
Regression Algorithms (For Predicting Continuous Values)
ALGORITHM DESCRIPTION REAL-WORLD APPLICATION
An ensemble of decision trees to improve
RANDOM FOREST REGRESSION Forecasting rainfall and weather conditions.
accuracy.
GRADIENT BOOSTING MACHINES (GBM) Boosts weak models iteratively to reduce errors. Sales forecasting in retail businesses.
Optimized version of GBM for faster and better Used in financial analytics for predicting stock
XGBOOST REGRESSION
performance. prices.
ARTIFICIAL NEURAL NETWORKS (ANN) FOR Deep learning model that captures complex Predicting customer lifetime value in e-
REGRESSION patterns. commerce.
MACHINE LEARNING 1:
SUPERVISED LEARNING ALGORITHMS
Regression Algorithms (For Predicting Continuous Values)
Example Code: Predicting House Prices
MACHINE LEARNING 1:
SUPERVISED LEARNING ALGORITHMS
Classification Algorithms (For Categorical Predictions)
ALGORITHM DESCRIPTION REAL-WORLD APPLICATION
A statistical model for binary classification Predicting whether a loan will be approved
LOGISTIC REGRESSION
problems. (Yes/No).
Classifies data points based on their closest Recommender systems (e.g., Netflix suggesting
K-NEAREST NEIGHBORS (KNN)
neighbors. movies).
Face recognition systems in security
SUPPORT VECTOR MACHINE (SVM) Finds the best hyperplane to separate classes.
applications.
Medical diagnosis (e.g., detecting diabetes from
DECISION TREE CLASSIFIER Splits data into branches for classification.
patient records).
An ensemble of decision trees for better
RANDOM FOREST CLASSIFIER Fraud detection in banking transactions.
accuracy.
GRADIENT BOOSTING MACHINES (GBM) Iteratively improves weak classifiers. Customer churn prediction in telecom.
MACHINE LEARNING 1:
SUPERVISED LEARNING ALGORITHMS
Classification Algorithms (For Categorical Predictions)
ALGORITHM DESCRIPTION REAL-WORLD APPLICATION
XGBOOST CLASSIFIER A highly efficient gradient boosting algorithm. Credit scoring models in finance.
NAÏVE BAYES CLASSIFIER Uses probability for classification. Spam email detection.
ARTIFICIAL NEURAL NETWORKS (ANN) FOR Image recognition (e.g., Google Photos
Deep learning model for complex patterns.
CLASSIFICATION categorizing images).
CONVOLUTIONAL NEURAL NETWORKS (CNN) Extracts spatial features for classification tasks. Detecting pneumonia from X-ray images.
RECURRENT NEURAL NETWORKS
Processes sequential data for classification. Sentiment analysis in customer reviews.
(RNN/LSTM)
MACHINE LEARNING 1:
SUPERVISED LEARNING ALGORITHMS
Classification (Predicting Categorical Labels)
Example Code: Classifying Emails as Spam or Not Spam
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Deep learning is a subset of machine learning that uses neural networks with
multiple layers to model complex patterns in data.
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Deep learning is a subset of machine learning that uses neural networks with
multiple layers to model complex patterns in data.
Key Components of Neural Networks
Neurons (similar to brain cells) receive inputs, process them, and pass
outputs.
Activation Functions (ReLU, Sigmoid, Softmax) determine neuron firing.
Loss Functions measure the model’s error.
Backpropagation & Optimization (Gradient Descent) adjust weights to
reduce error.
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Neural Network Structure
Input Layer: Accepts input features.
Hidden Layers: Process and extract patterns using weights and activation
functions (ReLU, Sigmoid).
Output Layer: Produces the final prediction.
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Implementing Neural Networks
1. Convolutional Neural Networks (CNNs)
Purpose:
CNNs are designed for image recognition by extracting spatial features from
images.
✅ Use Cases:
Face recognition (e.g., unlocking smartphones).
Medical imaging (e.g., detecting pneumonia from X-rays).
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Ì Example Code: CNN for Image Classification (MNIST Digits)
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Implementing Neural Networks
2. Long Short-Term Memory (LSTM) Networks
Purpose:
LSTMs are ideal for processing sequential data like time-series, speech, and
text.
✅ Use Cases:
Stock price prediction.
Speech recognition (Google Assistant, Siri).
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Ì Example Code: LSTM for Stock Price Prediction
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Implementing Neural Networks
3. CNN-LSTM Hybrid Model
CNN-LSTM models combine CNNs (for feature extraction) and LSTMs (for
sequence learning).
✅ Use Cases:
Video classification (e.g., detecting suspicious activities in security
footage).
Weather prediction (using images + time-series data).
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Ì Example Code:
CNN-LSTM for Video
Classification
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Ì Example Code:
CNN-LSTM for Video
Classification
PREDICTIVE ANALYTICS
Developing Predictive Models
Predictive analytics uses historical data to predict future outcomes.
✅ Use Cases:
Retail: Forecasting customer demand.
Finance: Predicting loan defaults.
PREDICTIVE ANALYTICS
Developing Predictive Models
Ì Example: Predicting Sales Using Regression
PREDICTIVE ANALYTICS
» Forecasting Techniques
Forecasting is the process of using historical data to make future predictions. It
is widely used in business, finance, weather prediction, and environmental
monitoring.
PREDICTIVE ANALYTICS
» Forecasting Techniques
This section explains three major time-series forecasting techniques:
Moving Averages – Simple smoothing technique for reducing noise.
ARIMA (Auto-Regressive Integrated Moving Average) – Statistical
forecasting model.
LSTM (Long Short-Term Memory Networks) – Deep learning for advanced
time-series forecasting.
PREDICTIVE ANALYTICS
» 1. Moving Averages (MA) – Smoothing Fluctuations
✅ Concept:
Moving Averages (MA) is a simple but powerful method for smoothing
short-term fluctuations in time-series data.
It works by calculating the average of past data points over a fixed
window size.
Used for trend analysis, rather than direct forecasting.
✅ Use Cases:
✔ Stock price trends (e.g., 50-day moving average in stock trading).
✔ Temperature smoothing for weather trends.
✔ Sales forecasting (e.g., moving average of monthly revenue).
PREDICTIVE ANALYTICS
( Example: Moving Average in Python
✅ Interpretation:
The red line smooths out random fluctuations, revealing the true trend.
Moving Averages work best for short-term forecasting but struggle with
seasonality.
PREDICTIVE ANALYTICS
» 2. ARIMA (Auto-Regressive Integrated Moving Average)
✅ Concept:
ARIMA is a powerful statistical model for forecasting time-series data by
considering:
( Auto-regression (AR): Uses past values to predict future values.
( Integration (I): Differencing is used to make the data stationary.
( Moving Average (MA): Uses past forecast errors to improve predictions.
✅ Use Cases:
✔ Stock market forecasting.
✔ Weather forecasting (e.g., rainfall prediction).
✔ Demand forecasting for businesses (e.g., sales, inventory).
PREDICTIVE ANALYTICS
( Example: ARIMA for Stock Price Prediction
✅ Interpretation:
ARIMA models time-series trends and patterns to predict future values.
It works well for stationary time-series data (where trends don’t change
over time).
PREDICTIVE ANALYTICS
» 3. LSTM (Long Short-Term Memory Networks) – Deep Learning for Time-
Series Forecasting
✅ Concept:
LSTM is a type of Recurrent Neural Network (RNN) designed to capture
long-term dependencies in time-series data.
It remembers past values over time, making it ideal for complex
forecasting problems.
✅ Use Cases:
✔ Weather forecasting (e.g., temperature, rainfall).
✔ Stock market trend prediction.
✔ Energy consumption forecasting.
PREDICTIVE ANALYTICS
( Example: LSTM for Weather Forecasting
✅ Interpretation:
LSTM can capture long-term dependencies in weather data.
Works best when large datasets are available for training.
PREDICTIVE ANALYTICS
» Comparison of Forecasting Techniques
METHOD BEST FOR STRENGTHS WEAKNESSES
Moving Averages (MA) Short-term trend smoothing Simple, fast, removes noise Doesn’t predict future values
Forecasting financial & sales
ARIMA data
Strong statistical basis Requires stationary data
Advanced time-series Requires large datasets &
LSTM forecasting
Learns complex patterns
more computing power
PERFORMANCE METRICS FOR EVALUATION
Model Evaluation Metrics
Evaluating machine learning and deep learning models is crucial to
understanding their performance and reliability. Different problems require
different evaluation metrics.
PERFORMANCE METRICS FOR EVALUATION
1. Regression Model Metrics. Used when the model predicts continuous
numerical values, such as house prices or temperature forecasts.
METRIC DESCRIPTION BEST VALUE
Measures the average absolute difference between
Mean Absolute Error (MAE) Closer to 0
actual and predicted values.
Measures the average squared difference between
Mean Squared Error (MSE) Closer to 0
actual and predicted values.
Root Mean Squared Error The square root of MSE, interpretable in the same
Closer to 0
(RMSE) units as the output.
R² Score (Coefficient of Measures how well the model explains the variance in
Closer to 1
Determination) the target variable.
PERFORMANCE METRICS FOR EVALUATION
1. Regression Model Metrics. Used when the model predicts continuous
numerical values, such as house prices or temperature forecasts.
PERFORMANCE METRICS FOR EVALUATION
2. Classification Model Metrics. Used when the model predicts categorical
labels, such as spam detection or disease diagnosis.
METRIC DESCRIPTION BEST VALUE
Accuracy Percentage of correctly classified instances. Closer to 1
Measures how many predicted positives were actually positive. Important
Precision Closer to 1
for fraud detection.
Measures how many actual positives were correctly identified. Important for
Recall (Sensitivity) Closer to 1
medical diagnosis.
F1 Score Harmonic mean of precision and recall. Best for imbalanced datasets. Closer to 1
Confusion Matrix Table showing TP, FP, FN, and TN for model predictions. N/A
PERFORMANCE METRICS FOR EVALUATION
2. Classification Model Metrics. Used when the model predicts categorical
labels, such as spam detection or disease diagnosis.
PERFORMANCE METRICS FOR EVALUATION
3. Deep Learning Model Metrics. Deep learning models use similar metrics as
machine learning but have some additional ones for complex architectures
like CNNs and LSTMs.
METRIC DESCRIPTION BEST VALUE
Cross-Entropy Loss Measures the error for classification tasks in
Closer to 0
(Categorical/Binary) neural networks.
AUC-ROC (Area Under Curve -
Measures how well the model differentiates
Receiver Operating Closer to 1
between classes.
Characteristic)
Measures how well a probabilistic model
Perplexity (For NLP Models) Lower is better
predicts a text sequence.
PERFORMANCE METRICS FOR EVALUATION
3. Deep Learning Model Metrics. Deep learning models use similar metrics as
machine learning but have some additional ones for complex architectures
like CNNs and LSTMs.
PERFORMANCE METRICS FOR EVALUATION
4. Time-Series Forecasting Metrics. Used when predicting values over time,
such as stock prices or weather forecasting.
METRIC DESCRIPTION BEST VALUE
Mean Absolute Percentage Measures percentage error in
Closer to 0
Error (MAPE) predictions.
Symmetric Mean Absolute Modified MAPE that avoids
Closer to 0
Percentage Error (sMAPE) division by zero.
Measures similarity between
Dynamic Time Warping (DTW) Lower is better
two time series.
PERFORMANCE METRICS FOR EVALUATION
4. Time-Series Forecasting Metrics. Used when predicting values over time,
such as stock prices or weather forecasting.
PERFORMANCE METRICS FOR EVALUATION
Summary of Key Model Evaluation Metrics
TYPE OF MODEL KEY METRICS
Regression MAE, MSE, RMSE, R² Score
Accuracy, Precision, Recall, F1 Score, Confusion
Classification
Matrix
Deep Learning Cross-Entropy Loss, AUC-ROC, Perplexity (for NLP)
Time-Series Forecasting MAPE, sMAPE, DTW
THANK YOU