0% found this document useful (0 votes)
22 views26 pages

Stock Price Prediction Using ML Models

The document outlines a study on evaluating deep learning and machine learning models for stock price prediction using datasets from S&P 500 and STOXX 600. It details the methodological approach, including data preprocessing, model training strategies, and the performance of various models like LSTM and XGBoost. The study concludes with insights on future research directions, particularly in sentiment analysis related to stock prices.

Uploaded by

g.carrivale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views26 pages

Stock Price Prediction Using ML Models

The document outlines a study on evaluating deep learning and machine learning models for stock price prediction using datasets from S&P 500 and STOXX 600. It details the methodological approach, including data preprocessing, model training strategies, and the performance of various models like LSTM and XGBoost. The study concludes with insights on future research directions, particularly in sentiment analysis related to stock prices.

Uploaded by

g.carrivale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Stocks Market

Forecasting
Gabriele Carrivale - 872488
Martino Pettinari - 866496
Stefano Madona - 874799
Overview
01 04
Introduction Results and Evaluation

02
05
Datasets
Discussion
03
06
The Methodological
Conclusions
Approach
Introduction
Mission

Focus Financial Market

Goal Evaluate deep learning and


machine learning models for
stock price prediction
Datasets
Data Preprocess
Data preprocess:
1. Data collection: 250 stocks from S&P 500 and 250 from
STOXX 600.
2. Data transformation: The dataset contains only 7 columns,
for each row there are the metrics and the company’s
ticker.
3. Merging and cleaning:
Merged datasets and cleaned from null values.
Keep the columns: Close, Date and Volume, and set the
date as the index.
Datasets Example
Data Exploration
STOXX600 S&P500

Correlation between individual stocks and their own market


Data Sequence
Sequences for classification and regression tasks.

For the classification task, three methods were applied:


1. Mean Value Approach

2. Maximum Value Approach

3. Initial Value Approach


The Methodological
Approach
Train Approach
Strategies to train models:

1. Index-Specific Models: Model for each major index.


2. Generalized Models for Individual Stocks: Model
with all data.
3. General Model with Fine-Tuning.
Prediction Paradigm
Classification-Based Prediction: Classification task,
determining whether the stock price will increase or
decrease on the next trading day.

Regression-Based Prediction: Estimate the stock price for


the following trading day within a selected period.
Models
Long Short-Term Memory (LSTM) and bidirectional LSTM
(BiLSTM): Capture long-term temporal dependencies and
patterns.
[1]
CNN-BiLSTM: Combines CNNs and BiLSTMs. CNN extracts
key features, while BiLSTM processes sequential patterns.

Random Forest Regressor and XGBoost Regressor:


Traditional machine learning models optimized for non-linear
relation-ships and efficient prediction.
LSTM fine-tune
Goal: improve accuracy on data

KerasTuner framework

Optimal configuration
First layer: 90–100 neurons and 0.2 dropout rate.
Second layer: 30–40 neurons and 0.5 dropout rate.

Made on the S$P500 index data


Challenges
Key challenges:
Data retrieval and cleaning: Handling invalid tickers and
API restrictions for sentiment data.

Noisy and incomplete data: Addressed through


preprocessing techniques.

Computational constraints: Required limiting dataset size


and sequence length.
Result and
Evaluation
Classification

Classification result on entire dataset


Regression

Performance metrics for the entire Datasets


Regression for Index

Performance metrics for S&P500 Datasets

Performance metrics for STOXX600 Datasets


Predictions

Comparison between the best and worse prediction using the entire dataset for
training the models.
Confidence Interval

[2]
Regression results predicting jointly mean and log variance
Discussion
Future works
Analyze sentiment related to S&P 500
companies.

VADER to generate sentiment scores and


merging these with stock price data.

Future research could investigate advanced


sentiment analysis models such as BERT or
FinBERT.
Conclusion
Recap
Evaluation classification and regression models.
Maximum Value-Based Classification: highest accuracy of 75%.
XGBoost outperformed other models,
Training on the full dataset captured broad market trends

BiLSTM-based confidence interval captured market volatility.

Sentiment analysis using Twitter data demonstrated potentiality for future


works.
Thank You!

You might also like