0% found this document useful (0 votes)

30 views4 pages

Modifying Neural Network for MNIST Data

This document contains instructions for an assignment in an introductory artificial intelligence course. It includes 4 questions - the first asks students to modify neural network code to write error values to a file, try different network architectures, and add noise to the data. The second questions describes building a neural network to predict soccer match outcomes. The third questions involves calculations with a network of linear neurons. The fourth question asks students to explain the relationship between gradient descent and backpropagation.

Uploaded by

Selmanurto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views4 pages

Modifying Neural Network for MNIST Data

Uploaded by

Selmanurto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CS 470/670 – Intro to Artificial Intelligence – Fall 2018

Instructor: Marc Pomplun

Assignment #4
Posted on November 21 - Due by November 29, 2:00pm

Question 1: Modifying the Neural Network Code

On the course homepage, you will find the neural network code that I showed in class in a
file named “MNIST_demo.c”. In order to make it work, you will also need to download
the following files from the website at:

[Link]

[Link]: training set images (9912422 bytes)

[Link]: training set labels (28881 bytes)
[Link]: test set images (1648877 bytes)
[Link]: test set labels (4542 bytes)
Unpack all files into the same directory as the network code, and then the program should
be able to learn the MNIST data and test its accuracy for 150 epochs. It will take quite a
while, though. So make sure you have other things to do while the network does its job.

(a) Modify the code so that it writes the error for the training set and the error for the test
set after each epoch into a file. Let the network run for at least 150 epochs (if you let it
run overnight, you can as well perform more epochs) and plot training and test error as
functions of epoch number like in Figure 9.9 (right panel) of Chapter 9 that I sent you.
If your plot looks completely different from Figure 9.9, please let me know and we can
check what went wrong.

(b) See what happens if you use fewer neurons in the two hidden layers. You can choose
any numbers, but make at least one layer significantly smaller than in the original
network. Again, plot the training and test errors across epochs. How do the results differ
from the initial ones?

(c) Restore the original number of neurons, and now change the dropout rate. Originally,
25% of input units (see line 164 in the code) and 50% of hidden-layer neurons (see line
172) are randomly chosen to drop out, i.e., give no output. Note that when we run the
network in production (non-training) mode without dropout, we need to increase the
output of neurons accordingly to keep the overall activation at the same level (lines 180
and 182). Choose a different dropout rate for the hidden-layer and/or input layer units
to whichever value you like, and run the network again for at least 150 epochs. Plot the
results and compare them to the ones you got in (a).

(d) Add noise to the data by randomly choosing a certain percentage of pixels in both the
training and test images and flipping their intensity so that intensity i will become (255
– i). For example, if a pixel has the original value 10, then after this transformation it
has the value 245. The easiest way to do this is to modify the readData() function. You
can choose any percentage you like, and any of the three networks that you created
above. Again, run it for at least 150 epochs and compare the results to the original
network (whichever one you chose).

Question 2: The Soccer Network

Here is an idea for an ANN that would make you rich if it performed well. This ANN
predicts the results of soccer matches. The network receives information about the two
competing teams and the conditions of the match and is supposed to predict how many
goals each team will score. With this knowledge, you could bet on the projected winner
team and gain a lot of money.

Let us say that every team consists of 20 players. You are providing the following input
data to the network:

• The skill level of every player on each of the two teams. Skill is rated by a group
of soccer reporters on a scale from 0 (“is unable to kick the ball”) to 10 (“world
class player”).

• The number of matches that each team has played during the last two weeks. There
are never more than seven matches in that period of time.

• The statistics of former matches between the same two teams within the past 10
years (e.g., Team A won 30% of the matches, Team B 45%, and 25% of the matches
were tied).

• The continent that each team comes from (North America, South America, Europe,
Africa, Asia, or Australia).

• Where the match takes place (Team A’s stadium, Team B’s stadium, or neutral
place).

• The phase of the soccer season (early season vs. late season).

You want to build and train a backpropagation network that, based on this information, is
able to predict the number of goals each team will score. Describe an appropriate way of
formatting the input, interpreting the output, collecting exemplars, constructing the
network, training the network, and testing the network. Give reasons for the decisions that
you make. Describe everything in great detail so that a computer programmer who does
not know anything about ANNs would be able to successfully build this network
application, predict results, and become rich. The programmer can look up the BPN
equations for training and operation in a book, but needs precise explanations for
everything else. Please help him/her out!

Question 3: Linear Neurons

The following is a network of linear neurons - that is, neurons whose output is identical to
their net input, x⋅w (in other words, their output function that translates net input into output
is simply the identity function). These neurons do not receive any “dummy” inputs (biases
or offsets). The numbers in the circles indicate the output of a neuron, and the labels of
connections indicate the value of the corresponding weight.

1
-2
-1
-4 3 4

2 3
2 0
-3
1

2 1

(a) Just as a warm-up exercise, compute the output of the hidden-layer and the output-layer
neurons for the given input (2, 1).

(b) Only mandatory for CS670: Show that a network of linear neurons, such as this one,
always computes a linear function, regardless of its number of layers and neurons.
Hint: A function y = f(x) is linear if and only if it can be expressed as y = Ax for some
matrix A.

(c) Only mandatory for CS670: Given that our three-layer network computes a linear
function, we suddenly notice that our network is wastefully large. It must be possible
to compute exactly the same function with a two-layer network. Draw such a network,
including all of its weights, that only consists of an input layer and an output layer and
computes the same function as the network shown above. Hint: In the network above,
determine how the output of each output-layer neuron depends on the two network
inputs, and then you should be able to find the correct weights for the two-layer
network. There is a more elegant way of deriving the solution that is related to (b), but
any correct solution gets full points, regardless of your approach.

Question 4 (Bonus): From Gradient Descent to Backpropagation

Explain in your own words how the concepts of gradient descent and backpropagation
are related to each other.

Please put your answers to all questions in a single text file and upload it to your course
directory. Alternatively, you can submit some or all answers as a hardcopy at the start of
the class.

Common questions

Gradient descent is an optimization algorithm used to minimize a loss function by iteratively moving towards the steepest decrease. Backpropagation applies gradient descent by calculating gradients of the network's loss with respect to each weight via partial derivatives, propagating these errors backward to update weights efficiently across all network layers .

Yes, a multi-layer network of linear neurons can be condensed into a two-layer format by encapsulating the function y = Ax in a singular matrix transformation. This involves adjusting weight matrices to ensure the ultimate output mirrors that of the original network, representing the same linear function but with reduced architectural complexity .

Introducing noise by flipping pixel intensity values increases the challenge a neural network faces during training, often enhancing the model's robustness to input variability. By randomly modifying pixel values, the network learns to focus on more invariant features. Comparing error plots before and after noise is added allows the assessment of model resilience and improved generalization to aberrant data characteristics .

Using diverse features like player skills, match histories, and contextual information can yield varied prediction accuracies. Rich feature sets increase input complexity, allowing networks to learn nuanced relationships but may also require extensive training data to prevent overfitting. Evaluating model performance across different feature sets highlights essential attributes influencing predictive accuracy and informs feature engineering strategies .

To track performance, modify the neural network code to write error rates for both training and test sets after each epoch into a file . Running the network for at least 150 epochs allows the plotting of errors as functions of epoch number, facilitating the analysis of training and generalization capabilities over time. Differences in the plotted results compared to reference figures indicate possible issues with learning dynamics or data handling .

Reducing the number of neurons in one or more hidden layers generally leads to changes in the capacity of the neural network to represent complex functions. Fewer neurons might increase training and test error rates, indicating underfitting. By comparing the errors after lowering neuron counts to the original errors, one can evaluate how network depth affects learning and model generalization .

Each player in a team can be represented as a feature vector with attributes like skill level and recent match activity. Compile these inputs along with historical match data, team geographic origin, match location, and seasonal phase into a comprehensive, normalized vector. Construct a backpropagation network that accommodates these inputs with appropriate layers and activation functions to predict team scores, ensuring data partitioning for training and evaluation reflects real-world scenarios for robust performance analysis .

Altering dataset image intensity, such as flipping pixel values, complicates the recognition task by altering pixel distributions. This can lead the network to rely less on absolute intensities and more on robust, context-aware feature detection, potentially improving its adaptability across varied inputs. Evaluation of error rate changes post-modification offers insights into the network's dependency on raw pixel intensity .

Linear neurons output their net input directly, limiting their ability to model complex, non-linear decision boundaries. They compute linear functions y = Ax, as opposed to non-linear neurons which apply activation functions that enable deeper, multi-layer networks to capture intricate patterns. Thus, architectures composed entirely of linear neurons cannot capture the richness of data relationships that networks employing non-linearities can .

Different dropout rates alter the network's ability to generalize by preventing co-adaptation of features. Adjusting dropout rates from the standard 25% for input units and 50% for hidden layers influences both training progression and overfitting tendencies. Analysis of modified dropout rates involves comparing corresponding training and test accuracy plots with the baseline, helping refine dropout parameters for optimal learning outcomes .

CS672 Neural Networks Midterm Solutions
100% (1)
CS672 Neural Networks Midterm Solutions
7 pages
Understanding Neural Networks and Learning
No ratings yet
Understanding Neural Networks and Learning
19 pages
Neural Networks and Machine Learning Concepts
No ratings yet
Neural Networks and Machine Learning Concepts
5 pages
Neural Network Fundamentals Explained
No ratings yet
Neural Network Fundamentals Explained
22 pages
Vanishing Gradients in Neural Networks
No ratings yet
Vanishing Gradients in Neural Networks
82 pages
Exercises on Linear Regression and Neural Networks
No ratings yet
Exercises on Linear Regression and Neural Networks
11 pages
Deep Learning and Neural Networks Guide
No ratings yet
Deep Learning and Neural Networks Guide
23 pages
Deep Learning Interview Questions Guide
No ratings yet
Deep Learning Interview Questions Guide
8 pages
Unit 4 - ML
No ratings yet
Unit 4 - ML
23 pages
ANN Group Assignment
No ratings yet
ANN Group Assignment
7 pages
Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
92 pages
Deep Learning and Neural Networks Overview
No ratings yet
Deep Learning and Neural Networks Overview
12 pages
Deep Learning Exam Answer Key 2025
No ratings yet
Deep Learning Exam Answer Key 2025
10 pages
Deep Learning Overview and Applications
No ratings yet
Deep Learning Overview and Applications
28 pages
Deep Learning Homework 1 Guidelines
No ratings yet
Deep Learning Homework 1 Guidelines
2 pages
Deep Learning Interview Questions Guide
No ratings yet
Deep Learning Interview Questions Guide
25 pages
DeepLearning ProblemSet 1
No ratings yet
DeepLearning ProblemSet 1
17 pages
Deep Learning Assignment 1 Overview
No ratings yet
Deep Learning Assignment 1 Overview
5 pages
Essential Deep Learning Interview Questions
No ratings yet
Essential Deep Learning Interview Questions
8 pages
Linear Classifiers and Neural Networks
No ratings yet
Linear Classifiers and Neural Networks
13 pages
GPU Neural Network Implementation Guide
No ratings yet
GPU Neural Network Implementation Guide
10 pages
Deep Learning Midterm Practice Questions
No ratings yet
Deep Learning Midterm Practice Questions
5 pages
Neural Networks and Deep Learning Overview
No ratings yet
Neural Networks and Deep Learning Overview
12 pages
Ai Endsem QB
No ratings yet
Ai Endsem QB
16 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
27 pages
Multi-layer Perceptron and Neural Networks
No ratings yet
Multi-layer Perceptron and Neural Networks
23 pages
Deep Learning Viva Questions
No ratings yet
Deep Learning Viva Questions
19 pages
Deep Learning Use Cases and Basics
No ratings yet
Deep Learning Use Cases and Basics
24 pages
CSE 465: Pattern Recognition Overview
No ratings yet
CSE 465: Pattern Recognition Overview
47 pages
DL Assignment-1 Answers
No ratings yet
DL Assignment-1 Answers
16 pages
Understanding Neural Networks in AI
No ratings yet
Understanding Neural Networks in AI
16 pages
AIT401-SCHEME
No ratings yet
AIT401-SCHEME
14 pages
Neural Network Concepts and Calculations
No ratings yet
Neural Network Concepts and Calculations
33 pages
Machine Learning Basics Using Excel
No ratings yet
Machine Learning Basics Using Excel
10 pages
APS360 Deep Learning Week 2 Syllabus
No ratings yet
APS360 Deep Learning Week 2 Syllabus
40 pages
Week 1 & 2 Deep Learning Quiz
No ratings yet
Week 1 & 2 Deep Learning Quiz
5 pages
Machine Learning and Neural Networks Overview
No ratings yet
Machine Learning and Neural Networks Overview
45 pages
Introduction to Neural Networks
No ratings yet
Introduction to Neural Networks
20 pages
Neural Networks in Machine Learning
No ratings yet
Neural Networks in Machine Learning
17 pages
NN Assgnment
No ratings yet
NN Assgnment
7 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
32 pages
Deep Learning Model Techniques
No ratings yet
Deep Learning Model Techniques
3 pages
Unit 3,4,5
No ratings yet
Unit 3,4,5
21 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
6 pages
Understanding Neural Networks and Activation Functions
No ratings yet
Understanding Neural Networks and Activation Functions
10 pages
Multilayer Neural Networks Overview
No ratings yet
Multilayer Neural Networks Overview
16 pages
Deep Learning Course Overview and Concepts
No ratings yet
Deep Learning Course Overview and Concepts
199 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
5 pages
Perceptron and MLP: Concepts and Applications
No ratings yet
Perceptron and MLP: Concepts and Applications
98 pages
Introduction to Neural Networks Basics
No ratings yet
Introduction to Neural Networks Basics
39 pages
Hopfield Neural Network Implementation Guide
No ratings yet
Hopfield Neural Network Implementation Guide
2 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
8 pages
Neural Networks: Introduction to ML/AI
No ratings yet
Neural Networks: Introduction to ML/AI
32 pages
AI, ML, DL, and Neural Networks Explained
No ratings yet
AI, ML, DL, and Neural Networks Explained
128 pages
Neural Network Interview Questions
No ratings yet
Neural Network Interview Questions
9 pages
Understanding AlphaGo and Neural Networks
No ratings yet
Understanding AlphaGo and Neural Networks
22 pages
Implementing Neural Networks in Python
No ratings yet
Implementing Neural Networks in Python
13 pages
Minimizing Vanishing Gradient in MLPs
No ratings yet
Minimizing Vanishing Gradient in MLPs
36 pages
DL Question Bank
No ratings yet
DL Question Bank
45 pages
Fish Recognition in Bangladesh Using CNN
No ratings yet
Fish Recognition in Bangladesh Using CNN
7 pages
ICT Trends and Critical Thinking Activities
No ratings yet
ICT Trends and Critical Thinking Activities
4 pages
Normalizing Chinese-English Mixed Texts
No ratings yet
Normalizing Chinese-English Mixed Texts
10 pages
Understanding Learning Agents in AI
No ratings yet
Understanding Learning Agents in AI
10 pages
Machine Learning Applications in AM
No ratings yet
Machine Learning Applications in AM
15 pages
Muluye Getnet Y 2010 09 PHD PDF
No ratings yet
Muluye Getnet Y 2010 09 PHD PDF
245 pages
Hybrid Deep Learning for Water Quality Prediction
No ratings yet
Hybrid Deep Learning for Water Quality Prediction
10 pages
Birth Rate Analysis with Machine Learning
No ratings yet
Birth Rate Analysis with Machine Learning
31 pages
Neural Networks vs Linear Regression
No ratings yet
Neural Networks vs Linear Regression
44 pages
Automated Malware Detection System
No ratings yet
Automated Malware Detection System
61 pages
Bagging BERT for Aggression Detection
No ratings yet
Bagging BERT for Aggression Detection
7 pages
CNN Convolution Basics Explained
No ratings yet
CNN Convolution Basics Explained
64 pages
Enhancing Solar Disk Imager Image Quality
No ratings yet
Enhancing Solar Disk Imager Image Quality
14 pages
What Is Machine Learning - Qifang Bi, Katherine E. Goodman, Joshua Kaminsky, and Justin Lessler
No ratings yet
What Is Machine Learning - Qifang Bi, Katherine E. Goodman, Joshua Kaminsky, and Justin Lessler
18 pages
Javascript Medecine AI
No ratings yet
Javascript Medecine AI
18 pages
Indian Consumer Sentiment on EVs
No ratings yet
Indian Consumer Sentiment on EVs
12 pages
Perceptron Explained for Class 10
No ratings yet
Perceptron Explained for Class 10
2 pages
UGC NET Computer Science Reference Books
No ratings yet
UGC NET Computer Science Reference Books
3 pages
Brain Tumor Detection with CNN in MRI
No ratings yet
Brain Tumor Detection with CNN in MRI
29 pages
CS 280: Intro to Deep Learning Notes
No ratings yet
CS 280: Intro to Deep Learning Notes
77 pages
Fault Prediction of Transformer Using Machine Learning and DGA
No ratings yet
Fault Prediction of Transformer Using Machine Learning and DGA
5 pages
Metodologi Penelitian Machine Learning
No ratings yet
Metodologi Penelitian Machine Learning
22 pages
Deep Learning for Power Quality Disturbances
No ratings yet
Deep Learning for Power Quality Disturbances
17 pages
ANN for Solar Drying Optimization
No ratings yet
ANN for Solar Drying Optimization
6 pages
Deep Learning for Safe Automated Driving
No ratings yet
Deep Learning for Safe Automated Driving
288 pages
An Innovative Anomaly Driving Detection Strategy For Adaptive FCW of CNN Approach
No ratings yet
An Innovative Anomaly Driving Detection Strategy For Adaptive FCW of CNN Approach
6 pages
AI in Renewable Energy Optimization
No ratings yet
AI in Renewable Energy Optimization
47 pages
Intelligent Torque Control for Induction Motors
No ratings yet
Intelligent Torque Control for Induction Motors
24 pages
AI Model for Energy-Efficient Building Design
No ratings yet
AI Model for Energy-Efficient Building Design
22 pages
Artificial Neural Networks Syllabus
No ratings yet
Artificial Neural Networks Syllabus
2 pages

Modifying Neural Network for MNIST Data

Uploaded by

Modifying Neural Network for MNIST Data

Uploaded by

CS 470/670 – Intro to Artificial Intelligence – Fall 2018

Instructor: Marc Pomplun

Question 1: Modifying the Neural Network Code

[Link]: training set images (9912422 bytes)

Question 2: The Soccer Network

Question 3: Linear Neurons

Question 4 (Bonus): From Gradient Descent to Backpropagation

Common questions

What is the relationship between gradient descent and backpropagation in training neural networks?

Can a multi-layer network of linear neurons be simplified without losing its functional representation, and how can this be achieved?

How does adding noise to dataset image pixels affect neural network training and evaluation processes?

What contrasts in network performance might one expect when using varied feature sets for predicting soccer match outcomes?

What modifications can be made to neural network code to track performance across epochs, and how do these changes impact the interpretation of training and test errors over time?

How does altering the number of neurons in hidden layers affect the performance of a neural network in learning tasks?

Describe an input format and feature vector preparation for a neural network predicting soccer match outcomes, considering both structural and operational aspects of the network.

How would modifying dataset image intensity impact neural network training results, and what does this imply for feature dependencies?

In what ways do linear neurons differ from their non-linear counterparts in neural network architectures, particularly regarding function computation capabilities?

What is the impact of different dropout rates on the performance of a neural network, and how should they be chosen and interpreted in experiments?

You might also like