Deep Learning for Plant Disease Detection
Deep Learning for Plant Disease Detection
APPROACH
ABSTRACT:
In India, Agriculture plays an essential role because of the rapid growth of population and
increased in demand for food. Therefore, it needs to increase in crop yield. India is a agricultural
based county where approx. 70% of population depend on agriculture. Now days the plant
disease detection is very important because agriculture is the backbone of the county like India.
Farmer is not aware what type of disease plant having and how to prevent them from these
diseases. To overcome from these we are going to develop a technique in which we can able to
detect plant disease using image processing technique. One major effect on low crop yield is
disease caused by bacteria, virus and fungus. It can be prevented by using plant diseases
detection techniques. Deep learning methods can be used for diseases identification because it
mainly apply on data themselves and gives priority to outcomes of certain task. These techniques
will help in identifying plant diseases thereby increasing the yield of plants. This paper describes
plant disease identification using Deep Learning Approach like CNN &VGG16 and study in
detail about various two DL techniques for disease identification and classification is also done .
TABLE OF CONTENTS
CHAPTE PAGE
R NO. TITLE NO.
CHAPTER 1 : INTRODUCTION
1.1 GENERAL
1.1.1 THE MACHINE LEARNING SYSTEM
1.
1.1.2 FUNDAMENTAL
4
1.2 JUPYTER
1.3 MACHINE LEARNING
6
1.4 CLASSIFICATION TECHNIQUES
9
1.4.1 NEURAL NETWORK AND DEEP LEARNING
1.4.2 METHODOLOGIES - GIVEN INPUT AND EXPECTED
OUTPUT
1.5 OBJECTIVE AND SCOPE OF THE PROJECT 12
12
1.6 EXISTING SYSTEM
1.6.1 DISADVANTAGES OF EXISTING SYSTEM
13
1.6.2 LITERATURE SURVEY
1.7 PROPOSED SYSTEM 17
1.7.1 PROPOSED SYSTEM ADVANTAGES
17
INTRODUCTION
1.1 GENERAL
The term digital image refers to processing of a two dimensional picture by a digital
computer. In a broader context, it implies digital processing of any two dimensional data. A
digital image is an array of real or complex numbers represented by a finite number of bits. An
image given in the form of a transparency, slide, photograph or an X-ray is first digitized
andstored as a matrix of binary digits in computer memory. This digitized image can then be
processed and/or displayed on a high-resolution television monitor. For display, the image is
stored in a rapid-access buffer memory, which refreshes the monitor at a rate of 25 frames per
second to produce a visually continuous display.
Digitizer Mass
Storage
Hard Copy
Display Device
A digitizer converts an image into a numerical representation suitable for input into a
digital computer. Some common digitizers are
1. Microdensitometer
2. Flying spot scanner
3. Image dissector
4. Videocon camera
5. Photosensitive solid- state arrays.
IMAGE PROCESSOR:
Problem Domain
Segmentation n Representation & Description
Image Acquisition
As detailed in the diagram, the first step in the process is image acquisition by an imaging
sensor in conjunction with a digitizer to digitize the image. The next step is the preprocessing step
where the image is improved being fed as an input to the other processes. Preprocessing typically
deals with enhancing, removing noise, isolating regions, etc. Segmentation partitions an image
into its constituent parts or objects. The output of segmentation is usually raw pixel data, which
consists of either the boundary of the region or the pixels in the region themselves.
Representation is the process of transforming the raw pixel data into a form useful for
subsequent processing by the computer. Description deals with extracting features that are basic
in differentiating one class of objects from another. Recognition assigns a label to an object
based on the information provided by its descriptors. Interpretation involves assigning meaning
to an ensemble of recognized objects. The knowledge about a problem domain is incorporated
into the knowledge base. The knowledge base guides the operation of each processing module
and also controls the interaction between the modules. Not all modules need be necessarily
present for a specific function. The composition of the image processing system depends on its
application. The frame rate of the image processor is normally around 25 frames per second.
DIGITAL COMPUTER:
MASS STORAGE:
The secondary storage devices normally used are floppy disks, CD ROMs etc.
The hard copy device is used to produce a permanent copy of the image and for the
storage of the software involved.
OPERATOR CONSOLE:
Digital image processing refers processing of the image in digital form. Modern cameras
may directly take the image in digital form but generally images are originated in optical form.
They are captured by video cameras and digitalized. The digitalization process includes
sampling, quantization. Then these images are processed by the five fundamental processes, at
least any one of them, not necessarily all of them.
Image Enhancement
Image Restoration
IP
Image Analysis
Image Compression
Image Synthesis
MAGE ENHANCEMENT:
Image enhancement operations improve the qualities of an image like improving the
image’s contrast and brightness characteristics, reducing its noise content, or sharpen the details.
This just enhances the image and reveals the same information in more understandable image. It
does not add any information to it.
IMAGE RESTORATION:
Image restoration like enhancement improves the qualities of image but all the operations
are mainly based on known, measured, or degradations of the original image. Image restorations
are used to restore images with problems such as geometric distortion, improper focus, repetitive
noise, and camera motion. It is used to correct images for known degradations.
IMAGE ANALYSIS:
IMAGE COMPRESSION:
Image compression and decompression reduce the data content necessary to describe the image.
Most of the images contain lot of redundant information, compression removes all the
redundancies. Because of the compression the size is reduced, so efficiently stored or
transported. The compressed image is decompressed when displayed. Lossless compression
preserves the exact data in the original image, but Lossy compression does not represent the
original image but provide excellent compression.
IMAGE SYNTHESIS:
Image synthesis operations create images from other images or non-image data. Image
synthesis operations generally create images that are either physically impossible or impractical
to acquire.
Jupyter
Jupyter, previously known as IPython Notebook, is a web-based, interactive development
environment. Originally developed for Python, it has since expanded to support over 40 other
programming languages including Julia and R.
Jupyter allows for notebooks to be written that contain text, live code, images, and equations.
These notebooks can be shared, and can even be hosted on GitHub for free.
For each section of this tutorial, you can download a Juypter notebook that allows you to edit and
experiment with the code and examples for each topic. Jupyter is part of the Anaconda
distribution; it can be started from the command line using the jupyter command:
Machine Learning
We will now move on to the task of machine learning itself. In the following sections we will
describe how to use some basic algorithms, and perform regression, classification, and clustering
on some freely available medical datasets concerning breast cancer and diabetes, and we will
also take a look at a DNA microarray dataset.
???
SciKit-Learn
SciKit-Learn provides a standardised interface to many of the most commonly used machine
learning algorithms, and is the most popular and frequently used library for machine learning for
Python. As well as providing many learning algorithms, SciKit-Learn has a large number of
convenience functions for common preprocessing tasks (for example, normalisation or k-fold
cross validation).
SciKit-Learn is a very large software library.
Clustering
Clustering algorithms focus on ordering data together into groups. In general clustering
algorithms are unsupervised—they require no y response variable as input. That is to say, they
attempt to find groups or clusters within data where you do not know the label for each sample.
SciKit-Learn have many clustering algorithms, but in this section we will demonstrate
hierarchical clustering on a DNA expression microarray dataset using an algorithm from the
SciPy library.
We will plot a visualisation of the clustering using what is known as a dendrogram, also using
the SciPy library.
The goal is to cluster the data properly in logical groups, in this case into the cancer types
represented by each sample’s expression data. We do this using agglomerative hierarchical
clustering, using Ward’s linkage method:
???
Dimensionality Reduction
Another important method in machine learning, and data science in general, is dimensionality
reduction. For this example, we will look at the Wisconsin breast cancer dataset once again. The
dataset consists of over 500 samples, where each sample has 30 features. The features relate to
images of a fine needle aspirate of breast tissue, and the features describe the characteristics of
the cells present in the images. All features are real values. The target variable is a discrete value
(either malignant or benign) and is therefore a classification dataset.
You will recall from the Iris example in Sect. 7.3 that we plotted a scatter matrix of the data,
where each feature was plotted against every other feature in the dataset to look for potential
correlations (Fig. 3). By examining this plot you could probably find features which would
separate the dataset into groups. Because the dataset only had 4 features we were able to plot
each feature against each other relatively easily. However, as the numbers of features grow, this
becomes less and less feasible, especially if you consider the gene expression example in Sect.
9.4 which had over 6000 features.
One method that is used to handle data that is highly dimensional is Principle Component
Analysis, or PCA. PCA is an unsupervised algorithm for reducing the number of dimensions of a
dataset. For example, for plotting purposes you might want to reduce your data down to 2 or 3
dimensions, and PCA allows
you to do this by generating components, which are combinations of the original features, that
you can then use to plot your data.
PCA is an unsupervised algorithm. You supply it with your data, X, and you specify the number
of components you wish to reduce its dimensionality to. This is known as transforming the data:
Again, you would not use this model for new data—in a real world scenario, you would, for
example, perform a 10-fold cross validation on the dataset, choosing the model parameters that
perform best on the cross validation. This model would be much more likely to perform well on
new data. At the very least, you would randomly select a subset, say 30% of the data, as a test set
and train the model on the remaining 70% of the dataset. You would evaluate the model based on
the score on the test set and not on the training set
.
ARTIFICIAL NEURAL NETWORK
Essentially, the idea of artificial neural network (ANN) is based on the concept of how the
information processes inside humans and animals brains. This concept can be over simplified as
a complex network of trillions of nerve cells interconnected with each other via pulses called
action potentials. According to this ANN aims to mimic this process as much as possible. Which
means mimicking the most important ability in human mind, which is the ability of learning.
This is differs from the linear algorithm of regular machine methodology to solve problems. In
other words, ANN can be simply defined as computer algorithms that consist of simple entities
interconnect with each other to form an interaction of behaviour in response to different states of
input. It is structured in the form of layers each of which is consisting of a number of nodes that
interconnected with each other through mathematical functions. According to other author 2011
the basic element of any ANN is a neuron, which is designed based on the neuron in a biological
neural network.
Artificial neural networks can perform different tasks, depending on the types of transfer
functions and the neuron interconnections. One of the most common ones (if not the most
commonly used one) is the two-layer perceptron
In order to design a neural network we also need to determine the number of neurons in the first
layer. Common sense says that more neurons should be able to approximate functions better.
There is however always a certain upper limit associated with this process. Too many neurons
can namely also result in poor function approximation as demonstrated.
• Supervised learning
In this case a network is trained by a sequence of pairs of vectors. The first one is the input
vector and the second one the target vector. The weights can be modified at each step (after each
pair) or a matrix of all the vectors can be formed and then used for training. The training
processes are therefore called incremental or batch learning.
• Unsupervised learning
In this case there are no target vectors. The weights are modified based only on the input vectors.
The algorithm used to train the two layer perceptron is the so called back propagation algorithm
(also known as Delta rule). It is composed of both forward and backward stage. The main goal of
the algorithm is to make the difference between the actual and the target outputs as small as
possible.
Keras additionally requires either Theano or TensorFlow to be installed. In the examples in this
chapter we are using Theano as a backend, however the code will work identically for either
backend. You can install Theano using pip, but it has a number of dependencies that must be
installed first. Refer to the Theano and TensorFlow documentation for more information [12].
Keras is a modular API. It allows you to create neural networks by building a stack of modules,
from the input of the neural network, to the output of the neural network, piece by piece until you
have a complete network. Also, Keras can be configured to use your Graphics Processing Unit,
or GPU. This makes training neural networks far faster than if we were to use a CPU. We begin
by importing Keras:
We may want to view the network’s accuracy on the test (or its loss on the training set) over time
(measured at each epoch), to get a better idea how well it is learning. An epoch is one complete
cycle through the training data.
Fortunately, this is quite easy to plot as Keras’ fit function returns a history object which we can
use to do exactly this:
This will result in a plot similar to that shown. Often you will also want to plot the loss on the
test set and training set, and the accuracy on the test set and training set.
Plotting the loss and accuracy can be used to see if you are over fitting (you experience tiny loss
on the training set, but large loss on the test set) and to see when your training has plateaued.
PROBLEM STATEMENT:
Agriculture is one of the important sources of income for farmer. Farmers can grow variety of
plants but diseases hamper the growth of plants. One of the major factors that lead the
destruction of plant is disease attack. Disease attack may reduce the productivity plants from
10%-95%. Classification of Plant and Diseased Plants using Machine Learning approach which
can help to control growth of diseases on Plants using the pesticides in the quantity needed so
that excess use of pesticides can be avoided. Automatic identification of plant diseases is an
important task as it may be proved beneficial for farmer to monitor large field of plants, and
identify the disease using machine learning approach. As per the survey, this paper has made an
attempt to study machine learning method used by researchers to identify diseases and
classification. These machine learning methods will help system to identify disease occurred on
plant by image processing and system will inform farmer about disease in detail and specify the
medicine to get rid of plant disease and increase the productivity.
1.1 OBJECTIVE AND SCOPE OF THE PROJECT
Plant leaf disease are perceived as an important problem because they conduce to reduce crop
yields due to the expanding competition for nutrients, water, and sunlight besides they serve as
hosts for diseases and pests. Thus, it is crucial to identify weeds in early growth in order to avoid
their side effects on crops growth. Previous conventional machine learning technologies
exploited for discriminating crops and plant leaf disease faced challenges of effectiveness and
reliability of rice leaf disease detection at preliminary stages of growth. This work proposes the
application of deep learning technique for plant nutrient disease classification. A new
Convolutional Neural Networks (CNN) architecture is designed to classify plant deficiency at
their early growth stages. The presented technique is appraised using rice leaf disease dataset.
Average accuracy, precision, recall, and F1- score are utilized as evaluation metrics
EXISTING SYSTEM
The Agricultural and Rural Department of Qinghai Province of China suggested that the pepper
diseases should be replanted during the period of two leaves and one heart [18]. Tang et al.
proposed that pepper diseases need to be replanted when 1 to 2 true leaves are grown, to improve
the e_ective utilization rate of greenhouse and reduce energy consumption [19]. claimed that the
disease’s survival rate could be 95% when they were transplanted with 2 to 3 true leaves [20].
Considering comprehensively, we choose two leaves and one heart period (with 2 true leaves) of
pepper disease to transplant in plug tray. Two leaves and one heart refer to the period in which
the pepper diseases have grown two true leaves and one top bud besides the two cotyledons.
According to this, the standard of qualified and unqualified diseases in this study was set. The
qualified disease refers to the pepper diseases which have obviously grown two true leaves and
one top bud during the two-leaf and one-heart period, as shown in Figure 3a. Unqualified
diseases refer to the diseases which have not yet grown true leaves as shown in Figure 3b, or
which have only one true leaf as shown in Figure 3c. Lack of diseases refers to no disease
growth in the cell as shown in Figure 3d. There are two types of cells that need to be replanted in
the plug tray.
A method for classifying rice leaf disease. This method aimed to improve the classification
performance by consolidating the classification of the whole plants and the individual leaves.
Thus, leaves are first separated from the plants then features are extracted from both the whole
plants and the segmented leaves. The classification process is performed for the leaves and
plants, and finally, Bayes belief integration is used to fuse the classification results. The two
significant pattern recognition approaches; artificial neural networks (ANN) and support vector
machine (SVM), to separate the weeds from the sugar beet plants using shape features. The
shape features comprise Fourier descriptors and moment invariant features. Four species of
prevalent weeds in the sugar beet fields were examined. The results indicate that SVM slightly
outperforms the ANN. In [13] the authors developed a system vision technique relied on video
processing as well as a hybrid ANN and ant colony algorithm classifier for assorting potato plant
and three weed species. Texture features, obtained from the gray level co-occurrence matrix
(GLCM) and the histogram, moment invariants, color features, and shape features are extracted.
Then, the Gamma test is used to select the significant features.
An SVM along with spectral reflectance measurements are combined for developing a
corn/silverbeet (as crop-weed) differentiation system. The intensities of the reflectance of laser
beams off soil and vegetation at three wavelengths are gathered by a weed sensor. These
reflectance measurements are used to compute the Normalized Difference Vegetation Indices
(NDVIs). Two experiments are performed; in the first one, the obtained NDVI values are fed to
an SVM to achieve the classification process, while in the second one, the raw reflected
intensities are provided to the SVM for crop-weed discrimination. Strothmann et al. [15]
proposed a crop-weed discrimination system based on in-field-labeling. A multiwavelength laser
line profile (MWLP) approach is used to scan plants and obtain spectral reflection intensities,
scattering information at several wavelengths and 3D data. The spectral features are applied for
separating soil and biomass, while the 3D surface features are exploited for discriminating crops
and weeds.
PLAN OF ACTION:
Step 1: Data preprocessing: all the images in dataset are resized to 100x100 pixel format.
Step 2: Data is divided into two parts 80% training set, 20% test set.
Step 3: Data augmentation: augmentation process is applied of Training set to rotate, resize and
adding some random noise to images in order to avoid over fitting.
Step 4: Feature extraction: Features would be extracted in starting layers of CNN architecture
using convolutional operation.
Step 5: Training the model: In our case we will use CNN & VGG16 based architecture. Once
architecture is developed we will train the model with Training set features.
Step 6: Evaluation: Accuracy of model would be evaluated with the help of Test set.
Step 7: Tuning: If results are not satisfactory tune the model by changing the parameters of
architecture such as kernel size, Nodes in last fully connected layer.
Step 8: Store the weights: final model which has trained save it in model_name.h5 configuration
file so that it can be used for new data.
OBJECTIVE:
To better understand the mental health conditions and provide better patient care, early detection
of mental health problems is an essential step. Different from the diagnosis of other chronic
conditions that rely on laboratory tests and measurements, mental illnesses are typically
diagnosed based on an individual’s self-report to specific questionnaires designed for the
detection of specific patterns of feelings or social interactions 3. Due to the increasing availability
of data pertaining to an individual’s mental health status, artificial intelligence (AI) and machine
learning (ML) technologies are being applied to improve our understanding of mental health
conditions and have been engaged to assist mental health providers for improved clinical
decision-making
INPUT IMAGE
PREPROCESSING
DATA
AUGUMENTATION
POST PROCESSING
CLASSIFICATION
OUTPUT IMAGE
Proposed approach :
Data collection Phase : provided a dataset that is aimed at ground-based weed or species
spotting and also suggested a benchmark measure to researchers to enable easy comparison of
classification results.
Data Normalization Phase: In this phase the values are normalized between 0 and 1 to avoid
over fitting.
Train-Test Split Phase: This phase splits data into training and testing data subsets. For
example, data are divided into two parts per a ratio of 80% training data and 20% test data.
Data-Preprocessing Phase: Before the data is fed to the model all the null and redundant values
are removed.
Model-Building Phase: In this phase we are using sklearn package of python which contains
many packages for classification and regression task. Here we are using In the course of
exploring the right architecture for our model, we consider the work of in classifying leaves
using the VGGNet16 architectures. The implemented a 26-layer deep learning model consisting
of 8 residual blocks in their classification of 10,000 images of 100 ornamental plant species
achieving classification rates of up to 91.78%.
Prediction Phase : In this phase we test our model with the test input data and make the
prediction.
ARCHITECTURE FOR PROPOSED SYSTEM:
Preprocessing and Augmentation.
Data set: The utilized dataset, delivered by the signal processing group of the Aarhus University,
in collaboration by Southern Denmark University, comprises 5539 images of roughly 960 unique
plants categorized into 6 disease captured at early growth stages. It includes annotated RGB
images with an approximate physical resolution of 10 pixels per mm. Particularly, this dataset is
adopted for researches that investigate plant nutrient deficiency identification at their early
germination stage. Thus, farmers (or robots for automatic deficiency control) may be able to
handle nutrient deficiency before the weeds commence to compete with crops for nutrition.
CNN is adopted for rice leaf disease classification to automatically discriminate between weed
species and crops at early growth stages. The proposed CNN consists of an input layer, hidden
layers, and an output layer. The original disease images are all equally resized to 128x128 pixels
(this has been specified empirically such that to get satisfactory performance with acceptable
processing speed) and fed to the input layer. The hidden layers consist of 5 stages of learning
layers, as illustrated in Fig. 1. The utilized filters are all of kernel size 3x3 with a number of
filters 32, 64, 128, 256 and 1024 for each convolutional layer within each stage, respectively.
The entire convolutional layers are associated with Rectified Linear Units (ReLU) layers, which
apply the function f(x) = max (0, x) to the whole values of the input image. Thus, the negative
input elements are set to 0. This decreases the training time and provides nonlinear rectifications,
which escalates the nonlinear characteristics of the model and the whole network without
impacting the receptive values of the convolutional layer.
Generally, the convolutional layers are used for feature extraction and the fully connected layers
are used for classification tasks. Thus, the lower part of the CNN includes convolutional layers
while the higher part comprises some fully connected layers. The fully connected layers have a
large number of parameters which needs a high computational power and produces overfitting.
On the other hand, the global average layer procedure computes the mean of each feature map
and delivers it to the next layer. Hence, it does not need any parameter which minimizes
overfitting [18]. Our proposed CNN architecture employs the global average pooling layer
before the fully connected layers in order to reduce the utilized parameters and avoid overfitting.
Algorithms:
CNN
Introduction
In the past few decades, Deep Learning has proved to be a very powerful tool because of its
ability to handle large amounts of data. The interest to use hidden layers has surpassed traditional
techniques, especially in pattern recognition. One of the most popular deep neural networks is
Convolutional Neural Networks.
Since the 1950s, the early days of AI, researchers have struggled to make a system that can
understand visual data. In the following years, this field came to be known as Computer Vision.
In 2012, computer vision took a quantum leap when a group of researchers from the University
of Toronto developed an AI model that surpassed the best image recognition algorithms and that
too by a large margin.
The AI system, which became known as AlexNet (named after its main creator, Alex
Krizhevsky), won the 2012 ImageNet computer vision contest with an amazing 85 percent
accuracy. The runner-up scored a modest 74 percent on the test.
At the heart of AlexNet was Convolutional Neural Networks a special type of neural network
that roughly imitates human vision. Over the years CNNs have become a very important part of
many Computer Vision applications and hence a part of any computer vision cours. So let’s take
a look at the workings of CNNs.
Background of CNNs
CNN’s were first developed and used around the 1980s. The most that a CNN could do at that
time was recognize handwritten digits. It was mostly used in the postal sectors to read zip codes,
pin codes, etc. The important thing to remember about any deep learning model is that it requires
a large amount of data to train and also requires a lot of computing resources. This was a major
drawback for CNNs at that period and hence CNNs were only limited to the postal sectors and it
failed to enter the world of machine learning.
In 2012 Alex Krizhevsky realized that it was time to bring back the branch of deep learning that
uses multi-layered neural networks. The availability of large sets of data, to be more specific
ImageNet datasets with millions of labeled images and an abundance of computing resources
enabled researchers to revive CNNs.
But we don’t really need to go behind the mathematics part to understand what a CNN is or how
it works.
Bottom line is that the role of the ConvNet is to reduce the images into a form that is easier to
process, without losing features that are critical for getting a good prediction.
Before we go to the working of CNN’s let’s cover the basics such as what is an image and how is
it represented. An RGB image is nothing but a matrix of pixel values having three planes
whereas a grayscale image is the same but it has a single plane. Take a look at this image to
understand more.
For simplicity, let’s stick with grayscale images as we try to understand how CNNs work.
The above image shows what a convolution is. We take a filter/kernel(3×3 matrix) and apply it
to the input image to get the convolved feature. This convolved feature is passed on to the next
layer.
In the case of RGB color, channel take a look at this animation to understand its working
Convolutional neural networks are composed of multiple layers of artificial neurons. Artificial
neurons, a rough imitation of their biological counterparts, are mathematical functions that
calculate the weighted sum of multiple inputs and outputs an activation value. When you input
an image in a ConvNet, each layer generates several activation functions that are passed on to the
next layer.
The first layer usually extracts basic features such as horizontal or diagonal edges. This output is
passed on to the next layer which detects more complex features such as corners or
combinational edges. As we move deeper into the network it can identify even more complex
features such as objects, faces, etc.
Based on the activation map of the final convolution layer, the classification layer outputs a set
of confidence scores (values between 0 and 1) that specify how likely the image is to belong to a
“class.” For instance, if you have a ConvNet that detects cats, dogs, and horses, the output of the
final layer is the possibility that the input image contains any of those animals.
Similar to the Convolutional Layer, the Pooling layer is responsible for reducing the spatial size
of the Convolved Feature. This is to decrease the computational power required to process
the data by reducing the dimensions. There are two types of pooling average pooling and max
pooling. I’ve only had experience with Max Pooling so far I haven’t faced any difficulties.
So what we do in Max Pooling is we find the maximum value of a pixel from a portion of the
image covered by the kernel. Max Pooling also performs as a Noise Suppressant. It discards the
noisy activations altogether and also performs de-noising along with dimensionality reduction.
On the other hand, Average Pooling returns the average of all the values from the portion of
the image covered by the Kernel. Average Pooling simply performs dimensionality reduction as
a noise suppressing mechanism. Hence, we can say that Max Pooling performs a lot better
than Average Pooling.
Limitations:
Despite the power and resource complexity of CNNs, they provide in-depth results. At the root
of it all, it is just recognizing patterns and details that are so minute and inconspicuous that it
goes unnoticed to the human eye. But when it comes to understanding the contents of an image
it fails.
VGG-16
The ImageNet Large Scale Visual Recognition Challenge is an annual computer vision
competition. Each year, teams compete on two tasks. The first is to detect objects within an
image coming from 200 classes, which is called object localization. The second is to classify
images, each labeled with one of 1000 categories, which is called image classification. VGG
16 was proposed by Karen Simonyan and Andrew Zisserman of the Visual Geometry Group
Lab of Oxford University in 2014 in the paper “VERY DEEP CONVOLUTIONAL
NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION”. This model won the 1 st and
2nd place on the above categories in 2014 ILSVRC challenge.
VGG-16 architecture
This model achieves 92.7% top-5 test accuracy on ImageNet dataset which contains 14 million
images belonging to 1000 classes.
The ImageNet dataset contains images of fixed size of 224*224 and have RGB channels. So,
we have a tensor of (224, 224, 3) as our input. This model process the input image and outputs
the a vector of 1000 values.
This vector represents the classification probability for the corresponding class. Suppose we
have a model that predicts that image belongs to class 0 with probability .1, class 1 with
probability 0.05, class 2 with probability 0.05, class 3 with probability 0.03, class 780 with
probability 0.72, class 999 with probability 0.05 and all other class with 0. so, the classification
vector for this will be:
To make sure these probabilities add to 1, we use softmax function. This softmax function is
defined as :
After this we take the 5 most probable candidates into the vector. and our ground truth vector is
definedasfollows:
Architecture:
The input to the network is image of dimensions (224, 224, 3). The first two layers
have 64 channels of 3*3 filter size and same padding. Then after a max pool layer of stride (2,
2), two layers which have convolution layers of 256 filter size and filter size (3, 3). This
followed by a max pooling layer of stride (2, 2) which is same as previous layer. Then there
are 2 convolution layers of filter size (3, 3) and 256 filter. After that there are 2 sets
of 3 convolution layer and a max pool layer. Each have 512 filters of (3, 3) size with same
[Link] image is then passed to the stack of two convolution layers. In these convolution
and max pooling layers, the filters we use is of the size 3*3 instead of 11*11 in AlexNet
and 7*7 in ZF-Net. In some of the layers, it also uses 1*1 pixel which is used to manipulate the
number of input channels. There is a padding of 1-pixel (same padding) done after each
convolution layer to prevent the spatial feature of the image.
After the stack of convolution and max-pooling layer, we got a (7, 7, 512) feature map. We
flatten this output to make it a (1, 25088) feature [Link] this there are 3 fully connected
layer, the first layer takes input from the last feature vector and outputs a (1, 4096) vector,
second layer also outputs a vector of size (1, 4096) but the third layer output a 1000 channels
for 1000 classes of ILSVRC challenge, then after the output of 3rd fully connected layer is
passed to softmax layer in order to normalize the classification vector. After the output of
classification vector top-5 categories for evaluation. All the hidden layers use ReLU as its
activation function. ReLU is more computationally efficient because it results in faster learning
and it also decreases the likelihood of vanishing gradient problem.
PROJECT DESCRIPTION
2.1 INTRODUCTION
Digital image process is the use of computer algorithms to perform image process on
digital pictures. It permits a far wider vary of algorithms to be applied to the computer file and
might avoid issues like the build-up of noise and signal distortion throughout process. Digital
image process has terribly important role in agriculture field. It is widely accustomed observe the
crop disease with high accuracy. Detection and recognition of diseases in plants mistreatment
digital image method is extremely effective in providing symptoms of characteristic diseases at
its early stages. Plant pathologists will analyze the digital pictures mistreatment digital image
process for diagnosing of crop diseases. Computer Systems area unit developed for agricultural
applications, like detection of leaf diseases, fruits diseases etc. altogether these techniques,
digital pictures are collected employing a camera and image process techniques are applied on
these pictures to extract valuable data that are essential for analysis. The diseases are mostly on
leaves and on stem of plant. They are Potassium, Magnesium, Calcium, Zinc, or iron deficiencies
due to insects, rust, nematodes etc. on plant. It is important task for farmers to find out these
deficiencies as early as possible. Following example shows that how deficiencies on plant Leafs
reduces the productivity from Image processing techniques is been used to detect on mango,
pomegranate, guava, sapota etc.
Deep learning is a type of machine learning in which a model learns to perform classification
tasks directly from images, text, or sound. Deep learning is usually implemented using neural
network architecture. The term deep refers to the number of layers in the network—the more the
layers, the deeper the network. Traditional neural networks contain only two or three layers,
while deep networks can have hundreds.
A deep neural network combines multiple non-linear processing layers, using simple elements
operating in parallel. It is inspired by the biological nervous system and consists of an input
layer, several hidden layers, and an output layer. The layers are interconnected via nodes, or
neurons, with each hidden layer using the output of the previous layer as its input.[1] The recent
advances in deep-learning technologies based on neural networks have led to the emergence of
high performance algorithms for interpreting images, such as object detection, semantic
segmentation instance segmentation, and image generation. As neural networks can learn the
high-dimensional hierarchical features of objects from large sets of training data, deep-learning
algorithms can acquire a high generalization ability to recognize images, i.e., they can interpret
images that they have not been shown before, which is one of the traits of artificial intelligence.
Soon after the success of deep learning algorithms in general scene recognition
challenges, attempts at automation began for imaging tasks that are conducted by human experts,
such as medical diagnosis and biological image analysis. However, despite significant advances
in image recognition algorithms, the implementation of these tools for practical applications
remains challenging because of the unique requirements for developing deep-learning algorithms
that necessitate the joint development of hardware, datasets, and software.
In the Indian economic sector, agriculture plays an important role and contributes a second place
in rice production. Approximately all the states in India practicing rice cultivation which are
Tamil Nadu, West Bengal, Punjab, Uttar Pradesh, Assam, Bihar, etc. Agriculture sector is
providing its share about 19.9% to the total gross domestic product. Rice is one of the
predominantly using food grains in India. The growth and quality of rice plants get affecting by
diseases which further implies the profit of the farming. The different varieties of diseases may
occur in individual rice crop which is difficult to identify to the farmers with their limited
knowledge gained through the experience. For this precision and early identification of plant
diseases diagnosis automatic data processing expert system is crucially requisite. Thus the
healthy and successful cultivation is viable.
The potent algorithm of deep learning has been entered in the field of agriculture for resolving
the different types of issues such as weed and seed detection, plant diseases classification, fruit
counting, root segmentation, etc. Deep learning is an advancement of machine learning
technique which successfully trains a huge amount of data and automatically learns the features
of the input and gives the output based on the decision rules. CNN is effective in processing the
visual imagery. It is a feed forward artificial neural network which has three diverse layers input
layer, hidden layer, and output layer. The hidden layer is composed of convolutional layer,
pooling layer normalization layer, and fully connected layer and it contains set of automatic
learnable parameters (weights) through which it can learn the spatial relationship of the input
data and perform a classification task.
Transfer learning is a method by which a pre-trained convolutional neural network can be re-
purposed for a new problem. Thereby the training time of the model can be reduce when
compared to the model developed from scratch and gives an enhanced performance to the
proposed model.
The hardware requirements may serve as the basis for a contract for the implementation of the
system and should therefore be a complete and consistent specification of the whole system.
They are used by software engineers as the starting point for the system design. It shows what
the system does and not how it should be implemented
PROCESSOR : Intel I5
RAM : 4GB
HARD DISK : 40 GB
'The PlanVilllage Dataset,' an open-access repository with a total of 54,323 images is the product
of all Potato and Tomato images. All Rice imagery originates from the “Rice Diseases Image
Dataset” dataset. In a controlled environment, all pictures are collected. This is supposed to lead
to model bias. A test dataset of 50 images originating from Google will also be created to access
this. These photos contain more plant anatomy, background data on the field, and different
disease stage.
Step 1 is designed to study the impact on model efficiency that the image size has. Five photos
of 150 x 150 to 255 x 255 are tested in total. Initially, weight training is downloaded from CNN.
Both layers, except the last two layers, are frozen as a default of transfer learning. These include
new weights and are unique to the classification of plant diseases. Freezing allows for the
independent training of these layers without reproducing the pitches. This is the precise way to
train the final layer in the 1 cycle policy
Data Collection
This study used publicly available Kaggle Dataset for Driver distraction Detection. The database
was created with images taken from publicly available rice leaf disease datasets. The Kaggle
dataset contain images with different way of rice leaf disease images without disease image.
From the total images we have chosen 100 images with disease and 122 normal images.
Data Augmentation
Since the image classes are heavily imbalanced, we augment the training data to get balanced
distribution among the classes. We mirror and rotate the images to create new augmented data
set
Image Preprocessing:-
Image processing is divided into analogue image processing and digital image
processing.
Digital image processing is the use of computer algorithms to perform image processing
on digital images. As a subfield of digital signal processing, digital image processing has
many advantages over analogue image processing. It allows a much wider range of
algorithms to be applied to the input data — the aim of digital image processing is to
improve the image data (features) by suppressing unwanted distortions and/or
enhancement of some important image features so that our AI-Computer Vision models
can benefit from this improved data to work on.
Read Images: - In this step, we store the path to our image dataset into a variable then
we created a function to load folders containing images into arrays.
Resize image: - In this step in order to visualize the change, we are going to create two
functions to display the images the first being a one to display one image and the second
for two images. After that, we then create a function called processing that just receives
the images as a parameter. The reason for doing resize is some images captured by a
camera and fed to our AI algorithm vary in size, therefore, we should establish a base size
for all images fed into our AI algorithms.
DATA SPLITTING:-
A dataset used for machine learning should be partitioned into three subsets — training,
test, and validation sets.
Training set: - A data scientist uses a training set to train a model and define its optimal
parameters — parameters it has to learn from data.
Test set: - A test set is needed for an evaluation of the trained model and its capability for
generalization. The latter means a model’s ability to identify patterns in new unseen data
after having been trained over a training data. It’s crucial to use different subsets for
training and testing to avoid model over fitting, which is the incapacity for generalization
we mentioned above.
MODELING:-
During this stage, a data scientist trains numerous models to define which one of them
provides the most accurate predictions.
Model training:-
It’s time to train the model with this limited number of images. [Link] offers many
architectures to use which makes it very easy to use transfer [Link] can create a
convolutional neural network (CNN) model using the pre-trained models that work for
most of the applications/datasets.
We are going to use ResNet architecture, as it is both fast and accurate for many datasets
and problems. The 18 in the resnet18 represents the number of layers in the neural
network.
We also pass the metric to measure the quality of the model’s predictions using the
validation set from the dataloader. We are using error_rate which tells us how frequently
the model is making incorrect predictions.
The fine_tune method is analogous to the fit() method in other ML libraries. Now, to
train the model, we need to specify the number of times (epochs) we want to train the
model on each image.
TRANSFER LEARNING:
Transfer learning refers to the automatic extraction of features from new data sets using pre-
trained models. This method is a convenient way to apply deep learning without the need for
large data sets and time-consuming calculations and training. In some cases where the sample
size is small, transfer learning is an e_ective machine learning method [25–27]. Pre-trained
convolutional neural network models are used. These models have been trained in 1000 classes
and 1.2 million samples of ImageNet datasets, and have strong feature extraction capabilities.
This study is based on four pre-trained convolutional neural network models (Alexnet,
Inception-v3, Resnet-18, VGG16), and the last three layers of the networks’ structure are
modified to make them suitable for the intelligent plug diseases classification applications.
Taking the Alexnet network model as an example, the last three layers of the Alexnet
convolutional neural network are discarded, and then the bottleneck layer is taken as the feature
extraction result of the new model. Then add a fully connected layer, a softmax layer and a
classified output layer to form a new classification
Model training:-
It’s time to train the model with this limited number of images. [Link] offers many
architectures to use which makes it very easy to use transfer learning.
We can create a convolutional neural network (CNN) model using the pre-trained models
that work for most of the applications/datasets.
We are going to use VGG16 architecture, as it is both fast and accurate for many datasets
and problems. The 18 in the resnet18 represents the number of layers in the neural
network.
We also pass the metric to measure the quality of the model’s predictions using the
validation set from the dataloader. We are using error_rate which tells us how frequently
the model is making incorrect predictions.
The fine_tune method is analogous to the fit() method in other ML libraries. Now, to
train the model, we need to specify the number of times (epochs) we want to train the
model on each image.
CNN is a type of Neural Networks widely used for image recognition and image
classification. CNN uses supervised learning. CNN consists of filters or neurons that have
biases or weights. Every filter takes some inputs and performs convolution on the
acquired input. The CNN classifier has four layers; Convolutional, pooling, Rectified
Linear Unit (ReLU), and Fully Connected layers.
i. Convolutional layer
This layer extracts the features from the image which is applied as input. The neurons
convolve the input image and produce a feature map in the output image and this output
image from this layer is fed as an input to the next convolutional layer.
ii. Pooling layer
This layer is used to decrease the dimensions of the feature map still maintaining all the
important features. This layer is usually placed between two convolutional layers.
FLC means that each filter in the previous layer is connected to each filter in the next
layer. This is used to classify the input image based on the training dataset into various
classes.
It has four phases:
1. Model construction
2. Model training
3. Model testing
4. Model evaluation
Model construction depends on machine learning algorithms. In this projects case, it was
Convolution Neural Networks. After model construction it is time for model training.
Here, the model is trained using training data and expected output for this data. Once the
model has been trained it is possible to carry out model testing. During this phase a
second set of data is loaded. This data set has never been seen by the model and therefore
it’s true accuracy will be verified. After the model training is complete, the saved model
can be used in the real world. The name of this phase is model evaluation.
VGG16 model:
Transfer learning generally refers to a process where a model trained on one problem is
used in some way on a second related problem. In deep learning, transfer learning is a
technique whereby a neural network model is first trained on a problem similar to the
problem that is being solved. One or more layers from the trained model are then used in
a new model trained on the problem of interest.
Transfer learning has the benefit of decreasing the training time for a neural network
model and can result in lower generalization error.
The weights in re-used layers may be used as the starting point for the training process
and adapted in response to the new problem. This usage treats transfer learning as a type
of weight initialization scheme. This may be useful when the first related problem has a
lot more labeled data than the problem of interest and the similarity in the structure of the
problem may be useful in both contexts.
VGG16 is a convolution neural net (CNN) model that was used to win ILSVR(Imagenet) contest
in 2014. It is believed to be one of the best vision model architectures to date. The most
extraordinary thing about VGG16 is that instead of having a large number of hyper-parameters
they focused on having convolution layers of a 3x3 filter with a stride 1 and always used the
same padding and max pool layer of a 2x2 filter of stride 2. It follows this system of convolution
and max pool layers consistently throughout the whole model. Finally, it has 2 FC(fully
connected layers) followed by a softmax for output. The 16 in VGG16 points to the model
having 16 layers that have weights. This network is pretty large and it has about 138 million
(approx.) parameters.
PERFORMANCE MATRICES:
Data was divided into two portions, training data and testing data, both these portions consisting
70% and 30% data respectively. All these two algorithms were applied on same dataset using
Enthought Canaopy and results were obtained.
Predicting accuracy is the main evaluation parameter that we used in this work. Accuracy can be
defied using equation. Accuracy is the overall success rate of the algorithm.
CONFUSION MATRIX:
It is the most commonly used evaluation metrics in predictive analysis mainly because it is very
easy to understand and it can be used to compute other essential metrics such as accuracy, recall,
precision, etc. It is an NxN matrix that describes the overall performance of a model when used
on some dataset, where N is the number of class labels in the classification problem.
All predicted true positive and true negative divided by all positive and negative. True Positive
(TP), True Negative (TN), False Negative (FN) and False Positive (FP) predicted by all
algorithms are presented in table.
True positive (TP) indicates that the positive class is predicted as a positive class, and the
number of sample positive classes was actually predicted by the model.
False negative indicates (FN) that the positive class is predicted as a negative class, and the
number of negative classes in the sample was actually predicted by the model.
False positive (FP) indicates that the negative class is predicted as a positive class, and the
number of positive classes of samples was actually predicted by the model.
True negative (TN) indicates that the negative class is predicted as a negative class, and the
number of sample negative classes was actually predicted by the model.
The below state chart diagram describes the flow of control from one state to another state
(event) in the flow of the events from the creation of an object to its termination.
ER DIAGRAM:
The ER diagram is used to represent the relational databases. For automation of normalization described
above, we need description as well as FD information. The tools in ER diagram are taken as input and
provide relations. This input provides FD information and they normalize.
DEPLOYEMENT DIAGRAM:
Data Dictionary:
Data stored in a dictionary can be modified so they are called mutable objects. They are unordered
which means that the order in which we specified the items is not maintained. Dictionary in
Python is an unordered collection of data values, used to store data values like a map, which
unlike other Data Types that hold only single value as an element, Dictionary
holds key:value pair. Key value is provided in the dictionary to make it more optimized. Each
key-value pair in a Dictionary is separated by a colon :, whereas each key is separated by a
‘comma’. A Dictionary in Python works similar to the Dictionary in a real world. Keys of a
Dictionary must be unique and of immutable data type such as Strings, Integers, and tuples, but
the key-values can be repeated and be of any type.
This table the user can able to upload the attribute details and then the algorithm will be
selected to do the prediction.
In our project we use waterfall model as our software development cycle because of its step-by-
step procedure while implementing.
FEASIBILITY STUDY
The feasibility of the project is analysed in this phase and business proposal is put forth with a
very general plan for the project and some cost estimates. During system analysis the feasibility
study of the proposed system is to be carried out. This is to ensure that the proposed system is
not a burden to the company. For feasibility analysis, some understanding of the major
requirements for the system is essential.
ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
Economic feasibility:
This study is carried out to check the economic impact that the system will have on the
organization. The amount of fund that the company can pour into the research and development
of the system is limited. The expenditures must be justified. Thus, the developed system as well
within the budget and this was achieved because most of the technologies used are freely
available. Only the customized products had to be purchased.
Technical feasibility:
This study is carried out to check the technical feasibility, that is, the technical requirements of
the system. Any system developed must not have a high demand on the available technical
resources. This will lead to high demands on the available technical resources. This will lead to
high demands being placed on the client. The developed system must have a modest
requirement, as only minimal or null changes are required for implementing this system.
Social feasibility:
The aspect of study is to check the level of acceptance of the system by the user. This includes
the process of training the user to use the system efficiently. The user must not feel threatened by
the system, instead must accept it as a necessity. The level of acceptance by the users solely
depends on the methods that are employed to educate the user about the system and to make him
familiar with it. His level of confidence must be raised so that he is also able to make some
constructive criticism, which is welcomed, as he is the final user of the system.
Requirement’s analysis is very critical process that enables the success of a system or software
project to be assessed. Requirements are generally split into two types: Functional and non-
functional requirements.
Functional Requirements: These are the requirements that the end user specifically
demands as basic facilities that the system should offer. All these functionalities need to be
necessarily incorporated into the system as a part of the contract. These are represented or
stated in the form of input to be given to the system, the operation performed and the output
expected. They are basically the requirements stated by the user which one can see directly in
the final product, unlike the non-functional requirements.
Examples of functional requirements:
1) Authentication of user whenever he/she logs into the system
2) System shutdown in case of a cyber-attack
3) A verification email is sent to user whenever he/she register for the first time on some
software system.
Non-functional requirements: These are basically the quality constraints that the
system must satisfy according to the project contract. The priority or extent to which these
factors are implemented varies from one project to other. They are also called non-behavioral
requirements.
They basically deal with issues like:
Portability
Security
Maintainability
Reliability
Scalability
Performance
Reusability
Flexibility
Examples of non-functional requirements:
1) Emails should be sent with a latency of no greater than 12 hours from such an activity.
2) The processing of each request should be done within 10 seconds
3) The site should load in 3 seconds whenever of simultaneous users are > 10000
CHAPTER 3
SOFTWARE SPECIFICATION
3.1 GENERAL
ANACONDA
It is a free and open-source distribution of the Python and R programming languages for
scientific computing (data science, machine learning applications, large-scale data processing,
predictive analytics, etc.), that aims to simplify package management and deployment.
The big difference between Conda and the pip package manager is in how package dependencies
are managed, which is a significant challenge for Python data science and the reason Conda
exists. Pip installs all Python package dependencies required, whether or not those conflict with
other packages you installed previously.
So your working installation of, for example, Google Tensorflow, can suddenly stop working
when you pip install a different package that needs a different version of the Numpy library.
More insidiously, everything might still appear to work but now you get different results from
your data science, or you are unable to reproduce the same results elsewhere because you didn't
pip install in the same order.
Conda analyzes your current environment, everything you have installed, any version limitations
you specify (e.g. you only want tensorflow>= 2.0) and figures out how to install compatible
dependencies. Or it will tell you that what you want can't be done. Pip, by contrast, will just
install the thing you wanted and any dependencies, even if that breaks other [Link] source
packages can be individually installed from the Anaconda repository, Anaconda Cloud
([Link]), or your own private repository or mirror, using the conda install command.
Anaconda Inc compiles and builds all the packages in the Anaconda repository itself, and
provides binaries for Windows 32/64 bit, Linux 64 bit and MacOS 64-bit. You can also install
anything on PyPI into a Conda environment using pip, and Conda knows what it has installed
and what pip has installed. Custom packages can be made using the conda build command, and
can be shared with others by uploading them to Anaconda Cloud, PyPI or other [Link]
default installation of Anaconda2 includes Python 2.7 and Anaconda3 includes Python 3.7.
However, you can create new environments that include any version of Python packaged with
conda.
JupyterLab
Jupyter Notebook
QtConsole
Spyder
Glueviz
Orange
Rstudio
Visual Studio Code
Microsoft .NET is a set of Microsoft software technologies for rapidly building and integrating
XML Web services, Microsoft Windows-based applications, and Web solutions. The .NET
Framework is a language-neutral platform for writing programs that can easily and securely
interoperate. There’s no language barrier with .NET: there are numerous languages available to
the developer including Managed C++, C#, Visual Basic and Java Script. The .NET framework
provides the foundation for components to interact seamlessly, whether locally or remotely on
different platforms. It standardizes common data types and communications protocols so that
components created in different languages can easily interoperate.
“.NET” is also the collective name given to various software components built upon the .NET
platform. These will be both products (Visual [Link] and [Link] Server, for
instance) and services (like Passport, .NET My Services, and so on).
Easy to code
Free and Open Source
Object-Oriented Language
GUI Programming Support
High-Level Language
Extensible feature
Python is Portable language
Python is Integrated language
Interpreted
Large Standard Library
Dynamically Typed Language
3.3 PYTHON:
Python is a powerful multi-purpose programming language created by Guido van
Rossum.
It has simple easy-to-use syntax, making it the perfect language for someone
trying to learn computer programming for the first time.
Features Of Python :
[Link] to code:
Python is high level programming language. Python is very easy to learn language as compared
to other language like c, c#, java script, java etc. It is very easy to code in python language and
anybody can learn python basic in few hours or days. It is also developer-friendly language.
[Link]-Oriented Language:
One of the key features of python is Object-Oriented programming. Python supports object
oriented language and concepts of classes, objects encapsulation etc.
5. High-Level Language:
Python is a high-level language. When we write programs in python, we do not need to
remember the system architecture, nor do we need to manage the memory.
[Link] feature:
Python is a Extensible language. we can write our some python code into c or c++ language and
also we can compile that code in c/c++ language.
9. Interpreted Language:
Python is an Interpreted Language. because python code is executed line by line at a time. like
other language c, c++, java etc there is no need to compile python code this makes it easier to
debug our code. The source code of python is converted into an immediate form called bytecode.
APPLICATIONS OF PYTHON :
WEB APPLICATIONS
You can create scalable Web Apps using frameworks and CMS (Content Management
System) that are built on Python. Some of the popular platforms for creating Web Apps
are:Django, Flask, Pyramid, Plone, Django CMS.
Sites like Mozilla, Reddit, Instagram and PBS are written in Python.
3.2.1 SCIENTIFIC AND NUMERIC COMPUTING
There are numerous libraries available in Python for scientific and numeric computing.
There are libraries like:SciPy and NumPy that are used in general purpose computing.
And, there are specific libraries like: EarthPy for earth science, AstroPy for Astronomy
and so on.
Also, the language is heavily used in machine learning, data mining and deep learning.
Python is slow compared to compiled languages like C++ and Java. It might not be a
good choice if resources are limited and efficiency is a must.
However, Python is a great language for creating prototypes. For example: You can use
Pygame (library for creating games) to create your game's prototype first. If you like the
prototype, you can use language like C++ to create the actual game.
3.2.3 GOOD LANGUAGE TO TEACH PROGRAMMING
GENERAL
The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, sub-assemblies, assemblies and/or a finished product It is the
process of exercising software with the intent of ensuring that the Software system meets its
requirements and user expectations and does not fail in an unacceptable manner. There are
various types of test. Each test type addresses a specific testing requirement.
DEVELOPING METHODOLOGIES
The test process is initiated by developing a comprehensive plan to test the general
functionality and special features on a variety of platform combinations. Strict quality control
procedures are used.
The process verifies that the application meets the requirements specified in the system
requirements document and is bug free. The following are the considerations used to develop the
framework from developing the testing methodologies.
TYPES OF TESTING
WHITE BOX TESTING:
White Box Testing is software testing technique in which internal structure, design and
coding of software are tested to verify flow of input-output and to improve design, usability and
security. In white box testing, code is visible to testers so it is also called Clear box testing, Open
box testing, Transparent box testing, Code-based testing and Glass box testing.
White box testing techniques analyze the internal structures the used data structures, internal
design, code structure and the working of the software rather than just the functionality as in
black box testing. It is also called glass box testing or clear box testing or structural testing.
Unit testing
Unit testing involves the design of test cases that validate that the internal program logic
is functioning properly, and that program input produces valid outputs. All decision branches and
internal code flow should be validated. It is the testing of individual software units of the
application .it is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests perform
basic tests at component level and test a specific business process, application, and/or system
configuration. Unit tests ensure that each unique path of a business process performs accurately
to the documented specifications and contains clearly defined inputs and expected results.
Functional test
Functional tests provide systematic demonstrations that functions tested are available as
specified by the business and technical requirements, system documentation, and user manuals.
Functional testing is centered on the following items:
Valid Input : identified classes of valid input must be accepted.
Invalid Input : identified classes of invalid input must be rejected.
Functions : identified functions must be exercised.
Output : identified classes of application outputs must be exercised.
Systems/Procedures: interfacing systems or procedures must be invoked.
System Test
System testing ensures that the entire integrated software system meets requirements. It
tests a configuration to ensure known and predictable results. An example of system testing is the
configuration oriented system integration test. System testing is based on process descriptions
and flows, emphasizing pre-driven process links and integration points.
Performance Test
The Performance test ensures that the output be produced within the time limits,and the
time taken by the system for compiling, giving response to the users and request being send to
the system for to retrieve the results.
Integration Testing
Software integration testing is the incremental integration testing of two or more
integrated software components on a single platform to produce failures caused by interface
defects.
The task of the integration test is to check that components or software applications, e.g.
components in a software system or – one step up – software applications at the company level –
interact without error.
Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional requirements.
IMPLEMENTATION
4.1 GENERAL
CHAPTER 5
CONCLUSION AND REFERENCES
CONCLUSION
A CNN architecture is developed to discriminate between plant images of crop species and weed species
disease at several early growth stages. The proposed CNN based leaves deficiency identification model is
capable of classifying three different deficiencies in leaves from the healthy one. Since CNN does not
require any tedious preprocessing of input images and hand designed features, faster convergence rate
and good training performance, it is preferred for many applications rather than the conventional
algorithms. The classification accuracy can be further increased by providing more images in the dataset
and tuning the parameters of the CNN model. In this paper, we proposed Crop Disease Detection
using CNN system based on Deep Learning. The described system can be efficiently used by
farmers as it is gives the instant information about the crop disease. It also reduces the
Outbreaks, upsurges which causes the huge losses to crops and pastures and threatening the
livelihoods of vulnerable farmers. As Comparing with traditional crop disease detection system,
the described system gives the accuracy rate of 89% which implies correct detection of 9 crop
images from set of 10. The experimental results demonstrate the effectiveness of our proposed
system and it can be used widely by Farmers to Detect the crop Disease.
FUTURE SCOPE
This system considers only the leaf of the plant to detect the disease of that crop. It will be more
convenient if the other parts of the crop such as roots, stem, branches etc. which increases the
detection accuracy more than current one. Also image categorization will also be done to check
whether the given leaf is of preferred category or not. If a model provided with input other than
leaf image then also it shows some name of disease for it.
5.2 REFERENCES:
[1]. F Ahmed, HA AI-Mamun, ASMH Bari, E Hossain, “Classification of crops and weeds from digital
images: A SVM approach”, Elsevier-2012 Proceedings of the International Conference on
Communication and Electronics Systems (ICCES 2018) IEEE Xplore Part Number:CFP18AWO-ART;
ISBN:978-1-5386-4765-3978-1-5386-4765-3/18/$31.00 ©2018 IEEE 1173.
[2]. Suhaili Beeran Kutty, Noor Ezan Abdullah, Dr. Hadzli Hashim, A’zraa Afhzan Ab Rahim, Aida
Sulinda Kusim,Tuan Norjihan Tuan Yaakub, Puteri Nor Ashikin Megat Yunus, Mohd Fauzi Abd Rahman
“Classification of Watermelon Leaf Diseases Using Neural Network Analysis” 2013 IEEE Business
Engineering and Industrial Applications[BEIAC].
[3]. Godliver Owomugisha, John A. Quinn, Ernest Mwebaze and James Lwasa, “Automated Vision-
Based Diagnosis of Banana Bacterial Wilt Disease and Black Sigatoka Disease”, Proceedings of the
International conference on the use of mobile ICT in Africa 2014, ISBN: 978-0-7972-1533-7.
[4]. Dhakte Mrunmayee, and A. B. Ingole. "Diagnosis of pomegranate rice leaf diseases using neural
network." Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2015
Fifth National Conference on. IEEE, 2015.
[5]. Singh Arti, “Machine learning for high-throughput stress phenotyping in plants." Trends in plant
science 21.2 (2016): 110-124.
[6]. Mohanty Sharada P., David P. Hughes, and Marcel Salathé. "Using deep learning for image-based
rice leaf disease detection." Frontiers in plant science 7 (2016): 1419.
[7]. Mahlein, Anne-Katrin. "Rice leaf disease detection by imaging sensors–parallels and specific
demands for precision agriculture and plant phenotyping." Rice leaf disease 100.2 (2016): 241-251
[8]. Behmann, Jan, et al. "A review of advanced machine learning methods for the detection of biotic
stress in precision crop protection." Precision Agriculture 16.3 (2015): 239-260.
[9]. Prince, Gillian, John P. Clarkson, and Nasir M. Rajpoot. "Automatic detection of diseased tomato
plants using thermal and stereo visible light images." PloS one 10.4 (2015): e0123262.
[10]. Khirade, Sachin D., and A. B. Patil. "Rice leaf disease detection using image processing."
Computing Communication Control and Automation (ICCUBEA), 2015 International Conference on.
IEEE, 2015.