DL Decode
DL Decode
Unit lI
Deep L e a n i n g
3-2
Recurrent Neural Networks
Because the
se the definition of s at time
3
t
goes back to the identical
Recurrent Neural Networks finition at ime t -
related equation is frequently ses information from the input x. Circuit schematic (left). AA
.Equation (Q.2.5) or a used
by processes
the variable h the state s network as a computational graph that has been unfolded, where
We now rewrite equation (Q.2.4) using as
to
netv
network's hidden units: each node de is now connected to specific time occurrence.
show that the state is the
a
forecasting the future from the past. Since it converts an arbitrary step.
length sequence (, x ,x) to a fixed length vector a4 Explain architecture of recurrent neural network.
h, this summary is inherently lossy. This summary may retain Ans.: Fig. Q.4.1 shows architecture of recurrent neural network.
some former sequence elements with greater precision than others Recurrent
Neurons
network
depending on the training criterion. For instance, it might not be
necessary to store all of the data in the input sequence up to time t,
only enough to predict the rest of the sentence if the RNN is used
in statistical language modelling, which typically predicts the next
word given previous words.
O
.The circumstance when we require h to be rich enough to allow
one to roughly recover the input sequence, like in autoencoder
Input layer Output layer (Ciasses
items, is the most challenging.
Fig. Q.4.1 Architecture of recurrent neural network
Ans.: Itis simple to calculate the gradient using a recurrent neural .Starting at the conclusion of the series, we work our way
network. The unrolled computational network is simply subjected backward. At the final time step t, h only has o as a
to the generalized back-propagation method. No descendent, so its gradient is simple
particular
algorithms are required. The back-propagation through time
VhPL vvoL .Q.5.3)
(BPTT) technique applies back-propagation to the unrolled
graph .We can then iterate backwards in time to back-propagate gradients
Then, to train an RNN,
gradients generated from back-propagation
through time, from t = t - 1 down to t=1, noting that h (for t < t)
can be employed with any general-purpose gradient-based
approach. has as descendants both o and h
We give an example of how to compute gradients via BPTT for the Its gradient is thus given by,
aforementioned RNN equations in order to give the reader an
edge in the computational graph to the gradient, is used in the re the losses
L in the
example recurrent network we have
the were
equations we want to employ. .2ated so far o . In theory, practically any loss may be used with
crea
. The calculus vW f operator, on the other hand, accounts for the a
recurrent network, much like with a feedforward network. The
contribution of W to the value off resulting from each edge in the iob should guide the loss selection.
computational graph. In order to clear up this uncertainty, we We often want to
interpret the output of the RNN as a probability
construct dummy variables W(t), which are duplicates of W that Aistribution, similar to how we would with a feedforward network
are only utilized at time step t. The weights' contribution to the and we typically use the cross-entropy of that distribution to
quantify the loss.
gradient at time step tis then shown by the symbol VW(t).
.The cross-entropy loss associated, for instance, with a feedforward
The gradient on the remaining parameters is represented by the
network and a unit Gaussian output distribution is called mean
following notation:
squared error.
FbL - hL-diag(1 - H) Vh°L (05.7) estimate the conditional distribution of the following sequence
element, y(t).
.This may mean that we maximize the log-likelihood
wL-22 vo-2voLh()" (Q5.8)
logpy,.., (Q.6.1)
or, if the model includes connections from the output at one time
VWL - w .(Q.5.9) step to the next time step,
diag(1 (h -
(Vh°L)H®T (Q.5.10)
- - - - -. L 20
L(t) = - log P(y
Q.6.4)
Where
-yly,),, .(Q.6.5)
a.7 Explain advantages and disadvantages of RNN.
Ans.: Advantages
a) RNN can process inputs of any length.
b) RNN model is modeled to remember each information
Fig. Q.6.1 throughout the time which is very helpful in any time series
predictor.
OFCODB A Guide for Engineering Students
OECODE A Guide for Engineering Students
Deep Learning 3- 11 Recurrent Neural Networ Deep Learming
3-12 Recurrent Neural Networks
c)Even if the input size is larger, the model size does in oFks
not
incre
3.2 Types of Recurrent Neural Networks
d) The weights can be shared across the time steps.
e) RNN can use their internal memory for processing the arbit.. 40 Explain types of recurrent neural networks.
series of inputs which is not the case with feedforward . trary
ard neural
Q.10
Sr. RNN CNN 4. Many-to-many: Several inputs are used for generating several
No. outputs. Name entity recognition is a famous example of this
-0- - 0
2. RNN can have no restriction in
length of inputs and outputs. finite outputs.
yo
3. RNN is primarily used for CNN can be used for video and (a) One-to-one
speech and text analysis. image processing.
RNN works on loops to handle CNN has a feedforward
sequential data. network. y2
5. While training the model, CNN While training the model, RNN
(b) One-to-many
uses a
simple back-propagation. uses back-propagation through
time to calcuiate the loss.
( Manytbome
) Many-to-many
RecurentNeuralNetwork (b) Feed-forward Neural Network
Fg 0101
Fig O-111
between RNNk and fed-forward neural
011 Whatis diference
nethworks? 33:longShort-Term Nemory Network
Ans.: In a fed-forward
neural network, the intormation
0122
explodinggradience?
Ans.: Vanishing gradience can be overcome with Relu activatian
function, LSTM, GRU.
Encoder
3.4 Encoder Decoder Architectures
Cwhich represents a semantic summary sequence RNN can accept input in at least twvo ditterent ways. The
of the input sequence the RNN's be linked
trom the encoder RNN's final hidden state. input may be given as starting state o r it mav
[Reter Fig. Q.16.1 on
nert page] to the hidden conmponents at each time step. Both of these
To maimize the average approaches c a n be combined.
oflog P(y.. yyIx,, x) across
all the pairings of x and y There is no need that the hidden layer size of the encoder and
sequences in the training set, the two
RNNs in a
sequence-to-sequence
architecture are trained decoder be the same.
simultaneously. The input sequence that is sent as input to the .When the context C generated by the encoder RNN has a size that
decoder RNN is generally represented by the final state h, of the is too tiny to adequately describe a lengthy sequence. this design
encoder RNN.
clearly has a constraint. As oPposed to being a tixed-size vector,
making(
a
vanatbhe-length
eqvence
ther
sugestd
included
an
attenton
mechanusm
tothose
metaisepiennel
that can ho
intheeouth
utrut
oftheC
syuene
vmponents
35:RecursiveNeuralNWetworks
017 Write
shornote on
Ans: Another
recusiveneural
generalization
networks
networks
otreCurent
Which
etworks is tepres
have a dia
00
reursie neural
networks, ent
br treelkestructuro
with a deep
Computathonalraph
chainlie stuactune
of RANs ig Q71 depick
netvwork.
tor a recursve
Computinggraph
Recursive nethvorks have been eitectvely used in computer vi
the tree
be for the learner to independenthy identiíy and inter
structure that is optimal tor every given imput.
4
variations. The inputs and targets
with a tree structure. Everv
very node Autoencoders
tree and the data is associated
be the conventional artificial
computation need
When
not
of Autoencoder.
Q.2 Explain properties
Ans.:Data-specitic: Autoencoders are only able to meaningfully
compress data similar to what they have been trained on. Since
they learn features specific for the given training data, they are
. Lossy: The output of the autoencoder will not be exactly the same
as the input, it will be a close but degraded representation.
Unsupervised : Autoencoders are considered an unsupervised
learning technique since they don't need explicit labels to train on.
a.3 Explain architecture of Autoencoder.
Ans.: Fig. Q.3.1 shows architecture of Autoencoder. (See Fig Q3.1
on next page.)
(4-1)
Deep Learning
compressed knowledge of the
Autocncoden Deep Learulng 4-3 Autoencoders
rigin Which hyperparameters must be set before training the
.
a
network to represent
into a lower-dimensional cod.a Autoencoders ?
input. The input is compressed le
this representation. The
is reconstructed from are four hyperparameters that rnust be set before training
then the output co de
Ans.: There
Enccdeer Decoder
Fig. Q.3.1 Architecture of autoencoder a.5 List the types of autoencoder.
Ans.: The different types of autoencoders are as follows
A n autoencoder consists components: encoder, code and
of 3 1. Undercomplete autoencoders
decoder. The encoder compresses the input and produces the code,
the decoder then reconstructs the input only using this code. 2. Sparse autoencoders
OrcODD A Guide for Engineering Students OIcODES A Guide for Engineering Students
p Learnin
Autoemcodes 4utoencoders
Fi 1 shows simple singie-laver sparse auto encoder with Q7
Discuss abou
Denoising Autoencoders.
equal numbers of inputs ( . outputs (that) and hidden nodes (h)
Ans.: Auto
utoencoder can learn useful representations by
changing
onstruction error term ot the cost function rather than
ing the
adding a
reconstruction error.
all but the strongest hidden unit
value, or by manually zeroing .They are also an
example of how overcomplete, high-capacity
activations.
models may be used as
autoencoders as
long as care is taken to
Advantages value close to
prevent them from learning the identity function.
1. Sparse autoencoders have a sparsity penalty, a
the
Sparsity penalty is applied
on
zero but not exactiy zero.
4.3: Stochastic Encoders and
in addition to the reconstruction
error. This Decoders
hidden layer
prevents overfitting a.8 Write short note on Stochastic Encoders and Decoders.
activation values in the hidden layer and
2. They take the highest Ans. Autoencoders are feedforward networks and use the same
zero out the rest of the hidden nodes. loss functions and output unit that are used in traditional
Disadvantages feedforward networks
1. For it to be working, it's essential that the individual nodes of a For designing the output units and the loss function of a
trained model which activate are data dependent, and that feedforward network, output distribution p(y I x) is defined
an
different inputs will result in activations of different nodes and the negative log-likelihood -log p(y I x) is minimized where y
through the network. is a vector of targets, e.g. class labels.
given code h,
Pdecoder
can assume that for a
feedforward network, we
Pdecoder(xIh the encoder and decoder distributions need not
conditional distribution I n general,
decoder is providing a
unique joint
necessarily onditional distributions compatible with
e c e s s a r i l y cond a
valued The negative log-likelihood yields stoencoders that reduces the risk of learning the identity function
Binary values correspond to a
Bernoull auto
Veasure
Add noise to the reconstructon
Feed
Pencser th | Y) Paecoder (| h) input imag corrupted 0ss aganst
as the output.
where Pdata (X) is the training distribution
Fig. Q.11.2 illustrates the training procedure of DAE. A
It represents a
corruption
corri.
conditi
C( Ix) is introduced. onal
process
a data sample x.
4.5:Contractive Autoencoders
x, given
distribution over corrupted samples
Q.12 Write
ite she short note on Contractive Autoencoders.
The main goal
of Contractive Autoencoder (CAE) is to have a
:
ns.
Ans.
bust learned representation that is less sensitive to small
in the data.
C x) variation
This is shown
in Fig. Q.12.1.
3. Use (x) as a training example for estimating the autoencoder
reconstruction distribution preconstruct(X x) = Pdecoderx h}) with
L Ix-g(fx))l +211J(x)||E
h the output of encoder f() and Pdecoder typically defined by a
Oh
decoder glh).
Gradient-based approximate minimization (such as minibatch
gradient descent) can be performed on the negative log-likelihood Fig.Q.12.1 Loss function with penalty term Frobenius -
norm of the
-log Pdecoder(x h). Jacobian matrix
As long as the encoder is deterministic, the .Contractive autoencoder is similar to denoising autoencoder in a
denoising autoencoder
is feedforward network. So it can be trained using
a sense that in presence of small Gaussian noise the denoising
exactly the
same
techniques as that of any other feedforward network. DAE reconstruction error is equivalent to a contractive penalty on the
performs stochastic gradient descent on the following expectation:- reconstruction function that maps x to r =
g(f(x).
ECODE> A Guide for
Engineering Students A Guide for Engineering Students
OrCODE
4utoencaden
Deep Learning
case of d e n o i s i n g
a u t o e n c o d e r s the rec Autoen4,
onstru
Der
Thes
Learni
alues
( 0 1 2 . 2 ) .w h e r e
can be calculated using equation ( 12
1 2nd
This means, in
perturbations of
but finite
sized
the
function resist small
the teature extraction in m Observations
while in
contractive
autoencoders
of t h e input.
funct : Hidden layer rodes, and
infinitesimal perturbations
resist
can be defined by equation (O12.3)
obtained by reEUlarizing autoencod. function
CAE s u r p a s s e s resuits Loss
CAE is aa better a s ce
better
weight decay or by denoising. compare IIA O121
learn u s e f u l
feature extraction.
autoencoder to
denoising
to learn an encoding where simila
The model is encouraged
So the mode is forced to 1earn P
have similar encodings. how
neighborhood of inputs into a smaller neighborh
contract a
Q122)
outputs.
derivative of the reconstructes
indicates how the
Fig Q.12.2
(ie siope) is essentially zero for local neighborhoods of input da
datz
This can be ahieved by penalizing the instances where a L 2 .(Q123
in the
change in the input leads t o a
large change encoding
spa
For this the loss term should penalizes large derivatives of i
instances.
hidd where
Fia Q122 Siope of the recostructed data database that resemble query a
entr
Dimensionality reduction benefits the tasik of information retrieval
Pelarizztin oss term used is the squared Froberius n o r m lAl; become
In case of certain type of iow dimensional data search can
h e Jacian natrix J foz the hidden ayer activatíons wr.t e more effiient dimensionality reduction A s o n e
due to of the
input vservations. A Frobeniu5 n o r m is a n L n o r m for a matria be
application of a u t o e n c o d e r is dimensionality reduction. they
can
The Jacvizn natriz represents all first-order applied to information retrieval using semantic hashins
partial derivatives
a veco vaied function
4 Guide f o r E n z i n e e r i n z S u d e n t s
dimensionality
the entrie
Brtraining code, äll
database
, B u t
toencoder cannot pertorm reconstruction task for imag
autoencoder
the
not
losto
COnstruction loss
reconstruction for such image will be very high. 5o by
all appropriate threshold it can be easily identified as anomaly
retuming
bits trom encoding
ofthe
he uery, s
some
fipping
guery. Br efficienth orunusualimage
entries can
also be searched very ly
less similar
retrieval is suitable for t Qo to this autoencoders are good at powering anomaly detecthon
This approach ofintormation
sYstems
forthis
injected betore siçmoid
functhon during training and its m aits magnith
Nonojsing autoencoders do not search for noise in image, instead
should increase over time.
intormation as possible they extract
tne image from noisy data fed as input to them by
much aa.
So in order to preserve as formed
learming their representahons. Ihen the noise free image
is
.iced
Unsupervised feature learning algorithm L, which takes a training
Representation Learning s e t
mples and returns an encoder
o fe x a m p l e s
or feature functionf The X
5 is
raw
input
data
F-Identify function
X- X
Unsupervised Pre-trainine
ng
Wise
5.1:GreedyLayer fork 1 . . . m do
Learning8? f-L(X)
Q.1 Whatis Representation concerned With training me
training machiine
Ans.:Representation
learning is f - f of
representations.
useful
to learn
learning algorithms
representationlea
Xf(KX)
Deep neural networks
can be considered
information which is
learning
projected into a
end for
models that typically encode
are then usually na
if fine-tuning then
different subspace. These representations ssed
on linear classifier to, for
to a
instance, train a classifier. fT(f X Y)
end if
Representation learning can be
divided into:
Learning representatin Return f
a) Supervised representation learning: tions
annotated data and used to solve task B each of the solution independently, on
on task A using
Gready
Optimize piece
b) Unsupervised representation learning Learnin
Learning a time.
piece at
representations on a task in an unsupervised way. These aare the layer of the network.
, Layer-Wise: The independent pieces are
(5-1)
Representation Len
Deep Learning
to another
that mini-batches oi. eurnin Deep Learning S-4
Representation Learning
much from one example simultaneously, instead of using the pretraining strategy, there
of the gradient
a region irrounded by area
surr
very noisy
estimate is a single hyperparameter, usually a coefficient attatched to the
matrix is so pooly conditioned ed that
that gradien
where the Hessian
small steps.
unsupervised cost, that
determine hows strongly supervised
descent methods must
use very objective will regularize the supervised model.
what aspects
characterize the exactly of
We cannot 2. Two separate training phases has its own hyperparameters. Ihe
during the
pretrained
training stages.
parameters are
retained
pervise
sun.
performance of the second phase cannot be predicted during
the first phase, there is a long delay between proposing
so
setting.
Pretraining because, it is only a first step before applying a joint
Two extreme form of transfer learning:
training algorithm is applied to fine-tune all layers together.
1. One-shot learning: Only one example of transfer task is given
Q.6 Explain
disadvantage of Unsupervised Pretraining. for one-shot learning. It is possible because the representation
Ans.
learns cleanly separate the underlying classes during tirst stage
1.
Unsupervsied pretraining does not offer a clear way to adjust During the transfer learning stage, only one labeled example is
the strength of the regularization arising from the needed to infer the label of many possible test examples that al
unsupervised
stage. When we perform unsupervised and cluster around the same point in representation space.
supervised learning
OFCODE A Guide for Engineering Students
A Guide for Engineering Students IcoDD
Rgre Le
bee 5.4 Variants of CNN : DenseNet
ariciei urrN a r
Ans
Standard Conv Net uses several comvolu ns to xtract high
ui rgesentation Eplain Symbolic level characterists trom the inpu: cHure
CsNat
dentity mappin 15 SugEested in ResNet to promote gradieni
vic «premeten:
Tte nput S 2SG d with a
singie
sin module to another.
s v v l s n tite dictionar.. a-
one Fach laver in DenseNet receives all levels that
r
vi ri angur. t te r extra inputs from
deetors ezcn UrrespOndins to th it and transmits its that
came before own feature-maps to all lavers
came
after it. You utilize concatenation. Each layer receie
raions ot the epresentation space an
"collective knowledge" from the levels that came before it.
a diieent on
is also calle
Rsbie aving äiierent regions in input space. It .Fig. Q.11.1
shovs the DenseNet block.
oTevt epresentzto
? Eplain example
e What is Nondistributed representations of it.
Decision tree Only one leaf is activated when the and thin network with fewer
:
input is allowing for a more compact
given channels. The extra number of channels for each layer is the growth
d. Gaussian mixture and mixtures of expert :Each
input is rate k.
represented with multiple values, but those values cannot be
Therefore, it has greater memory and processing efficiency.
readily be controlled separately from each other.
Representation Learming
Q.12 Draw and explain DenseNet Architecture.
6uy00d
k 4k k 6uood
channeis channelsS channels
uONNOAUOO
Fig. Q.12.2 DenseNet-B
6
leature man.
The transition laver produces
m output
<
0 SI is refer
Ps Applications of Deep Learning
feature-maps, where 0 red
dense block has m
factor.
as the compression
transition laye
The quantity of feature-maps
across
yers 6.1 Overview of Deep Learning Applications :
or DenseNet with a
constant when 1. DenseNet-C,
experiment.
value Image Classification
of 6 < 1, and 0 0 . 5 in the
DenseNet-BC when both
h
known as
The model is the How is deep learning applied to computer vision tasks ?
bottleneck and transition layers with 6 <l are implement.- a.1
An. With the help of convolutional neural networks, deep
An
various L layers, and k ro....
DenseNets with/without B/C, the
O th learning is able to perform following tasks:
rates are also trained at this point.
a) Object recognition b) Face recognition
DenseNets?
a.13 Why do we need c)Motiondetection d) Pose estimation
specially developed to improve accuracy causo
Ans.: DenseNet was
ised e) Semantic segnmentation
the vanishing gradient in high-level neural networks due to the
by
and the
information Object recognition (detection): Nowadays Al is able to recognize
long distance between input and output layers In
vanishes before reaching its destination. both static and dynamically moving objects with 99 % accuracy.
it is a matter of dividing the image into fragments and
of DenseNets.
general,
Q.14 List the advantages
letting algorithms find the similarities to o n e of the existing objects
Ans.: Advantages: in order to assign it to one of the classes. Classification plays
an
1. Parameter efficiency: Every layer adds only a limited number
computer vision task due to the wide variety of human shapes and
and crowded scenery. For these
appearance, difficult illumination
ECODED A Guide for Engineering Students
6-1)
rhons
Deep Le -
What
isimage
image classification in deep learning ?
Applhcatonsof Dep Leurni"
SAS to estimate th
ocation classification is
are used a.4
Ans. m a g e classifica.
devirs aage where
a tnm motiom captur
a
computer can
analyses an
t hunnan iints.
and identify the 'class'
image and the image falls under. For example,
dcep learning
that
attempts
att.
ot an image of a sheep. Image classification is the process of the
is a tpe
segmentation input
asses,
antic
into o n e of
several
classes. compu
zing the image and telling you it is
ut e r a n a l y z i
wd skr or grass,
These labels
are
processed they can also be seg ining Early image classification relied on raw pixel
data.
break down images into individual
This meant that
vegetation, roads, water resources and buildings. This is done What the computer sees
to create statistical measures to be applied to the overall
image. 82 % cat
2. Unsupervised classification Unsupervised classification 15 % dog
image classification
technique is a
fully automated method that does not leverage 2 % hat
1 % mug
training data. This means machine
learning algorithms are used
to
analyze and cluster unlabeled datasets by
hidden patterns or data groups without the need for
discovering Fig. Q4.1
human
intervention.
OECODS A Guide for A Guide for Engineering Students
Engineering Students OEcoDD
Applications of Deep Learnin
Deep Learning Image is an arra of pixe Deep Learning 6-5
Applications of Deep Leurning
in the form of pixels.
Image is analyzed of
where size of the
matrix depends
on
resolution
pixels
an
mage. Imag
im:
into
to specifie
. CNN layers can be of four main types Convolution, ReLu. pooling
and fully-connected layer.
grouping
dassification is done by
task
to as classes. 1. Convolution Layer : A convolution is the simple application of
categories referred
most prominent features a filter to an input that results in an activation. The convolution
into its
image is segregated
The
idea about the class of th layer has a set of trainable filters that have a small receptive
mage
an
the classifier
algorithm giving extraction range but can be used to the full-dept of data provided.
process
to. Thus the feature is mos
it may belong data fed used
classification. Also
to
Convolution layers are the major building blocks in
-9-
related
individuals or entities that are
members that a r e
social actors, or nodes,
formally defined as a set of
relations.
Tiger: 0.02 connected by one or m o r e types of
What is social network analysis?
Convoiution Pooling Fully-connected Q.8
social relations
Ans.: Social Network Analysis (SNA) is the study of
Fig. Q.6.1
among a set of actors.
Students
A Guide for Engineering
QECODD A Guide for Engineering Students
CIcoD
Application of
Deep
6-0 Learninp Deep Learning
6-7 Applications of Deep Learning
Deer Learming
Delation : It is the collection of ties of a specific kind among
analysis.
among pairs
network
The of friendship
members of a group. Example:
social set
Q.9
List the
principles
andtheir
of
actions
are
viewed
units.
as
interdeno
2.
network refers to the
resources
"flow" of
transfer or
individuals
view
ew the network
models
focusing on the network.
opportuni+
Two-mode networks
Network
providing
3. as
structure
environment
conomic,
constraints
structure
conceptualize
models events
patterns of relatic
4 Network
as lasting patt ons amon
and so forth)
What is social network analysis ? Explain.
political o.11
is the mapping and
actors. analysis, Ans.
Social Network Analysis [SNA]
soctal network
used in and flows between people, groups,
Q10 Explain
terminology
network analvsis measuring of relationships
social connected
used in computers, URLs and other
Ans.:Terminology is organizations,
group and relation. term "social network"
has
triad, subgroup, information/knowledge entities. The
relational tie, dyad,
Actor: Actor is discrete
individual, corporate, or
lective social
collective
been introduced by Barnes
in 1954.
departments within
a group, hin in a relations among a set of actors.
The
units. Examples People in .SNA is the study of social
in a city, nation-states are aimed at
the methods of data collection in network analysis
service agency
corporation, public
data in reliable manner.
world system. collecting relational a
to another by ial
social ties.
tiec
A tie using standard
Relational tie:
Actors are linked .Data collection is typically carried out
the
linkage between
a pair of actors. and observation techniques that aim to ensure
establishes a questionnaires
consists of pair of actors
between two actors and completeness of network data.
a ct
and the tie(s) between them. SNA is based on an assumption of the importance of relationships
ties. A subset of three
Triad Triples of actors and associated hree among interacting units.
models and
actors and the tie(s) among
them.
.The social network perspective encompasses theories,
or
are expressed in terms of
relational concepts
Subgroup of actors is defined as any subset of actors and all ties applications that
advantage ofsocial
network
Network
Net analysis allows
many ottherhe people
ped
with other people
and share data between people. A user
The i n t e r a c t i o n .
focuses
on
us to can create a personal profile, add other users as friends, exchange
it of networks
methods,
configuration
data, create and join common interest communities.
examine
how
the
organizations,
or systems funct
« how
social
individuals
and groups,
network
analysis Struch
Structural intuitioy . Twitter is a
net-working and microblogging service. The
users of Twitter can exchange text-based posts called tweets. A
of social
.Features
Sy'stematic
relational
data,
models.
graphic
8raphic
representation and
tweet is a maximum 140 characters long but can be augmented by
computational or audio The main concept of Twitter was to
recording.
mathematical
or
pictures
Social network analysis: build a social network formed by friends and followers. Friends are
the ties among them
em.
actors and people who you tollow, tollowers are those who follow you.
Refers to the set of
a) units arisi
of the social
b) Views on
characteristics
or focuses on prone
rising out of . The role of social networks in labor markets deserves attention for
relational processes at least two reasons: First, because of the central role networks
structural or of the
relational system themselves.
play in disseminating information about job openings they place a
analysis
helps to
helps deep learning, Learnin, Deep Learning
6-11 Applicationsof Deep Learning
DegLomi
social
n e t w o r k
network
mbeddings. Networ.
embe embedd aze function ot
encoder is to map the features of each node
into a latent
plsible
icationsduel
into data. reconstructed by
Information about the network. is then
network
network
data of
representations
space.
etc. are possil
[Link]
this latent space.
dimensional
prediction
low
custering link decoder from
hidden representation layer is usually small as compare
c l a s s i f i c a t i o n s
The size of
network epresentation learning.
renre
neural
networks
can
be used
to learn
based o n
presentaions fron to input/output
layer. The non linear network
structure
are shown in Fig. Q.13.1. Sampling and modeling are the two kev Q.14 What is graph convolutional approaches?
is approach for
components of this approach. Ans. Graph Convolutional Network (GCN) an
directly on graphs.
Fig Q.13.1 Building blocks of models with embedding look - up tables
via a
The choice of convolutional architecture is motivated
2. Autoencoder based models localized first-order approximation of spectral graph convolutions.
Two neural network modules of an autoencoder are: The model scales linearly in the number of graph edges and learns
i) Encoder and ii) hidden layer representations that encode both local graph structure
Decoder
and features of nodes.
Deep Learnng
6.3:SpeechRecognltion
The
Language Model (LM) is an important module as it captures
recognition Dro
is speech of a machine or
Q.16 What
recognition
is the ability
into readable
program
s to ken
take from the classification model as well as to make corrections
50und
Pre-processing extracton
model
Predictong recognition.
Y-1
Language
Mocdel
(LM)
The features that used for ASR, are extracted with a specific
are Hidden sequence is h
=
(h, hy .., hy) and
number of values or coefficients, which are generated by applying
Output sequence is y (Y» Y»
=
yn)
various methods on the input. Feature extraction techniques are h
RNNs compute the sequence of hidden vectors
as :
on Nes
n s h m y m v t a n t
as
strength recommender systems.
apual hr in both forwaard
int
backowons
is
mh the nput
shows recommend systems concept.
d
R:AAs
i m t n s
mre
and
tinis is
hidden
h iudden
nidirectional RNN,
unidirectiona
state
vector for each
direction.
bidirectional RNThat
Fig
Q18.1
ot u s n g
inshmd
Buy
wh speeh
egnition.
u s a i Ar in.
wdel c l a s s i t i c a t i o n
of audio
input signal Similar
t r a m e - w i s e
Nrome
using
alignment
betiween nput audio and netwothy
correspon
Recommend
Models (HMNM) or
Connectionist
Temporal Classifica
Tempor
Markov Fig. Q.18.1 Recommendation systems
(CTC) loss between almost every modern
objective
function. Alignment the input sped
the
.
Recommendation systems are a key part of
CTC is an
ot the words is Cons The systems help drive customer interaction
the output
sequence consumer website.
signal and usin and sales by helping customers discover products and services
tems and ideas to a users specific way of hinking. Recommend There are a variety of applications for recommendations including
items
widely used on the Web for recommending produ products (e.g., Amazon or similar
systems are movies (e.g. Netflix), consumer
available
object storage,
or
standard SQL databaso
se. NSOL data to fully implement.
history of user
The main benefit is that it doesn t need
ratings
The recommender
system finds items
3. Analyzing:
engagement
data atter
analysis. similar 5. Hybrid recommender systems combine various
to
inputs and
take advantage ot the
strategies
data gets filtero
iltered to access recommendation
user
where different
This is the
last step
4. Filtering: synergy among them.
required to provide
the relevant
information
this, user will need
recommendatitiong
recom
4. Oid customers can have a glut ot intormation. commercial recommender svstems are based on large
1. Many
5. Customer data is volatile. datasets. As a result, the user-item matrix used tor collaborative
various types of recommender system? could be extremely large and sparse, which brings
Q.20 Explain filtering
ot the
Ans.: In general, there are three types of recommender system: about the challenges in the pertormances
1. Collaborative recommender system is a system that produes recommendation.
traditional CF
its result based on past ratings of users with similar 2. As the numbers of users and items grow,
preferences. algorithms will sutter serious scalability problems.
not
2. Content based recommender system is a system that produces 3. Gray sheep reters to the
users whose opinions do
its result based on the and
similarity of the content of the
consistently agree or disagree with any group ot people
documents or items. thus do not benefit from collaborative tiltering.
Sudens
A Guide for Engineering Studens 4Guide for Engineering
pecob
Applications of Deep
Deep Learning
6-20
a
arrc
chhi
itte
ec tu
ct ur
ree
of conten
content Learming D e Learnlng
6-21
Applicarioms of Deep Learming
a.23 Draw and explain
of
documents
and products.
apDro
based
reter to such
r e c o m m e n d a t i o n
information
Content based
recommenders
hes, that sourc
Feechay
Ans.
by comparing
representations
ons of
terestscontethent
that int e
recommendations
provide content
of content User
proflg
tem
representations
item to descriptions
an
describing also referred to
user. These approaches are sometimes nt Content Useru Profle ctive
training deaner
based filtering to
analyzer
examples
Content based
recommendation systems try mend item
recomme
given user
has liked in
the past. User
orofig
similar to those a STUcured
preference is
also represented by the
the same se
e.
based recommendation
Thesimplest approach content - to
to . Uses different strategies
of the user profile with each item.
compute the similarity . Users have no detailed knowledge of collection makeup and the
Fig. Q.23.1 shows high level architecture content-based queries to obtain the results of their interest.
.Fig of content based
recommender systems. (See Fig. Q.23.1 on next page.) a.24 Explain advantages and disadvantages
filtering
1. Content Analyzer Ans.: Advantages
Extracts the features (keywords, n-grams) from the source 1. User Independence : Recommends only the items that interest
ierent items
led as
as the
t user does
be enpanded t
cannot
try a language processing?
diffe What isnatural
Business
tpe of product.
a.26
ew user,
new svste.
many
4 Cold Start
Problem: tor a systems don' taken for granted.
intormation to
recommend items. ha day are Spell check,
examples of NLP that people every
historical use
Sotmax classification Explain recurrent neural network based framework for NLP
a28
: RNN are effective for sequential data processing in RNN
Ans.
sequence
ence from previous computed results. Recurrent unit is
Max-pool sequentially fed with the sequences represented by fixed size
over time
vector of tokens. RNN based framework is shown in Fig Q29.1
Convolution
ayer
|W W
Unfold
X-1
of
inputs of arbitrary length with RNN and proper composition
Fig. Q.28.1 CNN based framework for NLP can be created.
input
The steps to perform sentence modeling with CNN are as .Mainly RNNs are used in different NLP tasks like.
follows machine
1. Natural language generation (eg. image captioning,
1. Sentences are tokenized into words. Then it is further translation, visual question answering)
transformed into word embedding matrix of dimension 'd 2. Word -
Autoencoders.
b)Write short note on Contractive
Solved Model Question Paper [End Sem (Refer Q.12 of Chapter 4) (5
Deep Learning c)What is autoencoder ? Explain properties of Autoencoder.
(Refer Q.1 and Q.2 of Chapter 4) (71
B.E. (IT) Semester V1l (As Per 2019 Pattern)
DenseNet Architecture.
a) Draw and explain
Time: 251 Hours 0a5 (Refer Q.12 of Chapter 5) 61
(Maximum Marks: 70 note on Dense Block. (Refer Q.11 of Chapter -5)16]
N.B.:i) b)Write short
Attempt Q.1 or Q.2, Q.3 or Q.4, Q5 or Q.6, Q.7 or Q.8, c) Write and explain an algorithm for Gready Layer-Wise
ii) Neat diagrams must be drawm wherever necessary. (Refer Q.3 of Chapter- 5) 41
Unsupervised Pretraining.
ii) Figures to the right side indicate full marks. OR
iv) Assume suitable data, What is representation? Explain symbolic
distributed
if necessary. a.6 a)
Nondistributed representations? Explain
Q.1 a) What is a directed graphical model in RNN? representation. What is
(Refer Q.9 and Q.10 of Chapter 5) [81
(Refer Q.6 of Chapter - 3) example ofit.
b) Explain types of recurrent neural networks. 6) ?
b) What is transfer learning Explain
its types.
16)
(Refer Q.10 of Chapter 3) (ReferQ.8 of Chapter-5)
6 ideas work ?
c) Draw and explain encoder decoder architectures. c) When and why does unsupervised pretraining
Chapter- 5) [41
(Refer Q4 of
(Refer Q.16 of Chapter 3)
OR
6
OR classification.
Q.7 a) Explain supervised and unsupervised
Q.2 a) Explain unfolding computational graplhs
(Refer Q.3 of Chapter- 6)
61
(Refer Q.2 of Chapter-3) to computer vision tasks ?
6 b) How is deep learning applied 6
b) Explain memoryless models for sequences.
(Refer Q.1 of Chapter-6)
(Refer Q.12 of Chapter 3) Automatic Speech Recognition (ASR) ?
161 c) How to formulate
c) Write short note on recursive neural networks.
(ReferQ.16 of Chapter-6)
(Refer Q.17 of Chapter 3) OR
61
Q.3 a) Explain Denoising Autoencoders. (Refer Q.11 of Chapter- What is recommender systems ? Explain in detail.
4)16 Q.8 a)
I61
b) Write short note on Stochastic Encoders and Decoders. (Refer Q.18 of Chapter- 6)
(Refer Q.8 of Chapter- 4) b) What is social network analysis ? Explain.
6]
c) Explain any two application of Autoencoders. (Refer Q.11of Chapter-6 (8
(Refer Q.14 of Chapter-4) c) List the application areas of image classification.
(Refer Q.5 of Chapter-6) [31
OR
Q.4 a) Explain Sparse Autoencoder with its advantages and END .E
disadvantages. (Refer Q.6 of Chapter 4)
A Guide for Engineering Students
OECODE
(M-1)