0% found this document useful (0 votes)

67 views56 pages

Lecture 1 Kaldi

Kaldi Lecture of Dan Povey

Uploaded by

Anh Nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views56 pages

Lecture 1 Kaldi

Kaldi Lecture of Dan Povey

Uploaded by

Anh Nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Speech REcognition

- a practical guide

In this lecture:
Over view of the course
GEtting started
Speech feature extraction

Over view of
the co urse

Unknown
Unknowns

Known
Unknowns

Known
Knowns

w
o
N
Unknown
Unknowns
Speech
Recognition

Known
Unknowns

Known
Knowns

e
s
e
h
t
r
e
Aft
s
e
r
u
lect
Unknown
Unknowns

Known
Unknowns
Speech
Recognition

Known
Knowns

r
e
h
t
r
u
f
r
e
t
f
A
y
d
u
t
s
Unknown
Unknowns

Known
Unknowns

Known
Knowns
Speech
Recognition

Structure of this
lecture series

A series of 45-minute lectures

Each one will combine:

Some of the theory of speech recognition

Practical examples with the Kaldi toolkit
Note: various toolkits exist.
speech recognition toolkit

I believe Kaldi is the best one... but I

wrote much of it.
Note: this was released ~1 year ago.

Prerequisites
It will be helpful if you have encountered:
Statistical models
UNIX shell scripts
C++
If a section requires background knowledge
of some kind, we will suggest search terms.
e.g.:

bash scripting

What this course is about

Natural Language
Processing

Machine
Learning

Speech Processing

Signal Processing

What this course is about

Language
Modeling

Speech Processing
Automatic Speech
Recognition (ASR)

Dialog
Systems/UI

Speech signal
processing

Speaker
Recognition
Text to
Speech

What is Speech Recognition?

m
r
o
f
e
v
Wa

t
x
Te

She asked for ...

How we do it
Given training data from the target
language, well train a statistical model of
speech.
statistical model
This model will assign probabilities to (some
sentence) producing (some waveform)
Given a waveform, we can work out the
most likely sentence.
This wont be guaranteed accurate.

Data resources required

A labeled corpus
i.e. a collection of recordings of speech
a record of what was spoken for each one
A pronouncing dictionary, a.k.a. lexicon
Says, for each word, what the sequence of
phonemes (speech sounds) is.
Not necessary in phonetically written languages
Possibly some extra text to train language model

Finding speech data

A lot of speech resources are available from
the Linguistic Data Consortium (LDC)
Also Appen, ELRA
None of this is for free. Typically one to
several thousand dollars for LDC databases
Not a download. Its FedEx.
Some lexicons available for free (e.g. CMUDict)
A limited amount of free speech data is
available.
gutenberg audio

Other Resources
To do large-scale speech training (on hundreds of
hours of data), would also need:
A cluster of machines (at least 20 or so cores in
total, preferably more), running e.g. GridEngine
A few hundred gigabytes of space on a fast disk
(e.g. NFS mounted)
Fast local network

What you will be able to do

If you listen to and understand this lecture
series, you should be able to:
build and (somewhat) understand a commandline speech recognition system
You will not be able to:
build a dialog system or speech user interface
get perfect accuracy (50-95% is normal
range, except for yes/no/digit type dialogs)

How to follow these lectures

I will be describing how to run the Kaldi
software
Better to watch or attend the lecture without
taking notes
Slides and video will be made available (follow
links from [Link])
For running the examples, do it after the
lecture (get the commands from the slides)

Getting starte d

What you need

Some kind of UNIX-based system (Linux, Mac,
cygwin should all work).
Plenty of memory (e.g. 5G), disk space (e.g.
20G).
Fast Web connection, or LDC data on your
system.
You may need to install some packages
e.g. subversion (svn), wget, g++
System-dependent: figure it out yourself or
ask your sysadmin.

Installing Kaldi
$
$
$
$
$
$
$
$
$

## see instructions at [Link]

## first cd to somewhere with a lot of space.
svn co [Link] kaldi-trunk
cd kaldi-trunk/tools
./[Link] ## Installs some stuff Kaldi depends on... takes a while
cd ../src
./configure
make -j 8
## -j 8 makes with 8 jobs in parallel; should not
## exceed number of cores on your machine.

If that worked, congratulations.

Otherwise, try to figure out what went wrong.
Look carefully at the output of steps that
failed.

How to get help

If any step in this course doesnt run..
Check for obvious stuff like programs that
are invoked but not installed.
Ask at kaldi-developers@[Link]
Please, no non-Kaldi questions, e.g. how do I
change directories, how do I install awk.
If you fix something, contact us.

What we installed (1)

$ cd ~/kaldi-trunk # assuming it was in your homedir
$ ls
COPYING INSTALL [Link] egs misc src tools windows
$ # Note: tools/, src/ and egs/ are most important.
$ ls tools/
ATLAS!!
[Link] [Link]
CLAPACK_include
irstlm! ! !
INSTALL! !
[Link]
sctk-2.4.0 openfst! ! !
[Link].bz2
[Link]!
openfst-1.2.10!!
sph2pipe_v2.5
install_atlas.sh
[Link]!
sph2pipe_v2.[Link]

Various tools Kaldi depends on.

OpenFst: Weighted Finite State Transducer library
ATLAS/CLAPACK: standard linear algebra libraries
scoring, audio format conversion tools....

What we installed (2)

$ cd ~/kaldi-trunk # assuming it was in your homedir
$ cd src
$ ls
Doxyfile!configure! fstext! ! lat! ! nnet_cpu!tied
INSTALL! ! decoder! ! gmm! ! latbin! ! nnetbin! ! tiedbin
Makefile!doc! ! gmmbin! ! lm!! nnetbin_cpu!transform
NOTES!! feat! ! hmm! ! machine-type! optimization! tree
TODO! ! featbin! ! itf! ! makefiles! rnn! ! util
base! ! fgmmbin! ! [Link]!matrix! ! sgmm
bin! ! fstbin! ! [Link]! nnet! ! sgmmbin

Mostly directories containing code.

Those ending in bin/ contain Kaldi programs
There are a large number of programs, each
with a fairly simple function.

Running the examples

$ cd ~/kaldi-trunk # assuming it was in your homedir
$ cd egs
$ ls
[Link] gp!rm swbd timit wsj
$ cd rm
$ ls
[Link] s1!s2 s3 s4
$ cd s3 # The s3 example scripts are the most normal one.
$ ls
RESULTS conf data exp local [Link] [Link] scripts steps

There are example scripts for various data-sets.

Well use Resource Management (smallest one).
Very easy task: clean, planned speech, small
vocabulary. (Spoken commands to computer).

Finding the data

$ cd ~/kaldi-trunk/egs/rm
$ cat [Link]
About the Resource Management corpus:
Clean speech in a medium-vocabulary task consisting
of commands to a (presumably imaginary) computer system. About 3
hours of training data.
Available from the LDC as catalog number LDC93S3A (it may be
possible to get the same data using combinations of other catalog
numbers, but this is the one we used).

See if you have this data on your system

Its $1000 from LDC if non-member.
Look for directory containing subdirs:
rm1_audio1 rm1_audio2! rm2_audio

If you dont have the data

If your institution is not an LDC member and
doesnt want to pay for the data:
you can use the scripts in rm/s4
Uses precomputed features derived from a
subset of the RM data
Will be downloaded from the Internet.

Thanks to Vassil Panayotov for contributing this recipe.

Looking at the data

$ find /export/corpora5/LDC/LDC93S3A/rm_comp | head
/export/corpora5/LDC/LDC93S3A/rm_comp
/export/corpora5/LDC/LDC93S3A/rm_comp/rm2_audio
/export/corpora5/LDC/LDC93S3A/rm_comp/rm2_audio/3-2.2
/export/corpora5/LDC/LDC93S3A/rm_comp/rm2_audio/3-2.2/rm2
/export/corpora5/LDC/LDC93S3A/rm_comp/rm2_audio/3-2.2/rm2/ex_train
/export/corpora5/LDC/LDC93S3A/rm_comp/rm2_audio/3-2.2/rm2/ex_train/lpn0_7
/export/corpora5/LDC/LDC93S3A/rm_comp/rm2_audio/3-2.2/rm2/ex_train/lpn0_7/[Link]
/export/corpora5/LDC/LDC93S3A/rm_comp/rm2_audio/3-2.2/rm2/ex_train/lpn0_7/[Link]

$ less /export/corpora5/LDC/LDC93S3A/rm_comp/rm1_audio1/rm1/doc/al_sents.txt
; al_sents.txt - updated 09/20/89
<snip>
What is the constellation's gross displacement in long tons? (SR001)
Is Ranger's earliest CASREP rated worse than hers? (SR002)
Show me all alerts. (SR003)
Give Bainbridge's CASREPs from the last 7 months. (SR004)
Show the Enterprise's home port. (SR005)
Draw Texas's last 3 H.F.D.F. sensor posits. (SR006)

Note: .wav files are not really .wav, they are .sph
Use tools/sph2pipe_v2.5/sph2pipe to convert
sphere format

The word-pair grammar

$ less /export/corpora5/LDC/LDC93S3A/rm_comp/rm1_audio1/rm1/doc/wp_gram.txt

/*
****************************************************************************
*
COPYRIGHT 1987. BBN LABORATORIES, INCORPORATED
*
*
ALL RIGHTS RESERVED
***************************************************************************
* File: patts_snor_word_pair.text
*
* This file contains a specification for the 'word-pair' grammar developed
* at BBN.
* The grammar allows all two word sequences (bigrams) possible in the DARPA
* continuous speech resource management database as defined by the sentence
* pattern grammar.
...

The RM database comes with a word-pair grammar

For the other Kaldi examples, we use statistical
language models.
n-gram model

Bayes rule and ASR

P(S | audio) =

p(audio | S) P(S)
p(audio)

Note:
d
p() = likeliho o
ity
il
b
a
b
o
r
p
=
)
(
P

Here, S is the sequence of words, P(S) is language

model, e.g. n-gram model or probabilistic grammar.
p(audio | S) is a sentence-dependent statistical model
of audio production, trained from data.
Given a test utterance, we pick S to maximize
P(S | audio). I.e. the most likely sentence.
Note: p(audio) is a normalizer that doesnt matter.

Preparing the data

$ cd ~/kaldi-trunk/egs/rm/s3
$ ## were running the steps from [Link] ##
$ local/rm_data_prep.sh /export/corpora5/LDC/LDC93S3A/rm_comp

$ local/rm_format_data.sh
$ ls data
lang lang_test local! test_feb89
test_oct89 test_sep92! train

test_feb91! test_mar87

test_oct87

Putting data in form that Kaldi scripts understand.

data/lang contains language-specific stuff (also see
data/lang_test which contains the grammar too).
data/train contains training data (data/test_feb89
etc. have same format)

Language-specific stuff
$ head -5 data/lang/[Link]
<eps>!0
aa!1
ae!2
ah!3
ao!4
aw!5
$ head -2 data/lang/[Link]
head -4 data/lang/[Link]
<eps>!0
A! 1
A42128! 2
AAW! 3
$ cat data/lang/[Link]
48
$ ## Note: just one silence phone in this setup.

*.txt are symbol tables in OpenFst format

Map between strings and ints; Kaldi code uses ints.

The lexicon
$ fstprint --isymbols=data/lang/[Link] --osymbols=data/lang/[Link]
data/lang/[Link] | head
0! 1! <eps>!<eps>!0.693147182
0! 1! sil! <eps>!0.693147182
1! 1! ax!A! 0.693147182
1! 2! ax!A! 0.693147182
1! 3! ey!A42128
1! 15!ey!AAW
1! 21!ae!ABERDEEN
1! 26!ax!ABOARD
1! 30!ax!ABOVE

The lexicon (pronouncing dictionary) is in binary

OpenFst format
Can view it as text using the command above.

Weighted Finite State Transducers (WFSTs)

Various resources for learning WFSTs, OpenFst
Informal intro by me to WFSTs (read slides first)
[Link]

More formal one, search for

[Link]

Paul Dixon tutorial:

apsipa_09_tutorial_dixon_furui.pdf

For OpenFst resources/tutorial: [Link]

Next slides: very quick intro.

WFST quick intro: FSAs

Finite State acceptor (FSA) is a finite representation
of a possibly infinite set of strings.
Has a finite #states. One is initial state.
States can be labeled final.
Arcs between states have symbols on them (or
special symbol epsilon meaning no symbol)
String == symbol-sequence.
String accepted if theres a path with that
symbol-sequence on, from initial->final state.

WFST quick intro: WFSAs

WFSA is like FSA but adding costs to the
transitions and final-states.
String accepted with weight determined by
minumum-cost path from initial->final.
The notion of cost can be generalized.
We call them weights. Operations + and *,
satisfying axioms of a semiring
A weight is multiplied along paths, added
across paths.

WFST quick intro: FSTs

Finite State transducer (FST) is (from the point of
view of its name) is an object that
transduces (converts) one string into another.
Like FSA but two symbols on each arc: input and
output.
Mathematically, represents a set of pairs of
strings: (input-string, output-string).
transducer name is a bit misleading.
Notion of composition (like function composition)

WFST quick intro: WFSTs

WFST combines the two-symbol idea of FSTs,
with the weighting idea of FSAs.
Keywords:
Determinization, minimization, composition
equivalent, epsilon-free, functional
on-demand algorithm
weight-pushing, epsilon removal
You might want to find out what these mean.

Data directory format

$ ls data/train

## note: it would look like this after the next step.

spk2gender spk2utt! text utt2spk [Link]

$ head -2 data/train/[Link]
trn_adg04_sr009 sph2pipe -f wav /foo/rm1_audio1/rm1/ind_trn/adg0_4/[Link] |
trn_adg04_sr049 sph2pipe -f wav /foo/rm1_audio1/rm1/ind_trn/adg0_4/[Link] |
$ head -2 data/train/text
trn_adg04_sr009 SHOW THE GRIDLEY+S TRACK IN BRIGHT ORANGE
trn_adg04_sr049 IS DIXON+S LENGTH GREATER THAN THAT OF RANGER
$ head -2 data/train/utt2spk
trn_adg04_sr009 adg0
trn_adg04_sr049 adg0

Most of these files map from utterance-id to

(something)
Kaldi Table concept: collection of objects indexed by
a string.

The Table concept

A Table is a collection of objects indexed by a string
(string must be nonempty, space-free).
E.g. a collection of matrices indexed by utteranceid, representing features.
Templates in C++: e.g. vector<int> is a vector of
integers. Mechanism for generic code.
The basic concept is: Table<Object>, e.g. Table<int>,
Table<Matrix<float> >
Handles access to objects on disk (or pipes, etc.)

Tables: form on disk

Two ways objects are stored on disk:
scp (script) mechanism: .scp file specifies mapping
from key (the string) to filename or pipe:
$ head -2 data/train/[Link]
trn_adg04_sr009 sph2pipe -f wav /foo/rm1_audio1/rm1/ind_trn/adg0_4/[Link] |
trn_adg04_sr049 sph2pipe -f wav /foo/rm1_audio1/rm1/ind_trn/adg0_4/[Link] |

ark (archive) mechanism: data is all in one file,

with utterance ids (example below is in text mode):
$ head -2 data/train/text
trn_adg04_sr009 SHOW THE GRIDLEY+S TRACK IN BRIGHT ORANGE
trn_adg04_sr049 IS DIXON+S LENGTH GREATER THAN THAT OF RANGER

Specifying Tables on command line

Strings passed from command line say how to read
or write Tables.
Note: the type of object expected, and whether to
read or write, is determined by the program itself.
A string interpreted as specifying how to write a
Table, we call a wspecifier in code, etc.
A string that specifies how to read a Table is
called an rspecifier.

Examples of writing Tables

wspecifier

meaning

ark:[Link]

Write to archive [Link]

scp:[Link]

Write to files using mapping in [Link]

ark:-

Write archive to stdout

ark,t:|gzip -c >[Link]

Write text-form archive to [Link]

ark,t:-

Write text-form archive to stdout

ark,scp:[Link],[Link]

Write archive and scp file (see below)

Last one is a special case: write archive, and .scp file

specifying offsets into that archive (for efficient
random access). Here, .scp file is like an index.

Examples of reading Tables

rspecifier

meaning

ark:[Link]

Read from archive [Link]

scp:[Link]

Read as specified in [Link]

ark:-

Read archive from stdin

ark:gunzip -c [Link]|

Read archive from [Link]

ark,s,cs:-

Read archive (sorted) from stdin...

In last one, s asserts archive is sorted, cs asserts

it will be called in sorted order.
Allows memory-efficient random access on archive.

C++ level Table code

Note: there is actually no Table<Object> class.
There are three: SequentialTableReader,
RandomAccessTableReader, and TableWriter.
SequentialTableReader<Matrix<float> > mat1_reader(rspecifier1);
RandomAccessTableReader<Matrix<float> > mat2_reader(rspecifier2);
TableWrite<Matrix<float> > mat_writer(wspecifier);
for (; !mat1_reader.Done(); mat1_reader.Next()) {
const Matrix<float> mat1(mat1_reader.Value());
std::string key = mat1_reader.Key();
if (mat2_reader.HasKey(key)) {
Matrix<float> mat2(mat2_reader.Value());
Matrix<float> prod([Link](), [Link]());
[Link](1.0, mat1, kNoTrans, mat2, kNoTrans);
mat_writer.Write(key, prod);
}
}

Shell level Table example

This fake example imagines the code on the
previous slide was in a program called multiplymatrices.
In reality, Kaldi programs are a little higher level
than this (although there is a program transformfeats that does this as a special case).
$ multiply-matrices scp:[Link] \
ark:gunzip c [Link]| \
ark,t:|gzip c >transformed_feats.gz
$

Feature processing

Speech audio processing

The most useful information in speech is frequency
domain
e.g. position of peaks in amplitude called
formants that vary between vowels
We use short-time Fourier spectrum
Further process this to reduce dimension and
make it more Gaussian distributed.
gaussian distribution

Audio processing (simple version)

Input is 16kHz sampled audio.
Take a 25ms window (shift by 10 ms each time; we
will output a sequence of vectors, one every 10ms)
Multiply by windowing function e.g. Hamming
Do fourier transform

Hamming window

FFT

Take log energy in each frequency bin

Do discrete cosine transform (DCT): (gives us the
cepstrum)
cepstrum
Keep the first 13 coefficients of the cepstrum.

Audio processing (details)

Pre-scale the frequency axis with mel (perceptual)
scale before doing DCT
mel scale
Dont take DCT of individual frequency components:
average energy over triangular bins, equally
spaced in mel scale
Pre-emphasize signal (do s(t) = s(t) -0.97 s(t-1)) ...
reduces aliasing artifacts w/ Hamming (?)
Add a little noise to signal: dithering--> no log(0)
Result is MFCC (Mel Frequency Cepstral Coeffs.)
Kaldi also supports PLP (perceptual linear
prediction)-- usually a bit better.

Audio processing (script)

## assumes your shell is bash. Uses 4 cpus (parameter 4)
featdir=mfcc_feats ## Note: put this somewhere with disk space
for x in train test_mar87 test_oct87 test_feb89 test_oct89 \
test_feb91 test_sep92; do
steps/make_mfcc.sh data/$x exp/make_mfcc/$x $featdir 4
#steps/make_plp.sh data/$x exp/make_plp/$x $featdir 4
done

For training set and each of the test sets, make the
features with 4 CPUs (on local machine).
Puts features e.g. in data/train/[Link]
head data/train/[Link]
trn_adg04_sr009 /home/dpovey/data/kaldi_rm_feats/raw_mfcc_train.[Link]
trn_adg04_sr049 /home/dpovey/data/kaldi_rm_feats/raw_mfcc_train.[Link]
trn_adg04_sr089 /home/dpovey/data/kaldi_rm_feats/raw_mfcc_train.[Link]

Audio processing (script)

Main command run by steps/make_mfcc.sh:
$ head -1 exp/make_mfcc/train/make_mfcc.[Link]
compute-mfcc-feats --verbose=2 --config=conf/[Link] \
scp:exp/make_mfcc/train/[Link] \
ark,scp:/data/mfcc/raw_mfcc_train.[Link],/data/mfcc/raw_mfcc_train.[Link]

First argument scp:... tells it to find filenames

(actually commands) in [dir]/[Link]
Second argument ark,scp:... tells it to write an
archive, and an index into the archive.
Archive contains (num-frames)x13 matrix of
features, for each utterance.

Audio processing (code)

## simplified extract from src/featbin/[Link]
main(int argc, char *argv[]) {
// <snip>: parse command line arguments.
Mfcc mfcc(mfcc_opts);
SequentialTableReader<WaveHolder> reader(wav_rspecifier);
BaseFloatMatrixWriter writer(feat_wspecifier); // note: a typedef.
for (; ![Link](); [Link]()) {
string utt = [Link]();
const WaveData &wave_data = [Link]();
int32 channel = 0; # Lets assume mono data for now.
BaseFloat vtln_warp = 1.0; # Gloss over VTLN (vocal tract len. norm.)
SubVector<BaseFloat> waveform(wave_data.Data(), this_chan);
Matrix<BaseFloat> features;
[Link](waveform, vtln_warp, &features, NULL);
[Link](utt, features);
}
}

Note on Tables
We said Table types were templated on the type
they store, e.g. TableWriter<Matrix<float> >
This is a simplification: we actually template on a
Holder type that tells the Table code how to read
and write the object.
Necessary because objects dont have uniform read/
write methods. (must work for fundamental types)

Audio processing (code)

## simplified extract from src/feat/[Link]
void Mfcc::Compute(const VectorBase<BaseFloat> &wave,
Matrix<BaseFloat> *output) {
int32 rows_out = NumFrames([Link](), opts_.frame_opts),
cols_out = opts_.num_ceps;
output->Resize(rows_out, cols_out);
Vector<BaseFloat> window; // windowed waveform.
Vector<BaseFloat> mel_energies; // energies for mel bins.
for (int32 r = 0; r < rows_out; r++) { // r is frame index..
ExtractWindow(wave, r, opts_.frame_opts,
feature_window_function_, &window);
srfft_->Compute([Link](), true); // split-radix FFT
ComputePowerSpectrum(&window);
SubVector<BaseFloat> power_spectrum(window, 0, [Link]()/2 + 1);
mel_banks_.Compute(power_spectrum, &mel_energies);
mel_energies.ApplyLog(); // take the log.
SubVector<BaseFloat> this_mfcc(output->Row(r));
// this_mfcc = dct_matrix_ * mel_energies [which now have log]
this_mfcc.AddMatVec(1.0, dct_matrix_, kNoTrans, mel_energies, 0.0);
}
}

End of this
lecture

Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
58 pages
PR - L1-Introduction To Pattern Recognition PDF
0% (1)
PR - L1-Introduction To Pattern Recognition PDF
20 pages
Data Science Career Track Overview
No ratings yet
Data Science Career Track Overview
42 pages
Characteristics of Intelligent Systems
No ratings yet
Characteristics of Intelligent Systems
7 pages
Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
50 pages
Overview of Natural Language Processing
No ratings yet
Overview of Natural Language Processing
13 pages
Training Neural Networks Overview
No ratings yet
Training Neural Networks Overview
138 pages
Expert Systems
No ratings yet
Expert Systems
44 pages
Knowledge Representation in AI Systems
No ratings yet
Knowledge Representation in AI Systems
33 pages
Fruit Sorting with ANN and Perceptron
No ratings yet
Fruit Sorting with ANN and Perceptron
5 pages
Morphological Operations in Image Processing
No ratings yet
Morphological Operations in Image Processing
26 pages
Introduction to Artificial Intelligence
No ratings yet
Introduction to Artificial Intelligence
20 pages
Introduction to Cybersecurity Course
No ratings yet
Introduction to Cybersecurity Course
13 pages
Understanding Knowledge Representation in AI
No ratings yet
Understanding Knowledge Representation in AI
45 pages
Prolog and Python Search Algorithms
100% (1)
Prolog and Python Search Algorithms
7 pages
Image Analysis Techniques Overview
No ratings yet
Image Analysis Techniques Overview
17 pages
Overview of Expert Systems
No ratings yet
Overview of Expert Systems
7 pages
Searching Techniques in Data Structures
100% (1)
Searching Techniques in Data Structures
15 pages
Pattern Recognition Course Overview
No ratings yet
Pattern Recognition Course Overview
28 pages
Image Analysis Techniques in Machine Vision
No ratings yet
Image Analysis Techniques in Machine Vision
16 pages
Neural Network Learning Rules
No ratings yet
Neural Network Learning Rules
73 pages
Pattern Recognition Lecture Slides 2022
No ratings yet
Pattern Recognition Lecture Slides 2022
92 pages
Introduction to Statistical Pattern Recognition
No ratings yet
Introduction to Statistical Pattern Recognition
52 pages
Knowledge Representation in AI
No ratings yet
Knowledge Representation in AI
4 pages
Antibody Structure and Antigen Recognition
No ratings yet
Antibody Structure and Antigen Recognition
16 pages
Logistic Regression in Python Basics
No ratings yet
Logistic Regression in Python Basics
28 pages
Overview of Graph Theory Concepts
No ratings yet
Overview of Graph Theory Concepts
25 pages
Mamdani Fuzzy Inference Overview
No ratings yet
Mamdani Fuzzy Inference Overview
31 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
20 pages
Data Analytics Course Overview
No ratings yet
Data Analytics Course Overview
20 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
21 pages
Cyber Security Fundamentals Lecture Notes
No ratings yet
Cyber Security Fundamentals Lecture Notes
55 pages
Overview of Automatic Speech Recognition
No ratings yet
Overview of Automatic Speech Recognition
77 pages
Clustering and Profiling in Data Mining
No ratings yet
Clustering and Profiling in Data Mining
9 pages
Natural Language Processing Training Program
No ratings yet
Natural Language Processing Training Program
13 pages
Limitations of N-Gram Models
No ratings yet
Limitations of N-Gram Models
38 pages
Advanced Deep Learning Syllabus
No ratings yet
Advanced Deep Learning Syllabus
2 pages
Introduction to Computer Vision Concepts
No ratings yet
Introduction to Computer Vision Concepts
65 pages
Introduction to Pattern Recognition Course
No ratings yet
Introduction to Pattern Recognition Course
37 pages
Types of Knowledge Representation in AI
No ratings yet
Types of Knowledge Representation in AI
12 pages
Understanding Netfilter in Linux
No ratings yet
Understanding Netfilter in Linux
23 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
56 pages
IPCop Firewall Setup with Copfilter Guide
No ratings yet
IPCop Firewall Setup with Copfilter Guide
21 pages
TinyML to TinyDL: Trade-offs and Advances
100% (1)
TinyML to TinyDL: Trade-offs and Advances
38 pages
Overview of Machine Learning Applications
No ratings yet
Overview of Machine Learning Applications
2 pages
Understanding Information Flow in Perceptrons
No ratings yet
Understanding Information Flow in Perceptrons
21 pages
Knowledge Representation in AI Report
No ratings yet
Knowledge Representation in AI Report
20 pages
Introduction to Intelligent Systems Course
No ratings yet
Introduction to Intelligent Systems Course
42 pages
Signature-Based Malware Detection Explained
No ratings yet
Signature-Based Malware Detection Explained
59 pages
Understanding Genetic Algorithms Basics
No ratings yet
Understanding Genetic Algorithms Basics
80 pages
Introduction to Machine Learning Concepts
100% (1)
Introduction to Machine Learning Concepts
52 pages
Suricata Tutorial Overview
No ratings yet
Suricata Tutorial Overview
78 pages
Deep Learning Methods for Image Segmentation
No ratings yet
Deep Learning Methods for Image Segmentation
12 pages
Phishing URL Detection with ML Techniques
No ratings yet
Phishing URL Detection with ML Techniques
24 pages
Regularization Techniques in Deep Learning
No ratings yet
Regularization Techniques in Deep Learning
30 pages
Extended Semantic Networks in KR
No ratings yet
Extended Semantic Networks in KR
30 pages
Build ASR System with Kaldi Toolkit
No ratings yet
Build ASR System with Kaldi Toolkit
13 pages
Kaldi ASR System Tutorial for Beginners
No ratings yet
Kaldi ASR System Tutorial for Beginners
11 pages
PyAudio Device Selection Guide
No ratings yet
PyAudio Device Selection Guide
5 pages
German ASR Models with Open Source Data
No ratings yet
German ASR Models with Open Source Data
5 pages
Digital Image Processing Overview
No ratings yet
Digital Image Processing Overview
4 pages
Image Fusion Techniques in MATLAB
No ratings yet
Image Fusion Techniques in MATLAB
59 pages
VTU Module 3: Text & Image Compression
No ratings yet
VTU Module 3: Text & Image Compression
45 pages
Medical Image Processing Experiments
No ratings yet
Medical Image Processing Experiments
55 pages
Coverless Information Hiding Based On The Molecular Structure Images of Material
No ratings yet
Coverless Information Hiding Based On The Molecular Structure Images of Material
11 pages
VisioMap: Natural Indoor Localization
No ratings yet
VisioMap: Natural Indoor Localization
13 pages
DCT Face Recognition Attendance System
No ratings yet
DCT Face Recognition Attendance System
5 pages
AI-Based Tutor for Moroccan Arabic Speech
No ratings yet
AI-Based Tutor for Moroccan Arabic Speech
8 pages
2014 IEEE VLSI Project Innovations
No ratings yet
2014 IEEE VLSI Project Innovations
3 pages
Review of Multiple Description Coding Techniques
No ratings yet
Review of Multiple Description Coding Techniques
59 pages
Frequency Perception Network for COD
No ratings yet
Frequency Perception Network for COD
11 pages
DCT Kernel Matrix in Image Processing
100% (1)
DCT Kernel Matrix in Image Processing
14 pages
Video Compression Techniques Explained
No ratings yet
Video Compression Techniques Explained
13 pages
Image Compression Techniques Overview
No ratings yet
Image Compression Techniques Overview
60 pages
Digital Image Processing Course Outline
No ratings yet
Digital Image Processing Course Outline
2 pages
Harshvardhan's DSP Experiment Signatures
No ratings yet
Harshvardhan's DSP Experiment Signatures
23 pages
Lossless vs. Lossy Video Compression
No ratings yet
Lossless vs. Lossy Video Compression
43 pages
Efficient Image Steganography Algorithm
No ratings yet
Efficient Image Steganography Algorithm
6 pages
2D Discrete Cosine Transform Explained
No ratings yet
2D Discrete Cosine Transform Explained
12 pages
No-Reference Image Quality Assessment Using DCT and SOM
No ratings yet
No-Reference Image Quality Assessment Using DCT and SOM
13 pages
Online Video Denoising with VIDOSAT
No ratings yet
Online Video Denoising with VIDOSAT
13 pages
DSP Algorithm Strength Reduction Techniques
No ratings yet
DSP Algorithm Strength Reduction Techniques
19 pages
Digital Signal Processing Exam Paper
No ratings yet
Digital Signal Processing Exam Paper
11 pages
JPEG Rounding Artifacts in Photo Forensics
No ratings yet
JPEG Rounding Artifacts in Photo Forensics
12 pages
AI Image Authentication for Social Media
No ratings yet
AI Image Authentication for Social Media
6 pages
Neural Network for Optimal DCT Compression
No ratings yet
Neural Network for Optimal DCT Compression
6 pages
Video Coding
No ratings yet
Video Coding
19 pages
History of the World Wide Web
No ratings yet
History of the World Wide Web
12 pages
Digital Image Compression Techniques
No ratings yet
Digital Image Compression Techniques
12 pages
Text and Image Compression Techniques
No ratings yet
Text and Image Compression Techniques
65 pages