0% found this document useful (0 votes)

9 views3 pages

Sequence Analysis

Sequence analysis involves using various analytical methods to study DNA, RNA, or peptide sequences to understand their features, functions, and evolutionary relationships. Key methodologies include sequence alignment, gene prediction, and protein structure prediction, utilizing tools like BLAST and ClustalW. The document outlines the importance of comparing sequences to identify similarities, variations, and molecular structures, as well as the methodologies employed in these analyses.

Uploaded by

shubhamchauhan307

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views3 pages

Sequence Analysis

Uploaded by

shubhamchauhan307

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Sequence Analysis Dr. P.

Saha

Sequence Analysis

Sequence analysis refers to the process of subjecting a DNA, RNA or peptide sequence to
any of a wide range of analytical methods to understand its features, function, structure, or
evolution. Methodologies used include sequence alignment, searches against biological
databases, and others. Since the development of methods of high-throughput production of
gene and protein sequences, the rate of addition of new sequences to the databases increased
exponentially. Such a collection of sequences does not, by itself, increase the scientist's
understanding of the biology of organisms. However, comparing these new sequences to
those with known functions is a key way of understanding the biology of an organism from
which the new sequence comes. Thus, sequence analysis can be used to assign function to
genes and proteins by the study of the similarities between the compared sequences.
Nowadays, there are many tools and techniques that provide the sequence comparisons
(sequence alignment) and analyze the alignment product to understand its biology.

Sequence analysis in molecular biology includes a very wide range of relevant topics:

1. The comparison of sequences in order to find similarity often to infer if they are
related (homologous)
2. Identification of intrinsic features of the sequence such as active sites, post
translational modification sites, gene-structures, reading frames, distributions of
introns and exons and regulatory elements
3. Identification of sequence differences and variations such as point mutations and
single nucleotide polymorphism (SNP) in order to get the genetic marker.
4. Revealing the evolution and genetic diversity of sequences and organisms
5. Identification of molecular structure from sequence alone

There are millions of protein and nucleotide sequences known. These sequences fall into
many groups of related sequences known as protein families or gene families. Relationships
between these sequences are usually discovered by aligning them together and assigning this
alignment a score. There are two main types of sequence alignment. Pair-wise sequence
alignment only compares two sequences at a time and multiple sequence alignment compares
many sequences in one go. Two important algorithms for aligning pairs of sequences are the
Needleman-Wunsch algorithm and the Smith-Waterman algorithm. Popular tools for
sequence alignment include:

• Pair-wise alignment - BLAST

• Multiple alignment - ClustalW, PROBCONS, MUSCLE, MAFFT, and T-Coffee.

A common use for pairwise sequence alignment is to take a sequence of interest and compare
it to all known sequences in a database to identify homologous sequences. In general the
matches in the database are ordered to show the most closely related sequences first followed
by sequences with diminishing similarity. These matches are usually reported with a measure
of statistical significance such as an Expectation value.

Profile comparison

In 1987 Michael Gribskov, Andrew McLachlan and David Eisenberg introduced the method
of profile comparison for identifying distant similarities between proteins. Rather than using
a single sequence, profile methods use a multiple sequence alignment to encode a profile

1
Sequence Analysis Dr. P. Saha

which contains information about the conservation level of each residue. These profiles can
then be used to search collections of sequences to find sequences that are related. Profiles are
also known as Position Specific Scoring Matrices (PSSMs). In 1993 a probabilistic
interpretation of profiles was introduced by David Haussler and colleagues using hidden
Markov models. These models have become known as profile-HMMs. In recent years
methods have been developed that allow the comparison of profiles directly to each other.
These are known as profile-profile comparison methods.

Sequence assembly

Sequence assembly refers to the reconstruction of a DNA sequence by aligning and merging
small DNA fragments. It is an integral part of modern DNA sequencing. Since presently-
available DNA sequencing technologies are ill-suited for reading long sequences, large pieces
of DNA (such as genomes) are often sequenced by (1) cutting the DNA into small pieces, (2)
reading the small fragments, and (3) reconstituting the original DNA by merging the
information on various fragment.

Gene prediction

Gene prediction or gene finding refers to the process of identifying the regions of genomic
DNA that encode genes. This includes protein-coding genes as well as RNA genes, but may
also include prediction of other functional elements such as regulatory regions. Gene finding
is one of the first and most important steps in understanding the genome of a species once it
has been sequenced. In general the prediction of bacterial genes is significantly simpler and
more accurate than the prediction of genes in eukaryotic species that usually have complex
intron/exon patterns.

Protein Structure Prediction

Target protein structure (3dsm, shown in ribbons), with Calpha backbones (in gray) of 354
predicted models for it submitted in the CASP8 structure-prediction experiment. The 3D
structures of molecules are of great importance to their functions in nature. Since structural
prediction of large molecules at an atomic level is largely intractable problem, some
biologists introduced ways to predict 3D structure at a primary sequence level. This includes
biochemical or statistical analysis of amino acid residues in local regions and structural
inference from homologs (or other potentially related proteins) with known 3D structures.
There have been a large number of diverse approaches to solve the structure prediction
problem. In order to determine which methods were most effective a structure prediction
competition was founded called CASP (Critical Assessment of Structure Prediction).

Methodology

The tasks that lie in the space of sequence analysis are often non-trivial to resolve and require
the use of relatively complex approaches. Of the many types of methods used in practice, the
most popular include:

• Dynamic programming
• Artificial Neural Network
• Hidden Markov Model
• Support Vector Machine

2
Sequence Analysis Dr. P. Saha

• Clustering
• Bayesian Network
• Regression Analysis

Dot-matrix methods
A dot matrix picture provides a global picture of local similarities between two sequences.
They are appropriate:

• for comparing large sequences (several 1000 residues)

• if one does not know in advance whether two sequences share detectable similarity or
which parts of the sequences are related to each other.

They are useful for:

• detection of repeats within protein sequences

• detection of shared domains between protein sequences

Overview of Sequence Analysis Methods
No ratings yet
Overview of Sequence Analysis Methods
6 pages
Sequence Analysis in Molecular Biology
No ratings yet
Sequence Analysis in Molecular Biology
9 pages
Sequence Analysis Unit 4
No ratings yet
Sequence Analysis Unit 4
24 pages
Sequence Alignment in Bioinformatics
No ratings yet
Sequence Alignment in Bioinformatics
114 pages
BTG 404
No ratings yet
BTG 404
24 pages
Sequence Alignment Techniques Overview
No ratings yet
Sequence Alignment Techniques Overview
48 pages
Bioinformatics UNIT II
No ratings yet
Bioinformatics UNIT II
27 pages
Understanding Multiple Sequence Alignment
No ratings yet
Understanding Multiple Sequence Alignment
17 pages
Understanding Bioinformatics Basics
No ratings yet
Understanding Bioinformatics Basics
54 pages
Sequence Analysis in Bioinformatics
No ratings yet
Sequence Analysis in Bioinformatics
28 pages
Sequence Alignment Methods and Analysis
No ratings yet
Sequence Alignment Methods and Analysis
63 pages
HMM Interpolation for Protein Analysis
No ratings yet
HMM Interpolation for Protein Analysis
10 pages
Sequence Analysis Overview and Methods
No ratings yet
Sequence Analysis Overview and Methods
16 pages
Sequence Analysis in Bioinformatics
No ratings yet
Sequence Analysis in Bioinformatics
24 pages
Sequence Alignment and Phylogenetics Guide
No ratings yet
Sequence Alignment and Phylogenetics Guide
70 pages
Understanding Bioinformatics by Marketa Zvelebil Ebook PDF Download
100% (3)
Understanding Bioinformatics by Marketa Zvelebil Ebook PDF Download
168 pages
Bioinformatics: DNA, RNA, and Sequence Analysis
No ratings yet
Bioinformatics: DNA, RNA, and Sequence Analysis
8 pages
Understanding Sequence Alignments
No ratings yet
Understanding Sequence Alignments
25 pages
Pairwise vs. Multiple Sequence Alignment
No ratings yet
Pairwise vs. Multiple Sequence Alignment
21 pages
Mining Biological Sequence Patterns
No ratings yet
Mining Biological Sequence Patterns
6 pages
Wa0061.
No ratings yet
Wa0061.
43 pages
Sequence Alignment in Bioinformatics
No ratings yet
Sequence Alignment in Bioinformatics
36 pages
Sequence Alignment in Bioinformatics
No ratings yet
Sequence Alignment in Bioinformatics
18 pages
Protein Sequence Analysis Overview
No ratings yet
Protein Sequence Analysis Overview
13 pages
Understanding Bioinformatics by Marketa Zvelebil PDF
100% (5)
Understanding Bioinformatics by Marketa Zvelebil PDF
173 pages
Understanding Sequence Alignment Techniques
No ratings yet
Understanding Sequence Alignment Techniques
27 pages
Evolutionary Basis of Sequence Alignment
No ratings yet
Evolutionary Basis of Sequence Alignment
26 pages
Sequence Alignment in Bioinformatics
No ratings yet
Sequence Alignment in Bioinformatics
22 pages
Protein Structure Prediction & Drug Design
No ratings yet
Protein Structure Prediction & Drug Design
24 pages
DNA Data Analysis Techniques Overview
No ratings yet
DNA Data Analysis Techniques Overview
17 pages
Bioinformatics Final Term Questions
No ratings yet
Bioinformatics Final Term Questions
23 pages
Topics in Bioinformatics 3353
No ratings yet
Topics in Bioinformatics 3353
100 pages
Predictive Methods for Protein Structure
No ratings yet
Predictive Methods for Protein Structure
36 pages
Understanding DNA Sequence Classification
No ratings yet
Understanding DNA Sequence Classification
26 pages
Protein Structure Prediction Methods
No ratings yet
Protein Structure Prediction Methods
7 pages
Basics of Bioinformatics
No ratings yet
Basics of Bioinformatics
59 pages
Pairwise Sequence Alignment Explained
No ratings yet
Pairwise Sequence Alignment Explained
70 pages
Pairwise Sequence Alignment in Bioinformatics
No ratings yet
Pairwise Sequence Alignment in Bioinformatics
31 pages
Multiple Sequence Alignment Overview
No ratings yet
Multiple Sequence Alignment Overview
14 pages
Sequence Alignment Methods in Bioinformatics
No ratings yet
Sequence Alignment Methods in Bioinformatics
55 pages
Phyre Server Protein Structure Prediction
No ratings yet
Phyre Server Protein Structure Prediction
9 pages
Homology Protein Modeling Techniques
No ratings yet
Homology Protein Modeling Techniques
18 pages
Global and Local Sequence Alignment Techniques
No ratings yet
Global and Local Sequence Alignment Techniques
9 pages
Bioinformatics: Merging Biology and Computing
No ratings yet
Bioinformatics: Merging Biology and Computing
59 pages
Introduction to Bioinformatics Concepts
No ratings yet
Introduction to Bioinformatics Concepts
55 pages
Bioinformatics: Pairwise Sequence Alignment
No ratings yet
Bioinformatics: Pairwise Sequence Alignment
85 pages
Protein Sequence Similarity Analysis
No ratings yet
Protein Sequence Similarity Analysis
14 pages
Sequence Homology and Alignment Methods
No ratings yet
Sequence Homology and Alignment Methods
20 pages
Mathematical Analysis of Biological Data
No ratings yet
Mathematical Analysis of Biological Data
2 pages
Understanding Sequence Alignment in Bioinformatics
No ratings yet
Understanding Sequence Alignment in Bioinformatics
22 pages
Sequence Alignment and Phylogenetics Guide
No ratings yet
Sequence Alignment and Phylogenetics Guide
6 pages
Genome Annotation Techniques Explained
No ratings yet
Genome Annotation Techniques Explained
66 pages
Multiple Sequence Alignment Techniques
No ratings yet
Multiple Sequence Alignment Techniques
35 pages
Homology Modeling of Proteins
No ratings yet
Homology Modeling of Proteins
20 pages
Protein Structure Prediction Overview
No ratings yet
Protein Structure Prediction Overview
23 pages
Homology Modeling Techniques Explained
No ratings yet
Homology Modeling Techniques Explained
18 pages
Global vs Local Sequence Alignment
No ratings yet
Global vs Local Sequence Alignment
77 pages
Bioinformatics: Sequence Analysis Overview
No ratings yet
Bioinformatics: Sequence Analysis Overview
50 pages
Multiple Sequence Alignment Methods
No ratings yet
Multiple Sequence Alignment Methods
5 pages
Maternal Effects in Cytoplasmic Inheritance
No ratings yet
Maternal Effects in Cytoplasmic Inheritance
6 pages
Understanding Aneuploidy in Botany
No ratings yet
Understanding Aneuploidy in Botany
7 pages
Centromere Structure and Chromosome Types
No ratings yet
Centromere Structure and Chromosome Types
13 pages
Understanding Aneuploidy in Botany
No ratings yet
Understanding Aneuploidy in Botany
10 pages
Understanding Bioinformatics and Its Applications
No ratings yet
Understanding Bioinformatics and Its Applications
108 pages
Cracking Genetic Codes with AI Techniques
No ratings yet
Cracking Genetic Codes with AI Techniques
7 pages
HMMs for Gene Finding in Bioinformatics
No ratings yet
HMMs for Gene Finding in Bioinformatics
32 pages
Gene Prediction Methods Overview
No ratings yet
Gene Prediction Methods Overview
5 pages
UC Riverside Electronic Theses and Dissertations
No ratings yet
UC Riverside Electronic Theses and Dissertations
59 pages
Scoring Matrices in Bioinformatics
No ratings yet
Scoring Matrices in Bioinformatics
27 pages
Gene Finding Techniques in Computational Biology
No ratings yet
Gene Finding Techniques in Computational Biology
84 pages
Introduction to Bioinformatics Overview
No ratings yet
Introduction to Bioinformatics Overview
12 pages
Genomics and Proteomics Course Overview
No ratings yet
Genomics and Proteomics Course Overview
3 pages
Computational Approaches
No ratings yet
Computational Approaches
12 pages
Agriculture Question Paper 2023
100% (6)
Agriculture Question Paper 2023
28 pages
Gene Prediction Methods and Techniques
No ratings yet
Gene Prediction Methods and Techniques
16 pages
Kasiski Method for DNA Region Prediction
No ratings yet
Kasiski Method for DNA Region Prediction
12 pages
Ab Initio Protein Structure Prediction
No ratings yet
Ab Initio Protein Structure Prediction
9 pages
Motif Search Types in Bioinformatics
No ratings yet
Motif Search Types in Bioinformatics
10 pages
Gene Prediction in Prokaryotes and Eukaryotes
No ratings yet
Gene Prediction in Prokaryotes and Eukaryotes
45 pages
M.Tech Biomedical Engineering Curriculum
No ratings yet
M.Tech Biomedical Engineering Curriculum
9 pages
Ecological Genomics and Bioinformatics Insights
No ratings yet
Ecological Genomics and Bioinformatics Insights
156 pages
Gene Prediction Techniques Overview
No ratings yet
Gene Prediction Techniques Overview
31 pages
M.Sc. Bioinformatics Curriculum Overview
No ratings yet
M.Sc. Bioinformatics Curriculum Overview
18 pages
Gene Identification Techniques Overview
No ratings yet
Gene Identification Techniques Overview
35 pages
Gene Prediction and SNP Analysis in Genomics
No ratings yet
Gene Prediction and SNP Analysis in Genomics
38 pages
Manual PDF
100% (1)
Manual PDF
53 pages
Hidden Markov Models in Gene Finding
No ratings yet
Hidden Markov Models in Gene Finding
37 pages
Integrative Frameworks for Cancer Driver Gene Prediction
No ratings yet
Integrative Frameworks for Cancer Driver Gene Prediction
16 pages
Stepwise Genome Annotation Guide
No ratings yet
Stepwise Genome Annotation Guide
1 page
Bioinformatics in Fungal Metabolism
No ratings yet
Bioinformatics in Fungal Metabolism
64 pages
Nature Inspired Computing in Bioinformatics
No ratings yet
Nature Inspired Computing in Bioinformatics
20 pages
B.Sc. Multidisciplinary Course Syllabi
No ratings yet
B.Sc. Multidisciplinary Course Syllabi
78 pages

Sequence Analysis

Uploaded by

Sequence Analysis

Uploaded by

Sequence Analysis Dr. P.

• Pair-wise alignment - BLAST

Protein Structure Prediction

• for comparing large sequences (several 1000 residues)

They are useful for:

• detection of repeats within protein sequences

You might also like