0% found this document useful (0 votes)

51 views66 pages

Multalin Tool for Sequence Alignment

The document provides an index of topics covered in a Bioinformatics course. The topics include using various bioinformatics tools like Multalin, RNAfold, BLAST, EMBOSS, Clustal Omega, KEGG pathways, PDB, SCOP, CATH, tRNAscanSE, Rasmol, and DendroUPGMA. Procedures for performing multiple sequence alignment of 16S rRNA using Multalin, predicting RNA secondary structure using RNAfold, performing BLAST searches, and utilizing other databases and tools are outlined. Screenshots from using the various tools are also included to demonstrate how to access and analyze data using different bioinformatics resources.

Uploaded by

Polu Chattopadhyay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views66 pages

Multalin Tool for Sequence Alignment

Uploaded by

Polu Chattopadhyay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

(Semester IV, Paper code: MMCB4413)

Roll number: 636

Registration number: A01-1112-0097-21
Subject: Bioinformatics
INDEX

[Link]. Date Topic Page Prof.

No.

1. Multalin 1-8 AB

2. RNA fold 9-13 AB

3. BLAST 14-19 KS

4. EMBOSS and Clustal Omega 20-27 KS

5. KEGG pathway 29-38 SSC

6. PDB 39-47 SSC

- The prediction of primary sequence
- Secondary structure prediction tool
- Prediction of tertiary structure of protein
by comparative/ homology modelling

7. SCOP AND CATH 48-52 SSC

8. tRNAscanSE 53-59 AB

9. Visualization of the structure of Aspirin using 60-63 KS

Rasmol

10. DendroUPGMA 64-65 SSC

ACCESSION TO MULTALIN
Date:
Introduction
JOB TITLE: To find the similarities between 16S rRNA between two species using
MultAlin
INTRODUCTION:
16S rRNA
16S ribosomal RNA (or 16S rRNA) is the RNA component of the 30S subunit of
a prokaryotic ribosome (SSU rRNA). It is also the component of the 30S subunit of the
ribosome of eukaryotic mitochondria and chloroplasts. It binds to the Shine-Dalgarno
sequence.
MultAlin
Multalin is a multiple-sequence alignment tool for protein and nucleic acid sequences created
by Florence Corpet.
Exercise: The following steps are performed to compare the 16sRNA of two species
Step 1: First, we open the NCBI (National Centre for Biotechnology Information) webpage.
MultAlin for Methanopterin and THF reductase

Step 1: First, we open the NCBI (National Centre for Biotechnology Information) webpage.
Step 2: Select Genome from the drop-down box and then press enter.
Step 3: For the above website for genomes, scroll down and select the prokaryotic reference
genome option.

Select any two organisms, for the multiple sequence alignment, in our case we have chosen
Archaeoglobus fulgidus and Acetobacterium woodii
Step 4: Scroll down to choose Refseq (reference sequence). Then customise the view of the
webpage to see gene and RNA.
Scroll and then click on FASTA.
Now, go back and copy-paste the FASTA sequence of both organisms individually and paste
it in the given box.
After pasting the sequence “start multalin”
The data run will show the following data, where the genome will be highlighted in red, blue
or black.
Red: Highly conserved sequence.
Blue: Variable Sequence.
Black: Neutral Value.
ACCESSION TO RNA fold
Date:
Introduction
RNA folding is the process by which a linear ribonucleic acid (RNA) molecule acquires secondary
structure through intra-molecular interactions.
Theory
Ribonucleic acid (RNA) is one of the key players in molecular biology and has in the past attracted
theoretical and experimental physicists because of its intriguing structural and functional properties.
RNA molecules are used for the synthesis of proteins, they act as messengers. Both DNA and RNA
are composed of subunits, the so-called nucleotides or bases. The nucleotides are linked together by
phosphodiester linkages through the hydroxyl group on the sugar on one nucleotide and the phosphate
on the next one. As a result, one can observe a strand with a 5’end where a free phosphate group can
be found, and a 3’end with a free hydroxyl group. Important aspect of the prediction of RNA
secondary structure is that there are many sequences whose structures have not yet been
experimentally determined and
for which there are no homologues in the databases from which the structure could be derived. Hence
it is a good idea to predict the structure. Moreover, it has been shown that RNA secondary structure
prediction has applications to the design of nucleic acid probes.
Procedure
[Link] was opened. Genome option was selected. Then prokaryotic reference genomes was chosen.

2. We have to search for the reference sequence. In this case we used Escherichia coli.
3. We obtain the RNAase P.

4. The FASTA sequence is copied.

5. The DNA sequence is converted to RNA sequence using Biomodel transcription and translation.
Link: [Link]
6. RNAfold web webserver was opened. The sequence was copied and pasted in the sequence field of
RNAfold web server.

[Link] page showing the dot bracket structure is opened for the Minimum free energy structure
An equivalent Graphical output is also seen with the graph showing the minimum Free Energy
Structures
ACCESSION TO BLAST - BASIC LOCAL ALIGNMENT SEARCH TOOL
Date:
INTRODUCTION:
BLAST or Basic Local Alignment Search Tool is an alignment tool that finds sequences from
a large database which show significant alignment to our query sequence. These sequences
are called subject sequences. BLAST is accessed through the National Centre for
Biotechnology Information (NCBI) website.
There are four types of BLAST:
● blastn- Nucleotide BLAST (nucleotide query sequence is compared with nucleotide
subject sequences).
● blastp- Protein BLAST (protein query sequence is compared with protein subject
sequences).
● blastx- (translated nucleotide sequence is compared against protein sequences).
● tblastn- (protein sequence is compared against translated nucleotide sequences).

PROCEDURE:
STEP 1: We go on our web browser and search NCBI website. We select protein from the
drop down option and write p53 on the search bar and click on search.

Query sequence: P53 [Cricetulus griseus]

GenBank: AAC53040.1

STEP 2: We select the first result and click on the ‘FASTA’ to obtain the FASTA sequence.
STEP 3: We select the entire sequence and copy it.

STEP 4: On a separate tab we open ‘BLAST’ and click on Protein BLAST.

STEP 5: In the space provided under “Enter query sequence” we paste our query sequence.

STEP 6: We scroll down and click on ‘BLAST’ to run BLAST analysis.

OBSERVATION:
● At the top of our result page we see information about our query sequence like its
query ID, molecule type, length etc.

● On scrolling down, we see the list of 100 subjects which showed the most significant
alignment to our query sequence under the ‘Description’ list. This also shows the
scientific name, maximum and total scores based on alignment, query coverage
showing how much of our query sequence is covered by the subject sequence, e value
(expected value) which here is zero showing maximum alignment, percentage
identity, accession length and number of the subjects.

● Clicking on the ‘Graphic Summary’ we can see the graphical representation of the
query and subject sequence alignment. Here, the sequences are red indicating the
alignment score is more than 200.

● Under ‘Alignments’ we can see the protein sequence alignment. The sequence of the
query is written on the 1st line and that of the subject is written on the 3rd line. If the
proteins align perfectly then that protein symbol is written, if it doesn’t align a gap is
left and incase of alignment of two proteins that are chemically similar, a plus sign (+)
is written in the 2nd line.

● Under ‘Taxonomy’ we can see the lineage, taxonomy and the organisms from where
the subject protein is obtained.

CONCLUSION: BLAST is a local alignment tool which helps us find sequences that show
significant alignment to our query sequence, giving us an idea about the query sequence, its
function or its species of origin.
Pairwise sequence alignment using EMBOSS needle
Date:
ACCESSION TO CLUSTAL OMEGA
Date:
Introduction
Clustal Omega is a multiple sequence alignment tool and it is very useful to align divergent sequences
and find relation among them. It is used for aligning multiple nucleotide or protein sequences in an
efficient manner. It uses progressive alignment methods, which align the most similar sequences first
and work their way down to the least similar sequences until a global alignment is created. ClustalW
is a matrix-based algorithm, whereas tools like T-Coffee and Dialign are consistency-based. ClustalW
has a fairly efficient algorithm that competes well against other software. This program requires three
or more sequences in order to calculate a global alignment, for pairwise sequence alignment (2
sequences) use tools similar to EMBOSS, LALIGN.

Steps
 Select first 10
 Download file
 Copy and paste in clustal omega
Red- small hydropho
bic residues
Blue- acidic
Magenta- basic
Asterix - conserved

Finally, here is the phylogenetic tree obtained by performing Clustal Omega.

Accession to pathway database: KEGG pathway
Date:

a. Metabolism
- Glycolysis
b. Genetic processing
- RNA polymerase
c. Environmental Information Processing
- Bacterial secretion system
d. Cellular Processes
- Endocytosis
e. Organismal systems
- Neutrophil Extracellular Trap Formation
f. Human Diseases
- Vibrio cholerae infection
g. Drug Development
- Cephalosporins
Accession to RCSB protein database – PDB
Date:
The prediction/ characterization of primary sequence

OPEN NCBI
Copy sequence and paste
Secondary structure prediction tool
Prediction of tertiary structure of protein by comparative/ homology modelling
ACCESSION TO SCOP AND CATH DATABASES
DATE:
1. SCOP
INTRODUCTION: SCOP is a protein classification database. It stands for Structural
classification of proteins. It provides a detailed description of the structural and evolutionary
relationships between all the proteins with known structures. The various levels of SCOP are:
class, fold, superfamily, family, protein domain and species.
ACCESSION STEPS
STEP 1: Type [Link] or search for SCOP in search box of google.
The homepage of SCOP database appears. The ID of our desired protein sequence (as
obtained from NCBI, 1A6M in this case) is typed in the search box. Results are displayed.

STEP 2: The ancestry (class, fold, superfamily, family and domain) can be observed. It also
shows that the ID 1A6M is of myoglobin in species of Physeter catodon.
STEP 3: The structure of the myoglobin molecules is observed.
2. CATH
INTRODUCTION: CATH is a protein classification database. It stands for Class,
Architecture, Topology and Homologous superfamily. It provides information according to
the evolutionary relationship of protein domains.
STEP 1: Type [Link] or search for CATH in search box of google. The homepage of
CATH database appears. The ID of our desired protein sequence (as obtained from NCBI,
1A6M in this case) is typed in the search box.

STEP 2: The appropriate results are displayed as per the gene ID submitted.

STEP 3: The first option ([Link]) is selected. A general summary of the superfamily is
shown.
STEP 4: The structure is shown as follows:

STEP 5: The CATH classification is displayed. The EC number is described showing that
the submitted ID is of a protein which belongs to- Class: Alpha proteins, Architecture:
Orthogonal bundle, Topology: Globin-like, Homologous superfamily: Globins.
STEP 6: The functional families under the superfamily (of 1A6M) can be seen along with
their total sequences.

STEP 7: Structural neighbourhood of the superfamily is seen. It shows that the sequences
belonging to the same homologous superfamily have very similar percent identity.
ACCESSION TO tRNAscanSE
Date:
Introduction
tRNAscan-SE has been the software of choice for predicting transfer RNA (tRNA) genes in genomic
sequences. Not only basic researchers, users of tRNAscan-SE include biologists, database annotators
and sequencing centres too. One or more sequences can be analysed together. The users are also asked
to state the source of the genome, if known. The sequnce may be uploaded as a FASTA format or
typed/pasted in the text editor on the website. The tRNAscan-SE web server used here is a convenient,
ready-for-use means to identify tRNA genes in one or more query sequences. The graphical interface
also provides easy navigation to the details of prediction results and a quick way to learn about the
features of the software without requiring familiarity with UNIX-based commands or installation on
one’s own computer. However, web-based analysis limits query sequences to a maximum of five
million base pairs. The standalone version can be used for larger genomic sequences.
Go to trna scan SE
3’ END

5’ END

CLASS 1

Types of bp = 3 AU, GC, GU

The bond between GC pairs in RNA helices appears as red dots and AU appears as blue
dots.
JOB: Visualization of the structure of Aspirin using Rasmol
Date:
ACCESSION TO DendroUPGMA
Date:
Introduction
UPGMA (Unweighted Pair Group Method with Arithmetic Mean) is a straightforward approach to
constructing a phylogenetic tree from a distance matrix. It is the only method of phylogenetic
reconstruction dealt with in which the resulting trees are rooted. The unweighted term indicates that
all distances contribute equally to each average that is computed and does not refer to the math by
which it is achieved.

DendroUPGMA homepage

Distance matrix

Similarity matrix
Steps
1. We take the FASTA sequence of the testis determining factor gene of both human and
monkey separately from NCBI and input them as instructed in the dialogue box.

2. We decide upon the parameters we want to base this dendogram on and click on
submit.

3. We get the different types of matrices based on the parameters we had set

Global Sequence Alignment with EMBOSS Needle
100% (1)
Global Sequence Alignment with EMBOSS Needle
11 pages
Understanding Multiple Sequence Alignment
No ratings yet
Understanding Multiple Sequence Alignment
17 pages
Pair-Wise Sequence Alignment Basics
No ratings yet
Pair-Wise Sequence Alignment Basics
17 pages
CATH Protein Structure Classification
No ratings yet
CATH Protein Structure Classification
3 pages
Protein Folding and Misfolding Insights
No ratings yet
Protein Folding and Misfolding Insights
11 pages
Python & Biopython Programming Syllabus
No ratings yet
Python & Biopython Programming Syllabus
25 pages
Protein Structure Classification & Prediction
No ratings yet
Protein Structure Classification & Prediction
10 pages
Multimeric Protein Structure Insights
No ratings yet
Multimeric Protein Structure Insights
39 pages
Protein Engineering Overview and Applications
No ratings yet
Protein Engineering Overview and Applications
79 pages
Michaelis-Menten Kinetics Overview
100% (1)
Michaelis-Menten Kinetics Overview
72 pages
Organelle Genomes in Molecular Genetics
No ratings yet
Organelle Genomes in Molecular Genetics
13 pages
Labmanual CS 1
No ratings yet
Labmanual CS 1
52 pages
Overview of Protein Bioseparation Techniques
No ratings yet
Overview of Protein Bioseparation Techniques
11 pages
Overview of Biological Databases
No ratings yet
Overview of Biological Databases
8 pages
Chapter 33: Protein Synthesis
No ratings yet
Chapter 33: Protein Synthesis
64 pages
Enzyme Kinetics of Alkaline Phosphatase
100% (1)
Enzyme Kinetics of Alkaline Phosphatase
8 pages
Molecular Chaperones in Protein Folding
No ratings yet
Molecular Chaperones in Protein Folding
17 pages
Competitive Inhibition and Enzyme Kinetics Analysis
No ratings yet
Competitive Inhibition and Enzyme Kinetics Analysis
21 pages
Standard Free Energy and ATP Dynamics
No ratings yet
Standard Free Energy and ATP Dynamics
11 pages
Enzyme Characterization Methods
No ratings yet
Enzyme Characterization Methods
14 pages
Hydration of Macromolecules in Biophysics
100% (1)
Hydration of Macromolecules in Biophysics
28 pages
Biotech Document Database Overview
No ratings yet
Biotech Document Database Overview
50 pages
Introduction to Bioinformatics Concepts
No ratings yet
Introduction to Bioinformatics Concepts
46 pages
Protein Mass Fingerprinting Techniques
No ratings yet
Protein Mass Fingerprinting Techniques
26 pages
Arginine's Role in Protein Chromatography
No ratings yet
Arginine's Role in Protein Chromatography
7 pages
Clustal Omega Sequence Alignment Guide
No ratings yet
Clustal Omega Sequence Alignment Guide
13 pages
Transcription Factors in Gene Regulation
No ratings yet
Transcription Factors in Gene Regulation
25 pages
Understanding Proteomics Techniques
No ratings yet
Understanding Proteomics Techniques
32 pages
Protein Transport in Eukaryotic Cells
No ratings yet
Protein Transport in Eukaryotic Cells
13 pages
Steps in Homology Modeling Process
No ratings yet
Steps in Homology Modeling Process
29 pages
Transgenic Cattle: Innovations and Examples
No ratings yet
Transgenic Cattle: Innovations and Examples
3 pages
Bioinformatics Course Lesson Plan
No ratings yet
Bioinformatics Course Lesson Plan
3 pages
Purification, Characterization and Immobilization of A Keratinase From Aspergillus Oryzae
No ratings yet
Purification, Characterization and Immobilization of A Keratinase From Aspergillus Oryzae
9 pages
Cell Communication Mechanisms Explained
No ratings yet
Cell Communication Mechanisms Explained
25 pages
Eukaryotic Transcription Factors Overview
No ratings yet
Eukaryotic Transcription Factors Overview
13 pages
Types of Cell Commitment Explained
No ratings yet
Types of Cell Commitment Explained
8 pages
Structure and Regulation of Trp Operon
No ratings yet
Structure and Regulation of Trp Operon
14 pages
Overview of GenBank Database
No ratings yet
Overview of GenBank Database
14 pages
Ultracentrifugation Techniques Explained
No ratings yet
Ultracentrifugation Techniques Explained
31 pages
Understanding Proteins and Amino Acids
No ratings yet
Understanding Proteins and Amino Acids
59 pages
Protein Sequencing Methods Overview
No ratings yet
Protein Sequencing Methods Overview
21 pages
Molecular Docking Overview and Applications
100% (1)
Molecular Docking Overview and Applications
23 pages
Rigid vs. Flexible Docking Methods
100% (1)
Rigid vs. Flexible Docking Methods
8 pages
PAM vs BLOSUM: Scoring Matrices Explained
No ratings yet
PAM vs BLOSUM: Scoring Matrices Explained
21 pages
RNA Posttranscriptional Modifications
No ratings yet
RNA Posttranscriptional Modifications
16 pages
BLAST vs FASTA: Key Differences Explained
No ratings yet
BLAST vs FASTA: Key Differences Explained
2 pages
Protein Structure and Denaturation
No ratings yet
Protein Structure and Denaturation
66 pages
Properties of Proteins and Amino Acids
No ratings yet
Properties of Proteins and Amino Acids
8 pages
Clone Contig Method in DNA Assembly
No ratings yet
Clone Contig Method in DNA Assembly
7 pages
Multalin: A Guide to Sequence Alignment
No ratings yet
Multalin: A Guide to Sequence Alignment
19 pages
Clone Contig Approach in Genome Sequencing
No ratings yet
Clone Contig Approach in Genome Sequencing
1 page
SWISS-MODEL: Homology Modelling Guide
100% (1)
SWISS-MODEL: Homology Modelling Guide
25 pages
Global vs Local Sequence Alignment Explained
No ratings yet
Global vs Local Sequence Alignment Explained
1 page
Understanding BLAST and Sequence Alignment
100% (1)
Understanding BLAST and Sequence Alignment
23 pages
E. coli Genomic DNA Isolation Protocol
100% (1)
E. coli Genomic DNA Isolation Protocol
5 pages
Clustal Algorithms for Sequence Alignment
No ratings yet
Clustal Algorithms for Sequence Alignment
2 pages
Bacterial Growth Curve Analysis Lab
No ratings yet
Bacterial Growth Curve Analysis Lab
18 pages
BLAST and ORF Analysis in Bioinformatics
No ratings yet
BLAST and ORF Analysis in Bioinformatics
8 pages
Bioinformatics II Assignment Overview
No ratings yet
Bioinformatics II Assignment Overview
24 pages
Overview of BLAST in Bioinformatics
100% (1)
Overview of BLAST in Bioinformatics
21 pages
Faculty Application - P.S.R. Group Chennai
No ratings yet
Faculty Application - P.S.R. Group Chennai
2 pages
Tender for Real-time PCR Cycler
No ratings yet
Tender for Real-time PCR Cycler
66 pages
SAIHK NANN Account Transactions Summary
No ratings yet
SAIHK NANN Account Transactions Summary
4 pages
Pneumatoceles in Pediatric Radiology
No ratings yet
Pneumatoceles in Pediatric Radiology
161 pages
Answer Key for Physics and Chemistry Exam
No ratings yet
Answer Key for Physics and Chemistry Exam
21 pages
Trip Preparation and Travel Tips
No ratings yet
Trip Preparation and Travel Tips
63 pages
Grade 4 Science Lesson Plan on Force
No ratings yet
Grade 4 Science Lesson Plan on Force
5 pages
Single-Area OSPFv2 Concepts Overview
No ratings yet
Single-Area OSPFv2 Concepts Overview
42 pages
Listening Comprehension Test Guide
No ratings yet
Listening Comprehension Test Guide
9 pages
Premium Low Sheen Exterior Paint Data
No ratings yet
Premium Low Sheen Exterior Paint Data
1 page
Strategic Management Overview and Methods
No ratings yet
Strategic Management Overview and Methods
58 pages
Workato Integration Governance & Best Practices Checklist - Implementation - Confluence
No ratings yet
Workato Integration Governance & Best Practices Checklist - Implementation - Confluence
1 page
LIK HUNG Cable Tray Product Catalogue
No ratings yet
LIK HUNG Cable Tray Product Catalogue
18 pages
IAnalyzing Ilan Pappé's Podcast Strategies
No ratings yet
IAnalyzing Ilan Pappé's Podcast Strategies
20 pages
CMS Exam 2024 Qualified Candidates List
No ratings yet
CMS Exam 2024 Qualified Candidates List
101 pages
Ujian Semester Genap Bahasa Inggris MTS
No ratings yet
Ujian Semester Genap Bahasa Inggris MTS
3 pages
End Semester Exam Form Instructions 2025
No ratings yet
End Semester Exam Form Instructions 2025
1 page
Understanding Factoring Services
No ratings yet
Understanding Factoring Services
32 pages
Electronic Evidence and Its Admissibility
No ratings yet
Electronic Evidence and Its Admissibility
30 pages
Cave Research Project Rubric Guide
No ratings yet
Cave Research Project Rubric Guide
4 pages
MBA Marketing Candidate Profile
No ratings yet
MBA Marketing Candidate Profile
1 page
Reporting Symptoms of Anomalous Health Incidents
No ratings yet
Reporting Symptoms of Anomalous Health Incidents
9 pages
An Inter-Comparison of Inverse Models For Estimati
No ratings yet
An Inter-Comparison of Inverse Models For Estimati
32 pages
Symmetrical Faults in Power Systems
No ratings yet
Symmetrical Faults in Power Systems
33 pages
Cameroon SME Promotion Law 2010
No ratings yet
Cameroon SME Promotion Law 2010
7 pages
K-Pop Artist Trivia and Quizzes
No ratings yet
K-Pop Artist Trivia and Quizzes
16 pages
International Business Exam Notes DU
No ratings yet
International Business Exam Notes DU
2 pages
Variance Analysis for Managers
No ratings yet
Variance Analysis for Managers
7 pages
Analisis Rafaksi pada Pemasaran Ubi Kayu
No ratings yet
Analisis Rafaksi pada Pemasaran Ubi Kayu
16 pages
Ethical Challenges in Genetic Engineering
No ratings yet
Ethical Challenges in Genetic Engineering
1 page

Multalin Tool for Sequence Alignment

Uploaded by

Multalin Tool for Sequence Alignment

Uploaded by

(Semester IV, Paper code: MMCB4413)

Roll number: 636

[Link]. Date Topic Page Prof.

2. RNA fold 9-13 AB

4. EMBOSS and Clustal Omega 20-27 KS

5. KEGG pathway 29-38 SSC

6. PDB 39-47 SSC

7. SCOP AND CATH 48-52 SSC

9. Visualization of the structure of Aspirin using 60-63 KS

10. DendroUPGMA 64-65 SSC

4. The FASTA sequence is copied.

Query sequence: P53 [Cricetulus griseus]

STEP 4: On a separate tab we open ‘BLAST’ and click on Protein BLAST.

STEP 6: We scroll down and click on ‘BLAST’ to run BLAST analysis.

Finally, here is the phylogenetic tree obtained by performing Clustal Omega.

Types of bp = 3 AU, GC, GU

You might also like