BLAST and ORF Analysis in Bioinformatics

The document outlines procedures for pair-wise sequence alignment using BLAST, identifying Open Reading Frames (ORFs) with bioinformatics tools, and predicting protein 3D structures through homology modeling. It emphasizes the importance of sequence similarity in evolutionary relationships, gene prediction, and protein function. The document provides step-by-step instructions for using NCBI BLAST, ORF Finder, and SWISS-MODEL for these analyses.

Uploaded by

takuriino11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views8 pages

BLAST and ORF Analysis in Bioinformatics

Uploaded by

takuriino11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Aim: Pair-wise alignment of sequences (BLAST) and interpretation of the output

Materials Required: Computer with internet, Access to NCBI BLAST tool

([Link] Sample DNA or protein sequences
Introduction
The principle behind pair-wise sequence alignment using BLAST (Basic Local Alignment
Search Tool) is based on finding regions of similarity between two sequences. This similarity
can indicate evolutionary relationships, functional similarities, or even structural homologies.
BLAST is designed to perform local sequence alignments, focusing on finding short, highly
similar regions within sequences. Instead of aligning entire sequences end-to-end (global
alignment), BLAST looks for subsequences with high similarity. This approach is faster and
more efficient, especially for large databases. BLAST uses a heuristic search strategy, which
accelerates the alignment process by finding initial "seed" matches between the query and
database sequences. It starts by identifying short word matches (or "words") between the
sequences, which are then extended in both directions to maximize alignment. Only regions
with scores above a threshold are retained, searching faster than exhaustive methods. BLAST
uses a scoring matrix (like BLOSUM for proteins or PAM) to calculate match scores,
penalizing mismatches and gaps. A high alignment score suggests a close similarity, while
lower scores are less likely to indicate functional or evolutionary relationships. The output
provides critical data points, including Percent Identity (percentage of identical matches),
Query Coverage (portion of the query sequence covered by alignment), and Bit Score
(alignment quality). Through these metrics, one can determine the degree of similarity,
possible function, and potential evolutionary relationship. BLAST includes a graphical
overview of the alignments, showing the positions and quality of alignments along the
sequences. This allows for quick assessment of regions of similarity and conservation.
Procedure:
1. Accessing BLAST:
 Open a web browser and go to the NCBI BLAST website.
2. Choosing the BLAST Program:
 Depending on the sequence type (DNA or protein), select the appropriate BLAST
program (e.g., BLASTn for DNA or BLASTp for protein).
3. Input the Query Sequence:
 Copy the sequence you want to analyze and paste it into the query box.
 Choose a relevant database, such as nr (non-redundant) or refseq.
4. Setting Parameters:
 You may adjust parameters like Expect threshold (E-value), Matrix, and Gap penalties
for fine-tuning.
 Choose the organism database if you’re looking for sequences from a specific species.
5. Run BLAST:
 Click on the "BLAST" button to start the search.
6. Review the Results:
 The results include a list of sequences that align with your query, along with
alignment scores and E-values.

Result: Do Blast and paste your result

Discussion/Conclusion:
Aim: Finding of ORF using bioinformatic tools
Materials Required: Computer with internet, NCBI ORF Finder, Sample DNA sequence
Introduction:
An Open Reading Frame (ORF) is a sequence of DNA that starts with a start codon, usually
ATG in eukaryotes, and ends with a stop codon, such as TAA, TAG, or TGA. ORFs represent
potential protein-coding regions and are fundamental for locating genes within a DNA
sequence. Identifying ORFs is a crucial step in gene prediction, helping scientists to locate
genes and infer possible protein functions. In a DNA sequence, there are six reading frames-
three in the forward direction and three in the reverse. Each frame can potentially contain
start and stop codons, so it is necessary to examine all six frames to identify all possible
ORFs. ORF identification is essential in genomic annotation, enabling scientists to determine
gene locations in both newly sequenced genomes and known genomes of various organisms.
ORF identification is significant in genetic engineering, molecular biology, and
biotechnology, where understanding gene locations and structures is foundational.
Bioinformatic tools are invaluable in automating ORF detection by analyzing DNA
sequences and finding regions between start and stop codons that meet length requirements,
which is especially useful for filtering out short, non-functional ORFs.
NCBI’s ORF Finder is a commonly used online tool, offering a simple interface that
identifies ORFs in all six reading frames and provides information on start and stop positions,
length, and potential translations. EMBOSS getorf, part of the EMBOSS suite, is another
widely used tool available online or locally, providing users with ORF sequences in each
frame. Commercial software like Geneious Prime also has ORF detection features with
graphical displays, simplifying the analysis for researchers. For those with programming
knowledge, Biopython is a Python library that can identify ORFs programmatically, enabling
batch processing of multiple sequences.
The mechanism of ORF identification involves scanning DNA sequences for start codons and
continuing until a stop codon is reached, often with a minimum length filter to eliminate short
ORFs that may not code for functional proteins. After identifying an ORF, tools can translate
it into the corresponding amino acid sequence, offering insights into the protein it may
encode. Interpreting ORF results involves examining start and stop positions, ORF length,
and amino acid sequence. This translated sequence can be further analyzed by comparing it
against databases, such as through a BLAST search, to predict function or identify
homologous sequences. ORF finding is fundamental in gene prediction, protein engineering,
and comparative genomics. It allows scientists to identify, clone, and express proteins for
research or therapeutic applications, compare gene structures across species, and identify
evolutionary patterns.
Procedure:
1: Go to the NCBI ORF Finder website.
2: Input the DNA sequence in FASTA format.
3: Configure any additional settings if needed (e.g., minimum ORF length).
4: Run the tool and interpret the results by reviewing the ORFs and their positions.
5: Download or copy the sequences of interest for further analysis (e.g., BLAST for
homology).
Result: Paste your result
Conclusion:
Aim: Demonstration and prediction of the 3D structure of a protein using bioinformatics tools
Materials and Software Required: Computer with internet access, Bioinformatics tools-
UniProt/PDB for sequence retrieval, SWISS-MODEL for homology modeling
Theory:
Proteins are essential macromolecules in all living organisms, playing vital roles in nearly all
cellular processes. Their functions include enzymatic catalysis, transport, signal transduction,
and structural support. The function of a protein is intricately linked to its 3D structure, which
is determined by the sequence of amino acids that make up the protein. Protein structures are
organized into four levels.
1. Primary Structure: The linear sequence of amino acids in a protein chain, linked by
peptide bonds. This sequence determines the way a protein will fold and, ultimately,
its function.
2. Secondary Structure: Localized conformations within the polypeptide chain, formed
through hydrogen bonding between backbone atoms. The two main types are:
o Alpha-helix: A right-handed coiled structure stabilized by hydrogen bonds
between every fourth amino acid.
o Beta-sheet: A planar structure where strands lie side by side, forming hydrogen
bonds between them.
3. Tertiary Structure: The complete 3D arrangement of all atoms within a single
polypeptide chain, formed through interactions among amino acid side chains. This
includes hydrogen bonds, ionic interactions, hydrophobic interactions, and disulfide
bonds. The tertiary structure defines the protein's specific shape and function.
4. Quaternary Structure: The arrangement of multiple polypeptide chains (subunits) in a
multi-subunit protein. Each subunit may have its own tertiary structure, but together
they function as a single unit.
The 3D structure of a protein is crucial for understanding how it interacts with other
molecules, substrates, or ligands. Knowledge of the structure allows researchers to design
drugs, study disease mechanisms, and understand enzyme catalysis and receptor-ligand
interactions. Since experimental methods such as X-ray crystallography and NMR are time-
consuming and expensive, bioinformatics-based structure prediction has become invaluable
in studying protein structures.
Methods of Protein Structure Prediction
There are three primary methods for predicting protein structures computationally:
1. Homology Modeling: This method predicts the 3D structure of a target protein based
on the structure of a homologous protein (template) with a known structure. It works
well when there is significant sequence similarity between the target and template.
Homology modeling relies on the principle that similar sequences have similar
structures. A widely used online tool for homology modeling, SWISS-MODEL
allows users to input a target protein sequence, select a homologous template, and
automatically build a 3D model.
2. Threading (Fold Recognition): Used when no suitable template is available, but the
target protein may have a fold similar to known protein folds. Threading methods
compare the target sequence with a library of known structures to identify compatible
folds, even in cases with low sequence similarity.
3. Ab Initio Prediction: This approach does not rely on template structures and predicts
protein structures solely based on the physical and chemical properties of amino
acids. It is used when no homologs or templates exist. This method is computationally
intensive but has advanced significantly with tools like AlphaFold, which uses deep
learning to predict accurate protein structures.

Procedure: Homology modeling using SWISS-MODEL

Step 1: Sequence Retrieval
1. Open PDB ([Link]
2. Enter the name of the protein of interest (e.g., Human Hemoglobin).
3. Download the FASTA format sequence for further analysis.
Step 2: 3D Structure Prediction
1. Go to SWISS-MODEL ([Link]
2. Input the protein sequence (FASTA format).
3. For SWISS-MODEL, select a suitable template from the BLAST search results for
homology modeling.
4. Run the model-building process, which may take several minutes.
5. Assess the structure through Ramachandran plot
6. Download the predicted 3D structure file in PDB format.

Result: Paste a picture of the predicted model along with the Ramachandran plot
Conclusion:

Overview of BLAST in Bioinformatics
100% (1)
Overview of BLAST in Bioinformatics
21 pages
FASTA and Multi-FASTA Formats Explained
No ratings yet
FASTA and Multi-FASTA Formats Explained
5 pages
Multalin Tool for Sequence Alignment
No ratings yet
Multalin Tool for Sequence Alignment
66 pages
wpr1 Bioinformatics
No ratings yet
wpr1 Bioinformatics
5 pages
Sequence Analysis and Gene Detection
No ratings yet
Sequence Analysis and Gene Detection
14 pages
Biological Sequence Database Overview
No ratings yet
Biological Sequence Database Overview
6 pages
Sequence Alignment in Bioinformatics
No ratings yet
Sequence Alignment in Bioinformatics
9 pages
Isa Chowdhury Project
No ratings yet
Isa Chowdhury Project
14 pages
Using BLAST for Protein Sequence Alignment
No ratings yet
Using BLAST for Protein Sequence Alignment
9 pages
Bioinformatics and Experimental Techniques
No ratings yet
Bioinformatics and Experimental Techniques
297 pages
Bioinformatics II Assignment Overview
No ratings yet
Bioinformatics II Assignment Overview
24 pages
Advances in Bacterial Genome Analysis
No ratings yet
Advances in Bacterial Genome Analysis
6 pages
Understanding BLAST in Bioinformatics
100% (1)
Understanding BLAST in Bioinformatics
4 pages
Bioinformatics Tools for Sequence Analysis
No ratings yet
Bioinformatics Tools for Sequence Analysis
9 pages
App Report
No ratings yet
App Report
16 pages
Overview of BLAST Algorithm Steps
No ratings yet
Overview of BLAST Algorithm Steps
6 pages
Al Imran Nahid Bio
No ratings yet
Al Imran Nahid Bio
8 pages
FASTA and BLAST Sequence Alignment Guide
No ratings yet
FASTA and BLAST Sequence Alignment Guide
45 pages
Overview of Biological Databases
No ratings yet
Overview of Biological Databases
15 pages
Genome & Protein Sequence Analysis Tools
100% (3)
Genome & Protein Sequence Analysis Tools
23 pages
Bioinformatics Tools & Applications
No ratings yet
Bioinformatics Tools & Applications
6 pages
Understanding PIR in Bioinformatics
No ratings yet
Understanding PIR in Bioinformatics
85 pages
Biopython: A Comprehensive Guide
No ratings yet
Biopython: A Comprehensive Guide
4 pages
Understanding BLAST in Bioinformatics
No ratings yet
Understanding BLAST in Bioinformatics
11 pages
Blast
No ratings yet
Blast
10 pages
Bioinformatics: An Overview of Techniques
100% (1)
Bioinformatics: An Overview of Techniques
41 pages
Sequence Alignment and BLAST Overview
No ratings yet
Sequence Alignment and BLAST Overview
4 pages
Overview of the BLAST Tool in Bioinformatics
100% (1)
Overview of the BLAST Tool in Bioinformatics
4 pages
Bioinformatics Module: Genome Databases
No ratings yet
Bioinformatics Module: Genome Databases
20 pages
BIF501P LAB003 Reading-Content
No ratings yet
BIF501P LAB003 Reading-Content
17 pages
Understanding BLAST in Bioinformatics
No ratings yet
Understanding BLAST in Bioinformatics
72 pages
BLAST and Sequence Alignment Overview
No ratings yet
BLAST and Sequence Alignment Overview
36 pages
DNA Sequence Analysis Guide for Microbiology
No ratings yet
DNA Sequence Analysis Guide for Microbiology
39 pages
Understanding FASTA in Bioinformatics
No ratings yet
Understanding FASTA in Bioinformatics
26 pages
NCBI Handbook Glossary of Terms
No ratings yet
NCBI Handbook Glossary of Terms
18 pages
Genome Annotation Techniques and Tools
No ratings yet
Genome Annotation Techniques and Tools
75 pages
Bioinformatics Assigment MANOHAR MAURYA ROLL NO - 7
No ratings yet
Bioinformatics Assigment MANOHAR MAURYA ROLL NO - 7
10 pages
Bioinformatics Resources Overview
No ratings yet
Bioinformatics Resources Overview
55 pages
Practical 1 2026
No ratings yet
Practical 1 2026
13 pages
Understanding FASTA Format and Sequence Alignment
No ratings yet
Understanding FASTA Format and Sequence Alignment
30 pages
Differences in Sequence Alignment Methods
No ratings yet
Differences in Sequence Alignment Methods
6 pages
Data Retrieval Systems in Bioinformatics
75% (4)
Data Retrieval Systems in Bioinformatics
17 pages
Genome Annotation Techniques Explained
No ratings yet
Genome Annotation Techniques Explained
66 pages
DNA and Protein Sequence Database Search
No ratings yet
DNA and Protein Sequence Database Search
22 pages
Homology Modeling in Drug Discovery
No ratings yet
Homology Modeling in Drug Discovery
12 pages
Understanding BLAST in Bioinformatics
No ratings yet
Understanding BLAST in Bioinformatics
11 pages
Overview of Bioinformatics Applications
No ratings yet
Overview of Bioinformatics Applications
28 pages
Bioinformatics: Sequence Alignment & BLAST
No ratings yet
Bioinformatics: Sequence Alignment & BLAST
56 pages
Gene Prediction Techniques in Bioinformatics
No ratings yet
Gene Prediction Techniques in Bioinformatics
34 pages
Sequence Searching in Bioinformatics
No ratings yet
Sequence Searching in Bioinformatics
14 pages
Basics of Bioinformatics Overview
100% (8)
Basics of Bioinformatics Overview
99 pages
Warum Blastx Benutzen
No ratings yet
Warum Blastx Benutzen
5 pages
Bioinformatics File Formats Overview
No ratings yet
Bioinformatics File Formats Overview
13 pages
Understanding BLAST and FASTA Formats
0% (1)
Understanding BLAST and FASTA Formats
3 pages
BTG 404
No ratings yet
BTG 404
24 pages
Bioinformatics Tools for DNA Analysis
No ratings yet
Bioinformatics Tools for DNA Analysis
21 pages
DNA, RNA, and Protein Sequence Analysis
No ratings yet
DNA, RNA, and Protein Sequence Analysis
1 page
Understanding BLAST in Bioinformatics
No ratings yet
Understanding BLAST in Bioinformatics
18 pages
ASH2113022M (Raihan Uddin) - An Assignment On FASTA - MSA
No ratings yet
ASH2113022M (Raihan Uddin) - An Assignment On FASTA - MSA
6 pages
Power Plant Types and Efficiency Analysis
No ratings yet
Power Plant Types and Efficiency Analysis
16 pages
Electrical Power Transmission Guide
No ratings yet
Electrical Power Transmission Guide
23 pages
Electrical Machine-I Concepts and Analysis
No ratings yet
Electrical Machine-I Concepts and Analysis
58 pages
Electrical Instruments & Measurement Guide
No ratings yet
Electrical Instruments & Measurement Guide
45 pages
Three-Phase Induction Motor Insights
No ratings yet
Three-Phase Induction Motor Insights
43 pages
Electrical Engineering Material Analysis Guide
No ratings yet
Electrical Engineering Material Analysis Guide
5 pages
Principles of Management Test Assignment
No ratings yet
Principles of Management Test Assignment
1 page
Mouse Tail DNA Isolation Protocol
No ratings yet
Mouse Tail DNA Isolation Protocol
2 pages
Combinatorics of Genome Rearrangements PDF
No ratings yet
Combinatorics of Genome Rearrangements PDF
2 pages
Burdock Multi-Omics Database Overview
No ratings yet
Burdock Multi-Omics Database Overview
13 pages
Protein Secondary Structure Prediction GOR
No ratings yet
Protein Secondary Structure Prediction GOR
2 pages
Using BLAST for Sequence Analysis
No ratings yet
Using BLAST for Sequence Analysis
2 pages
Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics Shoba Ranganathan Ebook Direct View Access
100% (3)
Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics Shoba Ranganathan Ebook Direct View Access
41 pages
Introduction to Bioinformatics Basics
No ratings yet
Introduction to Bioinformatics Basics
4 pages
Pfam Protein Family Database Overview
No ratings yet
Pfam Protein Family Database Overview
2 pages
Phylogenetic Tree Construction Methods
No ratings yet
Phylogenetic Tree Construction Methods
19 pages
Fish Species Identification via 16S rRNA
No ratings yet
Fish Species Identification via 16S rRNA
3 pages
Omics Technologies Course Overview
No ratings yet
Omics Technologies Course Overview
23 pages
Understanding Sequence Alignment Basics
No ratings yet
Understanding Sequence Alignment Basics
40 pages
4 - BTE 401 Multiple Sequence Alignment
No ratings yet
4 - BTE 401 Multiple Sequence Alignment
37 pages
Bioinformatics in Drug Discovery Insights
No ratings yet
Bioinformatics in Drug Discovery Insights
12 pages
PICRUSt2: Enhanced Metagenome Prediction
No ratings yet
PICRUSt2: Enhanced Metagenome Prediction
9 pages
Yellow Catfish Genome Assembly Insights
No ratings yet
Yellow Catfish Genome Assembly Insights
20 pages
Bioinformatics Course Overview: Key Topics
No ratings yet
Bioinformatics Course Overview: Key Topics
9 pages
Pressy Current Science
No ratings yet
Pressy Current Science
3 pages
Emma 2019 OrthoFinder Phylogenetic Orthology
No ratings yet
Emma 2019 OrthoFinder Phylogenetic Orthology
14 pages
Overview of Biological Databases
No ratings yet
Overview of Biological Databases
31 pages
Bioinformatics and the Human Genome Project
No ratings yet
Bioinformatics and the Human Genome Project
44 pages
B.Tech Biotechnology Practical Certificate
No ratings yet
B.Tech Biotechnology Practical Certificate
33 pages
Bioinformatics: Tools and Techniques
No ratings yet
Bioinformatics: Tools and Techniques
7 pages
Understanding Genomics: Methods & Uses
No ratings yet
Understanding Genomics: Methods & Uses
11 pages
Bioinformatics and AI in Drug Innovation - Sushil Kumar Kashaw
No ratings yet
Bioinformatics and AI in Drug Innovation - Sushil Kumar Kashaw
574 pages
Sequence Alignment Methods Overview
No ratings yet
Sequence Alignment Methods Overview
69 pages
Bioinformatics Test on NCBI Resources
No ratings yet
Bioinformatics Test on NCBI Resources
1 page
Taxonomy and Systematics Overview
No ratings yet
Taxonomy and Systematics Overview
3 pages
Sequence Alignment: Types and Tools
No ratings yet
Sequence Alignment: Types and Tools
44 pages
Bioinformatics Tools Overview at NYU
No ratings yet
Bioinformatics Tools Overview at NYU
50 pages
BIO 2026: International Bioscience Conference
No ratings yet
BIO 2026: International Bioscience Conference
3 pages

BLAST and ORF Analysis in Bioinformatics

Uploaded by

BLAST and ORF Analysis in Bioinformatics

Uploaded by

Aim: Pair-wise alignment of sequences (BLAST) and interpretation of the output

Materials Required: Computer with internet, Access to NCBI BLAST tool

Result: Do Blast and paste your result

Procedure: Homology modeling using SWISS-MODEL

You might also like