Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Breakout: Sequence Comparison (BLAST) & Visualization

What is BLAST and how can we analyze the results? 

The Basic Local Alignment Search Tool (BLAST) is a software program for comparing a biological sequence to those in a database of others. Unlike a global alignment - which attempt to line up sequences from end-to-end, BLAST has a multi-step process of initially finding matches of short sequences (words) which are then extended base-by-base on each end until the compared sequences no longer match up. Thus, you can find regions of similarity between sequences of distantly-related organisms almost as well as longer stretches of homology between closely related organisms. BLAST was originally developed to identify related sequences in evolutionary studies, but it's use has expanded to mapping sequences onto genomes, identifying the sequence's source organism, and more.

Some helpful information about BLAST from workshops taught by our Faculty:

A few examples of interest to biology teachers

Identifying a sequence: Which pathogen is it?
  • Compare a sample sequence using BLASTn:
    • produced with a universal primer set to the bacterial 16S rRNA database
    • produced with a universal primer set to the fungal Internal transcribed spacer (ITS) region database
    • or a viral genomic sequence to the viral RefSeq genome database
  • Tips on analyzing the results:
    • Text-based Alignments
    • Graphical Alignment display
Comparing protein sequences to see evolution: A hospital outbreak in the NIH Clinical Center!
  • Tips on analyzing the results:
    • Text-based Alignments
    • Graphical Alignment display
    • TreeView display
Assessing PCR primers:  Cloning a region or Designing a diagnostic
  • Designing PCR primers for Huntington's Disease Diagnosis 
    • You HAVE PCR primers and want to use the Human Genome BLAST to test them out on a genome for specificity for amplifying the the triplet expansion region of the HTT gene:  CAGCAGCGGCTGTGCCTGCGG  &  CCATGGCGACCCTGGAAAAGC
      • Human Genome BLAST result
      • Tip:  Use BLASTn, post your primer sequences with ~20 "N"s in between the two, select the RefSeq Genome database, limit the search to Human in the Organism section, select to use BLASTn - NOT MegaBLAST, and expand the Advanced parameter section to set the word size to 7, and increase the Evalue Threshold to at least 1 because the smaller the hit the bigger the Expect value
  • Designing PCR primers to clone and then study an enterobacteria toxin
    • You would like to have Primer BLAST suggest some good PCR primers to clone the E. coli O157:H7 Shigatoxin A gene
      • Primer BLAST result
      • Tip:  You can either start with a Gene record's "Genomic regions, transcripts and products" graphical view section, or a Nucleotide sequence record's Graphics view display, or even in the Genome Data Viewer. Identify the regions (positions) that you would like each primer to anneal to in order to limit to helpful results. and select regions (control-click will enable you to select two different regions around a coding sequence or a varation, for example) - and then click the "Tools" button>BLAST and Primer search>Primer BLAST (selection) to initiate the search.
  • Tips on analyzing the results:
    • Text-based Alignments
    • Graphical Alignment display
Learning about a gene sequence:  Understanding the impact of a cancer patient's sequence
  • Map a cancer patient's sequence to well-annotated RefSeq sequences to learn more about the sequence, it's originating gene and possible genetic variants.
  • Using Human Genome BLAST to:
    • find where it matches best to a similar sequence in the reference genome
    • visualize in the Genome Data Viewer to find where the gene is in the chromosome and compare with information in the same are in annotation tracks
  • Using BLASTx against the human refseq_protein database to:
    • visualize the alignment in "Pairwise with dots for identities" and identify encoded variant amino acids in the protein sequence
    • visualize in the Genome Data Viewer to find where the variant is located in the reference gene and compare to known functional regions in the protein. is in the chromosome and compare with information in the same are in annotation tracks
  • You can map the hit to data in the same region in other tracks and link to key database records to learn more:
    • ClinVar (clinical variation assertion) database to learn what is known about the variant
    • Gene database to learn what is known about the gene from it's wildtype function to tissues it is normally expressed in.
  •  
  • You can use Primer BLAST to predict a PCR Primer pair for amplifying/sequencing this region which might serve as a potential diagnostic.
  •  
  • You can use BLASTp to find orthologs, examine how far in evolution it goes back and see if the identified variant shows evidence of conservation.
    • Orthologs - Jawed Vertebrates | Do Protein Alignment, root with Human zoom in to residue 61 to show invarance | Phylogenetic TreeView is available!
  • You can use CD-Search (a.k.a. Reversed PSI (RPS)-BLAST) to identify conserved domains and key residues with functional annotations and in a 3D structure of the protein and then pinpoint the location of a variant and predict the potential impact. 
  • Tips on analyzing the results:
    • Text-based Alignments
    • TreeView Display
    • Using the Sequence Viewer's Graphics display (all sequences) or Graphical Data Viewer (eukaryotes)
    • iCn3D

Last Reviewed: July 25, 2024