BLAST Statistics: The Expect Value
The sequence alignment scores we have been discussing imply a particular model of biological sequences. Based on that that model we can calculate or simulate the distribution of random (chance) alignment scores. So BLAST can provide information about whether an alignment score is likely to have come from that distribution of random alignment scores. The relevant statistic is called the Expect Value or e-value.
Expect value — for a particular match, the number of chance alignments expected with the same score or a better one.
The Expect value is an exponentially decreasing function of the score and is directly proportional to the search space. If the expect value is very much less than one, then the alignment score is not due to chance. However, if the expect value is near one or a greater number, it means the score may be due to chance, but doesn't mean it is.
Important: Since the Expect value increases with the database size, you should always search the smallest database that contains the sequences of interest. On the web, you can choose a smaller database from the menu and restrict the database using an organism limit or one of the exclude filters.
The example below shows how the e-value is used to interpret the significance of a match in a BLAST results.
The Expect Threshold and Max Target Sequences
When running a BLAST search, you probably want to see all the significant matches. In other words, you want to make sure that you reach the expect value cutoff set for the search. The default cutoff (Expect threshold) is set to 0.05 for protein and nucleotide searches. However, the default number of matches returned is only 100. The number of matches is controlled by the Max target sequences setting. In most searches, you will need to increase this setting to see all significant matches. The Expect threshold and Max target sequences settings are under the expandable Algorithm parameters section of the BLAST form as shown in image below.
Last Reviewed: July 3, 2023