Position Specific Iterative BLAST (PSI-BLAST)
- blastpgp
-
The blastpgp program can do an iterative search in which
sequences found in one round of searching are used to build
a score model for the next round of searching. In this usage,
the program is called Position-Specific Iterated BLAST, or PSI-BLAST.
As explained in the accompanying paper, the BLAST algorithm is
not tied to a specific score matrix. Traditionally, it has been
implemented using an AxA substitution matrix where A is the alphabet size.
PSI-BLAST instead uses a QxA matrix, where Q is the length of the query
sequence; at each position the cost of a letter depends on the position
w.r.t. the query and the letter in the subject sequence.
-
The position-specific matrix for round i+1 is built from a constrained
multiple alignment among the query and the sequences found with
sufficiently low e-value in round i. The top part of the output for
each round distinguishes the sequences into: sequences found
previously and used in the score model, and sequences not used in the
score model. The output currently includes lots of diagnostics
requested by users at NCBI. To skip quickly from the output of
one round to the next, search for the string "producing", which is
part of the header for each round and likely does not appear elsewhere
in the output. PSI-BLAST "converges" and stops if all sequences
found at round i+1 below the e-value threshold were already in
the model at the beginning of the round.