skbio.alignment.local_pairwise_align_protein

skbio.alignment.local_pairwise_align_protein(seq1, seq2, gap_open_penalty=11, gap_extend_penalty=1, substitution_matrix=None)[source]

Locally align exactly two protein seqs with Smith-Waterman

State: Experimental as of 0.4.0.

Parameters:
  • seq1 (Protein) – The first unaligned sequence.
  • seq2 (Protein) – The second unaligned sequence.
  • gap_open_penalty (int or float, optional) – Penalty for opening a gap (this is substracted from previous best alignment score, so is typically positive).
  • gap_extend_penalty (int or float, optional) – Penalty for extending a gap (this is substracted from previous best alignment score, so is typically positive).
  • substitution_matrix (2D dict (or similar), optional) – Lookup for substitution scores (these values are added to the previous best alignment score); default is BLOSUM 50.
Returns:

TabularMSA object containing the aligned sequences, alignment score (float), and start/end positions of each input sequence (iterable of two-item tuples). Note that start/end positions are indexes into the unaligned sequences.

Return type:

tuple

Notes

Default gap_open_penalty and gap_extend_penalty parameters are derived from the NCBI BLAST Server [1].

The BLOSUM (blocks substitution matrices) amino acid substitution matrices were originally defined in [2].

References

[1]http://blast.ncbi.nlm.nih.gov/Blast.cgi
[2]Amino acid substitution matrices from protein blocks. S Henikoff and J G Henikoff. Proc Natl Acad Sci U S A. Nov 15, 1992; 89(22): 10915-10919.