Parameters: |
- query_sequence (string) – The query sequence, this may be upper or lowercase from the set of
{A, C, G, T, N} (nucleotide) or from the set of
{A, R, N, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y, V, B, Z, X, *
} (protein)
- gap_open_penalty (int, optional) – The penalty applied to creating a gap in the alignment. This CANNOT
be 0.
Default is 5.
- gap_extend_penalty (int, optional) – The penalty applied to extending a gap in the alignment. This CANNOT
be 0.
Default is 2.
- score_size (int, optional) – If your estimated best alignment score is < 255 this should be 0.
If your estimated best alignment score is >= 255, this should be 1.
If you don’t know, this should be 2.
Default is 2.
- mask_length (int, optional) – The distance between the optimal and suboptimal alignment ending
position >= mask_length. We suggest to use len(query_sequence)/2, if
you don’t have special concerns.
Detailed description of mask_length: After locating the optimal
alignment ending position, the suboptimal alignment score can be
heuristically found by checking the second largest score in the array
that contains the maximal score of each column of the SW matrix. In
order to avoid picking the scores that belong to the alignments
sharing the partial best alignment, SSW C library masks the reference
loci nearby (mask length = mask_length) the best alignment ending
position and locates the second largest score from the unmasked
elements.
Default is 15.
- mask_auto (bool, optional) – This will automatically set the used mask length to be
max(int(len(query_sequence)/2), mask_length).
Default is True.
- score_only (bool, optional) – This will prevent the best alignment beginning positions (BABP) and the
cigar from being returned as a result. This overrides any setting on
score_filter, distance_filter, and override_skip_babp. It has the
highest precedence.
Default is False.
- score_filter (int, optional) – If set, this will prevent the cigar and best alignment beginning
positions (BABP) from being returned if the optimal alignment score is
less than score_filter saving some time computationally. This filter
may be overridden by score_only (prevents BABP and cigar, regardless
of other arguments), distance_filter (may prevent cigar, but will
cause BABP to be calculated), and override_skip_babp (will ensure
BABP) returned.
Default is None.
- distance_filter (int, optional) – If set, this will prevent the cigar from being returned if the length
of the query_sequence or the target_sequence is less than
distance_filter saving some time computationally. The results of
this filter may be overridden by score_only (prevents BABP and cigar,
regardless of other arguments), and score_filter (may prevent cigar).
override_skip_babp has no effect with this filter applied, as BABP
must be calculated to perform the filter.
Default is None.
- override_skip_babp (bool, optional) – When True, the best alignment beginning positions (BABP) will always be
returned unless score_only is set to True.
Default is False.
- protein (bool, optional) – When True, the query_sequence and target_sequence will be read as
protein sequence. When False, the query_sequence and
target_sequence will be read as nucleotide sequence. If True, a
substitution_matrix must be supplied.
Default is False.
- match_score (int, optional) – When using a nucleotide sequence, the match_score is the score added
when a match occurs. This is ignored if substitution_matrix is
provided.
Default is 2.
- mismatch_score (int, optional) – When using a nucleotide sequence, the mismatch is the score subtracted
when a mismatch occurs. This should be a negative integer.
This is ignored if substitution_matrix is provided.
Default is -3.
- substitution_matrix (2D dict, optional) – Provides the score for each possible substitution of sequence
characters. This may be used for protein or nucleotide sequences. The
entire set of possible combinations for the relevant sequence type MUST
be enumerated in the dict of dicts. This will override match_score
and mismatch_score. Required when protein is True.
Default is None.
- suppress_sequences (bool, optional) – If True, the query and target sequences will not be returned for
convenience.
Default is False.
- zero_index (bool, optional) – If True, all inidices will start at 0. If False, all inidices will
start at 1.
Default is True.
|