GeneticCode.
translate
(sequence, reading_frame=1, start='ignore', stop='ignore')[source]¶Translate RNA sequence into protein sequence.
State: Stable as of 0.4.0.
Parameters: |
|
---|---|
Returns: | Translated sequence. |
Return type: |
See also
Notes
Input RNA sequence metadata are included in the translated protein sequence. Positional metadata are not included.
Examples
Translate RNA into protein using NCBI’s standard genetic code (table ID 1, the default genetic code in scikit-bio):
>>> from skbio import RNA, GeneticCode
>>> rna = RNA('AGUAUUCUGCCACUGUAAGAA')
>>> sgc = GeneticCode.from_ncbi()
>>> sgc.translate(rna)
Protein
--------------------------
Stats:
length: 7
has gaps: False
has degenerates: False
has definites: True
has stops: True
--------------------------
0 SILPL*E
In this command, we used the default start
behavior, which starts
translation at the beginning of the reading frame, regardless of the
presence of a start codon. If we specify “require”, translation will
start at the first start codon in the reading frame (in this example,
CUG), ignoring all prior positions:
>>> sgc.translate(rna, start='require')
Protein
--------------------------
Stats:
length: 5
has gaps: False
has degenerates: False
has definites: True
has stops: True
--------------------------
0 MPL*E
Note that the codon coding for L (CUG) is an alternative start codon in this genetic code. Since we specified “require” mode, methionine (M) was used in place of the alternative start codon (L). This behavior most closely matches the underlying biology since fMet doesn’t have a corresponding IUPAC character.
Translate the same RNA sequence, also specifying that translation terminate at the first stop codon in the reading frame:
>>> sgc.translate(rna, start='require', stop='require')
Protein
--------------------------
Stats:
length: 3
has gaps: False
has degenerates: False
has definites: True
has stops: False
--------------------------
0 MPL
Passing “require” to both start
and stop
trims the translation
to the CDS (and in fact requires that one is present in the reading
frame). Changing the reading frame to 2 causes an exception to be
raised because a start codon doesn’t exist in the reading frame:
>>> sgc.translate(rna, start='require', stop='require',
... reading_frame=2) # doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
ValueError: ...