DNA.
gc_frequency
(relative=False)[source]¶Calculate frequency of G’s and C’s in the sequence.
State: Stable as of 0.4.0.
This calculates the minimum GC frequency, which corresponds to IUPAC characters G, C, and S (which stands for G or C).
Parameters: | relative (bool, optional) – If False return the frequency of G, C, and S characters (ie the count). If True return the relative frequency, ie the proportion of G, C, and S characters in the sequence. In this case the sequence will also be degapped before the operation, so gap characters will not be included when calculating the length of the sequence. |
---|---|
Returns: | Either frequency (count) or relative frequency (proportion), depending on relative. |
Return type: | int or float |
See also
Examples
>>> from skbio import DNA
>>> DNA('ACGT').gc_frequency()
2
>>> DNA('ACGT').gc_frequency(relative=True)
0.5
>>> DNA('ACGT--..').gc_frequency(relative=True)
0.5
>>> DNA('--..').gc_frequency(relative=True)
0
S means G or C, so it counts:
>>> DNA('ASST').gc_frequency()
2
Other degenerates don’t count:
>>> DNA('RYKMBDHVN').gc_frequency()
0