skbio.stats.distance.DissimilarityMatrix

class skbio.stats.distance.DissimilarityMatrix(data, ids=None)[source]

Store dissimilarities between objects.

A DissimilarityMatrix instance stores a square, hollow, two-dimensional matrix of dissimilarities between objects. Objects could be, for example, samples or DNA sequences. A sequence of IDs accompanies the dissimilarities.

Methods are provided to load and save dissimilarity matrices from/to disk, as well as perform common operations such as extracting dissimilarities based on object ID.

Parameters:
  • data (array_like or DissimilarityMatrix) – Square, hollow, two-dimensional numpy.ndarray of dissimilarities (floats), or a structure that can be converted to a numpy.ndarray using numpy.asarray or a one-dimensional vector of dissimilarities (floats), as defined by scipy.spatial.distance.squareform. Can instead be a DissimilarityMatrix (or subclass) instance, in which case the instance’s data will be used. Data will be converted to a float dtype if necessary. A copy will not be made if already a numpy.ndarray with a float dtype.
  • ids (sequence of str, optional) – Sequence of strings to be used as object IDs. Must match the number of rows/cols in data. If None (the default), IDs will be monotonically-increasing integers cast as strings, with numbering starting from zero, e.g., ('0', '1', '2', '3', ...).

See also

DistanceMatrix, scipy.spatial.distance.squareform

Notes

The dissimilarities are stored in redundant (square-form) format [1].

The data are not checked for symmetry, nor guaranteed/assumed to be symmetric.

References

[1]http://docs.scipy.org/doc/scipy/reference/spatial.distance.html

Attributes

T Transpose of the dissimilarity matrix.
data Array of dissimilarities.
default_write_format
dtype Data type of the dissimilarities.
ids Tuple of object IDs.
png Display heatmap in IPython Notebook as PNG.
shape Two-element tuple containing the dissimilarity matrix dimensions.
size Total number of elements in the dissimilarity matrix.
svg Display heatmap in IPython Notebook as SVG.

Built-ins

x in dm Check if the specified ID is in the dissimilarity matrix.
dm1 == dm2 Compare this dissimilarity matrix to another for equality.
dm[x] Slice into dissimilarity data by object ID or numpy indexing.
__init_subclass__ This method is called when a class is subclassed.
dm1 != dm2 Determine whether two dissimilarity matrices are not equal.
str(dm) Return a string representation of the dissimilarity matrix.

Methods

copy() Return a deep copy of the dissimilarity matrix.
filter(ids[, strict]) Filter the dissimilarity matrix by IDs.
from_iterable(iterable, metric[, key, keys]) Create DissimilarityMatrix from an iterable given a metric.
index(lookup_id) Return the index of the specified ID.
plot([cmap, title]) Creates a heatmap of the dissimilarity matrix
read(file[, format]) Create a new DissimilarityMatrix instance from a file.
redundant_form() Return an array of dissimilarities in redundant format.
to_data_frame() Create a pandas.DataFrame from this DissimilarityMatrix.
transpose() Return the transpose of the dissimilarity matrix.
write(file[, format]) Write an instance of DissimilarityMatrix to a file.