RDKit
Open-source cheminformatics and machine learning.
RDKit::MolOps Namespace Reference

Groups a variety of molecular query and transformation operations. More...

Classes

struct  AdjustQueryParameters
 

Functions

RDKIT_GRAPHMOL_EXPORT int countAtomElec (const Atom *at)
 return the number of electrons available on an atom to donate for More...
 
RDKIT_GRAPHMOL_EXPORT int getFormalCharge (const ROMol &mol)
 sums up all atomic formal charges and returns the result More...
 
RDKIT_GRAPHMOL_EXPORT bool atomHasConjugatedBond (const Atom *at)
 returns whether or not the given Atom is involved in a conjugated bond More...
 
RDKIT_GRAPHMOL_EXPORT unsigned int getMolFrags (const ROMol &mol, std::vector< int > &mapping)
 find fragments (disconnected components of the molecular graph) More...
 
RDKIT_GRAPHMOL_EXPORT unsigned int getMolFrags (const ROMol &mol, std::vector< std::vector< int >> &frags)
 find fragments (disconnected components of the molecular graph) More...
 
RDKIT_GRAPHMOL_EXPORT std::vector< boost::shared_ptr< ROMol > > getMolFrags (const ROMol &mol, bool sanitizeFrags=true, std::vector< int > *frags=0, std::vector< std::vector< int >> *fragsMolAtomMapping=0, bool copyConformers=true)
 splits a molecule into its component fragments More...
 
template<typename T >
RDKIT_GRAPHMOL_EXPORT std::map< T, boost::shared_ptr< ROMol > > getMolFragsWithQuery (const ROMol &mol, T(*query)(const ROMol &, const Atom *), bool sanitizeFrags=true, const std::vector< T > *whiteList=0, bool negateList=false)
 splits a molecule into pieces based on labels assigned using a query More...
 
RDKIT_GRAPHMOL_EXPORT double computeBalabanJ (const ROMol &mol, bool useBO=true, bool force=false, const std::vector< int > *bondPath=0, bool cacheIt=true)
 calculates Balaban's J index for the molecule More...
 
RDKIT_GRAPHMOL_EXPORT double computeBalabanJ (double *distMat, int nb, int nAts)
 
RDKIT_GRAPHMOL_EXPORT unsigned getNumAtomsWithDistinctProperty (const ROMol &mol, std::string prop)
 returns the number of atoms which have a particular property set More...
 
Ring finding and SSSR
RDKIT_GRAPHMOL_EXPORT int findSSSR (const ROMol &mol, std::vector< std::vector< int >> &res)
 finds a molecule's Smallest Set of Smallest Rings More...
 
RDKIT_GRAPHMOL_EXPORT int findSSSR (const ROMol &mol, std::vector< std::vector< int >> *res=0)
 
RDKIT_GRAPHMOL_EXPORT void fastFindRings (const ROMol &mol)
 use a DFS algorithm to identify ring bonds and atoms in a molecule More...
 
RDKIT_GRAPHMOL_EXPORT int symmetrizeSSSR (ROMol &mol, std::vector< std::vector< int >> &res)
 symmetrize the molecule's Smallest Set of Smallest Rings More...
 
RDKIT_GRAPHMOL_EXPORT int symmetrizeSSSR (ROMol &mol)
 
Shortest paths and other matrices
RDKIT_GRAPHMOL_EXPORT double * getAdjacencyMatrix (const ROMol &mol, bool useBO=false, int emptyVal=0, bool force=false, const char *propNamePrefix=0, const boost::dynamic_bitset<> *bondsToUse=0)
 returns a molecule's adjacency matrix More...
 
RDKIT_GRAPHMOL_EXPORT double * getDistanceMat (const ROMol &mol, bool useBO=false, bool useAtomWts=false, bool force=false, const char *propNamePrefix=0)
 Computes the molecule's topological distance matrix. More...
 
RDKIT_GRAPHMOL_EXPORT double * getDistanceMat (const ROMol &mol, const std::vector< int > &activeAtoms, const std::vector< const Bond *> &bonds, bool useBO=false, bool useAtomWts=false)
 Computes the molecule's topological distance matrix. More...
 
RDKIT_GRAPHMOL_EXPORT double * get3DDistanceMat (const ROMol &mol, int confId=-1, bool useAtomWts=false, bool force=false, const char *propNamePrefix=0)
 Computes the molecule's 3D distance matrix. More...
 
RDKIT_GRAPHMOL_EXPORT std::list< int > getShortestPath (const ROMol &mol, int aid1, int aid2)
 Find the shortest path between two atoms. More...
 
Stereochemistry
RDKIT_GRAPHMOL_EXPORT void cleanupChirality (RWMol &mol)
 removes bogus chirality markers (those on non-sp3 centers): More...
 
RDKIT_GRAPHMOL_EXPORT void assignChiralTypesFrom3D (ROMol &mol, int confId=-1, bool replaceExistingTags=true)
 Uses a conformer to assign ChiralType to a molecule's atoms. More...
 
RDKIT_GRAPHMOL_EXPORT void assignStereochemistryFrom3D (ROMol &mol, int confId=-1, bool replaceExistingTags=true)
 Uses a conformer to assign ChiralTypes to a molecule's atoms and stereo flags to its bonds. More...
 
RDKIT_GRAPHMOL_EXPORT void detectBondStereochemistry (ROMol &mol, int confId=-1)
 Uses a conformer to assign directionality to the single bonds around double bonds. More...
 
RDKIT_GRAPHMOL_EXPORT void setDoubleBondNeighborDirections (ROMol &mol, const Conformer *conf=NULL)
 
RDKIT_GRAPHMOL_EXPORT void assignStereochemistry (ROMol &mol, bool cleanIt=false, bool force=false, bool flagPossibleStereoCenters=false)
 Assign stereochemistry tags to atoms (i.e. R/S) and bonds (i.e. Z/E) More...
 
RDKIT_GRAPHMOL_EXPORT void removeStereochemistry (ROMol &mol)
 Removes all stereochemistry information from atoms (i.e. R/S) and bonds. More...
 
RDKIT_GRAPHMOL_EXPORT void findPotentialStereoBonds (ROMol &mol, bool cleanIt=false)
 finds bonds that could be cis/trans in a molecule and mark them as Bond::STEREOANY. More...
 

Dealing with hydrogens

enum  AdjustQueryWhichFlags {
  ADJUST_IGNORENONE = 0x0, ADJUST_IGNORECHAINS = 0x1, ADJUST_IGNORERINGS = 0x4, ADJUST_IGNOREDUMMIES = 0x2,
  ADJUST_IGNORENONDUMMIES = 0x8, ADJUST_IGNOREMAPPED = 0x10, ADJUST_IGNOREALL = 0xFFFFFFF
}
 
RDKIT_GRAPHMOL_EXPORT ROMoladdHs (const ROMol &mol, bool explicitOnly=false, bool addCoords=false, const UINT_VECT *onlyOnAtoms=NULL, bool addResidueInfo=false)
 returns a copy of a molecule with hydrogens added in as explicit Atoms More...
 
RDKIT_GRAPHMOL_EXPORT void addHs (RWMol &mol, bool explicitOnly=false, bool addCoords=false, const UINT_VECT *onlyOnAtoms=NULL, bool addResidueInfo=false)
 
RDKIT_GRAPHMOL_EXPORT ROMolremoveHs (const ROMol &mol, bool implicitOnly=false, bool updateExplicitCount=false, bool sanitize=true)
 returns a copy of a molecule with hydrogens removed More...
 
RDKIT_GRAPHMOL_EXPORT void removeHs (RWMol &mol, bool implicitOnly=false, bool updateExplicitCount=false, bool sanitize=true)
 
RDKIT_GRAPHMOL_EXPORT ROMolmergeQueryHs (const ROMol &mol, bool mergeUnmappedOnly=false)
 
RDKIT_GRAPHMOL_EXPORT void mergeQueryHs (RWMol &mol, bool mergeUnmappedOnly=false)
 
RDKIT_GRAPHMOL_EXPORT ROMoladjustQueryProperties (const ROMol &mol, const AdjustQueryParameters *params=NULL)
 returns a copy of a molecule with query properties adjusted More...
 
RDKIT_GRAPHMOL_EXPORT void adjustQueryProperties (RWMol &mol, const AdjustQueryParameters *params=NULL)
 
RDKIT_GRAPHMOL_EXPORT ROMolrenumberAtoms (const ROMol &mol, const std::vector< unsigned int > &newOrder)
 returns a copy of a molecule with the atoms renumbered More...
 

Sanitization

enum  SanitizeFlags {
  SANITIZE_NONE = 0x0, SANITIZE_CLEANUP = 0x1, SANITIZE_PROPERTIES = 0x2, SANITIZE_SYMMRINGS = 0x4,
  SANITIZE_KEKULIZE = 0x8, SANITIZE_FINDRADICALS = 0x10, SANITIZE_SETAROMATICITY = 0x20, SANITIZE_SETCONJUGATION = 0x40,
  SANITIZE_SETHYBRIDIZATION = 0x80, SANITIZE_CLEANUPCHIRALITY = 0x100, SANITIZE_ADJUSTHS = 0x200, SANITIZE_ALL = 0xFFFFFFF
}
 
enum  AromaticityModel {
  AROMATICITY_DEFAULT = 0x0, AROMATICITY_RDKIT = 0x1, AROMATICITY_SIMPLE = 0x2, AROMATICITY_MDL = 0x4,
  AROMATICITY_CUSTOM = 0xFFFFFFF
}
 Possible aromaticity models. More...
 
RDKIT_GRAPHMOL_EXPORT void sanitizeMol (RWMol &mol, unsigned int &operationThatFailed, unsigned int sanitizeOps=SANITIZE_ALL)
 carries out a collection of tasks for cleaning up a molecule and More...
 
RDKIT_GRAPHMOL_EXPORT void sanitizeMol (RWMol &mol)
 
RDKIT_GRAPHMOL_EXPORT int setAromaticity (RWMol &mol, AromaticityModel model=AROMATICITY_DEFAULT, int(*func)(RWMol &)=NULL)
 Sets up the aromaticity for a molecule. More...
 
RDKIT_GRAPHMOL_EXPORT void cleanUp (RWMol &mol)
 Designed to be called by the sanitizer to handle special cases before. More...
 
RDKIT_GRAPHMOL_EXPORT void assignRadicals (RWMol &mol)
 Called by the sanitizer to assign radical counts to atoms. More...
 
RDKIT_GRAPHMOL_EXPORT void adjustHs (RWMol &mol)
 adjust the number of implicit and explicit Hs for special cases More...
 
RDKIT_GRAPHMOL_EXPORT void Kekulize (RWMol &mol, bool markAtomsBonds=true, unsigned int maxBackTracks=100)
 Kekulizes the molecule. More...
 
RDKIT_GRAPHMOL_EXPORT void setConjugation (ROMol &mol)
 flags the molecule's conjugated bonds More...
 
RDKIT_GRAPHMOL_EXPORT void setHybridization (ROMol &mol)
 calculates and sets the hybridization of all a molecule's Stoms More...
 

Detailed Description

Groups a variety of molecular query and transformation operations.

Enumeration Type Documentation

◆ AdjustQueryWhichFlags

Enumerator
ADJUST_IGNORENONE 
ADJUST_IGNORECHAINS 
ADJUST_IGNORERINGS 
ADJUST_IGNOREDUMMIES 
ADJUST_IGNORENONDUMMIES 
ADJUST_IGNOREMAPPED 
ADJUST_IGNOREALL 

Definition at line 264 of file MolOps.h.

◆ AromaticityModel

Possible aromaticity models.

  • AROMATICITY_DEFAULT at the moment always uses AROMATICITY_RDKIT
  • AROMATICITY_RDKIT is the standard RDKit model (as documented in the RDKit Book)
  • AROMATICITY_SIMPLE only considers 5- and 6-membered simple rings (it does not consider the outer envelope of fused rings)
  • AROMATICITY_MDL
  • AROMATICITY_CUSTOM uses a caller-provided function
Enumerator
AROMATICITY_DEFAULT 

future proofing

AROMATICITY_RDKIT 
AROMATICITY_SIMPLE 
AROMATICITY_MDL 
AROMATICITY_CUSTOM 

use a function

Definition at line 414 of file MolOps.h.

◆ SanitizeFlags

Enumerator
SANITIZE_NONE 
SANITIZE_CLEANUP 
SANITIZE_PROPERTIES 
SANITIZE_SYMMRINGS 
SANITIZE_KEKULIZE 
SANITIZE_FINDRADICALS 
SANITIZE_SETAROMATICITY 
SANITIZE_SETCONJUGATION 
SANITIZE_SETHYBRIDIZATION 
SANITIZE_CLEANUPCHIRALITY 
SANITIZE_ADJUSTHS 
SANITIZE_ALL 

Definition at line 345 of file MolOps.h.

Function Documentation

◆ addHs() [1/2]

RDKIT_GRAPHMOL_EXPORT ROMol* RDKit::MolOps::addHs ( const ROMol mol,
bool  explicitOnly = false,
bool  addCoords = false,
const UINT_VECT onlyOnAtoms = NULL,
bool  addResidueInfo = false 
)

returns a copy of a molecule with hydrogens added in as explicit Atoms

Parameters
molthe molecule to add Hs to
explicitOnly(optional) if this true, only explicit Hs will be added
addCoords(optional) If this is true, estimates for the atomic coordinates of the added Hs will be used.
onlyOnAtoms(optional) if provided, this should be a vector of IDs of the atoms that will be considered for H addition.
addResidueInfo(optional) if this is true, add residue info to hydrogen atoms (useful for PDB files).
Returns
the new molecule

Notes:

  • it makes no sense to use the addCoords option if the molecule's heavy atoms don't already have coordinates.
  • the caller is responsible for deleteing the pointer this returns.

◆ addHs() [2/2]

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::addHs ( RWMol mol,
bool  explicitOnly = false,
bool  addCoords = false,
const UINT_VECT onlyOnAtoms = NULL,
bool  addResidueInfo = false 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

◆ adjustHs()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::adjustHs ( RWMol mol)

adjust the number of implicit and explicit Hs for special cases

Currently this:

  • modifies aromatic nitrogens so that, when appropriate, they have an explicit H marked (e.g. so that we get things like "c1cc[nH]cc1"
Parameters
molthe molecule of interest

Assumptions

  • this is called after the molecule has been sanitized, aromaticity has been perceived, and the implicit valence of everything has been calculated.

◆ adjustQueryProperties() [1/2]

RDKIT_GRAPHMOL_EXPORT ROMol* RDKit::MolOps::adjustQueryProperties ( const ROMol mol,
const AdjustQueryParameters params = NULL 
)

returns a copy of a molecule with query properties adjusted

Parameters
molthe molecule to adjust
paramscontrols the adjustments made
Returns
the new molecule

Referenced by RDKit::MolOps::AdjustQueryParameters::AdjustQueryParameters().

◆ adjustQueryProperties() [2/2]

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::adjustQueryProperties ( RWMol mol,
const AdjustQueryParameters params = NULL 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

◆ assignChiralTypesFrom3D()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::assignChiralTypesFrom3D ( ROMol mol,
int  confId = -1,
bool  replaceExistingTags = true 
)

Uses a conformer to assign ChiralType to a molecule's atoms.

Parameters
molthe molecule of interest
confIdthe conformer to use
replaceExistingTagsif this flag is true, any existing atomic chiral tags will be replaced

If the conformer provided is not a 3D conformer, nothing will be done.

◆ assignRadicals()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::assignRadicals ( RWMol mol)

Called by the sanitizer to assign radical counts to atoms.

◆ assignStereochemistry()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::assignStereochemistry ( ROMol mol,
bool  cleanIt = false,
bool  force = false,
bool  flagPossibleStereoCenters = false 
)

Assign stereochemistry tags to atoms (i.e. R/S) and bonds (i.e. Z/E)

Does the CIP stereochemistry assignment for the molecule's atoms (R/S) and double bond (Z/E). Chiral atoms will have a property '_CIPCode' indicating their chiral code.

Parameters
molthe molecule to use
cleanItif true, atoms with a chiral specifier that aren't actually chiral (e.g. atoms with duplicate substituents or only 2 substituents, etc.) will have their chiral code set to CHI_UNSPECIFIED. Bonds with STEREOCIS/STEREOTRANS specified that have duplicate substituents based upon the CIP atom ranks will be marked STEREONONE.
forcecauses the calculation to be repeated even if it has already been done
flagPossibleStereoCentersset the _ChiralityPossible property on atoms that are possible stereocenters

Notes:M

  • Throughout we assume that we're working with a hydrogen-suppressed graph.

◆ assignStereochemistryFrom3D()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::assignStereochemistryFrom3D ( ROMol mol,
int  confId = -1,
bool  replaceExistingTags = true 
)

Uses a conformer to assign ChiralTypes to a molecule's atoms and stereo flags to its bonds.

Parameters
molthe molecule of interest
confIdthe conformer to use
replaceExistingTagsif this flag is true, any existing info about stereochemistry will be replaced

If the conformer provided is not a 3D conformer, nothing will be done.

◆ atomHasConjugatedBond()

RDKIT_GRAPHMOL_EXPORT bool RDKit::MolOps::atomHasConjugatedBond ( const Atom at)

returns whether or not the given Atom is involved in a conjugated bond

◆ cleanUp()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::cleanUp ( RWMol mol)

Designed to be called by the sanitizer to handle special cases before.

Currently this:

  • modifies nitro groups, so that the nitrogen does not have an unreasonable valence of 5, as follows:
    • the nitrogen gets a positive charge
    • one of the oxygens gets a negative chage and the double bond to this oxygen is changed to a single bond The net result is that nitro groups can be counted on to be: "[N+](=O)[O-]"
  • modifies halogen-oxygen containing species as follows: [Cl,Br,I](=O)(=O)(=O)O -> [X+3]([O-])([O-])([O-])O [Cl,Br,I](=O)(=O)O -> [X+3]([O-])([O-])O [Cl,Br,I](=O)O -> [X+]([O-])O
  • converts the substructure [N,C]=P(=O)-* to [N,C]=[P+](-[O-])-*
Parameters
molthe molecule of interest

◆ cleanupChirality()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::cleanupChirality ( RWMol mol)

removes bogus chirality markers (those on non-sp3 centers):

◆ computeBalabanJ() [1/2]

RDKIT_GRAPHMOL_EXPORT double RDKit::MolOps::computeBalabanJ ( const ROMol mol,
bool  useBO = true,
bool  force = false,
const std::vector< int > *  bondPath = 0,
bool  cacheIt = true 
)

calculates Balaban's J index for the molecule

Parameters
molthe molecule of interest
useBOtoggles inclusion of the bond order in the calculation (when false, we're not really calculating the J value)
forceforces the calculation (instead of using cached results)
bondPathwhen included, only paths using bonds whose indices occur in this vector will be included in the calculation
cacheItIf this is true, the calculated value will be cached as a property on the molecule
Returns
the J index

◆ computeBalabanJ() [2/2]

RDKIT_GRAPHMOL_EXPORT double RDKit::MolOps::computeBalabanJ ( double *  distMat,
int  nb,
int  nAts 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

◆ countAtomElec()

RDKIT_GRAPHMOL_EXPORT int RDKit::MolOps::countAtomElec ( const Atom at)

return the number of electrons available on an atom to donate for

The result is determined using the default valency, number of lone pairs, number of bonds and the formal charge. Note that the atom may not donate all of these electrons to a ring for aromaticity (also used in Conjugation and hybridization code).

Parameters
atthe atom of interest
Returns
the number of electrons

◆ detectBondStereochemistry()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::detectBondStereochemistry ( ROMol mol,
int  confId = -1 
)

Uses a conformer to assign directionality to the single bonds around double bonds.

Parameters
molthe molecule of interest
confIdthe conformer to use

◆ fastFindRings()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::fastFindRings ( const ROMol mol)

use a DFS algorithm to identify ring bonds and atoms in a molecule

NOTE: though the RingInfo structure is populated by this function, the only really reliable calls that can be made are to check if mol.getRingInfo().numAtomRings(idx) or mol.getRingInfo().numBondRings(idx) return values >0

◆ findPotentialStereoBonds()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::findPotentialStereoBonds ( ROMol mol,
bool  cleanIt = false 
)

finds bonds that could be cis/trans in a molecule and mark them as Bond::STEREOANY.

Parameters
molthe molecule of interest
cleanIttoggles removal of stereo flags from double bonds that can not have stereochemistry

This function finds any double bonds that can potentially be part of a cis/trans system. No attempt is made here to mark them cis or trans. No attempt is made to detect double bond stereo in ring systems.

This function is useful in the following situations:

  • when parsing a mol file; for the bonds marked here, coordinate information on the neighbors can be used to indentify cis or trans states
  • when writing a mol file; bonds that can be cis/trans but not marked as either need to be specially marked in the mol file
  • finding double bonds with unspecified stereochemistry so they can be enumerated for downstream 3D tools

The CIPranks on the neighboring atoms are checked in this function. The _CIPCode property if set to any on the double bond.

◆ findSSSR() [1/2]

RDKIT_GRAPHMOL_EXPORT int RDKit::MolOps::findSSSR ( const ROMol mol,
std::vector< std::vector< int >> &  res 
)

finds a molecule's Smallest Set of Smallest Rings

Currently this implements a modified form of Figueras algorithm (JCICS - Vol. 36, No. 5, 1996, 986-991)

Parameters
molthe molecule of interest
resused to return the vector of rings. Each entry is a vector with atom indices. This information is also stored in the molecule's RingInfo structure, so this argument is optional (see overload)
Returns
number of smallest rings found

Base algorithm:

  • The original algorithm starts by finding representative degree 2 nodes.
  • Representative because if a series of deg 2 nodes are found only one of them is picked.
  • The smallest ring around each of them is found.
  • The bonds that connect to this degree 2 node are them chopped off, yielding new deg two nodes
  • The process is repeated on the new deg 2 nodes.
  • If no deg 2 nodes are found, a deg 3 node is picked. The smallest ring with it is found. A bond from this is "carefully" (look in the paper) selected and chopped, yielding deg 2 nodes. The process is same as above once this is done.

Our Modifications:

  • If available, more than one smallest ring around a representative deg 2 node will be computed and stored
  • Typically 3 rings are found around a degree 3 node (when no deg 2s are available) and all the bond to that node are chopped.
  • The extra rings that were found in this process are removed after all the nodes have been covered.

These changes were motivated by several factors:

  • We believe the original algorithm fails to find the correct SSSR (finds the correct number of them but the wrong ones) on some sample mols
  • Since SSSR may not be unique, a post-SSSR step to symmetrize may be done. The extra rings this process adds can be quite useful.

◆ findSSSR() [2/2]

RDKIT_GRAPHMOL_EXPORT int RDKit::MolOps::findSSSR ( const ROMol mol,
std::vector< std::vector< int >> *  res = 0 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

◆ get3DDistanceMat()

RDKIT_GRAPHMOL_EXPORT double* RDKit::MolOps::get3DDistanceMat ( const ROMol mol,
int  confId = -1,
bool  useAtomWts = false,
bool  force = false,
const char *  propNamePrefix = 0 
)

Computes the molecule's 3D distance matrix.

Parameters
molthe molecule of interest
confIdthe conformer to use
useAtomWtssets the diagonal elements of the result to 6.0/(atomic number)
forceforces calculation of the matrix, even if already computed
propNamePrefixused to set the cached property name (if set to an empty string, the matrix will not be cached)
Returns
the distance matrix.

Notes

  • If propNamePrefix is not empty the result of this is cached in the molecule's local property dictionary, which will handle deallocation. In other cases the caller is responsible for freeing the memory.

◆ getAdjacencyMatrix()

RDKIT_GRAPHMOL_EXPORT double* RDKit::MolOps::getAdjacencyMatrix ( const ROMol mol,
bool  useBO = false,
int  emptyVal = 0,
bool  force = false,
const char *  propNamePrefix = 0,
const boost::dynamic_bitset<> *  bondsToUse = 0 
)

returns a molecule's adjacency matrix

Parameters
molthe molecule of interest
useBOtoggles use of bond orders in the matrix
emptyValsets the empty value (for non-adjacent atoms)
forceforces calculation of the matrix, even if already computed
propNamePrefixused to set the cached property name
Returns
the adjacency matrix.

Notes

  • The result of this is cached in the molecule's local property dictionary, which will handle deallocation. The caller should not delete this pointer.

◆ getDistanceMat() [1/2]

RDKIT_GRAPHMOL_EXPORT double* RDKit::MolOps::getDistanceMat ( const ROMol mol,
bool  useBO = false,
bool  useAtomWts = false,
bool  force = false,
const char *  propNamePrefix = 0 
)

Computes the molecule's topological distance matrix.

Uses the Floyd-Warshall all-pairs-shortest-paths algorithm.

Parameters
molthe molecule of interest
useBOtoggles use of bond orders in the matrix
useAtomWtssets the diagonal elements of the result to 6.0/(atomic number) so that the matrix can be used to calculate Balaban J values. This does not affect the bond weights.
forceforces calculation of the matrix, even if already computed
propNamePrefixused to set the cached property name
Returns
the distance matrix.

Notes

  • The result of this is cached in the molecule's local property dictionary, which will handle deallocation. The caller should not delete this pointer.

◆ getDistanceMat() [2/2]

RDKIT_GRAPHMOL_EXPORT double* RDKit::MolOps::getDistanceMat ( const ROMol mol,
const std::vector< int > &  activeAtoms,
const std::vector< const Bond *> &  bonds,
bool  useBO = false,
bool  useAtomWts = false 
)

Computes the molecule's topological distance matrix.

Uses the Floyd-Warshall all-pairs-shortest-paths algorithm.

Parameters
molthe molecule of interest
activeAtomsonly elements corresponding to these atom indices will be included in the calculation
bondsonly bonds found in this list will be included in the calculation
useBOtoggles use of bond orders in the matrix
useAtomWtssets the diagonal elements of the result to 6.0/(atomic number) so that the matrix can be used to calculate Balaban J values. This does not affect the bond weights.
Returns
the distance matrix.

Notes

  • The results of this call are not cached, the caller should delete this pointer.

◆ getFormalCharge()

RDKIT_GRAPHMOL_EXPORT int RDKit::MolOps::getFormalCharge ( const ROMol mol)

sums up all atomic formal charges and returns the result

◆ getMolFrags() [1/3]

RDKIT_GRAPHMOL_EXPORT unsigned int RDKit::MolOps::getMolFrags ( const ROMol mol,
std::vector< int > &  mapping 
)

find fragments (disconnected components of the molecular graph)

Parameters
molthe molecule of interest
mappingused to return the mapping of Atoms->fragments. On return mapping will be mol->getNumAtoms() long and will contain the fragment assignment for each Atom
Returns
the number of fragments found.

◆ getMolFrags() [2/3]

RDKIT_GRAPHMOL_EXPORT unsigned int RDKit::MolOps::getMolFrags ( const ROMol mol,
std::vector< std::vector< int >> &  frags 
)

find fragments (disconnected components of the molecular graph)

Parameters
molthe molecule of interest
fragsused to return the Atoms in each fragment On return mapping will be numFrags long, and each entry will contain the indices of the Atoms in that fragment.
Returns
the number of fragments found.

◆ getMolFrags() [3/3]

RDKIT_GRAPHMOL_EXPORT std::vector<boost::shared_ptr<ROMol> > RDKit::MolOps::getMolFrags ( const ROMol mol,
bool  sanitizeFrags = true,
std::vector< int > *  frags = 0,
std::vector< std::vector< int >> *  fragsMolAtomMapping = 0,
bool  copyConformers = true 
)

splits a molecule into its component fragments

Parameters
molthe molecule of interest
sanitizeFragstoggles sanitization of the fragments after they are built
fragsused to return the mapping of Atoms->fragments. if provided, frags will be mol->getNumAtoms() long on return and will contain the fragment assignment for each Atom
fragsMolAtomMappingused to return the Atoms in each fragment On return mapping will be numFrags long, and each entry will contain the indices of the Atoms in that fragment.
copyConformerstoggles copying conformers of the fragments after they are built
Returns
a vector of the fragments as smart pointers to ROMols

◆ getMolFragsWithQuery()

template<typename T >
RDKIT_GRAPHMOL_EXPORT std::map<T, boost::shared_ptr<ROMol> > RDKit::MolOps::getMolFragsWithQuery ( const ROMol mol,
T(*)(const ROMol &, const Atom *)  query,
bool  sanitizeFrags = true,
const std::vector< T > *  whiteList = 0,
bool  negateList = false 
)

splits a molecule into pieces based on labels assigned using a query

Parameters
molthe molecule of interest
querythe query used to "label" the molecule for fragmentation
sanitizeFragstoggles sanitization of the fragments after they are built
whiteListif provided, only labels in the list will be kept
negateListif true, the white list logic will be inverted: only labels not in the list will be kept
Returns
a map of the fragments and their labels

◆ getNumAtomsWithDistinctProperty()

RDKIT_GRAPHMOL_EXPORT unsigned RDKit::MolOps::getNumAtomsWithDistinctProperty ( const ROMol mol,
std::string  prop 
)

returns the number of atoms which have a particular property set

◆ getShortestPath()

RDKIT_GRAPHMOL_EXPORT std::list<int> RDKit::MolOps::getShortestPath ( const ROMol mol,
int  aid1,
int  aid2 
)

Find the shortest path between two atoms.

Uses the Bellman-Ford algorithm

Parameters
molmolecule of interest
aid1index of the first atom
aid2index of the second atom
Returns
an std::list with the indices of the atoms along the shortest path

Notes:

  • the starting and end atoms are included in the path
  • if no path is found, an empty path is returned

◆ Kekulize()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::Kekulize ( RWMol mol,
bool  markAtomsBonds = true,
unsigned int  maxBackTracks = 100 
)

Kekulizes the molecule.

Parameters
molthe molecule of interest
markAtomsBondsif this is set to true, isAromatic boolean settings on both the Bonds and Atoms are turned to false following the Kekulization, otherwise they are left alone in their original state.
maxBackTracksthe maximum number of attempts at back-tracking. The algorithm uses a back-tracking procedure to revist a previous setting of double bond if we hit a wall in the kekulization process

Notes:

◆ mergeQueryHs() [1/2]

RDKIT_GRAPHMOL_EXPORT ROMol* RDKit::MolOps::mergeQueryHs ( const ROMol mol,
bool  mergeUnmappedOnly = false 
)

returns a copy of a molecule with hydrogens removed and added as queries to the heavy atoms to which they are bound.

This is really intended to be used with molecules that contain QueryAtoms

Parameters
molthe molecule to remove Hs from
Returns
the new molecule

Notes:

  • Atoms that do not already have hydrogen count queries will have one added, other H-related queries will not be touched. Examples:
    • C[H] -> [C;!H0]
    • [C;H1][H] -> [C;H1]
    • [C;H2][H] -> [C;H2]
  • Hydrogens which aren't connected to a heavy atom will not be removed. This prevents molecules like "[H][H]" from having all atoms removed.
  • the caller is responsible for deleteing the pointer this returns.
  • By default all hydrogens are removed, however if mergeUnmappedOnly is true, any hydrogen participating in an atom map will be retained

◆ mergeQueryHs() [2/2]

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::mergeQueryHs ( RWMol mol,
bool  mergeUnmappedOnly = false 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

◆ removeHs() [1/2]

RDKIT_GRAPHMOL_EXPORT ROMol* RDKit::MolOps::removeHs ( const ROMol mol,
bool  implicitOnly = false,
bool  updateExplicitCount = false,
bool  sanitize = true 
)

returns a copy of a molecule with hydrogens removed

Parameters
molthe molecule to remove Hs from
implicitOnly(optional) if this true, only implicit Hs will be removed
updateExplicitCount(optional) If this is true, when explicit Hs are removed from the graph, the heavy atom to which they are bound will have its counter of explicit Hs increased.
sanitize(optional) If this is true, the final molecule will be sanitized
Returns
the new molecule

Notes:

  • Hydrogens which aren't connected to a heavy atom will not be removed. This prevents molecules like "[H][H]" from having all atoms removed.
  • Labelled hydrogen (e.g. atoms with atomic number=1, but mass > 1), will not be removed.
  • two coordinate Hs, like the central H in C[H-]C, will not be removed
  • Hs connected to dummy atoms will not be removed
  • Hs that are part of the definition of double bond Stereochemistry will not be removed
  • Hs that are not connected to anything else will not be removed
  • the caller is responsible for deleteing the pointer this returns.

Referenced by RDKit::ForwardSDMolSupplier::ForwardSDMolSupplier(), RDKit::SDMolSupplier::SDMolSupplier(), and RDKit::SDMolSupplier::~SDMolSupplier().

◆ removeHs() [2/2]

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::removeHs ( RWMol mol,
bool  implicitOnly = false,
bool  updateExplicitCount = false,
bool  sanitize = true 
)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

◆ removeStereochemistry()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::removeStereochemistry ( ROMol mol)

Removes all stereochemistry information from atoms (i.e. R/S) and bonds.

Parameters
molthe molecule of interest

◆ renumberAtoms()

RDKIT_GRAPHMOL_EXPORT ROMol* RDKit::MolOps::renumberAtoms ( const ROMol mol,
const std::vector< unsigned int > &  newOrder 
)

returns a copy of a molecule with the atoms renumbered

Parameters
molthe molecule to work with
newOrderthe new ordering of the atoms (should be numAtoms long) for example: if newOrder is [3,2,0,1], then atom 3 in the original molecule will be atom 0 in the new one
Returns
the new molecule

Notes:

  • the caller is responsible for deleteing the pointer this returns.

Referenced by RDKit::MolOps::AdjustQueryParameters::AdjustQueryParameters().

◆ sanitizeMol() [1/2]

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::sanitizeMol ( RWMol mol,
unsigned int &  operationThatFailed,
unsigned int  sanitizeOps = SANITIZE_ALL 
)

carries out a collection of tasks for cleaning up a molecule and

that it makes "chemical sense"

This functions calls the following in sequence

  1. MolOps::cleanUp()
  2. mol.updatePropertyCache()
  3. MolOps::symmetrizeSSSR()
  4. MolOps::Kekulize()
  5. MolOps::assignRadicals()
  6. MolOps::setAromaticity()
  7. MolOps::setConjugation()
  8. MolOps::setHybridization()
  9. MolOps::cleanupChirality()
  10. MolOps::adjustHs()
Parameters
mol: the RWMol to be cleaned
operationThatFailed: the first (if any) sanitization operation that fails is set here. The values are taken from the SanitizeFlags enum. On success, the value is SanitizeFlags::SANITIZE_NONE
sanitizeOps: the bits here are used to set which sanitization operations are carried out. The elements of the SanitizeFlags enum define the operations.

Notes:

  • If there is a failure in the sanitization, a SanitException will be thrown.
  • in general the user of this function should cast the molecule following this function to a ROMol, so that new atoms and bonds cannot be added to the molecule and screw up the sanitizing that has been done here

◆ sanitizeMol() [2/2]

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::sanitizeMol ( RWMol mol)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

◆ setAromaticity()

RDKIT_GRAPHMOL_EXPORT int RDKit::MolOps::setAromaticity ( RWMol mol,
AromaticityModel  model = AROMATICITY_DEFAULT,
int(*)(RWMol &)  func = NULL 
)

Sets up the aromaticity for a molecule.

This is what happens here:

  1. find all the simple rings by calling the findSSSR function
  2. loop over all the Atoms in each ring and mark them if they are candidates for aromaticity. A ring atom is a candidate if it can spare electrons to the ring and if it's from the first two rows of the periodic table.
  3. based on the candidate atoms, mark the rings to be either candidates or non-candidates. A ring is a candidate only if all its atoms are candidates
  4. apply Hueckel rule to each of the candidate rings to check if the ring can be aromatic
Parameters
molthe RWMol of interest
modelthe aromaticity model to use
funca custom function for assigning aromaticity (only used when model=AROMATICITY_CUSTOM)
Returns
>0 on success, <= 0 otherwise

Assumptions:

◆ setConjugation()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::setConjugation ( ROMol mol)

flags the molecule's conjugated bonds

◆ setDoubleBondNeighborDirections()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::setDoubleBondNeighborDirections ( ROMol mol,
const Conformer conf = NULL 
)

◆ setHybridization()

RDKIT_GRAPHMOL_EXPORT void RDKit::MolOps::setHybridization ( ROMol mol)

calculates and sets the hybridization of all a molecule's Stoms

◆ symmetrizeSSSR() [1/2]

RDKIT_GRAPHMOL_EXPORT int RDKit::MolOps::symmetrizeSSSR ( ROMol mol,
std::vector< std::vector< int >> &  res 
)

symmetrize the molecule's Smallest Set of Smallest Rings

SSSR rings obatined from "findSSSR" can be non-unique in some case. For example, cubane has five SSSR rings, not six as one would hope.

This function adds additional rings to the SSSR list if necessary to make the list symmetric, e.g. all atoms in cubane will be part of the same number of SSSRs. This function choses these extra rings from the extra rings computed and discarded during findSSSR. The new ring are chosen such that:

  • replacing a same sized ring in the SSSR list with an extra ring yields the same union of bond IDs as the orignal SSSR list
Parameters
mol- the molecule of interest
resused to return the vector of rings. Each entry is a vector with atom indices. This information is also stored in the molecule's RingInfo structure, so this argument is optional (see overload)
Returns
the total number of rings = (new rings + old SSSRs)

Notes:

◆ symmetrizeSSSR() [2/2]

RDKIT_GRAPHMOL_EXPORT int RDKit::MolOps::symmetrizeSSSR ( ROMol mol)

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.