![]() |
RDKit
Open-source cheminformatics and machine learning.
|
Substructure Search a library of molecules. More...
#include <SubstructLibrary.h>
Public Member Functions | |
SubstructLibrary () | |
SubstructLibrary (boost::shared_ptr< MolHolderBase > molecules) | |
SubstructLibrary (boost::shared_ptr< MolHolderBase > molecules, boost::shared_ptr< FPHolderBase > fingerprints) | |
MolHolderBase & | getMolHolder () |
Get the underlying molecule holder implementation. More... | |
const MolHolderBase & | getMolecules () const |
FPHolderBase & | getFingerprints () |
Get the underlying fingerprint implementation. More... | |
const FPHolderBase & | getFingerprints () const |
unsigned int | addMol (const ROMol &mol) |
Add a molecule to the library. More... | |
std::vector< unsigned int > | getMatches (const ROMol &query, bool recursionPossible=true, bool useChirality=true, bool useQueryQueryMatches=false, int numThreads=-1, int maxResults=-1) |
Get the matching indices for the query. More... | |
std::vector< unsigned int > | getMatches (const ROMol &query, unsigned int startIdx, unsigned int endIdx, bool recursionPossible=true, bool useChirality=true, bool useQueryQueryMatches=false, int numThreads=-1, int maxResults=-1) |
Get the matching indices for the query between the given indices. More... | |
unsigned int | countMatches (const ROMol &query, bool recursionPossible=true, bool useChirality=true, bool useQueryQueryMatches=false, int numThreads=-1) |
Return the number of matches for the query. More... | |
unsigned int | countMatches (const ROMol &query, unsigned int startIdx, unsigned int endIdx, bool recursionPossible=true, bool useChirality=true, bool useQueryQueryMatches=false, int numThreads=-1) |
Return the number of matches for the query between the given indices. More... | |
bool | hasMatch (const ROMol &query, bool recursionPossible=true, bool useChirality=true, bool useQueryQueryMatches=false, int numThreads=-1) |
Returns true if any match exists for the query. More... | |
bool | hasMatch (const ROMol &query, unsigned int startIdx, unsigned int endIdx, bool recursionPossible=true, bool useChirality=true, bool useQueryQueryMatches=false, int numThreads=-1) |
boost::shared_ptr< ROMol > | getMol (unsigned int idx) const |
Returns the molecule at the given index. More... | |
boost::shared_ptr< ROMol > | operator[] (unsigned int idx) |
Returns the molecule at the given index. More... | |
unsigned int | size () const |
return the number of molecules in the library More... | |
Substructure Search a library of molecules.
This class allows for multithreaded substructure searches os large datasets.
The implementations can use fingerprints to speed up searches and have molecules cached as binary forms to reduce memory usage.
basic usage:
Using different mol holders and pattern fingerprints.
Cached molecule holders create molecules on demand. There are currently three styles of cached molecules.
CachedMolHolder: stores molecules in the rdkit binary format. CachedSmilesMolHolder: stores molecules in smiles format. CachedTrustedSmilesMolHolder: stores molecules in smiles format.
The CachedTrustedSmilesMolHolder is made to add molecules from a trusted source. This makes the basic assumption that RDKit was used to sanitize and canonicalize the smiles string. In practice this is considerably faster than using arbitrary smiles strings since certain assumptions can be made.
When loading from external data, as opposed to using the "addMol" API, care must be taken to ensure that the pattern fingerprints and smiles are synchronized.
Each pattern holder has an API point for making its fingerprint. This is useful to ensure that the pattern stored in the database will be compatible with the patterns made when analyzing queries.
Definition at line 342 of file SubstructLibrary.h.
|
inline |
Definition at line 349 of file SubstructLibrary.h.
|
inline |
Definition at line 355 of file SubstructLibrary.h.
|
inline |
Definition at line 358 of file SubstructLibrary.h.
unsigned int RDKit::SubstructLibrary::addMol | ( | const ROMol & | mol | ) |
Add a molecule to the library.
mol | Molecule to add |
returns index for the molecule in the library
unsigned int RDKit::SubstructLibrary::countMatches | ( | const ROMol & | query, |
bool | recursionPossible = true , |
||
bool | useChirality = true , |
||
bool | useQueryQueryMatches = false , |
||
int | numThreads = -1 |
||
) |
Return the number of matches for the query.
query | Query to match against molecules |
recursionPossible | flags whether or not recursive matches are allowed [ default true ] |
useChirality | use atomic CIP codes as part of the comparison [ default true ] |
useQueryQueryMatches | if set, the contents of atom and bond queries [ default false ] will be used as part of the matching |
numThreads | If -1 use all available processors [default -1] |
unsigned int RDKit::SubstructLibrary::countMatches | ( | const ROMol & | query, |
unsigned int | startIdx, | ||
unsigned int | endIdx, | ||
bool | recursionPossible = true , |
||
bool | useChirality = true , |
||
bool | useQueryQueryMatches = false , |
||
int | numThreads = -1 |
||
) |
Return the number of matches for the query between the given indices.
query | Query to match against molecules |
startIdx | Start index of the search |
endIdx | Ending idx (non-inclusive) of the search. |
recursionPossible | flags whether or not recursive matches are allowed [ default true ] |
useChirality | use atomic CIP codes as part of the comparison [ default true ] |
useQueryQueryMatches | if set, the contents of atom and bond queries [ default false ] will be used as part of the matching |
numThreads | If -1 use all available processors [default -1] |
|
inline |
Get the underlying fingerprint implementation.
Throws a value error if no fingerprints have been set
Definition at line 378 of file SubstructLibrary.h.
|
inline |
Definition at line 384 of file SubstructLibrary.h.
std::vector<unsigned int> RDKit::SubstructLibrary::getMatches | ( | const ROMol & | query, |
bool | recursionPossible = true , |
||
bool | useChirality = true , |
||
bool | useQueryQueryMatches = false , |
||
int | numThreads = -1 , |
||
int | maxResults = -1 |
||
) |
Get the matching indices for the query.
query | Query to match against molecules |
recursionPossible | flags whether or not recursive matches are allowed [ default true ] |
useChirality | use atomic CIP codes as part of the comparison [ default true ] |
useQueryQueryMatches | if set, the contents of atom and bond queries [ default false ] will be used as part of the matching |
numThreads | If -1 use all available processors [default -1] |
maxResults | Maximum results to return, -1 means return all [default -1] |
std::vector<unsigned int> RDKit::SubstructLibrary::getMatches | ( | const ROMol & | query, |
unsigned int | startIdx, | ||
unsigned int | endIdx, | ||
bool | recursionPossible = true , |
||
bool | useChirality = true , |
||
bool | useQueryQueryMatches = false , |
||
int | numThreads = -1 , |
||
int | maxResults = -1 |
||
) |
Get the matching indices for the query between the given indices.
query | Query to match against molecules |
startIdx | Start index of the search |
endIdx | Ending idx (non-inclusive) of the search. |
recursionPossible | flags whether or not recursive matches are allowed [ default true ] |
useChirality | use atomic CIP codes as part of the comparison [ default true ] |
useQueryQueryMatches | if set, the contents of atom and bond queries [ default false ] will be used as part of the matching |
numThreads | If -1 use all available processors [default -1] |
maxResults | Maximum results to return, -1 means return all [default -1] |
|
inline |
Returns the molecule at the given index.
idx | Index of the molecule in the library |
Definition at line 514 of file SubstructLibrary.h.
References RDKit::MolHolderBase::getMol(), and PRECONDITION.
|
inline |
Definition at line 371 of file SubstructLibrary.h.
References PRECONDITION.
|
inline |
Get the underlying molecule holder implementation.
Definition at line 366 of file SubstructLibrary.h.
References PRECONDITION.
bool RDKit::SubstructLibrary::hasMatch | ( | const ROMol & | query, |
bool | recursionPossible = true , |
||
bool | useChirality = true , |
||
bool | useQueryQueryMatches = false , |
||
int | numThreads = -1 |
||
) |
Returns true if any match exists for the query.
query | Query to match against molecules |
recursionPossible | flags whether or not recursive matches are allowed [ default true ] |
useChirality | use atomic CIP codes as part of the comparison [ default true ] |
useQueryQueryMatches | if set, the contents of atom and bond queries [ default false ] will be used as part of the matching |
numThreads | If -1 use all available processors [default -1] |
bool RDKit::SubstructLibrary::hasMatch | ( | const ROMol & | query, |
unsigned int | startIdx, | ||
unsigned int | endIdx, | ||
bool | recursionPossible = true , |
||
bool | useChirality = true , |
||
bool | useQueryQueryMatches = false , |
||
int | numThreads = -1 |
||
) |
Returns true if any match exists for the query between the specified indices
query | Query to match against molecules |
startIdx | Start index of the search |
endIdx | Ending idx (inclusive) of the search. |
recursionPossible | flags whether or not recursive matches are allowed [ default true ] |
useChirality | use atomic CIP codes as part of the comparison [ default true ] |
useQueryQueryMatches | if set, the contents of atom and bond queries [ default false ] will be used as part of the matching |
numThreads | If -1 use all available processors [default -1] |
|
inline |
Returns the molecule at the given index.
idx | Index of the molecule in the library |
Definition at line 524 of file SubstructLibrary.h.
References RDKit::MolHolderBase::getMol(), and PRECONDITION.
|
inline |
return the number of molecules in the library
Definition at line 531 of file SubstructLibrary.h.
References PRECONDITION.