Trees | Indices | Help |
|
---|
|
command line utility for working with FragmentCatalogs (CASE-type analysis) **Usage** BuildFragmentCatalog [optional args] <filename> filename, the name of a delimited text file containing InData, is required for some modes of operation (see below) **Command Line Arguments** - -n *maxNumMols*: specify the maximum number of molecules to be processed - -b: build the catalog and OnBitLists *requires InData* - -s: score compounds *requires InData and a Catalog, can use OnBitLists* - -g: calculate info gains *requires Scores* - -d: show details about high-ranking fragments *requires a Catalog and Gains* - --catalog=*filename*: filename with the pickled catalog. If -b is provided, this file will be overwritten. - --onbits=*filename*: filename to hold the pickled OnBitLists. If -b is provided, this file will be overwritten - --scores=*filename*: filename to hold the text score data. If -s is provided, this file will be overwritten - --gains=*filename*: filename to hold the text gains data. If -g is provided, this file will be overwritten - --details=*filename*: filename to hold the text details data. If -d is provided, this file will be overwritten. - --minPath=2: specify the minimum length for a path - --maxPath=6: specify the maximum length for a path - --smiCol=1: specify which column in the input data file contains SMILES - --actCol=-1: specify which column in the input data file contains activities - --nActs=2: specify the number of possible activity values - --nBits=-1: specify the maximum number of bits to show details for
|
|||
RunDetails |
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|
|||
_cvsVersion = "$Revision$"
|
|||
idx1 = _cvsVersion.find(':')+ 1
|
|||
idx2 = _cvsVersion.rfind('$')
|
|||
__VERSION_STRING = "%s" %(_cvsVersion [idx1: idx2])
|
Imports: sys, os, cPickle, next, Chem, RDConfig, FragmentCatalog, DbConnect, numpy, InfoTheory, types
|
builds a fragment catalog from a set of molecules in a delimited text block **Arguments** - suppl: a mol supplier - maxPts: (optional) if provided, this will set an upper bound on the number of points to be considered - groupFileName: (optional) name of the file containing functional group information - minPath, maxPath: (optional) names of the minimum and maximum path lengths to be considered - reportFreq: (optional) how often to display status information **Returns** a FragmentCatalog |
scores the compounds in a supplier using a catalog **Arguments** - suppl: a mol supplier - catalog: the FragmentCatalog - maxPts: (optional) the maximum number of molecules to be considered - actName: (optional) the name of the molecule's activity property. If this is not provided, the molecule's last property will be used. - acts: (optional) a sequence of activity values (integers). If not provided, the activities will be read from the molecules. - nActs: (optional) number of possible activity values - reportFreq: (optional) how often to display status information **Returns** a 2-tuple: 1) the results table (a 3D array of ints nBits x 2 x nActs) 2) a list containing the on bit lists for each molecule |
similar to _ScoreMolecules()_, but uses pre-calculated bit lists for the molecules (this speeds things up a lot) **Arguments** - bitLists: sequence of on bit sequences for the input molecules - suppl: the input supplier (we read activities from here) - catalog: the FragmentCatalog - maxPts: (optional) the maximum number of molecules to be considered - actName: (optional) the name of the molecule's activity property. If this is not provided, the molecule's last property will be used. - nActs: (optional) number of possible activity values - reportFreq: (optional) how often to display status information **Returns** the results table (a 3D array of ints nBits x 2 x nActs) |
calculates info gains by constructing fingerprints *DOC* Returns a 2-tuple: 1) gains matrix 2) list of fingerprints |
calculates info gains from a set of fingerprints *DOC* |
gains should be a sequence of sequences. The idCol entry of each sub-sequence should be a catalog ID. _ProcessGainsData()_ provides suitable input. |
Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0.1 on Sat Apr 23 18:49:15 2016 | http://epydoc.sourceforge.net |