Package rdkit :: Package Dbase :: Package Pubmed :: Module Searches
[hide private]
[frames] | no frames]

Module Searches

source code

Tools for doing PubMed searches and processing the results

NOTE: much of the example code in the documentation here uses XML
files from the test_data directory in order to avoid having to call
out to PubMed itself.  Actual calls to the functions would not include
the _conn_ argument.

Functions [hide private]
 
openURL(url, args) source code
 
GetNumHits(query, url=QueryParams.searchBase)
returns a tuple of pubmed ids (strings) for the query provided
source code
 
GetSearchIds(query, url=QueryParams.searchBase)
returns a tuple of pubmed ids (strings) for the query provided
source code
 
GetSummaries(ids, query=None, url=QueryParams.summaryBase, conn=None)
gets a set of document summary records for the ids provided
source code
 
GetRecords(ids, query=None, url=QueryParams.fetchBase, conn=None)
gets a set of document summary records for the ids provided
source code
 
CheckForLinks(ids, query=None, url=QueryParams.linkBase, conn=None) source code
 
GetLinks(ids, query=None, url=QueryParams.linkBase, conn=None) source code
 
_test() source code

Imports: RDConfig, QueryParams, Records, urllib, urllib2, ElementTree


Function Details [hide private]

GetNumHits(query, url=QueryParams.searchBase)

source code 
returns a tuple of pubmed ids (strings) for the query provided

To do a search, we need a query object:
>>> query = QueryParams.details()

set up the search parameters:
>>> query['term'] = 'penzotti je AND grootenhuis pd'
>>> query['field'] = 'auth'

now get the search ids:
>>> counts = GetNumHits(query)
>>> counts
2

alternately, we can search using field specifiers:
>>> query = QueryParams.details()
>>> query['term'] = 'penzotti je[au] AND hydrogen bonding[mh]'
>>> counts = GetNumHits(query)
>>> counts
3

GetSearchIds(query, url=QueryParams.searchBase)

source code 
returns a tuple of pubmed ids (strings) for the query provided

To do a search, we need a query object:
>>> query = QueryParams.details()

set up the search parameters:
>>> query['term'] = 'penzotti je AND grootenhuis pd'
>>> query['field'] = 'auth'

now get the search ids:
>>> ids = GetSearchIds(query)
>>> len(ids)
2
>>> ids[0]
'11960484'
>>> ids[1]
'10893315'

GetSummaries(ids, query=None, url=QueryParams.summaryBase, conn=None)

source code 
gets a set of document summary records for the ids provided

>>> ids = ['11960484']
>>> summs = GetSummaries(ids,conn=open(os.path.join(testDataDir,'summary.xml'),'r'))
>>> len(summs)
1
>>> rec = summs[0]
>>> isinstance(rec,Records.SummaryRecord)
1
>>> rec.PubMedId
'11960484'
>>> rec.Authors
'Penzotti JE, Lamb ML, Evensen E, Grootenhuis PD'
>>> rec.Title
'A computational ensemble pharmacophore model for identifying substrates of P-glycoprotein.'
>>> rec.Source
'J Med Chem'
>>> rec.Volume
'45'
>>> rec.Pages
'1737-40'
>>> rec.HasAbstract
'1'

GetRecords(ids, query=None, url=QueryParams.fetchBase, conn=None)

source code 
gets a set of document summary records for the ids provided

>>> ids = ['11960484']
>>> recs = GetRecords(ids,conn=open(os.path.join(testDataDir,'records.xml'),'r'))
>>> len(recs)
1
>>> rec = recs[0]
>>> rec.PubMedId
'11960484'
>>> rec.Authors
u'Penzotti JE, Lamb ML, Evensen E, Grootenhuis PD'
>>> rec.Title
u'A computational ensemble pharmacophore model for identifying substrates of P-glycoprotein.'
>>> rec.Source
u'J Med Chem'
>>> rec.Volume
'45'
>>> rec.Pages
'1737-40'
>>> rec.PubYear
'2002'
>>> rec.Abstract[:10]
u'P-glycopro'

We've also got access to keywords:
>>> str(rec.keywords[0])
'Combinatorial Chemistry Techniques'
>>> str(rec.keywords[3])
'Indinavir / chemistry'

and chemicals:
>>> rec.chemicals[0]
'P-Glycoprotein'
>>> rec.chemicals[2]
'Nicardipine <55985-32-5>'