How Python connects you with biological databases? #2 – PubMed

Pubmed is the biggest database of biological scientific papers. There are other databases that gather information about all scientific papers (e.g. google scholar, scopus), however in biological sciences, still most commonly used is Pubmed from NCBI.

There is quite easy way to access Pubmed through their API (Entrez), however, there is already easier way by using BioPython, which I recommend.

To use it you need BioPython installed on your computer, import Entrez from BioPython

from Bio import Entrez

Simple examples are available in the documentation:

http://biopython.org/DIST/docs/api/Bio.Entrez-module.html

Entrez.email = "Your.Name.Here@example.org"
pmid = "19304878"
handle = Entrez.elink(dbfrom="pubmed", id=pmid, linkname="pubmed_pubmed")
record = Entrez.read(handle)
handle.close()

More specific instructions are in Tutorial (concerning both PubMed and MedLine):

http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc129

Entrez.email = "example@example.com"
handle = Entrez.esearch(db="pubmed", term="orchid", retmax=463)
record = Entrez.read(handle)
idlist = record["IdList"]
handle.close()

For specific information for found articles you can use Entrez.efetch using ids of articles.

handle = Entrez.efetch(db = 'pubmed', retmode = 'xml', id = idlist)
results = Entrez.read(handle)

Then, you can then handle the results as dictionaries in Python.

BioPython really made it easier to use many databases.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s