[Biopython] Integrating SQL query to biopython

Peter Cock p.j.a.cock at googlemail.com
Thu May 24 09:10:13 UTC 2012


On Thu, May 24, 2012 at 6:58 AM, Animesh Agrawal
<animesh.agrawal at anu.edu.au> wrote:
> Hi,
>
> I am running small SQL queries to select sequences from a local BIOSQL
> database. One instance such query is as follows:
>
>
>
> SELECT  biosequence.*
>
> FROM    biosequence JOIN bioentry USING (bioentry_id)
>
> WHERE   biosequence.seq NOT LIKE "%X%"
>
> AND   biosequence.alphabet = 'protein'
>
>
>
> I am wondering, how do I integrate this SQL query with Biopython code to get
> the output in form of SeqRecord or Seq objects.

>From your direct database access, get the bioentry table's primary ID,
and then use that to create a DBSeqRecord object (which is a subclass
of SeqRecord and will also load the sequence for you).

You will also need the adapter object as the other initialization argument,
which is how the DBSeqRecord knows which database to read from.
Get that by connecting to the BioSQL database through the Biopython
code as usual.

Something like this (untested):

from BioSQL import BioSeqDatabase
from BioSQL.BioSeq import DBSeqRecord
#Connect to BioSQL database as usual,
server = BioSeqDatabase.open_database(driver="MySQLdb", user="root",
                     passwd = "", host = "localhost", db="bioseqdb")
primary_id = .... #your code here
#Use Biopython's BioSQL SeqRecord loading:
record = DBSeqRecord(server.adapter, primary_id)

Peter




More information about the Biopython mailing list