[Biopython] SQL Alchemy based BioSQL
Kyle Ellrott
kellrott at gmail.com
Wed Aug 26 01:01:30 UTC 2009
I've added a new database function lookupFeature to quickly search for
sequences features without have to load all of them for any particular
sequence.
Because it's a non-standard function, I've taken the opportunity to
play around with some more dynamic search features.
Once we get the interface for these types of searches locked down on
lookupFeature, a similar system could be implemented in the standard
'lookup' call.
The work is posted at http://github.com/kellrott/biopython
The following is an example of a working search, that pulls all of the
protein_ids from NC_004663.1 between 60,000 and 70,000 on the positive
strand.
import sys
from BioSQL import BioSQLAlchemy as BioSeqDataBase
server = BioSeqDataBase.open_database( driver="mysql", user='test',
host='localhost', db='testdb' )
db = server[ 'bacteria' ]
seq = db.lookup( version="NC_004663.1" )
features = db.lookupFeatures( BioSeqDataBase.Column('strand') == 1,
BioSeqDataBase.Column('start_pos') < 70000,
BioSeqDataBase.Column('end_pos') > 60000,
bioentry_id = seq._primary_id, name="protein_id" )
#print len(features)
for feature in features:
print feature
> Kyle:
>> > I've posted a git fork of biopython with a BioSQL system based on SQL
>> > Alchemy. It can be found at git://github.com/kellrott/biopython.git
>> > It successfully completes unit tests copied from test_BioSQL and
>> > test_BioSQL_SeqIO.
More information about the Biopython
mailing list