[BioPython] import error

James Swetnam jswetnam at gmail.com
Tue Mar 11 18:21:05 EDT 2008


Hello.

First off, apologies if my problem has been resolved in a previous  
mailing; the archives search on the OBF wiki is disabled.  Also, it's  
quite possible i'm doing something boneheaded, as I still consider  
myself a fairly novice python programmer.  So apologies if I make you  
read through this just to correct an indentation error or somethinig  
similar!

I'm trying to use the Biopython BioSQL bindings to populate a locally  
served MySQL database with what I like to call 'chimeric' SeqRecord  
objects.  I take as a starting point a large, FASTA formatted file of  
short, translated (~35AA) protein sequences from the LANL HIV Sequence  
Database.  Every one of these LANL protein sequences is a subset of a  
longer sequence available in genbank.  Each of the sequences I  
download thus has an associated genbank accession number.

I'd like to combine both the specificity afforded by the LANL  
sequences with the 'meta' information given by the genbank files into  
one record for each translated protein sequence.  Thus, in very broad  
pseudocode, my procedure is as follows:

for every sequence in fasta formatted lanl file
	get the genbank number
	grab the genbank file and parse into a SeqRecord
	replace the Seq object in the genbank SeqRecord with the LANL protein  
sequence
  	let Biopython do its magic and populate my biosql database with my  
chimeric SeqRecord
	...
	Profit!

The entire procedure is rather short, thanks to the developers' hard  
work and the magic of abstraction.  Here's the actual code:

http://pastebin.com/m118199fe

OK. FIne.  But I'm getting an error when I do this, which originates  
deep in the bowels of the MySQLdb library, which I'd rather not touch  
without a lot more coffee than I have available.

-----------------------------degas:v3_sequence_browser james$ ipython  
populate_database.py
/sw/lib/python2.5/site-packages/Bio/config/DBRegistry.py:149:  
DeprecationWarning: Concurrent behavior has been deprecated, as this  
functionality needs Bio.MultiProc, which itself has been deprecated.  
If you need the concurrent behavior, please let the Biopython  
developers know by sending an email to biopython-dev at biopython.org to  
avoid permanent removal of this feature.
   DeprecationWarning)
---------------------------------------------------------------------------
<type 'exceptions.TypeError'>             Traceback (most recent call  
last)

/Users/james/src/v3_sequence_browser/populate_database.py in <module>()
      35
      36 db = server.new_database("v3")
---> 37 db.load(v3prod)
      38 server.adaptor.commit()
      39

/sw/lib/python2.5/site-packages/BioSQL/BioSeqDatabase.py in load(self,  
record_iterator)
     412                 break
     413             num_records += 1
--> 414             db_loader.load_seqrecord(cur_record)
     415
     416         return num_records

/sw/lib/python2.5/site-packages/BioSQL/Loader.py in  
load_seqrecord(self, record)
      28         """Load a Biopython SeqRecord into the database.
      29         """
---> 30         bioentry_id = self._load_bioentry_table(record)
      31         self._load_bioentry_date(record, bioentry_id)
      32         self._load_biosequence(record, bioentry_id)

/sw/lib/python2.5/site-packages/BioSQL/Loader.py in  
_load_bioentry_table(self, record)
     248                                    division,
     249                                    description,
--> 250                                    version))
     251         # now retrieve the id for the bioentry
     252         bioentry_id = self.adaptor.last_id('bioentry')

/sw/lib/python2.5/site-packages/BioSQL/BioSeqDatabase.py in  
execute(self, sql, args)
     275         """Just execute an sql command.
     276         """
--> 277         self.cursor.execute(sql, args or ())
     278
     279     def get_subseq_as_string(self, seqid, start, end):

/sw/lib/python2.5/site-packages/MySQLdb/cursors.py in execute(self,  
query, args)
     149             query = query.encode(charset)
     150         if args is not None:
--> 151             query = query % db.literal(args)
     152         try:
     153             r = self._query(query)

<type 'exceptions.TypeError'>: not all arguments converted during  
string formatting
WARNING: Failure executing file: <populate_database.py>


Any direct help or references are much appreciated.

James Swetnam
Research Technician
Department of Pharmacology
NYU School of Medicine






More information about the BioPython mailing list