[BioPython] FormatIO + Fasta parser + BioDB.

Brad Chapman chapmanb at uga.edu
Wed May 5 14:08:42 EDT 2004


Hi Christian;

> I'm writing a procedure to store files in a BioDB but I have the
> following error:
> """
> ...
>   File "/usr/lib/python2.2/site-packages/BioSQL/Loader.py", line 209, in
> _load_bioentry_table
>     if record.id.find('.') >= 0: # try to get a version from the id
> AttributeError: 'NoneType' object has no attribute 'find'
> """
> I feel that is because I don't define the title2ids function for the
> Fasta parser. If I'm right, how can I tell to the FormatIO module to use
> a title2ids function?

Yes, you have the problem figured exactly. The solution is actually
to not use the FormatIO module. That's more appropriate for
automated format conversions and you will probably need a finer
scale of work here to specifically parse out ids and descriptions
from the Fasta title headers.

To do this, use the standard Fasta.SequenceParser and Fasta.Iterator
classes, along with a title2ids function. The adjusted code which
should work is:

from Bio import Fasta

def your_title_to_ids(title):
        # write this for your specific FASTA titles
        # to return name, id and description

def SequenceStoreFile(SeqFile, database, format='genbank'):
        server = BioSeqDatabase.open_database(driver='MySQLdb', user='bio',
passwd='bio', host='localhost', db='bio')
	if server[database]:
		db = server[database]
	else:
		db = server.new_database(database)
        parser = Fasta.SequenceParser(title2ids = your_title_to_ids)
        itr = Fasta.Iterator(SeqFile, parser)
	db.load(itr)
	return

if __name__ == "__main__":
	SequenceStoreFile(open('example.fasta'), 'estC', 'fasta')

Sorry about the confusion and I hope this helps.
Brad


More information about the BioPython mailing list