[BioSQL-l] Loading SwissProt to BioSQL?

Hilmar Lapp hlapp at gnf.org
Wed Jan 26 14:01:25 EST 2005


I can't comment on the biopython side. Just one thought, if perl isn't 
too scary for you and if you think you can manage to install the 
requirement (or if you have a sysadmin you can bug with this), loading 
the database through bioperl-db's load_seqdatabase.pl may get you set 
up to then retrieve the content through biophyton.

It is in this case also untested though whether the storage of 
annotation and feature annotation is compatible with how biopython 
wants to find it. We just came upon a similar problem between bioperl 
and biojava.

	-hilmar

On Jan 26, 2005, at 9:12 AM, Nathan Edwards wrote:

>
> What is the current recommended solution for those wanting to load 
> SwissProt database files (uniprot_sprot.dat) to BioSQL via BioPython?
>
> As far as I can tell, the situtation is this:
>
> * The SwissProt .dat file parsers under Bio.SwissProt.SProt don't 
> produce SeqRecord objects required by BioSeqDatabase.load.
>
> * The SwissProt .dat file parsers under FormatIO can't parse SwissProt 
> .dat files, period.
>
> Does *anyone* have a working SwissProt .dat file to BioSQL solution 
> working in Python?
>
> The most recent solution I see suggested:
>
> http://biopython.org/pipermail/biopython/2004-May/002088.html
>
> doesn't work as advertised: SProt.SequenceParser produces SeqRecord 
> objects with only minimal instantiation of SeqRecord fields (sequence 
> and accession) and BioSeqDatabase dies because it expects a field that 
> SequenceParser never instantiates; and FormatIO.readFile doesn't 
> recognize the uniprot_sprot.dat as any format it recognizes, and if 
> forced, with format='swissprot/38' or format='swissprot/40' it dies 
> while parsing.
>
> Note: My entire BioPython, BioSQL, etc installation is new, pristine, 
> the latest update from CVS.
>
> Thanks,
>
> nathan
>
> -- 
> Nathan Edwards, Ph.D.
> Center for Bioinformatics and Computational Biology
> 3119 Agriculture/Life Sciences Surge Building #296
> University of Maryland, College Park, MD 20742-3360
> Phone: +1 301-405-9901
> Email: nedwards at umiacs.umd.edu
> WWWeb: http://www.umiacs.umd.edu/~nedwards
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------



More information about the BioSQL-l mailing list