[Bioperl-l] retrieval of PRELIMINARY uniprot sequences using Bio::Registry fails

Daniel Lang daniel.lang at biologie.uni-freiburg.de
Tue Sep 5 09:57:59 UTC 2006


Hi Brian,

sorry for the belated response!
I've compiled you a set of 100 PRELIMINARY entries from the latest
uniprot_trembl release. I've tried to reproduce the bug using only these
as input to build an index, but (sadly) all of them can be retrieved
using the latest checkout:-(
Maybe its not connected to these entries after all, but the size or some
other feature of the uniprot distribution?
I now could make it work using the 1.5.1 release.

Originally, I've built the index using flat protocol, when I try bdb and
bioperl-live even more problems occur:

bp_bioflat_index.pl --dbname sw -i bdb -f swiss -l . -c uniprot_sprot.dat

------------- EXCEPTION  -------------
MSG: The lineage 'Eukaryota, Metazoa, Chordata, Craniata, Vertebrata,
Euteleostomi, Amphibia, Batrachia, Anura, Mesobatrachia, Pipoidea,
Pipidae, Xenopodinae, Xenopus, Silurana, Xenopus, tropicalis' had two
non-consecutive nodes with the same name. Can't cope!
STACK Bio::DB::Taxonomy::list::add_lineage
/home/lang/bioperl/bioperl-live/Bio/DB/Taxonomy/list.pm:163
STACK Bio::DB::Taxonomy::list::new
/home/lang/bioperl/bioperl-live/Bio/DB/Taxonomy/list.pm:100
STACK Bio::DB::Taxonomy::new
/home/lang/bioperl/bioperl-live/Bio/DB/Taxonomy.pm:106
STACK Bio::Species::classification
/home/lang/bioperl/bioperl-live/Bio/Species.pm:171
STACK Bio::SeqIO::swiss::_read_swissprot_Species
/home/lang/bioperl/bioperl-live/Bio/SeqIO/swiss.pm:1049
STACK Bio::SeqIO::swiss::next_seq
/home/lang/bioperl/bioperl-live/Bio/SeqIO/swiss.pm:240
STACK Bio::DB::Flat::parse_one_record
/home/lang/bioperl/bioperl-live/Bio/DB/Flat.pm:333
STACK Bio::DB::Flat::BDB::_index_file
/home/lang/bioperl/bioperl-live/Bio/DB/Flat/BDB.pm:235
STACK Bio::DB::Flat::BDB::build_index
/home/lang/bioperl/bioperl-live/Bio/DB/Flat/BDB.pm:218
STACK toplevel
/share/apps/bioperl/bioperl-live/scripts_temp/bp_bioflat_index.pl:113

But I think this is connected to the new changes to taxonomy handling in
Bio::Taxon...
I'm unsure wether to submit this separately, but I could also provide an
example of such a swissprot entry that causes this error.

Thanks, again.

Daniel

Brian Osborne wrote:
> Daniel,
> 
> Bug, presumably in SeqIO/swiss.pm. Can you send me a small file with such a
> PRELIMINARY entry? 
> 
> Brian O.
> 
> 
> On 9/1/06 6:11 AM, "Daniel Lang" <daniel.lang at biologie.uni-freiburg.de>
> wrote:
> 
>> Hi,
>>
>> when using Bio::Registry (bioperl-live) to fetch uniprot entries from
>> local indexed uniprot *.dats, I had to realize that several entries
>> could not be retrieved despite the fact that they are present in the
>> files! A closer look reveals that they are of status PRELIMINARY:
>>
>> uniprot_trembl.dat:ID   Q16EZ1_AEDAE   PRELIMINARY;   PRT;   222 AA.
>>
>> I don't "grep" PRELIMINARY anywhere in my cvs checkout..
>> I also can't retrieve the sequences from the online database defined as
>> follows:
>> [swissprot_ebi]
>> protocol=biofetch
>> location=http://www.ebi.ac.uk/cgi-bin/dbfetch
>> dbname=swall
>>
>> Is this a bug or a feature? If its a feature, how can I bypass it?
>>
>> Thanks in advance,
>> Daniel
> 
> 



-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 100_exemplary_PRELIMINARY.swiss
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20060905/74109b66/attachment.ksh>


More information about the Bioperl-l mailing list