[Bioperl-l] get_SeqFeatures doesn't like genbank CON files

Staffa, Nick (NIH/NIEHS) staffa at niehs.nih.gov
Thu Mar 29 19:06:22 UTC 2007


If I use the following code on the genbank flat files gbconN.seq  (N=1..4),
I bomb memory.  So I wrote a flat Perl script and made oodles of files,
one for each genbank CON entry for D.pseudoobscura.
These entries have complete features tables, but do not have real sequence,
just join statements referencing the WGS files AADExxxxxxxxxxx.
When I run this code on them. the BioPerl modules don't seem to like the
join statements being where they are, and for some reason object to "gap".
I AM glad that BioPerl allowed the program to process all files.

The code:
   $seqio_object = Bio::SeqIO->new(-file => "$filename" );
   $seq_object = $seqio_object->next_seq;
   $sequence_length = $seq_object->length();
   my @features = $seq_object->get_SeqFeatures(); # just top level


The log:
-------------------- WARNING ---------------------
MSG: exception while parsing location line
[join(AADE01003924.1:1..5157,gap(128),complement(AADE01002963.1:1..8959),gap
(50),complement(AA
DE01002322.1:801..13635),AADE01008784.1:1..995,complement(AADE01002422.1:1..
12770),gap(105),complement(AADE01006425.1:1..1791),gap(940),c
omplement(AADE01002137.1:1..15323),gap(962),AADE01003112.1:1..8150,gap(194),
AADE01000989.1:1..38476,AADE01012537.1:1..1696,gap(243),AADE0
1012620.1:1..612,complement(AADE01002972.1:1..8912),gap(1646),complement(AAD
E01009428.1:602..2135),AADE01000086.1:1..143541,complement(AA
...
...
...
01003505.1:1..6496,gap(1445),AADE01004655.1:1..3580,gap(328),AADE01002622.1:
1..11193,gap(90),complement(AADE01006718.1:1..1606),gap(423),
complement(AADE01004351.1:1..4128))] in reading EMBL/GenBank/SwissProt,
ignoring feature CONTIG (seqid=CH379058):
------------- EXCEPTION  -------------
MSG: operator "gap" unrecognized by parser
STACK Bio::Factory::FTLocationFactory::from_string
/usr/lib/perl5/site_perl/5.8.5/Bio/Factory/FTLocationFactory.pm:179
STACK Bio::Factory::FTLocationFactory::from_string
/usr/lib/perl5/site_perl/5.8.5/Bio/Factory/FTLocationFactory.pm:175
STACK (eval) /usr/lib/perl5/site_perl/5.8.5/Bio/SeqIO/FTHelper.pm:127
STACK Bio::SeqIO::FTHelper::_generic_seqfeature
/usr/lib/perl5/site_perl/5.8.5/Bio/SeqIO/FTHelper.pm:126
STACK Bio::SeqIO::genbank::next_seq
/usr/lib/perl5/site_perl/5.8.5/Bio/SeqIO/genbank.pm:514
STACK toplevel find_orthos.pl:24

This even occurs with the addition of ­format => ³genbank²


 
Nick Staffa 
Telephone: 919-316-4569  (NIEHS: 6-4569)
Scientific Computing Support Group
NIEHS Information Technology Support Services Contract
(Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov)
National Institute of Environmental Health Sciences
National Institutes of Health
Research Triangle Park, North Carolina






More information about the Bioperl-l mailing list