[Bioperl-l] How to Handle Parse Errors

Heikki Lehvaslaiho heikki at nildram.co.uk
Sat Jul 5 22:46:52 EDT 2003


On Sat, 2003-07-05 at 21:20, Hilmar Lapp wrote:
> Aaron's suggestion was to warn and skip the feature, right?

Brian added the quick fix into Bio::Factory::FTLocationFactory 1.10 by
adding bond into list of keys that need split locations. There are no
warnings getting printed. I copied this over to branch.

	-Heikki

> I agree with this. Handling feature semantic-carrying operators in the 
> location line requires more than a quick fix.
> 
> 	-hilmar
> 
> On Friday, July 4, 2003, at 07:25  AM, Heikki Lehvaslaiho wrote:
> 
> > dmcwilli,
> >
> > This is bug report #1371.  I'll do the fixes Aaron suggests now so that
> > we get the fixes inot bioperl release next week. If someone wants to do
> > something more clever with undocumented features keys, feel free-  but
> > only in the cvs head.
> >
> > Thanks for reminding me of this,
> >
> > 	-Heikki
> >
> > On Fri, 2003-07-04 at 14:28, dmcwilli wrote:
> >> There was a question like this in May, I think, but I have been unable
> >> to find help for this in the FAQ or recent postings.
> >>
> >> I am trying to parse GenBank records and find those which have the
> >> Feature /region_name="Transit peptide".  I did a broad Entrez search
> >> and downloaded the results, so I'm accessing the file locally.  The
> >> parser fails and exits the script prematurely when it encounters a 
> >> record
> >> with the Feature "Het" with the message:
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: exception while parsing location line
> >> [join(bond(201),bond(203),bond(204),bond(204),bond(204),bond(204))] in
> >> reading EMBL/GenBank/SwissProt, ignoring feature Het (seqid=8RUC_G):
> >> ------------- EXCEPTION -------------
> >> MSG: operator "bond" unrecognized by parser STACK
> >> Bio::Factory::FTLocationFactory::from_string
> >> /usr/lib/perl5/site_perl/5.8.0/Bio/Factory/FTLocationFactory.pm:160
> >> STACK Bio::Factory::FTLocationFactory::from_string
> >> /usr/lib/perl5/site_perl/5.8.0/Bio/Factory/FTLocationFactory.pm:157
> >> STACK (eval) /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/FTHelper.pm:124
> >> STACK Bio::SeqIO::FTHelper::_generic_seqfeature
> >> /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/FTHelper.pm:123 STACK
> >> Bio::SeqIO::genbank::next_seq
> >> /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/genbank.pm:396 STACK toplevel
> >> ./biopl5.pl:20
> >> --------------------------------------
> >> ---------------------------------------------------
> >> Can't call method "primary_tag" on an undefined value at
> >> /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/genbank.pm line 400, <GEN0>
> >> line 23630.
> >> # end of message
> >>
> >> My code is:
> >>
> >> #!/usr/bin/perl
> >> #
> >> # tpfilter.pl
> >> # Get transit peptides from files in genbank format.  Uses BioPerl
> >> # David R. McWilliams dmcwilli at utk.edu
> >> # 04-Jul-03
> >>
> >> use strict;
> >> use warnings ;
> >> use Bio::SeqIO;
> >> use Bio::Seq;
> >>
> >> my $file = shift @ARGV;
> >> my $in = new Bio::SeqIO(-format => 'genbank', -file => $file);
> >>
> >> my $datetime = scalar(localtime()) ;
> >> print "# Output of $0 on $file.\n" ;
> >> print "# $datetime\n" ;
> >>
> >> my $fnd = 0 ;
> >> while( my $seq = $in-> next_seq) {
> >>     foreach my $feature ( $seq->get_SeqFeatures ) {
> >>  	if($feature->primary_tag eq 'Region' ) {
> >>  	    if( $feature->has_tag('region_name') ) {
> >>  		my ($tag) = $feature->get_tag_values('region_name') ;
> >>  		if( $tag =~ /transit|signal/i ) {
> >> 		    $fnd++ ;
> >> 		    print ">", $seq->display_id(), "|",
> >> 		          "tp=", $feature->start, "\.\.", $feature->end, "|",
> >>   		          $seq->species->binomial(), "|",
> >> 		          $seq->description(), "\n";
> >> 		    print $seq->subseq($feature->start, $feature->end), "\n" ;
> >> 		}
> >>  	    }
> >>  	}
> >>     }
> >> }
> >> print "# Found $fnd seqs w/ tp.\n" ;
> >> 		
> >> # end code
> >>
> >> If I remove the offending records by hand, this works fine.  So, is
> >> there a way to continue to parse the offending records, even though
> >> the parser does not recognize this particular feature, or is there a
> >> way to catch the error and skip the record without aborting the rest
> >> of the script?
> >>
> >> Regards,
> >> 	
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at portal.open-bio.org
> >> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ______ _/      _/_____________________________________________________
> >       _/      _/                      http://www.ebi.ac.uk/mutations/
> >      _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
> >     _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
> >    _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
> >   _/  _/  _/  Cambs. CB10 1SD, United Kingdom
> >      _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
> > ___ _/_/_/_/_/________________________________________________________
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________



More information about the Bioperl-l mailing list