[Bioperl-l] Odd problem with get_tag_values

Adlai Burman adlai at refenestration.com
Fri Feb 24 21:22:09 UTC 2012


Hey, Brian.
No, I am not absolutely sure about that but I am checking now. In the process of checking this I found out that one of the files that successfully parsed (NC_015139) had NO Features tags (other that "source"). No "CDS", no "gene" etc...
ok, now I am sure. I checked one record that didn't parse and and all the CDS's have "gene" tags. Go figure. 
Regarding your suggestion: I agree, checking for the existence of a tag is an important thing to do and everything parses great in a script I wrote which uses that. This, however, might be problematic for two reasons:
(1) The fact that the aforementioned featureless record parses and one of the crashers does have a full complement of properly placed "genes" suggest that this might not address the problem, and
(2) On a more humbling note, I don't know how to embed such a check into the one line hash generator, my %strands = map {$_->get_tag_values('gene'), $_->strand} @cds_features; , which wold be perfect for what I am coding now.

Thanks for your response,
Adlai
On Feb 24, 2012, at 10:03 PM, Brian Osborne wrote:

> Adlai,
> 
> You are absolutely sure that every single CDS feature has a "gene" tag inside it?
> 
> If this is not the case then you have to use the "if ($cds_feature->has_tag("gene")) …" type of logic.
> 
> Brian O.
> 
> On Feb 24, 2012, at 3:43 PM, Adlai Burman wrote:
> 
>> I have come across a perplexing problem with trying to parse sequence features into hashes from gb records. This is the minimal code which shows my problem:
>> 
>> #!/usr/bin/perl                                                                                                     
>> use strict;
>> use warnings;
>> use IO::String;
>> use Bio::Perl;
>> use Bio::SeqIO;
>> use IO::String;
>> 
>> my @files = </Users/adlai/Dropbox/atrsh/*>;
>> foreach my $file(@files){
>> 
>> 
>> my @cds_features = grep {$_->primary_tag eq 'CDS' } Bio::SeqIO->new(-file => $file)->next_seq->get_SeqFeatures;
>> my %strands = map {$_->get_tag_values('gene'), $_->strand} @cds_features; ##This Is The Culprit. 
>> .
>> .
>> .
>> #do nifty stuff
>> }
>> 
>> For some files this approach works just fine.
>> For others the script dies immediately with the error message:
>> 
>> ------------- EXCEPTION -------------
>> MSG: asking for tag value that does not exist gene
>> STACK Bio::SeqFeature::Generic::get_tag_values /Users/adlai/Downloads/BioPerl-1.6.1/Bio/SeqFeature/Generic.pm:517
>> STACK toplevel tosend.pl:16
>> -------------------------------------
>> 
>> The difference in the files that parse and those that don't seems to be that the files that crash have "intron" and "exon" tags. They ALL have "gene" tags.
>> Does anyone know why this is a problem and what can be done to circumvent it?
>> 
>> Thanks,
>> Adlai
>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 





More information about the Bioperl-l mailing list