[Bioperl-l] Parsing "PCR_primers" tag from GenBank file

Roy Chaudhuri roy.chaudhuri at gmail.com
Sun Jun 14 21:24:34 UTC 2015


Hi Horacio,

The two "satellite" tags in GQ344853 are in different features, hence 
they are separated out in your code, whereas the PCR_primers tags are 
both in the same feature (source). get_tag_values (and 
get_tagset_values, which is similar but doesn't throw an error if the 
tag isn't found) return an array if there are several of the specified 
tag in the feature, so you need to loop over that array if you want to 
separate them out. Your original code just passed the array value 
directly to print, so they were printed out one after the other.

Here's a modification of your code which should be closer to what you want:

#!/usr/bin/env perl
use strict;
use warnings FATAL=>qw(all);
use Bio::SeqIO;
my $seqio_object = Bio::SeqIO->new(-file => 'GQ344853.gb' );
while (my $seq = $seqio_object->next_seq) {
     print $seq->primary_id, "\t", $seq->length, "\n";
     for my $feat_object ($seq->get_SeqFeatures) {
         for my $tag (qw(satellite PCR_primers)) {
  	     for my $value ($feat_object->get_tagset_values($tag)) {
		  print "$tag\t$value\n";
	     }
	}
     }
}

Cheers,
Roy.



On 14/06/2015 00:34, Horacio Montenegro wrote:
>      hi,
>
>      I am trying to parse a GenBank file to extract primers and other
> info, outputting it separated with tabs. However, some records have
> two "PCR_primers" tags, and they are not being separated. The
> "satellite" tag also is doubled, but each one is being correctly
> separated with tabs. How can I manage to output each primer pair
> separated?
>
>      thanks, Horacio
>
>      One example of a record with two "PCR_primers" tags is Accession
> GQ344853, GI 282937571. Bellow is the code snippet to reproduce the
> behaviour:
>
> #!/usr/bin/env perl
> use Bio::DB::EUtilities;
> use Bio::SeqIO;
> my $seqio_object = Bio::SeqIO->new(-file => 'GQ344853.gb' );
> while (my $seq = $seqio_object->next_seq) {
>      print $seq->primary_id, "\t", $seq->length, "\t";
>      for my $feat_object ($seq->get_SeqFeatures) {
>          print $feat_object->get_tag_values("satellite"), "\t" if
> ($feat_object->has_tag("satellite"));
>          print $feat_object->get_tag_values("PCR_primers"), "\t" if
> ($feat_object->has_tag('PCR_primers'));
>      }
>      print "\n";
> }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/bioperl-l
>


More information about the Bioperl-l mailing list