[Bioperl-l] Question about parsing a gb file

Torsten Seemann torsten.seemann at infotech.monash.edu.au
Sun Mar 29 20:25:48 EDT 2009


> Hi everybody,I have a little problem/question in parsing a genbank file.
> I've got a $s = Bio::Seq object to which I've added
> some Bio::SeqFeature::Generic, everything here seem to be ok since I can
> find all the properties of the $s setted correctly in my visual debugger;
> for instance, I can find the display_name properties of the SeqFeature in
> the $s object.
> Than I perform a print Bio::SeqIO->new(-format => 'genbank')->write_seq($s)
> to write down the genbank file but there I can't get any more some
> properties of the sequence, like the "display_name".
> What does it happens?
> my $s = $str->next_seq();
> my $f = Bio::SeqFeature::Generic->new(
>            -start        => 10,
>            -end          => 100,
>            -strand       => -1,
>            -primary      => 'CDS', # -primary_tag is a synonym
>            -source_tag   => 'repeatmasker',
>            -display_name => 'alu family'
>             );
> $s->add_SeqFeature($f);
> print Bio::SeqIO->new(-format => 'genbank')->write_seq($s)

The logical conclusion is that the 'genbank' output format does not
store the -display_name attribute of a SeqFeature. If you look at the
output of your script you will see only this:

     CDS             complement(10..100)

You will have to add appropriate -tags => { name=>value, .... } to
your SeqFeature from the Genbank/EMBL feature table
http://www.ncbi.nlm.nih.gov/collab/FT/

In particular I think you want to do the following:

my $f = Bio::SeqFeature::Generic->new(
            -start        => 10,            -end          => 100,
      -strand       => -1,
            -primary      => 'CDS', # -primary_tag is a synonym
            -tags = {
               product => 'alu family',
               note =>   'repeatmasker',
               locus_tag => 'GENE00432',  # etc
             }
 );

Hope this helps,

--Torsten Seemann
--Victorian Bioinformatics Consortium, Dept. Microbiology, Monash
University, AUSTRALIA



More information about the Bioperl-l mailing list