[Bioperl-l] Question about parsing a gb file
Mark A. Jensen
maj at fortinbras.us
Sun Mar 29 20:42:28 EDT 2009
Paolo- You also may get some insight by looking through the thread started by
Govind Chandra subsequent to this one, and see Chris and Hilmar's
informative comments there regarding SeqFeature and Annotation.
cheers Mark
----- Original Message -----
From: "Torsten Seemann" <torsten.seemann at infotech.monash.edu.au>
To: "Paolo Pavan" <paolo.pavan at gmail.com>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Sunday, March 29, 2009 8:25 PM
Subject: Re: [Bioperl-l] Question about parsing a gb file
> Hi everybody,I have a little problem/question in parsing a genbank file.
> I've got a $s = Bio::Seq object to which I've added
> some Bio::SeqFeature::Generic, everything here seem to be ok since I can
> find all the properties of the $s setted correctly in my visual debugger;
> for instance, I can find the display_name properties of the SeqFeature in
> the $s object.
> Than I perform a print Bio::SeqIO->new(-format => 'genbank')->write_seq($s)
> to write down the genbank file but there I can't get any more some
> properties of the sequence, like the "display_name".
> What does it happens?
> my $s = $str->next_seq();
> my $f = Bio::SeqFeature::Generic->new(
> -start => 10,
> -end => 100,
> -strand => -1,
> -primary => 'CDS', # -primary_tag is a synonym
> -source_tag => 'repeatmasker',
> -display_name => 'alu family'
> );
> $s->add_SeqFeature($f);
> print Bio::SeqIO->new(-format => 'genbank')->write_seq($s)
The logical conclusion is that the 'genbank' output format does not
store the -display_name attribute of a SeqFeature. If you look at the
output of your script you will see only this:
CDS complement(10..100)
You will have to add appropriate -tags => { name=>value, .... } to
your SeqFeature from the Genbank/EMBL feature table
http://www.ncbi.nlm.nih.gov/collab/FT/
In particular I think you want to do the following:
my $f = Bio::SeqFeature::Generic->new(
-start => 10, -end => 100,
-strand => -1,
-primary => 'CDS', # -primary_tag is a synonym
-tags = {
product => 'alu family',
note => 'repeatmasker',
locus_tag => 'GENE00432', # etc
}
);
Hope this helps,
--Torsten Seemann
--Victorian Bioinformatics Consortium, Dept. Microbiology, Monash
University, AUSTRALIA
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list