[Bioperl-l] minor bug in Bio::FeatureIO::gff

Chris Fields cjfields at illinois.edu
Fri Mar 13 04:01:10 UTC 2009


Rusell,

I would file that in bugzilla.  We need to take that into  
consideration when refactoring Bio::FeatureIO.

chris

On Mar 12, 2009, at 9:07 PM, Smithies, Russell wrote:

> I think there's a bug in Bio::FeatureIO::GFF when it's reading fasta  
> from a gff file.
> If there's no ##FASTA directive in the gff file, it ignores the  
> fasta header and takes the first line of sequence as the primary_id  
> and display_id
>
> Eg:
>
> Here's some gff:
>
> super_1:34972746,34974962	BlastN	barley_ta_match	1558	1764	.	+	.	 
> Parent=barley_transgrp_blast:TC135274;Note=%22%22
> super_1:34972746,34974962	BlastN	barley_ta_match	1911	2262	.	+	.	 
> Parent=barley_transgrp_blast:TC135274;Note=%22%22
>> super_1:34972746,34974962
> ATGGGGCGCGGCTGGAGGGGGTTGTTGTTGCTGATTCTGCCGCTTCTCTGCTTCGTGACC
> GTTGCCGCCGCGGCGGACGCCTCCGCGGGCGACGCCGATCCGGTCTACAGGTCAGTGGTT
>
>
> This is what I get from DataDumper:
> $VAR1 = bless( {
>                 'primary_id' =>  
> 'ATGGGGCGCGGCTGGAGGGGGTTGTTGTTGCTGATTCTGCCGCTTCTCTGCTTCGTGACC',
>                 'primary_seq' => bless( {
>                                           'display_id' =>  
> 'ATGGGGCGCGGCTGGAGGGGGTTGTTGTTGCTGATTCTGCCGCTTCTCTGCTTCGTGACC
> ',
>                                           'primary_id' =>  
> 'ATGGGGCGCGGCTGGAGGGGGTTGTTGTTGCTGATTCTGCCGCTTCTCTGCTTCGTGACC
> ',
>                                           'desc' => '',
>                                           'seq' =>  
> 'GTTGCCGCCGCGGCGGACGCCTCCGCGGGCGACGCCGATCCGGTCTACAGGTCAGTGGTT',
>                                           'alphabet' => 'dna'
>                                         }, 'Bio::PrimarySeq' )
>               }, 'Bio::Seq' );
>
> If I put the ##FASTA directive back in the gff file,
> I get this (which is correct) from DataDumper:
> $VAR1 = bless( {
>                 'primary_id' => 'super_1:34972746,34974962',
>                 'primary_seq' => bless( {
>                                           'display_id' =>  
> 'super_1:34972746,34974962',
>                                           'primary_id' =>  
> 'super_1:34972746,34974962',
>                                           'desc' => '',
>                                           'seq' =>  
> 'ATGGGGCGCGGCTGGAGGGGGTTGTTGTTGCTGATTCTGCCGCTTCTCTGCTTCGTGACCGTTGCCG
> CCGCGGCGGACGCCTCCGCGGGCGACGCCGATCCGGTCTACAGGTCAGTGGTT',
>                                           'alphabet' => 'dna'
>                                         }, 'Bio::PrimarySeq' )
>               }, 'Bio::Seq' );
>
>
> It also breaks other stuff as now the $seq->end coord is longer than  
> the sequence length.
> Also, I think _handle_feature should warn rather than stack dump  
> when it gets  an unknown directive type, if only to stop it dying  
> when reading gff dumped from GBrowse.
>
>
> --Russell
> = 
> ======================================================================
> Attention: The information contained in this message and/or  
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or  
> privileged
> material. Any review, retransmission, dissemination or other use of,  
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by  
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> = 
> ======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list