[Bioperl-l] minor bug in Bio::FeatureIO::gff
Chris Fields
cjfields at illinois.edu
Fri Mar 13 04:01:10 UTC 2009
Rusell,
I would file that in bugzilla. We need to take that into
consideration when refactoring Bio::FeatureIO.
chris
On Mar 12, 2009, at 9:07 PM, Smithies, Russell wrote:
> I think there's a bug in Bio::FeatureIO::GFF when it's reading fasta
> from a gff file.
> If there's no ##FASTA directive in the gff file, it ignores the
> fasta header and takes the first line of sequence as the primary_id
> and display_id
>
> Eg:
>
> Here's some gff:
>
> super_1:34972746,34974962 BlastN barley_ta_match 1558 1764 . + .
> Parent=barley_transgrp_blast:TC135274;Note=%22%22
> super_1:34972746,34974962 BlastN barley_ta_match 1911 2262 . + .
> Parent=barley_transgrp_blast:TC135274;Note=%22%22
>> super_1:34972746,34974962
> ATGGGGCGCGGCTGGAGGGGGTTGTTGTTGCTGATTCTGCCGCTTCTCTGCTTCGTGACC
> GTTGCCGCCGCGGCGGACGCCTCCGCGGGCGACGCCGATCCGGTCTACAGGTCAGTGGTT
>
>
> This is what I get from DataDumper:
> $VAR1 = bless( {
> 'primary_id' =>
> 'ATGGGGCGCGGCTGGAGGGGGTTGTTGTTGCTGATTCTGCCGCTTCTCTGCTTCGTGACC',
> 'primary_seq' => bless( {
> 'display_id' =>
> 'ATGGGGCGCGGCTGGAGGGGGTTGTTGTTGCTGATTCTGCCGCTTCTCTGCTTCGTGACC
> ',
> 'primary_id' =>
> 'ATGGGGCGCGGCTGGAGGGGGTTGTTGTTGCTGATTCTGCCGCTTCTCTGCTTCGTGACC
> ',
> 'desc' => '',
> 'seq' =>
> 'GTTGCCGCCGCGGCGGACGCCTCCGCGGGCGACGCCGATCCGGTCTACAGGTCAGTGGTT',
> 'alphabet' => 'dna'
> }, 'Bio::PrimarySeq' )
> }, 'Bio::Seq' );
>
> If I put the ##FASTA directive back in the gff file,
> I get this (which is correct) from DataDumper:
> $VAR1 = bless( {
> 'primary_id' => 'super_1:34972746,34974962',
> 'primary_seq' => bless( {
> 'display_id' =>
> 'super_1:34972746,34974962',
> 'primary_id' =>
> 'super_1:34972746,34974962',
> 'desc' => '',
> 'seq' =>
> 'ATGGGGCGCGGCTGGAGGGGGTTGTTGTTGCTGATTCTGCCGCTTCTCTGCTTCGTGACCGTTGCCG
> CCGCGGCGGACGCCTCCGCGGGCGACGCCGATCCGGTCTACAGGTCAGTGGTT',
> 'alphabet' => 'dna'
> }, 'Bio::PrimarySeq' )
> }, 'Bio::Seq' );
>
>
> It also breaks other stuff as now the $seq->end coord is longer than
> the sequence length.
> Also, I think _handle_feature should warn rather than stack dump
> when it gets an unknown directive type, if only to stop it dying
> when reading gff dumped from GBrowse.
>
>
> --Russell
> =
> ======================================================================
> Attention: The information contained in this message and/or
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or
> privileged
> material. Any review, retransmission, dissemination or other use of,
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =
> ======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list