[Bioperl-l] GFF file output missing semicolon

Wes Barris wes.barris at csiro.au
Sun Nov 23 19:35:46 EST 2003


Jason Stajich wrote:

> I think that the gff2 dumping was not particularly good - I think I made
> some fixes to clean it up on the main trunk in the last few months.  I can
> certainly dump with Tools:GFF and load into Gbrowse just fine with
> the current code.  Wes you might try with bioperl 1.3.x series
> Bio::Tools::GFF instead.

I could try but this is running on a production server and I had a heck
of a time trying to find a working combination of bioperl and gbrowse
that would work together.  I think that at the time, the only combo I
could get to work was bioperl-1.2.2 and gbrowse-1.50.  If I installed
bioperl-1.2.3, what version of gbrowse is guaranteed to work with that?


> 
> -jason
> 
> On Fri, 21 Nov 2003, Lincoln Stein wrote:
> 
> 
>>Hi,
>>
>>The GFF2 spec specifies that the semicolon separates tag/value pairs.  It does
>>not say that the last tag/value should be terminated by a semicolon.  It also
>>specifies that any amount of whitespace can occur around the semicolon.
>>
>>Lincoln
>>
>>On Thursday 20 November 2003 11:19 pm, Wes Barris wrote:
>>
>>>Hi,
>>>
>>>I have written a bioperl program that parses blast files and generates
>>>a gff file.  I have everything working except there is one small detail
>>>that I have not been able to figure out.  When generating each line
>>>of gff output, the semicolon is left off at the end of the Accession
>>>name.  Here is a sample line from a gff file that I generated:
>>>
>>>AF354168        mirseeker       pred_miRNA      188152  188251  198     -
>>>   . Note "mirseeker score 17.58"   ; Accession
>>>"s-h_19_r_99330000-99363000"
>>>
>>>Notice that:
>>>
>>>1) There are three space characters after the note and the semicolon
>>>    that occurs before "Accession".
>>>
>>>2) At the end of the line, after the Accession, there are three space
>>>    characters and no semicolon.  Without that semicolon, the genome
>>>    browser doesn't display the "rollover" information properly.
>>>
>>>3) The "Note" field is written before the "Accession" field.  I thought
>>>    that the Accession should come first.
>>>
>>>Here is the relevant portion of my code:
>>>
>>>       while( my $hsp = $hit->next_hsp ) {
>>>          my $strand = 1;
>>>          $strand = -1 if ($hsp->strand('query') == -1 ||
>>>$hsp->strand('hit') == -1); my $feature = new Bio::SeqFeature::Generic(
>>>                         -source_tag=>$source,
>>>                         -primary_tag=>$feature_type,
>>>                         -start=>$hsp->start('hit'),
>>>                         -end=>$hsp->end('hit'),
>>>                         -score=>$hit->raw_score,
>>>                         -strand=>$strand,
>>>                         -tag=>{
>>>                                 Accession=>$result->query_name,
>>>                                 Note=>$result->query_description,
>>>                                 }
>>>                         );
>>>          $feature->seq_id($hit->accession);
>>>          $gffio->write_feature($feature);       #Bio::SeqFeatureI
>>>       }
>>>
>>>Perhaps I am not adding the "Accession" and "Note" fields properly???
>>
>>
> 
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu


-- 
Wes Barris
E-Mail: Wes.Barris at csiro.au



More information about the Bioperl-l mailing list