[Bioperl-l] methods, etc. for Bio::SearchIO on exonerate output

Jason Stajich jason.stajich at duke.edu
Wed Aug 31 13:34:44 EDT 2005


http://fungal.genome.duke.edu/~jes12/software/scripts/ 
process_exonerate_gff3.pl

You may still want to massage it some, but I use the script in this  
basic form, maybe with a few tweaks:

Note that it requires you to run exonerate with specific --ryo  
options so that it includes the length of the query and hit sequences  
in the report output. should be covered in the perldoc in the script.

Without the ryo options enabled,  you'll need to modify the script  
more to have access to the original sequence db, use Bio::DB::Fasta,   
and put in some $dbh->length($seqid) calls instead.

I don't think the part which writes HSP/match lines is actually  
correct - it is trying to roll gapped HSPs from the similarity features.

I end up ignoring all but the 'exon' and 'gene' lines for my gbrowse  
instance and/or grepping out the lines I really think I need.
You may want to s/exon/CDS/ for the protein2genome output as well.

-jason

On Aug 31, 2005, at 1:04 PM, Cook, Malcolm wrote:

> Jason,
>
> This message is in regards to an old thread  in which you offered  
> to shared a 'script for munging over' exonerate output for lading  
> in DB::GFF (c.f. http://bioperl.org/pipermail/bioperl-l/2005-April/ 
> 018741.html)
>
> Would you be willing to still share that script, if you've got it  
> around?
>
> Thanks, and regards,
>
> Malcolm Cook - mec at stowers-institute.org - 816-926-4449
> Database Applications Manager - Bioinformatics
> Stowers Institute for Medical Research - Kansas City, MO  USA
>
>

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12




More information about the Bioperl-l mailing list