[Bioperl-l] methods, etc. for Bio::SearchIO on exonerate output

Cook, Malcolm MEC at Stowers-Institute.org
Fri Sep 2 10:36:59 EDT 2005


Jason,
 
Thanks for the scripts and clues (esp re: using the --ryo option to
inject the needed length into the exonerate output to compensate).
 
I'm considering asking exonerate author to comport with GFF spec.  Do
you think this is a road to take?
 
Cheers,
 
Malcolm
 
-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at duke.edu] 
Sent: Wednesday, August 31, 2005 12:35 PM
To: Cook, Malcolm
Cc: bioperl-l
Subject: Re: [Bioperl-l] methods, etc. for Bio::SearchIO on exonerate
output


http://fungal.genome.duke.edu/~jes12/software/scripts/process_exonerate_
gff3.pl

You may still want to massage it some, but I use the script in this
basic form, maybe with a few tweaks:

Note that it requires you to run exonerate with specific --ryo options
so that it includes the length of the query and hit sequences in the
report output. should be covered in the perldoc in the script.

Without the ryo options enabled,  you'll need to modify the script more
to have access to the original sequence db, use Bio::DB::Fasta,  and put
in some $dbh->length($seqid) calls instead.

I don't think the part which writes HSP/match lines is actually correct
- it is trying to roll gapped HSPs from the similarity features. 

I end up ignoring all but the 'exon' and 'gene' lines for my gbrowse
instance and/or grepping out the lines I really think I need.  
You may want to s/exon/CDS/ for the protein2genome output as well.

-jason

On Aug 31, 2005, at 1:04 PM, Cook, Malcolm wrote:


Jason, 

This message is in regards to an old thread  in which you offered to
shared a 'script for munging over' exonerate output for lading in
DB::GFF (c.f.
<http://bioperl.org/pipermail/bioperl-l/2005-April/018741.html>
http://bioperl.org/pipermail/bioperl-l/2005-April/018741.html)

Would you be willing to still share that script, if you've got it
around? 

Thanks, and regards, 

Malcolm Cook -  <mailto:mec at stowers-institute.org>
mec at stowers-institute.org - 816-926-4449
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, MO  USA




--
Jason Stajich
Duke University
http://www.duke.edu/~jes12





More information about the Bioperl-l mailing list