[Bioperl-l] bp_search2gff.pl
Chris Fields
cjfields at uiuc.edu
Fri Oct 5 15:51:20 EDT 2007
We might want to file this as a bug so we can track it.
The core devs have been mulling over the state of GFF/GFF3 in
BioPerl; proper handling of any SearchIO data is certainly included
in that. I believe some road forward is to be planned soon (after
Genome Informatics).
chris
On Oct 5, 2007, at 2:35 PM, Eric Just wrote:
> Hello,
>
> I have been playing with the bp_search2gff.pl script (on HEAD of
> bioperl-live). There are a couple of issues I was wondering about.
>
> One is the ID that gets generated for a match feature when the --match
> option is set. The ID is set to the ID of the query sequence. This
> can be problematic if you are representing the query sequence and the
> blast hit in the same gff file. When using the resultant gff file for
> loading into Chado, it also creates a problem if you have more than
> one hit for a given query sequence, for example if you ran two
> different analyses that each had a hit for a given query. Would it be
> possible to have an option to create a unique ID for match features.
> One suggestion could be to create an ID based on the ID of the query +
> the id of the hit + the source
>
> As long as two different analyses were loaded as different sources,
> this would ensure unique IDs for the match features.
>
>
> Also, is there a reason for writing the Target string as
>
> Target=Sequence:SOME_ID
>
> as opposed to
>
> Target=SOME_ID
>
>
> The latter seems a little more in line with the gff3 spec and plays a
> little nicer with the GMOD tools.
>
> Thanks for looking into this.
>
> Eric
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list