[Biojava-l] Error parsing GFF3 file
Josh Goodman
jogoodma at indiana.edu
Mon Jan 3 15:13:27 UTC 2011
Negative locations can also happen in cases where the genome sequence is incomplete and other
experimental evidence shows that the actual sequence extends beyond the existing start location.
Since coordinate systems are almost always anchored to the assembled genome sequence the other
evidence features upstream of the start get assigned negative coordinates.
You can see an example of this in Drosophila melanogaster
(ftp://ftp.flybase.org/genomes/Drosophila_melanogaster/current/gff/dmel-3L-r5.32.gff.gz). In Dmel
3L you see an aberration breakpoint and chromosome band features all upstream of the sequenced start
site at position 1.
Cheers,
Josh
On 12/31/2010 01:21 PM, Scooter Willis wrote:
> Phillip
>
> I think it is complaining about the negative location (-1864985,746). Is
> this a circular genome? That seems to be a rather large sequence segment and
> I think it is correct to complain about the negative location. We tried to
> plan ahead on circular genomes and genes that cross the boundary begin/end
> boundary and at the same time not have the programmer brain explode trying
> to handle all the combinations that exist. It gets really fun when you have
> a negative strand.
>
> One of the challenges of a valid gff3 file is that you can make sure
> ontology is correct and the file format is correct but when you try and
> bring it all together to do something with the data(turn it into a protein)
> you need to check harder.
>
> If this is a valid location can you send me the gff3 segment and the DNA
> sequence that describes the features and I will see what I can do to make it
> work without previous reference to head exploding. Let me know what the end
> goal is on parsing gff3 file and what is missing when you try and map to a
> GeneSequence/ProteinSequence.
>
> Thanks
>
> Scooter
>
> On Fri, Dec 31, 2010 at 12:07 PM, Philipp Comans <philipp.comans at mytum.de>wrote:
>
>> Hello everyone,
>>
>> I am trying to parse the file available here:
>> ftp://ftp.jgi-psf.org/pub/JGI_data/Amphimedon_queenslandica/annotation/Aqu1.gff3.gz
>> with the following commands:
>>
>> import java.util.Iterator;
>>
>> import org.biojava3.genome.parsers.gff.FeatureI;
>> import org.biojava3.genome.parsers.gff.FeatureList;
>> import org.biojava3.genome.parsers.gff.GFF3Reader;
>>
>> public class GFFReader3 {
>>
>> public static void main(String[] args) throws Exception {
>>
>> FeatureList features = (FeatureList)
>> GFF3Reader.read("/Users/philipp/Dropbox/IDP/JGI_data/annotation/Smiles.gff3");
>> Iterator<FeatureI> featureIterator = features.iterator();
>>
>> FeatureI currentFeature = null;
>>
>> while (featureIterator.hasNext()) {
>> currentFeature = featureIterator.next();
>> System.out.println(currentFeature);
>> }
>>
>> }
>>
>> }
>>
>> The error I get is:
>> 31.12.2010 18:05:10 org.biojava3.genome.parsers.gff.GFF3Reader read
>> INFO: Gff.read(): Reading
>> /Users/philipp/Dropbox/IDP/JGI_data/annotation/Aqu1.gff3
>> Exception in thread "main" java.lang.IllegalArgumentException: Improper
>> location parameters: (-1864985,746)
>> at org.biojava3.genome.parsers.gff.Location.<init>(Location.java:75)
>> at org.biojava3.genome.parsers.gff.Location.union(Location.java:258)
>> at
>> org.biojava3.genome.parsers.gff.FeatureList.add(FeatureList.java:49)
>> at
>> org.biojava3.genome.parsers.gff.GFF3Reader.read(GFF3Reader.java:59)
>> at GFFReader3.main(GFFReader3.java:11)
>>
>> I find this very strange because the file is a valid GFF document according
>> to
>> http://modencode.oicr.on.ca/cgi-bin/validate_gff3_online
>>
>> Is this a bug or am I doing something wrong?
>> Thanks for your help, I wish you a happy New Year!
>>
>> Philipp
>> _______________________________________________
>> Biojava-l mailing list - Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>>
> _______________________________________________
> Biojava-l mailing list - Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
More information about the Biojava-l
mailing list