[Biopython-dev] GFF file parsing and error handling
carl crott
carlcrott at gmail.com
Mon Jan 9 14:36:00 UTC 2012
Hey all,
I'm posting here because I know there has been talk about GFF file parsing
and I'd love to code a bit as soon as I comprehend whats going on with
these files.
I've got this GFF file ( placed in a spreadsheet for readability )
https://docs.google.com/spreadsheet/ccc?key=0AtOqyz8P_fJ0dGVOMzNSM29qUVdjZmZ4emdIQ3U2OUE&hl=en_US#gid=0
line 178 + 179 are the problematic lines
what is going on here?
I know that these genes are listed in reverse order and that a sequence of:
stop_codon
CDS
CDS
start_codon
the above is a normal gene arrangement.
BUT my guesses as to what happening ( between 178 and 179 ):
1) the gene stretches from the end of one chromosome to another?
2) simply a stop_codon with no attached CDS or start_codon ?
I've successfully managed to parse out the gene intervals and now I'm
working on the error handling.
Thanks,
Carl
--
Carl Crott
Web Applications Engineer
www.black-glass.com
412-610-0600
More information about the Biopython-dev
mailing list