[Biojava-l] Parsing EMBL files from Ensembl
Stein Aerts
stein.aerts@esat.kuleuven.ac.be
Thu, 21 Nov 2002 09:23:58 +0100
Since today, apparantly something changed on the "export data" function
of Ensembl. When retrieving a gene based on its ensembl id, e.g.
ENSG00000110092 with 2000 bp on either side, and requesting only gene
features, then until yesterday, the resulting EMBL formatted file had
ID= ENSG00000110092 but now it has ID :
ID Chromosome 11 71948701 to 71966070 ENSEMBL; DNA; HUM; 17370 BP.
More importantly the parser in BioJava with EMBL format cannot parse it,
it complains by telling me
This line could not be parsed: CDS
join(-1151..-840,1654..1777,1995..2434)
This line could not be parsed: 12014..12175)
This line could not be parsed: CDS
join(3927..4460,4728..4887,8890..9038,12014..12178)
This line could not be parsed: exon -1151..-840
This line could not be parsed: exon 1654..1777
and so on.
Would there be anyone who could help me out on this?
Thanks a lot,
Stein Aerts.