[BioPython] Cannot parse/convert embl formatted files

Martin MOKREJŠ mmokrejs at ribosome.natur.cuni.cz
Sun Aug 13 00:16:07 UTC 2006


Hi Chris,

Chris Fields wrote:
> Just so everybody knows, EMBL recently made a few major revisions to  
> their sequence format. These are now corrected in Bioperl CVS and  
> will be available for the next dev release (hopefully out within a  
> few months).

I will test that later. Thanks.

> 
> Odd about the unbalanced quotes; is that on the Bioperl end?  I  
> missed that bit...

No, the input EMBL files are broken:

And the relevant EBML file was:

ID   5OSAR003520 standard; RNA; PLN; 213 BP.
...
FT   5'UTR           1..213
FT                   /source="REFSEQ::XM_479174:1..213"
FT                   /gene="B1056G08.147"
FT                   /product="putative dihydropterin pyrophosphokinase
FT   repeat_region   61..87
...
// 

Still, I believe the parser could ignore this minot error and terminate
the string (or treat it as terminated) when it is actually terminated
by a following feature line.

M.



More information about the Biopython mailing list