[Biopython] Could Bio.SeqIO write EMBL file?

Wed Jan 6 13:28:42 UTC 2010

Hi Peter,

Thanks again for this fast answer.

You've been fixing code for me recently on fasta-m10 al_start and  
al_end, so I am now working with the development version of biopython  
from git. I have no problem of updating it and testing it here.

I am working with about 30 bacteria genomes from the human gut and  
waiting 100 more genomes to work with this year. I can send you one of  
the file if you wish. Just let me know.

Kind regards,
Anne.

On 6 Jan 2010, at 13:15, Peter wrote:

> On Wed, Jan 6, 2010 at 12:20 PM, Anne Pajon <ap12 at sanger.ac.uk> wrote:
>> Dear,
>>
>> I'm reading EMBL file with Bio.SeqIO for adding an extra feature  
>> qualifier
>> to each of the annotations, and would like to write the modified  
>> annotated
>> sequence back to an EMBL file.
>> ...
>> While running the above I'm getting this error:
>> Reading format 'embl' is supported, but not writing
>>
>> Is there a way around? I know from the documentation on the wiki that
>> biopython does not have a writer for EMBL format. Is there a plan  
>> of having
>> one in the future? I volunteer to test it, or if it does not exist  
>> yet I may
>> be able to contribute writing it... thanks to let me know.
>>
>> Kind regards,
>> Anne.
>
> Hello Anne,
>
> The intention was to eventually have both GenBank and EMBL output
> working in SeqIO - and they should be able to share a lot of code.
> However, out of practicality, GenBank output was prioritised (and
> bar a few bits of annotation, seems to be working nicely). There
> hadn't been much interest in EMBL output in comparison.
>
> Getting something basic working shouldn't be too hard (id, features  
> and
> sequence), and having someone interested help test this would be very
> valuable. Did you install Biopython from source? Are you happy using
> git (to grab code for testing)? Neither is essential for trying out  
> new
> Python code, but would make things a bit simpler.
>
> Also, what kind of organisms are you working with? What I'm getting
> at here is how complex are the feature locations going to be?
>
> Peter

--
Dr Anne Pajon - Pathogen Genomics, Team 81
Sanger Institute, Wellcome Trust Genome Campus, Hinxton
Cambridge CB10 1SA, United Kingdom
+44 (0)1223 494 798 (office) | +44 (0)7958 511 353 (mobile)

-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.