[Biopython] Could Bio.SeqIO write EMBL file?

Anne Pajon ap12 at sanger.ac.uk
Thu Mar 4 13:31:33 UTC 2010


Dear Peter,

Sorry for taking so much time to come back to you.

I've managed to fork the biopython repository on github and I think I  
am ready now to help writing improvements to Bio/SeqIO/InsdcIO.py by  
adding missing fields on the ID line and adding a PR line. I may look  
also at the SQ line.

Does this sound right to you? Thanks to let me know.

Kind regards,
Anne.


On 12 Jan 2010, at 12:33, Peter wrote:

> On Tue, Jan 12, 2010 at 10:27 AM, Peter <biopython at maubp.freeserve.co.uk 
> > wrote:
>> On Mon, Jan 11, 2010 at 5:32 PM, Anne Pajon <ap12 at sanger.ac.uk>  
>> wrote:
>>> Here is the diff between the EMBL output from Bio.SeqIO and the  
>>> genbank
>>> output from Bio.SeqIO converted with the EMBOSS tool to an EMBL  
>>> file:
>>>
>>> ...
>>>
>>> The main differences are on line breaks.
>>
>> I hadn't yet done a comparison against EMBOSS (what version do you
>> have), but yes, it looks like I am wrapping the feature tables  
>> using a
>> shorter line length - we should check that, and it would be easy to
>> adjust in Bio/SeqIO/InsdcIO.py
>
> The spec is pretty clear than the feature lines should be up to 80
> characters. The premature wrapping was because I had been
> testing length < 80 instead of <= 80, which is now fixed in git.
>
> Peter

--
Dr Anne Pajon - Pathogen Genomics, Team 81
Sanger Institute, Wellcome Trust Genome Campus, Hinxton
Cambridge CB10 1SA, United Kingdom
+44 (0)1223 494 798 (office) | +44 (0)7958 511 353 (mobile)



-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 



More information about the Biopython mailing list