[Biopython] SeqIO.parse for imgt

Peter Cock p.j.a.cock at googlemail.com
Wed Nov 23 17:56:06 UTC 2016


Dear Chang,

This is hopefully fixed as of:

https://github.com/biopython/biopython/commit/b61e52b3ccb47785648e25f160a7b650d23ecc29

Are you able to re-install Biopython from GitHub to test this on your machine?

Thanks,

Peter

On Fri, Nov 11, 2016 at 5:14 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Thank you Chang, James,
>
> Those details are just the kind of thing I was hoping for.
> I don't have time to make the IMGT parser changes today,
> so have opened an issue for this on the Biopython GitHub:
>
> https://github.com/biopython/biopython/issues/988
>
> Peter
>
> On Fri, Nov 11, 2016 at 5:00 PM, Liu, Chang <cliu32 at wustl.edu> wrote:
>> Thank you very much, James!
>> Hi, Peter, here you go - thank you in advance for updating the 'imgt' parser. I really appreciate it. Please let me know if I can be of any assistance!
>> Chang
>>
>> -----Original Message-----
>> From: James Robinson [mailto:jrobinso at ebi.ac.uk]
>> Sent: Friday, November 11, 2016 10:54 AM
>> To: Liu, Chang <cliu32 at wustl.edu>
>> Cc: p.j.a.cock at googlemail.com
>> Subject: Re: [IPD #99553] hla.dat file and biopython, follow up
>>
>> Hi,
>>
>> The key changes post 3.16 are the addition of an SV value to the ID line, these additions should make the format more similar to the ENA style.
>>
>> ID   HLA00001   standard; DNA; HUM; 3503 BP.
>>
>> becomes
>>
>> ID   HLA00001; SV 1; standard; DNA; HUM; 3503 BP.
>>
>> We have also added the SV value as a line in the file;
>>
>> SV   HLA00001.1
>>
>> this is added between the AC and DT lines.
>>
>> The other change, is the removal of a third DT line, we previously had 3 lines, but have reduced this to two;
>>
>> DT   01-AUG-1989 (Rel. 1.0.0, Created, Version 1)
>> DT   16-DEC-1998 (Rel. 1.0.0, Sequence Updated, Version 1)
>> DT   14-APR-2014 (Rel. 3.16.0, Current Release, Version 1)
>>
>> becomes
>>
>> DT   01-AUG-1989 (Rel. 1.0.0, Created, Version 1)
>> DT   14-OCT-2016 (Rel. 3.26.0, Last Updated, Version 1)
>>
>> In addition the text within the CC lines has changed from;
>>
>> CC   --------------------------------------------------------------------------
>> CC   Copyrighted by the IMGT/HLA Database, Distributed under the Creative
>> CC   Commons Attribution-NoDerivs License, see;
>> CC   http://www.ebi.ac.uk/imgt/hla/licence.html for further details.
>> CC   --------------------------------------------------------------------------
>>
>> to
>>
>> CC   --------------------------------------------------------------------------
>> CC   IPD-IMGT/HLA Release Version 3.26.0
>> CC   --------------------------------------------------------------------------
>> CC   Copyrighted by the IPD-IMGT/HLA Database, Distributed under the Creative
>> CC   Commons Attribution-NoDerivs License, see;
>> CC   http://www.ebi.ac.uk/ipd/imgt/hla/licence.html for further details.
>> CC   --------------------------------------------------------------------------
>>
>> Thanks
>>
>> James


More information about the Biopython mailing list