[Biopython] SeqIO.parse for imgt

Peter Cock p.j.a.cock at googlemail.com
Fri Nov 4 16:54:21 UTC 2016


This does look like a small change, we would expect six
semi-colons in the ID line, e.g.

ID X56734; SV 1; linear; mRNA; STD; PLN; 1859 BP.
ID CD789012; SV 4; linear; genomic DNA; HTG; MAM; 500 BP.

0. Primary accession number
1. Sequence version number
2. Topology: 'circular' or 'linear'
3. Molecule type (e.g. 'genomic DNA')
4. Data class (e.g. 'STD')
5. Taxonomic division (e.g. 'PRO')
6. Sequence length (e.g. '4639675 BP.')

However, the hla.dat file uses only five semi-colons. It looks like the
data class has been removed leaving just HUM (human) as the
taxonomic division. e.g.

ID   HLA00001; SV 1; standard; DNA; HUM; 3503 BP.
ID   HLA02169; SV 1; standard; DNA; HUM; 3291 BP.

Peter


On Fri, Nov 4, 2016 at 4:37 PM, Liu, Chang <cliu32 at wustl.edu> wrote:
> Hi, Peter,
> Thank you for the quick response. I have sent a message to embl-ebi
> requesting for a list of changes. Hope this can be fixed with a minor
> tweak. Will keep you posted when I hear back.
> Best regards,
> Chang


More information about the Biopython mailing list