[Biopython] Problems with reading Swiss format records (swissprot specific date fields)

Peter Cock p.j.a.cock at googlemail.com
Fri Mar 8 16:00:39 UTC 2013


On Mon, Mar 4, 2013 at 3:40 PM, Jan T Kim <jttkim at googlemail.com> wrote:
> Dear All,
>
> trying to parse the attached Swissprot record gives me a stack trace:
>
>     Traceback (most recent call last):
>       File "./swisstest", line 7, in <module>
>         e = Bio.SeqIO.read(sys.argv[1], 'swiss')
>       File "/usr/lib/pymodules/python2.7/Bio/SeqIO/__init__.py", line 599, in read
>         first = iterator.next()
>       File "/usr/lib/pymodules/python2.7/Bio/SeqIO/__init__.py", line 537, in parse
>         for r in i:
>       File "/usr/lib/pymodules/python2.7/Bio/SeqIO/SwissIO.py", line 97, in SwissIterator
>         annotations['date'] = swiss_record.created[0]
>     TypeError: 'NoneType' object has no attribute '__getitem__'
>
> The problem is at line 99 (rather than 97)of
> https://github.com/biopython/biopython/blob/master/Bio/SeqIO/SwissIO.py :
>
>     annotations['date'] = swiss_record.created[0]
>
> without an "if swiss_record.created is not None" test or something
> similar. The parse function of Bio.SwissProt initialises the created
> instance variable to None, and only if a "DT" record containing the
> string "INTEGRATED" (case insensitive) is found, created is set to that
> date.
>
> The same kind of problem occurs with the sequence_update variable in the
> next statement:
>
>     annotations['date_last_sequence_update'] = swiss_record.sequence_update[0]
>
> Would it be sensible to set the 'date' and 'date_last_sequence_update'
> entries of the annotations dictionary only if the values are actually
> found in the swiss_record? I understand that with a genuine SwissProt
> record, they should always be there, but this happened to me when working
> on files generated from the refseq protein database using the EMBOSS
> seqret program with -osformat=swiss, which doesn't seem like an entirely
> exotic use case to me.
>
> Best regards, Jan

Good idea - this should now work in the next release:
https://github.com/biopython/biopython/commit/6d4d3838920bbb92e4acacc94d76ab3312417ca8

Can we use your example file for a test case?

Thanks,

Peter



More information about the Biopython mailing list