[Biopython-dev] [Biopython - Bug #3315] Bio.SwissProt fails parsing .dat dumps

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Mon Dec 12 14:57:21 UTC 2011

Issue #3315 has been updated by Peter Cock.

The database problem is visible at http://www.uniprot.org/uniprot/C6KIH8.txt where the line is just: RX   DOI=DOI;

You said you'd reported this record (C6KIH8_AURAN) to SwissProt/UniPort, and other problems in the past, so this is a recurrent problem.

Regarding the proposed fix, not really, we need to use the warnings module rather than a print statement.

I'm looking at it, but have to download the latest uniprot_trembl.dat first (last month's was fine, so it uniprot_sprot.dat this month and last month).
Bug #3315: Bio.SwissProt fails parsing .dat dumps

Author: Leszek Pryszcz
Status: New
Priority: Normal
Assignee: Biopython Dev Mailing List
Category: Main Distribution
Target version: 

SwissProt module fails when parsing .dat dump of Uniprot_trembl vesion 201111.
The error is due to corrupted RX lines in .dat for Aureococcus anophagefferens (i.e. C6KIH8_AURAN):
> RX   DOI=DOI; 10.1111/j.1529-8817.2010.00841.x;

I have reported the problem. The thing is, that it happened before. Previously, I have reported similar issue in releases 201010, 201011, 201012...
> RX   DOI=10.1098/rspb= .2010.1301;

Will it be possible to alter error catching mechanisms in Bio.SwissProt._read_rx, so the module warns about corrupted entry instead of failing the parser?

You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org

More information about the Biopython-dev mailing list