[Biopython-dev] [Biopython] skipping a bad record read in SeqIO

Peter biopython at maubp.freeserve.co.uk
Sun Jun 7 07:52:04 EDT 2009


On Sun, Jun 7, 2009 at 3:36 AM, Iddo Friedberg<idoerg at gmail.com> wrote:
> Suppose an iterator based reader throws an exception due to a bad record. I
> want to note that in stderr an move on to the next record. How do i do that?

The short answer is you can't (at least not easily), but the details
would depend on which parser you are using (i.e. which file format).

Do you have a corrupt file, or do you think you might have found a bug
in a parser? More details would help.

If you really have to do this, then if the file format is simple I
would suggest you manually read the file into chunks and then pass
them to SeqIO one by one. Not elegant but it would work. For example
with a GenBank file, loop over the file line by line caching the data
until you reach a new LOCUS line. Then turn the cached lines into a
StringIO handle and give it to Bio.SeqIO.read() to parse that single
record (in a try/except).

Peter


More information about the Biopython-dev mailing list