[Biopython] Handling records referencing other records

Ivan Gregoretti ivangreg at gmail.com
Fri Sep 18 15:27:15 UTC 2015


Hi John.

Here Python itself is designed to help you.
Take a look at the try...except statement:

https://docs.python.org/2/tutorial/errors.html

Cheers,

Ivan




Ivan Gregoretti, PhD
Bioinformatics



On Fri, Sep 18, 2015 at 10:30 AM, Athey, John * <John.Athey at fda.hhs.gov> wrote:
> Hello all,
>
>
>
> I’m looking for advice on how to handle Genbank records that reference other
> records as part of their location. My program iterates through large
> Genbank-formatted files with SeqIO.parse and extracts the CDS for subsequent
> analysis, using feat.extract(). However, upon hitting a record where the
> feature location references another record, it SOMETIMES fails. For example,
> http://www.ncbi.nlm.nih.gov/nuccore/DQ100169 seems to be handled correctly,
> while http://www.ncbi.nlm.nih.gov/nuccore/DQ100170 gives a “ValueError:
> Feature references another sequence.” Curiously, in both cases the CDS
> feature itself doesn’t specify another record, only the parent gene does.
>
>
>
> My questions about this are:
>
> 1)      Why does the extraction fail on some records but not on all of them?
>
> 2)      Is there a way to extract the data I’m looking for without causing
> this error?
>
> 3)      If the answer to (2) is no, is there some other way to check whether
> the sequence will cause this error, skip extracting that sequence, and
> exclude that record from the analysis?
>
>
>
> Thanks for any help you can provide!
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython



More information about the Biopython mailing list