[BioPython] Determining if GenBank record is circular

Chris Lasher chris.lasher at gmail.com
Wed Sep 3 12:34:24 EDT 2008


On Tue, Sep 2, 2008 at 5:00 AM, Peter <biopython at maubp.freeserve.co.uk> wrote:
>
> On Tue, Sep 2, 2008 at 2:25 AM, Chris Lasher <chris.lasher at gmail.com> wrote:
> > On Mon, Sep 1, 2008 at 8:19 PM, Iddo Friedberg <idoerg at gmail.com> wrote:
> >>
> >> Should be in LOCUS:
> >>
> >> LOCUS       NC_002678            7036071 bp    DNA     circular BCT 22-JUL-2008
> >
> > Ah, sure. Let me re-state my question more precisely: Where is this
> > represented in the SeqRecord object created by SeqIO.parse(), or is it
> > represented at all?
>
> Currently if the sequence is circular I don't think it is represented
> at all when parsed in a SeqRecord.
>
> Bio.SeqIO uses the Bio.GenBank.FeatureParser, which gets passed this
> information from the Scanner via the residue_type event.  This is a
> combined lump of data containing both the sequence type (DNA, RNA etc)
> and if it is linear or circular.  It is currently only used to
> determine the Seq alphabet, and has never been recorded.  So in
> addition to not recording if the LOCUS line said the sequence was
> circular, if the LOCUS line contained cDNA, mRNA, ... this fine detail
> is also currently lost in the SeqRecord representation.  On the other
> hand, the Bio.GenBank.RecordParser stores all this as the record's
> residue_type property (a single combined field, presumably reflecting
> the layout of early GenBank files).
>
> It would be a logical improvement to record the sequence data
> (molecule type and if circular) in the SeqRecord's annotations
> dictionary - perhaps as two fields but we'd need to check if that
> would be straight forward for EMBL files too.  Alternatively, if
> Biopython included a native CircularSeq object, we could use that
> explicitly when the sequence is declared as circular.  This might be
> considered a little surprising though.
>
> Do you want to file a bug on this Chris?

Would you mind filing it, Peter? I've got a poster to complete very
soon. I think you did a fine job describing the features we'd like to
add.

Thanks,
Chris


More information about the BioPython mailing list