[BioPython] Determining if GenBank record is circular
Chris Lasher
chris.lasher at gmail.com
Wed Sep 3 16:34:24 UTC 2008
On Tue, Sep 2, 2008 at 5:00 AM, Peter <biopython at maubp.freeserve.co.uk> wrote:
>
> On Tue, Sep 2, 2008 at 2:25 AM, Chris Lasher <chris.lasher at gmail.com> wrote:
> > On Mon, Sep 1, 2008 at 8:19 PM, Iddo Friedberg <idoerg at gmail.com> wrote:
> >>
> >> Should be in LOCUS:
> >>
> >> LOCUS NC_002678 7036071 bp DNA circular BCT 22-JUL-2008
> >
> > Ah, sure. Let me re-state my question more precisely: Where is this
> > represented in the SeqRecord object created by SeqIO.parse(), or is it
> > represented at all?
>
> Currently if the sequence is circular I don't think it is represented
> at all when parsed in a SeqRecord.
>
> Bio.SeqIO uses the Bio.GenBank.FeatureParser, which gets passed this
> information from the Scanner via the residue_type event. This is a
> combined lump of data containing both the sequence type (DNA, RNA etc)
> and if it is linear or circular. It is currently only used to
> determine the Seq alphabet, and has never been recorded. So in
> addition to not recording if the LOCUS line said the sequence was
> circular, if the LOCUS line contained cDNA, mRNA, ... this fine detail
> is also currently lost in the SeqRecord representation. On the other
> hand, the Bio.GenBank.RecordParser stores all this as the record's
> residue_type property (a single combined field, presumably reflecting
> the layout of early GenBank files).
>
> It would be a logical improvement to record the sequence data
> (molecule type and if circular) in the SeqRecord's annotations
> dictionary - perhaps as two fields but we'd need to check if that
> would be straight forward for EMBL files too. Alternatively, if
> Biopython included a native CircularSeq object, we could use that
> explicitly when the sequence is declared as circular. This might be
> considered a little surprising though.
>
> Do you want to file a bug on this Chris?
Would you mind filing it, Peter? I've got a poster to complete very
soon. I think you did a fine job describing the features we'd like to
add.
Thanks,
Chris
More information about the Biopython
mailing list