[Biopython] error in parseing Gene bank

Peter Cock p.j.a.cock at googlemail.com
Wed Oct 3 10:42:58 EDT 2012


On Wed, Oct 3, 2012 at 3:39 PM, francesco chiani
<francesco.chiani at gmail.com> wrote:
> Thanks for the replys:
>
> here the gbk file:
>
> LOCUS       allele_48659_OTTMUSE00000300743_L1L2_Bact_P        37935 bp
> dna     linear   UNK
> ACCESSION   unknown
> DBSOURCE    accession design_id=48659
> COMMENT     cassette : L1L2_Bact_P
> COMMENT     design_id : 48659
> FEATURES             Location/Qualifiers
> ...

As Nick guessed, I think the problem is you have 'dna' (lower case)
in the LOCUS line, rather than 'DNA' (upper case). Where did this
file come from? e.g. What software tool or database made it?

http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord#MoleculeTypeB

In any case, perhaps Biopython could check for 'dna' as well
(as some tools don't seem for obey this bit of the standard)?

Thanks

Peter


More information about the Biopython mailing list