[Biopython-dev] [Bug 2833] Features insertion on previous bioentry_id

Peter biopython at maubp.freeserve.co.uk
Wed Jun 3 12:54:40 UTC 2009


On Tue, Jun 2, 2009 at 9:29 PM, Cymon Cox <cy at cymon.org> wrote:
>
> Whoa, I see now that in Loader._load_bioentry_table that if the
> rec.annotations["gi"] is missing, it gets filled with the accession.version:
>
>        if "gi" in record.annotations :
>            identifier = record.annotations["gi"]
>        else :
>            identifier = record.id
>
> So biopythons BioSQL identifiers are not equivalent to GenBank identifiers.
> I wonder why this is done and identifier is not just left NULL, and the
> unique constraint maintained by accession/version...
>

Remember, it isn't just GenBank files that get imported into BioSQL.
While the record.id is the accession.version when loading a GenBank
file, this is not the case in general.

Consulting the CVS log, this was changed BioSQL/Loader/py revision
1.33 to cope with loading a FASTA file into a BioSQL database (Bug
2425). Presumably I was trying to mimic the BioPerl loading of FASTA
files. Before this change, the bioentry.identifier was taken as the GI
number if available.

i.e. This change wasn't anything directly to do with the uniqueness rules.

Peter




More information about the Biopython-dev mailing list