[BioSQL-l] bioentry.name and bioentry.identifier only 40 characters?

Peter biopython at maubp.freeserve.co.uk
Fri Sep 26 14:46:19 UTC 2008


On Fri, Sep 26, 2008 at 3:24 PM, Hilmar Lapp <hlapp at gmx.net> wrote:
>
> Interesting. I see your logic and I guess you have a point.
>
> I've almost exclusively been loading all kinds of genbank, {swiss,uni}prot,
> unigene etc files, which don't suffer from this problem, as the name then is
> either identical to the accession, or is a short gene symbol, and the
> identifier is the GI#, or empty.

In my case its also been mostly GenBank files, where long names are
not an issue.

> I myself wouldn't load FASTA files w/o any processing/parsing the identifier
> token, but maybe that's not a reasonable expectation to put on everyone
> else?

I think is not unreasonable to want to import FASTA files directly (but this
is not something I've actually done other than for testing).  For example,
I might want to import NCBI FASTA files directly, but these probably have
names under 40 letters.  Another example would be sequencing or assembly
output, but I don't have a feel for what kind of length names are used here.

I was half expecting someone on the list to say "Oh yes - we had to increase
the field size when we were importing XYZ".

Peter



More information about the BioSQL-l mailing list