[Biopython-dev] [Bug 2425] Fasta ID parsing error
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Fri Sep 26 12:44:16 UTC 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2425
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-09-26 08:44 EST -------
(In reply to comment #1)
> I assume in your example you expected "region1.fasta.screen.Contig1" to be
> used as the record key in BioSQL? There is a 40 character limit on this
> field, which should be fine for most FASTA identifiers.
In BioSQL v1.0.1, fields bioentry.accession and dbxref.accession were increased
from 40 to 128 characters. See
http://lists.open-bio.org/pipermail/biosql-l/2008-August/001311.html
However, bioentry.name is still only 40 characters.
It looks like for a FASTA file like this:
>gi|9629357|ref|NC_001802.1| Human immunodeficiency virus type 1, complete genome
GGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTTAAGCC
TCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAGAGA
...
BioPerl will use "gi|9629357|ref|NC_001802.1|" as bioentry.name and
bioentry.identifier with "Human immunodeficiency virus type 1, complete genome"
as bioentry.description, 0 as the version (BioSQL convention when unknown),
with bioentry.taxon_id and bioentry.division as NULL.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list