[Bioperl-l] Problems with Genbank Proteins File

Rodrigo Jardim jardim.rodrigo at gmail.com
Sun Nov 22 16:06:40 UTC 2009


I have been problem to parser genbank protein file. I think that because
this file have a other order of fields. For example:

In most general genbank files:
========================
LOCUS       AA399704                  183 bp   mRNA    linear   EST
03-MAR-2000
ACCESSION   AA399704
VERSION     AA399704.1  GI:2053305
DEFINITION  TEUF0001 T.cruzi epimastigote non-normalized cDNA Library
            Trypanosoma cruzi cDNA clone 1 5' similar to T. cruzi gene for
            histone H2b (X60982), mRNA sequence.
KEYWORDS    EST.
SOURCE      Trypanosoma cruzi

In genbank protein files:
===================
LOCUS       XP_628849                510 aa            linear   INV
31-OCT-2008
DEFINITION  hypothetical protein [Dictyostelium discoideum AX4].
ACCESSION   XP_628849
VERSION     XP_628849.1  GI:66799847
DBSOURCE    REFSEQ: accession XM_628847.1
KEYWORDS    .
SOURCE      Dictyostelium discoideum AX4.

When I try to parser, Bioperl abort with message error.

Any ideas?

Thanks all,

-- 
Atc,
Rodrigo Jardim
jardim.rodrigo at gmail.com



More information about the Bioperl-l mailing list