[Bioperl-l] Error parsing Genbank file

Ryan Golhar golharam at umdnj.edu
Thu Jan 6 16:21:18 EST 2005


What is the fix for CONTIG entries....

BTW- I'm new to bioperl...

Ryan

-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at duke.edu] 
Sent: Wednesday, January 05, 2005 4:37 PM
To: golharam at umdnj.edu
Cc: 'Bioperl List'
Subject: Re: [Bioperl-l] Error parsing Genbank file


We can't parse WGS files.  The fix it needs is very similar to how we 
handle CONTIG entries if you want to have a go at fixing it.

On Jan 5, 2005, at 3:41 PM, Ryan Golhar wrote:

> Hi all,
>
> I have a Genbank file that Bio::SeqIO:genbank.pm is choking on.  The 
> entry is just a WGS entry referencing a bunch of other entries.  It 
> does on line 492 with the error "Unexpected error in feature table for
> Skipping feature, attempting to recover".
>
> I'm using the following code:
>
> #!/usr/bin/perl
>
> use strict;
> use Bio::SeqIO;
>
> my $usage = "$0 <genbank file> <fasta file>\n";
> my $file = shift or die $usage;
> my $outfilename = shift or die $usage;
>
> my $infile = Bio::SeqIO->new('-file' => "<$file",
> 			    '-format' => "genbank");
>
> my $outfile = Bio::SeqIO->new(-'file' => ">$outfilename",
> 			    '-format' => "fasta");
>
> while (my $seq = $infile->next_seq) {
> #	print STDERR $seq->accession_number,"\n";
> 	
> 	$outfile->write_seq($seq);
> }
>
> Here is the contents of the genbank entry:
>
> LOCUS       CAAB01000000           12381 rc    DNA     linear   VRT
> 22-AUG-2002
> DEFINITION  Takifugu rubripes whole genome shotgun sequencing project.
> ACCESSION   CAAB00000000
> VERSION     CAAB00000000.1  GI:22418063
> KEYWORDS    WGS.
> SOURCE      Takifugu rubripes (Fugu rubripes)
>   ORGANISM  Takifugu rubripes
>             Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 
> Euteleostomi;
>             Actinopterygii; Neopterygii; Teleostei; Euteleostei; 
> Neoteleostei;
>             Acanthomorpha; Acanthopterygii; Percomorpha; 
> Tetraodontiformes;
>             Tetradontoidea; Tetraodontidae; Takifugu.
> REFERENCE   1  (bases 1 to 12381)
>   AUTHORS   The Fugu Genome Sequencing Consortium.
>   TITLE     Direct Submission
>   JOURNAL   Submitted (01-JUL-2002) The Fugu Genome Sequencing
> Consortium,
>             http://www.fugubase.org/ http://www.jgi.doe.gov/fugu
> COMMENT     The Takifugu rubripes whole genome shotgun (WGS) project 
> has
> the
>             project accession CAAB00000000.  This version of the
> project
> (01)
>             has the accession number CAAB01000000, and consists of
> sequences
>             CAAB01000001-CAAB01012381.
> FEATURES             Location/Qualifiers
>      source          1..12381
>                      /organism="Takifugu rubripes"
>                      /mol_type="genomic DNA"
>                      /db_xref="taxon:31033"
> WGS         CAAB01000001-CAAB01012381
> //
>
>
>
> -----
> Ryan Golhar
> Computational Biologist
> The Informatics Institute at
> The University of Medicine & Dentistry of NJ
>
> Phone: 973-972-5034
> Fax: 973-972-7412
> Email: golharam at umdnj.edu
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org 
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/



More information about the Bioperl-l mailing list