[Bioperl-l] How do you create a genbank file?

Wes Barris wes.barris at csiro.au
Fri Oct 17 01:48:17 EDT 2003


Hi,

I need to convert a set of fasta sequences into a set of genbank entries.
This is easy to do except that there are a few subtle parts of the genbank
format that I have not figured out how to set yet using bioperl.

Here is an example of what I have been able to produce:

LOCUS       AB050006                 588 bp    dna     linear   UNK
DEFINITION  Bos taurus bASC mRNA for apoptosis-associated speck-like protein
             containing a CARD, complete cds.
ACCESSION   AB050006
FEATURES             Location/Qualifiers
      source          1..588
                      /organism="Bos taurus"
.
.
.

Here is an example of what I would like it too look like:

LOCUS       AB050006                 588 bp    mRNA     linear   MAM
DEFINITION  Bos taurus bASC mRNA for apoptosis-associated speck-like protein
             containing a CARD, complete cds.
ACCESSION   AB050006
VERSION     AB050006.1  GI:26453358
SOURCE      Bos taurus (cow)
   ORGANISM  Bos taurus
FEATURES             Location/Qualifiers
      source          1..588
                      /organism="Bos taurus"
.
.
.

My question is how do I set the following?

mRNA (instead of dna)
MAM (instead of UNK)
VERSION     AB050006.1  GI:26453358		<- I can't get this line to appear
SOURCE      Bos taurus (cow)
   ORGANISM  Bos taurus


Here is a portion of my code:

my $seq_in  = Bio::SeqIO->new( -format => 'fasta', -file => $infile);
my $seq_out = Bio::SeqIO->new( -format => 'genbank',   -file => ">$outfile");
while (my $seq = $seq_in->next_seq()) {
    my ($gi, $accession, $clone, $clonelib, $length, $file, $direction, $description, 
$tissuetype, $organism) = &parseStackDefline($seq->description());
    $seq->desc($description);
    $seq->accession_number($accession);
    $seq->primary_id($gi);
    my $feat = Bio::SeqFeature::Generic->new(-primary=>'source', -start=>1, -end=>$length);
    $feat->add_tag_value('clone',$clone)           if ($clone ne 'unknown');
    $feat->add_tag_value('clonelib',$clonelib)     if ($clonelib ne 'unknown');
    $feat->add_tag_value('tissuetype',$tissuetype) if ($tissuetype ne 'unknown');
    $feat->add_tag_value('organism',$organism)     if ($organism ne 'unknown');
    $seq->add_SeqFeature($feat);

    $seq_out->write_seq($seq);
    }

-- 
Wes Barris
E-Mail: Wes.Barris at csiro.au




More information about the Bioperl-l mailing list