[Bioperl-l] Help with Bio::SeqIO

download on demand downloadondemand at gmail.com
Sun Nov 4 18:39:42 UTC 2007


Hi to all.

I have a problem with a simplest script:



         use Bio::SeqIO;
         # get command-line arguments, or die with a usage statement
         my $usage = "x2y.pl infile infileformat outfile outfileformat\n";
         my $infile = shift or die $usage;
         my $infileformat = shift or die $usage;
#         my $outfile = shift or die $usage;
         my $outfileformat = shift or die $usage;

         # create one SeqIO object to read in,and another to write out
         my $seq_in = Bio::SeqIO->new('-file' => "<$infile",
                                      '-format' => $infileformat);
         my $seq_out = Bio::SeqIO->new('-fh' => \*STDOUT,
                                       '-format' => $outfileformat);

         # write each entry in the input file to the output file
         while (my $inseq = $seq_in->next_seq) {

#            $seq_out->write_seq($inseq); # Whole sequence not needed

for my $feat_object ($inseq->get_SeqFeatures)
    {
    if ($feat_object->primary_tag eq "CDS")
        {
        print $feat_object->get_tag_values('product'),"\n";
        print
$feat_object->location->start,"..",$feat_object->location->end,"\n";
        print $feat_object->spliced_seq->seq,"\n\n";
        }
    }



The result seems OK to me, but in case of first CDS of NC_005213.gbk from
here <ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Nanoarchaeum_equitans/> the
output is wrong:

It is:
hypothetical protein
1..490885
TAAATGCGATTGCTATTAGAA..................................Truncated
sequence...................................

Should be:
hypothetical protein
879..490883
ATGCGATTGCTATTAGAA...................................Truncated
sequence....................................TAA



This CDS have an unnatural location string:
CDS             complement(join(490883..490885,1..879)), but spliced_seq
should handle these things?

Please help me!
Best regards, N.



More information about the Bioperl-l mailing list