[Bioperl-l] Re: [Bioperl-guts-l] SeqIO does not work with raw mode?

Jason Stajich jason at cgt.mc.duke.edu
Thu Apr 10 10:01:04 EDT 2003


On Thu, 10 Apr 2003, Juergen Rose wrote:

> Hi,
>
> when I try read a GenBank file (gbest1.seq) with bioperl-1.2.1 SeqIO and
> to print the Sequence with the following script:
>
> my $in  = Bio::SeqIO->new('-file' => "$infile",'-format' => 'genbank');
> while ( my $seq = $in->next_seq() ) {
>    my $acc=$seq->id;
>    print "id=$acc\n";
>    my $stringio = IO::String->new($string);
>    my $test = Bio::SeqIO->new('-fh' => $stringio,'-format' => "$out_fmt");
>    $test->write_seq($seq);
>    $string =~ s|(>)(\w+)|$1<font color="Red">$2</font>|g;
>    print $string;
> }
>
> I get the expected output only for the first entry AA000001, at the end
> of the output for second entry (AA000002) are the last bases of the
> first entry, at the end of the third entry are the bases of the first
> and the second entry, etc.
> It seems me, that no Null-Byte is written after the end of sequence in
> $string. Is this a feature or an error? If this is a feature, is there
> an easy way to get the sequence length for a $seq object?
>
Not sure exactly what you want - to get the sequence length do

print "sequence length is ", $seq->length(), "\n";

If you just want the sequence string call
$seq->seq();
To achieve the html tags you want, you might find this easier:

printf "><font color=\"Red\">%s</font> %s\n",$seq->display_id,
                                             $seq->description;
my $str = $seq->seq();
my $width = 60; # fasta sequence width
if(length($str) > 0) { $str =~ s/(.{1,$width})/$1\n/g;}
else { $str = "\n"; }
print $str;

The bptutorial.pl and the documentation for Bio::Seq and Bio::PrimarySeq
(perldoc Bio::Seq  or perldoc Bio::PrimarySeq or perldoc Bio::PrimarySeqI )
might also help you get a handle on how to use the objects.


-jason


> Please send your answers also to my private email, I do not follow the
> mailing list yet.
>
>
> Thank you for your answers
> 	Juergen
>
> The output of the above script is:
>
> id=AA000001
> TAANTGAGATCTAGGTATTAACCTGCTGTCTAGCGAAAACTAGTCACTAAGTCCTGGCCTGAGAGATACCCACATTTCCTTTAGAACAAACAGAACTAATACCTGTGTACATTTCTGAGAGCCTGATGTGTGAGTCCTTAAAATGTAGACCTTGCAGGAGGCTTAGACCTCAGTTTCACCTAATGCATGTGGAGGAAATGGAGGTGAGAATAGTCACCTGAAGAGTGCAAGCGCTCCAGCTCCAGCACACACACTCTTCCCTGGGCAGCAGGAAAAGGAGGTAACAAGGACTTGGGCTGACATCTGAAGCACTANGCTAATGTGCCTGGTAGAGGGGAGCCTCAGGAAGNCACAAGATGGTCATTCCACCTNGTAGCTGTCCACAAACCTGAGGTTTCCACATCGTTTTTAAAGGGCACAGTGGGCAAATGTGNCAAGGCAGAAAACCAATAACCATTTCAAGGGNTCACTTGNid=AA000002
> CCACCTTTCCCTCCACTCCTCACGTTCTCACCTGTAAAGCGTCCCTCCCTCATCCCCATGCCCCCTTACCCTGCAGGGTAGAGTAGGCTAGAAACCAGAGAGCTCCAAGCTCCATCTGTGGAGAGGTGCCATCCTTGGGCTGCAGAGAGAGGAGAATTTGCCCCAAAGCTGCCTGCAGAGCTTCACCACCCTTAGTCTCACAAAGCCTTGAGTTCATAGCATTTCTTGAGTTTTCACCCTGCCCAGCAGGACACTGCAGCACCCAAAGGGCTTCCCAGGAGTAGGGTTGCCCTCAAGAGGCTCTTGGGTCTGATGGCCACATCCTGGAATTGTTTTCAAGTTGATGGTCACAGCCCTGAGGCATGTAGGGGCGTGGGGATGCGCTCTGCTCTGCTCTCCTCTCCTGAACCCCTGAACCCTCTGGCTACCCCAGAGCACTTAGAGCCAG
> ATAACCATTTCAAGGGNTCACTTGN
> id=AA000003
> GGCGCTTCCAATGCCAGTGCTCCAGCAAACCCGTGCCGAAGATCATGGGCTGTGACGCATGCCTTTAATCCCAACGCTCAGAAGGCAGAGACAGGCAAATCTCAGTGAGTCTGAGGCCAGCCTGGTCTATACAGGAAGTTCCAGGACAGCCAGGGTCTCTGGAACTCGAGGTTCCTGAAGAGCTCCACTAAGGACTCCAGATCGCCAGCCTCTGTGTGTACTTCTCTCCTAATTTGAAGATTCATTTCCTATCTCCTCAAACTCACTACTATTCAACGTGTCATGGTTTCCAAACCCAGGACTGAATGGAGCATGTCCTGTCCCACCTGGTATTGGCAGGGGTGTCCCTGCATGCAGCGGTG
> TGTAGGGGCGTGGGGATGCGCTCTGCTCTGCTCTCCTCTCCTGAACCCCTGAACCCTCTGGCTACCCCAGAGCACTTAGAGCCAG
> ATAACCATTTCAAGGGNTCACTTGN
> id=AA000004
> GTCCGAGAAGGAAGTGCATCTGCATACGACCACCACACCTGAGTGCAGAAGCTATTCCACGTGCAGTCATGGCAGGTGAGTCGCTGACTATCACCTTGGATACGGTGTCGAGCTGGGAGCTGTAAACCTGACTATGAGGATCTGCTAGGAGGTCATCTCCTCAGATGACATTGGGCTGCGCATATATCTCATCTGCAGGAGACGACGAGGACTAGTCGAGACCTTCTGTACTCTTGATGCTGCGACTCTAGCATGGCTGACTGAGGAGCATGCGCAACATGTGTCACACGCTCATCC
> GGACTGAATGGAGCATGTCCTGTCCCACCTGGTATTGGCAGGGGTGTCCCTGCATGCAGCGGTG
> TGTAGGGGCGTGGGGATGCGCTCTGCTCTGCTCTCCTCTCCTGAACCCCTGAACCCTCTGGCTACCCCAGAGCACTTAGAGCCAG
>
>
> _______________________________________________
> Bioperl-guts-l mailing list
> Bioperl-guts-l at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-guts-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


More information about the Bioperl-l mailing list