Bioperl: Is this a bug?

Ewan Birney birney@sanger.ac.uk
Sat, 31 Oct 1998 12:56:18 +0000 (GMT)


On Fri, 30 Oct 1998, Osborne, Brian wrote:

> To the group,
> 
> First, congratulations on the impending release. This
> group has been an example not just of fine coding but
> but of fine collaboration as well. My question is
> demonstrated here :
> 
> use Bio::Seq;
> 
> $myEnt = ">somesequence
> gagagaatatagggcgctcgctt
> gagatagatataggggggggggg
> ";
> 
> $mySeq = Bio::Seq->new(-seq =>$myEnt,-ffmt => 'Fasta' );
> 
> print $mySeq->id;
> 


This should be the following code:

$id = "somesequence";
$seq = "ATGGGGCCTGGGCCCCTGGGGCCCCGGGT";

$mySeq = Bio::Seq->new(-id = $id,-seq => $seq);
# could add a -ffmt => 'fasta' if we liked


print $mySeq->layout; # prints fasta format because that is the default.

# if you have something as 'fastaformat' anyway you
# have to do the parsing yourself at the moment

$string = ">newseq\nTTGGAAAATGGGTGGGCCCCCGGTGG\n";

# this is not the ideal pattern match as the sequence has to be
# on one line. Beware!

$string =~ /^>(\S+)\s+([A-Za-z]+)/ || die("Not a fasta format string [$string]");

$id = $1;
$seq = $2; 

# etc....


You can read from a fasta file using the 

$seq = Bio::Seq->new(-file => $filename);

BUT not at the moment from a stream (sadly).



On object creation the ffmt line is just to indicate the default
layout of the output of the sequence - it has nothing to do with the
input! (perhaps slightly counterintuitive).

James Gilbert from the sanger centre wants something like the code snippet
you have just written. It's on the list of things to look at for after
the release.


If you can't get this to work, do come back again with another post.


Thanks for trying out the package.



> 
> 
> The result is : 
> 
> No_Id_Given
> 
> 
> 
> If I do :
> 
> print $mySeq->layout
> 
> the result is :
> 
> >No_Id_Given No Description Given
> >somesequence
> gagagaatatagggcgctcgctt
> gagatagatataggggggggggg
> 
> 
> If I invoke new() with the "-file" option, and the input file is a fasta
> file
> and I specify -ffmt => 'Fasta' the id is taken from the string
> (skipping white space) after the ">" (and I can see why in the
> parse_fasta
> method). But you can see that if I use the string above then no id is 
> found. Is this the desired result? Shouldn't these behave the same?
> That is, I'm specifying fasta so the header should at  least be removed.
> I ftp'd my Seq.pm and Object.pm from bio.perl.orgj this afternoon.
> 
> Thanks again,
> 
> Brian O.
> 
> Brian Osborne
> Cadus Pharmaceutical Corporation
> 777 Old Saw Mill River Rd.
> Tarrytown NY USA
> 10591-6705
> brian.osborne@cadus.com
> TEL 914 467 6291
> FAX 914 345 3565
> 
> =========== Bioperl Project Mailing List Message Footer =======
> Project URL: http://bio.perl.org/
> For info about how to (un)subscribe, where messages are archived, etc:
> http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
> ====================================================================
> 

Ewan Birney
<birney@sanger.ac.uk>
http://www.sanger.ac.uk/Users/birney/

=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================