[Bioperl-l] Re: Arrays of BioSeq and TCoffee

Brian Osborne brian_osborne@cognia.com
Thu Jan 16 20:34:06 EST 2003


James,

>> I've got round this in a
>> clumsy way (couple of lines), but wondered if there was an easy (more
>> efficient) way of keeping the '>'?

Like this:

@seqs = split /(?=>.+)/,$str;

"?=" indicates a positive lookahead assertion. Don't ask me for an
explanation.

 ;-)


Brian O.

-----Original Message-----
From: bioperl-l-admin@bioperl.org [mailto:bioperl-l-admin@bioperl.org]On
Behalf Of James Wasmuth
Sent: Thursday, January 16, 2003 3:14 PM
To: James Wasmuth
Cc: bioperl-l@bioperl.org
Subject: [Bioperl-l] Re: Arrays of BioSeq and TCoffee

Sorry peeps, realised my mistake...

didn't reference the array, kinda misread the documentation, oh and
didn't read the synopsis at all, wrist has been duely slapped...

I do have a Perl based question tho.  I need to seperate a string into
an array of fasta sequences.  At the moment I use split command but need
to give it an expression with which to delimit, the obvious choice is
'>', but that means I lose it from the header.  I've got round this in a
clumsy way (couple of lines), but wondered if there was an easy (more
efficient) way of keeping the '>'?


James Wasmuth wrote:

> Hi people,
>
> I have a number of fasta format sequences in one string, which I wish
> to align using T-Coffee, without writing the sequences to a file first.
>
> According to doc for T-Coffee module, the align method will take a
> filename or an array of references for Bio::Seq objects...
>
> I've tried the following script but it fails, I am informed that the
> first Bio::Seq object contains less than 2 sequences.  Well of course
> it does, its a Bio::Seq object...  Can anyone see my error...
>
> my @fasta_seq = qw/  >seq1\nFTTATT  >seq2\nFTTGTT  >seq3\nFTATTT /;
> my @seq;
> foreach (@fasta_seq)    {
>        my $stringfh = new IO::String($_);
>        my $seqio = new Bio::SeqIO(-fh => $stringfh, -format => 'fasta');
>        push @seq ($seqio_obj->next_seq);     }
> # here I know that @seq is an array of Bio::Seq objects
>
> my @params = (some params);
> my $factory = new Bio::Tools::Run::Alignment::TCoffee (@params);
> my $aln = $factory->align(@seq);
>
>
> Can a Bio::Seq object hold more than one sequence? Is that my problem
>
> Many Thanks
>
> J
>
>

--

Nematode Bioinformatics
Blaxter Nematode Genomics Group
Institute of Cell, Animal and Population Biology
Ashworth Labs
University of Edinburgh
King's Buildings
Edinburgh
EH9 3JT

0131 650 7403



_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list