Bioperl: ary/FASTA

Ewan Birney birney@sanger.ac.uk
Thu, 3 Jun 1999 12:03:42 +0100 (BST)


On Wed, 2 Jun 1999, Nirav Merchant wrote:

> Greetings,
> 	I am reading in a file in FASTA format , it has multiple sequences and "-"
> characters, I use the ary function in Bio::Seq to split it, the resultant
> array is devoid of "-" characters. ie. AA-TT becomes AATT.
> 	How can I maintain the "-" in the individual sequences while reading in
> the sequence. 

If you want to read in a multiple alignment, the Bio::SimpleAlign module
works well (and will preserve the '-''s). If you just want to read
sequences with '-''s in them, this will also work, but beware - this
reads the *entire* file, so if you have a large file, you will be in
trouble (unless you have alot of memory)

Go:

use Bio::SimpleAlign;

$aln = new Bio::SimpleAlign;

$aln->read_fasta(\*STDIN);

foreach $seq ( $aln->eachSeq() ) {
	# $seq is a Bio::Seq object

	print "The name is ", $seq->id(), "\n";
	}

Look at examples/simplealign.pl for more ideas about how to use it, and
the Bio::SimpleAlign documentation.


email me if you want to do something else/or you don't understand the
documentation...


> 
> Any pointers will be appreciated.
> 
> Thanks,
> Nirav
> 
> =========== Bioperl Project Mailing List Message Footer =======
> Project URL: http://bio.perl.org/
> For info about how to (un)subscribe, where messages are archived, etc:
> http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
> ====================================================================
> 

Ewan Birney
<birney@sanger.ac.uk>
http://www.sanger.ac.uk/Users/birney/

=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================