[Bioperl-l] Muscle Alignment and Memory Allocation

armendarez77 at hotmail.com armendarez77 at hotmail.com
Tue Jul 13 18:00:27 UTC 2010







Hello,

I need to align 20-30 large full genome sequences  (150,000+ bp each), but I run out of memory.  I've tried using -maxmb at the command line and as an argument for Bio::Tools::Run::Alignment::Muscle, but I'm either using it wrong or it's not working.  

I've also tried aligning 2 sequences at a time and then aligning those alignments using the -profile command, but it's still too much.

Do you have any advice on how to do such alignments?  My attempts are below.

Thank you,

Veronica


MUSCLE v3.6 by Robert C. Edgar

http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

07-13-2010_fullGenomes 17 seqs, max length 165101, avg  length 152670
00:00:00    26 MB(-9%)  Iter   1  100.00%  K-mer dist pass 1
00:00:00    26 MB(-9%)  Iter   1  100.00%  K-mer dist pass 2
00:00:01  105 MB(-37%)  Iter   1    6.25%  Align node
*** OUT OF MEMORY ***
Memory allocated so far 3211.48 MB

Alignment not completed, cannot save.



Using -maxmb at the command line:

$ muscle -in 07-13-2010_fullGenomes.fasta -clwout 07-13-2010_fullGenomes.clw -maxiters 1 -diags1 -sv -maxmb 4000

MUSCLE v3.6 by Robert C. Edgar

http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

07-13-2010_fullGenomes 17 seqs, max length 165101, avg  length 152670
00:00:00    26 MB(-9%)  Iter   1  100.00%  K-mer dist pass 1
00:00:00    26 MB(-9%)  Iter   1  100.00%  K-mer dist pass 2
00:00:01  105 MB(-37%)  Iter   1    6.25%  Align node
*** OUT OF MEMORY ***
Memory allocated so far 3210.74 MB

Alignment not completed, cannot save.


Using Bio::Tools::Run::Alignment::Muscle and -maxmb

SCRIPT:
my $inputFile = $ARGV[0];
my $factory = Bio::Tools::Run::Alignment::Muscle->new(-maxmb=>4000);

my $alnObj = $factory->align($inputFile);
my $output = "output.clw";
my $clwOut = Bio::AlignIO->new(-format=>'clustalw', -file=>">$output.clw");
$clwOut->write_aln($alnObj);

OUTPUT:

MUSCLE v3.6 by Robert C. Edgar

http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

07-13-2010_fullGenomes 17 seqs, max length 165101, avg  length 152670
00:00:01    26 MB(-9%)  Iter   1  100.00%  K-mer dist pass 1
00:00:01    26 MB(-9%)  Iter   1  100.00%  K-mer dist pass 2
00:00:01  105 MB(-37%)  Iter   1    6.25%  Align node
*** OUT OF MEMORY ***
Memory allocated so far 3210.9 MB

Alignment not completed, cannot save.

--------------------- WARNING ---------------------
MSG: Muscle call crashed: 512 [command /usr/bin/muscle -in 07-13-2010_fullGenomes.fasta  -out /tmp/ubyNWLmbV8/GggmsmA0vA]

---------------------------------------------------





 		 	   		  
_________________________________________________________________
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.
http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4



More information about the Bioperl-l mailing list