[Bioperl-l] Muscle Alignment and Memory Allocation
armendarez77 at hotmail.com
armendarez77 at hotmail.com
Tue Jul 13 18:00:27 UTC 2010
Hello,
I need to align 20-30 large full genome sequences (150,000+ bp each), but I run out of memory. I've tried using -maxmb at the command line and as an argument for Bio::Tools::Run::Alignment::Muscle, but I'm either using it wrong or it's not working.
I've also tried aligning 2 sequences at a time and then aligning those alignments using the -profile command, but it's still too much.
Do you have any advice on how to do such alignments? My attempts are below.
Thank you,
Veronica
MUSCLE v3.6 by Robert C. Edgar
http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.
07-13-2010_fullGenomes 17 seqs, max length 165101, avg length 152670
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 1
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 2
00:00:01 105 MB(-37%) Iter 1 6.25% Align node
*** OUT OF MEMORY ***
Memory allocated so far 3211.48 MB
Alignment not completed, cannot save.
Using -maxmb at the command line:
$ muscle -in 07-13-2010_fullGenomes.fasta -clwout 07-13-2010_fullGenomes.clw -maxiters 1 -diags1 -sv -maxmb 4000
MUSCLE v3.6 by Robert C. Edgar
http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.
07-13-2010_fullGenomes 17 seqs, max length 165101, avg length 152670
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 1
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 2
00:00:01 105 MB(-37%) Iter 1 6.25% Align node
*** OUT OF MEMORY ***
Memory allocated so far 3210.74 MB
Alignment not completed, cannot save.
Using Bio::Tools::Run::Alignment::Muscle and -maxmb
SCRIPT:
my $inputFile = $ARGV[0];
my $factory = Bio::Tools::Run::Alignment::Muscle->new(-maxmb=>4000);
my $alnObj = $factory->align($inputFile);
my $output = "output.clw";
my $clwOut = Bio::AlignIO->new(-format=>'clustalw', -file=>">$output.clw");
$clwOut->write_aln($alnObj);
OUTPUT:
MUSCLE v3.6 by Robert C. Edgar
http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.
07-13-2010_fullGenomes 17 seqs, max length 165101, avg length 152670
00:00:01 26 MB(-9%) Iter 1 100.00% K-mer dist pass 1
00:00:01 26 MB(-9%) Iter 1 100.00% K-mer dist pass 2
00:00:01 105 MB(-37%) Iter 1 6.25% Align node
*** OUT OF MEMORY ***
Memory allocated so far 3210.9 MB
Alignment not completed, cannot save.
--------------------- WARNING ---------------------
MSG: Muscle call crashed: 512 [command /usr/bin/muscle -in 07-13-2010_fullGenomes.fasta -out /tmp/ubyNWLmbV8/GggmsmA0vA]
---------------------------------------------------
_________________________________________________________________
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.
http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4
More information about the Bioperl-l
mailing list