[Bioperl-l] Muscle Alignment and Memory Allocation
armendarez77 at hotmail.com
armendarez77 at hotmail.com
Thu Jul 15 15:20:43 EDT 2010
Thank you.
I'll look into those programs.
Veronica
Date: Tue, 13 Jul 2010 15:44:32 -0700
From: jason at bioperl.org
To: armendarez77 at hotmail.com
CC: randalls at bioinfo.wsu.edu; bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Muscle Alignment and Memory Allocation
Veronica -
I think whole genome alignment is better applied with a program other
than MUSCLE - or other than a typical MSA approach.
See the extensive literature for this type of approach such as LAGAN,
PECAN, MAVID, MAUVE, and MERCATOR (scaffold then align with MAVID or
other tools) to name a few.
If you insist on a traditional multiple sequence alignment only
approach you may want to also try MAFFT but that is more suited for
lots of sequences rather than long whole genome sequences.
-jason
armendarez77 at hotmail.com wrote, On 7/13/10 11:57 AM:
That would be nice, but not possible right now :)
Date: Tue, 13 Jul 2010 11:43:35 -0700
From: randalls at bioinfo.wsu.edu
To: armendarez77 at hotmail.com
CC: bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Muscle Alignment and Memory Allocation
One suggestion is to use a computer with a lot more memory......
Randall Svancara
Systems Administrator/DBA/Developer
Main Bioinformatics Laboratory
----- Original Message -----
From: armendarez77 at hotmail.com
To: bioperl-l at lists.open-bio.org
Sent: Tuesday, July 13, 2010 11:00:27 AM
Subject: [Bioperl-l] Muscle Alignment and Memory Allocation
Hello,
I need to align 20-30 large full genome sequences (150,000+ bp each),
but I run out of memory. I've tried using -maxmb at the command line and
as an argument for Bio::Tools::Run::Alignment::Muscle, but I'm either
using it wrong or it's not working.
I've also tried aligning 2 sequences at a time and then aligning those
alignments using the -profile command, but it's still too much.
Do you have any advice on how to do such alignments? My attempts are
below.
Thank you,
Veronica
MUSCLE v3.6 by Robert C. Edgar
http://www.drive5.com/muscle This software is donated to the public
domain. Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.
07-13-2010_fullGenomes 17 seqs, max length 165101, avg length 152670
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 1
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 2
00:00:01 105 MB(-37%) Iter 1 6.25% Align node
*** OUT OF MEMORY ***
Memory allocated so far 3211.48 MB
Alignment not completed, cannot save.
Using -maxmb at the command line:
$ muscle -in 07-13-2010_fullGenomes.fasta -clwout
07-13-2010_fullGenomes.clw -maxiters 1 -diags1 -sv -maxmb 4000
MUSCLE v3.6 by Robert C. Edgar
http://www.drive5.com/muscle This software is donated to the public
domain. Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.
07-13-2010_fullGenomes 17 seqs, max length 165101, avg length 152670
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 1
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 2
00:00:01 105 MB(-37%) Iter 1 6.25% Align node
*** OUT OF MEMORY ***
Memory allocated so far 3210.74 MB
Alignment not completed, cannot save.
Using Bio::Tools::Run::Alignment::Muscle and -maxmb
SCRIPT: my $inputFile = $ARGV[0];
my $factory = Bio::Tools::Run::Alignment::Muscle->new(-maxmb=>4000);
my $alnObj = $factory->align($inputFile);
my $output = "output.clw";
my $clwOut = Bio::AlignIO->new(-format=>'clustalw',
-file=>">$output.clw"); $clwOut->write_aln($alnObj);
OUTPUT:
MUSCLE v3.6 by Robert C. Edgar
http://www.drive5.com/muscle This software is donated to the public
domain. Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.
07-13-2010_fullGenomes 17 seqs, max length 165101, avg length 152670
00:00:01 26 MB(-9%) Iter 1 100.00% K-mer dist pass 1
00:00:01 26 MB(-9%) Iter 1 100.00% K-mer dist pass 2
00:00:01 105 MB(-37%) Iter 1 6.25% Align node
*** OUT OF MEMORY ***
Memory allocated so far 3210.9 MB
Alignment not completed, cannot save.
--------------------- WARNING ---------------------
MSG: Muscle call crashed: 512 [command /usr/bin/muscle -in
07-13-2010_fullGenomes.fasta -out /tmp/ubyNWLmbV8/GggmsmA0vA]
---------------------------------------------------
_________________________________________________________________ The
New Busy is not the too busy. Combine all your e-mail accounts with
Hotmail.
http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4
_______________________________________________ Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
_________________________________________________________________
Hotmail is redefining busy with tools for the New Busy. Get more from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2
More information about the Bioperl-l
mailing list