[Bioperl-l] Muscle Alignment and Memory Allocation

armendarez77 at hotmail.com armendarez77 at hotmail.com
Thu Jul 15 15:20:43 EDT 2010


Thank you.

I'll look into those programs.

Veronica

Date: Tue, 13 Jul 2010 15:44:32 -0700
From: jason at bioperl.org
To: armendarez77 at hotmail.com
CC: randalls at bioinfo.wsu.edu; bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Muscle Alignment and Memory Allocation






  


Veronica - 



I think whole genome alignment is better applied with a program other
than MUSCLE - or other than a typical MSA approach.



See the extensive literature for this type of approach such as LAGAN,
PECAN, MAVID, MAUVE, and MERCATOR (scaffold then align with MAVID or
other tools) to name a few.



If you insist on a traditional multiple sequence alignment only
approach you may want to also try MAFFT but that is more suited for
lots of sequences rather than long whole genome sequences.



-jason



armendarez77 at hotmail.com wrote, On 7/13/10 11:57 AM:

  That would be nice, but not possible right now :)



  
  
    Date: Tue, 13 Jul 2010 11:43:35 -0700
From: randalls at bioinfo.wsu.edu
To: armendarez77 at hotmail.com
CC: bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Muscle Alignment and Memory Allocation

One suggestion is to use a computer with a lot more memory......

Randall Svancara
Systems Administrator/DBA/Developer
Main Bioinformatics Laboratory



----- Original Message -----
From: armendarez77 at hotmail.com
To: bioperl-l at lists.open-bio.org
Sent: Tuesday, July 13, 2010 11:00:27 AM
Subject: [Bioperl-l] Muscle Alignment and Memory Allocation

Hello,

I need to align 20-30 large full genome sequences (150,000+ bp each),
but I run out of memory. I've tried using -maxmb at the command line and
as an argument for Bio::Tools::Run::Alignment::Muscle, but I'm either
using it wrong or it's not working.

I've also tried aligning 2 sequences at a time and then aligning those
alignments using the -profile command, but it's still too much.

Do you have any advice on how to do such alignments? My attempts are
below.

Thank you,

Veronica


MUSCLE v3.6 by Robert C. Edgar

http://www.drive5.com/muscle This software is donated to the public
domain. Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

07-13-2010_fullGenomes 17 seqs, max length 165101, avg length 152670
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 1
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 2
00:00:01 105 MB(-37%) Iter 1 6.25% Align node
*** OUT OF MEMORY ***
Memory allocated so far 3211.48 MB

Alignment not completed, cannot save.



Using -maxmb at the command line:

$ muscle -in 07-13-2010_fullGenomes.fasta -clwout
07-13-2010_fullGenomes.clw -maxiters 1 -diags1 -sv -maxmb 4000

MUSCLE v3.6 by Robert C. Edgar

http://www.drive5.com/muscle This software is donated to the public
domain. Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

07-13-2010_fullGenomes 17 seqs, max length 165101, avg length 152670
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 1
00:00:00 26 MB(-9%) Iter 1 100.00% K-mer dist pass 2
00:00:01 105 MB(-37%) Iter 1 6.25% Align node
*** OUT OF MEMORY ***
Memory allocated so far 3210.74 MB

Alignment not completed, cannot save.


Using Bio::Tools::Run::Alignment::Muscle and -maxmb

SCRIPT: my $inputFile = $ARGV[0];
my $factory = Bio::Tools::Run::Alignment::Muscle->new(-maxmb=>4000);

my $alnObj = $factory->align($inputFile);
my $output = "output.clw";
my $clwOut = Bio::AlignIO->new(-format=>'clustalw',
-file=>">$output.clw"); $clwOut->write_aln($alnObj);

OUTPUT:

MUSCLE v3.6 by Robert C. Edgar

http://www.drive5.com/muscle This software is donated to the public
domain. Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

07-13-2010_fullGenomes 17 seqs, max length 165101, avg length 152670
00:00:01 26 MB(-9%) Iter 1 100.00% K-mer dist pass 1
00:00:01 26 MB(-9%) Iter 1 100.00% K-mer dist pass 2
00:00:01 105 MB(-37%) Iter 1 6.25% Align node
*** OUT OF MEMORY ***
Memory allocated so far 3210.9 MB

Alignment not completed, cannot save.

--------------------- WARNING ---------------------
MSG: Muscle call crashed: 512 [command /usr/bin/muscle -in
07-13-2010_fullGenomes.fasta -out /tmp/ubyNWLmbV8/GggmsmA0vA]

---------------------------------------------------






_________________________________________________________________ The
New Busy is not the too busy. Combine all your e-mail accounts with
Hotmail.
http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4
_______________________________________________ Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
    
  
   		 	   		  
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
  
 		 	   		  
_________________________________________________________________
Hotmail is redefining busy with tools for the New Busy. Get more from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2



More information about the Bioperl-l mailing list