[Bioperl-l] help with parsing meme output

Stefan Kirov skirov at utk.edu
Thu Jun 23 21:12:35 EDT 2005


James,
while you were correct in your first suggestion (to use $psmIO->length), 
but quite incorrect at your next suggestion:
the code
while    (my %header=$psmIO->header) {
makes no sense.
What you probably want to do is:
while (my $psm=$psmIO->next_psm) {
my %header=$psm->header; #Bio::Matrix::PSM::Psm method
#Do something with the has
}
But header has different purpose- it contains data about particular 
prediction, such as number of sites, width of the motif, etc.
If you need the initial sequences lengths it is precisely as you suggested:
my %lengths=$psmIO->length;
foreach my $id (keys %lengths) {
print "Initial sequence $id length is ",$lengths{$id},"\n";
}
C'est tout!
To get a length of a particular hit (that is the sequence, on which a 
predicted motif is based):
while (my $psm=$psmIO->next_psm) {
my $instances=$psm->instances;
foreach my..... {
print "Hits ... is long", $instance->length....
}
Let me know if there are further questions
Stefan

James Wasmuth wrote:

> It would appear that $psmIO->header is not implemented in PSM/IO.pm.
> Does anyone know if this is to be done?
>
> Nandita Mullapudi wrote:
>
>> thanks James,
>> this one gives the error
>>
>> Can't use an undefined value as an ARRAY reference at
>> /usr/lib/perl5/site_perl/5.6.1/Bio/Matrix/PSM/IO/meme.pm line
>> 159, <GEN0> line 43.
>>
>> i've attached the text output i am trying to parse
>>
>> -nandita
>>
>>
>> ---- Original message ----
>>  
>>
>>> Date: Thu, 23 Jun 2005 21:36:03 +0100
>>> From: James Wasmuth <james.wasmuth at ed.ac.uk>  Subject: Re: 
>>> [Bioperl-l] help with parsing meme output  To: Nandita Mullapudi 
>>> <nandita at uga.edu>
>>> Cc: bioperl-l at portal.open-bio.org
>>>
>>> Does this behave itself?
>>> while    (my %header=$psmIO->header) {
>>>     for (my $i=0; $i<=$#{$header{instances}};$i++)    {
>>>         print
>>>   
>>
>> $header{instances}->[$i],"\t",$header{lengths}->[$i],"\n";
>>  
>>
>>>     }
>>> }
>>>
>>>
>>> I don't use these modules but having looked at the docs this
>>>   
>>
>> should work. Although the notes in Bio::Matrix::PSM::IO for
>> this method say it should be obsolete.
>>  
>>
>>> If you still get no joy then attach a copy of the output file
>>>   
>>
>> to an email. This should provide people with an example.
>>  
>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Nandita Mullapudi wrote:
>>>
>>>   
>>>
>>>> ok i've got to be missing something here.
>>>>
>>>> this is my code:
>>>>
>>>> use strict;
>>>> use warnings;
>>>> use Bio::Matrix::PSM::IO;
>>>> use Bio::Matrix::PSM::InstanceSite;
>>>>
>>>> my $psmIO = new Bio::Matrix::PSM::IO(    -file => 'memeout.txt',
>>>>                                        -format => 'meme');
>>>>
>>>> while    (my %header=$psmIO->header) {
>>>>   foreach my $seqid (@{$header{instances}}) {
>>>> print "$header->length";
>>>>
>>>> }
>>>> }
>>>>
>>>>
>>>> and the error i get is " Global symbol "$header" requires
>>>> explicit package name at parsememe2.pl line 15.
>>>> Execution of parsememe2.pl aborted due to compilation errors.
>>>>
>>>>
>>>> thanks for your help.
>>>> -n
>>>>
>>>>
>>>>
>>>>
>>>> ---- Original message ----
>>>>
>>>>
>>>>     
>>>>
>>>>> Date: Thu, 23 Jun 2005 20:51:48 +0100
>>>>> From: James Wasmuth <james.wasmuth at ed.ac.uk>  Subject: Re: 
>>>>> [Bioperl-l] help with parsing meme output  To: Nandita Mullapudi 
>>>>> <nandita at uga.edu>
>>>>> Cc: bioperl-l at bioperl.org
>>>>>
>>>>> Nandita
>>>>>
>>>>> The BioPerl module $header->length() comes from is
>>>>>  
>>>>>       
>>>>
>>>> PSM/PsmHeader.pm
>>>>
>>>>
>>>>     
>>>>
>>>>> This should be inherited when you "use Bio::Matrix::PSM::IO"
>>>>>
>>>>> have a look 
>>>>> http://doc.bioperl.org/releases/bioperl-1.4/Bio/Matrix/PSM/IO.html
>>>>>
>>>>> What you want should be covered there. Otherwise shout and
>>>>>  
>>>>>       
>>>>
>>>> someone will
>>>>
>>>>     
>>>>
>>>>> answer
>>>>>
>>>>> -james
>>>>>
>>>>>
>>>>>
>>>>> Nandita Mullapudi wrote:
>>>>>
>>>>>  
>>>>>       
>>>>>
>>>>>> thanks James,
>>>>>>
>>>>>>
>>>>>>
>>>>>>    
>>>>>>         
>>>>>>
>>>>>>> Is it the length from the input sequence that you want?
>>>>>>>
>>>>>>> my %length= $header->length();
>>>>>>> Function: Returns the length of the input sequence or motifs
>>>>>>>  
>>>>>>>
>>>>>>>      
>>>>>>>           
>>>>>>
>>>>>> as a hash, indexed
>>>>>>
>>>>>>
>>>>>>    
>>>>>>         
>>>>>>
>>>>>>> by a sequence ID (motif id or accession number)
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>      
>>>>>>>           
>>>>>>
>>>>>> yes, i want the length from the input sequence. I am not sure
>>>>>> i can use the above without specifying which module / package
>>>>>> it refers to?
>>>>>> also , where can i find this info? :)
>>>>>> thanks,
>>>>>> -nandita
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>    
>>>>>>         
>>>>>>
>>>>>>> james
>>>>>>>
>>>>>>>
>>>>>>> Nandita Mullapudi wrote:
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>      
>>>>>>>           
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> I am trying to use  Bio::Matrix::PSM::IO to parse meme
>>>>>>>>             
>>>>>>>
>> output.
>>  
>>
>>>>>>>> I need to extract the values corresponding to length of the
>>>>>>>> sequence, seq id, and motif id, start and
>>>>>>>>             
>>>>>>>
>> significance/score.
>>  
>>
>>>>>>>> I can get the last three using
>>>>>>>> foreach my $instance (@{ $instances }) {
>>>>>>>>     my $start = $instance -> start;
>>>>>>>>     my $score = $instance -> score;
>>>>>>>>
>>>>>>>> But i cannot find out how to get the seq id and seq length.
>>>>>>>> any ideas?
>>>>>>>> thanks
>>>>>>>> -nandita
>>>>>>>>
>>>>>>>> ***************************************************
>>>>>>>> Graduate Student, Kissinger Lab.
>>>>>>>> Dept. of Genetics
>>>>>>>> UGA, Athens GA 30602 USA
>>>>>>>> lab phone: 706-542-6563
>>>>>>>> cell phone: 706-254-2444
>>>>>>>> Lab add: C318 Life Sciences
>>>>>>>> ****************************************************
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at portal.open-bio.org
>>>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>
>>>>>>>>
>>>>>>>>   
>>>>>>>>        
>>>>>>>>             
>>>>>>>
>>>>>>> -- 
>>>>>>> http://www.nematodes.org/~james
>>>>>>>
>>>>>>> "Until man duplicates a blade of grass, nature can laugh at
>>>>>>>  
>>>>>>>
>>>>>>>      
>>>>>>>           
>>>>>>
>>>>>> his so-called scientific knowledge...."
>>>>>>
>>>>>>
>>>>>>    
>>>>>>         
>>>>>>
>>>>>>>         --Thomas Edison
>>>>>>> Blaxter Nematode Genomics Group   |
>>>>>>> Institute of Evolutionary Biology |
>>>>>>> Ashworth Laboratories, KB         | tel: +44 131 650 7403
>>>>>>> University of Edinburgh           | web: www.nematodes.org
>>>>>>> Edinburgh                         |
>>>>>>> EH9 3JT                           |
>>>>>>> UK                                |   
>>>>>>>
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>      
>>>>>>>           
>>>>>>
>>>>>> ***************************************************
>>>>>> Graduate Student, Kissinger Lab.
>>>>>> Dept. of Genetics
>>>>>> UGA, Athens GA 30602 USA
>>>>>> lab phone: 706-542-6563
>>>>>> cell phone: 706-254-2444
>>>>>> Lab add: C318 Life Sciences
>>>>>> ****************************************************
>>>>>>
>>>>>>
>>>>>>    
>>>>>>         
>>>>>
>>>>> -- 
>>>>> http://www.nematodes.org/~james
>>>>>
>>>>> "Until man duplicates a blade of grass, nature can laugh at
>>>>>  
>>>>>       
>>>>
>>>> his so-called scientific knowledge...."
>>>>
>>>>
>>>>     
>>>>
>>>>>          --Thomas Edison
>>>>> Blaxter Nematode Genomics Group   |
>>>>> Institute of Evolutionary Biology |
>>>>> Ashworth Laboratories, KB         | tel: +44 131 650 7403
>>>>> University of Edinburgh           | web: www.nematodes.org
>>>>> Edinburgh                         |
>>>>> EH9 3JT                           |
>>>>> UK                                |   
>>>>>
>>>>>
>>>>>  
>>>>>       
>>>>
>>>> ***************************************************
>>>> Graduate Student, Kissinger Lab.
>>>> Dept. of Genetics
>>>> UGA, Athens GA 30602 USA
>>>> lab phone: 706-542-6563
>>>> cell phone: 706-254-2444
>>>> Lab add: C318 Life Sciences
>>>> ****************************************************
>>>>
>>>>
>>>>     
>>>
>>> -- 
>>> http://www.nematodes.org/~james
>>>
>>> "Until man duplicates a blade of grass, nature can laugh at
>>>   
>>
>> his so-called scientific knowledge...."
>>  
>>
>>>           --Thomas Edison
>>> Blaxter Nematode Genomics Group   |
>>> Institute of Evolutionary Biology |
>>> Ashworth Laboratories, KB         | tel: +44 131 650 7403
>>> University of Edinburgh           | web: www.nematodes.org
>>> Edinburgh                         |
>>> EH9 3JT                           |
>>> UK                                |   
>>>
>>>
>>>   
>>
>>
>> ***************************************************
>> Graduate Student, Kissinger Lab.
>> Dept. of Genetics
>> UGA, Athens GA 30602 USA
>> lab phone: 706-542-6563
>> cell phone: 706-254-2444
>> Lab add: C318 Life Sciences
>> ****************************************************
>>  
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list