[Bioperl-l] Re: Thanks

Barry Moore barry.moore at genetics.utah.edu
Wed Aug 24 08:30:00 EDT 2005


Usha-

It is important that you keep Bioperl related discussions on the bioperl 
list, that way others can benefit from the discussion in the future by 
searching the archives.  Having said that, I am constantly accidentally 
replying directly to people and not to the list because I hit reply 
instead of reply all, so I'm not really a good one to talk.

It seems from nature of your questions to this list that you might be 
quite new to perl programming.  You participation on the list is still 
welcome, but this is the kind of problem that you want to learn how to 
solve yourself.  You want to to avoid at all cost giving the impression 
that you are asking the list to do your debugging for you  - otherwise 
people will just stop replying to your messages.  An very valuable 
article about asking questions to forums like this one is found at 
http://www.catb.org/~esr/faqs/smart-questions.html#answers 
<http://www.catb.org/%7Eesr/faqs/smart-questions.html#answers>.  Read 
it.  Live it.  O.K.  enough preaching...

If you have Programming Perl 
<http://www.amazon.com/exec/obidos/tg/detail/-/0596000278/qid=1124883861/sr=8-1/ref=pd_bbs_1/102-6339742-6061723?v=glance&s=books&n=507846> 
read the first half of chapter 20.  If you don't have the Perl book, you 
can get the same info from 
http://www.perldoc.com/perl5.8.4/pod/perldebug.html.  After you 
understand the perl debugger (or if you already do) look at the error 
message that you got.  In the first line it reports "seq doesn't 
validate".  So something is wrong with some sequence that you are trying 
to use in your script.  Since perl itself isn't' aware of sequences, and 
since the stack trace below shows that the exception occurred while in 
Bio::PrimarySeq you can conclude that some sequence that your are 
sending to bioperl is bad.  Now fire up perl running your script with 
the debugger like this perl -d yourscript.pl.  Use 'n' to step through 
your code to the open command on line 14.  No errors yet?  You've just 
loaded your sequences.fasta sequence, so that sequence must be OK.  
Continue stepping through your code until you get the error again.  
Where exactly did the error occur?  When you try to set $input2 as a new 
Bio::Seq object?  Run the debugger again, and step through to the line 
just before where the error occurred.  Use the debugger's x command to 
see what the values of  $seq_id, $probe_id, $position, $probe_sequence 
are.  This should give you a clue s to what your problem is.  One more 
clue comes from the error message.  It says "Attempting to set the 
sequence to [PROBE_SEQUENCE] which does not look healthy".  The error 
message says that you are trying to set the sequence to PROBE_SEQUENCE.  
Try to figure out why this error is occurring and how to solve it.  If 
you're still stuck let us know what you've tried and ask us again. 

Barry


Usha Rani Reddi wrote:

>Hi,
>Thanks a lot for your help. When I tried to run the given code I got the
>following message.
>
>MSG: seq doesn't validate, mismatch is 1
>---------------------------------------------------
>
>------------- EXCEPTION  -------------
>MSG: Attempting to set the sequence to [PROBE_SEQUENCE] which does not
>look healthy
>STACK Bio::PrimarySeq::seq
>/usr/lib/perl5/site_perl/5.8.5/Bio/PrimarySeq.pm:268
>STACK Bio::PrimarySeq::new
>/usr/lib/perl5/site_perl/5.8.5/Bio/PrimarySeq.pm:217
>STACK Bio::Seq::new /usr/lib/perl5/site_perl/5.8.5/Bio/Seq.pm:498
>STACK toplevel barr:23
>
>What should I do next? Please help me.
>Thanks
>Usha.
>
>----- Original Message -----
>From: Barry Moore <bmoore at genetics.utah.edu>
>Date: Tuesday, August 23, 2005 2:40 pm
>Subject: [Bioperl-l] RE: Local bl2seq
>
>  
>
>>Usha-
>>
>>I think the code below will wrap your existing code in the loop you
>>need.  You will want to get a copy of a good perl programming book 
>>likeProgramming Perl from O'Reilly.  It will help you out with all 
>>thoselittle perl details like loop structures etc.
>>
>>Barry
>>
>>#!/usr/bin/perl
>>
>>use strict;
>>use warnings;
>>use Bio::SeqIO;
>>use Bio::Tools::Run::StandAloneBlast;
>>use Bio::Seq;
>>
>>my $seqio_obj = Bio::SeqIO->new(-file => " sequences.fasta",
>>                               -format => "fasta" );
>>
>>my $seq_obj = $seqio_obj->next_seq;
>>
>>open (IN, " location/of/your/probe/file") or die "Can't open IN";
>>
>>while (my $row = <IN>) {
>>   chomp $row;
>>   #Assuming your file is tab delimited...
>>   my ($seq_id, $probe_id, $position, $probe_sequence) = split /\t/,
>>$row;
>>
>>   my $input2 = Bio::Seq->new(-id=>"testquery2",
>>                              -seq=> $probe_sequence
>>                           );
>>
>>   my $factory = Bio::Tools::Run::StandAloneBlast->new('program' =>
>>'blastn',
>>                                                       'outfile' =>
>>'bl2seq1.out');
>>
>>   my $blast_report = $factory->bl2seq($seq_obj, $input2);
>>
>>   #Here is where you want to throw out good matches.  You'll need to
>>determine
>>   #what method you want to do that.  Maybe since you want there 
>>to be
>>no good
>>   #hits you would just call $blast_report->max_significance and make
>>sure it's
>>   #value is too high to be significant.
>>   if ($blast_report->max_significance > 0.01) {
>>       print "$row\n";
>>   }
>>}
>>
>>-----Original Message-----
>>From: Usha Rani Reddi [mailto:ureddi at emich.edu] 
>>Sent: Tuesday, August 23, 2005 5:51 AM
>>To: Barry Moore
>>Cc: bioperl-l at portal.open-bio.org
>>Subject: Local bl2seq
>>
>>Hi,
>>I am trying to use BLAST to compare the sequences. I did the 
>>program in
>>Bioperl. Below is my piece of code
>>use Bio::SeqIO;
>>use Bio::Tools::Run::StandAloneBlast;
>>use Bio::Seq;
>>$seqio_obj = Bio::SeqIO->new(-file => "sequences.fasta",
>>                            -format => "fasta" );
>>$seq_obj = $seqio_obj->next_seq;
>>$input2 = Bio::Seq->new(-id=>"testquery2",
>>                                
>>-seq=>"ggacccgatgactagccccttgatcgtagcagtggcaagtca");
>>         
>>$factory = Bio::Tools::Run::StandAloneBlast->new('program' =>
>>'blastn','outfile' => 'bl2seq1.out');
>>$blast_report = $factory->bl2seq($seq_obj, $input2);
>>
>>I need help for looping input2. I want to extract this part of 
>>sequencefrom a file containing 200000 records. Using perl I am 
>>extracting the
>>sequence part for file of format given below.
>>SEQ_ID	PROBE_ID	POSITION	PROBE_SEQUENCE
>>NC_004116	1	1	AATTAACATTGTTGATTTTATTCTTCAACATC
>>NC_004116	3	13	TGATTTTATTCTTCAACATCTGTGGAAAACTT
>>NC_004116	5	25	TCAACATCTGTGGAAAACTTTATTTTTTTATG
>>
>>code for extracting PROBE_SEQUENCE looks like this
>>
>>$NemSeq =<STDIN>;
>>
>>chomp $NemSeq;
>>
>>unless (open(seqfile, $NemSeq)) {
>>print "Cannot open file \n";
>>exit;
>>}
>>@NemSeq = <seqfile> ;
>>
>>close seqfile;
>>
>>for (my $k = 0 ; $k < scalar @NemSeq ; ++$k) {
>>   #print $k, $NemSeq[$k];
>>   @Nem =split(/\t/,$NemSeq[$k]);
>>   $input= $Nem[3];
>>
>>   #print scalar(@Nem);
>>   #print $Nem[3], "\n";
>>   
>>}
>>
>>
>>@Nem =split(/\t/,$NemSeq)
>>
>>$input2 = substr(@NemSeq,4,32);
>>
>>So far I could successfully use bioperl(bl2seq) to compare whole 
>>genomewith single probe. 
>>I want to compare all the 200000 thousand probes. I am interested only
>>in mismatches, for this particular scenario my assumption is that more
>>than 90% of them will match. I want to send only the mismatches to
>>output file and discard the matches. I would like to classify the
>>mismatches based on the percentage dissimilarity, is there a way in
>>Bioperl for this? Thanks a lot for the reply. Please help me with 
>>this.Thanks
>>Usha
>>
>>
>>----- Original Message -----
>>From: Barry Moore <barry.moore at genetics.utah.edu>
>>Date: Monday, August 22, 2005 11:45 pm
>>Subject: Re: [Bioperl-l] bl2seq
>>
>>    
>>
>>>Usha,
>>>
>>>The best advice I can give you is that you need to focus your 
>>>question a 
>>>bit more.  What method are you using to compare your probe to 
>>>      
>>>
>>your 
>>    
>>
>>>fasta?  Regex, BLAST, Needle, RNAHybrid...?  You say your 
>>>      
>>>
>>sequence 
>>    
>>
>>>is 
>>>working fine for single sequence.  Are you using Bioperl for 
>>>      
>>>
>>that?  
>>    
>>
>>>Can 
>>>you tell us exactly what isn't working for you or what questions 
>>>you 
>>>have about working with multiple sequences?  Are you already 
>>>      
>>>
>>using 
>>    
>>
>>>Bioperl with your single sequence comparison? Can you show us 
>>>      
>>>
>>some 
>>    
>>
>>>code?
>>>Barry
>>>
>>>Usha Rani Reddi wrote:
>>>
>>>      
>>>
>>>>Hi,
>>>>I am trying to compare two hundred thousand probes(each one of 
>>>>        
>>>>
>>>them) to
>>>      
>>>
>>>>another genome. Format of the file containing probes is like this
>>>>SEQ_ID	PROBE_ID	POSITION	PROBE_SEQUENCE
>>>>NC_004116	1	1	AATTAACATTGTTGATTTTATTCTTCAACATC
>>>>NC_004116	3	13	TGATTTTATTCTTCAACATCTGTGGAAAACTT
>>>>NC_004116	5	25	TCAACATCTGTGGAAAACTTTATTTTTTTATG
>>>>NC_004116	7	37	GAAAACTTTATTTTTTTATGGTACAATATAAC
>>>>NC_004116	9	49	TTTTTATGGTACAATATAACAATAATTATCCA
>>>>NC_004116	11	61	AATATAACAATAATTATCCACAAGACAATAAG
>>>>NC_004116	13	73	ATTATCCACAAGACAATAAGGAAGAAGCTATG
>>>>NC_004116	15	85	ACAATAAGGAAGAAGCTATGACGGAAAACGAA
>>>>What I am trying to do is compare PROBE_SEQUENCE to fasta file of
>>>>Streptococcus agalactiae. I am trying to loop through the probes 
>>>>        
>>>>
>>>but not
>>>      
>>>
>>>>sure how to proceed. My program is working fine for single 
>>>>        
>>>>
>>>sequence. One
>>>      
>>>
>>>>more thing is I am not interested in matches, I want to display 
>>>>        
>>>>
>>only> >mismatches. I am new to Bioperl, some one please help me 
>>with this.
>>    
>>
>>>>Thanks
>>>>Usha
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>> 
>>>>
>>>>        
>>>>
>>>-- 
>>>Barry Moore
>>>Dept. of Human Genetics
>>>University of Utah
>>>Salt Lake City, UT
>>>
>>>
>>>
>>>
>>>      
>>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at portal.open-bio.org
>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>    
>>

-- 
Barry Moore
Dept. of Human Genetics
University of Utah
Salt Lake City, UT



More information about the Bioperl-l mailing list